Author: Steve Smith (Page 8 of 10)

Organisation antipattern: Consumer Release Testing

31 March, 2014 / Steve Smith

Consumer Release Testing is high cost, low value risk management theatre

Despite the historical advice of Harold Dodge that “you cannot inspect quality into a product” and the contemporary advice of Don Reinertsen that “testing is probably the single most common critical-path queue” the Release Testing antipattern remains prevalent in the IT industry, and is by no means limited to standalone applications.

Consider the development of a consumer application that requires data from a provider application in order to fulfill its business capabilities. The consumer team contains developers and testers collaborating upon the Testing Pyramid strategy, which recommends unit/acceptance tests over end-to-end tests on the basis that test execution time is proportional to System Under Test scope. This means the necessary provider interactions are test-driven by the consumer team using the Test Stub pattern, which creates a lightweight provider implementation to supply canned responses back to the consumer.

By using a stub the consumer interactions with the provider can be tested in a minimal System Under Test, which ensures that changes made by the consumer team produce fast and deterministic feedback. Success and failure scenarios (e.g. socket failure, socket timeout, provider error code) can be rapidly developed without relying upon a running provider instance, and the consumer team should be capable of rapidly responding to changing requirements in the future.

However, in many IT organisations the consumer team will be hindered by Consumer Release Testing – a phase of post-development end-to-end regression testing of the full consumer and provider stack, performed by a segregated testing team on the critical path.

The desire for provider risk mitigation is understandable given that consumer revenues are to an extent dependent upon the provider, but Consumer Release Testing exacerbates the original flaws of Release Testing:

Extensive end-to-end testing – including both consumer and provider in System Under Test scope increases test execution time and maintenance costs
Independent testing phase – dividing authority and responsibility for the consumer results in quality issues and feedback delays
Critical path constraints – working on the critical path means the release testers will always be pressured to reduce test coverage to meet pre-agreed deadlines

By extending the Release Testing strategy it is evident that Consumer Release Testing is itself risk management theatre – it is highly unlikely to uncover any substantial defects in consumer/provider interactions without a significant increase in test coverage, which will drive up product lead times and opportunity costs.

A far more effective risk reduction strategy is to accept the conventional wisdom that testing is an activity not a phase, and move the blameless release testers into the consumer product team. This ensures that all team members are equally invested in product quality and empowers testers to focus upon higher-value activities such as exploratory testing, which has been described by Elisabeth Hendrickson as “particularly good at revealing vulnerabilities that no one thought to look for before“. For example, some exploratory testing off the critical path of the consumer against a running provider instance might uncover some additional error scenarios that would then be fed into the automated unit/acceptance tests.

A high value, low cost alternative to Consumer Release Testing is for the consumer and provider to actively cooperate in risk reduction, which can result in a substantial reduction in provider risk. The probability of a provider failure can be decreased by independently testing the conflated concerns of end-to-end testing as follows:

Connectivity: the consumer can test provider expectations of consumer connections via release time smoke tests and run time monitoring
Compatibility: the provider can test consumer expectations of messaging via build time Consumer Driven Contracts issued by the consumer
Conduct: the consumer can test its expectations of provider behaviour via build time API Examples issued by the provider

The cost of a provider failure can be reduced via incremental release strategies such as consumer-side Feature Toggles and provider-side Blue-Green Deployments. These practices encourage a provider release to be gradually phased into production usage, so that the consumer can switch back to the previous provider version if necessary.

This approach is a viable alternative to Consumer Release Testing, but it is of limited value without provider cooperation. If the provider cannot or will not participate in risk reduction then the consumer must assess risk based upon historical provider lead times. As large batch sizes increase risk an infrequent provider release schedule is indicative of heightened risk, and if the cost of failure is significant then a limited form of Consumer Release Testing may be deemed justifiable. In those circumstances the consumer development team should perform end-to-end tests off the critical path using a lightweight test client, so that the slow feedback loops and non-determinism of Consumer Release Testing are diminished.

Application pattern: Proxy Consumer Driven Contracts

21 February, 2014 / Steve Smith

Proxy Consumer Driven Contracts simulate provider-side testing

When Continuous Delivery is applied to an estate of interdependent applications we want to release as frequently as possible to minimise batch size and lead times, and Consumer Driven Contracts is an effective build-time method of verifying consumer-provider interactions. However, Consumer Driven Contracts assumes the provider will readily share responsibility for conversation integrity with the consumer, and that can be difficult to accomplish if the provider has different practices and/or is a third-party organisation. How can Consumer Driven Contracts be adapted to an environment in which the provider cannot or will not engage in joint risk mitigation?

For example, consider an organisation in which the applications Consumer A and Consumer B are dependent upon a customer details capability offered by a third-party Provider application. The Provider Contract is communicated by the third-party organisation as an API document, and it is used by Consumer A and Consumer B to consume name/address and name/address/email respectively. Consumer Contracts are made available to the provider for testing, but as the third-party organisation has more lucrative customers and a slower release cycle it is is unwilling to engage in Consumer Driven Contracts.

In this situation a new Provider release could cause a runtime communications failure with Consumer A and/or Consumer B with little or no warning. It is likely that some form of consumer-side integration testing will have to be adopted as a risk reduction strategy, unless the provider role within Consumer Driven Contracts can be simulated.

In Proxy Consumer Driven Contracts a provider proxy is created that assumes responsibility for testing the Consumer Driven Contracts as if it was the provider itself. Consumer A and Consumer B supply their Consumer Contracts to a Provider Proxy, which acts in a provider role.

When a new Provider release is announced by the third-party organisation the updated API document is stored within the Provider Proxy source tree. This triggers a Provider Proxy commit build that will transform the human-readable Provider Contract into a machine-readable equivalent, and use it to validate the Consumer Driven Contracts previously supplied by Consumer A and Consumer B. This means incompatibilities can be detected merely seconds after new Provider documentation is supplied by the third-party organisation.

Proxy Consumer Driven Contracts incur a greater maintenance cost than Consumer Driven Contracts, but if the provider cannot or will not become involved in risk reduction then a simulation of provider-side unit testing should be preferable to a lengthy period of consumer-side integration testing.

Organisation antipattern: Release Testing

8 February, 2014 / Steve Smith

Release Testing is high cost, low value risk management theatre

Described by Elisabeth Hendrickson as originating with the misguided belief that “testers test, programmers code, and the separation of the two disciplines is important“, the traditional segregation of development and testing into separate phases has disastrous consequences for product quality and validates Jez Humble’s adage that “bad behavior arises when you abstract people away from the consequences of their actions“. When a development team has authority for changes and a testing team has responsibility for quality, there will be an inevitable increase in defects and feedback loops that will inflate lead times and increase organisational vulnerability to opportunity costs.

Agile software development aims to solve this problem by establishing cross-functional product teams, in which testing is explicitly recognised as a continuous activity and there is a shared commitment to product quality. Developers and testers collaborate upon a testing strategy described by Lisa Crispin as the Testing Pyramid, in which Test Driven Development drives the codebase design and Acceptance Test Driven Development documents the product design. The Testing Pyramid values unit and acceptance tests over manual and end-to-end tests due to the execution times and well-publicised limitations of the latter, such as Martin Fowler stating that “end-to-end tests are more prone to non-determinism“.

Given Continuous Delivery is predicated upon the optimisation of product integrity, lead times, and organisational structure in order to deliver business value faster, the creation of cross-functional product teams is a textbook example of how to optimise an organisation for Continuous Delivery. However, many organisations are prevented from fully realising the benefits of product teams due to Release Testing – a risk reduction strategy that aims to reduce defect probability via manual and/or automated end-to-end regression testing independent of the product team.

While Release Testing is traditionally seen as a guarantee of product quality, it is in reality a fundamentally flawed strategy of disproportionately costly testing due to the following characteristics:

Extensive end-to-end testing – as end-to-end tests are slow and less deterministic they require long execution times and incur substantial maintenance costs. This ensures end-to-end testing cannot conceivably cover all scenarios and results in an implicit reduction of test coverage
Independent testing phase – a regression testing phase brazenly re-segregates development and testing, creating a product team with authority for changes and a release testing team with responsibility for quality. This results in quality issues, longer feedback delays, and substantial wait times
Critical path constraints – post-development testing must occur on the critical path, leaving release testers under constant pressure to complete their testing to a deadline. This will usually result in an explicit reduction of test coverage in order to meet expectations

As Release Testing is divorced from the development of value-add by the product team, the regression tests tend to either duplicate existing test scenarios or invent new test scenarios shorn of any business context. Furthermore, the implicit and explicit constraints of end-to-end testing on the critical path invariably prevent Release Testing from achieving any meaningful amount of test coverage or significant reduction in defect probability.

This means Release Testing has a considerable transaction cost and limited value, and attempts to reduce the costs or increase the value of Release Testing are a zero-sum game. Reducing transaction costs requires fewer end-to-end tests, which will decrease execution time but also decrease the potential for defect discovery. Increasing value requires more end-to-end tests, which will marginally increase the potential for defect discovery but will also increase execution time. We can therefore conclude that Release Testing is an example of what Jez Humble refers to as Risk Management Theatre – a process providing an artificial sense of value at a disproportionate cost:

Release Testing is high cost, low value Risk Management Theatre

To undo the detrimental impact of Release Testing upon product quality and lead times, we must heed the advice of W. Edwards Deming that “we cannot rely on mass inspection to improve quality“. Rather than try to inspect quality into each product increment, we must instead build quality in by replacing Release Testing with feedback-driven product development activities in which release testers become valuable members of the product team. By moving release testers into the product team everyone is able to collaborate in tight feedback loops, and the existing end-to-end tests can be assessed for removal, replacement, or retention. This will reduce both the wait waste and overprocessing waste in the value stream, empowering the team to focus upon valuable post-development activities such as automated smoke testing of environment configuration and the manual exploratory testing of product features.

A far more effective risk reduction strategy than Release Testing is batch size reduction, which can attain a notable reduction in defect probability with a minimal transaction cost. Championed by Eric Ries asserting that “small batches reduce risk“, releasing smaller change sets into production more frequently decreases the complexity of each change set, therefore reducing both the probability and cost of defect occurrence. In addition, batch size reduction also improves overheads and product increment flow, which will produce a further improvement in lead times.

Release Testing is not the fault of any developer, or any tester. It is a systemic fault that causes blameless teams of individuals to be bedevilled by a sub-optimal organisational structure, that actively harms lead times and product quality in the name of risk management theatre. Ultimately, we need to embrace the inherent lessons of Agile software development and Continuous Delivery – product quality is the responsibility of everyone, and testing is an activity not a phase.

Application pattern: Consumer Driven Contracts

18 December, 2013 / Steve Smith

Consumer Driven Contracts enable independent, interdependent releases

When applying Continuous Delivery to an application estate, our ability to rapidly release individual applications is constrained by inter-application dependencies. We want to release applications independently in order to minimise batch size and cycle time, but the traditional risk reduction methods associated with inter-application dependencies compromise lead times and are fundamentally incompatible with Continuous Delivery. How can we enable the independent evolution of interdependent applications with minimal risk?

For example, consider an estate initially comprised of two independently versioned applications – a Provider application that offers name/address/phone details, and a Consumer A application that uses name/address details.

In this scenario, the interdependent relationship between Provider and Consumer A means an independent release of either application will increase the probability of a runtime communications failure between the applications. As Consumer A has little knowledge of the Provider a new Consumer A artifact might unintentionally alter its usage of the Provider API, and conversely as the Provider has zero knowledge of Consumer A a new Provider artifact might break API compatibility for an unchanged version of Consumer A.

The fear of change and uncertainty associated with this inter-application dependency is exacerbated when another consumer with different requirements is added to the application estate, such as a Consumer B application requesting name/address/email details from the Provider.

The introduction of Consumer B adds another inter-application dependency to our estate and increases the complexity of our independent release strategy. We must be able to release a new Provider artifact capable of servicing Consumer B without endangering the integrity of existing API conversations with Consumer A, but this is difficult when the Provider is unaware of Consumer A and Consumer B requirements. To solve this problem a number of risk reduction methods may be proposed, each of which comes at a significant cost:

Coupled releases. Releasing new versions of the Provider, Consumer A, and Consumer B simultaneously is a direct impediment to Continuous Delivery, with increased holding costs and transactions costs resulting in artificially inflated batch sizes and lengthy lead times
End-to-end testing. An end-to-end test suite covering all possible API interactions would be costly to maintain and a drain upon lead times, hence Jez Humble arguing it “delays feedback on the effect of the change on the system as a whole” and JB Rainsberger stating “integrated tests are a scam“
Multiple Producer API operations. This shares the cost of change between the Provider and Consumer B without impacting Consumer A, but there is an increase in Provider application complexity and there is no incentive for Consumer A to migrate to the same service as Consumer B

Given that all of the above methods will increase the cost of change and cause the evolutionary design of our application estate to stagnate, we need a more adaptive solution that will favour Build Quality In over Risk Management Theatre. We can reduce the consumer-side risk of each inter-application dependency by implementing the Tolerant Reader and API Examples patterns within Consumer A and Consumer B, but this solution is incomplete as it cannot address the provider-side risk – a new Provider artifact could still harm unchanged versions of Consumer A and/or Consumer B running in production.

Consumer Driven Contracts is a pattern originally described by Ian Robinson for Service-Oriented Architectures that has since proven applicable to any scenario in which evolving consumer-provider relationships need to be documented and continually verified. Characterised by Ian as an attempt to create a vocabulary for describing the roles of the participants in a loosely coupled relationship, the Consumer Driven Contracts pattern defines three related contract types:

A Provider Contract is a description of a service offered by a provider
A Consumer Contract is a description of the extent to which a consumer utilises a Provider Contract
A Consumer Driven Contract is a description of how a provider satisfies an aggregate of Consumer Contracts

With Consumer Driven Contracts, the subset of information within the Provider Contract that a consumer is actually dependent upon forms a Consumer Contract that is communicated back to the Provider at build time. It is then the responsibility of the Provider to run a suite of unit tests in its commit build against each of its Consumer Contracts to ensure that no provider change could unexpectedly cause a communications failure at run time.

Consumer Driven Contracts are relatively cheap to implement, yet enormously powerful. As the Consumer Contracts supplied to the provider are executable specifications they provide an automated documentation mechanism that increases the visibility of existing inter-application relationships, and enable a testing strategy that favours the fast feedback of unit tests over the complexity and unreliability of end-to-end tests. Given that Consumer Driven Contracts enable us to visualise inter-application relationships and detect potentially harmful changes within seconds, Ian was not exaggerating when he stated that Consumer Driven Contracts “give us the fine-grained insight and rapid feedback we require to plan changes and assess their impact“.

If we apply the Consumer Driven Contracts vocabulary to our earlier example, it is apparent that our Provider application is offering a name/address/phone/email Provider Contract, with Consumer A encapsulating a private name/address Consumer Contract and Consumer B encapsulating a private name/address/email Consumer Contract.

These Consumer Contracts should be elevated to first class concepts and persisted within the Provider application as a Consumer Driven Contract, so that the Provider is made aware of consumer expectations and can be independently released without fear of harming any inter-application dependencies.

Regardless of original intent, the Consumer Driven Contracts pattern is of immense value to Continuous Delivery and is an essential tool when we wish to independently release interdependent applications.

Application pattern: Vertical Divide and Conquer

1 December, 2013 / Steve Smith

Divide and conquer vertically to reduce cost of change

Application architecture is an oft-overlooked aspect of Continuous Delivery, and an application that encapsulates orthogonal business capabilities is a direct impediment to our stated aim of rapidly releasing small changesets to improve our cycle time. How can we better align technical capabilities with business capabilities, and consequently improve our release cadence?

For example, consider a Fruit Basket application that contains unrelated Apples and Bananas business capabilities, both of which rely upon a messaging service to integrate with a third-party endpoint.

This is clearly an ineffective architecture as it encapsulates unrelated business capabilities with different rates of change, violating Bob Martin’s Single Responsibility Principle and Kevlin Henney’s assertion that “an effective architecture is one that generally reduces the significance of design decisions“. A Fruit Basket release that contains new Apples functionality must also include the unchanged Bananas functionality, creating an inflated changeset that increases transaction costs and the probability of failure.

We can reduce the cost of change and enable Continuous Delivery by using Divide and Conquer to split Fruit Basket into independent applications, but it is important to assess the merits of different strategies. Horizontal Divide and Conquer favours a division of technical capabilities, and would result in separate Fruit Basket and Messaging applications.

While Horizontal Divide and Conquer allows for independent releases of Fruit Basket and Messaging, it ignores the fact that the variation between individual business capabilities will be greater than the variation between business and technical capabilities. Over time there will be far more orthogonal Apples/Bananas requirements than orthogonal Fruit Basket/Messaging requirements, with each Fruit Basket release still a bloated changeset – and when an Apples or Bananas requirement needs a Messaging change, the changeset will grow even larger.

In order to have a significant impact upon cycle time, we must value the decoupling of business capabilities over the deduplication of technical capabilities and adopt a Vertical Divide and Conquer strategy. Vertical Divide and Conquer favours a division of business capabilities, and would result in separate Apples and Bananas applications.

By creating independent Apples and Bananas applications we align our technical capabilities with our business capabilities, and respect the different rates of change in the Apples and Bananas business domains. This will ensure each Apples and Bananas release consists of a minimal changeset, unlocking batch size reduction benefits such as improved transaction costs, improved risk, and improved cycle time.

If we identify Messaging duplication as an issue after Apple and Bananas have been decoupled, we can further improve our architecture by extracting Messaging as an independently versioned library. This will further shrink Apples and Bananas changesets, and the introduction of a Messaging Published Interface will make it easier to reason about responsibilities and collaborators.

The corollary to Vertical Divide and Conquer is Conway’s Law, which tells us for our vertically aligned business and technical capabilities to be truly successful we must also re-structure our organisation so that our development teams are vertically aligned with singular responsibility for specific business capabilities.

Application pattern: API Examples

8 November, 2013 / Steve Smith

API Examples enable consumer unit testing of producer APIs

When an application consumes data from a remote service, we wish to verify the correctness of consumer-producer interactions via a testing strategy that encompasses the following characteristics:

Fast feedback
100% scenario coverage
Representative test data
Auto-detect API changes

The simplest method of verifying parser behaviour would be to use Test Driven Development and create a suite of unit tests reliant upon self-generated test data. These tests could provide feedback in milliseconds, and would be able to cover all happy/sad path scenarios. However, consumer ownership of test data increases the probability of errors as highlighted by Brandon Byars warning that “hard-coding assumptions about data available at the time of the test can be a fragile approach“, and it leaves the consumer entirely unaware of API changes when a new producer version is released.

To address these concerns, we could write some integration tests to trigger interactions between running instances of the producer and consumer applications to implicitly test parser behaviour. This could encourage the use of representative test data and warn the consumer of producer API changes, but the increase in run time from milliseconds to minutes would result in significant feedback delays and a corresponding reduction in scenario coverage. Given JB Rainsberger’s oft-quoted assertion that “integrated tests are a scam… you write integrated tests because you can’t write perfect unit tests“, it seems prudent to explore how we might equip our unit testing strategy with representative test data and an awareness of API changes.

API Examples is an application pattern originally devised by Daniel Worthingon-Bodart, in which a new version of a producer application is accompanied by a sibling artifact that solely contains example API requests and example API responses. These example files should be raw textual data recorded from the acceptance tests of the producer application, meaning that all happy/sad path scenarios known to the producer become freely available for unit testing within the consumer commit build without any binary dependencies or feedback delays. This approach satisfies Brandon’s recommendation that “each service publish a cohesive set of golden test data that it guarantees to be stable“, and when combined with a regular update policy ensures new versions of the consumer application will have early warning of API changes.

As API Examples are exercised within the consumer commit build, they can warn a new consumer version of an API change but cannot warn an existing consumer version already in production. The solution to this problem is for the consumer to derive its parser behaviour from the API Examples and publish it as a Consumer Driven Contract – a testable specification embedded within the producer commit build to document how the consumer uses the API and to immediately warn the producer if an API change will harm a consumer.

Application pattern: Verify Branch By Abstraction

14 October, 2013 / Steve Smith

Verify Branch By Abstraction extends Branch By Abstraction to reduce risk

A fundamental requirement of Continuous Delivery is that the codebase must always be in a releasable state, with each successful pipeline commit resulting in an application artifact ready for production use. This means major changes to application architecture are a significant challenge – how can an architecture evolve without impacting the flow of customer value or increasing the risk of production failure?

In the Continuous Delivery book, Dave Farley and Jez Humble strongly advocate Trunk-Based Development and recommend the Branch By Abstraction pattern by Paul Hammant for major architectural changes. Branch By Abstraction refers to the creation of an abstraction layer between the desired changes and the remainder of the application, enabling evolutionary design of the application architecture while preserving the cadence of value delivery.

For example, consider an application consuming a legacy component that we wish to replace with a newer alternative. The scope of this change spans multiple releases, and has the potential to impede the development of other business features.

To apply Branch By Abstraction we model an abstraction layer in front of the old and new component entry points, and delegate to the old component at build-time while the new component is under development. When development is complete, we release an application artifact that delegates to the new component and delete the old component.

While this form of Branch By Abstraction successfully decouples design lead time from release lead time, there is still an increased risk of production failure when we switch to the new component. Dave and Jez have shown that “shipping semi-complete functionality along with the rest of your application is a good practice because it means you’re always integrating and testing“, but our new component could still cause a production failure. This can be alleviated by introducing a run-time configuration toggle behind the abstraction layer, so that for an agreed period of time we can dynamically revert to the old component in the event of a failure.

Enhancing Branch By Abstraction with a run-time configuration toggle reduces the cost of failure when switching to the new component, but there remains a heightened probability of failure – although our abstraction layer compels the same behaviour, it cannot guarantee the same implementation. This can be ensured by applying Verify Branch By Abstraction, in which our run-time configuration toggle delegates to a verifying implementation that calls both components with the same inputs and fails fast if they do not produce the same output. This reduces the feedback loop for old and new component incompatibilities, and increases confidence in our evolving architecture.

The value of Branch By Abstraction has been summarised by Jez as “your code is working at all times throughout the re-structuring, enabling continuous delivery“, and Verify Branch By Abstraction is an extension that provides risk mitigation. As with its parent pattern, it requires a substantial investment from the development team to accept the increased maintenance costs associated with managing multiple technologies for a period of time, but the preservation of release cadence and reduced risk means that Verify Branch By Abstraction is a valuable Trunk Based Development strategy.

Organisation antipattern: Project Teams

20 September, 2013 / Steve Smith

Projects kill teams and flow

Given the No Projects definition of a project as “a fixed amount of time and money assigned to deliver a large batch of value add“, it is not surprising that for many organisations a new project heralds the creation of a Project Team:

A project team is a temporary organisational unit responsible for the implementation and delivery of a project

When a new project is assigned a higher priority than business as usual and the Iron Triangle is in full effect, there can be intense pressure to deliver on time and on budget. As a result a Project Team appears to be an attractive option, as costs and progress can be monitored in isolation, and additional personnel can be diverted to the project when necessary. Unfortunately, in addition to managing the increased risk, variability, and overheads associated with a large batch of value-add, a Project Team is fatally compromised by its coupling to the project lifecycle.

The process of forming a team of complementary personnel that establish a shared culture and become highly productive is denied to Project Teams from start to finish. At the start of project implementation, the presence of a budget and a deadline means a Project Team is formed via:

Cannibalisation – impairs productivity as entering team members incur a context switching overhead
Recruitment – devalues cultural fit and required skills as hiring practices are compromised

Furthermore, at the end of project delivery the absence of a budget or a deadline means a Project Team is disbanded via:

Cannibalisation – impairs productivity as exiting team members incur a context switching overhead
Termination – devalues cultural fit and acquired skills as people are undervalued

This maximisation of resource efficiency clearly has a detrimental effect upon flow efficiency. Cannibalising a team member objectifies them as a fungible resource, and devalues their mastery of a particular domain. Project-driven recruitment of a team member ignores Johanna Rothman’s advice that “when you settle for second best, you often get third or fourth best” and “if a candidate’s cultural preferences do not match your organisation, that person will not fit“. Terminating a team member denigrates their accumulated domain knowledge and skills, and can significantly impact staff morale. Overall this strategy is predicated upon the notion that there will be no further business change, and as Allan Kelly warns that “the same people are unlikely to work together again“, it is an extremely dangerous assumption.

The inherent flaws in the Project Team model can be validated by an examination of any professional sports team that has enjoyed a period of sustained success. For example, when Sir Alex Ferguson was interviewed about his management style at Manchester United he described his initial desire to create a “continuity of supply to the first team… the players all grow up together, producing a bond“. This approach fostered a winning culture that valued long-term goals over short-term gains, and led to 20 years of unrivalled dominance. It is unlikely that Manchester United would have experienced the same amount of success had their focus been upon a particular season at the expense of others.

Therefore, the alternative to building a Project Team is to grow a Product Team:

A product team is a permanent organisational unit responsible for the continuous improvement of a product

Following Johanna’s advice to “keep teams of people together and flow the projects through cross-functional teams“, Product Teams are decoupled from project lifecycles and are empowered to pull in work as required. This enables a team to form a shared culture that reduces variability and improves stability, which as observed by Tobias Mayer “leads to enhanced focus and high performance“. Over a period of time a Product Team will master the relevant business and technical domains, which will fuel product innovation and produce a return on investment that rewards us for making the correct strategic decision of favouring products over projects.

Continuous Delivery and Cost of Delay

9 September, 2013 / Steve Smith

Use Cost of Delay to value Continuous Delivery features

When building a Continuous Delivery pipeline, we want to value and prioritise our backlog of planned features to maximise our return on investment. The time-honoured, ineffective IT approach of valuation by intuition and prioritisation by cost is particularly ill-suited to Continuous Delivery, due to its focus upon one-off infrastructure improvements to enable product flow. How can we value and prioritise our backlog of planned pipeline features to maximise economic benefits?

To value our backlog, we can calculate the Cost of Delay of each feature – its economic value over a period of time if it was immediately available. Described by Don Reinertsen as “the golden key that unlocks many doors“, Cost of Delay can be calculated by quantifying the value of change or the cost of the status quo via the following economic benefit types:

Increase Revenue – improve profit margin
Protect Revenue – sustain profit margin
Reduce Costs – reduce costs currently incurred
Avoid Costs – reduce costs potentially incurred

Cost of Delay allows us to quantify the opportunity cost between a feature being available now or later, and using money as the unit of measurement transforms stakeholder conversations from cost-cutting to delivering value. Calculation accuracy is less important than the process of collaborative information discovery, with assumptions and probabilities preferably co-owned by stakeholders and published via information radiator.

Cost of Delay = economic value over time if immediately available

To prioritise our backlog, we can use Cost of Delay Divided By Duration (CD3) – a variant of the Weighted Shortest Job First scheduling policy. With CD3 we divide Cost of Delay by duration, with a higher score resulting in a higher priority. This is an effective scheduling policy as the duration denominator promotes batch size reduction.

CD3 = Cost of Delay / Duration

As the goal of Continuous Delivery is to decrease cycle time by reducing the transaction cost of releasing software, a pipeline feature will likely yield an Avoid Cost or Reduce Cost benefit intrinsically linked to release cadence. We can therefore calculate the Cost of Delay as one of the below:

Reduce Cost: Automate action(s) to decrease wait times within release processing time

= (wait time in minutes / cycle time in days) * minute price in £
Avoid Cost: Automate action(s) to decrease probability of repeating release processing time due to rework

= (processing time in minutes / cycle time in days) * minute price in £ * % cost probability per year

For example, consider an organisation building a Continuous Delivery pipeline to support its Apples, Bananas, and Oranges applications by fully automating its release scripts. The rate of business change is variable, with an Apples cycle time of 1 month, a Bananas cycle time of 2 months, and an Oranges cycle time of 3 months. Our pipeline has already fully automated the deploy, stop, and start actions for our Apples and Bananas applications but lacks support for our Oranges application, our test framework, and our database migrator.
Once our development team have provided their cost estimates, how do we determine which feature to implement next without resorting to intuition?

We begin by agreeing with our pipeline stakeholders an arbitrary price for a minute of our time of £10000, and calculate the Cost of Delay for supporting the Oranges application as:

= (wait time / cycle time) * minute price
= (20 + 20 + 20 / 90) * 10000
= 0.67 * 10000
= £6700 per day

Given the test framework has failed twice in the past year and caused a repeat of release processing time specifically due to its lack of pipeline support, the Cost of Delay is:

= (100 / months in a year) * occurrences
= (100 / 12) * 2
= 16% cost probability per year

= (processing time / cycle time) * minute price * % cost probability
= ((100 / 30) + (100 / 60) + (160 / 90)) * 10000 * 16%
= 6.78 * 10000 * 16%
= £10848 per day (£5328 Apples, £2672 Bananas, £2848 Oranges)

The Cost of Delay for supporting the database migrator is:

= (wait time / cycle time) * minute price
= ((45 / 30) + (45 / 60) + (45 / 90)) * 10000
= 2.75 * 10000
= £27500 per day (£15000 Apples, £7500 Bananas, £5000 Oranges)

Now that we have established the value of the planned pipeline features, we can use CD3 to produce an optimal work queue. CD3 confirms that support for the database migrator is our most urgent priority:

This example shows that using Cost of Delay and CD3 within Continuous Delivery validates Mary Poppendieck’s argument that “basing development decisions on economic models helps the development team make good tradeoff decisions“. As well as learning support for the database migrator is twice as valuable as any current alternative, we can offer new options to our pipeline stakeholders – for example, if an Apples-specific database migrator required only 5 days, it would become our most desirable feature (£15000 per day / 5 days = CD3 score of 3000).

No Projects

28 July, 2013 / Steve Smith

Projects kill flow and teams. Focus on products, not projects

Since the Dawn of Computer Time, enormous sums of money and embarrassing amounts of time have been squandered upon software projects that have delivered little or no return on investment, with projects floundering between segregated Business and IT divisions squabbling over overestimated value-add and underestimated delivery dates. Given Grant Rule’s assertion that “studies too numerous to mention show that software projects are challenged or fail“, why are software projects so prone to failure and why do they persist?

To answer these questions, we must understand what constitutes a software project and why its delivery model is incongruent with product development. If we start with the PRINCE 2 project definition of “a temporary organization that is needed to produce a unique and predefined outcome or result at a pre-specified time using predetermined resources“, we can offer a concise definition as follows:

A project is a fixed amount of time and money assigned to deliver value-add

The key characteristic of a software project appears to be its fixed end date, which as a delivery model has been repeatedly debunked by IT practitioners such as Allan Kelly denouncing “endless, pointless discussions about when it will be done… successful software doesn’t have a pre-specified end date” and Marc Lankhorst arguing that “over 80% of IT spending in large organisations is on maintenance“. However, the fixed end date of a software project is invariably a consequence of its requirement for a collection of value-adding features to be simultaneously delivered, suggesting an augmented definition of:

A project is a fixed amount of time and money assigned to deliver a large batch of value-add

Once we view software projects as large batches of value-add, we can apply The Principles Of Product Development Flow by Don Reinertsen and better understand why so many projects fail:

Increased cycle time – a project might not be deliverable on a particular date unless either demand is throttled or capacity is increased, e.g. artifically reduce user demand or increase staffing levels
Increased variability – a project might be delayed due to unpredictable blockages in the value stream, e.g. testing of features B and C blocked while testing of feature A takes longer than expected
Increased feedback delays – a project might incur significant costs due to slow feedback on bad design decisions and/or defects increasing rework, e.g. failures in feature C not detected until features A and B have passed testing
Increased risk – a project might have an increased probability and cost of failure due to increased requirements/technology change, increased variation, and increased feedback delays
Increased overheads – a project might endure development inefficiencies due to increased requirements/technology change, e.g. feature C development time increased by need to understand complexity of features A and B
Increased inefficiencies – a project might encounter increased transaction costs due to increased requirements/technology change e.g. feature A slow to release as features B and C also required for release
Increased irresponsibility – a project might suffer from diluted responsibilities, e.g. staff member has responsibility for delivery of feature A but is unincentivised to participate in delivery of features B or C

Don also provides a compelling explanation as to why the project delivery model remains prevalent, by explaining how large batches can become institutionalised as they “appear to have scale economies that increase efficiency [and] appear to reduce variability“. Software projects might indeed appear to be efficient due to perceived value stream inefficiencies and the counter-intuitiveness of batch size reduction, but from a product development standpoint it is an inefficient, ineffective delivery model that impedes value, quality, and flow.

There is a compelling alternative to the project delivery model – product development flow, in which we apply economic theory to Lean product development practices in order to flow product designs through our organisation. Product development flow emphasises the benefits of batch size reduction and encourages a one piece continuous flow delivery model, in order to reduce costs and improve return on investment.

Discarding the project delivery model in favour of product development flow requires an entirely different mindset, as epitomised by Grant urging us to “accommodate the ideas of flow production and lean systems thinking” and Allan affirming that “BAU isn’t a dirty word… enhancing products is Business As Usual, we should be proud of that“. On that basis the No Projects movement was conceived by Joshua Arnold to promote the valuation of products over projects, and anointed as:

Projects kill flow and teams. Focus on products, not projects