At Webjet’s UnITed Conference, Lachlan McKerrow went through the history of the organisation’s 10-year agile journey, from being a siloed organisation practising waterfall processes for deploying monoliths, to having adaptive, constantly learning cross-functional teams, iteratively delivering microservices.
Before 2011, Webjet’s engineering team was set up much differently than how it is today. There were teams of developers, testers, and ITOps (plus a part-time Business Analysts) sitting in their own workspaces and there was little collaboration. The system was a monolith, know as TSA (Travel Service Aggregator), and deliverables were “thrown over the wall”.
Programming was done solo. The process was regimented, and individuals followed the plan, with tech leaders and managers advising teams what to do. None of the staff asked questions. Designs were done in a closed room by management, and each mockup and specifications had to be signed-off by senior management. If mockup was wrong, it was slow to change. Things would sit on the shelf for long due to the hoops they had to go through.
Delivery was done over an 8 to 12-week cycle, a “Big Bang” with a batch of things being brought to production. Many times, if things went wrong, everything had to roll back, and there would be finger-pointing that ended with “that’s what you told me to do.” There was no accountability by individuals or teams.
It was a long and torturous process to go from development to production.
Trying Out Agile
In 2011, Webjet started trying out the agile way. By then, the team had 11 developers and 1.5 BA’s. The development teams were then reorganised, and divided into two streams: Scrum (for Project Contrail), with 5 in the Azure team, and Waterfall, with 4.5 developers in the TSA team, 1.5 in Apps, and 5 in ITOps, with some people having multiple roles.
The design process remained the same. It was still management-centred, done in a closed room, and each mock still had to be signed off by senior management.
The Azure team developed the flights path as a mobile site, and the TSA and ITOps teams kept the desktop site running. The release was on 8.5-week cycle. There were modest enhancements to the desktop site as there were fewer resources. From the process of delivering the mobile site, however, came some significant insights.
The First Ah-Ha Moment
In late 2012, early 2013, Contrail was the first attempt to put something into the cloud (through Azure), where Webjet was the first customer commercially signed by Microsoft. The development partner, Readify, was practising Scrum, and Webjet thought that this might be the way forward. The mobile site, unfortunately, didn’t go to production, as ITOps wasn’t involved. They couldn’t support it, and if something happened, they wouldn’t know how to fix it.
Scrum was trialled for one team, before getting others to do the same. This made development effort costly, but the learning was valuable.
After learning from the Contrail project and the first attempt at Scrum, the teams were again reorganised. A dedicated product owner was brought on board. Two large development teams (named Snipers and Samurai, each with a Business Analyst) and a product support team (named SWAT) was established. Testers and ITOps remained in their separate teams. There was a 6-8 week delivery cycle for the monolith.
The teams got Professional Scrum Master training, and Scrum ceremonies (such as planning poker) were followed. There were visits to REA Group and Seek to see how they operated, and adopted some of their practices. There was a move away from Scrum to Lean, then to Kanban. The design was still management-centred, but the mocks didn’t have to be signed-off by senior management anymore. There was now an emphasis on user-centred design, with User eXperience expertise brought in. Feature toggles were introduced to allow code to be put live on production but turned off.
DevOps role was then added to each team. New systems like Octopus (aka the Kraken, as it kept breaking things), Git, ARR, and TeamCity were installed. The “bus” (a concept borrowed from Seek) was created for the deployment of the monolith, scheduled weekly.
The outcome wasn’t very good. Planning meetings were hated, and no one wanted to go to them. Estimates were not accurate. There was sandbagging to make burndown charts look good by overestimating. No one wanted to rotate through the SWAT team. There was diffused accountability across teams. There were, however, reduced time on ceremonies and planning meetings.
As for the design process, adjusting for mock-up errors was easier, and the focus was now on the customer, something that the REA Group pointed out was key. The Pattern Library was introduced for consistency and ease of front-end development.
For delivery, it was still a Big Bang delivery, with some rollbacks. Deployments gradually became easier, although they still had problems. However, when things went wrong, the focus was now on what’s needed to do to fix it rather than finger-pointing. The bus started to depart every 2 to 3 weeks.
The Second Ah-Ha Moment
In early 2016, the second major realisation is that Webjet was “doing” agile, but not “being” agile. It was a case of “monkey-see, monkey-do” with following practices and processes of agile without an understanding of why it works.
An Agile Coach was brought in to review, and Webjet was told that the practice of agile was fine, but like everyone, you can do better. What was missing was having better awareness and a deeper understanding of Agile’s 4 values and 12 principles from the Agile Manifesto. To be truly agile, it has to be inserted into the organisation’s DNA.
The teams were now reorganised into line-of-business cross-functional delivery teams, taking a cue from Spotify. Quality Assurance was now part of the same team, disbanding the test teams. DevOps moved from these teams and a Platform team was created. Responsibility for testing and deployment of microservices was fully with the delivery team, and ITOps was no longer involved in this process. There was an office redesign with collaborative workspaces. The organisation was organised into teams and guilds (grouped by roles).
UI resources were attached to teams for a project instead of sitting inside the team. UX optimised workflows with the aid of user testing. UI and UX improved all booking paths through consistency (with the pattern library). There was a weekly cycle for monoliths, and a daily cycle for microservices.
The process kept on evolving, adapting to the needs of the organisation. The strengths of Kanban practices and Scrum sprints were brought together as “Webban”. Standups started with a report on production support issues by the rostered team member (which included Quality Assurance). Psychological safety was promoted to enhance the learning process. Servant leadership, where leaders support and guide instead of telling what to do, was adopted. Business Analyst’s asked the 5 Why’s to identify the actual problem and core value, and who the customer is.
There was a shift of quality to the left. Business and UX were doing the ideation and Business Analysts and Solution Architects joining to refine it. Product Backlog Items (PBIs) were written in Behavioural Driven Design (BDD) format (Given/When/Then format) and reviewed by Quality Assurance. Kick-off meetings were held. Hack Days were introduced to further encourage collaboration and bring down siloes in the office.
Quality Assurance took the role of bus drivers for the monolith. A guide to developing microservices was written, and Continuous Integration/Continuous Delivery (CI/CD) pipelines were made to work with development/production configuration from the get-go. Teams had to think of how to get “Hello World” to production, and only after this is done do they start writing feature code.
As a result of all of this, teams have become more empowered, taking ownership and accountability. Team members developed T-shaped skillsets, having the breadth of knowledge to cover things when necessary, but having deep knowledge based on their roles.
For release, things were released iteratively. For example, for mobile search, the feature was first enabled only for one route, to see how customers behave, address problems, and assess the outcomes. If things went wrong in production, there was a calm, supportive approach to getting it sorted, with a Root Cause Analysis made to gain learning. This happened when Flights First was released to production on a Friday at 3pm with only a development config in place, not a production one. Holding off the release to Monday morning for more development environment testing wouldn’t have uncovered the issue.
The Third Ah-Ha Moment
The third major realisation, which came in mid-2019, is that there is no framework of agile that fully fits anyone and that the practice of Agile can’t stand still. It’s important to look around and see what is out there.
Being More Agile
In 2019, further refinement on agility came with the introduction of Vertical Slicing, introduced with Paper Planes and Elephant Carpaccio workshop by Alistair Coburn. The guide to deploying microservices was enhanced to emphasize getting value to the customers as early as possible. There were sessions on Pair Programming and the Heart of Agile.
Business Analysts are designated as Delivery Drivers, responsible for bringing features to production. Cypress was introduced as a replacement for Hymie (our in-house automated testing system), bringing the full spectrum of automated testing to the individual teams.
A framework can’t be simply taken from another organisation, turn it into a template, and shoehorn it into process and practices. The approach is to look at what others have done and is found to be useful, and consider if it is a good fit for the company’s environment and culture, as well as letting go of old practices. It’s a constant evolution as teams, work, and business demands change, continually adapting the process. In all of these, always thinking of customers first.