During my 1.5 years working as a Cloud Product Owner on a major defense company private Cloud, I had the opportunity to use FinOps methodologies and pass the FinOps Practitioner certification. FinOps combines financial management and operations, but the key is having the right information at the right time to make collaborative decisions.
Reading Time: 5 minutes
Enterprises are adopting a cloud-centric approach to save money in the short term, improve the time-to-market (TTM) of services and benefit from the expertise of cloud providers in addition to reducing the CapEx related to private equipment and data centers. The biggest difference is that technical teams can spend money on code autonomously without going through procurement. There are other changes for the company to take into account (traditional vs cloud-centric):
Long infrastructure cycle vs Fast infrastructure availability
Siloed purchasing power by procurement vs Procurement achievable by technical team
Low need for optimization vs Pressure to reduce waste
Fixed spending vs Dynamic spending that changes daily
Upfront visibility vs Discovering details after the fact
CapEx vision vs OpEx vision
To inform, identify optimization targets and operationalize changes, it is recommended to follow a progressive approach by thinking in different maturity phases. 100% coverage is not achieved from the beginning and, like Agile, teams improve as they practice FinOps. The main components of this discipline are collaboration, setting common goals, identifying optimization opportunities and automation. FinOps is based on 6 principles:
Teams must collaborate to continuously improve and make fast decisions.
Business value drives Cloud decisions in terms of scope, resources and time without thinking only about cost.
Everyone is responsible for their Cloud consumption and pays for what they use.
Financial reporting must be accessible and timely through a fast and consistent process to enable decision making.
A centralized team drives FinOps to encourage stakeholder collaboration.
Use the variable side of the Cloud to its advantage, with its greater complexity and lower predictability.
The team dedicated to FinOps must have the right skills to coordinate all the players in the company with different roles and skills: technical, finance, engineering, data analysis, purchasing, etc. However, it should not become the guardian or supervisor of the discipline as it is only a facilitator composed of a mix of skills at the crossroads of the groups. The related activities are therefore numerous and include negotiation with suppliers, process management and consumption by internal teams. The variety of players implies an adaptation of communications with them according to what interests them: cost, usage, deadline, application, service, scope, etc. Especially since everyone is responsible for their own consumption, they must access the information that concerns them in order to better adapt. To do this, communication should not be done through a single person a few times a year but in near real time for all groups. Since the FinOps team is made up of several people, it must be financed and this can be done by direct financing by the IT department, via a tax on Cloud consumption or by arbitrage of the savings made.
Since the operation of FinOps has many similarities with other methodologies such as DevOps and Agile, it is all the more important understand the key concepts of continuous improvement, increasing maturity, MVP, alignment, automation and rapid decision-making. But although the language must be common to all, the players come with their own motivation, their own definition and their own point of view that must be reconciled. Where a financier will define a service with the words usage rate and costs, the engineer will define it with robustness, availability and capacity.
Understand total out-of-pocket costs by using common metrics for everyone. It is necessary to map the spending data with the business to properly distribute the costs in order to invoice correctly afterwards. A critical art of FinOps is that of the tag strategy which allows the mapping of data. Budgets must be defined and then compared to the forecasts made with the data available.
Enable real-time decision-making by delivering spending data daily to stakeholders so that they can compare it to their budget. Thanks to this, anomalies can be identified, whether it is an excess of spending threshold or an unusual spike in usage. This daily delivery of data also makes it possible to identify underutilized services to reduce the waste of IT resources.
Calibrate performance by analyzing trends and variance over time to compare the current trend with the past. More than a self-analysis over time, it is also necessary to compare oneself to industry peers to estimate a level of efficiency.
Optimize usage to avoid under- and over-consumption of resources which, in addition to saving money, improves performance. Reducing consumption can be done by turning off elements outside working hours, particularly for development and test environments if nothing happens at night. Optimizing usage also involves automating the entire cycle from creation to destruction. You have to see the savings in $ and % to balance the biggest savings and do things in order.
Optimizing rates by mixing reserved and on-demand rates involves making complex decisions following the principle of exchanging flexibility for cost reduction. Sustainable use and volume discounts can impact costs that are adapted to spending levels. The impact of licenses should not be overlooked.
Align forecasts with the business through small case studies to document discoveries and savings opportunities as well as values to better retain them over time and create a format that works for you without being too complex. You need to establish a regular cadence of trend monitoring early on and encourage collaboration, otherwise everyone will miss out on information.
The complexity of the Cloud is also seen in the invoice which contains pure data that is not suitable for reading by humans who must instead favor the equation Cost = Rate * Time. Time management which comes down to consuming less is decentralized because it depends on the teams while the reduction of the rate which comes down to reducing prices is centralized because it benefits from the mass effect. A VM can be billed at several rates for reasons of time, if it is reserved or on demand, and services, quantity of storage and data flows.
To make FinOps possible and help decision-making, it is organized around the inform, optimize and operate lifecycle.
The information phase supports FinOps principles 3 and 4 by seeking to know what is used, what is spent on, the value it brings and who is responsible for it in the company. Teams must be transparent in the data they transmit to receive frequent feedback from the right people. The data must be clean, consistent and presented in a report format that will be useful with a common language and cost metrics that everyone agrees on so that everyone understands. The Prius effect must be taken into account: continuous feedback on our consumption pushes us towards responsible behavior. Data also allows us to calibrate and compare ourselves between teams and with competitors over time. To apply this, we must take into account the hierarchy and classification of the company to properly allocate costs and use tags. These can be used at any level and must reflect the structure without having too much or too little. Alert thresholds must be established and basic budgetary rules put in place for each team. This work in the present will make it possible to predict the future thanks to the data collected over time.
The optimization phase supports principles 2, 5 and 6 of FinOps by seeking to identify opportunities for continuous improvement, understand consumption and its link with the business. To do this, optimization is based on the diptych of objectives and metrics. Objectives must be recorded in a monitoring system to have visibility on their evolution, the associated actions and prioritization. Metrics must be reviewed during the evolution of the life cycle to become more nuanced and expressed in the form of an objective accompanied by 3 to 5 key results that are expected to be achieved. The optimization phase is where we identify the things we are going to try to optimize and do a mix of easy and hard to achieve elements, knowing that the number of easy elements will reduce over time to leave only complex elements.
The operation phase supports FinOps principles 1, 2 and 6 and depends more on the business, culture and working relationships of the company than on the Cloud and technology. Teams must align to embody the new processes and face the constraints despite the difficulty for employees and the company. Based on the right metrics such as the percentage of resources consumed, the company can define a threshold below which a purchase is made.
Although it encourages reducing consumption to save money, Cloud financial management is mainly part of a green approach to finding a use of resources that best meets the need and reduces waste. Thanks to the measurements made by proper tooling, these can also be used for information purposes in dedicated dashboards. Electricity consumption, CO2 emissions and the impact of unused VMs are generally displayed. It is important to note that using virtualized elements consumes less energy than if they were physical, that opting for new generation servers allows you to benefit from better energy efficiency and that requesting more resources than you actually need at scale leads to enormous losses.