The Evolving Job of a Startup CTO — Part 1

Vaidik Kapoor
28 min readJul 31, 2023

--

As a tech consultant and advisor, I am usually hired to help solve a burning problem that the company cannot solve internally. While working with startups specifically, I have observed that the founders (or the tech leaders) are aware of the presence of one or many problems, but they often only talk about the symptoms. Naturally, if they understood the problem or the root cause, they would have solved it themselves. But sometimes, while executing fast, they cannot make the time to retrospect, investigate, and fail to see and clearly articulate these problems. I thought it might be a good idea to write about some common symptoms I have observed and their potential root causes.

This topic is particularly interesting to me not only because this is what I do as a tech advisor and consultant but also because there is so much leverage in solving these problems. Tech is where most of the heavy lifting happens in most startups. So, an attempt to unbundle this seems like a way to build clarity for myself and help other startups in the process.

There is a lot of ground to cover, which is impossible to get done in one article. So, I intend to write about this topic in subsequent articles in this series. Through this article, I want to talk with the founders, especially the CTO, if there is one.

Let’s go with the first thing I usually hear in conversations with my clients.

We hired more engineers, but we are not shipping fast enough (or worse, have slowed down)

It is one of the most counterintuitive things that startup founders struggle with. Founders, naturally, want to make sure that their business grows faster — serving more customers, generating more revenue, expanding in new markets, raising money to grow faster, etc. But to be able to do all that, they need people. That’s when they hire so that they can step back.

But interesting things happen when they step back, and what happens can sort of depend on the composition of the founders (if more than one) from the perspective of their technical experience. There are two possible founder compositions:

  • The non-technical founders — the founder(s) do not have a software engineering background.
  • At least one of the founders is technical — one (or more) founders have worked full-time as a software engineer in the recent past, meaning that they can still take another job as a software engineer if they want.

The non-technical founders

When founders with no tech background start a new company, they usually have a founding engineer on the team. Work in such a situation happens by sitting at a common table, where the founders usually decide what needs to be built with inputs from the founding engineer on ideas and (most importantly) feasibility. Work gets decided and prioritised on a daily basis. It goes into execution when the founding engineer takes over. They write the code, put it out on a shared environment (like a staging environment) where their work can be tested, the entire team tests the changes, the code is deployed, and then the founders are out again to get some users to use what they have built.

At least one of the founders is technical

This is not very different from the previous scenario. The major difference is that the technical founder takes on the role of the founding engineer, and hence, the technical founder is mostly writing the code. Depending on the situation, they are probably accompanied by one or more founding engineers in building the software.

Now, let’s explore where things start to go south.

Growing

So now the startup is growing. The startup is scaling “something”.

Maybe there is a product that a few users use, and the company has raised a seed round. So they are scaling to build more features to solve problems for a wider audience and work towards achieving PMF.

Or, maybe there is PMF, and now the company is scaling to onboard more customers (i.e. scaling sales, improving the onboarding experience, improving support, improving quality, optimizing margins, etc.).

Each of these scenarios would most likely lead to hiring more people, some specialists and some generalists. For the scope of this article, we are concerned about hiring more engineers. Engineers are essential to be able to do most of the above-listed things, i.e. building more features, improving onboarding experience at scale (think automation and product experience), improving support (quality issues, missing features, automating support, etc.), optimizing margins (reducing tech cost?), etc.

We hired more engineers, but we are not faster.

The founders hired engineers to work on more things simultaneously and grow faster. Besides product and engineering, other functions also need attention, like sales, support, customer success, HR, etc. The founders must spend time setting up these functions as well. So the founders hired even more people in engineering and product and perhaps have stepped back a little from day-to-day execution in product development.

But, things are not going as the founders had planned. Everything seems to have slowed down. Here are some of the common symptoms I hear:

  1. Product feature releases take more time and often miss their deadline.
  2. Small changes take painfully long to get done.
  3. Product execution is not up to the mark. New features are not properly baked, leading to a poor first-time user experience. Often features need rework before the release.
  4. Quality issues in the product have started to creep up, leading to frustrated customers.
  5. Sales and product managers are not able to meet the commitments they make to customers.
  6. The catch-all — important things don’t get done fast enough without pressure, and the founders don’t understand why.

These are only the symptoms. Founders must identify the root causes and clearly articulate the problems leading to these symptoms.

Side note: If all this sounds familiar, we should chat. This is the kind of stuff I love talking about, learning about and solving. Working together or not, I’d love to have a conversation.

The Root Causes

From my experience of solving these problems in different contexts, I’d say that most root causes are common across companies. But at the same time, there could be nuances in some businesses where the specifics might differ, or these recommendations might need some tuning. So please digest with consideration what I am about to discuss next.

Product feature releases take more time and often miss their deadline.

Faster is always better. Every team must strive to be faster. But if they are not getting faster, they should at least not slow down. Speed is existential for every company and even more so for startups.

In the initial days, the execution was much faster with a small team. So why does execution slow down with a bigger team? Here are some reasons that I have experienced first-hand.

Lack of clear direction and focus

If everything is important, then nothing gets done. After all, only so much can be achieved with finite resources (law of nature). Tech leaders have to provide their teams clarity with what needs to be done with the finite amount of resources (time, manpower and money) to achieve a definite goal without creating wastage (think “task done” but value not delivered). To achieve goals with finite resources (including time), planning has to be done at some cadence (for low cognitive overhead and discipline). I will discuss this in the next section.

Even after stepping back from hands-on execution, founders must make sure that their teams have clarity to execute well. New information is collected on an ongoing basis. So founders must continuously engage with their teams to have conversations and provide them clarity (written if it helps).

Poor planning or no planning at all

Since resources are finite, work must be planned so that (ideally) every task done always delivers some value to customers or the business. Writing code is only “work done” and does not necessarily mean value is delivered to anyone. For example, the backend team deployed the API for a feature, but the frontend work or integration with the front end remains. This is a classic case of a task done, but the feature the customer will use (the value) is not delivered.

Planning is a big topic, and a ton is written about it (Agile, Scrum, Kanban, Extreme Programming, etc.). I will abstract it into a few simple rules that I like to follow:

  • Plan Do Check Act — stick to a Plan Do Check Act cycle. When we think of moving fast, frameworks like Scrum (synonymous with Agile) and Kanban come to mind. If implemented poorly, they can lead to poorer behaviours in the team (more on this later). Those frameworks are great. Learn them. But until you fully understand them, I’d suggest tech leaders stick to a simpler Plan-Do-Check-Act cycle and do it at a regular and well-defined cadence. For most web products, a cycle of 2 weeks (also popularly known as a sprint) makes sense. When information changes frequently in the early days, a 1-week cycle might also make sense. For hardware product companies, a different kind of cadence will make sense.
  • Plan with clarity — this is related to the previous section. Defining what needs to be done and then planning needs clarity of what must be done to solve the customer’s and your business’s problems. So if your Plan-Do-Check-Act cycle is 2 weeks and starts on Monday, make sure the plan is in place at least on Friday. To put a plan in place so that your teams can execute without your day-to-day involvement, make sure that, as a leader, you strive to articulate the necessary clarity for yourself first and then use it while planning the upcoming cycle. Following this structure will also provide leaders with a structure to have ongoing conversations with their team, provide them clarity and help them learn about the rapidly changing product and business context.
  • Release at a cadence — moving fast and being agile is not just about the raw execution speed. It is also about releasing frequently, learning fast and reducing waste. What you have not released and not got anyone to use yet is not useful because there is no feedback to learn about the usefulness of what you have built. So building further on top of it could mean you are not heading in the right direction. I love how Intercom has articulated this in their article Shipping is your company’s heartbeat.
  • Moving fast and being agile is about being smart about what you choose to build and when. It is also about how much investment to make behind a feature or an idea. New ideas could require significant investments consuming months of a team’s work. So it is important to define what is absolutely necessary to be done and then validate the next steps.
  • A great forcing function to ensure that your team is releasing fast and regularly so that you can learn from customer feedback is to force yourself to plan features in your Plan-Do-Check-Act cycles to release at least at the end of every cycle. If there is an idea that seems to take longer if it goes into execution, force yourselves to cut it down to a smaller scope so that it can be released at the end of a cycle. However, that small-scoped idea must still be valuable to customers if released (this could be a private release to a few customers by feature flagging in production or even shown in a demo to customers).

Founders must own the Plan-Do-Check-Act cycle, which is a way to get minimally but critically involved in execution. It will allow them to converse with their teams on an ongoing basis to provide them with new information and context about customer requirements and the changing needs of the business, and ensure that valuable work is prioritised and planned for their teams to execute. This will further help ensure that their teams are working on the most important things and continuously delivering value to customers. Continuous delivery will enable the founders to learn fast from real customer feedback and do timely course correction.

Inability to ship fast with high confidence

Lack of confidence is rooted in high risk in doing something. It is the fear of breaking things in production. Nobody likes to break things and cause trouble. In the context of shipping fast and releasing software frequently, the risk is the software breaking, i.e. introducing bugs as we change the software. To ensure we don’t ship broken software, two things are essential:

  • getting the requirements right (partly covered in planning, and I will cover the remaining part in the next section)
  • a good quality assurance (i.e. testing) process.

In the early days of building a product, things moved fast because the codebase and the software were not as big yet. So quality assurance via a manual Regression Testing process where everyone on the team, including the founders, was involved hands-on in manually testing the changes. When the codebase grows, manually testing every change is not scalable, efficient or effective:

  • Humans are bad at doing repetitive manual labour with high accuracy. Testing is a repetitive process which requires high accuracy.
  • Manpower is costly. So humans try to optimize their testing efforts by being selective. But humans are also individually and uniquely biased. So every human looks at the process of testing differently, which makes manual testing less deterministic and difficult to scale (there are ways to deal with this, but the cost angle of human labour always introduces the need for judgement). This makes manual testing ineffective and hard to scale.

I am not saying manual testing should not be done. Manual testing is extremely important for Exploratory Testing. Exploratory Testing is the process of discovering unknown behaviours (side effects) and user-experience related issues in the software. It is partly the practice of intentionally breaking the software before customers discover those broken experience issues. It should be a cross-functional effort, at least involving engineers, designers and product managers, but anyone in the company can get involved in this.

Coming back to the inability to ship fast, Regression Testing is a part of the QA process that is repetitive. If it is done manually, it will lead to:

  • the slow pace of execution and poor quality of releases
  • quality issues in production, which will lead to low morale and self-esteem of the team, and low confidence, which in turn will make the cycle of shipping slower

Founders must ensure that as the product and the codebase grows, strategic engineering investments are made to ensure that (at least a part) of the Regression Testing process is automated. The right level of test automation is not “100% tests coverage”. The right level of test automation is what allows us to release with high confidence. So just straight 100% unit test coverage does not help. Invest strategically in the right kind of test cases. Functional integration tests are a good starting point.

Here are some references on testing strategy:

  1. Write tests. Not too many. Mostly integration.
  2. Fixing a Test Hourglass

Lack of feedback on work at the right time

As founders (and leaders), an important part of our jobs is to provide timely feedback. I am not talking about feedback for personal growth (yes, that is also important). I am talking about feedback on work. Is the solution developed to solve customer problems accurately? Will it create some other problems? Does the solution fit well in the product and the larger scheme of things?

Sooner or later, these things get surfaced to the founders. At that time, the right thing (most of the time) is to intervene to get it fixed and ship the right solution to the customers. But delayed intervention leads to rework, which means wasted time and effort. If the teams repeatedly fail to get the requirements right, it will lead to rework, which is one of the root causes of the inability to ship fast.

Continuous conversation with teams (as highlighted in the “Lack of clear direction and focus” section) can reduce the occurrences of these. But I’ll reiterate that new information comes in really fast and often cannot be continuously conveyed to teams. Founders, by virtue of their position and their vantage point, have the leverage to consume and process a lot more information. Their judgement is a lot more reliable in the company. Their continuous feedback on work being done is extremely important to avoid rework. Even with experienced product managers in place, founders’ feedback is essential to help teams meet their goals in shipping the right solutions. Usually, nobody in a startup understands their business domain better than the founders.

Plan-Do-Check-Act cycles provide the “Check” phase (end of cycle) as the minimal intervention that founders have at their disposal to review work and provide feedback before the work is shipped. But the end of a cycle is already delayed. What can founders do to provide faster feedback? Here are a few ideas:

  • Introduce a mid-cycle review of work or at least “key work”. Use judgment to define key work.
  • Founders probably don’t have the time to review everything. So again, use your judgement to set clear expectations with your product, design or engineering leads to get “key work” reviewed as early as possible, ideally before starting software development. Shift Left.
  • Make yourselves available so your teams can approach you for your input.

Founders must make sure that they provide regular feedback to their teams on work, by reviewing solutions early for “key work / initiatives”. Delaying feedback leads to rework and possibly poor customer experience, which leads to wasted effort, inability to ship fast and frustrated customers.

Small changes take painfully long to get done.

I was recently talking to two non-technical founders. They have a fairly high number of customers using their platform. So, the product seems valuable to a large enough audience. They want to expand and grow. Their challenge is that everything in product engineering moves too slowly. One of them said: “Even a simple fix on phone number validation in the login screen to handle an additional zero prefix has taken us more than a month to get done. We just cannot move at this speed.” They are right. They will die if they move at such speed. It is also mind-boggling to know why something so simple would take so long. That’s probably an extreme example, which could be rooted in a cultural problem. But there could be other slightly more complex yet simple improvements that take a lot of time to get done. Here are some of the reasons that I have noticed so far from my experience:

Cultural problems

  • Lack of a sense of urgency or the intention to solve customer problems
  • Lack of customer centricity

Technical problems

  • Inability to ship fast with high confidence (already covered previously)

I have covered the inability to ship fast already, so I will not discuss it again. Let’s talk about Cultural Problems briefly.

Lack of customer centricity is the absence of empathy for customers and solving their problems from their lens. If the customer struggles to log in because of a zero prefix added by an auto-complete on their phone, it is a simple problem that a team can solve. But they don’t solve it because they are not close to their customers, and they are not made aware of the emotions that their customer feels when the product experience breaks. In this case, the customers are low-income group cart vendors using the product to make a living. Their level of education in tech is also fairly low. So as builders of the product, it is the team’s job to make it easy for them to use the product.

Continuous conversations (as covered in the “Lack of clear direction and focus” section) can solve this problem to an extent if customer experience issues are also talked about in those conversations. Another idea is to get your product managers, designers and engineers to speak to customers directly and get a first-hand experience of what the customers love and what they would like to see get improved. To make it a part of the culture, systemise it. Make it a habit.

Lack of urgency or the intention to solve customer problems is a different beast. The first thing is to identify if it is an individual problem (a junior or a leader) or is it a systemic problem. Individuals can be coached if you, the founders, have the time and resources. If you don’t have the time and resources, it is best to take the hard call and part ways. If it is a systemic team-wide problem, it needs a cultural intervention. At a high level, this comes down to doing the following:

  • Coaching team leaders on customer centricity, what it means and helping them build a sense of urgency.
  • Reiterating to the entire company the urgency for solving customer problems and being customer-centric.

To do both of these things, different kinds of management tactics (communication, business review, work review, customer support tour duty, etc.) can be used. I will not go in-depth about this because neither I am an expert at this, nor is this a small topic to cover. But enough books have been written about management, goal setting, communication and customer-centricity.

Founders must find ways to embed customer centricity in their teams and culture, explicitly state their expectation of how customer problems must be addressed by teams on a daily basis, and hold their teams accountable for solving customer problems with utmost urgency.

Product execution is not up to the mark. New features…

…are not properly baked, leading to a poor first-time user experience. Often features need rework before the release.

Founders often say this. Product engineering teams release features that either do not cover all the cases of the problems they had set out to solve or feature implementations have a poor user experience (think of things like validations, references to other entities in the software, etc.).

The first one unarguably is a bigger concern, leading to unimpressed customers who do not become promoters of your product, or worse, they become frustrated and stop to trust your company’s ability to solve their problems.

Missed user experience issues, in my opinion, might not be a very big problem as long as your team can do a fast follow-up to release fixes. However, they still lead to unplanned work (productivity killer) later in terms of support requests.

Why do these things happen? In my experience, teams struggle to execute product features well because of the following reasons:

  • Lack of domain experience and context
  • Lack of agile and product management practices
  • Lack of customer centricity and the desire to solve customer problems (already covered previously)

Lack of domain experience and context

Over time, this has become one of my favourite topics that I talk to teams about when I coach them. Domain experience is so underrated. It is something that you can hire for, but even if you don’t have it, it can be learned if you focus on it and approach it from first principles. Let’s look at this with an example to understand it and the side effects of its lacking in a team.

In one of my engagements, I worked with the engineering team at an e-commerce marketplace company to help them improve their test automation setup. This team was responsible for the product catalogue service responsible for controlling what product assortment is made discoverable to customers in a location. While working on test automation, we discussed a particular scenario that had to be tested, which led us to discuss some gory details of their architecture. Initially, the product catalogue would hold the mappings of products to merchants. So a customer at a certain location can be serviced only by the merchants who can service that location, allowing the customer to discover products available with these merchants. Over time, they added “bigger” merchants who could service a larger geolocation. But they wanted different pricing in different areas. So… they created the concept of a backend merchant and a frontend merchant. The backend merchant is, well, the backend, only responsible for holding the inventory and responsible for order fulfilment and logistics. A frontend merchant was mapped to a backend merchant so the frontend merchant would show the backend merchant’s inventory. The frontend merchant controlled the pricing of products. Many frontend merchants could be mapped to one backend merchant. This overtime led to so many complications in the architecture that any new person’s head would go spinning. It was too hard to follow because of entities created out of thin air that did not reflect the reality of the real world (AKA the domain). A good architecture is easy to follow because it represents the domain clearly, making the architecture easy to understand. In this case, there should have been only one type of merchant (like before). A separate service should store location-wise pricing overrides for a merchant. This would have made it much easier to follow what was happening in the system.

In another engagement with a company that builds a DevOps tool as a SaaS service, the founders faced difficulties in having their engineering team support their existing customers and onboard new customers. When the team receives concerns or feedback from their customers, they would either not act with urgency or, worse, they would propose incorrect solutions, leaving the customers frustrated. It seemed obvious to me that when engineers build products for other engineers, they can communicate well and “they get it”. They can easily understand what their customers want. And that is true to quite an extent. But the product engineers in this team had no background or exposure to DevOps at all. They also, unfortunately, had no inclination to learn about DevOps. So they regularly struggled to get to the root of the problems their customers brought to them. They did not understand the domain well, and unfortunately, they did not even want to learn.

Understanding that you need to learn about the domain is also being customer-centric. A great way to learn about the domain of the business is to do frequent customer conversations.

Founders must systemise how their team continuously learns more about the business domain and becomes an expert of their domain. There are several ways to do this but getting the team to do regular customer conversations is one of the most definite and powerful ways to make it happen.

Lack of agile and product management practices

This section introduces jargon (which I have avoided so far). I will try my best to break them down to reduce the obscurity of jargon. To me, and to a lot of really smart folks (who are not product managers by their job function) I have been lucky to work with, have principles and disciplines that they stick to, principles that are rooted in logic and First Principles to maximise return on investment of time and money. Instead of getting into the specifics of product management practices, I would prefer to stick to these foundational principles for product management:

  • Develop an effective strategy
  • Set and stick to clear priorities
  • Set measurable outcomes to determine success
  • Support product engineering teams

When a product engineering team is tasked to do something, the company is making an investment to build something that they can sell to customers and make more money than what was put in to build it. This depends on the fact that you know what to build that customers really want and would pay you for.

When it comes to knowing what customers want, let’s just agree that nobody knows enough about what customers want unless we systematically make an effort to learn. We build with the best information available to us, but we have to constantly be on the lookout for more new information that helps us either be sure of our current understanding and correct it when we are not.

At the minimum, we want to learn from our customers which problems we can solve for them, and when we attempt to solve those problems, have we actually solved those problems well enough to our customer’s expectations? In practice, this looks like:

Learning about unsolved problems

  • Doing frequent customer interviews, surveys or creating any touch point that gets you to interact with customers directly or indirectly to understand the challenges in whatever job they are trying to do.

Getting feedback

  • Build something, and then demo it to customers to see if the new feature or solution solves their problem and does that well enough. We don’t know if we have done our job well unless the customer says it and is ready to pay for it. So show early and show regularly before it is too late.
  • Discover user experience issues with existing solutions (including your product) by regularly reviewing support tickets, analytics, and bugs. Talk to customers for feedback. Our systems already have a lot of information (hopefully), and we can use that to build a better understanding of how our customers use the product and where they struggle.

Obviously, do all of this regularly, with discipline, without failing. Demo after every major release. Review support tickets and bugs regularly. Talk to customers regularly. If you follow the Plan-Do-Check-Act cycle, you already have a cadence in your company. Tie things to it to maintain discipline. For example, review support tickets and bugs once every two weeks (at least).

Most of this might sound obvious. But doing this is so important to build the clarity to decide where to invest to get maximum returns on your investment (engineering time).

Clarity guides strategy. Strategy guides priorities. Measuring and using metrics for success helps stay objective.

Now that we know learning from customers regularly is important and we can receive new information that can change direction, we must support product engineering teams to execute. To support product engineering teams, it is important to plan work so they can ship in small iterations frequently. This enables you to demo to customers regularly and learn from their feedback. We have already discussed some of this earlier in this article. Now, we are now inching into the territory.

How frequent is frequent enough? Ideally, every day or even better if multiple times a day. That is probably how you operated in the early days of your startup. That ideally should not have changed. If it has, then you need to fix it. A good starting point for shipping frequently is at least once in 2 weeks (sprints or Plan-Do-Check-Act cycles). We are now inching into the territory. Why 2 weeks? 2 weeks strike a good balance between time, flexibility, team efficiency, focus and customer collaboration. But this is the widely followed norm for software teams. Different teams (like ones working on ops or hardware) can have different reasons to choose a different duration for sprint cycles.

But we already follow Scrum, and we are still slow.

You may say that “We already follow Scrum, and we are still slow”. I didn’t write all this to tell you to follow Scrum. But I did write all this to tell you to work towards being more agile. Scrum is one of the tools to help you be more agile.

Agile does not mean scrum. Scrum does not mean agile. Agile means agile. What does that mean? Ship small, ship frequently, learn fast from feedback and course correct often. Scrum is only one of the ways to do it. You can very well be agile without Scrum. So if Scrum is not working for you, you have somewhere lost the essense of why we do Scrum, which is to be agile. So learn Scrum to do Scrum well, or don’t do Scrum at all. It can be counter productive and introduce more inneficiencies.

When you (genuinely) learn Scrum, you understand the different techniques and systems that help your teams be more agile. So follow Scrum or use First Principles to create your own processes — whatever works for you. But strive to be more agile in your execution.

Founders must institute the discipline in their teams to regularly learn from customers (directly or indirectly using data) and understand problems can you solve for your customers. To learn regularly and frequently, you must ship frequently as well. A good starting point for founders to make this happen is questioning if their teams and execution is truly agile. If that leads to an unsure feeling, work with your teams (or team leads if you have them) to understand where is execution slow in the process of product management and building software. Use first principles to solve your bottlenecks. If that doesn’t help (and it doesn’t always help if you have not had the experience of building software yourself), learn about Agile, Scrum and Extreme Programming. If you don’t have the time to do it yourself (a scarce resource for founders), reach out to someone (an experienced engineering leader) who can help you understand this.

If you are looking for an experienced tech leader to discuss your execution problems, this is what I do, and I’d be happy to chat.

Software quality issues have started to creep up, leading to frustrated customers.

This is another big one that I see so many teams struggle with. Shipping new features is easy. Prioritising bugs and the work that improves quality is not straightforward because it is hard to understand its impact (besides noisy customer frustration). There are two sub-problems while dealing with quality:

  1. Fixing known reported bugs timely
  2. Ensuring that you don’t introduce new bugs (bugs in new features or breaking existing features) with every release

Fixing known reported bugs timely

Often in the early days, bugs are reported and fixed as they come. Essentially, every team and every engineer is on-call. Every reported bug (especially the ones reported by the founders) is a high priority. So someone leaves their current work (interrupted) and jumps on to it, which is not bad in the early days, but such interruptions impact our ability to focus and deliver planned work. So not prioritising some bugs if they are not severe is absolutely fine because the cost of interruptions is high. But someone has to take that call to prioritise incoming bugs. To prioritise something, it is important to have a process to track it and prioritise it. If you prioritise not fixing it now, maybe you want to fix it later or use that bug as information in guiding new product development to avoid similar mistakes.

Unplanned work, like fixing bugs, is hard to deal with. If you are getting a lot of them, the only way to sanely deal with them is to have some bandwidth carved out so that the execution of planned work is not impacted.

Whatever you choose to do, you must track bugs (and collect information to recreate them to help developers fix them fast) so you can prioritise them to solve now or later. An additional benefit of tracking bugs is to be able to report quality metrics (release frequency vs reported bugs over a period of time). This helps you understand the quality aspect of your software development process (the system, which is what we will talk about in the next section).

Another useful feedback loop for quality is to build some kind of cadence with the Customer Support (or equivalent) team to get a first-hand report on customer issues. Customer Support teams are the ones that customers reach out to (or yell at). They not only have an idea about bugs but also the customer impact of those bugs. Using their insights into prioritising bugs can be really helpful. But more importantly, if you choose to invest time in improving the quality of your product, its impact should indirectly show up in how frequently customers are reaching out to your Customer Support team and what they think of the product experience.

Ensuring that you don’t introduce new bugs with every release

Shipping on-demand every day as and when the team feels that they are ready to ship is not enough, not without the explicit guardrails of quality anyways. If shipping every day leads to bugs (in what you shipped, and worse, in unrelated areas of the product), you have a misplaced sense of moving fast and being agile. All you are doing is frustrating customers, making progress but taking two steps back, creating more work for yourself to do later in terms of engineering and getting back the confidence of frustrated customers.

Move fast but at least don’t break everything?

If you don’t track bugs, you cannot track a trend of quality over time. Reporting bugs on a Slack channel is not enough. That is just communication. Unstructured bug reports are initially helpful. But as you scale, you need some amount of structured information to at least be able to import them in a spreadsheet, create filters, and generate metrics (number of bugs over a 4-week rolling window, number of valid bugs vs reported bugs, number of SEV 1 bugs over a rolling window, etc.). A useful way to think about quality in your software delivery process is by comparing the number of new bugs vs the number of releases over a rolling window. If the release frequency is increasing, but the number of new bugs with releases is growing faster, you have a problem — a false sense of moving fast.

Anyways, exact metrics are not as important as the ability to get metrics when you need them. So, track bugs.

Besides tracking bugs, how do we ensure we don’t introduce more bugs? It is through proper risk management (non-technical founders should understand this). Lack of risk management in the software delivery process makes the problem exponentially worse at a larger scale. So the sooner you nip it in the bud, the easier to scale your team and software delivery process.

What is risk management in software delivery? I love the term safety nets — a way to make software delivery safer. Let’s look at some tactics that are safety nets for your engineering team to ship software safely:

  • Automated quality assurance, Continuous Integration & Continuous Delivery. We have covered this in the “Inability to ship fast with high confidence” section.
  • Reviewing changes, i.e. pull requests. Have someone else in engineering review new changes for logical bugs, architectural problems, reliability concerns, and security. Systemise this as a process.
  • Limit the blast radius of changes. You can ensure that not every change impacts every customer immediately. You can release changes to customers slowly. One of the easiest ways to achieve this is by using Feature Flags to limit the exposure of new features or changes. When Feature Flags are insufficient because of engineering complexity, you might have to look at more complex engineering investments like Canary Releases or Blue-Green Deployments.

All these topics are fairly complex in themselves. Also, there is a lot more to risk management in software engineering (database migrations, database query optimization, security, reliability, performance, etc.), but I am being selective with these for the problem we are discussing, i.e. reducing leakage of bugs to production. I can’t cover all of these in this article. Perhaps another one. But if you research these on Google, you will find a ton of wisdom.

Sales and product managers are not able to meet the commitments they make to customers.

  • Poor planning (already covered previously)
  • Inability to ship fast with high confidence (already covered previously)
  • The sales team makes commitments on behalf of the product and engineering team

Briefly discussing the sales team making commitments on behalf of product engineering, I’d just say that it is impossible for someone to make a near-accurate commitment on the deadline of something they will not do themselves. It is not just about the sales team but anyone in the company.

Founders must discourage (or rather outrigtly stop) their non-tech teams to make commitments on behalf of their tech team. Situations where a commitment must be made immediately will not completely go away (for example, a critical strategic customer, a strategic time bound partnership decision, etc. Critical opportunities for survival can be time bound sometimes.) must be very few). Deal with them carefully and also work towards reducing such occurrences as much as possible.

The Catch-All — important things don’t get done fast enough without pressure, and the founders don’t understand why.

I have literally heard this in every job I have ever had and every engagement I have ever taken. I can count on one hand the number of founders who have not had this problem (good for them). As a founder, you understand the urgency of certain things, and you want them done fast for various reasons. But they don’t happen. There can be a number of reasons for this. I have tried my best to list them down (but I am sure there are more reasons I’d love to learn about them):

  • Poor planning (already covered previously)
  • Lack of a sense of urgency (already covered previously)
  • Lack of customer centricity (already covered previously)
  • Lack of clear direction and focus (already covered previously)
  • Lack of context and domain experience (already covered previously)

There is probably a lot more to this. I have only listed the most common problems I have come across with the companies I have worked with and the founders I have interacted with.

Easier said than done

It’s easy to say that the founders should do this and that. I fully empathise with their situation and the difficulties of being a founder. It’s hard. They surely have a lot on their plate. A lot of it could also feel like jargon. But every time I have attacked these root causes, I have seen teams improve. I also understand that I could be ignorant of the realities of the life of a founder (which is also probably true). But then, building a startup is not easy. My intention behind this article was to help provide a framework rooted in the industry’s knowledge and my personal experiences to make the job slightly easier.

I have used the word “systematic” a lot. The reason is that we are discussing The Evolving Job of a Startup CTO (or founder) managing technical teams. When you step away from daily execution, you are not doing everything yourself. When you were involved hands-on, you used first principles or just followed your gut. You can’t scale that. So how do you get your teams to execute better, move fast, be quality focussed and meet commitments? You build systems and culture that get your team to operate in a way that helps you succeed as a business without hands-building the product at the micro level all the time.

If you are a startup founder needing help streamlining product and engineering, I’d love to chat and see if I can help. If you’d like me to double-click on any of these topics, let me know.

Originally published at https://vaidik.in on July 31, 2023.

--

--

Vaidik Kapoor

Engineering Leader, Technology Advisor | Previously, led engineering at Blinkit, Wingify