Code Review Comments — Consequences will never be the same

Nowadays many development teams conduct code reviews, which are a means of feedback about the suggested code change.

Why review code?

The purpose is to ensure immediate and future quality. Benefits include:

  • Verifying the correctness of the functionality, error handling and edge cases
  • Readability — making sure that other humans are able to read the code
  • Enforcing common conventions (linters and other automatic tools can help with mundane syntax and styling conventions)
  • Finding mistakes (a.k.a bugs)
  • Knowledge sharing
  • Promoting good design
  • Mentoring opportunities to become craftsmen
  • Preventing premature optimizations
  • Keeping things simple
  • Spotting missing tests
  • Non functional requirements (performance, security, SLAs, GDPR)
  • Spotting violations of abstractions and architecture
  • Promoting best practices
  • Tricky topics like memory leaks and threading
  • Focusing on one responsibility per pull request

Problems with Code Review

All the benefits of code review come with a price.

It takes time until developers start the code review, in which the author of the code must switch to another task, causing “work in progress” items. This also imposes constant context switches for the reviewers, allowing them for less focus time.

Besides that, there are some potential problems that might arise:

  • Management or other stakeholders don’t always allow time for code reviews, as they perceive it as optional or wasteful
  • Management is pushing for code reviews but the developers are reluctant
  • Not rewarding code reviews. For example, a person who is conducting a lot of code reviews compared to other developers will end up with a bad performance review for not delivering as much as his colleagues
  • Can distract from shipping, trying to make the perfect code
  • Weird managerial initiatives, such as turning code reviews to competition between developers or giving public grades
  • One person is a bottleneck (the team lead, the tech lead, the code owner)
  • The code review is done poorly

Poor code reviews

There are many ways for which the code review would earn the title “poor”.
In essence, it means that either the pull request does not use its max potential, or it does, but there are consequences on the developers.

The symptoms are evident in the code review comments. Let’s examine them:

  • The comments only target parts of the changes, e.g. ignoring test or configuration files, only first few files, only the “main” file.
    If not giving an indication, it’s unclear whether the reviewer might continue later, or will review again after the first set of fixes, or prioritized some files due to lack of time. This might result in the author not sure what to expect, or doing a good job only on the main files.
  • Superficial code reviews. All the files were reviewed, but not thoroughly, rather skimmed through. This was well expressed in the joke:
    5 lines of code → 10 comments.
    500 lines of code → “looks good to me.”
    This is bad because the author might think that the rest of his code is excellent, when in fact it wasn’t really reviewed.
  • Incorrect comments, resulting in a waste of time. This could be as simple as requesting to use the wrong method instead of the correct one, but in more extreme cases, requesting to restructure big parts based on partial knowledge or wrong assumptions. Then it is tremendously more expensive.
  • Hostility, toxicity, aggressiveness. It might be a specific person who does it for all, or targets a specific person, a.k.a as bullying. The effect of such behavior is less collaboration, fear, killing creativity and experimental spirit, low moral and demotivation.
    Example: “I don’t understand why a skilled developer would do this.” or “What the hell?”
  • Passive-aggressiveness. The reviewer comments in an unclear way, that is not constructive in regards to what a problem could be, and whether she actually prefers a change.
    Example: “Go ahead if you think it’s good.”
  • Deadlocks — developers wait for other developers to review first.
  • Fear to approve — developers are afraid to approve a pull request, fearing there might be an error they missed.
  • Fear to comment seniors— developers are afraid to comment when the author is a senior developer or a manager.
  • Fear to comment juniors— a senior developer or a manager refrains from commenting on some issues as she doesn’t want to hurt a junior author.
  • Pet peeve — a developer desperately and repeatedly brings her agenda which was already rejected. This wastes everybody’s time.
    Example: “If you used Kafka, it would be great because…”
  • Personal taste of the reviewer vs. real issues or conventions, a.k.a “religion”.
    Example: “Parenthesis are not mandatory in this case.”
  • Nitpicking. Similar to personal taste
    Example: “please sort constants in alphabetical order.”
  • Irrelevant distracting comments — the reviewer comments about things that are not directly related to the change, risking the author to extend the scope and introduce unplanned work. Such suggestions should be made outside of the code review.
    Example: “Hey, why not create a micro-service with its own database here?”
  • Mistakes (a.k.a bugs) that are not caught. It is accepted that code reviews catch no more than 20% of the bugs, yet some are very obvious from reading the code.
  • Avoidance — some developers almost don’t contribute to code reviews.
  • Ego fights — either between different reviewers, or between the author and a reviewer. By reading such comments, it’s obvious that the discussion is not professional.
    Example:
    “Would be good to consider performance here, rather than a naive implementation”
    “If you knew a bit about our product, you’d realize this is not the case”
  • Opposition — two senior developers request opposing changes. The author needs to decide which one to listen to, thus possibly creating a problem with the other.
    Example:
    Senior A: “Always use a named subject in tests”
    Senior B: “Always use an implicit subject in tests”
  • Dialog — Some comments become a long dialog that needs resolution offline.
  • Irrelevant show-off — the reviewer’s comment is completely irrelevant and its only purpose is to show his knowledge.
    Example in Ruby code: “If this was java, we could use Int64…”
  • Personal — the comments describe the author instead of the code.
    Example: “You made a mistake” instead of “There is a mistake”
  • The psychologist — similar to the personal, but goes further with the attack, explaining why the error was made.
    Example: “You did this because you’re used to Java, but this is Ruby”
  • Not constructive comments, which are vague and unclear what the expected change is.
    Example: “This is bad / inefficient / wrong / unreadable”
  • Useless comments, which makes you really wonder why it’s there.
    Example: “Not sure why I don’t like it”, or “Not elegant”.
  • Giving clues instead of explaining what’s wrong and why. Results in a huge waste of time figuring out what needs fixing.
    Example: “Please read the (32 pages) documentation”
  • No reasoning- Requesting a change without taking the opportunity to teach.
    Example: “Invert ‘if’ please”
  • Too clever: requesting to rewrite a “dumb” and readable code into clever and unreadable code, for the sake of feeling sophisticated.
  • Culture differences between the author and the reviewer might cause misunderstanding and be perceived as hostile, commanding or failure to understand that there is a request for a change. This can happen when one culture is very direct and the other is very polite.
    Example: “Make this method static” (sounds commanding)
    Example: “Would be nice to have it as a static method” (yeah, wouldn’t it?)
  • Behavior differences. Similarly to the cultural differences, but the differences originate in the people’s behavior.
    Take DISC for example. It is a distinction between 4 main expected behaviors, or tendencies. The “C” is very data driven, needs a lot of time and information to decide, usually communicates in long written form. The “D” on the other hand makes quick decisions with missing information and communicates in short sentences.
  • “Sorry, no time to fix” — the author has a tendency to avoid fixes in multiple occasions, with the same reasoning.
  • Reasoning errors — the author has a tendency to explain why they made a mistake, or they blame other people or the internet in general. This adds zero value, creates noise and demonstrates lack of responsibility.
    Example: “Right, I should have done it more object oriented. I used procedural code here because I watched a great talk about functional programming and got confused.”
  • Not using common terminology in comments which eases the communication, like known design patterns, SOLID principles or industry standards.
    Example: “Consider creating a class whose job will be to translate our API to acme’s API…” instead of “The ‘adapter’ pattern could be beneficial in this case…”
  • Always/never very strong opinions in comments.
    Example: “Never use metaprogramming”, might cause developers to avoid it in the future, even when it’s a good solution for a given problem.

Alternative and supplemental approaches

  • Not doing any code reviews, everyone pushes their changes. If a problem is discovered it will be fixed.
  • Design reviews with other developers before the development starts, so the approach and design are agreed upon
  • Usage of tools like Rubocop, linters, code quality tools, etc.
  • Usage of automated security checks tools
  • Usage of automated code review tools, that is, no humans review the code
  • Trunk based development — always push to production, code review is done later voluntarily
  • Pair programming
  • Automated tests, such as unit, component, integration and end-to-end
  • Monitoring and alerting
  • Only designated people conduct code reviews, such as a manager, a code owner or a tech lead

Call to Action

We understand that good code reviews consist of:

  • Empathy and politeness
  • Teaching, giving reasons for requested changes
  • Global and consistent impact
  • High quality comments

Which result in:

  • Encouraged and grateful developers
  • Knowledge sharing
  • Best practices being applied
  • Developers constantly improving
  • Psychological safety that builds trust

Assuming you prefer good code reviews, and you already have 1:1s and learning activities, what else can you do?

Review the reviews!

Read code review comments and get statistics about the comments — the amount of comments given and received per developer. This is not a competition and numbers might change between teams with different maturity levels and cultures, areas in the code, different tech stacks, etc.

  • Identify good, knowledgeable, caring reviewers and reward them.
  • You can probably spot people who rarely review, or some that receive a lot of comments, have a lot of back and forth and are slow to deliver because of that.
  • Identify the people who struggle and what they struggle with.
  • Create development plans for people based on your findings. You also have numbers for measurable goals.
  • Watch out for bullying, and stop it immediately.

I am interested in your insights, so if you wish, you can fill in this short survey (up to 5 minutes).

--

--

Happily programming since 1984

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store