Code coverage would be very useful, regardless of what tool we use to measure it. Especially when reviewing PRs, because it provides a quantitative measure which can help reviewer and contributor to decide if testing is sufficient.

I'd prefer to use specific tools instead of one-fits-it-all tool but I'm open to whatever works best. The process of setting the tooling up is likely going to be iterative.

@Mikhail If you want to setup anything and present it here, that would be awesome. (Of course discussion can continue here.)

I think we should also consider false positive ratio of the tool. Oftentimes deeper analysis easily produces tons of false positives which make people less interested in static analysis results because of triaging overheads.

    Discussion on software engineering practices and tools tends to gather many
    opinions :) I suggest breaking this out into a doc to keep the discussion

    I appreciate that you've started with a list of requirements. I would add a 

    6. Analysis results should be integrated into the code review workflow.
    7. It should also be possible to run analysis and evaluate results locally.
    8. Analysis rules and thresholds should be easily configurable.

    And some thoughts on the previous requirements:

     > 2. Tool should keep history of reports.

    Seems nice-to-have but not required. I believe the most value is viewing the
    delta during code review, and also maybe a snapshot of the overall state of
    master. If we want trends we could also import data into 

     > 4. Tool should encorporate code coverage and static analysis reports. (Or
    more if applicable)

    Is the idea to have a single tool responsible for all code analysis? We
    currently have a variety of tools running in our build. It would be
    challenging to find a single tool that aggregates all current (and future)
    analysis, especially considering the different language ecosystems. Having
    targeted tools responsible for different pieces allows us to pick-and-choose
    what works best for Beam.

        Let me summarize and answer main question that I see:
        1. Seems that we do want to have some statistics on coverage and
        integrate automatic requirements into our build system.
        2. Implementation is still to be discussed.

        Lets talk about implementation further.

        My requirements for choice are:
        1. Tool should give us an option for deep-dive into findings.
        2. Tool should keep history of reports.
        3. Tool should give an option to break build (allow for hardcoded
        4. Tool should encorporate code coverage and static analysis reports.
        (Or more if applicable)
        5. Tool should support most or all languages we utilize in beam.

        Let me dive into SonarQube a bit first. (All up to my understanding of
        how it works.)
        Hits most of the points, potentially with some tweaks.
        This tool relies on reports generated by common tools. It also tracks
        history of builds and allows to navigate it. Multi language. I'm still
        working on figuring out how to configure it though.

        Common thresholds/checks that are suggested by SonarQube:
        Many checks are possible to apply to new code only. This allows not to
        fix legacy code, but keep all new additions clean and neat (ish).
        Test coverage by line/branch: Relies on cubertura report. Usually
        coverage by branch is suggested. (all "if" case lines should be tested
        with positive and negative condition result)
        Method complexity: Amount of different paths/conditions that method can
        be invoked with. Suggested max number is 15. Generally describes how
        easy it is to test/understand method.
        Bugs/vulnerabilities: Generally, output of Findbug. Reflects commonly
        vulnerable/dangerous code that might cause errors. Or just errors in
        code. I believe that sonar allows for custom code analysis as well, but
        that is not required.
        Technical debt: estimations on how much time will it take to cleanup
        code to make it shiny. Includes code duplications, commented code, not
        following naming conventions, long methods, ifs that can be inverted,
        public methods that can be private, etc. I'm not familiar with explicit
        list, but on my experience suggestions are usually relevant.
        More on metrics can be found here:

        Suggested alternatives:
        This tool looks great and I'll check more on it. But it has a
        restriction to 14 or 7 builds per week (not sure how will they estimate
        our project). Also, I'm not sure if we can break pre-commit based on
        report from coverity. Looks good for generating historical data.
        I'll check more on this one. Looks great to have it integrated in PRs.
        Although it requires plugin installation by each developer. I don't
        think it allows to break builds and only does coverage. Am I correct?


            It would be very useful to have line and/or branch coverage visible.
            These are both very weak proxies for quality or reliability, so IMO
            strict thresholds are not helpful. One thing that is super useful is
            to integrate line coverage into code review, like this:
   It is very easy to
            notice major missing tests.

            We have never really used Sonarqube. It was turned on as a
            possibility in the early days but never worked on past that point.
            Could be nice. I suspect there's a lot to be gained by just finding
            very low numbers and improving them. So just running Jacoco's
            offline HTML generation would do it (also this integrates with
            Jenkins). I tried this the other day and discovered that our gradle
            config is broken and does not wire tests and coverage reporting
            together properly. Last thing: How is "technical debt" measured? I'm
            skeptical of quantitative measures for qualitative notions.


                I don't have any experience of using SonarQube but Coverity
                worked well for me. Looks like it already has beam repo:

                    checkstyle and findbugs are already run as precommit checks,
                    are they not?

                        Hi everyone,

                        In our current builds we (can) run multiple code quality
                        checks tools like checkstyle, findbugs, code test
                        coverage via cubertura. However we do not utilize many
                        of those signals.

                        I suggest to add requirements to code based on those
                        tools. Specifically, I suggest to add pre-commit checks
                        that will require PRs to conform to some quality checks.

                        We can see good example of thresholds to add at Apache
                        SonarQube provided default quality gate config
                        80% tests coverage on new code,
                        5% technical technical debt on new code,
                        No bugs/Vulnerabilities added.

                        As another part of this proposal, I want to suggest the
                        use of SonarQube for tracking code statistics and as
                        agent for enforcing code quality thresholds. It is
                        Apache provided tool that has integration with Jenkins
                        or Gradle via plugins.

                        I believe some reporting to SonarQube was configured for
                        mvn builds of some of Beam sub-projects, but was lost
                        during migration to gradle.

                        I was looking for other options, but so far found only
                        general configs to gradle builds that will fail build if
                        code coverage for project is too low. Such approach will
                        force us to backfill tests for all existing code that
                        can be tedious and demand learning of all legacy code
                        that might not be part of current work.

                        I suggest to discuss and come to conclusion on two
                        points in this tread:
                        1. Do we want to add code quality checks to our
                        pre-commit jobs and require them to pass before PR is

                            Suggested: Add code quality checks listed above at
                            first, adjust them as we see fit in the future.

                        2. What tools do we want to utilize for analyzing code

                            Under discussion. Suggested: SonarQube, but will
                            depend on functionality level we want to achieve.



