Code coverage would be very useful, regardless of what tool we use to measure
it. Especially when reviewing PRs, because it provides a quantitative measure
which can help reviewer and contributor to decide if testing is sufficient.
I'd prefer to use specific tools instead of one-fits-it-all tool but I'm open to
whatever works best. The process of setting the tooling up is likely going to be
iterative.
@Mikhail If you want to setup anything and present it here, that would be
awesome. (Of course discussion can continue here.)
On 03.01.19 19:34, Heejong Lee wrote:
I think we should also consider false positive ratio of the tool. Oftentimes
deeper analysis easily produces tons of false positives which make people less
interested in static analysis results because of triaging overheads.
On Thu, Jan 3, 2019 at 4:18 PM Scott Wegner <sc...@apache.org
<mailto:sc...@apache.org>> wrote:
Discussion on software engineering practices and tools tends to gather many
opinions :) I suggest breaking this out into a doc to keep the discussion
organized.
I appreciate that you've started with a list of requirements. I would add a
few:
6. Analysis results should be integrated into the code review workflow.
7. It should also be possible to run analysis and evaluate results locally.
8. Analysis rules and thresholds should be easily configurable.
And some thoughts on the previous requirements:
> 2. Tool should keep history of reports.
Seems nice-to-have but not required. I believe the most value is viewing the
delta during code review, and also maybe a snapshot of the overall state of
master. If we want trends we could also import data into
s.apache.org/beam-community-metrics
<http://s.apache.org/beam-community-metrics>
> 4. Tool should encorporate code coverage and static analysis reports. (Or
more if applicable)
Is the idea to have a single tool responsible for all code analysis? We
currently have a variety of tools running in our build. It would be
challenging to find a single tool that aggregates all current (and future)
analysis, especially considering the different language ecosystems. Having
targeted tools responsible for different pieces allows us to pick-and-choose
what works best for Beam.
On Thu, Jan 3, 2019 at 3:43 PM Mikhail Gryzykhin <mig...@google.com
<mailto:mig...@google.com>> wrote:
Let me summarize and answer main question that I see:
1. Seems that we do want to have some statistics on coverage and
integrate automatic requirements into our build system.
2. Implementation is still to be discussed.
Lets talk about implementation further.
My requirements for choice are:
1. Tool should give us an option for deep-dive into findings.
2. Tool should keep history of reports.
3. Tool should give an option to break build (allow for hardcoded
requirements)
4. Tool should encorporate code coverage and static analysis reports.
(Or more if applicable)
5. Tool should support most or all languages we utilize in beam.
Let me dive into SonarQube a bit first. (All up to my understanding of
how it works.)
Hits most of the points, potentially with some tweaks.
This tool relies on reports generated by common tools. It also tracks
history of builds and allows to navigate it. Multi language. I'm still
working on figuring out how to configure it though.
Common thresholds/checks that are suggested by SonarQube:
Many checks are possible to apply to new code only. This allows not to
fix legacy code, but keep all new additions clean and neat (ish).
Test coverage by line/branch: Relies on cubertura report. Usually
coverage by branch is suggested. (all "if" case lines should be tested
with positive and negative condition result)
Method complexity: Amount of different paths/conditions that method can
be invoked with. Suggested max number is 15. Generally describes how
easy it is to test/understand method.
Bugs/vulnerabilities: Generally, output of Findbug. Reflects commonly
vulnerable/dangerous code that might cause errors. Or just errors in
code. I believe that sonar allows for custom code analysis as well, but
that is not required.
Technical debt: estimations on how much time will it take to cleanup
code to make it shiny. Includes code duplications, commented code, not
following naming conventions, long methods, ifs that can be inverted,
public methods that can be private, etc. I'm not familiar with explicit
list, but on my experience suggestions are usually relevant.
More on metrics can be found here:
https://docs.sonarqube.org/latest/user-guide/metric-definitions/
Suggested alternatives:
https://scan.coverity.com/
This tool looks great and I'll check more on it. But it has a
restriction to 14 or 7 builds per week (not sure how will they estimate
our project). Also, I'm not sure if we can break pre-commit based on
report from coverity. Looks good for generating historical data.
https://docs.codecov.io/docs/browser-extension
I'll check more on this one. Looks great to have it integrated in PRs.
Although it requires plugin installation by each developer. I don't
think it allows to break builds and only does coverage. Am I correct?
Regards,
--Mikhail
Have feedback <http://go/migryz-feedback>?
On Thu, Jan 3, 2019 at 2:18 PM Kenneth Knowles <k...@apache.org
<mailto:k...@apache.org>> wrote:
It would be very useful to have line and/or branch coverage visible.
These are both very weak proxies for quality or reliability, so IMO
strict thresholds are not helpful. One thing that is super useful is
to integrate line coverage into code review, like this:
https://docs.codecov.io/docs/browser-extension. It is very easy to
notice major missing tests.
We have never really used Sonarqube. It was turned on as a
possibility in the early days but never worked on past that point.
Could be nice. I suspect there's a lot to be gained by just finding
very low numbers and improving them. So just running Jacoco's
offline HTML generation would do it (also this integrates with
Jenkins). I tried this the other day and discovered that our gradle
config is broken and does not wire tests and coverage reporting
together properly. Last thing: How is "technical debt" measured? I'm
skeptical of quantitative measures for qualitative notions.
Kenn
On Thu, Jan 3, 2019 at 1:58 PM Heejong Lee <heej...@google.com
<mailto:heej...@google.com>> wrote:
I don't have any experience of using SonarQube but Coverity
worked well for me. Looks like it already has beam repo:
https://scan.coverity.com/projects/11881
On Thu, Jan 3, 2019 at 1:27 PM Reuven Lax <re...@google.com
<mailto:re...@google.com>> wrote:
checkstyle and findbugs are already run as precommit checks,
are they not?
On Thu, Jan 3, 2019 at 7:19 PM Mikhail Gryzykhin
<mig...@google.com <mailto:mig...@google.com>> wrote:
Hi everyone,
In our current builds we (can) run multiple code quality
checks tools like checkstyle, findbugs, code test
coverage via cubertura. However we do not utilize many
of those signals.
I suggest to add requirements to code based on those
tools. Specifically, I suggest to add pre-commit checks
that will require PRs to conform to some quality checks.
We can see good example of thresholds to add at Apache
SonarQube provided default quality gate config
<https://builds.apache.org/analysis/quality_gates/show/1>:
80% tests coverage on new code,
5% technical technical debt on new code,
No bugs/Vulnerabilities added.
As another part of this proposal, I want to suggest the
use of SonarQube for tracking code statistics and as
agent for enforcing code quality thresholds. It is
Apache provided tool that has integration with Jenkins
or Gradle via plugins.
I believe some reporting to SonarQube was configured for
mvn builds of some of Beam sub-projects, but was lost
during migration to gradle.
I was looking for other options, but so far found only
general configs to gradle builds that will fail build if
code coverage for project is too low. Such approach will
force us to backfill tests for all existing code that
can be tedious and demand learning of all legacy code
that might not be part of current work.
I suggest to discuss and come to conclusion on two
points in this tread:
1. Do we want to add code quality checks to our
pre-commit jobs and require them to pass before PR is
merged?
Suggested: Add code quality checks listed above at
first, adjust them as we see fit in the future.
2. What tools do we want to utilize for analyzing code
quality?
Under discussion. Suggested: SonarQube, but will
depend on functionality level we want to achieve.
Regards,
--Mikhail
--
Got feedback? tinyurl.com/swegner-feedback
<https://tinyurl.com/swegner-feedback>