Re: [Proposal] Adding Python Coverage Reports To CI/CD

Mikhail Gryzykhin Mon, 06 Jul 2020 15:07:06 -0700

I wouldn't consider build time as a blocker to add report. Even if build
time is rather slower, we can run coverage report periodically as a
separate job and still get use of it.


On Mon, Jul 6, 2020, 2:38 PM Robert Bradshaw <rober...@google.com> wrote:

> This sounds useful to me, and as it's purely informational would be a
> low cost to try out. The one question is how it would impact build
> runtimes--do you have an estimate for what the cost is here?
>
> On Sun, Jul 5, 2020 at 1:14 PM Saavan Nanavati <saa...@google.com> wrote:
> >
> > Hey everyone,
> >
> > Currently, during the Jenkins build process, we don't generate any code
> coverage reports for the Python SDK. This email includes a proposal to
> generate python coverage reports during the pre-commit build, upload them
> to codecov.io for analysis and visualization, and automatically post the
> resulting stats back to GitHub PRs to help developers decide whether their
> tests need revision.
> >
> > You can view/comment on the proposal here, or the full text of the
> proposal at the end of this email. Let me know what you think, or if there
> are any suggestions for improvements. Thanks!
> >
> > Python Coverage Reports For CI/CD
> >
> > Author: Saavan Nanavati (saa...@google.com)
> >
> > Reviewer: Udi Meiri (eh...@google.com)
> >
> >
> > Overview
> >
> >
> > This is a proposal for generating code coverage reports for the Python
> SDK during Jenkins’ pre-commit phase, and uploading them to codecov.io
> for analysis, with integration back into GitHub using the service’s sister
> app.
> >
> >
> > This would extend the pre-commit build time but provide valuable
> information for developers to revise and improve their tests before their
> PR is merged, rather than after when it’s less likely developers will go
> back to improve their coverage numbers.
> >
> >
> > This particular 3rd party service has a litany of awesome benefits:
> >
> > It’s free for open-source projects
> >
> > It seamlessly integrates into GitHub via a comment-bot (example here)
> >
> > It overlays coverage report information directly onto GitHub code using
> Sourcegraph
> >
> > It requires no changes to Jenkins, thereby reducing the risk of breaking
> the live test-infra
> >
> > It’s extensible and can later be used for the Java & Go SDKs if it
> proves to be awesome
> >
> > It has an extremely responsive support team that’s happy to help
> open-source projects
> >
> >
> > A proof-of-concept can be seen here and here.
> >
> >
> > Goals
> >
> >
> > Provide coverage stats for the Python SDK that update with every
> pre-commit run
> >
> > Integrate these reports into GitHub so developers can take advantage of
> the information
> >
> > Open a discussion for how these coverage results can be utilized in code
> reviews
> >
> >
> > Non-Goals
> >
> > Calculate coverage statistics using external tests located outside of
> the Python SDK
> >
> >
> > This is ideal, but would require not only merging multiple coverage
> reports together but, more importantly, waiting for these tests to be
> triggered in the first place. The main advantage of calculating coverage
> during pre-commit is that developers can revise their PRs before merging,
> which is not guaranteed if this is a goal.
> >
> >
> > However, it could be something to explore for the future.
> >
> > Background
> >
> >
> > Providing code coverage for the Python SDK has been a problem since at
> least 2017 (BEAM-2762) with the previous solution being to calculate
> coverage in post-commit with coverage.py, and then sending the report to
> coveralls.io which would post to GitHub. At some point, this solution
> broke and the Tox environment used to compute coverage, cover, was turned
> off but still remains in the codebase.
> >
> >
> > There have been 4 main barriers, in the past, to re-implementing
> coverage that will be addressed here.
> >
> >
> > It’s difficult to unify coverage for some integration tests, especially
> ones that rely on 3rd party dependencies like GCP since it’s not possible
> to calculate coverage for the dependencies.
> >
> > As stated earlier, this is a non-goal for the proposal.
> >
> >
> > The test reporter outputs results in the same directory which sometimes
> causes previous results to be overwritten. This occurs when using different
> parameters for the same test (e.g. running a test with Dataflow vs
> DirectRunner).
> >
> > This was mainly a post-commit problem but it does require exploration
> since it could be an issue for pre-commit. However, even in the worst case,
> the coverage numbers would still be valuable since you can still see how
> coverage changed relatively between commits even if the absolute numbers
> are slightly inaccurate.
> >
> >
> > It’s time-consuming and non-trivial to modify and test changes to Jenkins
> >
> > We don’t need to - this proposal integrates directly with codecov.io,
> making Jenkins an irrelevant part of the testing infrastructure with
> regards to code coverage - it’s not just easier, it’s better because it
> provides many benefits (mentioned in the Overview section) that a
> Jenkins-based solution like Cobertura doesn’t have. Lastly, using
> codecov.io would place less strain and maintenance on Jenkins (less jobs
> to run, no need for storing the reports, etc.).
> >
> >
> > coveralls.io isn’t the best to work with (email thread)
> >
> > codecov.io ;)
> >
> > Design Details
> >
> >
> > To utilize codecov.io, the existing Tox environment cover needs to be
> added to the list of toxTasks that Gradle will run in pre-commit. To reduce
> strain (and more-or-less duplicate information), coverage only needs to be
> calculated for one Python version (say 3.7, with cloud dependencies) rather
> than all of them.
> >
> >
> > To be compatible with Jenkins and codecov.io, the Tox environment
> should be configured as follows:
> >
> > passenv = GIT_* BUILD_* ghprb* CHANGE_ID BRANCH_NAME JENKINS_* CODECOV_*
> >
> > deps =
> >
> > ...
> >
> > codecov
> >
> > commands =
> >
> > ...
> >
> > codecov -t CODECOV_TOKEN
> >
> >
> > There are two options for what goes into the ellipsis (...) First, we
> can use coverage.py to compute coverage (which is how it’s currently set
> up) but that uses the old nose test runner. Alternatively, we can switch to
> computing coverage with pytest and pytest-cov, which would give more
> accurate numbers, and is the preferred solution.
> >
> >
> > Additionally, the CODECOV_TOKEN, which can be retrieved here, should be
> added as an environment variable in Jenkins.
> >
> >
> > Finally, a YAML file should be added so codecov.io can resolve file
> path errors. This can also be used to define success thresholds and more
> (for example, here) though, ideally, this coverage task should never fail a
> Jenkins build due to the risk of false-negatives.
> >
> >
> > A proof-of-concept for this design can be seen here and here (this is
> code coverage for my fork).
> >
> >
>

Re: [Proposal] Adding Python Coverage Reports To CI/CD

Reply via email to