Re: [Proposal] Adding Python Coverage Reports To CI/CD

Udi Meiri Mon, 06 Jul 2020 16:08:20 -0700

The idea was to add coverage to one of the already run precommit tox tasks.
This should keep the additional overhead to a minimum.
py37-cloud is the current candidate, since it contains the most
dependencies so fewer tests are skipped.
Saavan, do you have an estimate for the overhead?


Once coverage information is available, we could make use of it for review
purposes.


On Mon, Jul 6, 2020 at 3:06 PM Mikhail Gryzykhin <
gryzykhin.mikh...@gmail.com> wrote:

> I wouldn't consider build time as a blocker to add report. Even if build
> time is rather slower, we can run coverage report periodically as a
> separate job and still get use of it.
>
> On Mon, Jul 6, 2020, 2:38 PM Robert Bradshaw <rober...@google.com> wrote:
>
>> This sounds useful to me, and as it's purely informational would be a
>> low cost to try out. The one question is how it would impact build
>> runtimes--do you have an estimate for what the cost is here?
>>
>> On Sun, Jul 5, 2020 at 1:14 PM Saavan Nanavati <saa...@google.com> wrote:
>> >
>> > Hey everyone,
>> >
>> > Currently, during the Jenkins build process, we don't generate any code
>> coverage reports for the Python SDK. This email includes a proposal to
>> generate python coverage reports during the pre-commit build, upload them
>> to codecov.io for analysis and visualization, and automatically post the
>> resulting stats back to GitHub PRs to help developers decide whether their
>> tests need revision.
>> >
>> > You can view/comment on the proposal here, or the full text of the
>> proposal at the end of this email. Let me know what you think, or if there
>> are any suggestions for improvements. Thanks!
>> >
>> > Python Coverage Reports For CI/CD
>> >
>> > Author: Saavan Nanavati (saa...@google.com)
>> >
>> > Reviewer: Udi Meiri (eh...@google.com)
>> >
>> >
>> > Overview
>> >
>> >
>> > This is a proposal for generating code coverage reports for the Python
>> SDK during Jenkins’ pre-commit phase, and uploading them to codecov.io
>> for analysis, with integration back into GitHub using the service’s sister
>> app.
>> >
>> >
>> > This would extend the pre-commit build time but provide valuable
>> information for developers to revise and improve their tests before their
>> PR is merged, rather than after when it’s less likely developers will go
>> back to improve their coverage numbers.
>> >
>> >
>> > This particular 3rd party service has a litany of awesome benefits:
>> >
>> > It’s free for open-source projects
>> >
>> > It seamlessly integrates into GitHub via a comment-bot (example here)
>> >
>> > It overlays coverage report information directly onto GitHub code using
>> Sourcegraph
>> >
>> > It requires no changes to Jenkins, thereby reducing the risk of
>> breaking the live test-infra
>> >
>> > It’s extensible and can later be used for the Java & Go SDKs if it
>> proves to be awesome
>> >
>> > It has an extremely responsive support team that’s happy to help
>> open-source projects
>> >
>> >
>> > A proof-of-concept can be seen here and here.
>> >
>> >
>> > Goals
>> >
>> >
>> > Provide coverage stats for the Python SDK that update with every
>> pre-commit run
>> >
>> > Integrate these reports into GitHub so developers can take advantage of
>> the information
>> >
>> > Open a discussion for how these coverage results can be utilized in
>> code reviews
>> >
>> >
>> > Non-Goals
>> >
>> > Calculate coverage statistics using external tests located outside of
>> the Python SDK
>> >
>> >
>> > This is ideal, but would require not only merging multiple coverage
>> reports together but, more importantly, waiting for these tests to be
>> triggered in the first place. The main advantage of calculating coverage
>> during pre-commit is that developers can revise their PRs before merging,
>> which is not guaranteed if this is a goal.
>> >
>> >
>> > However, it could be something to explore for the future.
>> >
>> > Background
>> >
>> >
>> > Providing code coverage for the Python SDK has been a problem since at
>> least 2017 (BEAM-2762) with the previous solution being to calculate
>> coverage in post-commit with coverage.py, and then sending the report to
>> coveralls.io which would post to GitHub. At some point, this solution
>> broke and the Tox environment used to compute coverage, cover, was turned
>> off but still remains in the codebase.
>> >
>> >
>> > There have been 4 main barriers, in the past, to re-implementing
>> coverage that will be addressed here.
>> >
>> >
>> > It’s difficult to unify coverage for some integration tests, especially
>> ones that rely on 3rd party dependencies like GCP since it’s not possible
>> to calculate coverage for the dependencies.
>> >
>> > As stated earlier, this is a non-goal for the proposal.
>> >
>> >
>> > The test reporter outputs results in the same directory which sometimes
>> causes previous results to be overwritten. This occurs when using different
>> parameters for the same test (e.g. running a test with Dataflow vs
>> DirectRunner).
>> >
>> > This was mainly a post-commit problem but it does require exploration
>> since it could be an issue for pre-commit. However, even in the worst case,
>> the coverage numbers would still be valuable since you can still see how
>> coverage changed relatively between commits even if the absolute numbers
>> are slightly inaccurate.
>> >
>> >
>> > It’s time-consuming and non-trivial to modify and test changes to
>> Jenkins
>> >
>> > We don’t need to - this proposal integrates directly with codecov.io,
>> making Jenkins an irrelevant part of the testing infrastructure with
>> regards to code coverage - it’s not just easier, it’s better because it
>> provides many benefits (mentioned in the Overview section) that a
>> Jenkins-based solution like Cobertura doesn’t have. Lastly, using
>> codecov.io would place less strain and maintenance on Jenkins (less jobs
>> to run, no need for storing the reports, etc.).
>> >
>> >
>> > coveralls.io isn’t the best to work with (email thread)
>> >
>> > codecov.io ;)
>> >
>> > Design Details
>> >
>> >
>> > To utilize codecov.io, the existing Tox environment cover needs to be
>> added to the list of toxTasks that Gradle will run in pre-commit. To reduce
>> strain (and more-or-less duplicate information), coverage only needs to be
>> calculated for one Python version (say 3.7, with cloud dependencies) rather
>> than all of them.
>> >
>> >
>> > To be compatible with Jenkins and codecov.io, the Tox environment
>> should be configured as follows:
>> >
>> > passenv = GIT_* BUILD_* ghprb* CHANGE_ID BRANCH_NAME JENKINS_* CODECOV_*
>> >
>> > deps =
>> >
>> > ...
>> >
>> > codecov
>> >
>> > commands =
>> >
>> > ...
>> >
>> > codecov -t CODECOV_TOKEN
>> >
>> >
>> > There are two options for what goes into the ellipsis (...) First, we
>> can use coverage.py to compute coverage (which is how it’s currently set
>> up) but that uses the old nose test runner. Alternatively, we can switch to
>> computing coverage with pytest and pytest-cov, which would give more
>> accurate numbers, and is the preferred solution.
>> >
>> >
>> > Additionally, the CODECOV_TOKEN, which can be retrieved here, should be
>> added as an environment variable in Jenkins.
>> >
>> >
>> > Finally, a YAML file should be added so codecov.io can resolve file
>> path errors. This can also be used to define success thresholds and more
>> (for example, here) though, ideally, this coverage task should never fail a
>> Jenkins build due to the risk of false-negatives.
>> >
>> >
>> > A proof-of-concept for this design can be seen here and here (this is
>> code coverage for my fork).
>> >
>> >
>>
>

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [Proposal] Adding Python Coverage Reports To CI/CD

Reply via email to