I wouldn't consider build time as a blocker to add report. Even if build time is rather slower, we can run coverage report periodically as a separate job and still get use of it.
On Mon, Jul 6, 2020, 2:38 PM Robert Bradshaw <[email protected]> wrote: > This sounds useful to me, and as it's purely informational would be a > low cost to try out. The one question is how it would impact build > runtimes--do you have an estimate for what the cost is here? > > On Sun, Jul 5, 2020 at 1:14 PM Saavan Nanavati <[email protected]> wrote: > > > > Hey everyone, > > > > Currently, during the Jenkins build process, we don't generate any code > coverage reports for the Python SDK. This email includes a proposal to > generate python coverage reports during the pre-commit build, upload them > to codecov.io for analysis and visualization, and automatically post the > resulting stats back to GitHub PRs to help developers decide whether their > tests need revision. > > > > You can view/comment on the proposal here, or the full text of the > proposal at the end of this email. Let me know what you think, or if there > are any suggestions for improvements. Thanks! > > > > Python Coverage Reports For CI/CD > > > > Author: Saavan Nanavati ([email protected]) > > > > Reviewer: Udi Meiri ([email protected]) > > > > > > Overview > > > > > > This is a proposal for generating code coverage reports for the Python > SDK during Jenkins’ pre-commit phase, and uploading them to codecov.io > for analysis, with integration back into GitHub using the service’s sister > app. > > > > > > This would extend the pre-commit build time but provide valuable > information for developers to revise and improve their tests before their > PR is merged, rather than after when it’s less likely developers will go > back to improve their coverage numbers. > > > > > > This particular 3rd party service has a litany of awesome benefits: > > > > It’s free for open-source projects > > > > It seamlessly integrates into GitHub via a comment-bot (example here) > > > > It overlays coverage report information directly onto GitHub code using > Sourcegraph > > > > It requires no changes to Jenkins, thereby reducing the risk of breaking > the live test-infra > > > > It’s extensible and can later be used for the Java & Go SDKs if it > proves to be awesome > > > > It has an extremely responsive support team that’s happy to help > open-source projects > > > > > > A proof-of-concept can be seen here and here. > > > > > > Goals > > > > > > Provide coverage stats for the Python SDK that update with every > pre-commit run > > > > Integrate these reports into GitHub so developers can take advantage of > the information > > > > Open a discussion for how these coverage results can be utilized in code > reviews > > > > > > Non-Goals > > > > Calculate coverage statistics using external tests located outside of > the Python SDK > > > > > > This is ideal, but would require not only merging multiple coverage > reports together but, more importantly, waiting for these tests to be > triggered in the first place. The main advantage of calculating coverage > during pre-commit is that developers can revise their PRs before merging, > which is not guaranteed if this is a goal. > > > > > > However, it could be something to explore for the future. > > > > Background > > > > > > Providing code coverage for the Python SDK has been a problem since at > least 2017 (BEAM-2762) with the previous solution being to calculate > coverage in post-commit with coverage.py, and then sending the report to > coveralls.io which would post to GitHub. At some point, this solution > broke and the Tox environment used to compute coverage, cover, was turned > off but still remains in the codebase. > > > > > > There have been 4 main barriers, in the past, to re-implementing > coverage that will be addressed here. > > > > > > It’s difficult to unify coverage for some integration tests, especially > ones that rely on 3rd party dependencies like GCP since it’s not possible > to calculate coverage for the dependencies. > > > > As stated earlier, this is a non-goal for the proposal. > > > > > > The test reporter outputs results in the same directory which sometimes > causes previous results to be overwritten. This occurs when using different > parameters for the same test (e.g. running a test with Dataflow vs > DirectRunner). > > > > This was mainly a post-commit problem but it does require exploration > since it could be an issue for pre-commit. However, even in the worst case, > the coverage numbers would still be valuable since you can still see how > coverage changed relatively between commits even if the absolute numbers > are slightly inaccurate. > > > > > > It’s time-consuming and non-trivial to modify and test changes to Jenkins > > > > We don’t need to - this proposal integrates directly with codecov.io, > making Jenkins an irrelevant part of the testing infrastructure with > regards to code coverage - it’s not just easier, it’s better because it > provides many benefits (mentioned in the Overview section) that a > Jenkins-based solution like Cobertura doesn’t have. Lastly, using > codecov.io would place less strain and maintenance on Jenkins (less jobs > to run, no need for storing the reports, etc.). > > > > > > coveralls.io isn’t the best to work with (email thread) > > > > codecov.io ;) > > > > Design Details > > > > > > To utilize codecov.io, the existing Tox environment cover needs to be > added to the list of toxTasks that Gradle will run in pre-commit. To reduce > strain (and more-or-less duplicate information), coverage only needs to be > calculated for one Python version (say 3.7, with cloud dependencies) rather > than all of them. > > > > > > To be compatible with Jenkins and codecov.io, the Tox environment > should be configured as follows: > > > > passenv = GIT_* BUILD_* ghprb* CHANGE_ID BRANCH_NAME JENKINS_* CODECOV_* > > > > deps = > > > > ... > > > > codecov > > > > commands = > > > > ... > > > > codecov -t CODECOV_TOKEN > > > > > > There are two options for what goes into the ellipsis (...) First, we > can use coverage.py to compute coverage (which is how it’s currently set > up) but that uses the old nose test runner. Alternatively, we can switch to > computing coverage with pytest and pytest-cov, which would give more > accurate numbers, and is the preferred solution. > > > > > > Additionally, the CODECOV_TOKEN, which can be retrieved here, should be > added as an environment variable in Jenkins. > > > > > > Finally, a YAML file should be added so codecov.io can resolve file > path errors. This can also be used to define success thresholds and more > (for example, here) though, ideally, this coverage task should never fail a > Jenkins build due to the risk of false-negatives. > > > > > > A proof-of-concept for this design can be seen here and here (this is > code coverage for my fork). > > > > >
