This sounds useful to me, and as it's purely informational would be a low cost to try out. The one question is how it would impact build runtimes--do you have an estimate for what the cost is here?
On Sun, Jul 5, 2020 at 1:14 PM Saavan Nanavati <[email protected]> wrote: > > Hey everyone, > > Currently, during the Jenkins build process, we don't generate any code > coverage reports for the Python SDK. This email includes a proposal to > generate python coverage reports during the pre-commit build, upload them to > codecov.io for analysis and visualization, and automatically post the > resulting stats back to GitHub PRs to help developers decide whether their > tests need revision. > > You can view/comment on the proposal here, or the full text of the proposal > at the end of this email. Let me know what you think, or if there are any > suggestions for improvements. Thanks! > > Python Coverage Reports For CI/CD > > Author: Saavan Nanavati ([email protected]) > > Reviewer: Udi Meiri ([email protected]) > > > Overview > > > This is a proposal for generating code coverage reports for the Python SDK > during Jenkins’ pre-commit phase, and uploading them to codecov.io for > analysis, with integration back into GitHub using the service’s sister app. > > > This would extend the pre-commit build time but provide valuable information > for developers to revise and improve their tests before their PR is merged, > rather than after when it’s less likely developers will go back to improve > their coverage numbers. > > > This particular 3rd party service has a litany of awesome benefits: > > It’s free for open-source projects > > It seamlessly integrates into GitHub via a comment-bot (example here) > > It overlays coverage report information directly onto GitHub code using > Sourcegraph > > It requires no changes to Jenkins, thereby reducing the risk of breaking the > live test-infra > > It’s extensible and can later be used for the Java & Go SDKs if it proves to > be awesome > > It has an extremely responsive support team that’s happy to help open-source > projects > > > A proof-of-concept can be seen here and here. > > > Goals > > > Provide coverage stats for the Python SDK that update with every pre-commit > run > > Integrate these reports into GitHub so developers can take advantage of the > information > > Open a discussion for how these coverage results can be utilized in code > reviews > > > Non-Goals > > Calculate coverage statistics using external tests located outside of the > Python SDK > > > This is ideal, but would require not only merging multiple coverage reports > together but, more importantly, waiting for these tests to be triggered in > the first place. The main advantage of calculating coverage during pre-commit > is that developers can revise their PRs before merging, which is not > guaranteed if this is a goal. > > > However, it could be something to explore for the future. > > Background > > > Providing code coverage for the Python SDK has been a problem since at least > 2017 (BEAM-2762) with the previous solution being to calculate coverage in > post-commit with coverage.py, and then sending the report to coveralls.io > which would post to GitHub. At some point, this solution broke and the Tox > environment used to compute coverage, cover, was turned off but still remains > in the codebase. > > > There have been 4 main barriers, in the past, to re-implementing coverage > that will be addressed here. > > > It’s difficult to unify coverage for some integration tests, especially ones > that rely on 3rd party dependencies like GCP since it’s not possible to > calculate coverage for the dependencies. > > As stated earlier, this is a non-goal for the proposal. > > > The test reporter outputs results in the same directory which sometimes > causes previous results to be overwritten. This occurs when using different > parameters for the same test (e.g. running a test with Dataflow vs > DirectRunner). > > This was mainly a post-commit problem but it does require exploration since > it could be an issue for pre-commit. However, even in the worst case, the > coverage numbers would still be valuable since you can still see how coverage > changed relatively between commits even if the absolute numbers are slightly > inaccurate. > > > It’s time-consuming and non-trivial to modify and test changes to Jenkins > > We don’t need to - this proposal integrates directly with codecov.io, making > Jenkins an irrelevant part of the testing infrastructure with regards to code > coverage - it’s not just easier, it’s better because it provides many > benefits (mentioned in the Overview section) that a Jenkins-based solution > like Cobertura doesn’t have. Lastly, using codecov.io would place less strain > and maintenance on Jenkins (less jobs to run, no need for storing the > reports, etc.). > > > coveralls.io isn’t the best to work with (email thread) > > codecov.io ;) > > Design Details > > > To utilize codecov.io, the existing Tox environment cover needs to be added > to the list of toxTasks that Gradle will run in pre-commit. To reduce strain > (and more-or-less duplicate information), coverage only needs to be > calculated for one Python version (say 3.7, with cloud dependencies) rather > than all of them. > > > To be compatible with Jenkins and codecov.io, the Tox environment should be > configured as follows: > > passenv = GIT_* BUILD_* ghprb* CHANGE_ID BRANCH_NAME JENKINS_* CODECOV_* > > deps = > > ... > > codecov > > commands = > > ... > > codecov -t CODECOV_TOKEN > > > There are two options for what goes into the ellipsis (...) First, we can use > coverage.py to compute coverage (which is how it’s currently set up) but that > uses the old nose test runner. Alternatively, we can switch to computing > coverage with pytest and pytest-cov, which would give more accurate numbers, > and is the preferred solution. > > > Additionally, the CODECOV_TOKEN, which can be retrieved here, should be added > as an environment variable in Jenkins. > > > Finally, a YAML file should be added so codecov.io can resolve file path > errors. This can also be used to define success thresholds and more (for > example, here) though, ideally, this coverage task should never fail a > Jenkins build due to the risk of false-negatives. > > > A proof-of-concept for this design can be seen here and here (this is code > coverage for my fork). > >
