GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23117
[WIP][SPARK-7721][INFRA] Run and generate test coverage report from Python via Jenkins ## What changes were proposed in this pull request? ### Background For the current status, the test script that generates coverage information was merged into Spark, https://github.com/apache/spark/pull/20204 So, we can generate the coverage report and site by, for example: ``` run-tests-with-coverage --python-executables=python3 --modules=pyspark-sql ``` like `run-tests` script in `./python`. ### Proposed change The next step is to host this coverage report via `github.io` automatically by Jenkins (see https://spark-test.github.io/pyspark-coverage-site/). This uses my testing account for Spark, @spark-test, which is shared to Felix and Shivaram a long time ago for testing purpose including AppVeyor. To cut this short, this PR targets to run the coverage in [spark-master-test-sbt-hadoop-2.7](https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/) In the specific job, it will clone the page, and rebase the up-to-date PySpark test coverage from the latest commit. For instance as below: ```bash # Clone PySpark coverage site. git clone https://github.com/spark-test/pyspark-coverage-site.git # Copy generated coverage HTML. cp -r .../python/test_coverage/htmlcov/* pyspark-coverage-site/ # Check out to a temporary branch. git checkout --orphan latest_branch # Add all the files. git add -A # Commit current test coverage results. git commit -am "Coverage report at latest commit in Apache Spark" # Delete the old branch. git branch -D gh-pages # Rename the temporary branch to master. git branch -m gh-pages # Finally, force update to our repository. git push -f origin gh-pages ``` So, it is a one single up-to-date coverage can be shown in the `github-io` page. The commands above were manually tested. ### TODO: - [ ] Write a draft - [ ] Set hidden `SPARK_TEST_KEY` for @spark-test's password in Jenkins via Jenkins's feature. This should be set both at `SparkPullRequestBuilder` so that we (or I) can test and `spark-master-test-sbt-hadoop-2.7` - [ ] Make PR builder's test passed - [ ] Enable it this build only at spark-master-test-sbt-hadoop-2.7 right before getting this in. ## How was this patch tested? It will be tested via Jenkins. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-7721 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23117.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23117 ---- commit d88d5aa73db636f8c73ace9f83f339781ea50531 Author: hyukjinkwon <gurwls223@...> Date: 2018-11-22T08:08:20Z Run and generate test coverage report from Python via Jenkins ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org