GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/23117

    [WIP][SPARK-7721][INFRA] Run and generate test coverage report from Python 
via Jenkins

    ## What changes were proposed in this pull request?
    
    
    ### Background
    
    For the current status, the test script that generates coverage information 
was merged
    into Spark, https://github.com/apache/spark/pull/20204
    
    So, we can generate the coverage report and site by, for example:
    
    ```
    run-tests-with-coverage --python-executables=python3 --modules=pyspark-sql
    ```
    
    like `run-tests` script in `./python`.
    
    
    ### Proposed change
    
    The next step is to host this coverage report via `github.io` automatically
    by Jenkins (see https://spark-test.github.io/pyspark-coverage-site/).
    
    This uses my testing account for Spark, @spark-test, which is shared to 
Felix and Shivaram a long time ago for testing purpose including AppVeyor.
    
    To cut this short, this PR targets to run the coverage in 
    
[spark-master-test-sbt-hadoop-2.7](https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/)
    
    In the specific job, it will clone the page, and rebase the up-to-date 
PySpark test coverage from the latest commit. For instance as below:
    
    ```bash
    # Clone PySpark coverage site.
    git clone https://github.com/spark-test/pyspark-coverage-site.git
    
    # Copy generated coverage HTML.
    cp -r .../python/test_coverage/htmlcov/* pyspark-coverage-site/
    
    # Check out to a temporary branch.
    git checkout --orphan latest_branch
    
    # Add all the files.
    git add -A
    
    # Commit current test coverage results.
    git commit -am "Coverage report at latest commit in Apache Spark"
    
    # Delete the old branch.
    git branch -D gh-pages
    
    # Rename the temporary branch to master.
    git branch -m gh-pages
    
    # Finally, force update to our repository.
    git push -f origin gh-pages
    ```
    
    So, it is a one single up-to-date coverage can be shown in the `github-io` 
page. The commands above were manually tested.
    
    ### TODO:
    
    - [ ] Write a draft
    - [ ] Set hidden `SPARK_TEST_KEY` for @spark-test's password in Jenkins via 
Jenkins's feature.
      This should be set both at `SparkPullRequestBuilder` so that we (or I) 
can test and `spark-master-test-sbt-hadoop-2.7`
    - [ ] Make PR builder's test passed
    - [ ] Enable it this build only at spark-master-test-sbt-hadoop-2.7 right 
before getting this in.
    
    ## How was this patch tested?
    
    It will be tested via Jenkins.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-7721

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23117.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23117
    
----
commit d88d5aa73db636f8c73ace9f83f339781ea50531
Author: hyukjinkwon <gurwls223@...>
Date:   2018-11-22T08:08:20Z

    Run and generate test coverage report from Python via Jenkins

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to