GitHub user guoxiaolongzte opened a pull request:

    https://github.com/apache/spark/pull/21036

    [SPARK-23958][CORE] HadoopRdd filters empty files to avoid generating empty 
tasks that affect the performance of the Spark computing performance.

    ## What changes were proposed in this pull request?
    
    HadoopRdd filter empty files to avoid generating empty tasks that affect 
the performance of the Spark computing performance.
    
    Empty file's length is zero.
    
    ## How was this patch tested?
    
    manual tests
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/guoxiaolongzte/spark SPARK-23958

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21036.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21036
    
----
commit e4ccdf913157b45f11efe8b8900d1f805d853278
Author: guoxiaolong <guo.xiaolong1@...>
Date:   2018-04-11T02:48:51Z

    [SPARK-23958][CORE] HadoopRdd filters empty files to avoid generating empty 
tasks that affect the performance of the Spark computing performance.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to