GitHub user HeartSaVioR opened a pull request:

    https://github.com/apache/spark/pull/22952

    [SPARK-20568][SS] Rename files which are completed in previous batch

    ## What changes were proposed in this pull request?
    
    This patch adds the option to rename files which are completed in previous 
batch, so that end users can clean up processed files to save their storage.
    
    It is only applied to "micro-batch", since for batch all input files must 
be kept to get same result across multiple query executions.
    
    ## How was this patch tested?
    
    Added UT, manually tested with Mac local. (The logic is very simple so not 
sure we need to verify with HDFS manually.)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HeartSaVioR/spark SPARK-20568

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22952.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22952
    
----
commit 8a1d2e187c667833b2de8eb4cba2fa04dca9c6ff
Author: Jungtaek Lim <kabhwan@...>
Date:   2018-11-05T04:32:51Z

    SPARK-20568 Rename files which are completed in previous batch

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to