Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22952#discussion_r235632761
  
    --- Diff: docs/structured-streaming-programming-guide.md ---
    @@ -530,6 +530,12 @@ Here are the details of all the sources in Spark.
             "s3://a/dataset.txt"<br/>
             "s3n://a/b/dataset.txt"<br/>
             "s3a://a/b/c/dataset.txt"<br/>
    +        <code>cleanSource</code>: option to clean up completed files after 
processing.<br/>
    +        Available options are "archive", "delete", "no_op". If the option 
is not provided, the default value is "no_op".<br/>
    +        When "archive" is provided, additional option 
<code>sourceArchiveDir</code> must be provided as well. The value of 
"sourceArchiveDir" must be outside of source path, to ensure archived files are 
never included to new source files again.<br/>
    --- End diff --
    
    Yeah I guess you're right. I'll add a logic to check in initialization on 
FileStreamSource.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to