Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20386#discussion_r164908529
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceWriter.java
 ---
    @@ -63,32 +68,42 @@
       DataWriterFactory<Row> createWriterFactory();
     
       /**
    -   * Commits this writing job with a list of commit messages. The commit 
messages are collected from
    -   * successful data writers and are produced by {@link 
DataWriter#commit()}.
    +   * Handles a commit message which is collected from a successful data 
writer.
    +   *
    +   * Note that, implementations might need to cache all commit messages 
before calling
    +   * {@link #commit()} or {@link #abort()}.
    --- End diff --
    
    In what case would an implementation not cache and commit all at once? What 
is the point of a commit if not to make sure all of the data shows up at the 
same time?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to