spark git commit: [SPARK-19499][SS] Add more notes in the comments of Sink.addBatch()

zsxwing Tue, 07 Feb 2017 20:25:53 -0800

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 e642a07d5 -> 706d6c154



[SPARK-19499][SS] Add more notes in the comments of Sink.addBatch()

## What changes were proposed in this pull request?

addBatch method in Sink trait is supposed to be a synchronous method to 
coordinate with the fault-tolerance design in StreamingExecution (being 
different with the compute() method in DStream)

We need to add more notes in the comments of this method to remind the 
developers

## How was this patch tested?

existing tests

Author: CodingCat <zhunans...@gmail.com>

Closes #16840 from CodingCat/SPARK-19499.

(cherry picked from commit d4cd975718716be11a42ce92a47c45be1a46bd60)
Signed-off-by: Shixiong Zhu <shixi...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/706d6c15
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/706d6c15
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/706d6c15

Branch: refs/heads/branch-2.1
Commit: 706d6c154d2471c00253bf9b0c4e867752f841fe
Parents: e642a07
Author: CodingCat <zhunans...@gmail.com>
Authored: Tue Feb 7 20:25:18 2017 -0800
Committer: Shixiong Zhu <shixi...@databricks.com>
Committed: Tue Feb 7 20:25:25 2017 -0800

----------------------------------------------------------------------
 .../scala/org/apache/spark/sql/execution/streaming/Sink.scala   | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/706d6c15/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala
index 2571b59..d10cd30 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala
@@ -31,8 +31,11 @@ trait Sink {
    * this method is called more than once with the same batchId (which will 
happen in the case of
    * failures), then `data` should only be added once.
    *
-   * Note: You cannot apply any operators on `data` except consuming it (e.g., 
`collect/foreach`).
+   * Note 1: You cannot apply any operators on `data` except consuming it 
(e.g., `collect/foreach`).
    * Otherwise, you may get a wrong result.
+   *
+   * Note 2: The method is supposed to be executed synchronously, i.e. the 
method should only return
+   * after data is consumed by sink successfully.
    */
   def addBatch(batchId: Long, data: DataFrame): Unit
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-19499][SS] Add more notes in the comments of Sink.addBatch()

Reply via email to