Hi,

While reviewing StreamExecution and how batches are displayed in web
UI, I've noticed that currentBatchId is -1 when StreamExecution is
created [1] and becomes 0 when no offsets are available [2].

That leads to my question about setting the job description for a
query using getBatchDescriptionString [3]. It branches per
currentBatchId and when it's -1 gives "init" [4] which never happens
as showed above.

That leads to the PR for SPARK-20464 "Add a job group and description
for streaming queries and fix cancellation of running jobs using the
job group" that sets the job description after populateStartOffsets
[5].

Shouldn't it be before populateStartOffsets so
getBatchDescriptionString has a chance of giving "init" and we see no
two 0s?

Help appreciated.

[1] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L116
[2] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala?utf8=%E2%9C%93#L516
[3] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala?utf8=%E2%9C%93#L878-L883
[4] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala?utf8=%E2%9C%93#L879
[5] 
https://github.com/apache/spark/commit/6fc6cf88d871f5b05b0ad1a504e0d6213cf9d331#diff-6532dd3b63bdab0364fbcf2303e290e4R294

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to