[ https://issues.apache.org/jira/browse/SPARK-38033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim reassigned SPARK-38033: ------------------------------------ Assignee: LeeeeLiu > The structured streaming processing cannot be started because the commitId > and offsetId are inconsistent > -------------------------------------------------------------------------------------------------------- > > Key: SPARK-38033 > URL: https://issues.apache.org/jira/browse/SPARK-38033 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 2.4.6, 3.0.3 > Reporter: LeeeeLiu > Assignee: LeeeeLiu > Priority: Major > > Streaming Processing could not start due to an unexpected machine shutdown. > The exception is as follows > > {code:java} > ERROR 22/01/12 02:48:36 MicroBatchExecution: Query > streaming_4a026335eafd4bb498ee51752b49f7fb [id = > 647ba9e4-16d2-4972-9824-6f9179588806, runId = > 92385d5b-f31f-40d0-9ac7-bb7d9796d774] terminated with error > java.lang.IllegalStateException: batch 113258 doesn't exist > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$4.apply(MicroBatchExecution.scala:256) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$4.apply(MicroBatchExecution.scala:256) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$populateStartOffsets(MicroBatchExecution.scala:255) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:169) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351) > at > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:166) > at > org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:160) > at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:281) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193) > {code} > I checked checkpoint file on HDFS and found the latest offset is 113259. But > commits is 113257. The following > {code:java} > commits > /tmp/streaming_xxxxxxxx/commits/113253 > /tmp/streaming_xxxxxxxx/commits/113254 > /tmp/streaming_xxxxxxxx/commits/113255 > /tmp/streaming_xxxxxxxx/commits/113256 > /tmp/streaming_xxxxxxxx/commits/113257 > offset > /tmp/streaming_xxxxxxxx/offsets/113253 > /tmp/streaming_xxxxxxxx/offsets/113254 > /tmp/streaming_xxxxxxxx/offsets/113255 > /tmp/streaming_xxxxxxxx/offsets/113256 > /tmp/streaming_xxxxxxxx/offsets/113257 > /tmp/streaming_xxxxxxxx/offsets/113259{code} > Finally, I deleted offsets “/tmp/streaming_xxxxxxxx/offsets/113259” and the > program started normally. I think there is a problem here and we should try > to handle this exception or give some resolution in the log. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org