Re: [Structured Streaming] NullPointerException in long running query

Shixiong(Ryan) Zhu Tue, 28 Apr 2020 23:49:22 -0700

The stack trace is omitted by JVM when an exception is thrown too
many times. This usually happens when you have multiple Spark tasks on the
same executor JVM throwing the same exception. See
https://stackoverflow.com/a/3010106


Best Regards,
Ryan


On Tue, Apr 28, 2020 at 10:45 PM lec ssmi <shicheng31...@gmail.com> wrote:

> It should be a problem of my data quality. It's curious why the
> driver-side exception stack has no specific exception information.
>
> Edgardo Szrajber <szraj...@yahoo.com> 于2020年4月28日周二 下午3:32写道：
>
>> The exception occured while aborting the stage. It might be interesting
>> to try to understand the reason for the abortion.
>> Maybe timeout? How long the query run?
>> Bentzi
>>
>> Sent from Yahoo Mail on Android
>> <https://go.onelink.me/107872968?pid=InProduct&c=Global_Internal_YGrowth_AndroidEmailSig__AndroidUsers&af_wl=ym&af_sub1=Internal&af_sub2=Global_YGrowth&af_sub3=EmailSignature>
>>
>> On Tue, Apr 28, 2020 at 9:25, Jungtaek Lim
>> <kabhwan.opensou...@gmail.com> wrote:
>> The root cause of exception is occurred in executor side "Lost task 10.3
>> in stage 1.0 (TID 81, spark6, executor 1)" so you may need to check there.
>>
>> On Tue, Apr 28, 2020 at 2:52 PM lec ssmi <shicheng31...@gmail.com> wrote:
>>
>> Hi：
>>   One of my long-running queries occasionally encountered the following
>> exception:
>>
>>
>>   Caused by: org.apache.spark.SparkException: Job aborted due to stage
>> failure: Task 10 in stage 1.0 failed 4 times, most recent failure: Lost
>> task 10.3 in stage 1.0 (TID 81, spark6, executor 1):
>> java.lang.NullPointerException
>> Driver stacktrace:
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>> at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
>> at scala.Option.foreach(Option.scala:257)
>> at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:929)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:927)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
>> at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:927)
>> at
>> org.apache.spark.sql.execution.streaming.ForeachSink.addBatch(ForeachSink.scala:49)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3$$anonfun$apply$16.apply(MicroBatchExecution.scala:475)
>> at
>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3.apply(MicroBatchExecution.scala:473)
>> at
>> org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)
>> at
>> org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
>> at org.apache.spark.sql.execution.streaming.MicroBatchExecution.org
>> $apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:472)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:133)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121)
>> at
>> org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)
>> at
>> org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:121)
>> at
>> org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
>> at
>> org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:117)
>> at org.apache.spark.sql.execution.streaming.StreamExecution.org
>> $apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
>> ... 1 more
>>
>>
>>
>> According to the exception stack, it seems to have nothing to do with the
>> logic of my code.Is this a spark bug or something? The version of spark is
>> 2.3.1.
>>
>> Best
>> Lec Ssmi
>>
>>

Re: [Structured Streaming] NullPointerException in long running query

Reply via email to