Hi, Only with stack trace there’s nothing to look into it. It’d be better to provide simple reproducer (code, and problematic inputs) so that someone may give it a try.
You may also want to give it a try with 3.0.0, RC2 is better to test against, but preview2 would be easier for end users to test. 2020년 5월 23일 (토) 오후 8:14, Srinivas V <srini....@gmail.com>님이 작성: > Hello, > I am listening to a kaka topic through Spark Structured Streaming > [2.4.5]. After processing messages for few mins, I am getting below > NullPointerException.I have three beans used here 1.Event 2.StateInfo > 3.SessionUpdateInfo. I am suspecting that the problem is with StateInfo, > when it is writing state to hdfs it might be failing or it could be failing > while I update accumulators. But why would it fail for some events but not > for others? Once it fails, it stops the Streaming query. > When I send all fields null except EevntId in my testing, it works fine. > Any idea what could be happening? > Attaching the full stack trace as well. > This is a - yarn cluster, saving state in HDFS. > > Exception: > > 20/05/23 09:46:46 ERROR > org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask: Aborting > commit for partition 42 (task 118121, attempt 9, stage 824.0) > 20/05/23 09:46:46 ERROR > org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask: Aborted > commit for partition 42 (task 118121, attempt 9, stage 824.0) > 20/05/23 09:46:46 ERROR org.apache.spark.executor.Executor: Exception in task > 42.9 in stage 824.0 (TID 118121) > java.lang.NullPointerException: Null value appeared in non-nullable field: > top level input bean > If the schema is inferred from a Scala tuple/case class, or a Java bean, > please try to use scala.Option[_] or other nullable types (e.g. > java.lang.Integer instead of int/scala.Int). > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.serializefromobject_doConsume_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636) > at > org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:117) > at > org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394) > at > org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146) > at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67) > at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:123) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 20/05/23 09:47:48 ERROR > org.apache.spark.executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM > > > > Regards > > Srini V > >