Re: Spark 3.0.1 Structured streaming - checkpoints fail

2020-12-23 Thread David Morin
;>>> workers need to read/write to the checkpoint. The distributed file system >>>> has to be immediately consistent: When one node writes to it, the other >>>> nodes should be able to read it immediately >>>> >>>> The solutions/workarounds depend on wher

Re: Spark 3.0.1 Structured streaming - checkpoints fail

2020-12-23 Thread Jungtaek Lim
where you are hosting your Spark >>> application. >>> >>> >>> >>> *From: *David Morin >>> *Date: *Wednesday, December 23, 2020 at 11:08 AM >>> *To: *"user@spark.apache.org" >>> *Subject: *[EXTERNAL] Spark 3.0.

Re: Spark 3.0.1 Structured streaming - checkpoints fail

2020-12-23 Thread David Morin
;> nodes should be able to read it immediately >> >> The solutions/workarounds depend on where you are hosting your Spark >> application. >> >> >> >> *From: *David Morin >> *Date: *Wednesday, December 23, 2020 at 11:08 AM >> *To: *"user@s

Re: Spark 3.0.1 Structured streaming - checkpoints fail

2020-12-23 Thread David Morin
> nodes should be able to read it immediately > > The solutions/workarounds depend on where you are hosting your Spark > application. > > > > *From: *David Morin > *Date: *Wednesday, December 23, 2020 at 11:08 AM > *To: *"user@spark.apache.org" > *Subject:

Re: Spark 3.0.1 Structured streaming - checkpoints fail

2020-12-23 Thread Lalwani, Jayesh
on where you are hosting your Spark application. From: David Morin Date: Wednesday, December 23, 2020 at 11:08 AM To: "user@spark.apache.org" Subject: [EXTERNAL] Spark 3.0.1 Structured streaming - checkpoints fail CAUTION: This email originated from outside of the organization. Do

Spark 3.0.1 Structured streaming - checkpoints fail

2020-12-23 Thread David Morin
Hello, I have an issue with my Pyspark job related to checkpoint. Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 16997.0 failed 4 times, most recent failure: Lost task 3.3 in stage 16997.0 (TID 206609, 10.XXX, executor 4):