Thanks.
My Spark applications run on nodes based on docker images but this is a
standalone mode (1 driver - n workers)
Can we use S3 directly with consistency addon like s3guard (s3a) or AWS
Consistent view
<https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-consistent-view.html>
 ?

Le mer. 23 déc. 2020 à 17:48, Lalwani, Jayesh <jlalw...@amazon.com> a
écrit :

> Yes. It is necessary to have a distributed file system because all the
> workers need to read/write to the checkpoint. The distributed file system
> has to be immediately consistent: When one node writes to it, the other
> nodes should be able to read it immediately
>
> The solutions/workarounds depend on where you are hosting your Spark
> application.
>
>
>
> *From: *David Morin <morin.david....@gmail.com>
> *Date: *Wednesday, December 23, 2020 at 11:08 AM
> *To: *"user@spark.apache.org" <user@spark.apache.org>
> *Subject: *[EXTERNAL] Spark 3.0.1 Structured streaming - checkpoints fail
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hello,
>
>
>
> I have an issue with my Pyspark job related to checkpoint.
>
>
>
> Caused by: org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 3 in stage 16997.0 failed 4 times, most recent failure: Lost
> task 3.3 in stage 16997.0 (TID 206609, 10.XXX, executor 4):
> java.lang.IllegalStateException: Error reading delta file
> file:/opt/spark/workdir/query6/checkpointlocation/state/0/3/1.delta of
> HDFSStateStoreProvider[id = (op=0,part=3),dir =
> file:/opt/spark/workdir/query6/checkpointlocation/state/0/3]: 
> *file:/opt/spark/workdir/query6/checkpointlocation/state/0/3/1.delta
> does not exist*
>
>
>
> This job is based on Spark 3.0.1 and Structured Streaming
>
> This Spark cluster (1 driver and 6 executors) works without hdfs. And we
> don't want to manage an hdfs cluster if possible.
>
> Is it necessary to have a distributed filesystem ? What are the different
> solutions/workarounds ?
>
>
>
> Thanks in advance
>
> David
>

Reply via email to