: user@spark.apache.org
Subject: Re: Huge partitioning job takes longer to close after all tasks
finished
Thank you liu. Can you please explain what do you mean by enabling spark fault
tolerant mechanism?
I observed that after all tasks finishes, spark is working on concatenating
same partitions from all
Hi,
you are definitely not using SPARK 2.1 in the way it should be used.
Try using sessions, and follow their guidelines, this issue has been
specifically resolved as a part of Spark 2.1 release.
Regards,
Gourav
On Wed, Mar 8, 2017 at 8:00 PM, Swapnil Shinde
wrote:
Thank you liu. Can you please explain what do you mean by enabling spark
fault tolerant mechanism?
I observed that after all tasks finishes, spark is working on concatenating
same partitions from all tasks on file system. eg,
task1 - partition1, partition2, partition3
task2 - partition1,
Do you enable the spark fault tolerance mechanism, RDD run at the end of
the job, will start a separate job, to the checkpoint data written to the
file system before the persistence of high availability
2017-03-08 2:45 GMT+08:00 Swapnil Shinde :
> Hello all
>I have