Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Nirav Patel
Awesome! it looks promising. Thanks Rishabh and Marcelo. On Wed, Feb 3, 2016 at 12:09 PM, Rishabh Wadhawan wrote: > Check out this link > http://spark.apache.org/docs/latest/configuration.html and check > spark.shuffle.service. Thanks > > On Feb 3, 2016, at 1:02 PM,

Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Nirav Patel
Hi, I have a spark job running on yarn-client mode. At some point during Join stage, executor(container) runs out of memory and yarn kills it. Due to this Entire job restarts! and it keeps doing it on every failure? What is the best way to checkpoint? I see there's checkpoint api and other

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Marcelo Vanzin
Without the exact error from the driver that caused the job to restart, it's hard to tell. But a simple way to improve things is to install the Spark shuffle service on the YARN nodes, so that even if an executor crashes, its shuffle output is still available to other executors. On Wed, Feb 3,

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Nirav Patel
Do you mean this setup? https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin wrote: > Without the exact error from the driver that caused the job to restart, > it's hard to tell. But a simple

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Marcelo Vanzin
Yes, but you don't necessarily need to use dynamic allocation (just enable the external shuffle service). On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel wrote: > Do you mean this setup? > > https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Rishabh Wadhawan
Hi Nirav There is a difference between dynamic resource allocation and shuffle service. The dynamic allocation when you enable the configurations for it, every time you run any task spark will determine the number of executors required to run that task for you, which means decreasing the

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Rishabh Wadhawan
Check out this link http://spark.apache.org/docs/latest/configuration.html and check spark.shuffle.service. Thanks > On Feb 3, 2016, at 1:02 PM, Marcelo Vanzin wrote: > > Yes, but you don't necessarily need to use