On Wed, Oct 22, 2014 at 2:17 PM, Ashwin Shankar <ashwinshanka...@gmail.com> wrote: >> That's not something you might want to do usually. In general, a >> SparkContext maps to a user application > > My question was basically this. In this page in the official doc, under > "Scheduling within an application" section, it talks about multiuser and > fair sharing within an app. How does multiuser within an application > work(how users connect to an app,run their stuff) ? When would I want to use > this ?
I see. The way I read that page is that Spark supports all those scheduling options; but Spark doesn't give you the means to actually be able to submit jobs from different users to a running SparkContext hosted on a different process. For that, you'll need something like the job server that I referenced before, or write your own framework for supporting that. Personally, I'd use the information on that page when dealing with concurrent jobs in the same SparkContext, but still restricted to the same user. I'd avoid trying to create any application where a single SparkContext is trying to be shared by multiple users in any way. >> As far as I understand, this will cause executors to be killed, which >> means that Spark will start retrying tasks to rebuild the data that >> was held by those executors when needed. > > I basically wanted to find out if there were any "gotchas" related to > preemption on Spark. Things like say half of an application's executors got > preempted say while doing reduceByKey, will the application progress with > the remaining resources/fair share ? Jobs should still make progress as long as at least one executor is available. The gotcha would be the one I mentioned, where Spark will fail your job after "x" executors failed, which might be a common occurrence when preemption is enabled. That being said, it's a configurable option, so you can set "x" to a very large value and your job should keep on chugging along. The options you'd want to take a look at are: spark.task.maxFailures and spark.yarn.max.executor.failures -- Marcelo --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org