Re: Is spark fair scheduler is for kubernete?
Hi, On Mon, Apr 11, 2022 at 7:43 AM Jason Jun wrote: > the official doc, https://spark.apache.org/docs/latest/job-scheduling.html, > didn't mention that its working for kubernete cluster? > You could use Volcano scheduler for more advanced setups on Kubernetes. Here is an article explaining how to make use of the fresh integration between Spark and Volcano in 3.3 (not yet released!) - https://martin-grigorov.medium.com/native-integration-between-apache-spark-and-volcano-kubernetes-scheduler-488f54dbbab3 Regards, Martin > > Can anyone quickly answer this? > > TIA. > Jason >
Is spark fair scheduler is for kubernete?
the official doc, https://spark.apache.org/docs/latest/job-scheduling.html, didn't mention that its working for kubernete cluster? Can anyone quickly answer this? TIA. Jason
Spark FAIR Scheduler vs FIFO Scheduler
Good morning, I have a conceptual question. In an application I am working on, when I write to HDFS some results (*action 1*), I use ~30 executors out of 200. I would like to improve resource utilization in this case. I am aware that repartitioning the df to 200 before action 1 would produce 200 tasks and full executors utilization, but for several reasons is not what I want to do. What I would like to do is using the other ~170 executors to work on the actions (jobs) coming after action 1. The normal case would be that *action 2* starts after action 1 (FIFO), but here I want them to start at the same time, using the idle executors. My question is: is it something achievable with the FAIR scheduler approach and if yes how? As I read the fair scheduler needs a pool of jobs and then it schedules their tasks in a round-robin fashion. If I submit action 1 and action 2 at the same time (multi-threading) to a fair pool, which of the following things happen? 1. at every moment, all (or almost all) executors are used in parallel (30 for action 1, the rest for action 2) 2. for a certain small amount of time X, 30 executors are used for action 1, then for another time X the other executors are used for action 2, then again X unit of time for action 1 and so on... Among the two, 1 will actually improve cluster utlization, while 2 will allow only to have both jobs advancing at the same time. Can someone who has knowledge about the FAIR scheduler help me understand how it works? Thanks, *Alessandro Liparoti*
Re: Fair scheduler pool leak
If I understand what you're trying to do correctly, I think you really just want one pool, but you want to change the mode *within* the pool to be FAIR as well https://spark.apache.org/docs/latest/job-scheduling.html#configuring-pool-properties you'd still need to change the conf file to set up that pool, but that should be fairly straight-forward? Another approach to what you're asking might be to expose the scheduler configuration as command line confs as well, which seems reasonable and simple. On Sat, Apr 7, 2018 at 5:55 PM, Matthias Boehm <mboe...@gmail.com> wrote: > well, the point was "in a programmatic way without the need for > additional configuration files which is a hassle for a library" - > anyway, I appreciate your comments. > > Regards, > Matthias > > On Sat, Apr 7, 2018 at 3:43 PM, Mark Hamstra <m...@clearstorydata.com> > wrote: > >> Providing a way to set the mode of the default scheduler would be > awesome. > > > > > > That's trivial: Just use the pool configuration XML file and define a > pool > > named "default" with the characteristics that you want (including > > schedulingMode FAIR). > > > > You only get the default construction of the pool named "default" is you > > don't define your own "default". > > > > On Sat, Apr 7, 2018 at 2:32 PM, Matthias Boehm <mboe...@gmail.com> > wrote: > >> > >> No, these pools are not created per job but per parfor worker and > >> thus, used to execute many jobs. For all scripts with a single > >> top-level parfor this is equivalent to static initialization. However, > >> yes we create these pools dynamically on demand to avoid unnecessary > >> initialization and handle scenarios of nested parfor. > >> > >> At the end of the day, we just want to configure fair scheduling in a > >> programmatic way without the need for additional configuration files > >> which is a hassle for a library that is meant to work out-of-the-box. > >> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick > >> because we end up with a single default fair scheduler pool in FIFO > >> mode, which is equivalent to FIFO. Providing a way to set the mode of > >> the default scheduler would be awesome. > >> > >> Regarding why fair scheduling showed generally better performance for > >> out-of-core datasets, I don't have a good answer. My guess was > >> isolated job scheduling and better locality of in-memory partitions. > >> > >> Regards, > >> Matthias > >> > >> On Sat, Apr 7, 2018 at 8:50 AM, Mark Hamstra <m...@clearstorydata.com> > >> wrote: > >> > Sorry, but I'm still not understanding this use case. Are you somehow > >> > creating additional scheduling pools dynamically as Jobs execute? If > so, > >> > that is a very unusual thing to do. Scheduling pools are intended to > be > >> > statically configured -- initialized, living and dying with the > >> > Application. > >> > > >> > On Sat, Apr 7, 2018 at 12:33 AM, Matthias Boehm <mboe...@gmail.com> > >> > wrote: > >> >> > >> >> Thanks for the clarification Imran - that helped. I was mistakenly > >> >> assuming that these pools are removed via weak references, as the > >> >> ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For > >> >> the time being, we'll just work around it, but I'll file a > >> >> nice-to-have improvement JIRA. Also, you're right, we see indeed > these > >> >> warnings but they're usually hidden when running with ERROR or INFO > >> >> (due to overwhelming output) log levels. > >> >> > >> >> Just to give the context: We use these scheduler pools in SystemML's > >> >> parallel for loop construct (parfor), which allows combining data- > and > >> >> task-parallel computation. If the data fits into the remote memory > >> >> budget, the optimizer may decide to execute the entire loop as a > >> >> single spark job (with groups of iterations mapped to spark tasks). > If > >> >> the data is too large and non-partitionable, the parfor loop is > >> >> executed as a multi-threaded operator in the driver and each worker > >> >> might spawn several data-parallel spark jobs in the context of the > >> >> worker's scheduler pool, for operations that don't fit into the > >> >> driver. > >> >> > >> &g
Re: Fair scheduler pool leak
No, these pools are not created per job but per parfor worker and thus, used to execute many jobs. For all scripts with a single top-level parfor this is equivalent to static initialization. However, yes we create these pools dynamically on demand to avoid unnecessary initialization and handle scenarios of nested parfor. At the end of the day, we just want to configure fair scheduling in a programmatic way without the need for additional configuration files which is a hassle for a library that is meant to work out-of-the-box. Simply setting 'spark.scheduler.mode' to FAIR does not do the trick because we end up with a single default fair scheduler pool in FIFO mode, which is equivalent to FIFO. Providing a way to set the mode of the default scheduler would be awesome. Regarding why fair scheduling showed generally better performance for out-of-core datasets, I don't have a good answer. My guess was isolated job scheduling and better locality of in-memory partitions. Regards, Matthias On Sat, Apr 7, 2018 at 8:50 AM, Mark Hamstra <m...@clearstorydata.com> wrote: > Sorry, but I'm still not understanding this use case. Are you somehow > creating additional scheduling pools dynamically as Jobs execute? If so, > that is a very unusual thing to do. Scheduling pools are intended to be > statically configured -- initialized, living and dying with the Application. > > On Sat, Apr 7, 2018 at 12:33 AM, Matthias Boehm <mboe...@gmail.com> wrote: >> >> Thanks for the clarification Imran - that helped. I was mistakenly >> assuming that these pools are removed via weak references, as the >> ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For >> the time being, we'll just work around it, but I'll file a >> nice-to-have improvement JIRA. Also, you're right, we see indeed these >> warnings but they're usually hidden when running with ERROR or INFO >> (due to overwhelming output) log levels. >> >> Just to give the context: We use these scheduler pools in SystemML's >> parallel for loop construct (parfor), which allows combining data- and >> task-parallel computation. If the data fits into the remote memory >> budget, the optimizer may decide to execute the entire loop as a >> single spark job (with groups of iterations mapped to spark tasks). If >> the data is too large and non-partitionable, the parfor loop is >> executed as a multi-threaded operator in the driver and each worker >> might spawn several data-parallel spark jobs in the context of the >> worker's scheduler pool, for operations that don't fit into the >> driver. >> >> We decided to use these fair scheduler pools (w/ fair scheduling >> across pools, FIFO per pool) instead of the default FIFO scheduler >> because it gave us better and more robust performance back in the >> Spark 1.x line. This was especially true for concurrent jobs over >> shared input data (e.g., for hyper parameter tuning) and when the data >> size exceeded aggregate memory. The only downside was that we had to >> guard against scenarios where concurrently jobs would lazily pull a >> shared RDD into cache because that lead to thread contention on the >> executors' block managers and spurious replicated in-memory >> partitions. >> >> Regards, >> Matthias >> >> On Fri, Apr 6, 2018 at 8:08 AM, Imran Rashid <iras...@cloudera.com> wrote: >> > Hi Matthias, >> > >> > This doeesn't look possible now. It may be worth filing an improvement >> > jira >> > for. >> > >> > But I'm trying to understand what you're trying to do a little better. >> > So >> > you intentionally have each thread create a new unique pool when its >> > submits >> > a job? So that pool will just get the default pool configuration, and >> > you >> > will see lots of these messages in your logs? >> > >> > >> > https://github.com/apache/spark/blob/6ade5cbb498f6c6ea38779b97f2325d5cf5013f2/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala#L196-L200 >> > >> > What is the use case for creating pools this way? >> > >> > Also if I understand correctly, it doesn't even matter if the thread >> > dies -- >> > that pool will still stay around, as the rootPool will retain a >> > reference to >> > its (the pools aren't really actually tied to specific threads). >> > >> > Imran >> > >> > On Thu, Apr 5, 2018 at 9:46 PM, Matthias Boehm <mboe...@gmail.com> >> > wrote: >> >> >> >> Hi all, >> >> >> >> for concurrent Spark jobs spawned from the driver, we us
Re: Fair scheduler pool leak
well, the point was "in a programmatic way without the need for additional configuration files which is a hassle for a library" - anyway, I appreciate your comments. Regards, Matthias On Sat, Apr 7, 2018 at 3:43 PM, Mark Hamstra <m...@clearstorydata.com> wrote: >> Providing a way to set the mode of the default scheduler would be awesome. > > > That's trivial: Just use the pool configuration XML file and define a pool > named "default" with the characteristics that you want (including > schedulingMode FAIR). > > You only get the default construction of the pool named "default" is you > don't define your own "default". > > On Sat, Apr 7, 2018 at 2:32 PM, Matthias Boehm <mboe...@gmail.com> wrote: >> >> No, these pools are not created per job but per parfor worker and >> thus, used to execute many jobs. For all scripts with a single >> top-level parfor this is equivalent to static initialization. However, >> yes we create these pools dynamically on demand to avoid unnecessary >> initialization and handle scenarios of nested parfor. >> >> At the end of the day, we just want to configure fair scheduling in a >> programmatic way without the need for additional configuration files >> which is a hassle for a library that is meant to work out-of-the-box. >> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick >> because we end up with a single default fair scheduler pool in FIFO >> mode, which is equivalent to FIFO. Providing a way to set the mode of >> the default scheduler would be awesome. >> >> Regarding why fair scheduling showed generally better performance for >> out-of-core datasets, I don't have a good answer. My guess was >> isolated job scheduling and better locality of in-memory partitions. >> >> Regards, >> Matthias >> >> On Sat, Apr 7, 2018 at 8:50 AM, Mark Hamstra <m...@clearstorydata.com> >> wrote: >> > Sorry, but I'm still not understanding this use case. Are you somehow >> > creating additional scheduling pools dynamically as Jobs execute? If so, >> > that is a very unusual thing to do. Scheduling pools are intended to be >> > statically configured -- initialized, living and dying with the >> > Application. >> > >> > On Sat, Apr 7, 2018 at 12:33 AM, Matthias Boehm <mboe...@gmail.com> >> > wrote: >> >> >> >> Thanks for the clarification Imran - that helped. I was mistakenly >> >> assuming that these pools are removed via weak references, as the >> >> ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For >> >> the time being, we'll just work around it, but I'll file a >> >> nice-to-have improvement JIRA. Also, you're right, we see indeed these >> >> warnings but they're usually hidden when running with ERROR or INFO >> >> (due to overwhelming output) log levels. >> >> >> >> Just to give the context: We use these scheduler pools in SystemML's >> >> parallel for loop construct (parfor), which allows combining data- and >> >> task-parallel computation. If the data fits into the remote memory >> >> budget, the optimizer may decide to execute the entire loop as a >> >> single spark job (with groups of iterations mapped to spark tasks). If >> >> the data is too large and non-partitionable, the parfor loop is >> >> executed as a multi-threaded operator in the driver and each worker >> >> might spawn several data-parallel spark jobs in the context of the >> >> worker's scheduler pool, for operations that don't fit into the >> >> driver. >> >> >> >> We decided to use these fair scheduler pools (w/ fair scheduling >> >> across pools, FIFO per pool) instead of the default FIFO scheduler >> >> because it gave us better and more robust performance back in the >> >> Spark 1.x line. This was especially true for concurrent jobs over >> >> shared input data (e.g., for hyper parameter tuning) and when the data >> >> size exceeded aggregate memory. The only downside was that we had to >> >> guard against scenarios where concurrently jobs would lazily pull a >> >> shared RDD into cache because that lead to thread contention on the >> >> executors' block managers and spurious replicated in-memory >> >> partitions. >> >> >> >> Regards, >> >> Matthias >> >> >> >> On Fri, Apr 6, 2018 at 8:08 AM, Imran Rashid <iras...@cloudera.com> >> >> wrot
Re: Fair scheduler pool leak
> > Providing a way to set the mode of the default scheduler would be awesome. That's trivial: Just use the pool configuration XML file and define a pool named "default" with the characteristics that you want (including schedulingMode FAIR). You only get the default construction of the pool named "default" is you don't define your own "default". On Sat, Apr 7, 2018 at 2:32 PM, Matthias Boehm <mboe...@gmail.com> wrote: > No, these pools are not created per job but per parfor worker and > thus, used to execute many jobs. For all scripts with a single > top-level parfor this is equivalent to static initialization. However, > yes we create these pools dynamically on demand to avoid unnecessary > initialization and handle scenarios of nested parfor. > > At the end of the day, we just want to configure fair scheduling in a > programmatic way without the need for additional configuration files > which is a hassle for a library that is meant to work out-of-the-box. > Simply setting 'spark.scheduler.mode' to FAIR does not do the trick > because we end up with a single default fair scheduler pool in FIFO > mode, which is equivalent to FIFO. Providing a way to set the mode of > the default scheduler would be awesome. > > Regarding why fair scheduling showed generally better performance for > out-of-core datasets, I don't have a good answer. My guess was > isolated job scheduling and better locality of in-memory partitions. > > Regards, > Matthias > > On Sat, Apr 7, 2018 at 8:50 AM, Mark Hamstra <m...@clearstorydata.com> > wrote: > > Sorry, but I'm still not understanding this use case. Are you somehow > > creating additional scheduling pools dynamically as Jobs execute? If so, > > that is a very unusual thing to do. Scheduling pools are intended to be > > statically configured -- initialized, living and dying with the > Application. > > > > On Sat, Apr 7, 2018 at 12:33 AM, Matthias Boehm <mboe...@gmail.com> > wrote: > >> > >> Thanks for the clarification Imran - that helped. I was mistakenly > >> assuming that these pools are removed via weak references, as the > >> ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For > >> the time being, we'll just work around it, but I'll file a > >> nice-to-have improvement JIRA. Also, you're right, we see indeed these > >> warnings but they're usually hidden when running with ERROR or INFO > >> (due to overwhelming output) log levels. > >> > >> Just to give the context: We use these scheduler pools in SystemML's > >> parallel for loop construct (parfor), which allows combining data- and > >> task-parallel computation. If the data fits into the remote memory > >> budget, the optimizer may decide to execute the entire loop as a > >> single spark job (with groups of iterations mapped to spark tasks). If > >> the data is too large and non-partitionable, the parfor loop is > >> executed as a multi-threaded operator in the driver and each worker > >> might spawn several data-parallel spark jobs in the context of the > >> worker's scheduler pool, for operations that don't fit into the > >> driver. > >> > >> We decided to use these fair scheduler pools (w/ fair scheduling > >> across pools, FIFO per pool) instead of the default FIFO scheduler > >> because it gave us better and more robust performance back in the > >> Spark 1.x line. This was especially true for concurrent jobs over > >> shared input data (e.g., for hyper parameter tuning) and when the data > >> size exceeded aggregate memory. The only downside was that we had to > >> guard against scenarios where concurrently jobs would lazily pull a > >> shared RDD into cache because that lead to thread contention on the > >> executors' block managers and spurious replicated in-memory > >> partitions. > >> > >> Regards, > >> Matthias > >> > >> On Fri, Apr 6, 2018 at 8:08 AM, Imran Rashid <iras...@cloudera.com> > wrote: > >> > Hi Matthias, > >> > > >> > This doeesn't look possible now. It may be worth filing an > improvement > >> > jira > >> > for. > >> > > >> > But I'm trying to understand what you're trying to do a little better. > >> > So > >> > you intentionally have each thread create a new unique pool when its > >> > submits > >> > a job? So that pool will just get the default pool configuration, and > >> > you > >> > will see lots of these messages in your logs? &
Re: Fair scheduler pool leak
Sorry, but I'm still not understanding this use case. Are you somehow creating additional scheduling pools dynamically as Jobs execute? If so, that is a very unusual thing to do. Scheduling pools are intended to be statically configured -- initialized, living and dying with the Application. On Sat, Apr 7, 2018 at 12:33 AM, Matthias Boehm <mboe...@gmail.com> wrote: > Thanks for the clarification Imran - that helped. I was mistakenly > assuming that these pools are removed via weak references, as the > ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For > the time being, we'll just work around it, but I'll file a > nice-to-have improvement JIRA. Also, you're right, we see indeed these > warnings but they're usually hidden when running with ERROR or INFO > (due to overwhelming output) log levels. > > Just to give the context: We use these scheduler pools in SystemML's > parallel for loop construct (parfor), which allows combining data- and > task-parallel computation. If the data fits into the remote memory > budget, the optimizer may decide to execute the entire loop as a > single spark job (with groups of iterations mapped to spark tasks). If > the data is too large and non-partitionable, the parfor loop is > executed as a multi-threaded operator in the driver and each worker > might spawn several data-parallel spark jobs in the context of the > worker's scheduler pool, for operations that don't fit into the > driver. > > We decided to use these fair scheduler pools (w/ fair scheduling > across pools, FIFO per pool) instead of the default FIFO scheduler > because it gave us better and more robust performance back in the > Spark 1.x line. This was especially true for concurrent jobs over > shared input data (e.g., for hyper parameter tuning) and when the data > size exceeded aggregate memory. The only downside was that we had to > guard against scenarios where concurrently jobs would lazily pull a > shared RDD into cache because that lead to thread contention on the > executors' block managers and spurious replicated in-memory > partitions. > > Regards, > Matthias > > On Fri, Apr 6, 2018 at 8:08 AM, Imran Rashid <iras...@cloudera.com> wrote: > > Hi Matthias, > > > > This doeesn't look possible now. It may be worth filing an improvement > jira > > for. > > > > But I'm trying to understand what you're trying to do a little better. > So > > you intentionally have each thread create a new unique pool when its > submits > > a job? So that pool will just get the default pool configuration, and > you > > will see lots of these messages in your logs? > > > > https://github.com/apache/spark/blob/6ade5cbb498f6c6ea38779b97f2325 > d5cf5013f2/core/src/main/scala/org/apache/spark/ > scheduler/SchedulableBuilder.scala#L196-L200 > > > > What is the use case for creating pools this way? > > > > Also if I understand correctly, it doesn't even matter if the thread > dies -- > > that pool will still stay around, as the rootPool will retain a > reference to > > its (the pools aren't really actually tied to specific threads). > > > > Imran > > > > On Thu, Apr 5, 2018 at 9:46 PM, Matthias Boehm <mboe...@gmail.com> > wrote: > >> > >> Hi all, > >> > >> for concurrent Spark jobs spawned from the driver, we use Spark's fair > >> scheduler pools, which are set and unset in a thread-local manner by > >> each worker thread. Typically (for rather long jobs), this works very > >> well. Unfortunately, in an application with lots of very short > >> parallel sections, we see 1000s of these pools remaining in the Spark > >> UI, which indicates some kind of leak. Each worker cleans up its local > >> property by setting it to null, but not all pools are properly > >> removed. I've checked and reproduced this behavior with Spark 2.1-2.3. > >> > >> Now my question: Is there a way to explicitly remove these pools, > >> either globally, or locally while the thread is still alive? > >> > >> Regards, > >> Matthias > >> > >> - > >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >> > > > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: Fair scheduler pool leak
Thanks for the clarification Imran - that helped. I was mistakenly assuming that these pools are removed via weak references, as the ContextCleaner does for RDDs, broadcasts, and accumulators, etc. For the time being, we'll just work around it, but I'll file a nice-to-have improvement JIRA. Also, you're right, we see indeed these warnings but they're usually hidden when running with ERROR or INFO (due to overwhelming output) log levels. Just to give the context: We use these scheduler pools in SystemML's parallel for loop construct (parfor), which allows combining data- and task-parallel computation. If the data fits into the remote memory budget, the optimizer may decide to execute the entire loop as a single spark job (with groups of iterations mapped to spark tasks). If the data is too large and non-partitionable, the parfor loop is executed as a multi-threaded operator in the driver and each worker might spawn several data-parallel spark jobs in the context of the worker's scheduler pool, for operations that don't fit into the driver. We decided to use these fair scheduler pools (w/ fair scheduling across pools, FIFO per pool) instead of the default FIFO scheduler because it gave us better and more robust performance back in the Spark 1.x line. This was especially true for concurrent jobs over shared input data (e.g., for hyper parameter tuning) and when the data size exceeded aggregate memory. The only downside was that we had to guard against scenarios where concurrently jobs would lazily pull a shared RDD into cache because that lead to thread contention on the executors' block managers and spurious replicated in-memory partitions. Regards, Matthias On Fri, Apr 6, 2018 at 8:08 AM, Imran Rashid <iras...@cloudera.com> wrote: > Hi Matthias, > > This doeesn't look possible now. It may be worth filing an improvement jira > for. > > But I'm trying to understand what you're trying to do a little better. So > you intentionally have each thread create a new unique pool when its submits > a job? So that pool will just get the default pool configuration, and you > will see lots of these messages in your logs? > > https://github.com/apache/spark/blob/6ade5cbb498f6c6ea38779b97f2325d5cf5013f2/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala#L196-L200 > > What is the use case for creating pools this way? > > Also if I understand correctly, it doesn't even matter if the thread dies -- > that pool will still stay around, as the rootPool will retain a reference to > its (the pools aren't really actually tied to specific threads). > > Imran > > On Thu, Apr 5, 2018 at 9:46 PM, Matthias Boehm <mboe...@gmail.com> wrote: >> >> Hi all, >> >> for concurrent Spark jobs spawned from the driver, we use Spark's fair >> scheduler pools, which are set and unset in a thread-local manner by >> each worker thread. Typically (for rather long jobs), this works very >> well. Unfortunately, in an application with lots of very short >> parallel sections, we see 1000s of these pools remaining in the Spark >> UI, which indicates some kind of leak. Each worker cleans up its local >> property by setting it to null, but not all pools are properly >> removed. I've checked and reproduced this behavior with Spark 2.1-2.3. >> >> Now my question: Is there a way to explicitly remove these pools, >> either globally, or locally while the thread is still alive? >> >> Regards, >> Matthias >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Fair scheduler pool leak
Hi Matthias, This doeesn't look possible now. It may be worth filing an improvement jira for. But I'm trying to understand what you're trying to do a little better. So you intentionally have each thread create a new unique pool when its submits a job? So that pool will just get the default pool configuration, and you will see lots of these messages in your logs? https://github.com/apache/spark/blob/6ade5cbb498f6c6ea38779b97f2325d5cf5013f2/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala#L196-L200 What is the use case for creating pools this way? Also if I understand correctly, it doesn't even matter if the thread dies -- that pool will still stay around, as the rootPool will retain a reference to its (the pools aren't really actually tied to specific threads). Imran On Thu, Apr 5, 2018 at 9:46 PM, Matthias Boehm <mboe...@gmail.com> wrote: > Hi all, > > for concurrent Spark jobs spawned from the driver, we use Spark's fair > scheduler pools, which are set and unset in a thread-local manner by > each worker thread. Typically (for rather long jobs), this works very > well. Unfortunately, in an application with lots of very short > parallel sections, we see 1000s of these pools remaining in the Spark > UI, which indicates some kind of leak. Each worker cleans up its local > property by setting it to null, but not all pools are properly > removed. I've checked and reproduced this behavior with Spark 2.1-2.3. > > Now my question: Is there a way to explicitly remove these pools, > either globally, or locally while the thread is still alive? > > Regards, > Matthias > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Fair scheduler pool leak
Hi all, for concurrent Spark jobs spawned from the driver, we use Spark's fair scheduler pools, which are set and unset in a thread-local manner by each worker thread. Typically (for rather long jobs), this works very well. Unfortunately, in an application with lots of very short parallel sections, we see 1000s of these pools remaining in the Spark UI, which indicates some kind of leak. Each worker cleans up its local property by setting it to null, but not all pools are properly removed. I've checked and reproduced this behavior with Spark 2.1-2.3. Now my question: Is there a way to explicitly remove these pools, either globally, or locally while the thread is still alive? Regards, Matthias - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: fair scheduler
@Crystal You can use spark on yarn. Yarn have fair scheduler,modified yarn-site.xml. 发自我的 iPad 在 2014年8月11日,6:49,Matei Zaharia matei.zaha...@gmail.com 写道: Hi Crystal, The fair scheduler is only for jobs running concurrently within the same SparkContext (i.e. within an application), not for separate applications on the standalone cluster manager. It has no effect there. To run more of those concurrently, you need to set a cap on how many cores they each grab with spark.cores.max. Matei On August 10, 2014 at 12:13:08 PM, 李宜芳 (xuite...@gmail.com) wrote: Hi I am trying to switch from FIFO to FAIR with standalone mode. my environment: hadoop 1.2.1 spark 0.8.0 using stanalone mode and i modified the code.. ClusterScheduler.scala - System.getProperty(spark.scheduler.mode, FAIR)) SchedulerBuilder.scala - val DEFAULT_SCHEDULING_MODE = SchedulingMode.FAIR LocalScheduler.scala - System.getProperty(spark.scheduler.mode, FAIR) spark-env.sh - export SPARK_JAVA_OPTS=-Dspark.scheduler.mode=FAIR export SPARK_JAVA_OPTS= -Dspark.scheduler.mode=FAIR ./run-example org.apache.spark.examples.SparkPi spark://streaming1:7077 but it's not work i want to switch from fifo to fair how can i do?? Regards Crystal Lee - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
fair scheduler
Hi I am trying to switch from FIFO to FAIR with standalone mode. my environment: hadoop 1.2.1 spark 0.8.0 using stanalone mode and i modified the code.. ClusterScheduler.scala - System.getProperty(spark.scheduler.mode, FAIR)) SchedulerBuilder.scala - val DEFAULT_SCHEDULING_MODE = SchedulingMode.FAIR LocalScheduler.scala - System.getProperty(spark.scheduler.mode, FAIR) spark-env.sh - export SPARK_JAVA_OPTS=-Dspark.scheduler.mode=FAIR export SPARK_JAVA_OPTS= -Dspark.scheduler.mode=FAIR ./run-example org.apache.spark.examples.SparkPi spark://streaming1:7077 but it's not work i want to switch from fifo to fair how can i do?? Regards Crystal Lee