Hi,
On Mon, Apr 11, 2022 at 7:43 AM Jason Jun wrote:
> the official doc, https://spark.apache.org/docs/latest/job-scheduling.html,
> didn't mention that its working for kubernete cluster?
>
You could use Volcano scheduler for more advanced setups on Kubernetes.
Here is an article explaining
the official doc, https://spark.apache.org/docs/latest/job-scheduling.html,
didn't mention that its working for kubernete cluster?
Can anyone quickly answer this?
TIA.
Jason
at the same
time, using the idle executors.
My question is: is it something achievable with the FAIR scheduler approach
and if yes how?
As I read the fair scheduler needs a pool of jobs and then it schedules
their tasks in a round-robin fashion. If I submit action 1 and action 2 at
the same time
amically on demand to avoid unnecessary
> >> initialization and handle scenarios of nested parfor.
> >>
> >> At the end of the day, we just want to configure fair scheduling in a
> >> programmatic way without the need for additional configuration files
> >> w
the trick
because we end up with a single default fair scheduler pool in FIFO
mode, which is equivalent to FIFO. Providing a way to set the mode of
the default scheduler would be awesome.
Regarding why fair scheduling showed generally better performance for
out-of-core datasets, I don't have a good answer
e end of the day, we just want to configure fair scheduling in a
>> programmatic way without the need for additional configuration files
>> which is a hassle for a library that is meant to work out-of-the-box.
>> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick
>
gt; programmatic way without the need for additional configuration files
> which is a hassle for a library that is meant to work out-of-the-box.
> Simply setting 'spark.scheduler.mode' to FAIR does not do the trick
> because we end up with a single default fair scheduler pool i
t; > What is the use case for creating pools this way?
> >
> > Also if I understand correctly, it doesn't even matter if the thread
> dies --
> > that pool will still stay around, as the rootPool will retain a
> reference to
> > its (the pools aren't really actually tied
in the driver and each worker
might spawn several data-parallel spark jobs in the context of the
worker's scheduler pool, for operations that don't fit into the
driver.
We decided to use these fair scheduler pools (w/ fair scheduling
across pools, FIFO per pool) instead of the default FIFO scheduler
because
t; Hi all,
>
> for concurrent Spark jobs spawned from the driver, we use Spark's fair
> scheduler pools, which are set and unset in a thread-local manner by
> each worker thread. Typically (for rather long jobs), this works very
> well. Unfortunately, in an application with lots of very s
Hi all,
for concurrent Spark jobs spawned from the driver, we use Spark's fair
scheduler pools, which are set and unset in a thread-local manner by
each worker thread. Typically (for rather long jobs), this works very
well. Unfortunately, in an application with lots of very short
parallel
@Crystal
You can use spark on yarn. Yarn have fair scheduler,modified yarn-site.xml.
发自我的 iPad
在 2014年8月11日,6:49,Matei Zaharia matei.zaha...@gmail.com 写道:
Hi Crystal,
The fair scheduler is only for jobs running concurrently within the same
SparkContext (i.e. within an application
Hi
I am trying to switch from FIFO to FAIR with standalone mode.
my environment:
hadoop 1.2.1
spark 0.8.0 using stanalone mode
and i modified the code..
ClusterScheduler.scala - System.getProperty(spark.scheduler.mode,
FAIR))
SchedulerBuilder.scala -
val DEFAULT_SCHEDULING_MODE =
13 matches
Mail list logo