[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-03 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035897#comment-16035897
 ] 

Sean Owen commented on SPARK-20662:
---

I still don't understand why YARN's capacity scheduler doesn't answer this. It 
shouldn't be reimplemented elsewhere, including HoS. I agree with [~vanzin]

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035525#comment-16035525
 ] 

Marcelo Vanzin commented on SPARK-20662:


bq. For multiple users in an enterprise deployment, it's good to provide admin 
knobs. In this case, an admin just wanted to block bad jobs.

Your definition of a bad job is the problem (well, one of the problems). 
"Number of tasks" is not an indication that a job is large. Each task may be 
really small.

Spark shouldn't be in the job of defining what is a good or bad job, and that 
doesn't mean it's targeted at single user vs. multi user environments. It's 
just something that needs to be controlled at a different layer. If the admin 
is really worried about resource usage, he has control over the RM, and 
shouldn't rely on applications behaving nicely to enforce those controls. 
Applications misbehave. Users mess with configuration. Those are all things 
outside of the admin's control.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035519#comment-16035519
 ] 

Xuefu Zhang commented on SPARK-20662:
-

I can understand the counter argument here if Spark is targeted for single user 
cases. For multiple users in an enterprise deployment, it's good to provide 
admin knobs. In this case, an admin just wanted to block bad jobs. I don't 
think RM meets that goal.

This is actually implemented in Hive on Spark. However, I thought this is 
generic and may be desirable for others as well. In addition, blocking a job at 
submission is better than killing it after it started to run.

If Spark doesn't think this is useful, then very well.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035487#comment-16035487
 ] 

Marcelo Vanzin commented on SPARK-20662:


BTW if you really, really, really think this is a good idea and you really want 
it, you can write a listener that just cancels jobs or kills the application 
whenever a stage with more than x tasks is submitted.

No need for any changes in Spark.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035481#comment-16035481
 ] 

Sean Owen commented on SPARK-20662:
---

It's not equivalent to block the job, but why is that more desirable? your use 
case is what resource queues are for, and things like the capacity scheduler. 
Yes you limit the amount of resource a person is entitled for just that reason. 
A job that's blocked for being "too big" during busy hours may be fine to run 
off hours, but this would mean the job is never runnable ever. The capacity 
scheduler, in contrast, can  let someone use resources when nobody else wants 
them but preempt when someone else needs them, so it doesn't really cost anyone 
else. It just doesn't seem like this is a wheel to reinvent in Spark. Possibly 
its own standalone resource manager, but if you need functionality like this 
you're not likely to get by with a standalone cluster anyway.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035478#comment-16035478
 ] 

Marcelo Vanzin commented on SPARK-20662:


bq. It's probably not a good idea to let one job takes all resources while 
starving others.

I'm pretty sure that's why resource managers have queues.

What you want here is a client-controlled, opt-in, application-level "nicety 
config" that tells it to not submit more tasks than a limit at a time. That 
control already exists - set a maximum number of executors for the app. number 
of executors times number of cores = max number of tasks.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035462#comment-16035462
 ] 

Xuefu Zhang commented on SPARK-20662:
-

[~lyc] I'm talking about mapreduce.job.max.map, which is the maximum number of 
map tasks that a MR job may have. If a submitted MR job contains more map tasks 
than that, it will be rejected. Similar to mapreduce.job.max.reduce.

[~sowen], [~vanzin], I don't think blocking a large (perhaps ridiculously) job 
is equivalent to letting it run slowly and for ever. The use case I have is: 
while yarn queue can be used to limit how much resources can be used, but a 
queue can be shared by a team or multiple applications. It's probably not a 
good idea to let one job takes all resources while starving others. Secondly, 
many those users who submit ridiculously large job have no idea on what they 
are doing and they don't even realize that their jobs are huge. Lastly and more 
importantly, our application environment has a global timeout, beyond which a 
job will be killed. If a large job gets killed this way, significant resources 
is wasted. Thus, blocking such a job at submission time helps preserve the 
resources.

BTW, if the scenarios don't apply to a user, there is nothing for him/her to 
worry about because the default should keep them happy.

In addition to spark.job.max.tasks, I'd also propose spark.stage.max.tasks, 
which limits the number of tasks any stage of a job may contain. The rationale 
behind this is that spark.job.max.tasks tends to favor jobs with small number 
of stages. With both, we can not only cover MR's mapreduce.job.max.map and 
mapreduce.job.max.reduce, but also control the overall size of a job.


> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034964#comment-16034964
 ] 

Marcelo Vanzin commented on SPARK-20662:


Yeah, I don't really understand this request. It doesn't matter how many tasks 
a job creates, what really matters is how many resources the cluster manager 
allows the application to allocate. If a job has 1 million tasks but the 
cluster manager allocates a single vcpu for the job, it will take forever, but 
it won't really bog down the cluster.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034334#comment-16034334
 ] 

Sean Owen commented on SPARK-20662:
---

Isn't this better handled by the resource manager? for example, YARN lets you 
cap these things in a bunch of ways already, and the resource manager is a 
better place to manage, well, resources.

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread lyc (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034282#comment-16034282
 ] 

lyc commented on SPARK-20662:
-

Do you mean `mapreduce.job.running.map.limit`? The conf means `The maximum 
number of simultaneous map tasks per job. There is no limit if this value is 0 
or negative.` 

This means task concurrency. And the behavior seems to be that stops scheduling 
tasks when job has that many running tasks, and starts scheduling when some 
tasks are done.

This seems can be done in `DAGScheduler`, I'd like give it a try if the idea is 
accepted.  cc @Marcelo Vanzin

> Block jobs that have greater than a configured number of tasks
> --
>
> Key: SPARK-20662
> URL: https://issues.apache.org/jira/browse/SPARK-20662
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.6.0, 2.0.0
>Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. 
> While there might not be a single metrics defining the size of a job, the 
> number of tasks is usually a good indicator. Thus, it would be useful for 
> Spark scheduler to block a job whose number of tasks reaches a configured 
> limit. By default, the limit could be just infinite, to retain the existing 
> behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be 
> configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 
> (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org