[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/7786 If you want to add a new config for the "kill preempted containers" functionality that would probably be an acceptable compromise. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user chemikadze commented on the issue: https://github.com/apache/spark/pull/7786 @vanzin If those would be implemented, would it have any change to get merged? We use preemption quite a lot and current behavior is not the best we can get: logs sometimes getting overfilled with preemption side effects (RPC errors, etc), getting logs hard to read and confusing some users. I agree that depending on task size, effect might be both positive and negative (longer ones anyway won't be able to complete be wasting resources, but lots of shorter ones will not get chance to run). Does it just mean it should be configurable behavior (spark.yarn.releasePreemptedContainers=true/false)? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/7786 @vazin, sorry for the delay. I think your concern is valid that this change will probably harm the performance since 15 seconds (preemption waiting) is not a short time. But your suggestion of killing the executors ahead of preemption to release the resource is also risky, in Yarn's side preemption is based on the decision at that time, the decision will be changed time to time since the resource is dynamically changed, if later on the resource is enough that don't need to do preemption, this pre-killing and re-acquire of containers will also harm the performance to some extent. Maybe current implementation is sufficient enough to address the preemption problem, we don't need to do anything about it. I'm going to close this, thanks a lot for your time and review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/7786 @jerryshao do you plan to implement either of the above suggestions? otherwise we should probably close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/7786 > But I'm just trying to point out that the current change doesn't really make things better. Without killing the executor, you'll still be holding on to resources, except now you wouldn't be using them. So might as well keep using them for as long as you can, or give them back as soon as possible (i.e. as soon as the executor becomes idle). I agree. Doing a clean shutdown and release is probably what the schedulers prefer âespecially if it starts up the pre-empting code faster --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/7786 @steveloughran > I suspect that if you get told you are being pre-empted, you aren't likely to get containers elsewhere That's very possible. But I'm just trying to point out that the current change doesn't really make things better. Without killing the executor, you'll still be holding on to resources, except now you wouldn't be using them. So might as well keep using them for as long as you can, or give them back as soon as possible (i.e. as soon as the executor becomes idle). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/7786 @vanzin I suspect that if you get told you are being pre-empted, you aren't likely to get containers elsewhere âpre-emption is a sign of demand being too high, and your queue lower priority. But pre-requesting a new container while continuing the current work might be a nice trick, keeping the live executors busy while queueing up early requests for the replacements --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/7786 > by default Yarn will preempt the container 15 seconds after the warning That's a long time and you can run a lot of tasks in that time. Unless Spark actively goes and gets rid of these executors to play nice, this change would probably harm performance more than it helps. So maybe instead we should focus on stopping the executors so that dynamic allocation can bring up new ones elsewhere? Otherwise the current state is probably better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org