[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20998 Will do, and it's okay. My view limited in the source code yet, but you guys have more practical experience. So I learned from your points. It's beneficial. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20998 @Ngone51 can you instead leave the behavior as is, and just update the comment? Sorry that its going to be a small change in the end, and all the extra work the bad comments led you to do, but still appreciate you noticing this and fixing. a good PR with a quality test too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20998 @squito I completely agree that the comment is inaccurate. Note that this is for a specific taskset, so impact is limited to that taskset (w.r.t using executors for spec exec) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20998 I'm not even really concerned about the case for two hosts -- I agree its fine if we do something sub-optimal. I'm more concerned about code-clarity and the behavior in general. It seems cleaner to me if speculation doesn't worry about where its failed before, and those exclusions are left to the blacklist. But it sounds like you're saying the prior behavior was really desirable -- you think its better if speculation always excludes hosts that task has ever failed on? I'm happy to defer to your opinion on this, I haven't really stressed speculative execution yet. Then lets just change that comment in the code to be consistent. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89228/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20998 **[Test build #89228 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89228/testReport)** for PR 20998 at commit [`5901728`](https://github.com/apache/spark/commit/5901728d6be8ad33e39d56006a2bc8cc02cfff38). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20998 **[Test build #89228 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89228/testReport)** for PR 20998 at commit [`5901728`](https://github.com/apache/spark/commit/5901728d6be8ad33e39d56006a2bc8cc02cfff38). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20998 @squito My concern is, in large workloads, some nodes simply become bad for some tasks (transient env or hardware issues, colocating containers, etc) while being fine for others; speculative tasks should alleviate performance concerns and not increase chances of application failure due to locality preference affinity. For cluster sizes which are very small, speculative execution is less relevant than for those which are large - and here we are tuning for the former. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20998 @mridulm more thoughts? I think this is the right change but I will leave open for a bit to get more input --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89164/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20998 **[Test build #89164 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89164/testReport)** for PR 20998 at commit [`2ed9584`](https://github.com/apache/spark/commit/2ed958418ff182bca0a3af1bf35999130312e78f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20998 **[Test build #89164 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89164/testReport)** for PR 20998 at commit [`2ed9584`](https://github.com/apache/spark/commit/2ed958418ff182bca0a3af1bf35999130312e78f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20998 Hi, @squito . Thank for review and comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20998 Hi, @mridulm, thank for your comment. Actually, I have the same worry with you. May be we can make this change as a second choice for `hasAttemptOnHost `, in case of there's really no other hosts to select. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20998 Adding isRunning can cause a single 'bad' node (from task pov - not necessarily only bad hardware: just that task fails on node) can keep tasks to fail repeatedly causing app to exit. Particularly with blacklist'ing, I am not very sure how the interactions will play out .. @squito might have more comments. In general, this is not a benign change imo and can have non trivial side effects. In the specific usecase of only two machines, it is an unfortunate side effect. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89026/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20998 **[Test build #89026 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89026/testReport)** for PR 20998 at commit [`3584a09`](https://github.com/apache/spark/commit/3584a09410d72207d8a00dcbf385b2e276925fc8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20998 Hi, @felixcheung , thank for trigger a task and your comments. > shouldn't this be up to the scheduler backend? Actually, it is `TaskSchedulerImpl` who holds a thread to check whether there are any speculative tasks need to be scheduled periodically. If any, then, call `backend.reviveOffers` to offer resources . But, it's `TaskSetManager` who decides whether we need to launch a speculative task for a certain task. > multiple tasks/attempts can run simultaneously on the same physical host? I think multiple task attempts(actually, speculative tasks) can run on the sam physical host, but not simultaneously, as long as there's no running attempt on it. In PR description, I illustrate a case which a speculative task chose to run on a host, where a previous task attempts have been run on, but failed finally. I think if the task's failure is not relevant to the host, 'run on the same host' can be acceptable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20998 **[Test build #89026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89026/testReport)** for PR 20998 at commit [`3584a09`](https://github.com/apache/spark/commit/3584a09410d72207d8a00dcbf385b2e276925fc8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20998 sounds fair, but shouldn't this be up to the scheduler backend? multiple tasks/attempts can run simultaneously on the same physical host? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20998 Jenkins, ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20998 ping @pwendell @kayousterhout . pls help review, thanks :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20998: [SPARK-23888][CORE] speculative task should not run on a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20998 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org