[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227077#comment-14227077 ] Matei Zaharia commented on SPARK-3628: -- FYI I merged this into 1.2.0, since the patch is now quite a bit smaller. We should decide whether we want to back port it to branch-1.1, so I'll leave it open for that reason. I don't think there's much point backporting it further because the issue is somewhat rare, but we can do it if people ask for it. Don't apply accumulator updates multiple times for tasks in result stages - Key: SPARK-3628 URL: https://issues.apache.org/jira/browse/SPARK-3628 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Matei Zaharia Assignee: Nan Zhu Priority: Blocker Fix For: 1.2.0 In previous versions of Spark, accumulator updates only got applied once for accumulators that are only used in actions (i.e. result stages), letting you use them to deterministically compute a result. Unfortunately, this got broken in some recent refactorings. This is related to https://issues.apache.org/jira/browse/SPARK-732, but that issue is about applying the same semantics to intermediate stages too, which is more work and may not be what we want for debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222973#comment-14222973 ] Nan Zhu commented on SPARK-3628: hmmmOK but for this case, shall I submit individual patches for 0.9.x, 1.0.x, because there are some merge conflicts to apply the patch directly ? Don't apply accumulator updates multiple times for tasks in result stages - Key: SPARK-3628 URL: https://issues.apache.org/jira/browse/SPARK-3628 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Matei Zaharia Assignee: Nan Zhu Priority: Blocker In previous versions of Spark, accumulator updates only got applied once for accumulators that are only used in actions (i.e. result stages), letting you use them to deterministically compute a result. Unfortunately, this got broken in some recent refactorings. This is related to https://issues.apache.org/jira/browse/SPARK-732, but that issue is about applying the same semantics to intermediate stages too, which is more work and may not be what we want for debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222672#comment-14222672 ] Patrick Wendell commented on SPARK-3628: I took a quick look at the current patch and i'm re-assigning the target version to 1.2.1. From what I can tell this involves nontrivial changes to the DAGScheduler. That's too critical of a component to modify substantially without significant testing. Let's try to get a fix into master and then put it into 1.2.1. down the road. Don't apply accumulator updates multiple times for tasks in result stages - Key: SPARK-3628 URL: https://issues.apache.org/jira/browse/SPARK-3628 Project: Spark Issue Type: Bug Reporter: Matei Zaharia Assignee: Nan Zhu Priority: Blocker In previous versions of Spark, accumulator updates only got applied once for accumulators that are only used in actions (i.e. result stages), letting you use them to deterministically compute a result. Unfortunately, this got broken in some recent refactorings. This is related to https://issues.apache.org/jira/browse/SPARK-732, but that issue is about applying the same semantics to intermediate stages too, which is more work and may not be what we want for debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146714#comment-14146714 ] Nan Zhu commented on SPARK-3628: https://github.com/apache/spark/pull/2524 Don't apply accumulator updates multiple times for tasks in result stages - Key: SPARK-3628 URL: https://issues.apache.org/jira/browse/SPARK-3628 Project: Spark Issue Type: Bug Reporter: Matei Zaharia Priority: Blocker In previous versions of Spark, accumulator updates only got applied once for accumulators that are only used in actions (i.e. result stages), letting you use them to deterministically compute a result. Unfortunately, this got broken in some recent refactorings. This is related to https://issues.apache.org/jira/browse/SPARK-732, but that issue is about applying the same semantics to intermediate stages too, which is more work and may not be what we want for debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146718#comment-14146718 ] Apache Spark commented on SPARK-3628: - User 'CodingCat' has created a pull request for this issue: https://github.com/apache/spark/pull/2524 Don't apply accumulator updates multiple times for tasks in result stages - Key: SPARK-3628 URL: https://issues.apache.org/jira/browse/SPARK-3628 Project: Spark Issue Type: Bug Reporter: Matei Zaharia Assignee: Nan Zhu Priority: Blocker In previous versions of Spark, accumulator updates only got applied once for accumulators that are only used in actions (i.e. result stages), letting you use them to deterministically compute a result. Unfortunately, this got broken in some recent refactorings. This is related to https://issues.apache.org/jira/browse/SPARK-732, but that issue is about applying the same semantics to intermediate stages too, which is more work and may not be what we want for debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142756#comment-14142756 ] Matei Zaharia commented on SPARK-3628: -- BTW the problem is that this used to be guarded against in the TaskSetManager (see https://github.com/apache/spark/blob/branch-0.6/core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala#L253), and that went away at some point. Don't apply accumulator updates multiple times for tasks in result stages - Key: SPARK-3628 URL: https://issues.apache.org/jira/browse/SPARK-3628 Project: Spark Issue Type: Bug Reporter: Matei Zaharia Priority: Blocker In previous versions of Spark, accumulator updates only got applied once for accumulators that are only used in actions (i.e. result stages), letting you use them to deterministically compute a result. Unfortunately, this got broken in some recent refactorings. This is related to https://issues.apache.org/jira/browse/SPARK-732, but that issue is about applying the same semantics to intermediate stages too, which is more work and may not be what we want for debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org