mridulm commented on code in PR #38371: URL: https://github.com/apache/spark/pull/38371#discussion_r1009018499
########## core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala: ########## @@ -3089,13 +3089,14 @@ class DAGSchedulerSuite extends SparkFunSuite with TempLocalSparkContext with Ti submit(finalRdd, Array(0, 1), properties = new Properties()) // Finish the first 2 shuffle map stages. - completeShuffleMapStageSuccessfully(0, 0, 2) + completeShuffleMapStageSuccessfully(0, 0, 2, Seq("hostA", "hostB")) Review Comment: This change is not required. Fetch failed is due to stage 1 partition on hostB going missing - by default, `completeShuffleMapStageSuccessfully` will progressively complete on hostA, hostB, etc ... - it will result in recomputing 0 (since there are two partitions - on hostA and hostB) and 1 (due to fetch failure) - and 2 ofcourse. In this case, since there is output on hostB for stage 0 (partition 1) and stage 1 (partition 0) , they are recomputed. If it is confusing, we can add this to the javadoc of `completeShuffleMapStageSuccessfully` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org