[ https://issues.apache.org/jira/browse/SPARK-34154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-34154: --------------------------------- Fix Version/s: 3.1.1 > Flaky Test: LocalityPlacementStrategySuite.handle large number of containers > and tasks (SPARK-18750) > ---------------------------------------------------------------------------------------------------- > > Key: SPARK-34154 > URL: https://issues.apache.org/jira/browse/SPARK-34154 > Project: Spark > Issue Type: Bug > Components: YARN > Affects Versions: 3.0.2, 3.2.0, 3.1.1 > Reporter: Dongjoon Hyun > Assignee: Attila Zsolt Piros > Priority: Major > Fix For: 3.0.2, 3.2.0, 3.1.1, 3.1.2 > > > `LocalityPlacementStrategySuite` hangs sometimes like the following. We can > retriever, but it takes our resource significantly because it hangs until the > timeout (6 hours) occurs. > [https://github.com/apache/spark/runs/1719480243] > [https://github.com/apache/spark/runs/1724459002] > [https://github.com/apache/spark/runs/1717958874] > [https://github.com/apache/spark/runs/1731673955] (branch-3.0) > {code:java} > [info] LocalityPlacementStrategySuite: > 17299[info] *** Test still running after 3 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17300[info] *** Test still running after 8 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17301[info] *** Test still running after 13 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17302[info] *** Test still running after 18 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17303[info] *** Test still running after 23 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17304[info] *** Test still running after 28 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17305[info] *** Test still running after 33 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17306[info] *** Test still running after 38 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17307[info] *** Test still running after 43 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17308[info] *** Test still running after 48 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17309[info] *** Test still running after 53 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17310[info] *** Test still running after 58 minutes, 6 seconds: suite name: > LocalityPlacementStrategySuite, test name: handle large number of containers > and tasks (SPARK-18750). > 17311[info] *** Test still running after 1 hour, 3 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17312[info] *** Test still running after 1 hour, 8 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17313[info] *** Test still running after 1 hour, 13 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17314[info] *** Test still running after 1 hour, 18 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17315[info] *** Test still running after 1 hour, 23 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17316[info] *** Test still running after 1 hour, 28 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17317[info] *** Test still running after 1 hour, 33 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17318[info] *** Test still running after 1 hour, 38 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17319[info] *** Test still running after 1 hour, 43 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17320[info] *** Test still running after 1 hour, 48 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17321[info] *** Test still running after 1 hour, 53 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17322[info] *** Test still running after 1 hour, 58 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17323[info] *** Test still running after 2 hours, 3 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17324[info] *** Test still running after 2 hours, 8 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17325[info] *** Test still running after 2 hours, 13 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17326[info] *** Test still running after 2 hours, 18 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17327[info] *** Test still running after 2 hours, 23 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17328[info] *** Test still running after 2 hours, 28 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17329[info] *** Test still running after 2 hours, 33 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17330[info] *** Test still running after 2 hours, 38 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17331[info] *** Test still running after 2 hours, 43 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17332[info] *** Test still running after 2 hours, 48 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17333[info] *** Test still running after 2 hours, 53 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17334[info] *** Test still running after 2 hours, 58 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17335[info] *** Test still running after 3 hours, 3 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17336[info] *** Test still running after 3 hours, 8 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17337[info] *** Test still running after 3 hours, 13 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17338[info] *** Test still running after 3 hours, 18 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17339[info] *** Test still running after 3 hours, 23 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17340[info] *** Test still running after 3 hours, 28 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17341[info] *** Test still running after 3 hours, 33 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17342[info] *** Test still running after 3 hours, 38 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17343[info] *** Test still running after 3 hours, 43 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17344[info] *** Test still running after 3 hours, 48 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17345[info] *** Test still running after 3 hours, 53 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17346[info] *** Test still running after 3 hours, 58 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17347[info] *** Test still running after 4 hours, 3 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17348[info] *** Test still running after 4 hours, 8 minutes, 6 seconds: suite > name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17349[info] *** Test still running after 4 hours, 13 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). > 17350[info] *** Test still running after 4 hours, 18 minutes, 6 seconds: > suite name: LocalityPlacementStrategySuite, test name: handle large number of > containers and tasks (SPARK-18750). {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org