[ https://issues.apache.org/jira/browse/SPARK-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323531#comment-14323531 ]
Weizhong commented on SPARK-5801: --------------------------------- This is because in standalone, worker will create temp directories for executor like "/mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/spark-675133f0-b2c8-44a1-8775-5e394674609b", and on diskblockmanager also will create temp directories like "/mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/spark-675133f0-b2c8-44a1-8775-5e394674609b/spark-69c1ea15-4e7f-454a-9f57-19763c7bdd17/spark-b036335c-60fa-48ab-a346-f1b420af2027". > Shuffle creates too many nested directories > ------------------------------------------- > > Key: SPARK-5801 > URL: https://issues.apache.org/jira/browse/SPARK-5801 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core > Affects Versions: 1.2.1 > Reporter: Kay Ousterhout > Priority: Critical > > When running Spark on EC2, there are 4 nested shuffle directories before the > hashed directory names, for example: > /mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/spark-675133f0-b2c8-44a1-8775-5e394674609b/spark-69c1ea15-4e7f-454a-9f57-19763c7bdd17/spark-b036335c-60fa-48ab-a346-f1b420af2027/0c > My understanding is that this should look like: > /mnt/spark/spark-5824d912-25af-4187-bc6a-29ae42cd78e5/0c > This happened when I was using the sort-based shuffle (all default > configurations for Spark on EC2). > This is not a correctness problem (the shuffle still works fine). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org