[ 
https://issues.apache.org/jira/browse/SPARK-34280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275783#comment-17275783
 ] 

Attila Zsolt Piros commented on SPARK-34280:
--------------------------------------------

It would be interesting to know whether the context cleaner was running for 
those blocks beforehand.

If you have the log it would be nice to see whether it has any of these: 
 - ["Error cleaning 
shuffle"|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ContextCleaner.scala#L223-L237]
 - ["Error deleting 
data"|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala#L107-L121]

If not re-run it along with the logger "org.apache.spark.ContextCleaner" set to 
DEBUG level.

(I guess this not much help for you Holden but who knows who else reads this 
and looks for answers to similar questions.)


> Avoid migrating un-needed shuffle files
> ---------------------------------------
>
>                 Key: SPARK-34280
>                 URL: https://issues.apache.org/jira/browse/SPARK-34280
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.1.0, 3.2.0, 3.1.1, 3.1.2
>            Reporter: Holden Karau
>            Priority: Major
>
> In Spark 3.1 we introduced shuffle migrations. However, it is possible that a 
> shuffle file will still exist after it is no longer needed. I've only 
> observed this in a back port branch with SQL, so I'll do some more digging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to