[ 
https://issues.apache.org/jira/browse/SPARK-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152070#comment-14152070
 ] 

Andrew Or commented on SPARK-3685:
----------------------------------

Yeah there will be a design doc soon on the possible solutions for dealing with 
shuffles. Note that one of the main motivations of doing this is to free up 
containers in Yarn when an application is not using it, so maintaining a pool 
of executor containers does not achieve what we want. Also, DFS shuffle is only 
one of the solutions we will consider, but we probably won't end up relying on 
it because of the overhead it adds (i.e. we'll probably need a different 
solution down the road either way).

It could be a warning, but I think an exception is appropriate here because the 
user clearly thinks that its shuffle files are going into HDFS when they're 
not. Also, the fact that it fails-fast means the user knows Spark won't do what 
he/she wants before even a single shuffle file is written. Either way I don't 
feel strongly about this.



> Spark's local dir should accept only local paths
> ------------------------------------------------
>
>                 Key: SPARK-3685
>                 URL: https://issues.apache.org/jira/browse/SPARK-3685
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, YARN
>    Affects Versions: 1.1.0
>            Reporter: Andrew Or
>
> When you try to set local dirs to "hdfs:/tmp/foo" it doesn't work. What it 
> will try to do is create a folder called "hdfs:" and put "tmp" inside it. 
> This is because in Util#getOrCreateLocalRootDirs we use java.io.File instead 
> of Hadoop's file system to parse this path. We also need to resolve the path 
> appropriately.
> This may not have an urgent use case, but it fails silently and does what is 
> least expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to