[ https://issues.apache.org/jira/browse/SPARK-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263516#comment-14263516 ]
Cheng Lian commented on SPARK-1529: ----------------------------------- Hi [~srowen], first of all we are not trying to put shuffle and temp files in HDFS. At the time this ticket was created, the initial motivation was to support MapR, because MapR only exposes local file system via MapR volume and HDFS {{FileSystem}} interface. However, later on this issue was worked around with NFS. And this ticket wasn't solved because of lacking enough capacity. [~rkannan82] Thanks for looking into this! Several months ago, I had once implemented a prototype by simply replacing Java NIO file system operations with corresponding HDFS {{FileSystem}} version. According to prior benchmark done with {{spark-perf}}, this introduces ~15% performance penalty for shuffling. Thus we had once planned to write a specialized {{FileSystem}} implementation which simply wraps normal Java NIO operations to avoid the performance penalty as much as possible, and then replace all local file system access with this specialized {{FileSystem}} implementation. > Support setting spark.local.dirs to a hadoop FileSystem > -------------------------------------------------------- > > Key: SPARK-1529 > URL: https://issues.apache.org/jira/browse/SPARK-1529 > Project: Spark > Issue Type: Bug > Components: Spark Core > Reporter: Patrick Wendell > Assignee: Cheng Lian > > In some environments, like with MapR, local volumes are accessed through the > Hadoop filesystem interface. We should allow setting spark.local.dir to a > Hadoop filesystem location. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org