Thanks for the awesome response, Steve. As you say, it's not ideal, but the clarification greatly helps. Cheers, everyone :) -Ashic.
Subject: Re: Recent spark sc.textFile needs hadoop for folders?!? From: ste...@hortonworks.com To: as...@live.com CC: guha.a...@gmail.com; user@spark.apache.org Date: Fri, 26 Jun 2015 08:54:31 +0000 On 26 Jun 2015, at 09:29, Ashic Mahtab <as...@live.com> wrote: Thanks for the replies, guys. Is this a permanent change as of 1.3, or will it go away at some point? Don't blame the spark team, complain to the hadoop team for being slow to embrace the java 1.7 APIs for low-level filesystem IO. Also, does it require an entire Hadoop installation, or just WinUtils.exe? Thanks, Ashic. you really only need a HADOOP_HOME dir with a bin/ subdir containing the DLLs and exes needed to work with the specific Hadoop JARs you are running with This should be all you need for the Hadoop 2.6 https://github.com/steveloughran/clusterconfigs/tree/master/clusters/morzine/hadoop_home I know it's a pain, we really do need to fix it. -Steve