Thanks for the replies, guys. Is this a permanent change as of 1.3, or will it go away at some point? Also, does it require an entire Hadoop installation, or just WinUtils.exe? Thanks,Ashic.
Date: Fri, 26 Jun 2015 18:22:03 +1000 Subject: Re: Recent spark sc.textFile needs hadoop for folders?!? From: guha.a...@gmail.com To: as...@live.com CC: user@spark.apache.org It's a problem since 1.3 I think On 26 Jun 2015 04:00, "Ashic Mahtab" <as...@live.com> wrote: Hello,Just trying out spark 1.4 (we're using 1.1 at present). On Windows, I've noticed the following: * On 1.4, sc.textFile("D:\\folder\\").collect() fails from both spark-shell.cmd and when running a scala application referencing the spark-core package from maven.* sc.textFile("D:\\folder\\file.txt").collect() succeeds.* On 1.1, both succeed.* When referencing the binaries in the scala application, this is the error: 15/06/25 18:30:13 ERROR Shell: Failed to locate the winutils binary in the hadoop binary pathjava.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:978) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:978) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) This seems quite strange...is this a known issue? Worse, is this a feature? I don't have to be using hadoop at all... just want to process some files and data in Cassandra. Regards,Ashic.