Re: Recent spark sc.textFile needs hadoop for folders?!?

Akhil Das Fri, 26 Jun 2015 01:07:52 -0700

You just need to set your HADOOP_HOME which appears to be null in the
stackstrace. If you are not having the winutils.exe, then you can download
<https://github.com/srccodes/hadoop-common-2.2.0-bin/archive/master.zip>
and put it there.


Thanks
Best Regards

On Thu, Jun 25, 2015 at 11:30 PM, Ashic Mahtab <as...@live.com> wrote:

> Hello,
> Just trying out spark 1.4 (we're using 1.1 at present). On Windows, I've
> noticed the following:
>
> * On 1.4, sc.textFile("D:\\folder\\").collect() fails from both
> spark-shell.cmd and when running a scala application referencing the
> spark-core package from maven.
> * sc.textFile("D:\\folder\\file.txt").collect() succeeds.
> * On 1.1, both succeed.
> * When referencing the binaries in the scala application, this is the
> error:
>
> *15/06/25 18:30:13 ERROR Shell: Failed to locate the winutils binary in
> the hadoop binary path*
> *java.io.IOException: Could not locate executable null\bin\winutils.exe in
> the Hadoop binaries.*
> * at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)*
> * at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)*
> * at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)*
> * at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)*
> * at
> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)*
> * at
> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:978)*
> * at
> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:978)*
> * at
> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)*
> * at
> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)*
>
> This seems quite strange...is this a known issue? Worse, is this a
> *feature*? I don't have to be using hadoop at all... just want to process
> some files and data in Cassandra.
>
> Regards,
> Ashic.
>
>

Re: Recent spark sc.textFile needs hadoop for folders?!?

Reply via email to