Thanks for answer. I am now trying to set HADOOP_HOME but the issues still
persists. Also, I can see only windows-utils.exe in my HADDOP_HOME, but no
WINUTILS.EXE.

I do not have hadoop installed in my system, as I am not using HDFS, but I
am using Spark 1.3.1 prebuilt with Hadoop 2.6. AM I missing something?

Best
Ayan

On Tue, Apr 28, 2015 at 12:45 AM, Steve Loughran <ste...@hortonworks.com>
wrote:

>
>  This a hadoop-side stack trace
>
>  it looks like the code is trying to get the filesystem permissions by
> running
>
>  %HADOOP_HOME%\bin\WINUTILS.EXE  ls -F
>
>
>  and something is triggering a null pointer exception.
>
>  There isn't any HADOOP- JIRA with this specific stack trace in it, so
> it's not a known/fixed problem.
>
>  At a guess, your environment HADOOP_HOME environment variable isn't
> point to the right place. If that's the case there should have been a
> warning in the logs
>
>
>
>
>>> Py4JJavaError: An error occurred while calling
>>> z:org.apache.spark.api.python.PythonRDD.collectAndServe.
>>> : java.lang.NullPointerException
>>>
>>>  at java.lang.ProcessBuilder.start(Unknown Source)
>>>
>>>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
>>>
>>>  at org.apache.hadoop.util.Shell.run(Shell.java:455)
>>>
>>>  at
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
>>>
>>>  at org.apache.hadoop.util.Shell.execCommand(Shell.java:808)
>>>
>>>  at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
>>>
>>>  at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1097)
>>>
>>>  at
>>> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:582)
>>>
>>>  at
>>> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:557)
>>>
>>>  at
>>> org.apache.hadoop.fs.LocatedFileStatus.<init>(LocatedFileStatus.java:42)
>>>
>>>  at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1699)
>>>
>>>  at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1681)
>>>
>>>  at
>>> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:268)
>>>
>>>  at
>>> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
>>>
>>>  at
>>> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
>>>
>>>  at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
>>>
>>>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>>
>>>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>>
>>>  at scala.Option.getOrElse(Option.scala:120)
>>>
>>>  at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>>
>>>  at
>>> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>>>
>>>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>>
>>>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>>
>>>  at scala.Option.getOrElse(Option.scala:120)
>>>
>>>  at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>>
>>>  at
>>> org.apache.spark.api.python.PythonRDD.getPartitions(PythonRDD.scala:57)
>>>
>>>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>>
>>>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>>
>>>  at scala.Option.getOrElse(Option.scala:120)
>>>
>>>  at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>>
>>>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1512)
>>>
>>>  at org.apache.spark.rdd.RDD.collect(RDD.scala:813)
>>>
>>>  at
>>> org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:374)
>>>
>>>  at
>>> org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
>>>
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>>
>>>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>>
>>>  at java.lang.reflect.Method.invoke(Unknown Source)
>>>
>>>  at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>>>
>>>  at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>>>
>>>  at py4j.Gateway.invoke(Gateway.java:259)
>>>
>>>  at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>>>
>>>  at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>
>>>  at py4j.GatewayConnection.run(GatewayConnection.java:207)
>>>
>>>  at java.lang.Thread.run(Unknown Source)
>>>
>>>  --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>
>>
>>
>>
>


-- 
Best Regards,
Ayan Guha

Reply via email to