1.5.2 prebuilt for 2.4 spark-submit standalone Python scripts not running

peteranolaN Sat, 26 Dec 2015 16:27:35 -0800

Hi all, 

Question from a newbie here about your excellent Spark:


I've just installed Spark 1.5.2, pre-built for Hadoop 2.4 and later.  I'm
trying to go through the introductory documentation using local[4] to begin
with.  In pyspark, I'm able to use examples  such as the simple application
at PROVIDED that I remove the sc initialisation.  Otherwise, if I try to run
any Python script using spark-submit, I get the verbose error message I show
below and no output.  I am not able to fix this. 

Any assistance would be very gratefully received.  

My machine runs Windows 10 HOME, with 8GB ram on a 64 bit Intel Core i3-@
3.4 gHz.  I'm using Python 2.7.11 under Anaconda 2.4.1.  

Source, from
http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications
=

from pyspark import SparkContext
logFile = "README.md"  # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()
print("Lines with a: %i, lines with b: %i" % (numAs, numBs))

Error output =

Traceback (most recent call last):
  File "c:/Users/Peter/spark-1.5.2-bin-hadoop2.4/SimpleApp.py", line 3, in
<module>
    sc = SparkContext("local", "Simple App")
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\pyspark.zip\pyspark\context.py",
line 113, in __init__
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\pyspark.zip\pyspark\context.py",
line 170, in _do_init
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\pyspark.zip\pyspark\context.py",
line 224, in _initialize_context
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py",
line 701, in __call__
  File
"c:\Users\Peter\spark-1.5.2-bin-hadoop2.4\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py",
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException

        at java.lang.ProcessBuilder.start(Unknown Source)

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)

        at org.apache.hadoop.util.Shell.run(Shell.java:418)

        at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

        at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)

        at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)

        at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)

        at org.apache.spark.SparkContext.addFile(SparkContext.scala:1387)

        at org.apache.spark.SparkContext.addFile(SparkContext.scala:1341)

        at 
org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:484)

        at 
org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:484)

        at scala.collection.immutable.List.foreach(List.scala:318)

        at org.apache.spark.SparkContext.<init>(SparkContext.scala:484)

        at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source)

        at java.lang.reflect.Constructor.newInstance(Unknown Source)

        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)

        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)

        at py4j.Gateway.invoke(Gateway.java:214)

        at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)

        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)

        at py4j.GatewayConnection.run(GatewayConnection.java:207)

        at java.lang.Thread.run(Unknown Source)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/1-5-2-prebuilt-for-2-4-spark-submit-standalone-Python-scripts-not-running-tp25804.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

1.5.2 prebuilt for 2.4 spark-submit standalone Python scripts not running

Reply via email to