That means the user who launch zeppelin don’t have permission to use python. Please switch to another user with proper permission.
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 192.168.168.29): java.io.IOException: Cannot run program "/opt/spark-2.0.2-bin-hadoop2.7/python": error=13, Permission denied Best Regard, Jeff Zhang From: Alec Lee <alec.in...@gmail.com<mailto:alec.in...@gmail.com>> Reply-To: "users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>" <users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>> Date: Friday, January 6, 2017 at 12:52 AM To: "users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>" <users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>> Subject: pyspark can't run through Hello, all I recently come cross good tool - zeppelin, it is easy to use. But I have some troubles to make pyspark work in my server. The code below used to work fine, but for no reason it pop up errors like permission deny. Code******** %pyspark import pandas ## http://blog.csdn.net/lsshlsw/article/details/53768756 ## possible error, fix as above PlateText = sc.textFile("hdfs://node9:54310/user/hadoop/tmp/plate_A216E6.csv") header = PlateText.first() #print header #print PlateText.count() Error msg *********** Traceback (most recent call last): File "/tmp/zeppelin_pyspark-7122867288384299951.py", line 267, in <module> raise Exception(traceback.format_exc()) Exception: Traceback (most recent call last): File "/tmp/zeppelin_pyspark-7122867288384299951.py", line 265, in <module> exec(code) File "<stdin>", line 3, in <module> File "/opt/spark-2.0.2-bin-hadoop2.7/python/pyspark/rdd.py", line 1328, in first rs = self.take(1) File "/opt/spark-2.0.2-bin-hadoop2.7/python/pyspark/rdd.py", line 1310, in take res = self.context.runJob(self, takeUpToNumLeft, p) File "/opt/spark-2.0.2-bin-hadoop2.7/python/pyspark/context.py", line 933, in runJob port = self._jvm.PythonRDD.runJob(self._jsc.sc(), mappedRDD._jrdd, partitions) File "/opt/spark-2.0.2-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__ answer, self.gateway_client, self.target_id, self.name) File "/opt/spark-2.0.2-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63, in deco return f(*a, **kw) File "/opt/spark-2.0.2-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value format(target_id, ".", name), value) Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 192.168.168.29): java.io.IOException: Cannot run program "/opt/spark-2.0.2-bin-hadoop2.7/python": error=13, Permission denied at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:114) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error=13, Permission denied at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:248) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 14 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1873) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1886) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1899) at org.apache.spark.api.python.PythonRDD$.runJob(PythonRDD.scala:441) at org.apache.spark.api.python.PythonRDD.runJob(PythonRDD.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Cannot run program "/opt/spark-2.0.2-bin-hadoop2.7/python": error=13, Permission denied at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:65) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:114) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more Caused by: java.io.IOException: error=13, Permission denied at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:248) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 14 more I install zeppelin with root user, and all the zeppelin ownership is root, what user permission zeppelin suppose to have? Here I also post the zeppelin-env.sh export SPARK_HOME=/opt/spark export PYSPARK_PYTHON=/opt/spark/python export PYTHONPATH=/opt/spark/python/lib/py4j-0.10.4-src.zip:/opt/spark/python: I also correspondingly make the changes in zeppelin interpreter. thanks AL