Hi Tom, are you using an sbt-built assembly by any chance? If so, take a look at SPARK-5808.
I haven't had any problems with the maven-built assembly. Setting SPARK_HOME on the executors is a workaround if you want to use the sbt assembly. On Fri, Feb 20, 2015 at 2:56 PM, Tom Graves <tgraves...@yahoo.com.invalid> wrote: > Trying to run pyspark on yarn in client mode with basic wordcount example I > see the following error when doing the collect: > Error from python worker: /usr/bin/python: No module named sqlPYTHONPATH > was: > /grid/3/tmp/yarn-local/usercache/tgraves/filecache/20/spark-assembly-1.3.0-hadoop2.6.0.1.1411101121.jarjava.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) > at > org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) > at > org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) > at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:105) > at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:69) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at > org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:308) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:64) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > any ideas on this? > Tom > > On Wednesday, February 18, 2015 2:14 AM, Patrick Wendell > <pwend...@gmail.com> wrote: > > > Please vote on releasing the following candidate as Apache Spark version > 1.3.0! > > The tag to be voted on is v1.3.0-rc1 (commit f97b0d4a): > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=f97b0d4a6b26504916816d7aefcf3132cd1da6c2 > > The release files, including signatures, digests, etc. can be found at: > http://people.apache.org/~pwendell/spark-1.3.0-rc1/ > > Release artifacts are signed with the following key: > https://people.apache.org/keys/committer/pwendell.asc > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapachespark-1069/ > > The documentation corresponding to this release can be found at: > http://people.apache.org/~pwendell/spark-1.3.0-rc1-docs/ > > Please vote on releasing this package as Apache Spark 1.3.0! > > The vote is open until Saturday, February 21, at 08:03 UTC and passes > if a majority of at least 3 +1 PMC votes are cast. > > [ ] +1 Release this package as Apache Spark 1.3.0 > [ ] -1 Do not release this package because ... > > To learn more about Apache Spark, please see > http://spark.apache.org/ > > == How can I help test this release? == > If you are a Spark user, you can help us test this release by > taking a Spark 1.2 workload and running on this release candidate, > then reporting any regressions. > > == What justifies a -1 vote for this release? == > This vote is happening towards the end of the 1.3 QA period, > so -1 votes should only occur for significant regressions from 1.2.1. > Bugs already present in 1.2.X, minor regressions, or bugs related > to new features will not block this release. > > - Patrick > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > > > > -- Marcelo --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org