Re: No module named pyspark - latest built

2014-11-14 Thread Andrew Or
I see. The general known constraints on building your assembly jar for
pyspark on Yarn are:

Java 6
NOT RedHat
Maven

Some of these are documented here
<http://spark.apache.org/docs/latest/building-with-maven.html> (bottom).
Maybe we should make it more explicit.

2014-11-13 2:31 GMT-08:00 jamborta :

> it was built with 1.6 (tried 1.7, too)
>
> On Thu, Nov 13, 2014 at 2:52 AM, Andrew Or-2 [via Apache Spark User
> List] <[hidden email] <http://user/SendEmail.jtp?type=node&node=18833&i=0>>
> wrote:
>
> > Hey Jamborta,
> >
> > What java version did you build the jar with?
> >
> > 2014-11-12 16:48 GMT-08:00 jamborta <[hidden email]>:
> >>
> >> I have figured out that building the fat jar with sbt does not seem to
> >> included the pyspark scripts using the following command:
> >>
> >> sbt/sbt -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive clean
> >> publish-local assembly
> >>
> >> however the maven command works OK:
> >>
> >> mvn -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive -DskipTests
> >> clean package
> >>
> >> am I running the correct sbt command?
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18787.html
> >> Sent from the Apache Spark User List mailing list archive at
> Nabble.com.
> >>
> >> -
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >
> >
> >
> > 
> > If you reply to this email, your message will be added to the discussion
> > below:
> >
> http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18797.html
> > To unsubscribe from No module named pyspark - latest built, click here.
> > NAML
>
> --
> View this message in context: Re: No module named pyspark - latest built
> <http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18833.html>
>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>


Re: No module named pyspark - latest built

2014-11-13 Thread jamborta
it was built with 1.6 (tried 1.7, too)

On Thu, Nov 13, 2014 at 2:52 AM, Andrew Or-2 [via Apache Spark User
List]  wrote:
> Hey Jamborta,
>
> What java version did you build the jar with?
>
> 2014-11-12 16:48 GMT-08:00 jamborta <[hidden email]>:
>>
>> I have figured out that building the fat jar with sbt does not seem to
>> included the pyspark scripts using the following command:
>>
>> sbt/sbt -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive clean
>> publish-local assembly
>>
>> however the maven command works OK:
>>
>> mvn -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive -DskipTests
>> clean package
>>
>> am I running the correct sbt command?
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18787.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> ____________
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18797.html
> To unsubscribe from No module named pyspark - latest built, click here.
> NAML




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18833.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: No module named pyspark - latest built

2014-11-12 Thread Andrew Or
Hey Jamborta,

What java version did you build the jar with?

2014-11-12 16:48 GMT-08:00 jamborta :

> I have figured out that building the fat jar with sbt does not seem to
> included the pyspark scripts using the following command:
>
> sbt/sbt -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive clean
> publish-local assembly
>
> however the maven command works OK:
>
> mvn -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive -DskipTests
> clean package
>
> am I running the correct sbt command?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18787.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: No module named pyspark - latest built

2014-11-12 Thread Xiangrui Meng
You need to use maven to include python files. See
https://github.com/apache/spark/pull/1223 . -Xiangrui

On Wed, Nov 12, 2014 at 4:48 PM, jamborta  wrote:
> I have figured out that building the fat jar with sbt does not seem to
> included the pyspark scripts using the following command:
>
> sbt/sbt -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive clean
> publish-local assembly
>
> however the maven command works OK:
>
> mvn -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive -DskipTests
> clean package
>
> am I running the correct sbt command?
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18787.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: No module named pyspark - latest built

2014-11-12 Thread Tamas Jambor
Thanks. Will it work with sbt at some point?

On Thu, 13 Nov 2014 01:03 Xiangrui Meng  wrote:

> You need to use maven to include python files. See
> https://github.com/apache/spark/pull/1223 . -Xiangrui
>
> On Wed, Nov 12, 2014 at 4:48 PM, jamborta  wrote:
> > I have figured out that building the fat jar with sbt does not seem to
> > included the pyspark scripts using the following command:
> >
> > sbt/sbt -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive clean
> > publish-local assembly
> >
> > however the maven command works OK:
> >
> > mvn -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive -DskipTests
> > clean package
> >
> > am I running the correct sbt command?
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/No-module-named-pyspark-latest-
> built-tp18740p18787.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>


Re: No module named pyspark - latest built

2014-11-12 Thread jamborta
I have figured out that building the fat jar with sbt does not seem to
included the pyspark scripts using the following command:

sbt/sbt -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive clean
publish-local assembly

however the maven command works OK:

mvn -Pdeb -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Phive -DskipTests
clean package

am I running the correct sbt command?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18787.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: No module named pyspark - latest built

2014-11-12 Thread jamborta
forgot to mention, that this setup works in spark standalone mode, only
problem when I run on yarn.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740p18777.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



No module named pyspark - latest built

2014-11-12 Thread jamborta
Hi all,

I am trying to run spark with the latest build (from branch-1.2), as far as
I can see, all the paths are set and SparkContext starts up OK, however, I
cannot run anything that goes to the nodes. I get the following error:

Error from python worker:
  /usr/bin/python2.7: No module named pyspark
PYTHONPATH was:
 
/mnt/yarn/nm/usercache/massive/filecache/15/spark-assembly-1.2.0-SNAPSHOT-hadoop2.3.0.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
at
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86)
at
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)

any idea where it is picking up this path from?

thanks,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org