Re: sbt run with spark.ContextCleaner ERROR

2014-05-07 Thread Tathagata Das
Okay, this needs to be fixed. Thanks for reporting this! On Mon, May 5, 2014 at 11:00 PM, wxhsdp wrote: > Hi, TD > > i tried on v1.0.0-rc3 and still got the error > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/sbt-run-with-spark-ContextCleaner-

Re: How to read a multipart s3 file?

2014-05-07 Thread kamatsuoka
Yes, I'm using s3:// for both. I was using s3n:// but I got frustrated by how slow it is at writing files. In particular the phases where it moves the temporary files to their permanent location takes as long as writing the file itself. I can't believe anyone uses this. -- View this message in

Re: master attempted to re-register the worker and then took all workers as unregistered

2014-05-07 Thread Cheney Sun
Hi Nan, In worker's log, I see the following exception thrown when try to launch on executor. (The SPARK_HOME is wrongly specified on purpose, so there is no such file "/usr/local/spark1/bin/compute-classpath.sh"). After the exception was thrown several times, the worker was requested to kill the

Re: Spark and Java 8

2014-05-07 Thread Kristoffer Sjögren
Running Hadoop and HDFS on unsupported JVM runtime sounds a little adventurous. But as long as Spark can run in a separate Java 8 runtime it's all good. I think having lambdas and type inference is huge when writing these jobs and using Scala (paying the price of complexity, poor tooling etc etc) f

Re: details about event log

2014-05-07 Thread wxhsdp
any ideas? thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/details-about-event-log-tp5411p5476.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Unable to load native-hadoop library problem

2014-05-07 Thread Sophia
Hi,everyone, [root@CHBM220 spark-0.9.1]# SPARK_JAR=.assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar ./bin/spark-class org.apache.spark.deploy.yarn.Client --jar examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar --class org.apache.spark.examples.SparkPi --args yar

Re: How to use spark-submit

2014-05-07 Thread Tathagata Das
Doesnt the run-example script work for you? Also, are you on the latest commit of branch-1.0 ? TD On Mon, May 5, 2014 at 7:51 PM, Soumya Simanta wrote: > > > Yes, I'm struggling with a similar problem where my class are not found on > the worker nodes. I'm using 1.0.0_SNAPSHOT. I would really

Is there anything that I need to modify?

2014-05-07 Thread Sophia
[root@CHBM220 spark-0.9.1]# SPARK_JAR=.assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar ./bin/spark-class org.apache.spark.deploy.yarn.Client --jar examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar --class org.apache.spark.examples.SparkPi --args yarn-standalone

Re: Easy one

2014-05-07 Thread Ian Ferreira
Thanks! From: Aaron Davidson Reply-To: Date: Tuesday, May 6, 2014 at 5:32 PM To: Subject: Re: Easy one If you're using standalone mode, you need to make sure the Spark Workers know about the extra memory. This can be configured in spark-env.sh on the workers as export SPARK_WORKER_MEMORY

Re: log4j question

2014-05-07 Thread Sophia
I have tryed to see the log,but the log4j.properties cannot work,how to do to see the running logs? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/log4j-question-tp412p5472.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to read a multipart s3 file?

2014-05-07 Thread Han JU
Just some complements to other answers: If you output to, say, `s3://bucket/myfile`, then you can use this bucket as the input of other jobs (sc.textFile('s3://bucket/myfile')). By default all `part-xxx` files will be used. There's also `sc.wholeTextFiles` that you can play with. If you file is s

Re: How to read a multipart s3 file?

2014-05-07 Thread Nicholas Chammas
Amazon also strongly discourages the use of s3:// because the block file system it maps to is deprecated. http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-file-systems.html Note > The configuration of Hadoop running on Amazon EMR differs from the default > configuration