RE: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0

2013-12-02 Thread Liu, Raymond
What version of code you are using? 2.2.0 support not yet merged into trunk. Check out https://github.com/apache/incubator-spark/pull/199 Best Regards, Raymond Liu From: horia@gmail.com [mailto:horia@gmail.com] On Behalf Of Horia Sent: Monday, December 02, 2013 3:00 PM To:

Re: Worker failed to connect when build with SPARK_HADOOP_VERSION=2.2.0

2013-12-02 Thread Maxime Lemaire
Horia, if you dont need yarn support you can get it work by setting SPARK_YARN to false : *SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=false sbt/sbt assembly* Raymond, Ok, thank you, so thats why, im using the lastest release 0.8.0 (september 25, 2013) 2013/12/2 Liu, Raymond raymond@intel.com

Connection Error Problem

2013-12-02 Thread lihu
Hi, I run the SparkLR example in the spark0.9. When I run a small set of data, about 8M, it succeed. but when I run about 800M data, it occurred the below Connection problem. At first, I thought this due to the hadoop file, because I do not set the hadoop file, but it can pass in

Re: Connection Error Problem

2013-12-02 Thread lihu
I also occurred this problem in Spark0.8. Here is the SparkLR code. I only change the N and D in two different experiment. *val N = 1 // Number of data points* * val D = 100 // Numer of dimensions* * val R = 0.7 // Scaling factor* * val ITERATIONS = 10* * val rand = new

How to balance task load

2013-12-02 Thread Hao REN
Hi, When running some tests on EC2 with spark, I notice that: the tasks are not fairly distributed to executor. For example, a map action produces 4 tasks, but they all go to the Executors (3) - *Memory:* 0.0 B Used (19.0 GB Total) - *Disk:* 0.0 B Used Executor IDAddressRDD

Re: Could not find resource path for Web UI: org/apache/spark/ui/static

2013-12-02 Thread Walrus theCat
Anyone have any ideas based on the stack trace? Thanks On Sun, Dec 1, 2013 at 9:09 PM, Walrus theCat walrusthe...@gmail.comwrote: Shouldn't? I imported the new 0.8.0 jars into my build path, and had to update my imports accordingly. The only way I upload the spark jars myself is that they

Serializable incompatible with Externalizable error

2013-12-02 Thread Matt Cheah
Hi everyone, I'm running into a case where I'm creating a Java RDD of an Externalizable class, and getting this stack trace: java.io.InvalidClassException (java.io.InvalidClassException: com.palantir.finance.datatable.server.spark.WritableDataRow; Serializable incompatible with

Re: RDD cache question

2013-12-02 Thread Yadid Ayzenberg
Thanks Mark, that makes perfect sense. I guess I still don't have a full picture in my head when in comes to the caching: How is the RDD cache managed (assuming not enough memory for all the cached RDDs): is it LRU or LFU, or something else ? Thanks, Yadid On 11/30/13 10:56 PM, Mark

Re: Spark worker processes hang while processing small 2MB dataset

2013-12-02 Thread K. Shankari
So if I run my code directly from the spark-shell, it works as well. Luckily, I have a fairly small main function. I wonder if there is something funky going on with my spark context - that seems to be the main difference in launching the program. Anyway, I am unblocked now, so I will go off and

Re: Serializable incompatible with Externalizable error

2013-12-02 Thread Andrew Ash
At least from http://stackoverflow.com/questions/817853/what-is-the-difference-between-serializable-and-externalizable-in-javait looks like Externalizable is roughly an old-java version of Serializable. Does that class implement both interfaces? Can you take away the Externalizable interface if