AbstractMethodError

2013-12-23 Thread leosand...@gmail.com
I write a example MyWordCount , just set spark.akka.frameSize larger than default . but when I run this jar , there is a problem : 13/12/19 18:53:48 INFO ClusterTaskSetManager: Lost TID 0 (task 0.0:0) 13/12/19 18:53:48 INFO ClusterTaskSetManager: Loss was due to java.lang.AbstractMethodError

Re: AbstractMethodError

2013-12-23 Thread Azuryy Yu
Leo, Which version Spark are you used? It was caused compiled by Scala-2.10. Spark-0.8-x using scala-2.9, so you must use the same major version to compile spark code. On Mon, Dec 23, 2013 at 4:00 PM, leosand...@gmail.com leosand...@gmail.comwrote: I write a example MyWordCount , just set

Re: ADD_JARS and jar dependencies in sbt

2013-12-23 Thread Gary Malouf
In your own project, use something like the sbt-assembly plugin to build a jar of your code and all of it's dependencies. Once you have that, use ADD_JARS to add that jar alone and you should be set. On Mon, Dec 23, 2013 at 7:29 AM, Aureliano Buendia buendia...@gmail.comwrote: Hi, It seems

Re: ADD_JARS doubt.!!!!!

2013-12-23 Thread Gary Malouf
I would not recommend putting your text files in via ADD_JARS. The better thing to do is to put those files in HDFS or locally on your driver server, load them into memory and then use Spark's broadcast variable concept to spread the data out across the cluster. On Mon, Dec 23, 2013 at 1:57 AM,

Deploy my application on spark cluster

2013-12-23 Thread Pankaj Mittal
Hi, I have scenario where kafka is going to be input source for data. So how can I deploy my application which is having all logic for transforming kafka input stream. But I am little bit confused about usage of spark in cluster mode. After running spark in cluster mode, I want to deploy my

Re: debugging NotSerializableException while using Kryo

2013-12-23 Thread Ameet Kini
Thanks Imran. I tried setting spark.closure.serializer to org.apache.spark.serializer.KryoSerializer and now end up seeing NullPointerException when the executor starts up. This is a snippet of the executor's log. Notice how registered TileIdWritable and registered ArgWritable is called, so I see

Unable to load additional JARs in yarn-client mode

2013-12-23 Thread Karavany, Ido
Hi All, For our application we need to use the yarn-client mode featured in 0.8.1. (Yarn 2.0.5) We've successfully executed it both yarn-client and yarn-standalone with our java applications. While in yarn-standalone there is a way to add external JARs - we couldn't find a way to add those in

Re: failed to compile spark because of the missing packages

2013-12-23 Thread Patrick Wendell
Hey Nan, You shouldn't copy lib_managed manually. SBT will deal with that. Try just using the same .gitignore settings that we have in the spark github. Seems like you are accidentally including some files that cause this to get messed up. - Patrick On Mon, Dec 23, 2013 at 8:37 AM, Nan Zhu

Re: debugging NotSerializableException while using Kryo

2013-12-23 Thread Jie Deng
maybe try to implement your class with serializable... 2013/12/23 Ameet Kini ameetk...@gmail.com Thanks Imran. I tried setting spark.closure.serializer to org.apache.spark.serializer.KryoSerializer and now end up seeing NullPointerException when the executor starts up. This is a snippet of

Noob Spark questions

2013-12-23 Thread Ognen Duzlevski
Hello, I am new to Spark and have installed it, played with it a bit, mostly I am reading through the Fast data processing with Spark book. One of the first things I realized is that I have to learn Scala, the real-time data analytics part is not supported by the Python API, correct? I don't

Re: failed to compile spark because of the missing packages

2013-12-23 Thread Nan Zhu
Hi, Patrick Thanks for the reply I still failed to compile the code, even I made the following attempts 1. download spark-0.8.1.tgz, 2. decompress, and copy the files to the github local repo directory (.gitignore is just copied from

Re: Noob Spark questions

2013-12-23 Thread Jie Deng
I am using Java, and Spark has APIs for Java as well. Though there is a saying that Java in Spark is slower than Scala shell, well, depends on your requirement. I am not an expert in Spark, but as far as I know, Spark provide different level of storage including memory or disk. And for the disk

Re: failed to compile spark because of the missing packages

2013-12-23 Thread Nan Zhu
I finally solved the issue manually I found that when I compile with sbt, lib/ directory under streaming/ and repl/ is missing, The reason is that in the official .gitignore, it intends to ignore the “lib/“, while in the distributed tgz files, these two lib/ directories are included….

Re: debugging NotSerializableException while using Kryo

2013-12-23 Thread Ameet Kini
Using Java serialization would make the NPE go away, but it would be a less preferable solution. My application is network-intensive, and serialization cost is significant. In other words, these objects are ideal candidates for Kryo. On Mon, Dec 23, 2013 at 3:41 PM, Jie Deng

Re: debugging NotSerializableException while using Kryo

2013-12-23 Thread Michael (Bach) Bui
What spark version are you using? By looking at the code Executor.scala line195, you will at least know what cause the NPE. We can start from there. On Dec 23, 2013, at 10:21 AM, Ameet Kini ameetk...@gmail.com wrote: Thanks Imran. I tried setting spark.closure.serializer to

Re: Noob Spark questions

2013-12-23 Thread Mark Hamstra
Though there is a saying that Java in Spark is slower than Scala shell That shouldn't be said. The Java API is mostly a thin wrapper of the Scala implementation, and the performance of the Java API is intended to be equivalent to that of the Scala API. If you're finding that not to be true,

Re: Noob Spark questions

2013-12-23 Thread Ognen Duzlevski
Hello, On Mon, Dec 23, 2013 at 3:23 PM, Jie Deng deng113...@gmail.com wrote: I am using Java, and Spark has APIs for Java as well. Though there is a saying that Java in Spark is slower than Scala shell, well, depends on your requirement. I am not an expert in Spark, but as far as I know,

mapPartitions versus map overhead?

2013-12-23 Thread Huan Dao
Hi all, is there any overhead of mapPartitions versus overhead, if I implement an algorithm using map - reduce versus mapPartitions - reduce. Thanks, Huan Dao

Re: Unable to load additional JARs in yarn-client mode

2013-12-23 Thread Matei Zaharia
I’m surprised by this, but one way that will definitely work is to assemble your application into a single JAR. If passing them to the constructor doesn’t work, that’s probably a bug. Matei On Dec 23, 2013, at 12:03 PM, Karavany, Ido ido.karav...@intel.com wrote: Hi All, For our

RE: Unable to load additional JARs in yarn-client mode

2013-12-23 Thread Liu, Raymond
Ido, when you say add external JARS, do you mean by -addJars which adding some jar for SparkContext to use in the AM env? If so, I think you don't need it for yarn-cilent mode at all, for yarn-client mode, SparkContext running locally, I think you just need to make sure those jars are in the