Execption while using kryo with broadcast

2015-04-15 Thread Jeetendra Gangele
Hi All I am getting below exception while using Kyro serializable with
broadcast variable. I am broadcating a hasmap with below line.

Map matchData =RddForMarch.collectAsMap();
final Broadcast> dataMatchGlobal =
jsc.broadcast(matchData);






15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage
4.0 (TID 7)
java.io.IOException: java.lang.UnsupportedOperationException
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
at
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
at
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException
at java.util.AbstractMap.put(AbstractMap.java:203)
at
com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
at
com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
at
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
... 18 more
15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
commanded a shutdown
15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared
15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped
15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator:
Shutting down remote daemon.


RE: Execption while using kryo with broadcast

2015-08-06 Thread Shuai Zheng
Hi,

 

I have exactly same issue on Spark 1.4.1 (on EMR latest default AMI 4.0),
run as Yarn client. And after wrapped with another java hashMap, the
exception disappear.

 

But may I know what is right solution? Any JIRA ticket is created for this?
I want to monitor it, even it could be bypass by wrapped with another
hashmap, but it is ugly so I want to remove this kind of code piece later.

 

BTW: interesting things is when I run this in local mode, even with Kyro,
there is no issue (so it passed my unit test and dev test).

 

Regards,

 

Shuai

 

From: Jeetendra Gangele [mailto:gangele...@gmail.com] 
Sent: Wednesday, April 15, 2015 10:59 AM
To: Imran Rashid
Cc: Akhil Das; user
Subject: Re: Execption while using kryo with broadcast

 

This worked with java serialization.I am using 1.2.0 you are right if I use
1.2.1 or 1.3.0 this issue will not occur

I will test this and let you know

 

On 15 April 2015 at 19:48, Imran Rashid  wrote:

oh interesting.  The suggested workaround is to wrap the result from
collectAsMap into another hashmap, you should try that:

Map matchData =RddForMarch.collectAsMap();
Map tmp = new HashMap(matchData);
final Broadcast> dataMatchGlobal =
jsc.broadcast(tmp);

 

Can you please clarify:

* Does it work w/ java serialization in the end?  Or is this kryo only?

* which Spark version you are using? (one of the relevant bugs was fixed in
1.2.1 and 1.3.0)

 

 

 

On Wed, Apr 15, 2015 at 9:06 AM, Jeetendra Gangele 
wrote:

This looks like known issue? check this out

http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassExce
ption-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-co
r-td20034.html

 

Can you please suggest any work around I am broad casting HashMap return
from RDD.collectasMap().

 

On 15 April 2015 at 19:33, Imran Rashid  wrote:

this is a really strange exception ... I'm especially surprised that it
doesn't work w/ java serialization.  Do you think you could try to boil it
down to a minimal example?

 

On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele 
wrote:

Yes Without Kryo it did work out.when I remove kryo registration it did
worked out

 

On 15 April 2015 at 19:24, Jeetendra Gangele  wrote:

its not working with the combination of Broadcast.

Without Kyro also not working.

 

 

On 15 April 2015 at 19:20, Akhil Das  wrote:

Is it working without kryo?




Thanks

Best Regards

 

On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele 
wrote:

Hi All I am getting below exception while using Kyro serializable with
broadcast variable. I am broadcating a hasmap with below line.

 

Map matchData =RddForMarch.collectAsMap();

final Broadcast> dataMatchGlobal =
jsc.broadcast(matchData);

 

 

 

 

 

 

15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage
4.0 (TID 7)

java.io.IOException: java.lang.UnsupportedOperationException

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)

at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadc
ast.scala:164)

at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadca
st.scala:64)

at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64
)

at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:
87)

at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)

at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)

at
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(Jav
aPairRDD.scala:1002)

at
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(Jav
aPairRDD.scala:1002)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.sca
la:204)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scal
a:58)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:56)

at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
45)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
15)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.UnsupportedOperationException

at java.util.AbstractMap.put(AbstractMap.java:203)

at
com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:
135)

at
com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:
17)

at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:

Re: Execption while using kryo with broadcast

2015-04-15 Thread Akhil Das
Is it working without kryo?

Thanks
Best Regards

On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele 
wrote:

> Hi All I am getting below exception while using Kyro serializable with
> broadcast variable. I am broadcating a hasmap with below line.
>
> Map matchData =RddForMarch.collectAsMap();
> final Broadcast> dataMatchGlobal =
> jsc.broadcast(matchData);
>
>
>
>
>
>
> 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage
> 4.0 (TID 7)
> java.io.IOException: java.lang.UnsupportedOperationException
> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
> at
> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
> at
> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
> at
> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
> at
> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
> at
> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
> at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
> at
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
> at
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
> at
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap.put(AbstractMap.java:203)
> at
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
> at
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
> at
> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
> at
> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
> at
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
> ... 18 more
> 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
> commanded a shutdown
> 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared
> 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped
> 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator:
> Shutting down remote daemon.
>
>
>


Re: Execption while using kryo with broadcast

2015-04-15 Thread Jeetendra Gangele
its not working with the combination of Broadcast.
Without Kyro also not working.

On 15 April 2015 at 19:20, Akhil Das  wrote:

> Is it working without kryo?
>
> Thanks
> Best Regards
>
> On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele 
> wrote:
>
>> Hi All I am getting below exception while using Kyro serializable with
>> broadcast variable. I am broadcating a hasmap with below line.
>>
>> Map matchData =RddForMarch.collectAsMap();
>> final Broadcast> dataMatchGlobal =
>> jsc.broadcast(matchData);
>>
>>
>>
>>
>>
>>
>> 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage
>> 4.0 (TID 7)
>> java.io.IOException: java.lang.UnsupportedOperationException
>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>> at
>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
>> at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
>> at
>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>> at
>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at
>> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
>> at
>> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
>> at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>> at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>> at org.apache.spark.scheduler.Task.run(Task.scala:56)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.UnsupportedOperationException
>> at java.util.AbstractMap.put(AbstractMap.java:203)
>> at
>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
>> at
>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>> at
>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>> at
>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
>> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
>> ... 18 more
>> 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
>> commanded a shutdown
>> 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared
>> 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped
>> 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator:
>> Shutting down remote daemon.
>>
>>
>>
>
>


Re: Execption while using kryo with broadcast

2015-04-15 Thread Jeetendra Gangele
Yes Without Kryo it did work out.when I remove kryo registration it did
worked out

On 15 April 2015 at 19:24, Jeetendra Gangele  wrote:

> its not working with the combination of Broadcast.
> Without Kyro also not working.
>
>
> On 15 April 2015 at 19:20, Akhil Das  wrote:
>
>> Is it working without kryo?
>>
>> Thanks
>> Best Regards
>>
>> On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele 
>> wrote:
>>
>>> Hi All I am getting below exception while using Kyro serializable with
>>> broadcast variable. I am broadcating a hasmap with below line.
>>>
>>> Map matchData =RddForMarch.collectAsMap();
>>> final Broadcast> dataMatchGlobal =
>>> jsc.broadcast(matchData);
>>>
>>>
>>>
>>>
>>>
>>>
>>> 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in
>>> stage 4.0 (TID 7)
>>> java.io.IOException: java.lang.UnsupportedOperationException
>>> at
>>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>>> at
>>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
>>> at
>>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
>>> at
>>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>>> at
>>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>> at
>>> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
>>> at
>>> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
>>> at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>>> at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>> at org.apache.spark.scheduler.Task.run(Task.scala:56)
>>> at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.UnsupportedOperationException
>>> at java.util.AbstractMap.put(AbstractMap.java:203)
>>> at
>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
>>> at
>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>>> at
>>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>> at
>>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
>>> at
>>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
>>> ... 18 more
>>> 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
>>> commanded a shutdown
>>> 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared
>>> 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped
>>> 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator:
>>> Shutting down remote daemon.
>>>
>>>
>>>
>>
>>
>
>
>
>


Re: Execption while using kryo with broadcast

2015-04-15 Thread Imran Rashid
this is a really strange exception ... I'm especially surprised that it
doesn't work w/ java serialization.  Do you think you could try to boil it
down to a minimal example?

On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele 
wrote:

> Yes Without Kryo it did work out.when I remove kryo registration it did
> worked out
>
> On 15 April 2015 at 19:24, Jeetendra Gangele  wrote:
>
>> its not working with the combination of Broadcast.
>> Without Kyro also not working.
>>
>>
>> On 15 April 2015 at 19:20, Akhil Das  wrote:
>>
>>> Is it working without kryo?
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele >> > wrote:
>>>
 Hi All I am getting below exception while using Kyro serializable with
 broadcast variable. I am broadcating a hasmap with below line.

 Map matchData =RddForMarch.collectAsMap();
 final Broadcast> dataMatchGlobal =
 jsc.broadcast(matchData);






 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in
 stage 4.0 (TID 7)
 java.io.IOException: java.lang.UnsupportedOperationException
 at
 org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
 at
 org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
 at
 org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
 at
 org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
 at
 org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
 at
 com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
 at
 com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
 at
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
 at
 org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
 at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
 at
 org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
 at
 org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
 at
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:56)
 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.UnsupportedOperationException
 at java.util.AbstractMap.put(AbstractMap.java:203)
 at
 com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
 at
 com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
 at
 com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
 at
 org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
 at
 org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
 at
 org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
 at
 org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
 ... 18 more
 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
 commanded a shutdown
 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared
 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped
 15/04/15 12:58:51 INFO
 remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote
 daemon.



>>>
>>>
>>
>>
>>
>>
>
>
>
>


Re: Execption while using kryo with broadcast

2015-04-15 Thread Jeetendra Gangele
This looks like known issue? check this out
http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassException-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-cor-td20034.html

Can you please suggest any work around I am broad casting HashMap return
from RDD.collectasMap().

On 15 April 2015 at 19:33, Imran Rashid  wrote:

> this is a really strange exception ... I'm especially surprised that it
> doesn't work w/ java serialization.  Do you think you could try to boil it
> down to a minimal example?
>
> On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele 
> wrote:
>
>> Yes Without Kryo it did work out.when I remove kryo registration it did
>> worked out
>>
>> On 15 April 2015 at 19:24, Jeetendra Gangele 
>> wrote:
>>
>>> its not working with the combination of Broadcast.
>>> Without Kyro also not working.
>>>
>>>
>>> On 15 April 2015 at 19:20, Akhil Das  wrote:
>>>
 Is it working without kryo?

 Thanks
 Best Regards

 On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele <
 gangele...@gmail.com> wrote:

> Hi All I am getting below exception while using Kyro serializable with
> broadcast variable. I am broadcating a hasmap with below line.
>
> Map matchData =RddForMarch.collectAsMap();
> final Broadcast> dataMatchGlobal =
> jsc.broadcast(matchData);
>
>
>
>
>
>
> 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in
> stage 4.0 (TID 7)
> java.io.IOException: java.lang.UnsupportedOperationException
> at
> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
> at
> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
> at
> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
> at
> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
> at
> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
> at
> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
> at
> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
> at
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
> at
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
> at
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.UnsupportedOperationException
> at java.util.AbstractMap.put(AbstractMap.java:203)
> at
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
> at
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
> at
> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
> at
> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
> at
> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
> at
> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
> at
> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
> ... 18 more
> 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
> commanded a shutdown
> 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared
> 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped
> 15/04/15 12:58:51 INFO
> remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote
> daemon.
>
>
>


>>>
>>>
>>>
>>>
>>
>>
>>
>>
>


Re: Execption while using kryo with broadcast

2015-04-15 Thread Imran Rashid
oh interesting.  The suggested workaround is to wrap the result from
collectAsMap into another hashmap, you should try that:

Map matchData =RddForMarch.collectAsMap();
Map tmp = new HashMap(matchData);
final Broadcast> dataMatchGlobal =
jsc.broadcast(tmp);

Can you please clarify:
* Does it work w/ java serialization in the end?  Or is this kryo only?
* which Spark version you are using? (one of the relevant bugs was fixed in
1.2.1 and 1.3.0)



On Wed, Apr 15, 2015 at 9:06 AM, Jeetendra Gangele 
wrote:

> This looks like known issue? check this out
>
> http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassException-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-cor-td20034.html
>
> Can you please suggest any work around I am broad casting HashMap return
> from RDD.collectasMap().
>
> On 15 April 2015 at 19:33, Imran Rashid  wrote:
>
>> this is a really strange exception ... I'm especially surprised that it
>> doesn't work w/ java serialization.  Do you think you could try to boil it
>> down to a minimal example?
>>
>> On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele 
>> wrote:
>>
>>> Yes Without Kryo it did work out.when I remove kryo registration it did
>>> worked out
>>>
>>> On 15 April 2015 at 19:24, Jeetendra Gangele 
>>> wrote:
>>>
 its not working with the combination of Broadcast.
 Without Kyro also not working.


 On 15 April 2015 at 19:20, Akhil Das 
 wrote:

> Is it working without kryo?
>
> Thanks
> Best Regards
>
> On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele <
> gangele...@gmail.com> wrote:
>
>> Hi All I am getting below exception while using Kyro serializable
>> with broadcast variable. I am broadcating a hasmap with below line.
>>
>> Map matchData =RddForMarch.collectAsMap();
>> final Broadcast> dataMatchGlobal =
>> jsc.broadcast(matchData);
>>
>>
>>
>>
>>
>>
>> 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in
>> stage 4.0 (TID 7)
>> java.io.IOException: java.lang.UnsupportedOperationException
>> at
>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>> at
>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
>> at
>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
>> at
>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>> at
>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at
>> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
>> at
>> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
>> at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>> at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>> at org.apache.spark.scheduler.Task.run(Task.scala:56)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.UnsupportedOperationException
>> at java.util.AbstractMap.put(AbstractMap.java:203)
>> at
>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
>> at
>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>> at
>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>> at
>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)
>> at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
>> at
>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000)
>> ... 18 more
>> 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver
>> comman

Re: Execption while using kryo with broadcast

2015-04-15 Thread Jeetendra Gangele
This worked with java serialization.I am using 1.2.0 you are right if I use
1.2.1 or 1.3.0 this issue will not occur
I will test this and let you know

On 15 April 2015 at 19:48, Imran Rashid  wrote:

> oh interesting.  The suggested workaround is to wrap the result from
> collectAsMap into another hashmap, you should try that:
>
> Map matchData =RddForMarch.collectAsMap();
> Map tmp = new HashMap(matchData);
> final Broadcast> dataMatchGlobal =
> jsc.broadcast(tmp);
>
> Can you please clarify:
> * Does it work w/ java serialization in the end?  Or is this kryo only?
> * which Spark version you are using? (one of the relevant bugs was fixed
> in 1.2.1 and 1.3.0)
>
>
>
> On Wed, Apr 15, 2015 at 9:06 AM, Jeetendra Gangele 
> wrote:
>
>> This looks like known issue? check this out
>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassException-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-cor-td20034.html
>>
>> Can you please suggest any work around I am broad casting HashMap return
>> from RDD.collectasMap().
>>
>> On 15 April 2015 at 19:33, Imran Rashid  wrote:
>>
>>> this is a really strange exception ... I'm especially surprised that it
>>> doesn't work w/ java serialization.  Do you think you could try to boil it
>>> down to a minimal example?
>>>
>>> On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele >> > wrote:
>>>
 Yes Without Kryo it did work out.when I remove kryo registration it did
 worked out

 On 15 April 2015 at 19:24, Jeetendra Gangele 
 wrote:

> its not working with the combination of Broadcast.
> Without Kyro also not working.
>
>
> On 15 April 2015 at 19:20, Akhil Das 
> wrote:
>
>> Is it working without kryo?
>>
>> Thanks
>> Best Regards
>>
>> On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele <
>> gangele...@gmail.com> wrote:
>>
>>> Hi All I am getting below exception while using Kyro serializable
>>> with broadcast variable. I am broadcating a hasmap with below line.
>>>
>>> Map matchData =RddForMarch.collectAsMap();
>>> final Broadcast> dataMatchGlobal =
>>> jsc.broadcast(matchData);
>>>
>>>
>>>
>>>
>>>
>>>
>>> 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in
>>> stage 4.0 (TID 7)
>>> java.io.IOException: java.lang.UnsupportedOperationException
>>> at
>>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>>> at
>>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103)
>>> at
>>> com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1)
>>> at
>>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>>> at
>>> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002)
>>> at
>>> scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>> at
>>> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204)
>>> at
>>> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58)
>>> at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>>> at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>> at org.apache.spark.scheduler.Task.run(Task.scala:56)
>>> at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.UnsupportedOperationException
>>> at java.util.AbstractMap.put(AbstractMap.java:203)
>>> at
>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
>>> at
>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>>> at
>>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>> at
>>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142)
>>> at
>>> org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216)