Execption while using kryo with broadcast
Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
Yes Without Kryo it did work out.when I remove kryo registration it did worked out On 15 April 2015 at 19:24, Jeetendra Gangele gangele...@gmail.com wrote: its not working with the combination of Broadcast. Without Kyro also not working. On 15 April 2015 at 19:20, Akhil Das ak...@sigmoidanalytics.com wrote: Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
This looks like known issue? check this out http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassException-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-cor-td20034.html Can you please suggest any work around I am broad casting HashMap return from RDD.collectasMap(). On 15 April 2015 at 19:33, Imran Rashid iras...@cloudera.com wrote: this is a really strange exception ... I'm especially surprised that it doesn't work w/ java serialization. Do you think you could try to boil it down to a minimal example? On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele gangele...@gmail.com wrote: Yes Without Kryo it did work out.when I remove kryo registration it did worked out On 15 April 2015 at 19:24, Jeetendra Gangele gangele...@gmail.com wrote: its not working with the combination of Broadcast. Without Kyro also not working. On 15 April 2015 at 19:20, Akhil Das ak...@sigmoidanalytics.com wrote: Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
this is a really strange exception ... I'm especially surprised that it doesn't work w/ java serialization. Do you think you could try to boil it down to a minimal example? On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele gangele...@gmail.com wrote: Yes Without Kryo it did work out.when I remove kryo registration it did worked out On 15 April 2015 at 19:24, Jeetendra Gangele gangele...@gmail.com wrote: its not working with the combination of Broadcast. Without Kyro also not working. On 15 April 2015 at 19:20, Akhil Das ak...@sigmoidanalytics.com wrote: Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
oh interesting. The suggested workaround is to wrap the result from collectAsMap into another hashmap, you should try that: MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); MapString, String tmp = new HashMapString, String(matchData); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(tmp); Can you please clarify: * Does it work w/ java serialization in the end? Or is this kryo only? * which Spark version you are using? (one of the relevant bugs was fixed in 1.2.1 and 1.3.0) On Wed, Apr 15, 2015 at 9:06 AM, Jeetendra Gangele gangele...@gmail.com wrote: This looks like known issue? check this out http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassException-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-cor-td20034.html Can you please suggest any work around I am broad casting HashMap return from RDD.collectasMap(). On 15 April 2015 at 19:33, Imran Rashid iras...@cloudera.com wrote: this is a really strange exception ... I'm especially surprised that it doesn't work w/ java serialization. Do you think you could try to boil it down to a minimal example? On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele gangele...@gmail.com wrote: Yes Without Kryo it did work out.when I remove kryo registration it did worked out On 15 April 2015 at 19:24, Jeetendra Gangele gangele...@gmail.com wrote: its not working with the combination of Broadcast. Without Kyro also not working. On 15 April 2015 at 19:20, Akhil Das ak...@sigmoidanalytics.com wrote: Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
its not working with the combination of Broadcast. Without Kyro also not working. On 15 April 2015 at 19:20, Akhil Das ak...@sigmoidanalytics.com wrote: Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO storage.MemoryStore: MemoryStore cleared 15/04/15 12:58:51 INFO storage.BlockManager: BlockManager stopped 15/04/15 12:58:51 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
Re: Execption while using kryo with broadcast
This worked with java serialization.I am using 1.2.0 you are right if I use 1.2.1 or 1.3.0 this issue will not occur I will test this and let you know On 15 April 2015 at 19:48, Imran Rashid iras...@cloudera.com wrote: oh interesting. The suggested workaround is to wrap the result from collectAsMap into another hashmap, you should try that: MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); MapString, String tmp = new HashMapString, String(matchData); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(tmp); Can you please clarify: * Does it work w/ java serialization in the end? Or is this kryo only? * which Spark version you are using? (one of the relevant bugs was fixed in 1.2.1 and 1.3.0) On Wed, Apr 15, 2015 at 9:06 AM, Jeetendra Gangele gangele...@gmail.com wrote: This looks like known issue? check this out http://apache-spark-user-list.1001560.n3.nabble.com/java-io-InvalidClassException-org-apache-spark-api-java-JavaUtils-SerializableMapWrapper-no-valid-cor-td20034.html Can you please suggest any work around I am broad casting HashMap return from RDD.collectasMap(). On 15 April 2015 at 19:33, Imran Rashid iras...@cloudera.com wrote: this is a really strange exception ... I'm especially surprised that it doesn't work w/ java serialization. Do you think you could try to boil it down to a minimal example? On Wed, Apr 15, 2015 at 8:58 AM, Jeetendra Gangele gangele...@gmail.com wrote: Yes Without Kryo it did work out.when I remove kryo registration it did worked out On 15 April 2015 at 19:24, Jeetendra Gangele gangele...@gmail.com wrote: its not working with the combination of Broadcast. Without Kyro also not working. On 15 April 2015 at 19:20, Akhil Das ak...@sigmoidanalytics.com wrote: Is it working without kryo? Thanks Best Regards On Wed, Apr 15, 2015 at 6:38 PM, Jeetendra Gangele gangele...@gmail.com wrote: Hi All I am getting below exception while using Kyro serializable with broadcast variable. I am broadcating a hasmap with below line. MapLong, MatcherReleventData matchData =RddForMarch.collectAsMap(); final BroadcastMapLong, MatcherReleventData dataMatchGlobal = jsc.broadcast(matchData); 15/04/15 12:58:51 ERROR executor.Executor: Exception in task 0.3 in stage 4.0 (TID 7) java.io.IOException: java.lang.UnsupportedOperationException at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:103) at com.insideview.yam.CompanyMatcher$4.call(CompanyMatcher.java:1) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1002) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:204) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:58) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException at java.util.AbstractMap.put(AbstractMap.java:203) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135) at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:142) at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1000) ... 18 more 15/04/15 12:58:51 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 15/04/15 12:58:51 INFO