HI All,
getting the below Exception while converting my rdd to Map below is the
code.and my data size is hardly 200MD snappy file and the code looks like
this

@SuppressWarnings("unchecked")
        public Tuple2<Map&lt;String, byte[]>, String> getMatchData(String 
location,
String key) {
                ParquetInputFormat.setReadSupportClass(this.getJob(),
                                (Class<AvroReadSupport&lt;GenericRecord>>) 
(Class<?>)
AvroReadSupport.class);
                JavaPairRDD<Void, GenericRecord> avroRDD =
this.getSparkContext().newAPIHadoopFile(location,
                                (Class<ParquetInputFormat&lt;GenericRecord>>) 
(Class<?>)
ParquetInputFormat.class, Void.class,
                                GenericRecord.class, 
this.getJob().getConfiguration());
                JavaPairRDD<String, GenericRecord> kv = avroRDD.mapToPair(new
MapAvroToKV(key, new SpelExpressionParser()));
                Schema schema = kv.first()._2().getSchema();
                JavaPairRDD<String, byte[]> bytesKV = kv.mapToPair(new
AvroToBytesFunction());
                Map<String, byte[]> map = bytesKV.collectAsMap();
                Map<String, byte[]> hashmap = new HashMap<String, byte[]>(map);
                return new Tuple2<>(hashmap, schema.toString());
        }


please help me out if any thoughts and thanks in advance



04/27 03:13:18 INFO YarnClusterScheduler: Cancelling stage 11
04/27 03:13:18 INFO DAGScheduler: ResultStage 11 (collectAsMap at
DoubleClickSORJob.java:281) failed in 0.734 s due to Job aborted due to
stage failure: Task 0 in stage 11.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 11.0 (TID 15, ip-172-31-50-58.ec2.internal, executor
1): java.io.IOException: java.lang.NegativeArraySizeException
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1276)
        at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
        at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
        at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
        at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
        at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:81)
        at org.apache.spark.scheduler.Task.run(Task.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NegativeArraySizeException
        at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:325)
        at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:60)
        at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:43)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
        at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:244)
        at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$10.apply(TorrentBroadcast.scala:286)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
        at
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:287)
        at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:221)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1269)
        ... 11 more

Driver stacktrace:
04/27 03:13:18 INFO DAGScheduler: Job 11 failed: collectAsMap at
DoubleClickSORJob.java:281, took 3.651447 s
04/27 03:13:18 ERROR ApplicationMaster: User class threw exception:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 11.0 failed 4 times, most recent failure: Lost task 0.3 in stage 11.0
(TID 15, ip-172-31-50-58.ec2.internal, executor 1): java.io.IOException:
java.lang.NegativeArraySizeException
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1276)
        at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
        at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
        at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
        at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
        at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:81)
        at org.apache.spark.scheduler.Task.run(Task.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NegativeArraySizeException
        at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:325)
        at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:60)
        at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:43)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
        at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:244)
        at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$10.apply(TorrentBroadcast.scala:286)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
        at
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:287)
        at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:221)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1269)
        ... 11 more



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/javaRDD-to-collectasMap-throuwa-ava-lang-NegativeArraySizeException-tp28631.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to