[ https://issues.apache.org/jira/browse/SPARK-36787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raghu updated SPARK-36787: -------------------------- Description: Spark JavaPairRDD processing fails with error {code:java} com.esotericsoftware.kryo.KryoException: Buffer underflow. Serialization trace: topologyInfo_ (org.apache.spark.storage.BlockManagerId) loc (org.apache.spark.scheduler.HighlyCompressedMapStatus) at com.esotericsoftware.kryo.io.Input.require(Input.java:199) at com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:373) at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:127) at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:804) at com.twitter.chill.TraversableSerializer.read(Traversable.scala:43) at com.twitter.chill.TraversableSerializer.read(Traversable.scala:21) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:734) at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:734) at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:816) at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:397) at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:103) at org.apache.spark.scheduler.TaskResultGetter$$anon$3.$anonfun$run$1(TaskResultGetter.scala:75) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996) at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:63) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Kryo log shows this right before the error, could this be a mulithreading issue in spark! {code} 0:15 TRACE: [kryo] Read field: loc (org.apache.spark.scheduler.HighlyCompressedMapStatus) pos=534 00:15 TRACE: [kryo] Read field: loc (org.apache.spark.scheduler.HighlyCompressedMapStatus) pos=726 00:15 TRACE: [kryo] Read class 15: org.apache.spark.storage.BlockManagerId 00:15 TRACE: [kryo] Read class 15: org.apache.spark.storage.BlockManagerId 00:15 TRACE: [kryo] Read field: topologyInfo_ (org.apache.spark.storage.BlockManagerId) pos=801 00:15 TRACE: [kryo] Read field: topologyInfo_ (org.apache.spark.storage.BlockManagerId) pos=609 {code} was: Spark JavaPairRDD processing fails with error {code:java} com.esotericsoftware.kryo.KryoException: Buffer underflow. Serialization trace: topologyInfo_ (org.apache.spark.storage.BlockManagerId) loc (org.apache.spark.scheduler.HighlyCompressedMapStatus) at com.esotericsoftware.kryo.io.Input.require(Input.java:199) at com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:373) at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:127) at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:804) at com.twitter.chill.TraversableSerializer.read(Traversable.scala:43) at com.twitter.chill.TraversableSerializer.read(Traversable.scala:21) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:734) at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:734) at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:816) at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:397) at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:103) at org.apache.spark.scheduler.TaskResultGetter$$anon$3.$anonfun$run$1(TaskResultGetter.scala:75) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996) at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:63) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} > Kryo Buffer underflow error in Spark 3.0 > ---------------------------------------- > > Key: SPARK-36787 > URL: https://issues.apache.org/jira/browse/SPARK-36787 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.1.2 > Environment: Dataproc Image: 2.0.20-debian10 > Apache Spark: 3.1.2 > Scala: 2.12.14 > > Reporter: Raghu > Priority: Blocker > > Spark JavaPairRDD processing fails with error > {code:java} > com.esotericsoftware.kryo.KryoException: Buffer underflow. > Serialization trace: > topologyInfo_ (org.apache.spark.storage.BlockManagerId) > loc (org.apache.spark.scheduler.HighlyCompressedMapStatus) > at com.esotericsoftware.kryo.io.Input.require(Input.java:199) > at com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:373) > at > com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:127) > at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) > at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:804) > at com.twitter.chill.TraversableSerializer.read(Traversable.scala:43) > at com.twitter.chill.TraversableSerializer.read(Traversable.scala:21) > at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:734) > at > com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) > at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:734) > at > com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) > at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:816) > at > org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:397) > at > org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:103) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3.$anonfun$run$1(TaskResultGetter.scala:75) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996) > at > org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:63) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > Kryo log shows this right before the error, could this be a mulithreading > issue in spark! > {code} > 0:15 TRACE: [kryo] Read field: loc > (org.apache.spark.scheduler.HighlyCompressedMapStatus) pos=534 > 00:15 TRACE: [kryo] Read field: loc > (org.apache.spark.scheduler.HighlyCompressedMapStatus) pos=726 > 00:15 TRACE: [kryo] Read class 15: org.apache.spark.storage.BlockManagerId > 00:15 TRACE: [kryo] Read class 15: org.apache.spark.storage.BlockManagerId > 00:15 TRACE: [kryo] Read field: topologyInfo_ > (org.apache.spark.storage.BlockManagerId) pos=801 > 00:15 TRACE: [kryo] Read field: topologyInfo_ > (org.apache.spark.storage.BlockManagerId) pos=609 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org