[ https://issues.apache.org/jira/browse/SPARK-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196288#comment-14196288 ]
Vitaliy Migov commented on SPARK-3958: -------------------------------------- Observed the same exception ( Spark 1.1.0 ). After changing broadcast factory to the HttpBroadcastFactory ( as suggested in the SPARK-4133 ), exception looks like: java.io.FileNotFoundException: http://10.8.0.22:44907/broadcast_0 Full logs attached to issue: spark_ex.logs Seems that something wrong with the handling of the "broadcast_0" in the BlockManager: I think that something wrong with the handing of the "broadcast_0" in the BlockManager: 14/11/04 17:20:39 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1216.0 B, free 983.1 MB) 14/11/04 17:20:39 DEBUG BlockManager: Put block broadcast_0 locally took 84 ms 14/11/04 17:20:39 DEBUG BlockManager: Putting block broadcast_0 without replication took 86 ms 14/11/04 17:20:39 DEBUG BlockManager: Getting local block broadcast_0 14/11/04 17:20:39 DEBUG BlockManager: Level for block broadcast_0 is StorageLevel(true, true, false, true, 1) 14/11/04 17:20:39 DEBUG BlockManager: Getting block broadcast_0 from memory 14/11/04 17:20:39 INFO BlockManager: Found block broadcast_0 locally 14/11/04 17:20:57 WARN BlockManager: Block broadcast_0 already exists on this machine; not re-adding it 14/11/04 17:20:57 DEBUG BlockManager: Getting local block broadcast_0 14/11/04 17:20:57 DEBUG BlockManager: Block broadcast_0 not registered locally 14/11/04 17:20:57 DEBUG BlockManager: Getting remote block broadcast_0 14/11/04 17:20:57 DEBUG BlockManagerMasterActor: [actor] received message GetLocations(broadcast_0) from Actor[akka://sparkDriver/temp/$l] 14/11/04 17:20:57 DEBUG BlockManager: Block broadcast_0 not found 14/11/04 17:20:57 DEBUG BlockManagerMasterActor: [actor] handled message (0.117676 ms) GetLocations(broadcast_0) from Actor[akka://sparkDriver/temp/$l] 14/11/04 17:20:57 INFO HttpBroadcast: Started reading broadcast variable 0 14/11/04 17:20:57 DEBUG HttpBroadcast: broadcast read server: http://10.8.0.22:44907 id: broadcast-0 14/11/04 17:20:57 DEBUG HttpBroadcast: broadcast not using security 14/11/04 17:20:57 DEBUG RecurringTimer: Callback for BlockGenerator called at time 1415114457200 14/11/04 17:20:57 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.io.FileNotFoundException: http://10.8.0.22:44907/broadcast_0 > Possible stream-corruption issues in TorrentBroadcast > ----------------------------------------------------- > > Key: SPARK-3958 > URL: https://issues.apache.org/jira/browse/SPARK-3958 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.1.0, 1.2.0 > Reporter: Josh Rosen > Assignee: Josh Rosen > Priority: Blocker > Attachments: spark_ex.logs > > > TorrentBroadcast deserialization sometimes fails with decompression errors, > which are most likely caused by stream-corruption exceptions. For example, > this can manifest itself as a Snappy PARSING_ERROR when deserializing a > broadcasted task: > {code} > 14/10/14 17:20:55.016 DEBUG BlockManager: Getting local block broadcast_8 > 14/10/14 17:20:55.016 DEBUG BlockManager: Block broadcast_8 not registered > locally > 14/10/14 17:20:55.016 INFO TorrentBroadcast: Started reading broadcast > variable 8 > 14/10/14 17:20:55.017 INFO TorrentBroadcast: Reading broadcast variable 8 > took 5.3433E-5 s > 14/10/14 17:20:55.017 ERROR Executor: Exception in task 2.0 in stage 8.0 (TID > 18) > java.io.IOException: PARSING_ERROR(2) > at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84) > at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method) > at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:594) > at > org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:125) > at > org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:88) > at org.xerial.snappy.SnappyInputStream.<init>(SnappyInputStream.java:58) > at > org.apache.spark.io.SnappyCompressionCodec.compressedInputStream(CompressionCodec.scala:128) > at > org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:216) > at > org.apache.spark.broadcast.TorrentBroadcast.readObject(TorrentBroadcast.scala:170) > at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:164) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > SPARK-3630 is an umbrella ticket for investigating all causes of these Kryo > and Snappy deserialization errors. This ticket is for a more > narrowly-focused exploration of the TorrentBroadcast version of these errors, > since the similar errors that we've seen in sort-based shuffle seem to be > explained by a different cause (see SPARK-3948). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org