[ 
https://issues.apache.org/jira/browse/TINKERPOP-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357229#comment-15357229
 ] 

Dan LaRocque commented on TINKERPOP-1341:
-----------------------------------------

I made a PR: https://github.com/apache/tinkerpop/pull/353

> UnshadedKryoAdapter fails to deserialize StarGraph when SparkConf sets 
> spark.rdd.compress=true whereas GryoSerializer works
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1341
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1341
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 3.2.1, 3.3.0
>            Reporter: Dylan Bethune-Waddell
>            Priority: Minor
>
> When trying to bulk load a large dataset into Titan I was running into OOM 
> errors and decided to try tweaking some spark configuration settings - 
> although I am having trouble bulk loading with the new 
> GryoRegistrator/UnshadedKryo serialization shim stuff in master whereby a few 
> hundred tasks into the edge loading stage (stage 5) exceptions are thrown 
> complaining about the need to explicitly register CompactBuffer[].class with 
> Kryo, this approach with spark.rdd.compress=true fails a few hundred tasks 
> into the vertex loading stage (stage 1) of BulkLoaderVertexProgram. 
> GryoSerializer instead of KryoSerializer with GryoRegistrator does not fail 
> and successfully loads the data with this compression flag flipped on whereas 
> before I would just get OOM errors until eventually the job was set back so 
> far that it just failed. So it would seem it is desirable in some instances 
> to use this setting, and the new Serialization stuff seems to break it. Could 
> be a Spark upstream issue based on this open JIRA ticket 
> (https://issues.apache.org/jira/browse/SPARK-3630). Here is the exception 
> that is thrown with the middle bits cut out:
> com.esotericsoftware.kryo.KryoException: java.io.IOException: PARSING_ERROR(2)
>         at com.esotericsoftware.kryo.io.Input.fill(Input.java:142)
>         at com.esotericsoftware.kryo.io.Input.require(Input.java:169)
>         at com.esotericsoftware.kryo.io.Input.readLong_slow(Input.java:715)
>         at com.esotericsoftware.kryo.io.Input.readLong(Input.java:665)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:113)
>         at 
> com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:103)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:48)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:30)
>         at 
> org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.readEdges(StarGraphSerializer.java:134)
>         at 
> org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:91)
>         at 
> org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:45)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
>         at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:42)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:30)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:46)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:36)
>         at 
> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>         at 
> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
> ........................................................ and so on 
> .....................................
> Caused by: java.io.IOException: PARSING_ERROR(2)
>         at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84)
>         at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)
>         at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:594)
>         at 
> org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:358)
>         at 
> org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:167)
>         at 
> org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:150)
>         at com.esotericsoftware.kryo.io.Input.fill(Input.java:140)
>         ... 51 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to