[ https://issues.apache.org/jira/browse/TINKERPOP-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623507#comment-17623507 ]
ASF GitHub Bot commented on TINKERPOP-2817: ------------------------------------------- ministat commented on code in PR #1835: URL: https://github.com/apache/tinkerpop/pull/1835#discussion_r1003924906 ########## hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/structure/io/graphson/GraphSONRecordWriter.java: ########## @@ -58,11 +61,15 @@ public final class GraphSONRecordWriter extends RecordWriter<NullWritable, Verte public GraphSONRecordWriter(final DataOutputStream outputStream, final Configuration configuration) { this.outputStream = outputStream; this.hasEdges = configuration.getBoolean(Constants.GREMLIN_HADOOP_GRAPH_WRITER_HAS_EDGES, true); - this.graphsonWriter = GraphSONWriter.build().mapper( - GraphSONMapper.build(). - version(GraphSONVersion.valueOf(configuration.get(Constants.GREMLIN_HADOOP_GRAPHSON_VERSION, "V3_0"))). - typeInfo(TypeInfo.PARTIAL_TYPES). - addRegistries(IoRegistryHelper.createRegistries(ConfUtil.makeApacheConfiguration(configuration))).create()).create(); + GraphSONVersion graphSONVersion = Review Comment: ack > "Could not find a type identifier for the class : class java.lang.Byte" > occurs when dumping graph to graphson format > --------------------------------------------------------------------------------------------------------------------- > > Key: TINKERPOP-2817 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2817 > Project: TinkerPop > Issue Type: Bug > Components: hadoop > Affects Versions: 3.4.6 > Reporter: Redriver > Priority: Major > > When I used hadoop-gremlin 3.4.6 to export a graph to HDFS, I encountered an > error. > My gremlin query looks like: > {code:java} > graph = GraphFactory.open('conf/fdb-psave-export.properties') > graph.compute(SparkGraphComputer).program(CloneVertexProgram.build().create()).submit().get() > {code} > {code:java} > org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not > find a type identifier for the class : class java.lang.Byte. Make sure the > value to serialize has a type identifier registered for its class. > at > org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3906) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:3177) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertex(GraphSONWriter.java:82) > ~[gremlin-core-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONRecordWriter.write(GraphSONRecordWriter.java:72) > ~[hadoop-gremlin-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONRecordWriter.write(GraphSONRecordWriter.java:42) > ~[hadoop-gremlin-3.4.6.jar:3.4.6] > at > org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) > ~[spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) > ~[spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) > ~[spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) > [spark-core_2.11-2.4.0.jar:2.4.0] > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > [spark-core_2.11-2.4.0.jar:2.4.0] > at org.apache.spark.scheduler.Task.run(Task.scala:121) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:416) > [spark-core_2.11-2.4.0.jar:2.4.0] > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:422) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_202] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_202] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_202] > {code} > It looks like the default GraphSONRecordWriter/GraphSONRecordReader does not > support the serialization/deserialization of java.lang.Byte. But according to > https://tinkerpop.apache.org/docs/3.4.6/dev/io/#_extended_2, adding the > custom module GraphSONXModuleV2d0/GraphSONXModuleV3d0 can fix this. The > experient with a quick fix proves the fix is working. -- This message was sent by Atlassian Jira (v8.20.10#820010)