[ https://issues.apache.org/jira/browse/TINKERPOP-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17622970#comment-17622970 ]
ASF GitHub Bot commented on TINKERPOP-2817: ------------------------------------------- ministat commented on code in PR #1835: URL: https://github.com/apache/tinkerpop/pull/1835#discussion_r1002905316 ########## hadoop-gremlin/src/test/java/org/apache/tinkerpop/gremlin/hadoop/structure/io/graphson/GraphSONXModuleV3d0RecordReaderWriterTest.java: ########## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.io.NullWritable; +import org.apache.hadoop.mapreduce.*; +import org.apache.hadoop.mapreduce.lib.input.FileSplit; +import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl; +import org.apache.hadoop.util.ReflectionUtils; +import org.apache.tinkerpop.gremlin.hadoop.structure.io.RecordReaderWriterTest; +import org.apache.tinkerpop.gremlin.hadoop.structure.io.VertexWritable; +import org.apache.tinkerpop.gremlin.structure.Direction; +import org.apache.tinkerpop.gremlin.structure.Vertex; +import org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.net.URL; +import java.util.List; +import java.util.Optional; +import java.util.UUID; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; + +public class GraphSONXModuleV3d0RecordReaderWriterTest extends RecordReaderWriterTest { Review Comment: RecordReaderWriterTest is binding with grateful-dead-xxx.json because the validateFileSplits wants to check the vertex count, inEdge count, and outEdge count. I used tinkerpop-classic-xxx.json instead of grateful-dead-xxx.json, because I just replaced the vertex id type from Int32 to Byte since tinkerpop-classic-xxx.json only has 6 vertice. I cannot change any type in grateful-dead-xxx.json to Byte. > "Could not find a type identifier for the class : class java.lang.Byte" > occurs when dumping graph to graphson format > --------------------------------------------------------------------------------------------------------------------- > > Key: TINKERPOP-2817 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2817 > Project: TinkerPop > Issue Type: Bug > Components: hadoop > Affects Versions: 3.4.6 > Reporter: Redriver > Priority: Major > > When I used hadoop-gremlin 3.4.6 to export a graph to HDFS, I encountered an > error. > My gremlin query looks like: > {code:java} > graph = GraphFactory.open('conf/fdb-psave-export.properties') > graph.compute(SparkGraphComputer).program(CloneVertexProgram.build().create()).submit().get() > {code} > {code:java} > org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not > find a type identifier for the class : class java.lang.Byte. Make sure the > value to serialize has a type identifier registered for its class. > at > org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3906) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:3177) > ~[gremlin-shaded-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertex(GraphSONWriter.java:82) > ~[gremlin-core-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONRecordWriter.write(GraphSONRecordWriter.java:72) > ~[hadoop-gremlin-3.4.6.jar:3.4.6] > at > org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONRecordWriter.write(GraphSONRecordWriter.java:42) > ~[hadoop-gremlin-3.4.6.jar:3.4.6] > at > org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) > ~[spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) > ~[spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) > ~[spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) > [spark-core_2.11-2.4.0.jar:2.4.0] > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > [spark-core_2.11-2.4.0.jar:2.4.0] > at org.apache.spark.scheduler.Task.run(Task.scala:121) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:416) > [spark-core_2.11-2.4.0.jar:2.4.0] > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:422) > [spark-core_2.11-2.4.0.jar:2.4.0] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_202] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_202] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_202] > {code} > It looks like the default GraphSONRecordWriter/GraphSONRecordReader does not > support the serialization/deserialization of java.lang.Byte. But according to > https://tinkerpop.apache.org/docs/3.4.6/dev/io/#_extended_2, adding the > custom module GraphSONXModuleV2d0/GraphSONXModuleV3d0 can fix this. The > experient with a quick fix proves the fix is working. -- This message was sent by Atlassian Jira (v8.20.10#820010)