[
https://issues.apache.org/jira/browse/TINKERPOP-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17622970#comment-17622970
]
ASF GitHub Bot commented on TINKERPOP-2817:
-------------------------------------------
ministat commented on code in PR #1835:
URL: https://github.com/apache/tinkerpop/pull/1835#discussion_r1002905316
##########
hadoop-gremlin/src/test/java/org/apache/tinkerpop/gremlin/hadoop/structure/io/graphson/GraphSONXModuleV3d0RecordReaderWriterTest.java:
##########
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapreduce.*;
+import org.apache.hadoop.mapreduce.lib.input.FileSplit;
+import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl;
+import org.apache.hadoop.util.ReflectionUtils;
+import org.apache.tinkerpop.gremlin.hadoop.structure.io.RecordReaderWriterTest;
+import org.apache.tinkerpop.gremlin.hadoop.structure.io.VertexWritable;
+import org.apache.tinkerpop.gremlin.structure.Direction;
+import org.apache.tinkerpop.gremlin.structure.Vertex;
+import org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+import java.util.List;
+import java.util.Optional;
+import java.util.UUID;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+public class GraphSONXModuleV3d0RecordReaderWriterTest extends
RecordReaderWriterTest {
Review Comment:
RecordReaderWriterTest is binding with grateful-dead-xxx.json because the
validateFileSplits wants to check the vertex count, inEdge count, and outEdge
count. I used tinkerpop-classic-xxx.json instead of grateful-dead-xxx.json,
because I just replaced the
vertex id type from Int32 to Byte since tinkerpop-classic-xxx.json only has
6 vertice. I cannot change any type in grateful-dead-xxx.json to Byte.
> "Could not find a type identifier for the class : class java.lang.Byte"
> occurs when dumping graph to graphson format
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: TINKERPOP-2817
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2817
> Project: TinkerPop
> Issue Type: Bug
> Components: hadoop
> Affects Versions: 3.4.6
> Reporter: Redriver
> Priority: Major
>
> When I used hadoop-gremlin 3.4.6 to export a graph to HDFS, I encountered an
> error.
> My gremlin query looks like:
> {code:java}
> graph = GraphFactory.open('conf/fdb-psave-export.properties')
> graph.compute(SparkGraphComputer).program(CloneVertexProgram.build().create()).submit().get()
> {code}
> {code:java}
> org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not
> find a type identifier for the class : class java.lang.Byte. Make sure the
> value to serialize has a type identifier registered for its class.
> at
> org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509)
> ~[gremlin-shaded-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482)
> ~[gremlin-shaded-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319)
> ~[gremlin-shaded-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3906)
> ~[gremlin-shaded-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:3177)
> ~[gremlin-shaded-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertex(GraphSONWriter.java:82)
> ~[gremlin-core-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONRecordWriter.write(GraphSONRecordWriter.java:72)
> ~[hadoop-gremlin-3.4.6.jar:3.4.6]
> at
> org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONRecordWriter.write(GraphSONRecordWriter.java:42)
> ~[hadoop-gremlin-3.4.6.jar:3.4.6]
> at
> org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
> ~[spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
> ~[spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
> ~[spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at org.apache.spark.scheduler.Task.run(Task.scala:121)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:416)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:422)
> [spark-core_2.11-2.4.0.jar:2.4.0]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [na:1.8.0_202]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [na:1.8.0_202]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_202]
> {code}
> It looks like the default GraphSONRecordWriter/GraphSONRecordReader does not
> support the serialization/deserialization of java.lang.Byte. But according to
> https://tinkerpop.apache.org/docs/3.4.6/dev/io/#_extended_2, adding the
> custom module GraphSONXModuleV2d0/GraphSONXModuleV3d0 can fix this. The
> experient with a quick fix proves the fix is working.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)