[ https://issues.apache.org/jira/browse/ARROW-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Kovalenko updated ARROW-17998: ------------------------------------ Labels: json json-schema (was: ) > [Java] JSON representation of pojo.Schema is incompatible with flatbuffers > JSON generated via C++ API > ----------------------------------------------------------------------------------------------------- > > Key: ARROW-17998 > URL: https://issues.apache.org/jira/browse/ARROW-17998 > Project: Apache Arrow > Issue Type: Bug > Components: Format, Java > Affects Versions: 6.0.1 > Reporter: Pavel Kovalenko > Priority: Major > Labels: json, json-schema > > I have JSON arrow::Schema representation generated from flatbuffers format in > C++: > > {code:java} > const void* schemaBytes; > std::string fbsSchemaFile; > flatbuffers::LoadFile("/path/to/Schema.fbs", false, &fbsSchemaFile); > flatbuffers::Parser parser; > parser.Parse(fbsSchemaFile.c_str()); > std::string json; > flatbuffers::GenerateTextFromTable(parser, schemaBytes, > "org.apache.arrow.flatbuf.Schema", &json); > return json;{code} > > When I'm trying to read this JSON in Java and create pojo.Schema: > > {code:java} > String json; // Read from file. > Schema.fromJson(json);{code} > > > It fails because JSON formats in flatbuffers generation and in Java using > Jackson bindings are a bit different: > > C++ Schema Flatbuffers JSON example: > {code:java} > { > fields: [ > { > name: "cc_call_center_sk", > type_type: "Int", > type: { > bitWidth: 32, > is_signed: true > }, > children: [ > ], > custom_metadata: [ > { > key: "metadata", > value: "some_metadata" > } > ] > }, > ], > custom_metadata: [ > { > key: "metadata", > value: "some_metadata" > } > ] > }{code} > Java Schema JSON example: > {code:java} > { > "fields" : [ { > "name" : "cc_call_center_sk", > "nullable" : true, > "type" : { > "name" : "int", > "bitWidth" : 32, > "isSigned" : true > }, > "children" : [ ], > "metadata" : [ { > "value" : "some_metadata", > "key" : "metadata" > } ] > } ], > "metadata" : [ { > "value" : "some_metadata", > "key" : "metadata" > } ] > } {code} > There is a difference in type id declaration: > `{*}type_type{*}` field is used in C++ flatbuffers > `{*}name{*}` field inside `{*}type{*}` field is used in Java > > Also, there is a difference in `{*}metadata{*}` field: > `{*}custom_metadata{*}` name is used in C++ flatbuffers > `{*}metadata{*}` name is used in Java > > It makes it impossible to re-use JSON representation from Java in C++ and > vice-versa > Probably the same issue exists in other languages -- This message was sent by Atlassian Jira (v8.20.10#820010)