[I] Avro schema parser uses type name instead of field name [AVRO] [arrow-rs]

via GitHub Wed, 26 Nov 2025 02:01:56 -0800


EmilyMatt opened a new issue, #8928:
URL: https://github.com/apache/arrow-rs/issues/8928


   **Describe the bug**
   <!--
   A clear and concise description of what the bug is.
   -->
   If I provide a schema like so:
   ```
   {
     "namespace": "ns1",
     "name": "main",
     "type": "record",
     "fields": [
       {
         "name": "f1",
         "type": {
           "type": "record",
           "namespace": "ns2",
           "name": "record2",
           "fields": [
             {
               "name": "f1_1",
               "type": "string"
             }
           ]
         }
       },
       {
         "name": "f2",
         "type": "ns2.record2"
       }
     ]
   }
   ```
   
   The schema parser will use "record2" as the field name, despite it actually 
being the type name, the field name should be "f1", this means conversion from 
ArrowSchemas that don't contain the schema json in the metadata will always 
fail, and in general the schemas will not be applicable to the output 
recordbatch.
   
   **To Reproduce**
   <!--
   Steps to reproduce the behavior:
   -->
   
   Create an avro file with the above writer schema, then use the following 
arrow schema to create a reader_schema
   ```
   Schema::new(vec![
               Field::new(
                   "f1",
                   DataType::Struct(
                       vec![
                           Field::new("f1_1", DataType::Utf8, false),
                       ]
                       .into(),
                   ),
                   false,
               )
   ]).with_metadata(HashMap::from([(AVRO_NAMESPACE_METADATA_KEY.into(), 
"ns1".into()), (AVRO_NAME_METADATA_KEY.into(), "main".into())]));
   ```
   (Use AvroSchema::try_from() etc.)
   
   It will error on mismatch in field names because the writer schema will have 
the field "f1" correctly, but the newly created reader_schema will have 
"record2"
   
   **Expected behavior**
   <!--
   A clear and concise description of what you expected to happen.
   -->
   
   The field name should be propagated correctly
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Avro schema parser uses type name instead of field name [AVRO] [arrow-rs]

Reply via email to