[ 
https://issues.apache.org/jira/browse/AVRO-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Skraba updated AVRO-2702:
------------------------------
    Status: Patch Available  (was: Open)

> Avro ResolvingGrammarGenerator does not honor "avro.java.string" property in 
> inner record schemas
> -------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-2702
>                 URL: https://issues.apache.org/jira/browse/AVRO-2702
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.1
>            Reporter: Thorsten Hake
>            Assignee: Adam Bellemare
>            Priority: Major
>              Labels: ClassCastException, Deserialize
>             Fix For: 1.10.1
>
>         Attachments: Bar.kt
>
>
> The type property "avro.java.string" is being used to qualify the 
> CharSequence implementation of a string type in java. This property will be 
> set in the java code generated by the avro maven plugin, if the <stringType> 
> property is set to "String".
> However the ResolvingGrammarGenerator, which helps in matching the writer 
> schema to the reader schema, does not honor this property for inner records 
> within unions. Instead of deserializing to java.lang.String, the strings of 
> the inner record will be deserialized to org.apache.avro.util.Utf8. String 
> properties belonging to the outer record will be correctly deserialized to 
> java.lang.String.
> If you try to deserialize an Avro record from a schema that has an inner 
> record within an union type with the java code generated by the maven plugin 
> (<stringType> is set to "String"), you'll get a ClassCastException:
> {noformat}
> Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 
> cannot be cast to class java.lang.String
> {noformat}
> This is because the generated java code expects the strings to be 
> deserialized according to the "avro.java.string" property which does not 
> happen for the inner record.
> I would expect that the deserializer treats the strings in the inner record 
> the same as the strings in the outer record.
> Example:
> writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "foo",
>   "fields": [
>     {
>       "name": "k",
>       "type": "string"
>     },
>     {
>       "name": "value",
>       "type": [
>         "null",
>         {
>           "type": "record",
>           "name": "bar",
>           "fields": [
>             {
>               "name": "str",
>               "type": "string"
>             }
>           ]
>         }
>       ]
>     }
>   ]
> }
> {code}
>  reader schema:
> {code:json}
> {
>   "type": "record",
>   "name": "foo",
>   "fields": [
>     {
>       "name": "k",
>       "type": {
>         "type": "string",
>         "avro.java.string": "String"
>       }
>     },
>     {
>       "name": "value",
>       "type": [
>         "null",
>         {
>           "type": "record",
>           "name": "bar",
>           "fields": [
>             {
>               "name": "str",
>               "type": {
>                 "type": "string",
>                 "avro.java.string": "String"
>               }
>             }
>           ]
>         }
>       ]
>     }
>   ]
> }
> {code}
> You'll find some example kotlin code demonstrating the problem in the 
> attached Bar.kt.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to