[ 
https://issues.apache.org/jira/browse/AVRO-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated AVRO-3512:
---------------------------------
    Labels: pull-request-available  (was: )

> aliases to the null namespace do not work as expected
> -----------------------------------------------------
>
>                 Key: AVRO-3512
>                 URL: https://issues.apache.org/jira/browse/AVRO-3512
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java, spec
>    Affects Versions: 1.11.0
>            Reporter: Radai Rosenblatt
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: AVRO-3512.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "ModernEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
>         "aliases": [
>            ".AncientEnum"
>         ]
>       }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work .... to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to