[ https://issues.apache.org/jira/browse/AVRO-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated AVRO-3512: --------------------------------- Labels: pull-request-available (was: ) > aliases to the null namespace do not work as expected > ----------------------------------------------------- > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec > Affects Versions: 1.11.0 > Reporter: Radai Rosenblatt > Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 10m > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ > ".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work .... to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)