[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=771012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771012 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 16/May/22 19:59 Start Date: 16/May/22 19:59 Worklog Time Spent: 10m Work Description: martin-g merged PR #1685: URL: https://github.com/apache/avro/pull/1685 Issue Time Tracking --- Worklog Id: (was: 771012) Time Spent: 1h 10m (was: 1h) > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770812 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 16/May/22 12:56 Start Date: 16/May/22 12:56 Worklog Time Spent: 10m Work Description: martin-g commented on PR #1685: URL: https://github.com/apache/avro/pull/1685#issuecomment-1127639076 I am going to merge this PR even without the confirmation of the current Java behavior in the JIRA ticket. I will adapt it later if the Java impl changes! Issue Time Tracking --- Worklog Id: (was: 770812) Time Spent: 1h (was: 50m) > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 1h > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770683=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770683 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 16/May/22 07:14 Start Date: 16/May/22 07:14 Worklog Time Spent: 10m Work Description: martin-g commented on PR #1685: URL: https://github.com/apache/avro/pull/1685#issuecomment-1127311267 I will extract the "Alias is a Name" to a separate issue/PR. It is a bigger change that is not really related to this issue/PR. Issue Time Tracking --- Worklog Id: (was: 770683) Time Spent: 50m (was: 40m) > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 50m > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770341 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 13/May/22 19:39 Start Date: 13/May/22 19:39 Worklog Time Spent: 10m Work Description: martin-g commented on PR #1685: URL: https://github.com/apache/avro/pull/1685#issuecomment-1126390216 > I'm wondering if the aliases are all fully qualified names then we should store them in a 'Name' struct? Not sure upsides or downsides though I like the idea! Let's try it and see! Issue Time Tracking --- Worklog Id: (was: 770341) Time Spent: 40m (was: 0.5h) > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 40m > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770253 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 13/May/22 16:03 Start Date: 13/May/22 16:03 Worklog Time Spent: 10m Work Description: radai-rosenblatt commented on PR #1685: URL: https://github.com/apache/avro/pull/1685#issuecomment-1126214594 looks OK to me and my lack of familiarity with rust :-) my only concern is i still dont know if aliases into the null namespace are a bug or a feature. personally i definitely think they should be a feature (and i have used them at work before), but there's been no response on my issue yet Issue Time Tracking --- Worklog Id: (was: 770253) Time Spent: 0.5h (was: 20m) > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770173=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770173 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 13/May/22 13:06 Start Date: 13/May/22 13:06 Worklog Time Spent: 10m Work Description: martin-g commented on PR #1685: URL: https://github.com/apache/avro/pull/1685#issuecomment-1126036967 // CC @jklamer @radai-rosenblatt Issue Time Tracking --- Worklog Id: (was: 770173) Time Spent: 20m (was: 10m) > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Labels: pull-request-available > Attachments: AVRO-3512.patch > > Time Spent: 20m > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in > [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected
[ https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770172 ] ASF GitHub Bot logged work on AVRO-3512: Author: ASF GitHub Bot Created on: 13/May/22 13:05 Start Date: 13/May/22 13:05 Worklog Time Spent: 10m Work Description: martin-g opened a new pull request, #1685: URL: https://github.com/apache/avro/pull/1685 ### Jira - [X] My PR addresses the following [Avro Jira](https://issues.apache.org/jira/browse/AVRO/) issues and references them in the PR title. For example, "AVRO-1234: My Avro PR" - https://issues.apache.org/jira/browse/AVRO-3512 ### Tests - [X] My PR adds new unit tests ### Commits - [X] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](https://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [X] A typo is fixed in the documentation. Issue Time Tracking --- Worklog Id: (was: 770172) Remaining Estimate: 0h Time Spent: 10m > aliases to the null namespace do not work as expected > - > > Key: AVRO-3512 > URL: https://issues.apache.org/jira/browse/AVRO-3512 > Project: Apache Avro > Issue Type: Bug > Components: java, spec >Affects Versions: 1.11.0 >Reporter: Radai Rosenblatt >Priority: Major > Attachments: AVRO-3512.patch > > Time Spent: 10m > Remaining Estimate: 0h > > the avro spec allows for the "null namespace" (when no namespace is specified > anywhere). it also has [the > following|https://avro.apache.org/docs/current/spec.html#Aliases] to say > about aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no namespace). > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ >".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work to a > point. > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch on > the top level record schema. in fact, on further testing i think avro (at > least in java) doesnt bother comparing the fullnames on the top level writer > vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() strips > out the dots from the aliases, thereby creating a "monsanto terminator > schema" - once printed and parsed again the aliases would become "simple > aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" the > null namespace. if this is an intentional feature I think the spec should be > expanded a little to cover it? > > i have code to reproduce all these issues in > [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > i also have code to reproduce all the above against multiple older avro > versions in >