[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=771012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771012
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 16/May/22 19:59
Start Date: 16/May/22 19:59
Worklog Time Spent: 10m 
  Work Description: martin-g merged PR #1685:
URL: https://github.com/apache/avro/pull/1685




Issue Time Tracking
---

Worklog Id: (was: 771012)
Time Spent: 1h 10m  (was: 1h)

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770812
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 16/May/22 12:56
Start Date: 16/May/22 12:56
Worklog Time Spent: 10m 
  Work Description: martin-g commented on PR #1685:
URL: https://github.com/apache/avro/pull/1685#issuecomment-1127639076

   I am going to merge this PR even without the confirmation of the current 
Java behavior in the JIRA ticket.
   I will adapt it later if the Java impl changes!




Issue Time Tracking
---

Worklog Id: (was: 770812)
Time Spent: 1h  (was: 50m)

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770683=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770683
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 16/May/22 07:14
Start Date: 16/May/22 07:14
Worklog Time Spent: 10m 
  Work Description: martin-g commented on PR #1685:
URL: https://github.com/apache/avro/pull/1685#issuecomment-1127311267

   I will extract the "Alias is a Name" to a separate issue/PR. It is a bigger 
change that is not really related to this issue/PR.




Issue Time Tracking
---

Worklog Id: (was: 770683)
Time Spent: 50m  (was: 40m)

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770341
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 13/May/22 19:39
Start Date: 13/May/22 19:39
Worklog Time Spent: 10m 
  Work Description: martin-g commented on PR #1685:
URL: https://github.com/apache/avro/pull/1685#issuecomment-1126390216

   > I'm wondering if the aliases are all fully qualified names then we should 
store them in a 'Name' struct? Not sure upsides or downsides though
   
   I like the idea! Let's try it and see!




Issue Time Tracking
---

Worklog Id: (was: 770341)
Time Spent: 40m  (was: 0.5h)

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770253
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 13/May/22 16:03
Start Date: 13/May/22 16:03
Worklog Time Spent: 10m 
  Work Description: radai-rosenblatt commented on PR #1685:
URL: https://github.com/apache/avro/pull/1685#issuecomment-1126214594

   looks OK to me and my lack of familiarity with rust :-)
   
   my only concern is i still dont know if aliases into the null namespace are 
a bug or a feature. 
   personally i definitely think they should be a feature (and i have used them 
at work before), but there's been no response on my issue yet




Issue Time Tracking
---

Worklog Id: (was: 770253)
Time Spent: 0.5h  (was: 20m)

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770173=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770173
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 13/May/22 13:06
Start Date: 13/May/22 13:06
Worklog Time Spent: 10m 
  Work Description: martin-g commented on PR #1685:
URL: https://github.com/apache/avro/pull/1685#issuecomment-1126036967

   // CC @jklamer @radai-rosenblatt 




Issue Time Tracking
---

Worklog Id: (was: 770173)
Time Spent: 20m  (was: 10m)

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
>  Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (AVRO-3512) aliases to the null namespace do not work as expected

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770172
 ]

ASF GitHub Bot logged work on AVRO-3512:


Author: ASF GitHub Bot
Created on: 13/May/22 13:05
Start Date: 13/May/22 13:05
Worklog Time Spent: 10m 
  Work Description: martin-g opened a new pull request, #1685:
URL: https://github.com/apache/avro/pull/1685

   ### Jira
   
   - [X] My PR addresses the following [Avro 
Jira](https://issues.apache.org/jira/browse/AVRO/) issues and references them 
in the PR title. For example, "AVRO-1234: My Avro PR"
 - https://issues.apache.org/jira/browse/AVRO-3512
   
   ### Tests
   
   - [X] My PR adds new unit tests
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines. In 
addition, my commits follow the guidelines from "[How to write a good git 
commit message](https://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [X] A typo is fixed in the documentation.




Issue Time Tracking
---

Worklog Id: (was: 770172)
Remaining Estimate: 0h
Time Spent: 10m

> aliases to the null namespace do not work as expected
> -
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, spec
>Affects Versions: 1.11.0
>Reporter: Radai Rosenblatt
>Priority: Major
> Attachments: AVRO-3512.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified 
> anywhere). it also has [the 
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say 
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully 
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on 
> the declaring type.
>  
> now suppose i was to use aliases on a namespaced schema to be able to read 
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
>   "type": "record",
>   "name": "AncientSchema",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
>         "type" : "enum",
>         "name" : "AncientEnum",
>         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>       }
>     }
>   ]
> }
> {code}
> and reader schema:
> {code:json}
> {
>   "type": "record",
>   "namespace": "much.namespace",
>   "name": "ModernRecord",
>   "fields": [
>     {
>       "name" : "enumField",
>       "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
>".AncientEnum"
> ]
>   }
>   ],
>   "aliases": [
>     ".AncientSchema"
>   ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this 
> should be the only legal way to do this. and it does indeed work  to a 
> point.
>  
> when testing this i found multiple issues with avro's handling of such 
> aliases, dating back to late avro 1.7.*
>  
>  # without these aliases, decoding does fail, but it fails over the nested 
> enum, whereas it should have failed "immediately" on the fullname mismatch on 
> the top level record schema. in fact, on further testing i think avro (at 
> least in java) doesnt bother comparing the fullnames on the top level writer 
> vs reader schemas at all?
>  # while the schema with the aliases parse()es fine, Schema.toString() strips 
> out the dots from the aliases, thereby creating a "monsanto terminator 
> schema" - once printed and parsed again the aliases would become "simple 
> aliases" and stop working
>  # the spec doesnt explicitly talk about how to use aliases to "target" the 
> null namespace. if this is an intentional feature I think the spec should be 
> expanded a little to cover it?
>  
> i have code to reproduce all these issues in 
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
>  (coded against master)
>  
> i also have code to reproduce all the above against multiple older avro 
> versions in 
>