Re: Unable to get Avro schema alias working for an array item

2022-01-15 Thread Spencer Nelson
This should work according to the spec. What language and Avro library are
you using, and with what version?

Aliases are a bit tricky to use correctly. When deserializing, you may need
to indicate the writer’s schema as using oldFieldName1 and oldFieldName2,
while the reader schema uses newFieldName1 and newFieldName2. In other
words, you may need to provide both the old and new schemas to the
deserializer. This is just built in to how aliases work (
https://avro.apache.org/docs/current/spec.html#Aliases). This may be a
little abstract and unclear; it’s easier to describe in the context of a
particular language.


On Sat, Jan 15, 2022 at 8:37 AM Spencer Lu  wrote:

> Hi everyone,
>
> We have an application that receives Avro data, and it needs to rename
> certain fields in the data before sending it downstream. The
> application is using the following Avro schema to send the data
> downstream (note that 2 of the fields have aliases defined):
>
> {
> "name":"MyCompanyRecordAvroEncoder",
> "aliases":["com.mycompany.avro.MyStats"],
> "type":"record",
> "fields":[
> {"name":"newFieldName1","type":["null",
> "int"],"default":null,"aliases":["oldFieldName1"]}
>
> {"name":"statusRecords","type":{"type":"array","items":{"name":"StatusAvroRecord","type":"record","fields"
> : [
> {"name":"recordId","type":"long"},
> {"name":"recordName","type":["null", "string"],"default":null},
> {"name":"newFieldName2","type":["null",
> "string"],"default":null,"aliases":["oldFieldName2"]}
> ]}}, "default": []}
> ]
> }
>
> We see that our application receives the following Avro data:
>
> {
> "oldFieldName1": 300,
> "statusRecords": [
> {
> "recordId": 100,
> "recordName": "Record1",
> "oldFieldName2":
>
> "{\"type\":\"XYZ\",\"properties\":{\"property1\":-1.2,\"property2\":\"Value\"}}"
> }
> ]
> }
>
> Then the application sends the following Avro data downstream:
>
> {
>  "newFieldName1": 300,
>  "statusRecords": [
>  {
>  "recordId": 100,
>  "recordName": "Record1",
>  "newFieldName2": null
>  }
>  ]
> }
>
> As you can see, newFieldName1 is aliased to oldFieldName1 and has the
> value from oldFieldName1, so its alias is working.
>
> However, newFieldName2 is aliased to oldFieldName2, but it is null
> instead of having the value from oldFieldName2, so its alias is not
> working.
>
> The only difference I see between newFieldName1 and newFieldName2 is
> that newFieldName2 is a field within an array item. Do aliases not
> work for fields in array items? Or is there some other issue?
>
> Any idea how can I get the alias for newFieldName2 to work?
>
> Thanks,
> Spencer
>


Re: Spec wording on fullnames is not clear

2021-12-27 Thread Spencer Nelson
Trick question! "c" is a *field name*, not a type name, so the fullname is
either "a.d" or "d". Fields don't have fullnames.

But your question is still good. I don't think this is clear in the Avro
specification either. I asked avro-dev about this about a year ago and got
no response:
http://mail-archives.us.apache.org/mod_mbox/avro-dev/202103.mbox/%3cCAB6dobWX1=_fctgvgm-d5r17pv_69u27tdzvljmwc+aizow...@mail.gmail.com%3e

As I mentioned in that email, there are even more tricky cases than the one
you listed. What if the "a.b" schema definition is wrapped inside another
schema with an explicit "namespace" field? Like this:

{
  "type": "record",
  "name": "wrapper",
  "namespace": "wrapping"
  "fields": [
{
  "name": "inside",
  "type": {
  "type": "record",
  "name": "a.b",
"fields": [
  {
"name": "c",
"type": {
  "type": "record",
  "name": "d",
  "fields": []
}
  }
]
  }
}
  ]
}

Now is the interior one "a.d" (since "a.b" is a fullname, so it implicitly
creates a namespace of "a"), or is it "wrapping.d" (since that's the first
explicit namespace)?

The spec just says that when a type is named with dots in it (like "a.b"),
then it ignores any namespaces, but it doesn't say it creates one for all
children. I think implementations are inconsistent in how they handle this,
and it needs to be cleaned up in the spec.



On Mon, Dec 27, 2021 at 1:53 PM Brennan Vincent 
wrote:

> It is a.c
>
> > On Dec 27, 2021, at 9:42 AM, Askar Safin  wrote:
> >
> > Hi. I'm writing Avro implementation in Rust for personal use. I have a
> question. Consider this Avro scheme:
> >
> > {
> >  "type": "record",
> >  "name": "a.b",
> >  "fields": [
> >{
> >  "name": "c",
> >  "type": {
> >"type": "record",
> >"name": "d",
> >"fields": []
> >  }
> >}
> >  ]
> > }
> >
> > What is fullname of record "c"? "a.c" or "c"? I think Avro specification
> is vague about this and should be fixed. When I attempt to interpret Avro
> spec literally, I get to conclusion that the fullname is "a.c". But this
> contradicts to my common sense.
> >
> > ==
> > Askar Safin
> > http://safinaskar.com
> > https://sr.ht/~safinaskar
> > https://github.com/
>