Schema version in namespace? good practice

2024-05-15 Thread Vignesh Kumar Kathiresan via user
Hi,

I am new to avro and started working on it recently. I am in the process of
designing a schema evolution process. We use java applications and make use
of maven plugin to auto generate the classes from .avsc schema files.

I am thinking of adding the version in the namespace during each evolution
as compatibility is based on unqualified-name only according to
specification. This is because I can now have a central library which keeps
track of all the versions and all the client applications can just import
the library and use different versions of schemas(java classes). Instead of
every client importing the required schema files and auto-generating at
their end every time they are upgrading to a newer version of the schema.
Is this a good practice to include version_id in the namespace?

Also we use a schema registry with full compatibility checks on.

 Thanks,
Vignesh


RE: Formal spec for Avro Schema

2024-05-15 Thread Clemens Vasters via user
Hi Martin,

I am saying that the specification of the schema is currently entangled with 
the specification of the serialization framework. Avro Schema is useful and 
usable even if you never touch the Avro binaries (the framework, an 
implementation using the spec).

I am indeed proposing to separate the schema spec from the specs of the Avro 
binary encoding and the Avro JSON encoding, which also avoids strange 
entanglements like the JSON encoding pointing to the schema description’s 
default values section, which is in itself rather lacking in precision, i.e. 
the encoding rule for binary or fixed is “defined” with a rather terse example: 
"\u00ff"

Microsoft would like to propose Avro and Avro Schema in several standardization 
efforts, but we need a spec that works in those contexts and that can stand on 
its own. I would also like to see “application/avro” as a formal media type, 
but the route towards that only goes via formal standardization of both schema 
and encodings.

I believe the Avro project’s reach and importance is such that schema and 
encodings should have formal specs that can stand on their own as JSON and CBOR 
and AMQP and XML and OPC/Binary and other serialization schemas/formats do. I 
don’t think existence of a formal spec gets in the way of progress and Avro is 
so mature that the spec captures a fairly stable state.

Best Regards
Clemens

From: Martin Grigorov 
Sent: Wednesday, May 15, 2024 10:54 AM
To: d...@avro.apache.org
Cc: user@avro.apache.org
Subject: Re: Formal spec for Avro Schema

Hi Clemens,

On Wed, May 15, 2024 at 11:18 AM Clemens Vasters 
mailto:cleme...@microsoft.com.invalid>> wrote:
Hi Martin,

we find Avro Schema to be a great fit for describing application data 
structures in general and even independent of wire-serialization scenarios.

Therefore, I would like to have a spec that focuses specifically on the schema 
format, is grounded in the IETF RFC specs, and which follows the conventions 
set by IETF, so that folks who need a sane schema format to describe data 
structures independent of implementation can use that.

Do you say that the specification document is implementation dependent ?
If this is the case then maybe we should work on improving it instead of 
duplicating it.


The benefit for the Avro serialization framework of having such a formal spec 
that is untangled from the wire-serialization specs is that all schemas defined 
by that schema model are compatible with the framework.

What do you mean by "framework" here ?


The differences are organization, scope, and language style (including keywords 
etc.). The expressed ruleset is the same.

I don't think it is a good idea to add a second document that is very similar 
to the specification but uses a different language style.
To me this looks like a duplication.
IMO it would be better to suggest (many) (smaller) improvements for the 
existing document.



Best Regards
Clemens

-Original Message-
From: Martin Grigorov mailto:mgrigo...@apache.org>>
Sent: Wednesday, May 15, 2024 9:26 AM
To: d...@avro.apache.org
Cc: user@avro.apache.org
Subject: Re: Formal spec for Avro Schema

[Sie erhalten nicht häufig E-Mails von 
mgrigo...@apache.org. Weitere Informationen, warum 
dies wichtig ist, finden Sie unter 
https://aka.ms/LearnAboutSenderIdentification ]

Hi Clemens,

What is the difference between your document and the specification [1] ?
I haven't read it completely but it looks very similar to the specification to 
me.

1. https://avro.apache.org/docs/1.11.1/specification/
2.
https://github.com/apache/avro/tree/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification
- sources of the specification

On Wed, May 15, 2024 at 9:28 AM Clemens Vasters 
mailto:cleme...@microsoft.com>.invalid> wrote:

> I wrote a formal spec for the Avro Schema format.
>
>
>
> https://gist/
> .github.com%2Fclemensv%2F498c481965c425b218ee156b38b49333=05%7C02
> %7Cclemensv%40microsoft.com%7C5cd57d6ebe504e02e6dd08dc74b06a33%7C72f98
> 8bf86f141af91ab2d7cd011db47%7C1%7C0%7C638513548275308005%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C0%7C%7C%7C=n24LJspeNxYRKjlD0tgJzxQh3CzuILK%2FRe30gbarB
> ec%3D=0
>
>
>
> Where would that go in the repo?
>
>
>
>
>
>
>  microsoft.com%2Fen-us%2Fnews%2FImageDetail.aspx%3Fid%3D4DABA54CBB4D25A
> 9E9905BC59E4A6D44E33694EA=05%7C02%7Cclemensv%40microsoft.com%7C5c
> d57d6ebe504e02e6dd08dc74b06a33%7C72f988bf86f141af91ab2d7cd011db47%7C1%
> 7C0%7C638513548275312403%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=x6ZAZ
> YEAjqkSVznt3N%2FKGjZzE%2BJietvHZuaiqVQYuDY%3D=0>
>
> *Clemens Vasters*
>
> Messaging Platform Architect
>
> Microsoft Azure
>
> È+49 151 44063557
>
> *  

Re: Formal spec for Avro Schema

2024-05-15 Thread Martin Grigorov
Hi Elliot,

I am not sure which document you are referring to - the new proposal by
Clemens or the official specification.
Please start a new email thread or file a Jira ticket if you think
something needs to be improved in the specification!

On Wed, May 15, 2024 at 10:56 AM Elliot West  wrote:

> I note that the enum type appears to be missing the specification of the
> default attribute.
>
> On Wed, 15 May 2024 at 08:26, Martin Grigorov 
> wrote:
>
>> Hi Clemens,
>>
>> What is the difference between your document and the specification [1] ?
>> I haven't read it completely but it looks very similar to the
>> specification to me.
>>
>> 1. https://avro.apache.org/docs/1.11.1/specification/
>> 2.
>> https://github.com/apache/avro/tree/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification
>> - sources of the specification
>>
>> On Wed, May 15, 2024 at 9:28 AM Clemens Vasters
>>  wrote:
>>
>>> I wrote a formal spec for the Avro Schema format.
>>>
>>>
>>>
>>> https://gist.github.com/clemensv/498c481965c425b218ee156b38b49333
>>>
>>>
>>>
>>> Where would that go in the repo?
>>>
>>>
>>>
>>>
>>>
>>>
>>> 
>>>
>>> *Clemens Vasters*
>>>
>>> Messaging Platform Architect
>>>
>>> Microsoft Azure
>>>
>>> È+49 151 44063557
>>>
>>> *  cleme...@microsoft.com
>>> European Microsoft Innovation Center GmbH | Gewürzmühlstrasse 11 |
>>> 80539 Munich| Germany
>>> 
>>> Geschäftsführer/General Managers: Keith Dolliver, Benjamin O. Orndorff
>>> Amtsgericht Aachen, HRB 12066
>>>
>>>
>>>
>>>
>>>
>>


Re: Formal spec for Avro Schema

2024-05-15 Thread Martin Grigorov
Hi Clemens,

On Wed, May 15, 2024 at 11:18 AM Clemens Vasters
 wrote:

> Hi Martin,
>
> we find Avro Schema to be a great fit for describing application data
> structures in general and even independent of wire-serialization scenarios.


> Therefore, I would like to have a spec that focuses specifically on the
> schema format, is grounded in the IETF RFC specs, and which follows the
> conventions set by IETF, so that folks who need a sane schema format to
> describe data structures independent of implementation can use that.
>

Do you say that the specification document is implementation dependent ?
If this is the case then maybe we should work on improving it instead of
duplicating it.


>
> The benefit for the Avro serialization framework of having such a formal
> spec that is untangled from the wire-serialization specs is that all
> schemas defined by that schema model are compatible with the framework.
>

What do you mean by "framework" here ?


>
> The differences are organization, scope, and language style (including
> keywords etc.). The expressed ruleset is the same.
>

I don't think it is a good idea to add a second document that is very
similar to the specification but uses a different language style.
To me this looks like a duplication.
IMO it would be better to suggest (many) (smaller) improvements for the
existing document.



>
> Best Regards
> Clemens
>
> -Original Message-
> From: Martin Grigorov 
> Sent: Wednesday, May 15, 2024 9:26 AM
> To: d...@avro.apache.org
> Cc: user@avro.apache.org
> Subject: Re: Formal spec for Avro Schema
>
> [Sie erhalten nicht häufig E-Mails von mgrigo...@apache.org. Weitere
> Informationen, warum dies wichtig ist, finden Sie unter
> https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi Clemens,
>
> What is the difference between your document and the specification [1] ?
> I haven't read it completely but it looks very similar to the
> specification to me.
>
> 1. https://avro.apache.org/docs/1.11.1/specification/
> 2.
>
> https://github.com/apache/avro/tree/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification
> - sources of the specification
>
> On Wed, May 15, 2024 at 9:28 AM Clemens Vasters 
> 
> wrote:
>
> > I wrote a formal spec for the Avro Schema format.
> >
> >
> >
> > https://gist/
> > .github.com%2Fclemensv%2F498c481965c425b218ee156b38b49333=05%7C02
> > %7Cclemensv%40microsoft.com%7C5cd57d6ebe504e02e6dd08dc74b06a33%7C72f98
> > 8bf86f141af91ab2d7cd011db47%7C1%7C0%7C638513548275308005%7CUnknown%7CT
> > WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> > 6Mn0%3D%7C0%7C%7C%7C=n24LJspeNxYRKjlD0tgJzxQh3CzuILK%2FRe30gbarB
> > ec%3D=0
> >
> >
> >
> > Where would that go in the repo?
> >
> >
> >
> >
> >
> >
> >  > microsoft.com%2Fen-us%2Fnews%2FImageDetail.aspx%3Fid%3D4DABA54CBB4D25A
> > 9E9905BC59E4A6D44E33694EA=05%7C02%7Cclemensv%40microsoft.com%7C5c
> > d57d6ebe504e02e6dd08dc74b06a33%7C72f988bf86f141af91ab2d7cd011db47%7C1%
> > 7C0%7C638513548275312403%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=x6ZAZ
> > YEAjqkSVznt3N%2FKGjZzE%2BJietvHZuaiqVQYuDY%3D=0>
> >
> > *Clemens Vasters*
> >
> > Messaging Platform Architect
> >
> > Microsoft Azure
> >
> > È+49 151 44063557
> >
> > *  cleme...@microsoft.com
> > European Microsoft Innovation Center GmbH | Gewürzmühlstrasse 11 |
> > 80539
> > Munich| Germany
> > Geschäftsführer/General Managers: Keith Dolliver, Benjamin O. Orndorff
> > Amtsgericht Aachen, HRB 12066
> >
> >
> >
> >
> >
>


RE: Formal spec for Avro Schema

2024-05-15 Thread Clemens Vasters via user
Hi Martin,

we find Avro Schema to be a great fit for describing application data 
structures in general and even independent of wire-serialization scenarios.

Therefore, I would like to have a spec that focuses specifically on the schema 
format, is grounded in the IETF RFC specs, and which follows the conventions 
set by IETF, so that folks who need a sane schema format to describe data 
structures independent of implementation can use that.

The benefit for the Avro serialization framework of having such a formal spec 
that is untangled from the wire-serialization specs is that all schemas defined 
by that schema model are compatible with the framework.

The differences are organization, scope, and language style (including keywords 
etc.). The expressed ruleset is the same.

Best Regards
Clemens

-Original Message-
From: Martin Grigorov 
Sent: Wednesday, May 15, 2024 9:26 AM
To: d...@avro.apache.org
Cc: user@avro.apache.org
Subject: Re: Formal spec for Avro Schema

[Sie erhalten nicht häufig E-Mails von mgrigo...@apache.org. Weitere 
Informationen, warum dies wichtig ist, finden Sie unter 
https://aka.ms/LearnAboutSenderIdentification ]

Hi Clemens,

What is the difference between your document and the specification [1] ?
I haven't read it completely but it looks very similar to the specification to 
me.

1. https://avro.apache.org/docs/1.11.1/specification/
2.
https://github.com/apache/avro/tree/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification
- sources of the specification

On Wed, May 15, 2024 at 9:28 AM Clemens Vasters 
 wrote:

> I wrote a formal spec for the Avro Schema format.
>
>
>
> https://gist/
> .github.com%2Fclemensv%2F498c481965c425b218ee156b38b49333=05%7C02
> %7Cclemensv%40microsoft.com%7C5cd57d6ebe504e02e6dd08dc74b06a33%7C72f98
> 8bf86f141af91ab2d7cd011db47%7C1%7C0%7C638513548275308005%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C0%7C%7C%7C=n24LJspeNxYRKjlD0tgJzxQh3CzuILK%2FRe30gbarB
> ec%3D=0
>
>
>
> Where would that go in the repo?
>
>
>
>
>
>
>  microsoft.com%2Fen-us%2Fnews%2FImageDetail.aspx%3Fid%3D4DABA54CBB4D25A
> 9E9905BC59E4A6D44E33694EA=05%7C02%7Cclemensv%40microsoft.com%7C5c
> d57d6ebe504e02e6dd08dc74b06a33%7C72f988bf86f141af91ab2d7cd011db47%7C1%
> 7C0%7C638513548275312403%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=x6ZAZ
> YEAjqkSVznt3N%2FKGjZzE%2BJietvHZuaiqVQYuDY%3D=0>
>
> *Clemens Vasters*
>
> Messaging Platform Architect
>
> Microsoft Azure
>
> È+49 151 44063557
>
> *  cleme...@microsoft.com
> European Microsoft Innovation Center GmbH | Gewürzmühlstrasse 11 |
> 80539
> Munich| Germany
> Geschäftsführer/General Managers: Keith Dolliver, Benjamin O. Orndorff
> Amtsgericht Aachen, HRB 12066
>
>
>
>
>


Re: Formal spec for Avro Schema

2024-05-15 Thread Elliot West
I note that the enum type appears to be missing the specification of the
default attribute.

On Wed, 15 May 2024 at 08:26, Martin Grigorov  wrote:

> Hi Clemens,
>
> What is the difference between your document and the specification [1] ?
> I haven't read it completely but it looks very similar to the
> specification to me.
>
> 1. https://avro.apache.org/docs/1.11.1/specification/
> 2.
> https://github.com/apache/avro/tree/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification
> - sources of the specification
>
> On Wed, May 15, 2024 at 9:28 AM Clemens Vasters
>  wrote:
>
>> I wrote a formal spec for the Avro Schema format.
>>
>>
>>
>> https://gist.github.com/clemensv/498c481965c425b218ee156b38b49333
>>
>>
>>
>> Where would that go in the repo?
>>
>>
>>
>>
>>
>>
>> 
>>
>> *Clemens Vasters*
>>
>> Messaging Platform Architect
>>
>> Microsoft Azure
>>
>> È+49 151 44063557
>>
>> *  cleme...@microsoft.com
>> European Microsoft Innovation Center GmbH | Gewürzmühlstrasse 11 | 80539
>> Munich| Germany
>> 
>> Geschäftsführer/General Managers: Keith Dolliver, Benjamin O. Orndorff
>> Amtsgericht Aachen, HRB 12066
>>
>>
>>
>>
>>
>


Re: Formal spec for Avro Schema

2024-05-15 Thread Martin Grigorov
Hi Clemens,

What is the difference between your document and the specification [1] ?
I haven't read it completely but it looks very similar to the specification
to me.

1. https://avro.apache.org/docs/1.11.1/specification/
2.
https://github.com/apache/avro/tree/main/doc/content/en/docs/%2B%2Bversion%2B%2B/Specification
- sources of the specification

On Wed, May 15, 2024 at 9:28 AM Clemens Vasters
 wrote:

> I wrote a formal spec for the Avro Schema format.
>
>
>
> https://gist.github.com/clemensv/498c481965c425b218ee156b38b49333
>
>
>
> Where would that go in the repo?
>
>
>
>
>
>
> 
>
> *Clemens Vasters*
>
> Messaging Platform Architect
>
> Microsoft Azure
>
> È+49 151 44063557
>
> *  cleme...@microsoft.com
> European Microsoft Innovation Center GmbH | Gewürzmühlstrasse 11 | 80539
> Munich| Germany
> Geschäftsführer/General Managers: Keith Dolliver, Benjamin O. Orndorff
> Amtsgericht Aachen, HRB 12066
>
>
>
>
>


Formal spec for Avro Schema

2024-05-15 Thread Clemens Vasters via user
I wrote a formal spec for the Avro Schema format.

https://gist.github.com/clemensv/498c481965c425b218ee156b38b49333

Where would that go in the repo?


[cid:image001.jpg@01DAA6A1.96E35FC0]
Clemens Vasters
Messaging Platform Architect
Microsoft Azure
È+49 151 44063557
*  cleme...@microsoft.com
European Microsoft Innovation Center GmbH | Gewürzmühlstrasse 11 | 80539 
Munich| Germany
Geschäftsführer/General Managers: Keith Dolliver, Benjamin O. Orndorff
Amtsgericht Aachen, HRB 12066