Re: Recomended naming of types to support for schema evolution

Martin Mucha Mon, 30 Dec 2019 15:02:07 -0800

Hi, thanks for answer.

I don't understand avro sufficiently and don't know schema registry at all,
actually. So maybe following questions will be dumb.

a) how is schema registry with 5B header different from single object
encoding with 10B header?
b) will schema registry somehow relieve me from having to parse individual
schemas? What if I want to/have to send 2 different version of certain
schema?
c) actually what I have here is (seemingly) pretty similar setup (and btw,
which was recommended here as an alternative to confluent schema registry):
it's a registry without an extra service. Trivial map mapping single object
encoding long[data type] schema fingerprint, pairing schema fingerprint to
schema. So when the bytes "arrive" I can easily read header, find out
fingerprint, get hold onto schema and decode it. Trivial. But the snag is,
that single Schema.Names instance can contain just one Name of given
"identity", and equality is based on fully qualified type, ie. namespace
and name. Thus if you have schema in 2 versions, which does have same
namespace and name, they cannot be parsed using same Parser. Does schema
registry (from confluent platform, right?) work differently than this? Does
this "use it for decoding" process bypasses avros new Schema.Parser().parse
and everything beneath it?

~ I really don't know how this work/should work, as there are close to no
complete actual examples and documentation does not help much. For example
if avro schema evolves from v1 to v2, and the type names and nameschema
aren't the same, how will be the pairing between fields made ?? Completely
puzzling. I need no less then schema evolution with backward and forward
compatibility with schema reuse (ie. no hacks with top level union, but
schema reusing using schema imports). I think I can hack my way through, by
using one parser per set of 1 schema of given version and all needed
imports, which will make everything working (well I don't yet know about
anything which will fail), but it completely does not feel right. And I
would like to know, what is the corret avro way. And I suppose it should be
possible without confluent schema registry, just with single object
encoding as I cannot see any difference between them, but please correct me
if I'm wrong.

thanks,
Mar.

po 30. 12. 2019 v 20:32 odesílatel Lee Hambley <lee.hamb...@gmail.com>
napsal:

> Hi Martin,
>
> I believe the answer is "just use the schema registry". When you then
> encode for the network your library should give you a binary package with a
> 5 byte header that includes the schema version and name from the registry.
> The reader will when go to the registry and find that schema at that
> version and use it for decoding.
>
> In my experience the naming/etc doesn't matter, only things like defaults
> in enums and things need to be given a thought, but you'll see that for
> yourself with experience.
>
> HTH, Regards,
>
> Lee Hambley
> http://lee.hambley.name/
> +49 (0) 170 298 5667
>
>
> On Mon, 30 Dec 2019 at 17:26, Martin Mucha <alfon...@gmail.com> wrote:
>
>> Hi,
>> I'm relatively new to avro, and I'm still struggling with getting schema
>> evolution and related issues. But today it should be simple question.
>>
>> What is recommended naming of types if we want to use schema evolution?
>> Should namespace contain some information about version of schema? Or
>> should it be in type itself? Or neither? What is the best practice? Is
>> evolution even possible if namespace/type name is different?
>>
>> I thought that "neither" it's the case, built the app so that version ID
>> is nowhere except for the directory structure, only latest version is
>> compiled to java classes using maven plugin, and parsed all other avsc
>> files in code (to be able to build some sort of schema registry, identify
>> used writer schema using single object encoding and use schema evolution).
>> However I used separate Parser instance to parse each schema. But if one
>> would like to use schema imports, he cannot have separate parser for every
>> schema, and having global one in this setup is also not possible, as each
>> type can be registered just once in org.apache.avro.Schema.Names. Btw. I
>> favored this variant(ie. no ID in name/namespace) because in this setup,
>> after I introduce new schema version, I do not have to change imports in
>> whole project, but just one line in pom.xml saying which directory should
>> be compiled into java files.
>>
>> so what could be the suggestion to correct naming-versioning scheme?
>> thanks,
>> M.
>>
>

Re: Recomended naming of types to support for schema evolution

Reply via email to