Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-02 Thread Ryan Skraba
Hello! This is a good discussion. For your question: > when sending avro bytes (obtained by provided serializer[1]), they are or can > be somehow paired with schema used to serialize data The answer is no, not in the serializer method you've provided -- it serializes *only* the data without

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-01 Thread Svante Karlsson
First of all you can use confluents schema registry as you which - it's not in the paid bundle as long as you are not hosting kafka as a service (ie amazon et al). And I would recommend you to. It's good and trivial to operate. Second, take a look at the serializer in my pet project at:

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-01 Thread Martin Mucha
Thanks for answer! Ad: "which byte[] are we talking about?" — actually I don't know. Please lets break it down together. I'm pretty sure, that we're not using confluent platform(iiuc the paid bundle, right?). I shared some serializer before [1], so you're saying, that this wont include neither

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-01 Thread Svante Karlsson
For clarity: What byte[] are we talking about? You are slightly missing my point if we are speaking about kafka. Confluent encoding: 0 schema_id avro avro_binary_payload does not in any case contain the schema or schema id. The schema id is a confluent thing. (in an avrofile the

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-01 Thread Martin Mucha
Thanks for answer. What I knew already is, that in each message there is _somehow_ present either _some_ schema ID or full schema. I saw some byte array manipulations to get _somehow_ defined schema ID from byte[], which worked, but that's definitely not how it should be done. What I'm looking

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-01 Thread Svante Karlsson
In an avrofile the schema is in the beginning but if you refer a single record serialization like Kafka then you have to add something that you can use to get hold of the schema. Confluents avroencoder for Kafka uses confluents schema registry that uses int32 as schema Id. This is prepended (+a

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-08-01 Thread Martin Mucha
Hi, just one more question, not strictly related to the subject. Initially I though I'd be OK with using some initial version of schema in place of writer schema. That works, but all columns from schema older than this initial one would be just ignored. So I need to know EXACTLY the schema,

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-07-30 Thread Martin Mucha
Thank you very much for in depth answer. I understand how it works now better, will test it shortly. Thank you for your time. Martin. út 30. 7. 2019 v 17:09 odesílatel Ryan Skraba napsal: > Hello! It's the same issue in your example code as allegro, even with > the SpecificDatumReader. > >

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-07-30 Thread Ryan Skraba
Hello! It's the same issue in your example code as allegro, even with the SpecificDatumReader. This line: datumReader = new SpecificDatumReader<>(schema) should be: datumReader = new SpecificDatumReader<>(originalSchema, schema) In Avro, the original schema is commonly known as the writer

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-07-30 Thread Martin Mucha
Thanks for answer. Actually I have exactly the same behavior with avro 1.9.0 and following deserializer in our other app, which uses strictly avro codebase, and failing with same exceptions. So lets leave "allegro" library and lots of other tools out of it in our discussion. I can use whichever

Re: AVRO schema evolution: adding optional column with default fails deserialization

2019-07-30 Thread Ryan Skraba
Hello! Schema evolution relies on both the writer and reader schemas being available. It looks like the allegro tool you are using is using the GenericDatumReader that assumes the reader and writer schema are the same: