Re: Using Avro for encoding messages

Daniel Schierbeck Thu, 09 Jul 2015 06:22:44 -0700

The Confluent tools seem to be very oriented towards a Java-heavy
infrastructure, and I'd rather not have to re-implement all their somewhat
complex tooling in Ruby and Go. I'd much prefer a simplified model that can
more easily be implemented.
As an aside, Confluent *could* support such a standard by using a custom
"fingerprint type" that's just their id number.


On Thu, Jul 9, 2015 at 2:21 PM Svante Karlsson <svante.karls...@csi.se>
wrote:

> >> What causes the schema normalization to be incomplete?
> Bad implementation, I use C++ avro and it's not complete and not very
> active.
>
> >And is that a problem? As long as the reader can get the schema, it
> shouldn't matter that there are duplicates – as long as the >differences
> between the duplicates do not affect decoding.
> Not really a problem, we tend to use machine generated schemas and they
> are always identical.
>
> I think there are holes in the simplification of types if I remember
> correctly.
> Namespaces should be collapsed,
> {"type" : "string"} -> "string" etc
>
> Current implementation can't reliably decide if two types are identical.
> If you correct the problem later then a registered schema would actually
> change it's hash since it now can be simplified. If this is a problem
> depends on your application.
>
> We currently encode this as you suggest <schema_type (byte)><schema_id
> (32/128bits)><avro (binary)>
> The binary fields should probably have a defined endianness also.
>
> I agree on that a defacto way of encoding this would be nice. Currently I
> would say that the confluent / linkedin way is the normal....
>
>
>

Re: Using Avro for encoding messages

Reply via email to