Re: Effort towards Avro 2.0?

Christophe Taton Wed, 04 Dec 2013 10:39:41 -0800

Hi Douglas,

When you write a middleware that lets users define custom types, extensions
are pretty much required.


Middleware doesn't need to, and shouldn't need to know these user-defined
custom types ahead of time : you don't want to rebuild and restart your
middleware everytime a user define a new type they want handled by the
middleware.

An explicit bytes field always works, but is both inefficient and unwieldy:

   - inefficient because you'll end up serializing your data twice, once
   from the actual type into the bytes field, then a second type as a bytes
   field;
   - unwieldy because as a user, I'll have to encode and decode the bytes
   field manually everytime I want to access this field from the original
   record, unless I keep track of the decoded extension externally to the Avro
   record.

C.


On Wed, Dec 4, 2013 at 8:07 AM, Douglas Creager <[email protected]>wrote:

> On Tue, Dec 3, 2013, at 07:49 AM, Doug Cutting wrote:
> > On Mon, Dec 2, 2013 at 1:42 PM, Christophe Taton
> > <[email protected]> wrote:
> > > - New extension data type, similar to ProtocolBuffer extensions
> (incompatible change).
> >
> > Extensions might be implemented as something like:
> >
> >   {"type":"record", "name":"extension", "fields":[
> >     {"name":"fingerprint", "type": {"type":"fixed", "size":16}},
> >     {"name":"payload", "type":"bytes"}
> >     ]
> >   }
>
> I'd also want to know more about the kind of use cases that you'd need
> protobuf-style extensions for.  I like Doug's solution if each record
> can have a different set of extensions.  If all of the records will have
> the same set of extensions, my hunch is that you'd only need to use
> extra fields and schema resolution.  Either way, I can't think of a use
> case where a new data type in the spec is a noticeable improvement.
>
> –doug
>

Re: Effort towards Avro 2.0?

Reply via email to