On Tue, Dec 3, 2013 at 7:49 AM, Doug Cutting <[email protected]> wrote:
> On Mon, Dec 2, 2013 at 1:42 PM, Christophe Taton > <[email protected]> wrote: > > - New extension data type, similar to ProtocolBuffer extensions > (incompatible change). > > Extensions might be implemented as something like: > > {"type":"record", "name":"extension", "fields":[ > {"name":"fingerprint", "type": {"type":"fixed", "size":16}}, > {"name":"payload", "type":"bytes"} > ] > } > > One could then use this with: > > {"type":"record", "name":"Foo", "fields":[ > {"name":"bar", "type":"extension"} > ] > } > > The implementation could then find the schema for the extension at > runtime given its fingerprint. The reader could have a table mapping > fingerprints to schemas. > > In particular, the specific compiler, when it sees a schema like: > > > {"type":"record", "name":"Bar", "isExtension":true, "fields":[ > {"name":"x", "type":"long"} > ] > } > > Might emit code to add entries to the extension mapping table used by > SpecificDatumReader, e.g.: > > static { > SpecificData.addExtension(getSchema()); > } > > Might something like this work? > Yes, this is very much the idea. In a prototype I made a few months ago, I found allowing the user to specify the fingerprint schema useful : in some scenario, an extension could be prefixed by a string that contains the JSON schema; in some other scenario, I may want to use fingerprints to identify the schema of the extension; in some other cases, I may want to use some external mapping maintained by another system (eg. the schema repository worked on in AVRO-1124). C.
