On 02/05/2012 00:24, Scott Carey wrote:

The second time that "location" is used, it should be used by reference, and not re-defined. I believe that
  "name":"position2"
"type":"some.domain.location" should work, provided the type "some.domain.location" is defined previously in the schema, as it is in "position1".



Thanks, that did the job. Obvious, I suppose when you think about it!

We're attempting to use Avro to define some specifications that we're putting forward (as a set of Avro schemas) to a standards body on which we have a presence.

With Avro as it stands right now, that specification would consist of a set of schemas plus a layer that you have to implement on top of Avro to manage schemas. This isn't ideal. Code (in Java /C# or whatever) should not form part of our spec. I like the idea of the parser having callbacks on specific events, such as "type not defined". That would provide a lot of what we need, but not all.

For our particular scenarios, we don't have access to the domain objects defined by the schemas at runtime. In other words, we are entirely schema-driven (apart from some code generation for our core functionality). Class types of objects at runtime are not something we're interested in -- the actual type defined in the schema is. So, for the location example, we don't have a location object, we have a location schema and therefore it's QName i.e. some.domain.location.

How we proceed in our thin veneer on top of Avro is to serialise any non-primitive (i.e. not one of Avro's built-in types), as an array of bytes. The type information (some.domain.location) is also serialised as part of our schema, so all the information is there to reconstruct a location object at the *endpoint*.

Given that the schema is all there is for us, we've also had to custom-code a type for collections e.g. a list of locations is typed as "list<some.domain.location>".

Any comments or thoughts?

Peter

Reply via email to