Avro schema doesn't honor backward compatibilty

2016-02-01 Thread Raghvendra Singh
down votefavorite


I have this avro schema

{
 "namespace": "xx..x.x",
 "type": "record",
 "name": "MyPayLoad",
 "fields": [
 {"name": "filed1",  "type": "string"},
 {"name": "filed2", "type": "long"},
 {"name": "filed3",  "type": "boolean"},
 {
  "name" : "metrics",
  "type":
  {
 "type" : "array",
 "items":
 {
 "name": "MyRecord",
 "type": "record",
 "fields" :
 [
   {"name": "min", "type": "long"},
   {"name": "max", "type": "long"},
   {"name": "sum", "type": "long"},
   {"name": "count", "type": "long"}
 ]
 }
  }
 }
  ]}

Here is the code which we use to parse the data

public static final MyPayLoad parseBinaryPayload(byte[] payload) {
DatumReader payloadReader = new
SpecificDatumReader<>(MyPayLoad.class);
Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null);
MyPayLoad myPayLoad = null;
try {
myPayLoad = payloadReader.read(null, decoder);
} catch (IOException e) {
logger.log(Level.SEVERE, e.getMessage(), e);
}

return myPayLoad;
}

Now i want to add one more field int the schema so the schema looks like
below

 {
 "namespace": "xx..x.x",
 "type": "record",
 "name": "MyPayLoad",
 "fields": [
 {"name": "filed1",  "type": "string"},
 {"name": "filed2", "type": "long"},
 {"name": "filed3",  "type": "boolean"},
 {
  "name" : "metrics",
  "type":
  {
 "type" : "array",
 "items":
 {
 "name": "MyRecord",
 "type": "record",
 "fields" :
 [
   {"name": "min", "type": "long"},
   {"name": "max", "type": "long"},
   {"name": "sum", "type": "long"},
   {"name": "count", "type": "long"}
 ]
 }
  }
 }
 {"name": "agentType",  "type": ["null", "string"], "default": "APP_AGENT"}
  ]}

Note the filed added and also the default is defined. The problem is that
if we receive the data which was written using the older schema i get this
error

java.io.EOFException: null
at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473)
~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128)
~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423)
~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
~[avro-1.7.4.jar:1.7.4]
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
~[avro-1.7.4.jar:1.7.4]
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152)
~[avro-1.7.4.jar:1.7.4]
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177)
~[avro-1.7.4.jar:1.7.4]
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148)
~[avro-1.7.4.jar:1.7.4]
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
~[avro-1.7.4.jar:1.7.4]
at 
com.appdynamics.blitz.shared.util.X.parseBinaryPayload(BlitzAvroSharedUtil.java:38)
~[blitz-shared.jar:na]

What i understood from this

document
that this should have been backward compatible but somehow that doesn't
seem to be the case. Any idea what i am doing wrong?


Re: Avro schema doesn't honor backward compatibilty

2016-02-01 Thread Raghvendra Singh
Thanks Prajwal

I tried what you suggested but i still get the same error.



On Mon, Feb 1, 2016 at 2:05 PM, Prajwal Tuladhar  wrote:

> Hi,
>
> I think your usage of default for field "agentType" is invalid here.
>
> When generating code from invalid schema, it tends to fail:
>
> [INFO]
>> [INFO] --- avro-maven-plugin:1.7.6-cdh5.4.4:schema (default) @ test-app
>> ---
>> [WARNING] Avro: Invalid default for field agentType: "APP_AGENT" not a
>> ["null","string"]
>
>
> Try:
>
> {
>>  "namespace": "xx..x.x",
>>  "type": "record",
>>  "name": "MyPayLoad",
>>  "fields": [
>>  {"name": "filed1",  "type": "string"},
>>  {"name": "filed2", "type": "long"},
>>  {"name": "filed3",  "type": "boolean"},
>>  {
>>   "name" : "metrics",
>>   "type":
>>   {
>>  "type" : "array",
>>  "items":
>>  {
>>  "name": "MyRecord",
>>  "type": "record",
>>  "fields" :
>>  [
>>{"name": "min", "type": "long"},
>>{"name": "max", "type": "long"},
>>{"name": "sum", "type": "long"},
>>{"name": "count", "type": "long"}
>>  ]
>>  }
>>   }
>>  },
>>  {"name": "agentType",  "type": ["null", "string"], "default": null}
>>   ]
>> }
>
>
>
>
>
> On Mon, Feb 1, 2016 at 8:31 PM, Raghvendra Singh 
> wrote:
>
>>
>>
>> down votefavorite
>> 
>>
>> I have this avro schema
>>
>> {
>>  "namespace": "xx..x.x",
>>  "type": "record",
>>  "name": "MyPayLoad",
>>  "fields": [
>>  {"name": "filed1",  "type": "string"},
>>  {"name": "filed2", "type": "long"},
>>  {"name": "filed3",  "type": "boolean"},
>>  {
>>   "name" : "metrics",
>>   "type":
>>   {
>>  "type" : "array",
>>  "items":
>>  {
>>  "name": "MyRecord",
>>  "type": "record",
>>  "fields" :
>>  [
>>{"name": "min", "type": "long"},
>>{"name": "max", "type": "long"},
>>{"name": "sum", "type": "long"},
>>{"name": "count", "type": "long"}
>>  ]
>>  }
>>   }
>>  }
>>   ]}
>>
>> Here is the code which we use to parse the data
>>
>> public static final MyPayLoad parseBinaryPayload(byte[] payload) {
>> DatumReader payloadReader = new 
>> SpecificDatumReader<>(MyPayLoad.class);
>> Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null);
>> MyPayLoad myPayLoad = null;
>> try {
>> myPayLoad = payloadReader.read(null, decoder);
>> } catch (IOException e) {
>> logger.log(Level.SEVERE, e.getMessage(), e);
>> }
>>
>> return myPayLoad;
>> }
>>
>> Now i want to add one more field int the schema so the schema looks like
>> below
>>
>>  {
>>  "namespace": "xx..x.x",
>>  "type": "record",
>>  "name": "MyPayLoad",
>>  "fields": [
>>  {"name": "filed1",  "type": "string"},
>>  {"name": "filed2", "type": "long"},
>>  {"name": "filed3",  "type": "boolean"},
>>  {
>>   "name" : "metrics",
>>   "type":
>>   {
>>  "type" : "array",
>>  "items":
>>  {
>>  "name": "MyRecord",
>>  "type": "record",
>>  "fields" :
>>  [
>>{"name": "min", "type": "long"},
>>{"name": "max", "type": "long"},
>>{"name": "sum", "type": "long"},
>>{"name": "count", "type": "long"}
>>  ]
>>  }
>>   }
>>  }
>>  {"name": "agentType",  "type": ["null", "string"], "default": 
>> "APP_AGENT"}
>>   ]}
>>
>> Note the filed added and also the default is defined. The problem is that
>> if we receive the data which was written using the older schema i get this
>> error
>>
>> java.io.EOFException: null
>> at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) 
>> ~[avro-1.7.4.jar:1.7.4]
>> at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) 
>> ~[avro-1.7.4.jar:1.7.4]
>> at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) 
>> ~[avro-1.7.4.jar:1.7.4]
>> at 
>> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) 
>> ~[avro-1.7.4.jar:1.7.4]
>> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) 
>> ~[avro-1.7.4.jar:1.7.4]
>> at 
>> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) 
>> ~[avro-1.7.4.jar:1.7.4]
>> at 
>> 

Re: Avro schema doesn't honor backward compatibilty

2016-02-01 Thread Prajwal Tuladhar
Hi,

I think your usage of default for field "agentType" is invalid here.

When generating code from invalid schema, it tends to fail:

[INFO]
> [INFO] --- avro-maven-plugin:1.7.6-cdh5.4.4:schema (default) @ test-app ---
> [WARNING] Avro: Invalid default for field agentType: "APP_AGENT" not a
> ["null","string"]


Try:

{
>  "namespace": "xx..x.x",
>  "type": "record",
>  "name": "MyPayLoad",
>  "fields": [
>  {"name": "filed1",  "type": "string"},
>  {"name": "filed2", "type": "long"},
>  {"name": "filed3",  "type": "boolean"},
>  {
>   "name" : "metrics",
>   "type":
>   {
>  "type" : "array",
>  "items":
>  {
>  "name": "MyRecord",
>  "type": "record",
>  "fields" :
>  [
>{"name": "min", "type": "long"},
>{"name": "max", "type": "long"},
>{"name": "sum", "type": "long"},
>{"name": "count", "type": "long"}
>  ]
>  }
>   }
>  },
>  {"name": "agentType",  "type": ["null", "string"], "default": null}
>   ]
> }





On Mon, Feb 1, 2016 at 8:31 PM, Raghvendra Singh 
wrote:

>
>
> down votefavorite
> 
>
> I have this avro schema
>
> {
>  "namespace": "xx..x.x",
>  "type": "record",
>  "name": "MyPayLoad",
>  "fields": [
>  {"name": "filed1",  "type": "string"},
>  {"name": "filed2", "type": "long"},
>  {"name": "filed3",  "type": "boolean"},
>  {
>   "name" : "metrics",
>   "type":
>   {
>  "type" : "array",
>  "items":
>  {
>  "name": "MyRecord",
>  "type": "record",
>  "fields" :
>  [
>{"name": "min", "type": "long"},
>{"name": "max", "type": "long"},
>{"name": "sum", "type": "long"},
>{"name": "count", "type": "long"}
>  ]
>  }
>   }
>  }
>   ]}
>
> Here is the code which we use to parse the data
>
> public static final MyPayLoad parseBinaryPayload(byte[] payload) {
> DatumReader payloadReader = new 
> SpecificDatumReader<>(MyPayLoad.class);
> Decoder decoder = DecoderFactory.get().binaryDecoder(payload, null);
> MyPayLoad myPayLoad = null;
> try {
> myPayLoad = payloadReader.read(null, decoder);
> } catch (IOException e) {
> logger.log(Level.SEVERE, e.getMessage(), e);
> }
>
> return myPayLoad;
> }
>
> Now i want to add one more field int the schema so the schema looks like
> below
>
>  {
>  "namespace": "xx..x.x",
>  "type": "record",
>  "name": "MyPayLoad",
>  "fields": [
>  {"name": "filed1",  "type": "string"},
>  {"name": "filed2", "type": "long"},
>  {"name": "filed3",  "type": "boolean"},
>  {
>   "name" : "metrics",
>   "type":
>   {
>  "type" : "array",
>  "items":
>  {
>  "name": "MyRecord",
>  "type": "record",
>  "fields" :
>  [
>{"name": "min", "type": "long"},
>{"name": "max", "type": "long"},
>{"name": "sum", "type": "long"},
>{"name": "count", "type": "long"}
>  ]
>  }
>   }
>  }
>  {"name": "agentType",  "type": ["null", "string"], "default": 
> "APP_AGENT"}
>   ]}
>
> Note the filed added and also the default is defined. The problem is that
> if we receive the data which was written using the older schema i get this
> error
>
> java.io.EOFException: null
> at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) 
> ~[avro-1.7.4.jar:1.7.4]
> at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) 
> ~[avro-1.7.4.jar:1.7.4]
> at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) 
> ~[avro-1.7.4.jar:1.7.4]
> at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) 
> ~[avro-1.7.4.jar:1.7.4]
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) 
> ~[avro-1.7.4.jar:1.7.4]
> at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) 
> ~[avro-1.7.4.jar:1.7.4]
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) 
> ~[avro-1.7.4.jar:1.7.4]
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177)
>  ~[avro-1.7.4.jar:1.7.4]
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> 

Re: Autogenerating Java classes from IDL

2016-02-01 Thread tl

> On 31.01.2016, at 01:13, Evan McClain  wrote:
> 
> I have used IDLs with the avro-maven-plugin and that worked for me.
> Evan

Maven is black art for me but your answer encouraged me to stare at the command 
line output of avro-tools again (since I considered it improbable that the 
maven-plugin can invoke things that the CLI can't). "idl2schemata" does indeed 
provide a good part of what I want in that it omits the protocol declarations 
and only transforms the schema parts.

I guess that Maven would allow me to set up a process where Java classes would 
be generated from the JSON schemata that idl2schemata extracts from the IDL 
protocol definition but I’m having trouble figuring out how to configure the 
POM. 

Just to prove that I’m not totally useless I googled me the following snippet 
that should generate classes from a schema.


  
generate-sources

  schema


  
${project.basedir}/src/main/avro/
  
  
${project.basedir}/src/main/java/
  

  


I reckon that a similar instruction, preceding this one, could initiate the 
conversion from an IDl to a schema (like the CLI tool does with "idl2schema"). 
What should the relevant snippet look like?


Thanks!
Thomas






 








> On Sat, Jan 30, 2016, 8:52 AM tl  wrote:
> Hi,
> 
> I started working with Avro only recently so maybe I missed something but it 
> seems to me that schemas can be defined in JSON as well as in IDL but only 
> from JSON schemas can builder classes be autogenerated. This is a pity since 
> IDLs are much easier to write and read than the JSON representation.
> I wrote a few IDL schemas, converted them to avpr and tweaked those avpr 
> files to become valid avsc schema files from which I autogenerated the 
> classes. This is a rather convoluted process. I wouldn’t mind so much if I 
> wouldn’t know that I or somebody else will have to update the schemas and 
> classes from time to time. This doesn’t look like a robust workflow.
> 
> Deleting the IDLs and avpr files and doing updates only in the avsc schemas 
> reduces the workflow to 2 steps but I loose the nice properties of IDLs [0].
> Using only generic mapping would reduce the workflow by one step too but I’d 
> loose static type checking.
> 
> I’d love to be able to autogenerate the Java classes from the IDLs directly. 
> Is there a way?
> 
> Regards,
> Thomas
> 
> 
> [0] Regarding IDLs it would be cool if I could forward reference objects in 
> the schema or even write nested schemas but that’s a relatively minor gripe.
>