Great, Thank you very much guys, this works. Very much appreciated. On Tue, Feb 2, 2016 at 12:46 PM, kulkarni.swar...@gmail.com < kulkarni.swar...@gmail.com> wrote:
> Raghvendra, > > You need to use > > *DatumReader<ControllerPayload> payloadReader = new > SpecificDatumReader<>(SCHEMA_V1, **SCHEMA_V2**)* > > So you provide both writer(SCHEMA_V1) and reader(SCHMEA_V2) to avro. In > your current case avro is assuming both to be the same which is certainly > not the case and hence it is failing. I think this is what Ryan was > referring to as well. > > Hope that helps. > > > > On Tue, Feb 2, 2016 at 1:44 PM, Raghvendra Singh <rsi...@appdynamics.com> > wrote: > >> Hi Ryan >> >> Thanks for your answer. Here is what i am doing in my environment >> >> 1. Write the data using the old schema >> >> *SpecificDatumWriter<ControllerPayload> datumWriter = new >> SpecificDatumWriter<>(SCHEMA_V1)* >> >> 2. Now trying to read the data written by the old schema using the new >> schema >> >> *DatumReader<ControllerPayload> payloadReader = new >> SpecificDatumReader<>(**SCHEMA_V2**)* >> >> In this case *SCHEMA_V1 *is the old schema which doesn't have the field >> while SCHEMA_V2 is the new one which has the extra field. >> >> Your suggestion *"You should run setSchema on your SpecificDatumReader >> to set the schema the data was written with"* is kind of work around >> where i have to read the data with the schema it was written with and hence >> this is not exactly backward compatible. Note that if i do this then i have >> to maintain all the schemas while reading and somehow know which version >> the data was written with and hence this will make schema evolution pretty >> painful. >> >> Please let me know if i didn't understand your email correctly or their >> is something i missed. >> >> -raghu >> >> On Tue, Feb 2, 2016 at 9:19 AM, Ryan Blue <b...@cloudera.com> wrote: >> >>> Hi Raghvendra, >>> >>> It looks like the problem is that you're using the new schema in place >>> of the schema that the data was written with. You should run setSchema on >>> your SpecificDatumReader to set the schema the data was written with. >>> >>> What's happening is that the schema you're using, the new one, has the >>> new field so Avro assumes it is present and tries to read it. By setting >>> the schema that the data was actually written with, the datum reader will >>> know that it isn't present and will use your default instead. When you read >>> data encoded with the new schema, you need to use it as the written schema >>> instead so the datum reader knows that the field should be read. >>> >>> Does that make sense? >>> >>> rb >>> >>> On 02/01/2016 12:31 PM, Raghvendra Singh wrote: >>> >>>> down votefavorite >>>> < >>>> http://stackoverflow.com/questions/34733604/avro-schema-doesnt-honor-backward-compatibilty# >>>> > >>>> >>>> >>>> I have this avro schema >>>> >>>> { >>>> "namespace": "xx.xxxx.xxxxx.xxxxx", >>>> "type": "record", >>>> "name": "MyPayLoad", >>>> "fields": [ >>>> {"name": "filed1", "type": "string"}, >>>> {"name": "filed2", "type": "long"}, >>>> {"name": "filed3", "type": "boolean"}, >>>> { >>>> "name" : "metrics", >>>> "type": >>>> { >>>> "type" : "array", >>>> "items": >>>> { >>>> "name": "MyRecord", >>>> "type": "record", >>>> "fields" : >>>> [ >>>> {"name": "min", "type": "long"}, >>>> {"name": "max", "type": "long"}, >>>> {"name": "sum", "type": "long"}, >>>> {"name": "count", "type": "long"} >>>> ] >>>> } >>>> } >>>> } >>>> ]} >>>> >>>> Here is the code which we use to parse the data >>>> >>>> public static final MyPayLoad parseBinaryPayload(byte[] payload) { >>>> DatumReader<MyPayLoad> payloadReader = new >>>> SpecificDatumReader<>(MyPayLoad.class); >>>> Decoder decoder = DecoderFactory.get().binaryDecoder(payload, >>>> null); >>>> MyPayLoad myPayLoad = null; >>>> try { >>>> myPayLoad = payloadReader.read(null, decoder); >>>> } catch (IOException e) { >>>> logger.log(Level.SEVERE, e.getMessage(), e); >>>> } >>>> >>>> return myPayLoad; >>>> } >>>> >>>> Now i want to add one more field int the schema so the schema looks like >>>> below >>>> >>>> { >>>> "namespace": "xx.xxxx.xxxxx.xxxxx", >>>> "type": "record", >>>> "name": "MyPayLoad", >>>> "fields": [ >>>> {"name": "filed1", "type": "string"}, >>>> {"name": "filed2", "type": "long"}, >>>> {"name": "filed3", "type": "boolean"}, >>>> { >>>> "name" : "metrics", >>>> "type": >>>> { >>>> "type" : "array", >>>> "items": >>>> { >>>> "name": "MyRecord", >>>> "type": "record", >>>> "fields" : >>>> [ >>>> {"name": "min", "type": "long"}, >>>> {"name": "max", "type": "long"}, >>>> {"name": "sum", "type": "long"}, >>>> {"name": "count", "type": "long"} >>>> ] >>>> } >>>> } >>>> } >>>> {"name": "agentType", "type": ["null", "string"], "default": >>>> "APP_AGENT"} >>>> ]} >>>> >>>> Note the filed added and also the default is defined. The problem is >>>> that >>>> if we receive the data which was written using the older schema i get >>>> this >>>> error >>>> >>>> java.io.EOFException: null >>>> at org.apache.avro.io >>>> .BinaryDecoder.ensureBounds(BinaryDecoder.java:473) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at org.apache.avro.io >>>> .BinaryDecoder.readInt(BinaryDecoder.java:128) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at org.apache.avro.io >>>> .BinaryDecoder.readIndex(BinaryDecoder.java:423) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at org.apache.avro.io >>>> .ResolvingDecoder.doAction(ResolvingDecoder.java:229) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at org.apache.avro.io >>>> .ResolvingDecoder.readIndex(ResolvingDecoder.java:206) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at >>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at >>>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at >>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at >>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) >>>> ~[avro-1.7.4.jar:1.7.4] >>>> at >>>> com.appdynamics.blitz.shared.util.XXXXXXXXXXXXX.parseBinaryPayload(BlitzAvroSharedUtil.java:38) >>>> ~[blitz-shared.jar:na] >>>> >>>> What i understood from this >>>> < >>>> https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html >>>> > >>>> document >>>> that this should have been backward compatible but somehow that doesn't >>>> seem to be the case. Any idea what i am doing wrong? >>>> >>>> >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Cloudera, Inc. >>> >> >> > > > -- > Swarnim >