Issue with reading old data with a new Avro Schema

2015-04-08 Thread Nicolas Phung
Hello,

I'm trying to read old avro binary data with a new schema (I add a new
field).

This is the Avro Schema (OLD) I was using to write Avro binary data before:
{
namespace: com.hello.world,
type: record,
name: Toto,
fields:
{
name: a,
type: [
string,
null
]
},
{
name: b,
type: string
}
]
}

This is the Avro Schema (NEW) I'm using to read the Avro binary data :

{
namespace: com.hello.world,
type: record,
name: Toto,
fields:
{
name: a,
type: [
string,
null
]
},
{
name: b,
type: string
},
{
name: c,
type: string,
default: na
}
]
}

However, I can't read the old data with the new Schema. I've got the
following errors :

15/04/08 17:32:22 ERROR executor.Executor: Exception in task 0.0 in stage
3.0 (TID 3)
java.io.EOFException
at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473)
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:272)
at
org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:113)
at
org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:353)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157)
at
org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at com.miguno.kafka.avro.AvroDecoder.fromBytes(AvroDecoder.scala:31)

From my understanding, I should be able to read the old data with the new
schema that contains a new field with a default value. But it doesn't seem
to work. Am I doing something wrong ?

I have posted a report https://issues.apache.org/jira/browse/AVRO-1661

Regards,
Nicolas PHUNG


Re: Issue with reading old data with a new Avro Schema

2015-04-08 Thread Nicolas Phung
OLD:
{
namespace: com.hello.world,
type: record,
name: Toto,
fields: [
{
name: a,
type: [
string,
null
]
},
{
name: b,
type: string
}
]
}

NEW:
{
namespace: com.hello.world,
type: record,
name: Toto,
fields: [
{
name: a,
type: [
string,
null
]
},
{
name: b,
type: string
},
{
name: c,
type: string,
default: na
}
]
}

Sorry bad copy paste. The Avro Schema should be fine because I'm using
sbt-avro to generate the class from it.

On Wed, Apr 8, 2015 at 6:57 PM, Lukas Steiblys lu...@doubledutch.me wrote:

   The schema is not valid JSON. Maybe you forgot the “[“ after “fields:”?

 Lukas
   *From:* Nicolas Phung nicolas.ph...@gmail.com
 *Sent:* Wednesday, April 8, 2015 9:45 AM
 *To:* user@avro.apache.org
 *Subject:* Issue with reading old data with a new Avro Schema

  Hello,

 I'm trying to read old avro binary data with a new schema (I add a new
 field).

  This is the Avro Schema (OLD) I was using to write Avro binary data
 before:
 {
 namespace: com.hello.world,
 type: record,
 name: Toto,
 fields:
 {
 name: a,
 type: [
 string,
 null
 ]
 },
 {
 name: b,
 type: string
 }
 ]
 }

 This is the Avro Schema (NEW) I'm using to read the Avro binary data :

 {
 namespace: com.hello.world,
 type: record,
 name: Toto,
 fields:
 {
 name: a,
 type: [
 string,
 null
 ]
 },
 {
 name: b,
 type: string
 },
 {
 name: c,
 type: string,
 default: na
 }
 ]
 }

 However, I can't read the old data with the new Schema. I've got the
 following errors :

 15/04/08 17:32:22 ERROR executor.Executor: Exception in task 0.0 in stage
 3.0 (TID 3)
 java.io.EOFException
 at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473)
 at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128)
 at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259)
 at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:272)
 at
 org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:113)
 at
 org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:353)
 at
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157)
 at
 org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
 at
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
 at
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
 at
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at com.miguno.kafka.avro.AvroDecoder.fromBytes(AvroDecoder.scala:31)

 From my understanding, I should be able to read the old data with the new
 schema that contains a new field with a default value. But it doesn't seem
 to work. Am I doing something wrong ?

 I have posted a report https://issues.apache.org/jira/browse/AVRO-1661

 Regards,
 Nicolas PHUNG