A strange problem when I am trying to read avro record with a subset of the schema. -----------------------------------------------------------------------------------
Key: AVRO-793 URL: https://issues.apache.org/jira/browse/AVRO-793 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.5.0 Environment: Avro1.5,Windows xp/Ubuntu 10.0.4 Reporter: Yingzhong Xu Priority: Critical Hi, all. When I am trying to read avro file with a subset of that schema(because I do not need all the details).I meet a strange problem. 1.I write data using this schema: { "name": "relation", "type": "record", "fields": [ { "name": "timestamp", "type": "long" }, { "name": "type", "type": { "type": "map", "values":{ "type" : "array", "items": { "type":"record", "name":"sdf", "fields": [ { "name": "device", "type": "string" }, { "name": "children", "type": { "type": "array", "items": "string" } } ] } } } } ] } 2.Here is a JSONObject for that schema. { "timestamp":1234567890, "type":{ "WMA":[ { "device":"WMA1", "children":["WMB1","WMB2"] }, { "device":"WMA2", "children":["WMB1","WMB2"] } ] } } 3.I write that record succefully.And it is okay if I use this schema for reading: { "name": "relation", "type": "record", "fields": [ { "name": "timestamp", "type": "long" }, { "name": "type", "type": { "type": "map", "values":{ "type" : "array", "items": { "type":"record", "name":"sdf", "fields": [ { "name": "children", "type": { "type": "array", "items": "string" } } ] } } } } ] } the result is : { "timestamp":1234567890, "type":{ "WMA":[ { "children":["WMB1","WMB2"] }, { "children":["WMB1","WMB2"] } ] } } 4.But if i want to igonre the "children" part instead of "device", I use this schema for reading: { "name": "relation", "type": "record", "fields": [ { "name": "timestamp", "type": "long" }, { "name": "type", "type": { "type": "map", "values":{ "type" : "array", "items": { "type":"record", "name":"sdf", "fields": [ { "name": "device", "type": "string" } ] } } } } ] } Unfortunately,I get exception: java.lang.ArrayIndexOutOfBoundsException: -8 cause:java.lang.ArrayIndexOutOfBoundsException at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122) at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262) at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113) at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60) at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71) at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83) at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195) at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70) at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71) at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140) at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129) at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236) at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223) at AvroUtilTest.read(AvroUtilTest.java:77) at AvroUtilTest.main(AvroUtilTest.java:61) As Scott Carey said,I did like this and it worked.How to fix this bug? Scott Carey: 2: If you change the schema you write with by making reversing the order of the fields of "sdf" (array, then string), are the results the same? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira