A strange problem when I am trying to read avro record with a subset of the 
schema.
-----------------------------------------------------------------------------------

                 Key: AVRO-793
                 URL: https://issues.apache.org/jira/browse/AVRO-793
             Project: Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.5.0
         Environment: Avro1.5,Windows xp/Ubuntu 10.0.4
            Reporter: Yingzhong Xu
            Priority: Critical


Hi, all. When I am trying to read avro file with a subset of that 
schema(because I do not need all the details).I meet a strange problem.
1.I write data using this schema:
{
    "name": "relation",
    "type": "record",
    "fields": [
        {
            "name": "timestamp",
            "type": "long"
        },
        {
            "name": "type",
            "type": {
                "type": "map",
                "values":{
                    "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "device",
"type": "string"
},
{
"name": "children",
"type": {
"type": "array",
"items": "string"
}
}
]
}
}
            }
        }
    ]
}

2.Here is a JSONObject for that schema.
{
"timestamp":1234567890,
"type":{
"WMA":[
{
"device":"WMA1",
"children":["WMB1","WMB2"]
},
{
"device":"WMA2",
"children":["WMB1","WMB2"]
}
]
}

}

3.I write that record succefully.And it is okay if I use this schema for 
reading:
{
    "name": "relation",
    "type": "record",
    "fields": [
        {
            "name": "timestamp",
            "type": "long"
        },
        {
            "name": "type",
            "type": {
                "type": "map",
                "values":{
                    "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "children",
"type": {
"type": "array",
"items": "string"
}
}
]
}
}
            }
        }
    ]
}

the result is :
{
"timestamp":1234567890,
"type":{
"WMA":[
{
"children":["WMB1","WMB2"]
},
{
"children":["WMB1","WMB2"]
}
]
}

}

4.But if i want to igonre the "children" part instead of "device",  I use this 
schema for reading:
{
    "name": "relation",
    "type": "record",
    "fields": [
        {
            "name": "timestamp",
            "type": "long"
        },
        {
            "name": "type",
            "type": {
                "type": "map",
                "values":{
                    "type" : "array",
"items": {
"type":"record",
"name":"sdf",
"fields": [
{
"name": "device",
"type": "string"
}
]
}
}
            }
        }
    ]
}

Unfortunately,I get exception:

java.lang.ArrayIndexOutOfBoundsException: -8
cause:java.lang.ArrayIndexOutOfBoundsException
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122)
at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113)
at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60)
at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83)
at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195)
at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70)
at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at 
org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140)
at 
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223)
at AvroUtilTest.read(AvroUtilTest.java:77)
at AvroUtilTest.main(AvroUtilTest.java:61)



As Scott Carey said,I did like this and it worked.How to fix this bug?
Scott Carey:
2:  If you change the schema you write with by making reversing the order of 
the fields of "sdf" (array, then string), are the results the same?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to