Hi,

I am trying to write a very easy avro schema (easy because I am just pointing 
out my current issue) to write an avro data file based on data stored in json 
format. The trick is that one field is optional, and one of avrotools or me is 
not doing it right.

The goal is not to write my own serialiser, the endgoal will be to have this in 
flume, I am in the early stages.

The data (works), in a file named so.log:

{
  "valid":  {"boolean":true}
, "source": {"bytes":"live"}
}

The schema, in a file named so.avsc:

{
  "type":"record",
  "name":"Event",
  "fields":[
      {"name":"valid", "type": ["null", "boolean"],"default":null}
    , {"name":"source","type": ["null", "bytes"],"default":null}
  ]
}
I can easily generate an avro file with the following command:

java -jar avro-tools-1.7.6.jar fromjson --schema-file so.avsc so.log

So far so good. The thing is that "source" is optional, so I would expect the 
following data to be valid as well:

{
  "valid": {"boolean":true}
}

But running the same command gives me the error:

Exception in thread "main" org.apache.avro.AvroTypeException: Expected 
start-union. Got END_OBJECT
at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:155)
at 
org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:99)
at org.apache.avro.tool.Main.run(Main.java:84)
at org.apache.avro.tool.Main.main(Main.java:73)
I did try a lot of variations in the schema, even things that do not follow the 
avro spec. The schema I show here is, as far as I know, what the spec says it 
should be.

Would anybody know what I am doing wrong, and how I can actually have optional 
elements without writing my own serialiser?

Thanks,

--
Guillaume ROGER, Datawarehouse Engineer
Spil Games, http://www.spilgames.com
Arendstraat 23, 1223 RE Hilversum, The Netherlands

Reply via email to