Sean,

Thanks for the reply.

Your suggestion kind of makes sense. The default example wraps a 
GenericDatumWriter with a DataFileWriter. Then call the create/append/close 
method on DataFileWriter in sequence to write out the container file. 

Now my problem of using ProtobufDataWriter in a similar fashion is that I do 
not have an avro schema object in the method call dataFileWriter.create(schema, 
file). As I understand, the protobuf-avro should have a way to convert the 
protobuf schema to avro schema for you automatically. I have not found any 
utility class to do the schema conversion.  Correct me if I am wrong. 

Lan



> On Aug 24, 2015, at 3:14 PM, Sean Busbey <bus...@cloudera.com> wrote:
> 
> Hiya Lan!
> 
> You need to use a container file instead of just writing via the datum writer 
> yourself.
> 
> Take a look at the "Getting Started (Java)" section on serialization[1]. The 
> example there uses the GenericDatumWriter, but you ought to be able to switch 
> it out for your ProtobufDatumWriter.
> 
> 
> 
> 
> [1]: 
> http://avro.apache.org/docs/1.7.7/gettingstartedjava.html#Serializing-N101DE 
> <http://avro.apache.org/docs/1.7.7/gettingstartedjava.html#Serializing-N101DE>
> 
> On Mon, Aug 24, 2015 at 12:54 PM, Lan Jiang <ljia...@gmail.com 
> <mailto:ljia...@gmail.com>> wrote:
> Hi, there
> 
> I am trying to convert a protobuf object to Avro. I am using  
> 
> //myProto object is deserialized using google protobuf API
> ProtobufDatumWriter<MyProto> pbWriter = new 
> ProtobufDatumWriter<MyProto>(MyProto.class);
> FileOutputStream fo = new FileOutputStream(args[0]);
> Encoder e = EncoderFactory.get().binaryEncoder(fo, null);
> pbWriter.write(myProto, e);
> fo.flush();
> 
> The avro file was created successfully. If I cat the file, I can see the data 
> in the file. However, when I tried to use avro-tools to get schema or meta 
> info about the saved avro file, it says
> 
> Exception in thread "main" java.io.IOException: Not a data file.
>       at 
> org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
>       at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
>       at 
> org.apache.avro.tool.DataFileGetSchemaTool.run(DataFileGetSchemaTool.java:47)
> 
> Look at the Avro source code, the error means it does not have the first 4 
> bytes matching the MAGIC first 4 bytes. I am trying to see if I have done 
> anything wrong. 
> 
> Appreciate any help you can give me.
> 
> Lan
> 
> 
> 
> -- 
> Sean

Reply via email to