Sean, Thanks for the reply.
Your suggestion kind of makes sense. The default example wraps a GenericDatumWriter with a DataFileWriter. Then call the create/append/close method on DataFileWriter in sequence to write out the container file. Now my problem of using ProtobufDataWriter in a similar fashion is that I do not have an avro schema object in the method call dataFileWriter.create(schema, file). As I understand, the protobuf-avro should have a way to convert the protobuf schema to avro schema for you automatically. I have not found any utility class to do the schema conversion. Correct me if I am wrong. Lan > On Aug 24, 2015, at 3:14 PM, Sean Busbey <bus...@cloudera.com> wrote: > > Hiya Lan! > > You need to use a container file instead of just writing via the datum writer > yourself. > > Take a look at the "Getting Started (Java)" section on serialization[1]. The > example there uses the GenericDatumWriter, but you ought to be able to switch > it out for your ProtobufDatumWriter. > > > > > [1]: > http://avro.apache.org/docs/1.7.7/gettingstartedjava.html#Serializing-N101DE > <http://avro.apache.org/docs/1.7.7/gettingstartedjava.html#Serializing-N101DE> > > On Mon, Aug 24, 2015 at 12:54 PM, Lan Jiang <ljia...@gmail.com > <mailto:ljia...@gmail.com>> wrote: > Hi, there > > I am trying to convert a protobuf object to Avro. I am using > > //myProto object is deserialized using google protobuf API > ProtobufDatumWriter<MyProto> pbWriter = new > ProtobufDatumWriter<MyProto>(MyProto.class); > FileOutputStream fo = new FileOutputStream(args[0]); > Encoder e = EncoderFactory.get().binaryEncoder(fo, null); > pbWriter.write(myProto, e); > fo.flush(); > > The avro file was created successfully. If I cat the file, I can see the data > in the file. However, when I tried to use avro-tools to get schema or meta > info about the saved avro file, it says > > Exception in thread "main" java.io.IOException: Not a data file. > at > org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105) > at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97) > at > org.apache.avro.tool.DataFileGetSchemaTool.run(DataFileGetSchemaTool.java:47) > > Look at the Avro source code, the error means it does not have the first 4 > bytes matching the MAGIC first 4 bytes. I am trying to see if I have done > anything wrong. > > Appreciate any help you can give me. > > Lan > > > > -- > Sean