Steve,

Thanks so much for the reply.  I hope that I can inconvenience you for a little 
more guidance.  We want to read and write Avro data files whose schema is not 
known until run-time, when we read the file metadata and transform that into 
our own internal record structure.  So we are not mapping to a C++ struct/class 
with defined compile-time members.  We just want to loop over the records and 
columns in the data file, transforming them serially.  Can this be done without 
incurring the performance penalty of GenericDatum that you speak of?

Different question: do you know if the full complement of compression codecs is 
available in C++?  We don't need "everything possible", but we want to be able 
to read 99.9% of files that we are likely to encounter in practice.

Thanks
John


From: Steve Roehrs [mailto:steve.roe...@rlmgroup.com.au]
Sent: Sunday, August 10, 2014 11:25 PM
To: user@avro.apache.org
Subject: RE: State of the C++ vs Java implementations

Hi John

You can definitely read and write Avro data files using C++.  The 
DataFileWriter and DataFileReader classes are what you need.

The README is severely out of date.

I can't comment on the relative performance of the Java/C++ API's - we used the 
C++ API for our application, but for performance reasons we don't use the 
GenericDatum class, as it does have poor performance for our particular mix of 
data.  I don't know if the Java API fares any better in this regard.

Regards,

Steve Roehrs
Senior Software Engineer | Lockheed Martin

| p: +61 8 7389 4525    | m: +61 4 3891 5622     | f: +61 8 7389 4551
| w: www.rlmgroup.com.au<http://www.rlmgroup.com.au> | e: 
steve.roe...@rlmgroup.com.au<mailto:steve.roe...@rlmgroup.com.au>
| Company address: 82-86 Woomera Ave, Edinburgh, SA 5111
This email and any attachment to it remains the property of Lockheed Martin and 
is intended only to be read or used by the named addressee.  It may contain 
information that is confidential, commercially valuable or subject to legal 
privilege.  If you receive this email in error, please immediately delete it 
and notify the sender.  Opinions, conclusions and other information in this 
message that do not relate to the official business of Lockheed Martin or any 
companies within Lockheed Martin shall be understood as neither given nor 
endorsed by them.
________________________________
From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Wednesday, August 06, 2014 6:28 AM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Subject: State of the C++ vs Java implementations

Greetings,

I am desiring to read and write Avro files (such as those manipulated by 
MapReduce applications) from a C++ program.  While there are higher-level 
wrappers (such as Hive), I am interested in reading/writing the files directly. 
 There are both C++ and Java library implementations; however, in the C++ API 
README I see "And the file and rpc containers are not yet implemented."  Does 
this mean that I can't read and write Avro files using the C++ library?

We have very good C++/JNI wrapper-generator, so using the Java is not terribly 
difficult.  Given that, which interface would you recommend?  Does the C++ 
interface (assuming it works) have significant performance advantages?

Thanks
john

Reply via email to