Why do JsonGenerator encodeBinary as unicode string?

2014-09-26 Thread Patrick Nip

I am wondering why avro-cpp JsonGenerator encodeBinary encode every byte as 

132 class AVRO_DECL JsonGenerator {


158 void escapeCtl(char c) {

159 out_.write('\\');

160 out_.write('U');

161 out_.write('0');

162 out_.write('0');

163 out_.write(toHex((static_castunsigned char(c)) / 16));

164 out_.write(toHex((static_castunsigned char(c)) % 16));

165 }

Is this a bug?

2014-09-25 Thread Patrick Nip
According to the documentation:


avro::GenericDatum::GenericDatum(   const 
 schema  )

Constructs a datum corresponding to the given avro type.

The value will the appropraite default corresponding to the data type.

schema  The schema that defines the avro type.

A GenericDatum when given a valid schema should have all its field fill with 
reasonable default. But it seems it is failing for union schema. I have a 
written the following piece of code which I input the script with a schema and 
then try to encode it and save the result in a file:

 14  avro::ValidSchema load(const char* filename)

 15  {

 16  std::ifstream ifs(filename);

 17  avro::ValidSchema result;

 18  avro::compileJsonSchema(ifs, result);

 19  return result;

 20  }


 22 int

 23 main(int argc, char ** argv)

 24 {

 25 avro::ValidSchema sch = load(argv[1]); // load a schema


 27 avro::GenericDatum metaDatum( sch );

 28 std::auto_ptravro::OutputStream out = avro::fileOutputStream( 
argv[2], 1 ); // write the result to a specified file

 29 avro::EncoderPtr en = avro::jsonEncoder( sch );

 30 en-init ( *out );

 31 avro::encode ( *en, metaDatum );

 32 en-flush();


 34   return 0;

 35 }

Case and point:

[pnip =avro_rhel6= my_avro]$ cat union.schema

[ bytes, long”]

[pnip =avro_rhel6= my_avro]$ ./schemaTest union.schema /tmp/result

terminate called after throwing an instance of 'avro::Exception'

  what():  Not that many names


It seems to work on other schema types (have not check all yet) but failed on 
union type schema

Is it legal avro schema to have a name tie to different type in different record

2014-09-14 Thread Patrick Nip
Hi All,

Is the following legal schema:



  metadata : {

schema : {

  family : search,

  version : v1,

  attrs : [ srch ]




  metadata : {

schema : {

  family : UDB,

  version : v1,

  attrs : [ login, reg ]





Note the metadata is appear twice in 2 different record with different 


How to update an union field genericly

2014-09-14 Thread Patrick Nip
I am trying to use the c++ generic interface to update the following record:


type: record,

namespace: com.abc.v1,

name: “def,

fields: [


name: id,

type: [







The following is the way that I try to do the update is through the 
GenericDatum. The code compile but when throw an exception of:

uncaught exception of type N4avro9ExceptionE

- Not that many names

std::vectoruint8_t postdata;

uint8_t idarray[] = { 35, 36, 37, 38 };

std::copy( idarray, idarray + 3, std::back_inserter( postdata ));

avro::GenericDatum dat( schema ); // dat was filled with null by default

dat.setFieldAt(0, postdata); // trying to replace the null with bytes

std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout );

avro::EncoderPtr en = avro::jsonEncoder( schema );

en-init ( *out );

avro::encode ( *en, dat );

Re: How to update an union field genericly

2014-09-14 Thread Patrick Nip
Completed code sample:

#include attributes.hh

#include avro/Encoder.hh

#include avro/Decoder.hh

#include boost/algorithm/string.hpp

#include boost/typeof/typeof.hpp

#include iomanip

#include vector

#include fstream

#include avro/Compiler.hh

#include avro/Generic.hh

#include avro/Schema.hh


main(int argc, char ** argv)


  avro::ValidSchema schema;

  std::ifstream ifs(unit/id.json);

  avro::compileJsonSchema(ifs, schema);

  std::vectoruint8_t postdata;

  uint8_t idarray[] = { 35, 36, 37, 38 };

  std::copy( idarray, idarray + 3, std::back_inserter( postdata ));

  avro::GenericDatum dat( schema ); // dat was filled with null by default

  avro::GenericRecord datr = dat.valueavro::GenericRecord();

  datr.setFieldAt(0, postdata); // trying to replace the null with bytes

  std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout );

  avro::EncoderPtr en = avro::jsonEncoder( schema );

  en-init ( *out );

  avro::encode ( *en, dat );

  return 0;



[pnip =avro_rhel6= avro]$ ./schemaTest

terminate called after throwing an instance of 'avro::Exception'

  what():  Invalid operation. Expected: Bytes got Union


[pnip =avro_rhel6= avro]$

From: Yahoo p...@yahoo-inc.commailto:p...@yahoo-inc.com
Reply-To: user@avro.apache.orgmailto:user@avro.apache.org 
Date: Sunday, September 14, 2014 at 9:45 PM
To: user@avro.apache.orgmailto:user@avro.apache.org 
Subject: How to update an union field genericly

I am trying to use the c++ generic interface to update the following record:


type: record,

namespace: com.abc.v1,

name: “def,

fields: [


name: id,

type: [







The following is the way that I try to do the update is through the 
GenericDatum. The code compile but when throw an exception of:

uncaught exception of type N4avro9ExceptionE

- Not that many names

std::vectoruint8_t postdata;

uint8_t idarray[] = { 35, 36, 37, 38 };

std::copy( idarray, idarray + 3, std::back_inserter( postdata ));

avro::GenericDatum dat( schema ); // dat was filled with null by default

dat.setFieldAt(0, postdata); // trying to replace the null with bytes

std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout );

avro::EncoderPtr en = avro::jsonEncoder( schema );

en-init ( *out );

avro::encode ( *en, dat );

Can avro cpp support multiple records?

2014-09-05 Thread Patrick Nip

More specifically if I have the following payload with 2 records, metadata 
and data. Can avro cpp library parse each record successively? If it can, can 
you shared some code sample of how that can be done? I know the Java library 
BinaryDecoder has no problem dealing with it. You just need to construct 
different GenericDatumReader using different schema to read into the same 
decoder. E.g.

new GenericDatumReaderGenericRecord(metaDataSchema).read(null, decoder);
new GenericDatumReaderGenericRecord(dataSchema).read(null, decoder);

But is it possible to do the same thing in C++ ??

  metadata : {
  data : {