Why do JsonGenerator encodeBinary as unicode string?

2014-09-26 Thread Patrick Nip
Hi


I am wondering why avro-cpp JsonGenerator encodeBinary encode every byte as 
unicode?


132 class AVRO_DECL JsonGenerator {

157

158 void escapeCtl(char c) {

159 out_.write('\\');

160 out_.write('U');

161 out_.write('0');

162 out_.write('0');

163 out_.write(toHex((static_castunsigned char(c)) / 16));

164 out_.write(toHex((static_castunsigned char(c)) % 16));

165 }


Is this a bug?

2014-09-25 Thread Patrick Nip
According to the documentation:

https://avro.apache.org/docs/current/api/cpp/html/classavro_1_1GenericDatum.html#a879e7b725023bfd8246e15f07cb5bef0


avro::GenericDatum::GenericDatum(   const 
ValidSchemahttps://avro.apache.org/docs/current/api/cpp/html/classavro_1_1ValidSchema.html
 schema  )

Constructs a datum corresponding to the given avro type.

The value will the appropraite default corresponding to the data type.

Parameters:
schema  The schema that defines the avro type.


A GenericDatum when given a valid schema should have all its field fill with 
reasonable default. But it seems it is failing for union schema. I have a 
written the following piece of code which I input the script with a schema and 
then try to encode it and save the result in a file:


 14  avro::ValidSchema load(const char* filename)

 15  {

 16  std::ifstream ifs(filename);

 17  avro::ValidSchema result;

 18  avro::compileJsonSchema(ifs, result);

 19  return result;

 20  }

 21

 22 int

 23 main(int argc, char ** argv)

 24 {

 25 avro::ValidSchema sch = load(argv[1]); // load a schema

 26

 27 avro::GenericDatum metaDatum( sch );

 28 std::auto_ptravro::OutputStream out = avro::fileOutputStream( 
argv[2], 1 ); // write the result to a specified file

 29 avro::EncoderPtr en = avro::jsonEncoder( sch );

 30 en-init ( *out );

 31 avro::encode ( *en, metaDatum );

 32 en-flush();

 33

 34   return 0;

 35 }


Case and point:


[pnip =avro_rhel6= my_avro]$ cat union.schema

[ bytes, long”]


[pnip =avro_rhel6= my_avro]$ ./schemaTest union.schema /tmp/result

terminate called after throwing an instance of 'avro::Exception'

  what():  Not that many names

Aborted


It seems to work on other schema types (have not check all yet) but failed on 
union type schema










Is it legal avro schema to have a name tie to different type in different record

2014-09-14 Thread Patrick Nip
Hi All,

Is the following legal schema:

{

{

  metadata : {

schema : {

  family : search,

  version : v1,

  attrs : [ srch ]

}

  }

}{

  metadata : {

schema : {

  family : UDB,

  version : v1,

  attrs : [ login, reg ]

}

  }

}

}

Note the metadata is appear twice in 2 different record with different 
definition.

Thanks
Patrick


How to update an union field genericly

2014-09-14 Thread Patrick Nip
I am trying to use the c++ generic interface to update the following record:


{

type: record,

namespace: com.abc.v1,

name: “def,

fields: [

{

name: id,

type: [

null,

bytes

]

}

]

}


The following is the way that I try to do the update is through the 
GenericDatum. The code compile but when throw an exception of:


uncaught exception of type N4avro9ExceptionE

- Not that many names


std::vectoruint8_t postdata;

uint8_t idarray[] = { 35, 36, 37, 38 };

std::copy( idarray, idarray + 3, std::back_inserter( postdata ));



avro::GenericDatum dat( schema ); // dat was filled with null by default

dat.setFieldAt(0, postdata); // trying to replace the null with bytes


std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout );

avro::EncoderPtr en = avro::jsonEncoder( schema );

en-init ( *out );

avro::encode ( *en, dat );





Re: How to update an union field genericly

2014-09-14 Thread Patrick Nip
Completed code sample:


#include attributes.hh

#include avro/Encoder.hh

#include avro/Decoder.hh

#include boost/algorithm/string.hpp

#include boost/typeof/typeof.hpp

#include iomanip

#include vector

#include fstream

#include avro/Compiler.hh

#include avro/Generic.hh

#include avro/Schema.hh


int

main(int argc, char ** argv)

{

  avro::ValidSchema schema;

  std::ifstream ifs(unit/id.json);

  avro::compileJsonSchema(ifs, schema);



  std::vectoruint8_t postdata;

  uint8_t idarray[] = { 35, 36, 37, 38 };

  std::copy( idarray, idarray + 3, std::back_inserter( postdata ));



  avro::GenericDatum dat( schema ); // dat was filled with null by default

  avro::GenericRecord datr = dat.valueavro::GenericRecord();


  datr.setFieldAt(0, postdata); // trying to replace the null with bytes


  std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout );

  avro::EncoderPtr en = avro::jsonEncoder( schema );

  en-init ( *out );

  avro::encode ( *en, dat );



  return 0;

}


Result:


[pnip =avro_rhel6= avro]$ ./schemaTest

terminate called after throwing an instance of 'avro::Exception'

  what():  Invalid operation. Expected: Bytes got Union

Aborted

[pnip =avro_rhel6= avro]$

From: Yahoo p...@yahoo-inc.commailto:p...@yahoo-inc.com
Reply-To: user@avro.apache.orgmailto:user@avro.apache.org 
user@avro.apache.orgmailto:user@avro.apache.org
Date: Sunday, September 14, 2014 at 9:45 PM
To: user@avro.apache.orgmailto:user@avro.apache.org 
user@avro.apache.orgmailto:user@avro.apache.org
Subject: How to update an union field genericly


I am trying to use the c++ generic interface to update the following record:


{

type: record,

namespace: com.abc.v1,

name: “def,

fields: [

{

name: id,

type: [

null,

bytes

]

}

]

}


The following is the way that I try to do the update is through the 
GenericDatum. The code compile but when throw an exception of:


uncaught exception of type N4avro9ExceptionE

- Not that many names


std::vectoruint8_t postdata;

uint8_t idarray[] = { 35, 36, 37, 38 };

std::copy( idarray, idarray + 3, std::back_inserter( postdata ));



avro::GenericDatum dat( schema ); // dat was filled with null by default

dat.setFieldAt(0, postdata); // trying to replace the null with bytes


std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout );

avro::EncoderPtr en = avro::jsonEncoder( schema );

en-init ( *out );

avro::encode ( *en, dat );





Can avro cpp support multiple records?

2014-09-05 Thread Patrick Nip
[https://mail.google.com/mail/u/1/images/cleardot.gif]


More specifically if I have the following payload with 2 records, metadata 
and data. Can avro cpp library parse each record successively? If it can, can 
you shared some code sample of how that can be done? I know the Java library 
BinaryDecoder has no problem dealing with it. You just need to construct 
different GenericDatumReader using different schema to read into the same 
decoder. E.g.

new GenericDatumReaderGenericRecord(metaDataSchema).read(null, decoder);
new GenericDatumReaderGenericRecord(dataSchema).read(null, decoder);

But is it possible to do the same thing in C++ ??

{
  metadata : {
 ...
  },
  data : {
 ...
  }
}