Why do JsonGenerator encodeBinary as unicode string?
Hi I am wondering why avro-cpp JsonGenerator encodeBinary encode every byte as unicode? 132 class AVRO_DECL JsonGenerator { 157 158 void escapeCtl(char c) { 159 out_.write('\\'); 160 out_.write('U'); 161 out_.write('0'); 162 out_.write('0'); 163 out_.write(toHex((static_castunsigned char(c)) / 16)); 164 out_.write(toHex((static_castunsigned char(c)) % 16)); 165 }
Is this a bug?
According to the documentation: https://avro.apache.org/docs/current/api/cpp/html/classavro_1_1GenericDatum.html#a879e7b725023bfd8246e15f07cb5bef0 avro::GenericDatum::GenericDatum( const ValidSchemahttps://avro.apache.org/docs/current/api/cpp/html/classavro_1_1ValidSchema.html schema ) Constructs a datum corresponding to the given avro type. The value will the appropraite default corresponding to the data type. Parameters: schema The schema that defines the avro type. A GenericDatum when given a valid schema should have all its field fill with reasonable default. But it seems it is failing for union schema. I have a written the following piece of code which I input the script with a schema and then try to encode it and save the result in a file: 14 avro::ValidSchema load(const char* filename) 15 { 16 std::ifstream ifs(filename); 17 avro::ValidSchema result; 18 avro::compileJsonSchema(ifs, result); 19 return result; 20 } 21 22 int 23 main(int argc, char ** argv) 24 { 25 avro::ValidSchema sch = load(argv[1]); // load a schema 26 27 avro::GenericDatum metaDatum( sch ); 28 std::auto_ptravro::OutputStream out = avro::fileOutputStream( argv[2], 1 ); // write the result to a specified file 29 avro::EncoderPtr en = avro::jsonEncoder( sch ); 30 en-init ( *out ); 31 avro::encode ( *en, metaDatum ); 32 en-flush(); 33 34 return 0; 35 } Case and point: [pnip =avro_rhel6= my_avro]$ cat union.schema [ bytes, long”] [pnip =avro_rhel6= my_avro]$ ./schemaTest union.schema /tmp/result terminate called after throwing an instance of 'avro::Exception' what(): Not that many names Aborted It seems to work on other schema types (have not check all yet) but failed on union type schema
Is it legal avro schema to have a name tie to different type in different record
Hi All, Is the following legal schema: { { metadata : { schema : { family : search, version : v1, attrs : [ srch ] } } }{ metadata : { schema : { family : UDB, version : v1, attrs : [ login, reg ] } } } } Note the metadata is appear twice in 2 different record with different definition. Thanks Patrick
How to update an union field genericly
I am trying to use the c++ generic interface to update the following record: { type: record, namespace: com.abc.v1, name: “def, fields: [ { name: id, type: [ null, bytes ] } ] } The following is the way that I try to do the update is through the GenericDatum. The code compile but when throw an exception of: uncaught exception of type N4avro9ExceptionE - Not that many names std::vectoruint8_t postdata; uint8_t idarray[] = { 35, 36, 37, 38 }; std::copy( idarray, idarray + 3, std::back_inserter( postdata )); avro::GenericDatum dat( schema ); // dat was filled with null by default dat.setFieldAt(0, postdata); // trying to replace the null with bytes std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout ); avro::EncoderPtr en = avro::jsonEncoder( schema ); en-init ( *out ); avro::encode ( *en, dat );
Re: How to update an union field genericly
Completed code sample: #include attributes.hh #include avro/Encoder.hh #include avro/Decoder.hh #include boost/algorithm/string.hpp #include boost/typeof/typeof.hpp #include iomanip #include vector #include fstream #include avro/Compiler.hh #include avro/Generic.hh #include avro/Schema.hh int main(int argc, char ** argv) { avro::ValidSchema schema; std::ifstream ifs(unit/id.json); avro::compileJsonSchema(ifs, schema); std::vectoruint8_t postdata; uint8_t idarray[] = { 35, 36, 37, 38 }; std::copy( idarray, idarray + 3, std::back_inserter( postdata )); avro::GenericDatum dat( schema ); // dat was filled with null by default avro::GenericRecord datr = dat.valueavro::GenericRecord(); datr.setFieldAt(0, postdata); // trying to replace the null with bytes std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout ); avro::EncoderPtr en = avro::jsonEncoder( schema ); en-init ( *out ); avro::encode ( *en, dat ); return 0; } Result: [pnip =avro_rhel6= avro]$ ./schemaTest terminate called after throwing an instance of 'avro::Exception' what(): Invalid operation. Expected: Bytes got Union Aborted [pnip =avro_rhel6= avro]$ From: Yahoo p...@yahoo-inc.commailto:p...@yahoo-inc.com Reply-To: user@avro.apache.orgmailto:user@avro.apache.org user@avro.apache.orgmailto:user@avro.apache.org Date: Sunday, September 14, 2014 at 9:45 PM To: user@avro.apache.orgmailto:user@avro.apache.org user@avro.apache.orgmailto:user@avro.apache.org Subject: How to update an union field genericly I am trying to use the c++ generic interface to update the following record: { type: record, namespace: com.abc.v1, name: “def, fields: [ { name: id, type: [ null, bytes ] } ] } The following is the way that I try to do the update is through the GenericDatum. The code compile but when throw an exception of: uncaught exception of type N4avro9ExceptionE - Not that many names std::vectoruint8_t postdata; uint8_t idarray[] = { 35, 36, 37, 38 }; std::copy( idarray, idarray + 3, std::back_inserter( postdata )); avro::GenericDatum dat( schema ); // dat was filled with null by default dat.setFieldAt(0, postdata); // trying to replace the null with bytes std::auto_ptravro::OutputStream out = avro::fileOutputStream( /tmp/fout ); avro::EncoderPtr en = avro::jsonEncoder( schema ); en-init ( *out ); avro::encode ( *en, dat );
Can avro cpp support multiple records?
[https://mail.google.com/mail/u/1/images/cleardot.gif] More specifically if I have the following payload with 2 records, metadata and data. Can avro cpp library parse each record successively? If it can, can you shared some code sample of how that can be done? I know the Java library BinaryDecoder has no problem dealing with it. You just need to construct different GenericDatumReader using different schema to read into the same decoder. E.g. new GenericDatumReaderGenericRecord(metaDataSchema).read(null, decoder); new GenericDatumReaderGenericRecord(dataSchema).read(null, decoder); But is it possible to do the same thing in C++ ?? { metadata : { ... }, data : { ... } }