I found out that those excessive ‘\’ char before universal character names is 
added by method doEncodeString of JsonGenerator Class (JsonIO.hh).

Should I manually replace “\\u“ with “\u” or it is possible to prevent such 
escaping somehow?

 

From: Anton [mailto:[email protected]] 
Sent: Tuesday, October 19, 2021 8:59 PM
To: '[email protected]' <[email protected]>
Subject: RE: How to use avro::jsonEncoder with unicode symbols?

 

Hi Martin,

 

I’m using C++, here is the fragment which encodes GenericDatum to Json:

    avro::EncoderPtr d = avro::jsonEncoder(validSchema);

    std::unique_ptr<avro::OutputStream> out = avro::memoryOutputStream();

    d->init(*out);

    avro::encode(*d, datum);

    out->flush();

    std::unique_ptr<avro::InputStream> in = avro::memoryInputStream(*out);

    avro::StreamReader r(*in);

    size_t bc = out->byteCount();

    uint8_t* jsonBytes = new uint8_t[bc];

    r.readBytes(jsonBytes,bc);

 

 

From: Martin Grigorov [mailto:[email protected]] 
Sent: Tuesday, October 19, 2021 8:49 PM
To: [email protected] <mailto:[email protected]> 
Subject: Re: How to use avro::jsonEncoder with unicode symbols?

 

Hi Anton,

 

Which Avro module do you use ? Java, Python, ... ?

Please show us your code!

 

On Tue, Oct 19, 2021 at 6:03 PM Anton <[email protected] 
<mailto:[email protected]> > wrote:

Hello,

 

I’m trying to deserialize avro data to json and now I can’t properly receive 
non-ASCII symbols from encoder. In OutputStream of encoder I’m seeing Unicode 
codes like \\u042a <file://u042a>  in place of non ASCII symbols.

How to properly translate this data to Unicode strings?

Reply via email to