Roman Mitasov created AVRO-3966:
-----------------------------------
Summary: toString generates poorly formatted defaults for bytes
Key: AVRO-3966
URL: https://issues.apache.org/jira/browse/AVRO-3966
Project: Apache Avro
Issue Type: Bug
Reporter: Roman Mitasov
Schema#toString and Protocol#toString both generate default values for "bytes"
and "fixed" types.
According to docs:
{quote}Default values for bytes and fixed fields are JSON strings, where
Unicode code points 0-255 are mapped to unsigned 8-bit byte values 0-255. Avro
encodes a field even if its value is equal to its default.
{quote}
The following schema
{code:json}
{
"type" : "record",
"name" : "TestRecord",
"fields" : [ {
"name" : "testFixed",
"type" : {
"type" : "fixed",
"name" : "Code",
"size" : 3
},
"default" : "\u0009\u0020\u00FF"
} ]
}{code}
If parsed and then again encoded to JSON would have {{"\t ÿ"}} value in
"default".
It happens because `toString` implementations use `JsonGenerator` with default
escape configs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)