I am noticing a difference between Java and C# versions of Avro when I call 
Schema.ToString().

First, is that the C# version adds the namespace to each Named schema.  Second, 
is the order of the output.  I would expect across languages that we output the 
same JSON string.

The following took a schema json string and called Schema.Parse(string json) 
and Schema.parse(String jsonSchema, boolean validate) or C# and Java.

Original Schema string
{"type":"record","name":"TestRecord","namespace":"test.namespace","fields":[{"name":"testName","type":{"type":"record","name":"TestData","fields":[{"name":"version","type":"float","doc":"version
 number of this schema"}]}}]}

C# output of Schema.ToString()
{"type":"record","name":"TestRecord","namespace":"test.namespace","fields":[{"name":"testName","type":{"type":"record","name":"TestData","namespace":"test.namespace","fields":[{"name":"version","doc":"version
 number of this schema","type":"float"}]}}]}

Java output of Schema.toString
{"type":"record","name":"TestRecord","namespace":"test.namespace","fields":[{"name":"testName","type":{"type":"record","name":"TestData","fields":[{"name":"version","type":"float","doc":"version
 number of this schema"}]}}]}

It is not overly complicated to have the C# version match the Java version, but 
in order to maintain backwards compatibility while supporting a new output, we 
will need to create a Schema.ToJsonString method, and update the WriteJson* 
methods as well to support the new flow.  Ideally we mark ToString() obsolete 
with the message to use the ToJsonString method.  Eventually, pointing 
ToString() to the ToJsonString method.

While this work is not complicated it is a lot of work and testing.  While, I 
personally see value in having the output being the same (I work in a mixed 
technology environment), I wanted to address any concerns with this sort of 
change.

Thanks,
Kyle T. Schoonover

Reply via email to