[
https://issues.apache.org/jira/browse/AVRO-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17880610#comment-17880610
]
Santiago Fraire Willemoes commented on AVRO-4055:
-------------------------------------------------
I will update the issue, sorry I've had my mind all around.
The main issue is the *nesting of "records"*.
A normal "record" looks like this (notice that "type" and "fields" are on the
same depth):
{code}
{
"type": "record",
"name": "LongList",
"aliases": ["LinkedLongs"], // old name for this
"fields" : [
{"name": "value", "type": "long"}, // each element has a long
{"name": "next", "type": ["null", "LongList"]} // optional next element
]
}
{code}
But when nesting records, it's different (notice that this time "type" contains
another "type" with the "fields")
{code}
{
"type": "record",
"name": "SampleSchema",
"fields": [
{
"name": "order",
"type": {
"type": "record",
"name": "Order",
"fields": [
{
"name": "order_number",
"type": ["null", "string"],
"default": null
},
{ "name": "order_date", "type": "string" }
]
}
}
]
}
{code}
I couldn't find any explanation on the spec as of why this is the case. In the
ticket I opened, the faulty schema has "type" and "fields" at the same depth.
Now, before proceeding with a PR, I think the bug should be confirmed, because
this could be a valid schema and rust is doing it correctly, but Java is not.
Or it's a bug in rust. In any case, IMO libraries should behave the same.
Regarding the avro-tools error, my user got this:
{code}
Exception in thread "main" org.apache.avro.SchemaParseException: "record" is
not a defined name. The type of the "order" field must be a defined name or a
{"type": ...} expression.
at org.apache.avro.Schema.parse(Schema.java:1734)
at org.apache.avro.Schema$Parser.parse(Schema.java:1471)
at org.apache.avro.Schema$Parser.parse(Schema.java:1433)
at org.apache.avro.tool.SpecificCompilerTool.run(SpecificCompilerTool.java:154)
at org.apache.avro.tool.Main.run(Main.java:67)
at org.apache.avro.tool.Main.main(Main.java:56)
{code}
> [rust] schema parsing invalid
> -----------------------------
>
> Key: AVRO-4055
> URL: https://issues.apache.org/jira/browse/AVRO-4055
> Project: Apache Avro
> Issue Type: Bug
> Components: rust
> Reporter: Santiago Fraire Willemoes
> Priority: Major
> Labels: rust
>
> Parsing a schema like this should be invalid on rust, but rust parses the
> schema without raising errors:
> {code}
> {
> "type": "record",
> "name": "SampleSchema",
> "fields": [
> {
> "name": "order",
> "type": "record",
> "fields": [
> {
> "name": "order_number",
> "type": ["null", "string"],
> "default": null
> },
> { "name": "order_date", "type": "string" }
> ]
> }
> ]
> }
> {code}
> Sample code:
> {code}
> use apache_avro::Schema;
> let raw_schema = r#"
> {
> "type": "record",
> "name": "SampleSchema",
> "fields": [
> {
> "name": "order",
> "type": "record",
> "fields": [
> {
> "name": "order_number",
> "type": ["null", "string"],
> "default": null
> },
> { "name": "order_date", "type": "string" }
> ]
> }
> ]
> }
> "#;
> // if the schema is not valid, this function will return an error
> let schema = Schema::parse_str(raw_schema).unwrap();
> // schemas can be printed for debugging
> println!("{:?}", schema);
> {code}
> Why is this important? Other tools like in Java are not able to parse this
> schema, making compatibility between different languages harder.
> We've had issues using `avro-tools` to build the jars.
> I can try to fix it, let me know if you want me to send a PR.
> Regards
--
This message was sent by Atlassian Jira
(v8.20.10#820010)