PookieBuns commented on issue #383:
URL: https://github.com/apache/avro-rs/issues/383#issuecomment-4147374006
Hi guys. I noticed another issue, specifically wrt Unions when performing
schema resolution
```
fn incorrect_schema_resolution_with_reader() {
use apache_avro::{Reader, Schema, Writer, types::Record};
let schema_str = r#"
[
{
"type": "record",
"name": "foo",
"fields": [
{"name": "a", "type": "string", "default": "a"},
{"name": "b", "type": "string", "default": "b"},
{"name": "c", "type": "string", "default": "c"}
]
},
{
"type": "record",
"name": "bar",
"fields": [
{"name": "d", "type": "string"}
]
}
]
"#;
let schema = apache_avro::Schema::parse_str(schema_str).unwrap();
apache_avro::schema_compatibility::SchemaCompatibility::can_read(&schema,
&schema).unwrap();
let value = apache_avro::types::Value::Union(
1,
Box::new(apache_avro::types::Value::Record(vec![(
"d".to_string(),
apache_avro::types::Value::String("d".to_string()),
)])),
);
let encoded = apache_avro::to_avro_datum(&schema,
value.clone()).unwrap();
let without_reader =
apache_avro::from_avro_datum(&schema, &mut
std::io::Cursor::new(encoded.clone()), None)
.unwrap();
assert_eq!(value, without_reader);
let with_reader =
apache_avro::from_avro_datum(&schema, &mut
std::io::Cursor::new(encoded), Some(&schema))
.unwrap();
assert_eq!(value, with_reader);
}
```
In this, if you use a reader's schema, it ends up converting your avro value
into record Foo because of how resolution is currently implemented.
https://github.com/apache/avro-rs/blob/48dfe5b3576944b7c6936981759df58e779d93f7/avro/src/schema/union.rs#L87
this function currently will instantly succeed any Value in the first branch,
because defaults were all provided. I haven't looked into deeply how java
avoids this (maybe somewhere in
https://github.com/apache/avro/blob/main/lang/java/avro/src/main/java/org/apache/avro/Resolver.java#L722)
but I will take a deeper dive tmr and see if the currently proposed solution
above can address it (by attaching writer schema in schema resolution)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]