antonsmetanin commented on PR #366: URL: https://github.com/apache/avro-rs/pull/366#issuecomment-3677395843
Thanks! I might have misunderstood how schema resolution is supposed to work, so changes to `types.rs` could be wrong, but the rest should still hold up. I'll look into it a bit later, but apparently there are two steps to finding the correct type in the schema based on the Avro datum: 1. The tag byte from the data is used to index union from the writer's schema. So for example, if the writer's schema for a field is defined as `[ "null", "A", "B", "string", "C" ]`, and the tag is 2, it must pick type `B`. 2. The type from the first step is used to find a corresponding type in the reader's schema. So if the reader's schema is `[ "null", "string", "D", "C", "B", "A" ]`, it will find `B` and the resulting index after resolution becomes 4. For primitive types, the comparison is trivial, but for complex ones they should match by name (or alias) first and then by structure, since the first condition for matching records in the spec states: > To match, one of the following must hold: > both schemas are records with the same (unqualified) name So according to this, the current implementation is still not correct, because it ignores the tag. What also confuses me is how the `resolve` function is supposed to be used in practice. I would expect it to accept both writer's and reader's schemas, but it only accepts one and [here in schema registry converter](https://github.com/gklijs/schema_registry_converter/blob/d2708eae106964c3d92207a59841fafd810c8e2c/src/avro_common.rs#L150) it's used with the writer's schema when encoding the value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
