antonsmetanin commented on PR #366:
URL: https://github.com/apache/avro-rs/pull/366#issuecomment-3677395843

   Thanks! I might have misunderstood how schema resolution is supposed to 
work, so changes to `types.rs` could be wrong, but the rest should still hold 
up. I'll look into it a bit later, but apparently there are two steps to 
finding the correct type in the schema based on the Avro datum:
   1. The tag byte from the data is used to index union from the writer's 
schema. So for example, if the writer's schema for a field is defined as `[ 
"null", "A", "B", "string", "C" ]`, and the tag is 2, it must pick type `B`.
   2. The type from the first step is used to find a corresponding type in the 
reader's schema. So if the reader's schema is `[ "null", "string", "D", "C", 
"B", "A" ]`, it will find `B` and the resulting index after resolution becomes 
4. For primitive types, the comparison is trivial, but for complex ones they 
should match by name (or alias) first and then by structure, since the first 
condition for matching records in the spec states:
   > To match, one of the following must hold:
   >    both schemas are records with the same (unqualified) name
   
   So according to this, the current implementation is still not correct, 
because it ignores the tag. What also confuses me is how the `resolve` function 
is supposed to be used in practice. I would expect it to accept both writer's 
and reader's schemas, but it only accepts one and [here in schema registry 
converter](https://github.com/gklijs/schema_registry_converter/blob/d2708eae106964c3d92207a59841fafd810c8e2c/src/avro_common.rs#L150)
 it's used with the writer's schema when encoding the value.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to