KitFieldhouse commented on PR #467: URL: https://github.com/apache/avro-rs/pull/467#issuecomment-4014871627
A few concerns about this change: **1. `default` field in map and array example was removed in the spec** [This jira issue ](https://issues.apache.org/jira/browse/AVRO-3155?) points out that the `default` field was added to map and array examples in the specification starting from 1.10. The [associated PR](https://github.com/apache/avro/pull/3046) indicates that this is a mistake in the specification and fixes it for versions starting at 1.12.1. **2. Interoperability regression** The Avro specification states "Attributes not defined in this document are permitted as metadata, but must not affect the format of serialized data". The other implementations I have tested (Java, Python) treat `"default"` on array and map type definitions as an unrecognized attribute — they store it as opaque metadata and do not validate it against the schema. This PR promotes `"default"` to a validated, parsed field on `ArraySchema` and `MapSchema`. Because it validates the value against the item/value schema, schemata that parse successfully in other implementations will now fail to parse in Rust. For example, in the python implementation, the below parses successfully and shows "default" as being considered an "other_prop". ```python import avro.schema import json # Array with a "default" that wouldn't even validate against the items schema array_schema = avro.schema.parse(json.dumps({ "type": "array", "items": "string", "default": 12345 # intentionally invalid — not an array of strings })) print("Array props: ", dict(array_schema.props)) print("Array other_props: ", dict(array_schema.other_props)) # Map with a default map_schema = avro.schema.parse(json.dumps({ "type": "map", "values": "long", "default": {"key": "not-a-long"} # also intentionally invalid })) print("Map props: ", dict(map_schema.props)) print("Map other_props: ", dict(map_schema.other_props)) ``` I believe this is an edge case, but it does cause a schism between this implementation and the others. **3. No semantic effect** This PR does not appear to modify the encoding or decoding paths and has no effect on data serialization or schema resolution. A user who sets a default on an array type might reasonably expect it to behave like record field defaults during schema evolution, but it does not. The spec has never defined semantics for type-level defaults on arrays or maps. The `"default"` examples that appeared in the 1.10.0–1.12.0 spec were, as mentioned above, identified as a documentation error (AVRO-3155) and removed. **Suggested alternative** I believe the pre-PR version actually was the correct implementation given the behavior of other implementations and the spec, so I would vote to revert this change, or at minimum stop validating the default value against the schema so that it passes through. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
