wmoustafa commented on code in PR #4242:
URL: https://github.com/apache/iceberg/pull/4242#discussion_r877311220
##########
core/src/main/java/org/apache/iceberg/avro/AvroSchemaWithTypeVisitor.java:
##########
@@ -79,11 +79,29 @@ private static <T> T visitRecord(Types.StructType struct,
Schema record, AvroSch
private static <T> T visitUnion(Type type, Schema union,
AvroSchemaWithTypeVisitor<T> visitor) {
List<Schema> types = union.getTypes();
List<T> options = Lists.newArrayListWithExpectedSize(types.size());
- for (Schema branch : types) {
- if (branch.getType() == Schema.Type.NULL) {
- options.add(visit((Type) null, branch, visitor));
- } else {
- options.add(visit(type, branch, visitor));
+
+ // simple union case
+ if (AvroSchemaUtil.isOptionSchema(union)) {
+ for (Schema branch : types) {
+ if (branch.getType() == Schema.Type.NULL) {
+ options.add(visit((Type) null, branch, visitor));
+ } else {
+ options.add(visit(type, branch, visitor));
+ }
+ }
+ } else { // complex union case
+ Preconditions.checkArgument(type instanceof Types.StructType,
+ "Cannot visit invalid Iceberg type: %s for Avro complex union type:
%s", type, union);
Review Comment:
What about the second option, where we expect them to be in the same order?
(see this [PR](https://github.com/linkedin/iceberg/pull/108) to support missing
fields in the case of projection pruning)? This approach will also make sure
that the deep types also match.
For the above suggestion, I am a bit worried about using only the top level
types since it will fail in unexpected places and could lead to cryptic error
messages if things go wrong. Also, can we list the whole type hierarchy
instead?
That said, does it need to be retrofit into name mapping? I feel we could
implement it without extending the name mapping as long as we have functions to
deeply compare types.
I think as a starting point, we could assume they must align in terms of
order (but do not necessarily be the same, as we could skip some in the
struct), and later we can implement deep type comparisons, which will relax the
ordering expectation (which is also forward compatible with the type-based
check).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]