ahmedahamid commented on a change in pull request #2792: [GOBBLIN-933] add support for array of unions in json schema URL: https://github.com/apache/incubator-gobblin/pull/2792#discussion_r341779189
########## File path: gobblin-core/src/test/resources/converter/jsonToAvroRecord.json ########## @@ -1,4 +1,6 @@ -{"nullableField": null, +{ + "arrayFieldWithUnion": ["arrU1", "arrU2", "arrU3"], Review comment: Just following up. I tried adding a `null` value to this array but the `JsonRecordAvroSchemaToAvroConverterTest.testConverter` failed: ```java Could not convert field arrayFieldWithUnion org.apache.gobblin.converter.DataConversionException: Could not convert field arrayFieldWithUnion at org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverter.convertNestedRecord(JsonRecordAvroSchemaToAvroConverter.java:127) at org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverter.convertRecord(JsonRecordAvroSchemaToAvroConverter.java:71) at org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverterTest.testConverter(JsonRecordAvroSchemaToAvroConverterTest.java:69) ... Caused by: java.lang.RuntimeException: Field: arrayFieldWithUnion is not nullable and contains a null value at org.apache.gobblin.converter.avro.JsonElementConversionFactory$JsonElementConverter.convert(JsonElementConversionFactory.java:278) at org.apache.gobblin.converter.avro.JsonElementConversionWithAvroSchemaFactory$ArrayConverter.convertField(JsonElementConversionWithAvroSchemaFactory.java:98) at org.apache.gobblin.converter.avro.JsonElementConversionFactory$JsonElementConverter.convert(JsonElementConversionFactory.java:280) at org.apache.gobblin.converter.avro.JsonRecordAvroSchemaToAvroConverter.convertNestedRecord(JsonRecordAvroSchemaToAvroConverter.java:125) ... 51 more ``` I think the key issue here is that `UnionConverter` does not determine its nullability correctly; it uses the nullability it receives from `JsonElementConversionWithAvroSchemaFactory.getConvertor()` to decide its own nullability. This is incorrect because `arrayFieldWithUnion` is a non-nullable array. However, the union itself (i.e. `UnionConverter`) would return `false` if you invoke `isNullable()` on it even though `["string", "null"]` _is_ nullable (because it is construing `arrayFieldWithUnion`'s non-nullability as its own (see the values `UnionConverter` provides to `super`). One possible fix could be to have `UnionConverter` override `isNullable()` so it can decide its own nullability based on its schemas, e.g. ```java @Override public boolean isNullable() { return this.firstSchema.getType() == Schema.Type.NULL || this.secondSchema.getType() == Schema.Type.NULL; } ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services