[
https://issues.apache.org/jira/browse/GOBBLIN-1726?focusedWorklogId=817839&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-817839
]
ASF GitHub Bot logged work on GOBBLIN-1726:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 17/Oct/22 22:09
Start Date: 17/Oct/22 22:09
Worklog Time Spent: 10m
Work Description: srramach commented on code in PR #3581:
URL: https://github.com/apache/gobblin/pull/3581#discussion_r997451873
##########
gobblin-core-base/src/main/java/org/apache/gobblin/converter/filter/AvroSchemaFieldRemover.java:
##########
@@ -119,14 +121,17 @@ private Schema removeFieldsFromRecords(Schema schema,
Map<String, Schema> schema
if (!this.shouldRemove(field)) {
Field newField;
if (this.children.containsKey(field.name())) {
- newField = new Field(field.name(),
this.children.get(field.name()).removeFields(field.schema(), schemaMap),
- field.doc(), field.defaultValue());
+ newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+ this.children.get(field.name()).removeFields(field.schema(),
schemaMap),
+ field.doc(), AvroUtils.getCompatibleDefaultValue(field));
} else {
- newField = new Field(field.name(),
DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap), field.doc(),
- field.defaultValue());
+ newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+ DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap),
field.doc(),
+ AvroUtils.getCompatibleDefaultValue(field));
}
- for (Map.Entry<String, JsonNode> stringJsonNodeEntry :
field.getJsonProps().entrySet()) {
- newField.addProp(stringJsonNodeEntry.getKey(),
stringJsonNodeEntry.getValue());
+ // Avro 1.9 compatible change - replaced deprecated public api
getJsonProps with getObjectProps
+ for (Map.Entry<String, Object> objectEntry :
field.getObjectProps().entrySet()) {
Review Comment:
getObjectProps() doesn't exist in Avro 1.7.7, so this is not Avro version
agnostic. We did add a new API in AvroCompatibilityHelper to getAllPropNames(),
so you should be able to use the AvroCompatibilityHelper here too.
##########
gobblin-core/src/test/java/org/apache/gobblin/recordaccess/AvroGenericRecordAccessorTest.java:
##########
@@ -139,6 +139,11 @@ public void testGetStringArrayUtf8() throws IOException {
@Test
public void testGetMultiConvertsStrings() throws IOException {
+ // The below error is due to invalid avro data. Type with "null" union
must have "null" first and then
Review Comment:
This comment is not strictly accurate (as far as Avro is concerned). A union
with "null" doesn't _have_ to have null as the first type. It's usually
intended to signify optional-ness, and usually people do want to put null
first, so that they can set "null" as the default value, but it's not required.
What _is_ required (as per Avro) is that the default value must have the
same type as the first entry in the union. So, for example, this is correct:
```
"type": ["null", "string"],
"default": null
```
Whereas this is wrong:
```
// default is null, but must be a string, since that's the first type in the
union
"type": ["string", "null"],
"default": null
```
This is also wrong:
```
// default is a string ("null"), not null, which is the first type in the
union
"type": ["null", "string"],
"default": "null"
```
And this is totally correct, though usually rare to see:
```
"type": ["string", "null"]
"default": "foobar"
```
So what you really want to say in this comment is that since the default
value is `null`, you want to ensure that that's the first type in the union.
##########
gobblin-core-base/src/main/java/org/apache/gobblin/converter/filter/AvroSchemaFieldRemover.java:
##########
@@ -119,14 +121,17 @@ private Schema removeFieldsFromRecords(Schema schema,
Map<String, Schema> schema
if (!this.shouldRemove(field)) {
Field newField;
if (this.children.containsKey(field.name())) {
- newField = new Field(field.name(),
this.children.get(field.name()).removeFields(field.schema(), schemaMap),
- field.doc(), field.defaultValue());
+ newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+ this.children.get(field.name()).removeFields(field.schema(),
schemaMap),
+ field.doc(), AvroUtils.getCompatibleDefaultValue(field));
} else {
- newField = new Field(field.name(),
DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap), field.doc(),
- field.defaultValue());
+ newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+ DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap),
field.doc(),
+ AvroUtils.getCompatibleDefaultValue(field));
}
- for (Map.Entry<String, JsonNode> stringJsonNodeEntry :
field.getJsonProps().entrySet()) {
- newField.addProp(stringJsonNodeEntry.getKey(),
stringJsonNodeEntry.getValue());
+ // Avro 1.9 compatible change - replaced deprecated public api
getJsonProps with getObjectProps
+ for (Map.Entry<String, Object> objectEntry :
field.getObjectProps().entrySet()) {
+ newField.addProp(objectEntry.getKey(), objectEntry.getValue());
Review Comment:
Same for addProp(). Use the AvroCompatibilityHelper instead.
Issue Time Tracking
-------------------
Worklog Id: (was: 817839)
Time Spent: 20m (was: 10m)
> Upgrade Avro version supported in Gobblin to 1.9.2
> --------------------------------------------------
>
> Key: GOBBLIN-1726
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1726
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: William Lo
> Priority: Major
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Bumps avro version to 1.9.2
> Relies on a LinkedIn version of Hive in order to utilize
> AvroCompatibilityHelper as the version used in Gobblin is very out of date.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)