tpalfy commented on code in PR #6018:
URL: https://github.com/apache/nifi/pull/6018#discussion_r865854290
##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeReader.java:
##########
@@ -95,6 +98,18 @@ public class JsonTreeReader extends SchemaRegistryService
implements RecordReade
.dependsOn(STARTING_FIELD_STRATEGY,
StartingFieldStrategy.NESTED_FIELD.name())
.build();
+ public static final PropertyDescriptor SCHEMA_APPLICATION_STRATEGY = new
PropertyDescriptor.Builder()
+ .name("schema-application-strategy")
+ .displayName("Schema Application Strategy")
+ .description("Specifies whether the schema is defined for the
whole JSON or for the selected part starting from \"Starting Field Name\".")
+ .required(true)
+ .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
+ .defaultValue(SchemaApplicationStrategy.SELECTED_PART.getValue())
+ .dependsOn(STARTING_FIELD_STRATEGY,
StartingFieldStrategy.NESTED_FIELD.name())
+ .dependsOn(SCHEMA_ACCESS_STRATEGY, SCHEMA_NAME_PROPERTY,
SCHEMA_TEXT_PROPERTY, HWX_SCHEMA_REF_ATTRIBUTES, HWX_CONTENT_ENCODED_SCHEMA,
CONFLUENT_ENCODED_SCHEMA)
Review Comment:
Why do we need to depend on SCHEMA_ACCESS_STRATEGY?
##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeReader.java:
##########
@@ -95,6 +98,18 @@ public class JsonTreeReader extends SchemaRegistryService
implements RecordReade
.dependsOn(STARTING_FIELD_STRATEGY,
StartingFieldStrategy.NESTED_FIELD.name())
.build();
+ public static final PropertyDescriptor SCHEMA_APPLICATION_STRATEGY = new
PropertyDescriptor.Builder()
Review Comment:
Suggestion: I would call this SCHEMA_SCOPE. I feel "Application Strategy" is
a bit too technical.
##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/json/TestJsonTreeRowRecordReader.java:
##########
@@ -1128,29 +1140,46 @@ void testStartFromNestedFieldThenStartObject() throws
IOException, MalformedReco
}})
);
- testReadRecords(jsonPath, expectedRecordSchema, expected,
StartingFieldStrategy.NESTED_FIELD, "accounts");
+ testReadRecords(jsonPath, expectedRecordSchema, expected,
StartingFieldStrategy.NESTED_FIELD,
+ "accounts", SchemaApplicationStrategy.SELECTED_PART);
+ }
+
+ @Test
+ void testStartsFromNestedObjectWithWholeJsonSchemaApplicationStrategy()
throws IOException, MalformedRecordException {
+ String jsonPath = "src/test/resources/json/single-element-nested.json";
+
+ RecordSchema recordSchema = getSchema();
+
+ RecordSchema expectedRecordSchema = getAccountSchema();
+
+ List<Object> expected = Collections.singletonList(
+ new MapRecord(expectedRecordSchema, new HashMap<String,
Object>() {{
+ put("id", 42);
+ put("balance", 4750.89);
+ }})
+ );
+
+ testReadRecords(jsonPath, recordSchema, expected,
StartingFieldStrategy.NESTED_FIELD,
+ "account", SchemaApplicationStrategy.WHOLE_JSON);
+
}
Review Comment:
Unless the test data creation is really complicated, tests are easier to
understand if the data creation is inlined (not done via extracted methods).
Doing that, one can check the input, the metadata (schema in this case) and
the expected output at one place.
(Input is actually already extracted to file but I guess we can leave that.)
It also helps avoiding making errors/confusions like in this case: the full
schema is created with an "account" field and a "balance" field - even though
_there is no_ balance field in the data at the root level.
I would suggest this method done like this:
```java
@Test
void testStartsFromNestedObjectWithWholeJsonSchemaScope() throws
IOException, MalformedRecordException {
String jsonPath =
"src/test/resources/json/single-element-nested.json";
RecordSchema accountSchema = new SimpleRecordSchema(Arrays.asList(
new RecordField("id", RecordFieldType.INT.getDataType()),
new RecordField("balance", RecordFieldType.DOUBLE.getDataType())
));
RecordSchema recordSchema = new SimpleRecordSchema(Arrays.asList(
new RecordField("account",
RecordFieldType.RECORD.getRecordDataType(accountSchema))
));
RecordSchema expectedRecordSchema = accountSchema;
List<Object> expected = Collections.singletonList(
new MapRecord(expectedRecordSchema, new HashMap<String,
Object>() {{
put("id", 42);
put("balance", 4750.89);
}})
);
testReadRecords(jsonPath, recordSchema, expected,
StartingFieldStrategy.NESTED_FIELD,
"account", SchemaScope.WHOLE_JSON);
}
```
Also I would add a case where the nested field is an array:
```java
@Test
void testStartsFromNestedArrayWithWholeJsonSchemaScope() throws
IOException, MalformedRecordException {
String jsonPath =
"src/test/resources/json/single-element-nested-array.json";
RecordSchema accountSchema = new SimpleRecordSchema(Arrays.asList(
new RecordField("id", RecordFieldType.INT.getDataType()),
new RecordField("balance", RecordFieldType.DOUBLE.getDataType())
));
RecordSchema recordSchema = new SimpleRecordSchema(Arrays.asList(
new RecordField("accounts",
RecordFieldType.ARRAY.getArrayDataType(RecordFieldType.RECORD.getRecordDataType(accountSchema)))
));
RecordSchema expectedRecordSchema = accountSchema;
List<Object> expected = Arrays.asList(
new MapRecord(expectedRecordSchema, new HashMap<String,
Object>() {{
put("id", 42);
put("balance", 4750.89);
}}),
new MapRecord(expectedRecordSchema, new HashMap<String,
Object>() {{
put("id", 43);
put("balance", 48212.38);
}})
);
testReadRecords(jsonPath, recordSchema, expected,
StartingFieldStrategy.NESTED_FIELD,
"accounts", SchemaScope.WHOLE_JSON);
}
```
##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/SchemaApplicationStrategy.java:
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.json;
+
+import org.apache.nifi.components.DescribedValue;
+
+public enum SchemaApplicationStrategy implements DescribedValue {
+ WHOLE_JSON {
+ @Override
+ public String getDisplayName() {
+ return "Whole JSON";
+ }
+
+ @Override
+ public String getDescription() {
+ return "Applies the schema for the whole JSON.";
+ }
+ },
+ SELECTED_PART {
+ @Override
+ public String getDisplayName() {
+ return "Selected Part";
+ }
+
+ @Override
+ public String getDescription() {
+ return "Applies the schema for the selected part starting from the
\"Starting Field Name\".";
+ }
+ };
+
+ @Override
+ public String getValue() {
+ return name();
+ }
+}
Review Comment:
```suggestion
public enum SchemaApplicationStrategy implements DescribedValue {
WHOLE_JSON(
"Whole JSON",
"Applies the schema for the whole JSON."
),
SELECTED_PART(
"Selected Part",
"Applies the schema for the selected part starting from the
\"Starting Field Name\"."
);
private final String displayName;
private final String description;
SchemaApplicationStrategy(String displayName, String description) {
this.displayName = displayName;
this.description = description;
}
@Override
public String getDisplayName() {
return displayName;
}
@Override
public String getDescription() {
return description;
}
@Override
public String getValue() {
return name();
}
}
```
##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeRowRecordReader.java:
##########
@@ -48,26 +48,48 @@ public class JsonTreeRowRecordReader extends
AbstractJsonRowRecordReader {
public JsonTreeRowRecordReader(final InputStream in, final ComponentLog
logger, final RecordSchema schema,
Review Comment:
I see with [NIFI-9862](https://issues.apache.org/jira/browse/NIFI-9862) we
added independent constructors for both _JsonTreeReader_ and
_AbstractJsonRowRecordReader_.
That is ill-advised. A constructor hierarchy should have as little branching
as possible, preferably none.
I recommend the following changes:
**JsonTreeRowRecordReader**
```java
public JsonTreeRowRecordReader(final InputStream in, final ComponentLog
logger, final RecordSchema schema,
final String dateFormat, final String
timeFormat, final String timestampFormat) throws IOException,
MalformedRecordException {
this(in, logger, schema, dateFormat, timeFormat, timestampFormat,
null, null, null);
}
```
instead of
```java
public JsonTreeRowRecordReader(final InputStream in, final ComponentLog
logger, final RecordSchema schema,
final String dateFormat, final String
timeFormat, final String timestampFormat) throws IOException,
MalformedRecordException {
super(in, logger, dateFormat, timeFormat, timestampFormat);
this.schema = schema;
}
```
**AbstractJsonRowRecordReader**
```java
protected AbstractJsonRowRecordReader(final InputStream in, final
ComponentLog logger, final String dateFormat, final String timeFormat, final
String timestampFormat)
throws IOException, MalformedRecordException {
this(in, logger, dateFormat, timeFormat, timestampFormat, null,
null);
}
```
instead of
```java
protected AbstractJsonRowRecordReader(final InputStream in, final
ComponentLog logger, final String dateFormat, final String timeFormat, final
String timestampFormat)
throws IOException, MalformedRecordException {
this(logger, dateFormat, timeFormat, timestampFormat);
try {
jsonParser = jsonFactory.createParser(in);
jsonParser.setCodec(codec);
JsonToken token = jsonParser.nextToken();
if (token == JsonToken.START_ARRAY) {
token = jsonParser.nextToken(); // advance to START_OBJECT
token
}
if (token == JsonToken.START_OBJECT) { // could be END_ARRAY also
firstJsonNode = jsonParser.readValueAsTree();
} else {
firstJsonNode = null;
}
} catch (final JsonParseException e) {
throw new MalformedRecordException("Could not parse data as
JSON", e);
}
}
```
##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/StartingFieldStrategy.java:
##########
@@ -16,23 +16,33 @@
*/
package org.apache.nifi.json;
-public enum StartingFieldStrategy {
- ROOT_NODE("Root Node", "Begins processing from the root node."),
- NESTED_FIELD("Nested Field", "Skips forward to the given nested JSON field
(array or object) to begin processing.");
+import org.apache.nifi.components.DescribedValue;
- private final String displayName;
- private final String description;
+public enum StartingFieldStrategy implements DescribedValue {
Review Comment:
```suggestion
public enum StartingFieldStrategy implements DescribedValue {
ROOT_NODE("Root Node", "Begins processing from the root node."),
NESTED_FIELD("Nested Field", "Skips forward to the given nested JSON
field (array or object) to begin processing.");
private final String displayName;
private final String description;
StartingFieldStrategy(final String displayName, final String
description) {
this.displayName = displayName;
this.description = description;
}
public String getDisplayName() {
return displayName;
}
public String getDescription() {
return description;
}
@Override
public String getValue() {
return name();
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]