markap14 commented on a change in pull request #4282:
URL: https://github.com/apache/nifi/pull/4282#discussion_r427525012



##########
File path: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/queryrecord/FlowFileTable.java
##########
@@ -223,12 +225,69 @@ private RelDataType getRelDataType(final DataType 
fieldType, final JavaTypeFacto
             case BIGINT:
                 return typeFactory.createJavaType(BigInteger.class);
             case CHOICE:
+                final ChoiceDataType choiceDataType = (ChoiceDataType) 
fieldType;
+                DataType widestDataType = 
choiceDataType.getPossibleSubTypes().get(0);
+                for (final DataType possibleType : 
choiceDataType.getPossibleSubTypes()) {
+                    if (possibleType == widestDataType) {
+                        continue;
+                    }
+                    if 
(possibleType.getFieldType().isWiderThan(widestDataType.getFieldType())) {
+                        widestDataType = possibleType;
+                        continue;
+                    }
+                    if 
(widestDataType.getFieldType().isWiderThan(possibleType.getFieldType())) {
+                        continue;
+                    }
+
+                    // Neither is wider than the other.
+                    widestDataType = null;
+                    break;
+                }
+
+                // If one of the CHOICE data types is the widest, use it.
+                if (widestDataType != null) {
+                    return getRelDataType(widestDataType, typeFactory);
+                }
+
+                // None of the data types is strictly the widest. Check if all 
data types are numeric.
+                // This would happen, for instance, if the data type is a 
choice between float and integer.
+                // If that is the case, we can use a String type for the table 
schema because all values will fit
+                // into a String. This will still allow for casting, etc. if 
the query requires it.
+                boolean allNumeric = true;
+                for (final DataType possibleType : 
choiceDataType.getPossibleSubTypes()) {
+                    if (!isNumeric(possibleType)) {
+                        allNumeric = false;
+                        break;
+                    }
+                }
+
+                if (allNumeric) {

Review comment:
       As an example, consider a csv like:
   ```
   name, other
   markap14, 48
   pcgrenier, computer
   ```
   In this case, the schema has a field with name 'name' and a type String. But 
'other' field is a CHOICE[INT, STRING]. So what should best represent that 
field in terms of java objects? I'd say `Object.class`. The `String.class` 
being used here is honestly a bit of a hack because Calcite doesn't give us a 
better way to represent Number - or, at least, not to my knowledge :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to