paul-rogers commented on a change in pull request #1954: DRILL-7509: Incorrect
TupleSchema is created for DICT column when querying Parquet files
URL: https://github.com/apache/drill/pull/1954#discussion_r366597636
##########
File path:
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/AbstractColumnMetadata.java
##########
@@ -306,7 +306,7 @@ public String columnString() {
builder.append(typeString());
// Drill does not have nullability notion for complex types
- if (!isNullable() && !isArray() && !isMap()) {
+ if (!isNullable() && !isArray() && !isMap() && !isDict()) {
Review comment:
Bad "code smell". The base class is enumerating all the ways that something
can be not null. Not even sure this makes sense. Repeated items (arrays) in
Drill are non-nullable. But, LISTs, oddly, are nullable. Is a List an Array? A
Map is not nullable, but a union with a map element is nullable. A Dict is not
nullable, but UNION or LIST of Dict is.
So, why is it that we have to check both if the type is not nullable and
that it is not an array (or map or DICT?) It should be that, if we say that a
DICT, say, is not nullable, then `isNullable()` should always be false for that
type, or we have a logic error.
Note: a while back I tried adding unit tests to this class (still need to
issue that PR) and found that the semantics are getting pretty muddy: very hard
to figure out what we're trying to do. Not in scope for this PR, but we really
need to clean up our type semantics.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services