Daniel Becker created IMPALA-12926:
--------------------------------------

             Summary: Remove AuxColumnType and add 'is_binary' field to 
ColumnType
                 Key: IMPALA-12926
                 URL: https://issues.apache.org/jira/browse/IMPALA-12926
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Daniel Becker
            Assignee: Daniel Becker


Currently the STRING and BINARY types are not distinguished in most of the 
backend. In contrast to the frontend, PrimitiveType::TYPE_BINARY is not used 
there at all, TYPE_STRING being used instead. This is to ensure that everything 
that works for STRING also works for BINARY. So far only file readers and 
writers have had to handle them differently, and they have access to 
ColumnDescriptors which contain AuxColumnType fields that differentiate these 
two types.

However, only top-level columns have ColumnDescriptors. Adding support or 
BINARYs within complex types (see IMPALA-11491 and IMPALA-12651) necessitates 
adding type information about STRING vs BINARY to embedded fields as well.

Using PrimitiveType::TYPE_BINARY would probably be the cleanest solution but it 
would affect huge parts of the code as TYPE_BINARY would have to be added to 
hundreds of switch statements and this would be error prone.

Instead, we should introduce a new field in ColumnType: 'is_binary', which is 
true if the type is a BINARY and false otherwise. We keep using TYPE_STRING as 
the PrimitiveType of the ColumnType for BINARYs. This way full type information 
is present in ColumnType but code that does not differentiate between STRING 
and BINARY will continue to work for BINARY.

With this change, AuxColumnType is no longer needed and should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to