Paul Rogers created DRILL-7318:
----------------------------------

             Summary: Unify type-to-string implementations
                 Key: DRILL-7318
                 URL: https://issues.apache.org/jira/browse/DRILL-7318
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.16.0
            Reporter: Paul Rogers


Drill has many places that perform type-to-string conversions. Unfortunately, 
these multiple implementations are inconsistent in subtle ways. The suggestion 
here is to unify them around an Arrow-like style (as in 
[\{{PrimitiveColumnMetadata.typeString()}}|https://github.com/apache/drill/blob/master/exec/vector/src/main/java/org/apache/drill/exec/record/metadata/PrimitiveColumnMetadata.java#L186)])
 but using SQL type names (as in 
[{{Types.getBaseSqlTypeName()}}|https://github.com/apache/drill/blob/master/common/src/main/java/org/apache/drill/common/types/Types.java#L140]).

Some of the many places where we do type-to-string conversions are:

* {{Types.java}} - This is supposed to be the definitive location, though the 
{{getExtendedSqlTypeName()}} method does not properly handle the {{VARDECIMAL}} 
type nor the optional width for {{VARCHAR}}, etc.
* {{MaterializedField.toString()}} - Uses internal type names. Was handling 
precision incorrectly.
* {{AbstractColumnMetadata.toString()}} - Uses internal type names. Was 
handling precision incorrectly.
* {{PrimitiveColumnMetadata.typeString()}} - Uses ad-hoc solution for some SQL 
names, internal names for other types, does not correctly ignore precision for 
types for which precision is not valid. (Assumes precision will be zero in 
those cases.)
* {{WebUserConnection.sendData() - Uses and ad-hoc type-to-string 
implementation that uses internal names, makes incorrect use of 
{{hasPrecisiont()}} to detect if the precision is non-zero. (DRILL-7308).
* The {{typeOf}} and {{sqlTypeOf()}} SQL functions.

There are probably others. The suggestion is:

* For internal use (e.g. {{toString()}}), use internal names: the {{MinorType}} 
names.
* For user-visible use, use the SQL type names from {{Types}}.
* Define a method in {{Types}} to state whether a type takes a precision.
* For Decimal, always include the precision. For VarChar, etc., include the 
precision (where it represents the width) only when non-zero.
* Define a method in {{Types}} to state whether a type takes a scale. (Only the 
decimal types do.)
* Include scale only for the types which accept them. (For Decimal, include the 
scale even if it is zero.)
* Use the Arrow-like "ARRAY<...>" syntax, for repeated types in the new schema 
file.
* Use the SQL "NOT NULL" syntax for user-visible strings for the {{OPTIONAL}} 
cardinality. Use just the type itself for the {{REQUIRED}} cardinality.
* Use the {{DataMode}} enums in internal strings.

In general, user-visible strings should be in the form that could be used in a 
SQL {{CREATE TABLE}} statement.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to