[ 
https://issues.apache.org/jira/browse/DRILL-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986931#comment-15986931
 ] 

ASF GitHub Bot commented on DRILL-5419:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/819#discussion_r113606842
  
    --- Diff: 
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveUtilities.java
 ---
    @@ -294,10 +296,21 @@ public static MajorType 
getMajorTypeFromHiveTypeInfo(final TypeInfo typeInfo, fi
             MajorType.Builder typeBuilder = 
MajorType.newBuilder().setMinorType(minorType)
                 .setMode(DataMode.OPTIONAL); // Hive columns (both regular and 
partition) could have null values
     
    -        if (primitiveTypeInfo.getPrimitiveCategory() == 
PrimitiveCategory.DECIMAL) {
    -          DecimalTypeInfo decimalTypeInfo = (DecimalTypeInfo) 
primitiveTypeInfo;
    -          typeBuilder.setPrecision(decimalTypeInfo.precision())
    -              .setScale(decimalTypeInfo.scale()).build();
    +        switch (primitiveTypeInfo.getPrimitiveCategory()) {
    +          case CHAR:
    +          case VARCHAR:
    +            BaseCharTypeInfo baseCharTypeInfo = (BaseCharTypeInfo) 
primitiveTypeInfo;
    +            typeBuilder.setPrecision(baseCharTypeInfo.getLength());
    +            break;
    +          case STRING:
    +            typeBuilder.setPrecision(HiveVarchar.MAX_VARCHAR_LENGTH);
    +            break;
    +          case DECIMAL:
    +            DecimalTypeInfo decimalTypeInfo = (DecimalTypeInfo) 
primitiveTypeInfo;
    +            
typeBuilder.setPrecision(decimalTypeInfo.getPrecision()).setScale(decimalTypeInfo.getScale());
    --- End diff --
    
    We are now considering decimal precision and scale. The precision 
determines which Decimal minor type must be used. Shouldn't the Hive-to-Drill 
type conversion consider the precision and scale earlier?


> Calculate return string length for literals & some string functions
> -------------------------------------------------------------------
>
>                 Key: DRILL-5419
>                 URL: https://issues.apache.org/jira/browse/DRILL-5419
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.9.0
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>         Attachments: version_with_cast.JPG
>
>
> Though Drill is schema-less and cannot determine in advance what the length 
> of the column should be but if query has an explicit type/length specified, 
> Drill should return correct column length.
> For example, JDBC / ODBC Driver is ALWAYS returning 64K as the length of a 
> varchar or char even if casts are applied.
> Changes:
> *LITERALS*
> String literals length is the same as actual literal length.
> Example: for 'aaa' return length is 3.
> *CAST*
> Return length is the one indicated in cast expression. This also applies when 
> user has created view where each string columns was casted to varchar with 
> some specific length.
> This length will be returned to the user without need to apply cast one more 
> time. Below mentioned functions can take leverage of underlying varchar 
> length and calculate return length.
> *LOWER, UPPER, INITCAP, REVERSE, FIRST_VALUE, LAST_VALUE* 
> Return length is underlying column length, i.e. if column is known, the same 
> length will be returned.
> Example:
> lower(cast(col as varchar(30))) will return 30.
> lower(col) will return max varchar length, since we don't know actual column 
> length.
> *LAG, LEAD*
> Return length is underlying column length but column type will be nullable.
> *LPAD, RPAD*
> Pads the string to the length specified. Return length is this specified 
> length. 
> *CONCAT, CONCAT OPERATOR (||)*
> Return length is sum of underlying columns length. If length is greater then 
> varchar max length,  varchar max length is returned.
> *SUBSTR, SUBSTRING, LEFT, RIGHT*
> Calculates return length according to each function substring rules, for 
> example, taking into account how many char should be substracted.
> *IF EXPRESSIONS (CASE STATEMENT, COALESCE), UNION OPERATOR*
> When combining string columns with different length, return length is max 
> from source columns.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to