[ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---------------------------------------

    Status: Open  (was: Patch Available)

> casting from decimal to tinyint,smallint, int and bigint generates different 
> result when vectorization is on
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-6511
>                 URL: https://issues.apache.org/jira/browse/HIVE-6511
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch
>
>
> select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
> vectortab10korc limit 20 generates following result when vectorization is 
> enabled:
> {code}
> 4619756289662.078125  -1628520834     -16770  126
> 1553532646710.316406  -1245514442     -2762   54
> 3367942487288.360352  688127224       -776    -8
> 4386447830839.337891  1286221623      12087   55
> -3234165331139.458008 -54957251       27453   61
> -488378613475.326172  1247658269      -16099  29
> -493942492598.691406  -21253559       -19895  73
> 3101852523586.039062  886135874       23618   66
> 2544105595941.381836  1484956709      -23515  37
> -3997512403067.0625   1102149509      30597   -123
> -1183754978977.589355 1655994718      31070   94
> 1408783849655.676758  34576568        -26440  -72
> -2993175106993.426758 417098319       27215   79
> 3004723551798.100586  -1753555402     -8650   54
> 1103792083527.786133  -14511544       -28088  72
> 469767055288.485352   1615620024      26552   -72
> -1263700791098.294434 -980406074      12486   -58
> -4244889766496.484375 -1462078048     30112   -96
> -3962729491139.782715 1525323068      -27332  60
> NULL  NULL    NULL    NULL
> {code}
> When vectorization is disabled, result looks like this:
> {code}
> 4619756289662.078125  -1628520834     -16770  126
> 1553532646710.316406  -1245514442     -2762   54
> 3367942487288.360352  688127224       -776    -8
> 4386447830839.337891  1286221623      12087   55
> -3234165331139.458008 -54957251       27453   61
> -488378613475.326172  1247658269      -16099  29
> -493942492598.691406  -21253558       -19894  74
> 3101852523586.039062  886135874       23618   66
> 2544105595941.381836  1484956709      -23515  37
> -3997512403067.0625   1102149509      30597   -123
> -1183754978977.589355 1655994719      31071   95
> 1408783849655.676758  34576567        -26441  -73
> -2993175106993.426758 417098319       27215   79
> 3004723551798.100586  -1753555402     -8650   54
> 1103792083527.786133  -14511545       -28089  71
> 469767055288.485352   1615620024      26552   -72
> -1263700791098.294434 -980406074      12486   -58
> -4244889766496.484375 -1462078048     30112   -96
> -3962729491139.782715 1525323069      -27331  61
> NULL  NULL    NULL    NULL
> {code}
> This issue is visible only for certain decimal values. In above example, row 
> 7,11,12, and 15 generates different results.
> vectortab10korc table schema:
> {code}
> t                     tinyint                 from deserializer   
> si                    smallint                from deserializer   
> i                     int                     from deserializer   
> b                     bigint                  from deserializer   
> f                     float                   from deserializer   
> d                     double                  from deserializer   
> dc                    decimal(38,18)          from deserializer   
> bo                    boolean                 from deserializer   
> s                     string                  from deserializer   
> s2                    string                  from deserializer   
> ts                    timestamp               from deserializer   
>                
> # Detailed Table Information           
> Database:             default                  
> Owner:                xyz                      
> CreateTime:           Tue Feb 25 21:54:28 UTC 2014     
> LastAccessTime:       UNKNOWN                  
> Protect Mode:         None                     
> Retention:            0                        
> Location:             
> hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc         
> Table Type:           MANAGED_TABLE            
> Table Parameters:              
>       COLUMN_STATS_ACCURATE   true                
>       numFiles                1                   
>       numRows                 10000               
>       rawDataSize             0                   
>       totalSize               344748              
>       transient_lastDdlTime   1393365281          
>                
> # Storage Information          
> SerDe Library:        org.apache.hadoop.hive.ql.io.orc.OrcSerde        
> InputFormat:          org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
> OutputFormat:         org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat        
>  
> Compressed:           No                       
> Num Buckets:          -1                       
> Bucket Columns:       []                       
> Sort Columns:         []                       
> Storage Desc Params:           
>       serialization.format    1                   
> Time taken: 0.196 seconds, Fetched: 41 row(s
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to