[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-09 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Affects Version/s: 0.13.0

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, 
 HIVE-6511.4.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:   

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Status: Patch Available  (was: Open)

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, 
 HIVE-6511.4.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Attachment: HIVE-6511.4.patch

Updated patch addresses review comments.

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, 
 HIVE-6511.4.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Status: Open  (was: Patch Available)

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, 
 HIVE-6511.4.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Attachment: HIVE-6511.2.patch

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 OutputFormat: 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Status: Patch Available  (was: Open)

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 OutputFormat: 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Attachment: HIVE-6511.3.patch

Review board: https://reviews.apache.org/r/18808/

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Status: Patch Available  (was: Open)

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 OutputFormat: 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Status: Open  (was: Patch Available)

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 OutputFormat: 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-02 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Attachment: HIVE-6511.1.patch

The longValue function in Decimal128 rounds the value. HiveDecimal just 
discards the fractional part. This patch adds another method to Decimal128, 
that discards the fractional part, and is used in the CastDecimalToLong 
expression.

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage 

[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-02-26 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6511:
---

Description: 
select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
vectortab10korc limit 20 generates following result when vectorization is 
enabled:
{code}
4619756289662.078125-1628520834 -16770  126
1553532646710.316406-1245514442 -2762   54
3367942487288.360352688127224   -776-8
4386447830839.3378911286221623  12087   55
-3234165331139.458008   -54957251   27453   61
-488378613475.3261721247658269  -16099  29
-493942492598.691406-21253559   -19895  73
3101852523586.039062886135874   23618   66
2544105595941.3818361484956709  -23515  37
-3997512403067.0625 1102149509  30597   -123
-1183754978977.589355   1655994718  31070   94
1408783849655.67675834576568-26440  -72
-2993175106993.426758   417098319   27215   79
3004723551798.100586-1753555402 -8650   54
1103792083527.786133-14511544   -28088  72
469767055288.485352 1615620024  26552   -72
-1263700791098.294434   -980406074  12486   -58
-4244889766496.484375   -1462078048 30112   -96
-3962729491139.782715   1525323068  -27332  60
NULLNULLNULLNULL
{code}

When vectorization is disabled, result looks like this:
{code}
4619756289662.078125-1628520834 -16770  126
1553532646710.316406-1245514442 -2762   54
3367942487288.360352688127224   -776-8
4386447830839.3378911286221623  12087   55
-3234165331139.458008   -54957251   27453   61
-488378613475.3261721247658269  -16099  29
-493942492598.691406-21253558   -19894  74
3101852523586.039062886135874   23618   66
2544105595941.3818361484956709  -23515  37
-3997512403067.0625 1102149509  30597   -123
-1183754978977.589355   1655994719  31071   95
1408783849655.67675834576567-26441  -73
-2993175106993.426758   417098319   27215   79
3004723551798.100586-1753555402 -8650   54
1103792083527.786133-14511545   -28089  71
469767055288.485352 1615620024  26552   -72
-1263700791098.294434   -980406074  12486   -58
-4244889766496.484375   -1462078048 30112   -96
-3962729491139.782715   1525323069  -27331  61
NULLNULLNULLNULL
{code}

This issue is visible only for certain decimal values. In above example, row 
7,11,12, and 15 generates different results.

vectortab10korc table schema:
{code}
t   tinyint from deserializer   
si  smallintfrom deserializer   
i   int from deserializer   
b   bigint  from deserializer   
f   float   from deserializer   
d   double  from deserializer   
dc  decimal(38,18)  from deserializer   
bo  boolean from deserializer   
s   string  from deserializer   
s2  string  from deserializer   
ts  timestamp   from deserializer   
 
# Detailed Table Information 
Database:   default  
Owner:  xyz  
CreateTime: Tue Feb 25 21:54:28 UTC 2014 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   
hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   true
numFiles1   
numRows 1   
rawDataSize 0   
totalSize   344748  
transient_lastDdlTime   1393365281  
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
OutputFormat:   org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
serialization.format1   
Time taken: 0.196 seconds, Fetched: 41 row(s
{code}





  was:
select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
vectortab10korc limit