[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Affects Version/s: 0.13.0 casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, HIVE-6511.4.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Status: Patch Available (was: Open) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, HIVE-6511.4.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Attachment: HIVE-6511.4.patch Updated patch addresses review comments. casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, HIVE-6511.4.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Status: Open (was: Patch Available) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, HIVE-6511.4.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Attachment: HIVE-6511.2.patch casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Status: Patch Available (was: Open) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Attachment: HIVE-6511.3.patch Review board: https://reviews.apache.org/r/18808/ casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Status: Patch Available (was: Open) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Status: Open (was: Patch Available) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Attachment: HIVE-6511.1.patch The longValue function in Decimal128 rounds the value. HiveDecimal just discards the fractional part. This patch adds another method to Decimal128, that discards the fractional part, and is used in the CastDecimalToLong expression. casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage
[jira] [Updated] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6511: --- Description: select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125-1628520834 -16770 126 1553532646710.316406-1245514442 -2762 54 3367942487288.360352688127224 -776-8 4386447830839.3378911286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.3261721247658269 -16099 29 -493942492598.691406-21253559 -19895 73 3101852523586.039062886135874 23618 66 2544105595941.3818361484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.67675834576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586-1753555402 -8650 54 1103792083527.786133-14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULLNULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125-1628520834 -16770 126 1553532646710.316406-1245514442 -2762 54 3367942487288.360352688127224 -776-8 4386447830839.3378911286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.3261721247658269 -16099 29 -493942492598.691406-21253558 -19894 74 3101852523586.039062886135874 23618 66 2544105595941.3818361484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.67675834576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586-1753555402 -8650 54 1103792083527.786133-14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULLNULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer si smallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dc decimal(38,18) from deserializer bo boolean from deserializer s string from deserializer s2 string from deserializer ts timestamp from deserializer # Detailed Table Information Database: default Owner: xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Num Buckets:-1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format1 Time taken: 0.196 seconds, Fetched: 41 row(s {code} was: select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit