[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java: http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@1472 PS13, Line 1472: Type.DOUBLE, ScalarType.createDecimalType(2, 1)); > I think the decimal output type parameters are orthogonal to the purpose of Yeah, I didn't look at which test case we're in here. I think TestDecimalArithmetic() could use some additional cases. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 13 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 05 Oct 2017 23:20:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java: http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@1472 PS13, Line 1472: Type.DOUBLE, ScalarType.createDecimalType(2, 1)); > actually, are there any other cases that we should add here now that we've I think the decimal output type parameters are orthogonal to the purpose of this specific test, which is testing whether the output is resolved to *INT/DECIMAL/DOUBLE with various combinations of input types and literals/non-literals. I don't think Taras' change would have affected that. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 13 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 19:02:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java: http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@1472 PS13, Line 1472: Type.DOUBLE, ScalarType.createDecimalType(2, 1)); actually, are there any other cases that we should add here now that we've changed decimal_v2 multiplication rule? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 13 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 16:50:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Reviewed-on: http://gerrit.cloudera.org:8080/7438 Reviewed-by: Taras Bobrovytsky Tested-by: Impala Public Jenkins --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 5 files changed, 333 insertions(+), 62 deletions(-) Approvals: Taras Bobrovytsky: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 13 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 12 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 09:37:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1297/ -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 12 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 05:27:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 12: Code-Review+2 Fixed the uninitialized variable bug in expr-test.cc. Forwarding the +2. Thanks Jim! -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 12 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 05:26:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 5 files changed, 333 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/12 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 12 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/11/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/7438/11/be/src/exprs/expr-test.cc@1544 PS11, Line 1544: DCHECK(false) << "Unexpected character."; This is causing a clang-tidy error in your last GVD: See the last lines of https://jenkins.impala.io/job/gerrit-verify-dryrun/1292/console ERROR: There were some failed jobs: [https://jenkins.impala.io/job/clang-tidy/1606/, https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/374/] Then the last lines of the URL it points to, https://jenkins.impala.io/job/clang-tidy/1606/console: /home/ubuntu/Impala/be/src/exprs/expr-test.cc:1543:7: warning: variable 'digit' is used uninitialized whenever switch default is taken [clang-diagnostic-sometimes-uninitialized] -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 11 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 03:05:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1295/ -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 11 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 02:50:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 11: Code-Review+2 Updated FE tests. Forwarding the +2. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 11 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 02:49:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 5 files changed, 333 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/11 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 11 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 10: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1292/ -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 10 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 04 Oct 2017 01:59:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1292/ -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 10 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 21:57:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 10: Code-Review+2 Rebased, forwarding the +2 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 10 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 21:51:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 330 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/10 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 10 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 330 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/9 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 9 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 8: Did you mean to upload a new diff? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 21:39:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1672 PS8, Line 1672: not require int256. > if result does not require int256, why does it overflow (return null)? Added comment. http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1675 PS8, Line 1675: { true, 0, 38, 0 }}}, > when both V1 and V2 give the same answer, you don't have to include both re Done http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1680 PS8, Line 1680: { "cast(18446744073709551615 as decimal(38,0)) * cast(9223372036854775808 as decimal(38,0))", > It might be helpful if you mentioned how these values where computed in the Added comment. http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1702 PS8, Line 1702: 85070591730234615865843651857942052864 > how did you verify these results? I used Python to make sure I get the same results. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 21:26:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1680 PS8, Line 1680: { "cast(18446744073709551615 as decimal(38,0)) * cast(9223372036854775808 as decimal(38,0))", It might be helpful if you mentioned how these values where computed in the comments, e.g. 2 ^ 64 - 1 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 17:56:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 8: Code-Review+2 (3 comments) Changes seem okay to me, but might be good to have Tim also take a look. http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1672 PS8, Line 1672: not require int256. if result does not require int256, why does it overflow (return null)? http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1675 PS8, Line 1675: { true, 0, 38, 0 }}}, when both V1 and V2 give the same answer, you don't have to include both results http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1702 PS8, Line 1702: 85070591730234615865843651857942052864 how did you verify these results? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 17:11:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 8: I made some additional changes in patch 8 as a result of thinking about your comment. Dan, can you take a look one more time? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 02:55:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 332 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/8 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h@249 PS7, Line 249: } else { > maybe a quick comment: Yes, your understanding is correct. Added comment. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 7 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 02:53:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 7: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG@67 PS6, Line 67: after: 126.17s > No, not always. Look at line 73 in this file. Yeah, but that's "just" 2x, as opposed to 10x. http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h@249 PS7, Line 249: } else { maybe a quick comment: // We've verified the intermediate value will fit in int128. is my understanding correct? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 7 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 02 Oct 2017 22:29:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 292 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/7 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 7 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: (7 comments) mostly just some questions. http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG@67 PS6, Line 67: after: 126.17s is it always the case that this regression is in cases that would have (incorrectly) returned null before this change? http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@a253 PS6, Line 253: great to see that go away http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@245 PS6, Line 245: overflow instead of saying "we will overflow", I think "we indicate overflow" or "return overflow" or something like that. I was trying to figure out where the overflow occurs until I continued reading. http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@244 PS6, Line 244: If delta scale is 0, we need to do a more precise check if converting to 256 bits : // is necessary I don't understand that comment, given that line 240-241 says the first check was conservative. http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@273 PS6, Line 273: // with a delta_scale 39 does not fit into int128. so normally the (38,38) * (38,38) case would be handled in the needs_int256 case, is that right? is this case just when the values are so small that they turn into 0? http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@284 PS6, Line 284: nit: let's eliminate this blank line (and maybe others. this function already barely fits on one screen which is important for readability. http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/util/bit-util.h File be/src/util/bit-util.h: http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/util/bit-util.h@292 PS6, Line 292: __int128 shifted = v >> 64; : if (shifted > 0) { given that 'v' is unsigned, won't the shift be unsigned, meaning 'shifted' will always get zeros in the top 64, meaning shifted is always >= 0? oh, so is this really just checking "shifted != 0"? If so, that seems clearer, especially since the else-clause only makes sense for 0, not for negative. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 29 Sep 2017 23:23:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 29 Sep 2017 22:04:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: The algorithm mentioned here, on page 17 seems quite useful: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.111.9736&rep=rep1&type=pdf -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 28 Sep 2017 17:28:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: > > I totally agree about doing anything fancy as a followup. In > > particular, I did not find anything at all useful in AVX to help. > > There is no vectored multiply large enough for this, and FMA > > operations won't help either as they basically only help with > > precision loss on floating point types. > > > > Considering the perf difference is so extreme in some cases, I > > think we should strongly consider either living with the broken > > behavior until we can come up with a solution, or else verifying > > with users making use of the DECIMAL_V2 feature that this will > not > > be a problem. > > AVX2 has 32x8 times 32x8 multiplication. I have elsewhere simulated > 64-bit multiplication with this, combines with additions and > shifts, and received a speedup on 64x4 times 64x4 operations. Oh, and AVX512 has vpmullq for multiplying vectors of 64-bit values. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 28 Sep 2017 01:56:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: > I totally agree about doing anything fancy as a followup. In > particular, I did not find anything at all useful in AVX to help. > There is no vectored multiply large enough for this, and FMA > operations won't help either as they basically only help with > precision loss on floating point types. > > Considering the perf difference is so extreme in some cases, I > think we should strongly consider either living with the broken > behavior until we can come up with a solution, or else verifying > with users making use of the DECIMAL_V2 feature that this will not > be a problem. AVX2 has 32x8 times 32x8 multiplication. I have elsewhere simulated 64-bit multiplication with this, combines with additions and shifts, and received a speedup on 64x4 times 64x4 operations. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 27 Sep 2017 20:31:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: I totally agree about doing anything fancy as a followup. In particular, I did not find anything at all useful in AVX to help. There is no vectored multiply large enough for this, and FMA operations won't help either as they basically only help with precision loss on floating point types. Considering the perf difference is so extreme in some cases, I think we should strongly consider either living with the broken behavior until we can come up with a solution, or else verifying with users making use of the DECIMAL_V2 feature that this will not be a problem. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 27 Sep 2017 03:00:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: I think we should explore your suggestion to try SEE/AVX in a follow on patch. (to optimze both multiplication and division if it's possible) -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 27 Sep 2017 01:24:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 289 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/6 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Zach Amsden has posted comments on this change. Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 5: Have we ever tried using SSE/AVX for multiplication? It should be possible to avoid using int256_t altogether if we can do four 64-bit multiplies in parallel. Are FMA instructions available for integer? (I have the manual in front of me but have to run and no time to look it up). We should be able to do the division the same way with another round of multiplication. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/5/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: Line 203: RESULT_T result = value / multiplier; > Another idea: there are a finite number of possible delta_scale values. It After trying different approaches, this one produced the best results: http://github.mtv.cloudera.com/Taras/Impala/commit/950892a48e555ba24dcca772851d251ae3ecb17f Before: 66.13s After: 55.63s So, a small improvement. Still not that great: without the multiplication patch, it took 8.26s -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/5/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: Line 203: RESULT_T result = value / multiplier; Another idea: there are a finite number of possible delta_scale values. It may be faster to promote delta_scale to a template parameter and switch on delta_scale to select an optimised implementation. The compiler may be able to optimise the division if it compiles the division separately for each specific multiplier values, e.g. generating special code for /10, /100, /1000, etc. E.g. there are plenty of tricks in this area that compilers can incorporate: https://stackoverflow.com/questions/2033210/c-fast-division-mod-by-10x -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/7438/4//COMMIT_MSG Commit Message: Line 75: DECIMAL_V2 disabled: 4.25s > Running perf top showed that most of the time is spent in ScaleDownAndRound Yeah that seems likely - the faster everything else becomes the more this would be a bottleneck. I don't see an obvious way to speed this up (except maybe avoiding the redundant divide/mod calls?) and I guess it only makes a difference if a lot of expressions are overflowing. Maybe someone with deeper knowledge of decimal might see a way to improve it. http://gerrit.cloudera.org:8080/#/c/7438/4/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: Line 204: if (round) { It's unfortunate that we're not computing the quotient and remainder at the same time. It looks like boost eventually calls divide_unsigned_helper in multiprecision/cpp_int/divide.hpp with the same arguments in both cases but throws away either the result or remainder. If avoiding two divide calls would help performance we could consider calling divide_unsigned_helper() directly. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#5). Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute an int256 intermediate value, which we then attempt to scale down and round to int128. Benchmarks: In the following benchmarks, we are selecting from lineitem_big, which has about 100 times as many rows as the normal lineitem. Query: select sum(l_quantity * l_tax) + sum(l_extendedprice * l_discount) from lineitem_big; Before: DECIMAL_V2 disabled: 2.24s DECIMAL_V2 enabled : 2.14 After: DECIMAL_V2 disabled: 2.66s DECIMAL_V2 enabled : 2.25s Query: select sum(l_quantity * l_tax * cast(1 as decimal(38, 0))) + sum(l_extendedprice * l_discount * cast(1 as decimal(38, 0))) from lineitem_big Before: DECIMAL_V2 disabled: 2.34s DECIMAL_V2 enabled : 2.36s After: DECIMAL_V2 disabled: 4.25s DECIMAL_V2 enabled : 4.15s Query: select sum(l_quantity * l_tax * cast(1 as decimal(38, 37))) + sum(l_extendedprice * l_discount * cast(1 as decimal(38, 37))) from lineitem_big Before: DECIMAL_V2 disabled: 2.84s DECIMAL_V2 enabled : 8.26s After: DECIMAL_V2 disabled: 69.16s DECIMAL_V2 enabled : 66.13s Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 282 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/5 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#5). Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute an int256 intermediate value, which we then attempt to scale down and round to int128. Benchmarks: In the following benchmarks, we are selecting from lineitem_big, which has about 100 times as many rows as the normal lineitem. Query: select sum(l_quantity * l_tax) + sum(l_extendedprice * l_discount) from lineitem_big; Before: DECIMAL_V2 disabled: 2.24s DECIMAL_V2 enabled : 2.14 After: DECIMAL_V2 disabled: 2.66s DECIMAL_V2 enabled : 2.25s Query: select sum(l_quantity * l_tax * cast(1 as decimal(38, 0))) + sum(l_extendedprice * l_discount * cast(1 as decimal(38, 0))) from lineitem_big Before: DECIMAL_V2 disabled: 2.34s DECIMAL_V2 enabled : 2.36s After: DECIMAL_V2 disabled: 4.25s DECIMAL_V2 enabled : 4.15s Query: select sum(l_quantity * l_tax * cast(1 as decimal(38, 37))) + sum(l_extendedprice * l_discount * cast(1 as decimal(38, 37))) from lineitem_big Before: DECIMAL_V2 disabled: 2.84s DECIMAL_V2 enabled : 8.26s After: DECIMAL_V2 disabled: 69.16s DECIMAL_V2 enabled : 66.13s Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 282 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/5 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden