[Impala-ASF-CR] IMPALA-11504:Specializing DecimalUtil::GetScaleMultiplier().

Daniel Becker (Code Review) Wed, 28 Sep 2022 03:58:13 -0700

Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18861 )


Change subject: IMPALA-11504:Specializing 
DecimalUtil::GetScaleMultiplier<int256_t>().
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/18861/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18861/3//COMMIT_MSG@15
PS3, Line 15: - Add decimal-util-benchmark.
> done, but ran in a docker container, will that be a problem?
It's ok if it is in a Docker container.
I would mention in the commit msg how big the speedup is.


http://gerrit.cloudera.org:8080/#/c/18861/3/be/src/benchmarks/decimal-util-benchmark.cc
File be/src/benchmarks/decimal-util-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/18861/3/be/src/benchmarks/decimal-util-benchmark.cc@51
PS3, Line 51: }
> done.
I experimented a bit with it and a simple global static volatile variable may 
be enough. I couldn't get it to work with int256_t but using an int64_t I got 
an even greater difference, around 6000x.
This time I simply assigned the result of the function to the global volatile 
int64_t, didn't add anything.

Interestingly if the global volatile variable is not static, the difference is 
only 1400x...
I don't know the cause of the difference but I think we should try the global, 
static volatile int64_t variable that we assign the result to.


http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/benchmarks/decimal-util-benchmark.cc
File be/src/benchmarks/decimal-util-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/benchmarks/decimal-util-benchmark.cc@56
PS4, Line 56: int256_t
If we stay with summing the results, I'm not sure int256_t is OK, it may have 
undefined behaviour on overflow.


http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/runtime/decimal-test.cc
File be/src/runtime/decimal-test.cc:

http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/runtime/decimal-test.cc@1012
PS4, Line 1012:       DCHECK(expect * 10 >= expect);
This is not really a good check because if a signed type (such as int32_t etc.) 
overflows it is undefined behaviour so the value could be anything.
In this test I think we can tolerate not checking overflow.


http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/runtime/decimal-test.cc@1013
PS4, Line 1013:       expect *= 10;
We don't need to prepare a fresh 'expected' in each iteration, we can start 
with 1 outside the loop and multiply it with 10 at the end of each iteration.


http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/runtime/decimal-test.cc@1022
PS4, Line 1022:   
TestGetScaleMultiplier<int32_t>(DecimalUtil::INT32_SCALE_UPPER_BOUND);
We could also test with double, I saw this function used with double in the 
code base.


http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/util/decimal-util.h
File be/src/util/decimal-util.h:

http://gerrit.cloudera.org:8080/#/c/18861/4/be/src/util/decimal-util.h@37
PS4, Line 37:   // The scale upper bound for GetScaleMultiplier<int32_t>()
            :   static constexpr int INT32_SCALE_UPPER_BOUND = 
ColumnType::MAX_DECIMAL4_PRECISION + 1;
            :   // The scale upper bound for GetScaleMultiplier<int64_t>()
            :   static constexpr int INT64_SCALE_UPPER_BOUND = 
ColumnType::MAX_DECIMAL8_PRECISION + 1;
            :   // The scale upper bound for GetScaleMultiplier<int128_t>()
            :   static constexpr int INT128_SCALE_UPPER_BOUND = 
ColumnType::MAX_PRECISION + 1;
            :   // The scale upper bound for GetScaleMultiplier<int256_t>()
            :   static constexpr int INT256_SCALE_UPPER_BOUND = 77;
I don't think these should be public class-level constants, they should be in 
the respective functions.


http://gerrit.cloudera.org:8080/#/c/18861/3/be/src/util/decimal-util.h
File be/src/util/decimal-util.h:

http://gerrit.cloudera.org:8080/#/c/18861/3/be/src/util/decimal-util.h@342
PS3, Line 342:  * i10e18 * i10e18 * 1000000000ll
> use 'num_values' instead of "sizeof(values) / sizeof(int256_t)"? it seems t
I agree now this assert statement is not needed, we can remove it.



--
To view, visit http://gerrit.cloudera.org:8080/18861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I969e2977d51313e738f72c8246db003ae43a3782
Gerrit-Change-Number: 18861
Gerrit-PatchSet: 4
Gerrit-Owner: Xiang Yang <yx91...@126.com>
Gerrit-Reviewer: Anonymous Coward <lipeng...@sensorsdata.cn>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Xiang Yang <yx91...@126.com>
Gerrit-Comment-Date: Wed, 28 Sep 2022 10:53:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11504:Specializing DecimalUtil::GetScaleMultiplier().

Reply via email to