Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11183 )
Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions ...................................................................... Patch Set 16: (20 comments) Thanks for the review and sorry for the large number of patches. Please look at patch set 17-20 as one commit. http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG@10 PS16, Line 10: do > typo: to Done http://gerrit.cloudera.org:8080/#/c/11183/16//COMMIT_MSG@51 PS16, Line 51: > Maybe you could mention your results about the speedup. I have added this information to line 26. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@254 PS16, Line 254: v1 > nit: use better names, e.g.: Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@592 PS16, Line 592: (uint64_t)MICROS_PER_SEC > nit: please use static_cast Is there a reason to use static_cast for primitive numeric types? IMO C style casts add less noise and expresses the intent with the same clarity (pointers/references are totally different stories...). http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@604 PS16, Line 604: && nanos == other.nanos > If nanos are more variable in the benchmarks then maybe it'd be worth to ch I have just realized that this is not needed anymore. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@627 PS16, Line 627: / ONE_BILLIONTH > * NANOS_PER_SEC ? The results are actually not always the same due to rounding. I left it this way in TimestampValue to avoid changing external behavior, so I also left it like this in the benchmark. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@634 PS16, Line 634: / ONE_BILLIONTH > * NANOS_PER_SEC ? See last answer. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/exprs/expr-test.cc@6508 PS16, Line 6508: ScopedLocalUnixTimestampConversionOverride > I just noticed that this class has a bug. FLAGS_use_local_tz_for_unix_timestamp_conversions should be false by default in tests, but I agree that restoring the original value fits the goal of the class better. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241 PS16, Line 241: (double) seconds + (double) > nit: static_cast See my comment in convert-timezone-benchmark.cc. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241 PS16, Line 241: 1000 > nit: MILLIS_PER_SEC? Attila mentioned this too. I think that 1000 is easier to read than MILLIS_PER_SEC. (hmm, maybe if both of you have mentioned it than I am wrong here :) http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@250 PS16, Line 250: 8 > nit: please give it a name, e.g. MARGIN_OF_ERROR Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@261 PS16, Line 261: 1000 > nit: MILLIS_PER_SEC from gutil/walltime.h? Same as for line 241. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@266 PS16, Line 266: 1000 > nit: MICROS_PER_MILLI from walltime.h? Same as for line 241. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@280 PS16, Line 280: 60*60 > it appears twice Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@188 PS16, Line 188: (int64_t)1000*1000*1000 > nit: static_cast I was not happy to write this here, but walltime.h constants are not available in this file. I did not want to include walltime.h, because if timestamp-value.h is included (indirectly) from a very large number of files, and I did not want to make compilation slower. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@318 PS16, Line 318: int64_t SplitTime(int64_t* ticks) > Maybe it would be cleaner to change the signature to void SplitTime(const i I see the point, and I trust the compiler that it will eliminate the variabled, but I think that this would make most calling functions a bit more complicated. For example FromUnixTimeNanos would need +2 variables and for unix_time += SplitTime<NANOS_PER_SEC>(&nanos); http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@317 PS16, Line 317: template <int64_t GRANULARITY> : inline static int64_t SplitTime(int64_t* ticks) { : int64_t result = *ticks / GRANULARITY; : *ticks %= GRANULARITY; : if (*ticks < 0) { : result--; : *ticks += GRANULARITY; : } : return result; : } > We have this function here and in the benchmark. Can we place it to some co I have made it public. I think that it will be actually useful in other functions in the future. http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.inline.h File be/src/runtime/timestamp-value.inline.h: http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.inline.h@39 PS16, Line 39: EPOCH + boost::gregorian::date_duration(days), > nit: it fits to the previous line Done http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.inline.h@62 PS16, Line 62: / ONE_BILLIONTH > * NANOS_PER_SEC ? I think it expresses the intention better. See my explanation in convert-timestamp-benchmark.cc http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.inline.h@70 PS16, Line 70: [0,1) > [0, NANOS_PER_SEC) Done -- To view, visit http://gerrit.cloudera.org:8080/11183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa Gerrit-Change-Number: 11183 Gerrit-PatchSet: 16 Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Wed, 05 Sep 2018 17:23:52 +0000 Gerrit-HasComments: Yes