[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. IMPALA-5347: reduce codegen overhead of timestamp trunc() Trunc has many implementations that are switched between based on a string argument. Before this patch all implementations were compiled for every call to trunc(), which added a lot of unnecessary codegen time. This patch avoids the problem by moving the implementation out of the cross-compiled code. Testing: Ran expr-test. I ran the repro query from IMPALA-5347 and verified that codegen time was significantly reduced from ~1.4s to ~.35s. Perf: I ran the following targeted benchmark: set num_nodes=1; set num_scanner_threads=1; select count(*) from lineitem where trunc(l_shipdate, 'yy') >= '1998-01-01' The end-to-end query latency was reduced to 0.52s from 0.72s on average. The time spent in the scanner increased slightly from around 390ms to around 410ms. This seems like a good-tradeoff. Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Reviewed-on: http://gerrit.cloudera.org:8080/7081 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/exprs/CMakeLists.txt M be/src/exprs/udf-builtins-ir.cc A be/src/exprs/udf-builtins.cc M be/src/exprs/udf-builtins.h 4 files changed, 280 insertions(+), 229 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 3: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/680/ -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 3: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7081/2/be/src/exprs/udf-builtins.cc File be/src/exprs/udf-builtins.cc: Line 16: // under the License. > mention somewhere that these functions should specifically not get cross-co Done -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Hello Marcel Kornacker, Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7081 to look at the new patch set (#3). Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. IMPALA-5347: reduce codegen overhead of timestamp trunc() Trunc has many implementations that are switched between based on a string argument. Before this patch all implementations were compiled for every call to trunc(), which added a lot of unnecessary codegen time. This patch avoids the problem by moving the implementation out of the cross-compiled code. Testing: Ran expr-test. I ran the repro query from IMPALA-5347 and verified that codegen time was significantly reduced from ~1.4s to ~.35s. Perf: I ran the following targeted benchmark: set num_nodes=1; set num_scanner_threads=1; select count(*) from lineitem where trunc(l_shipdate, 'yy') >= '1998-01-01' The end-to-end query latency was reduced to 0.52s from 0.72s on average. The time spent in the scanner increased slightly from around 390ms to around 410ms. This seems like a good-tradeoff. Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 --- M be/src/exprs/CMakeLists.txt M be/src/exprs/udf-builtins-ir.cc A be/src/exprs/udf-builtins.cc M be/src/exprs/udf-builtins.h 4 files changed, 280 insertions(+), 229 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7081/3 -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/7081/2/be/src/exprs/udf-builtins.cc File be/src/exprs/udf-builtins.cc: Line 16: // under the License. mention somewhere that these functions should specifically not get cross-compiled (otherwise the next person might decide there's something to be gained from ...). -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 2: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins.cc File be/src/exprs/udf-builtins.cc: Line 171: // TODO: it would be nice to resolve the branch before codegen so we can optimise I put a TODO here -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Michael Ho has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins-ir.cc File be/src/exprs/udf-builtins-ir.cc: PS1, Line 242: > Yeah I agree it would be nice, I don't think we have the infrastructure now I concur.May be a TODO ? -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7081 to look at the new patch set (#2). Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. IMPALA-5347: reduce codegen overhead of timestamp trunc() Trunc has many implementations that are switched between based on a string argument. Before this patch all implementations were compiled for every call to trunc(), which added a lot of unnecessary codegen time. This patch avoids the problem by moving the implementation out of the cross-compiled code. Testing: Ran expr-test. I ran the repro query from IMPALA-5347 and verified that codegen time was significantly reduced from ~1.4s to ~.35s. Perf: I ran the following targeted benchmark: set num_nodes=1; set num_scanner_threads=1; select count(*) from lineitem where trunc(l_shipdate, 'yy') >= '1998-01-01' The end-to-end query latency was reduced to 0.52s from 0.72s on average. The time spent in the scanner increased slightly from around 390ms to around 410ms. This seems like a good-tradeoff. Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 --- M be/src/exprs/CMakeLists.txt M be/src/exprs/udf-builtins-ir.cc A be/src/exprs/udf-builtins.cc M be/src/exprs/udf-builtins.h 4 files changed, 277 insertions(+), 229 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7081/2 -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins-ir.cc File be/src/exprs/udf-builtins-ir.cc: PS1, Line 242: > May make sense to codegen and constant propagate in this case. Yeah I agree it would be nice, I don't think we have the infrastructure now to do this in a generic way though, given the dispatch logic to map a string to an implementation is non-trivial. I didn't want to get sidetracked implementing a special-case optimisation here. http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins.h File be/src/exprs/udf-builtins.h: PS1, Line 67: // > nit:/// Done -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Michael Ho has posted comments on this change. Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. Patch Set 1: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins-ir.cc File be/src/exprs/udf-builtins-ir.cc: PS1, Line 242: May make sense to codegen and constant propagate in this case. http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins.h File be/src/exprs/udf-builtins.h: PS1, Line 67: // nit:/// -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/7081 Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc() .. IMPALA-5347: reduce codegen overhead of timestamp trunc() Trunc has many implementations that are switched between based on a string argument. Before this patch all implementations were compiled for every call to trunc(), which added a lot of unnecessary codegen time. This patch avoids the problem by moving the implementation out of the cross-compiled code. Testing: Ran expr-test. I ran the repro query from IMPALA-5347 and verified that codegen time was significantly reduced from ~1.4s to ~.35s. Perf: I ran the following targeted benchmark: set num_nodes=1; set num_scanner_threads=1; select count(*) from lineitem where trunc(l_shipdate, 'yy') >= '1998-01-01' The end-to-end query latency was reduced to 0.52s from 0.72s on average. The time spent in the scanner increased slightly from around 390ms to around 410ms. This seems like a good-tradeoff. Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 --- M be/src/exprs/CMakeLists.txt M be/src/exprs/udf-builtins-ir.cc A be/src/exprs/udf-builtins.cc M be/src/exprs/udf-builtins.h 4 files changed, 275 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7081/1 -- To view, visit http://gerrit.cloudera.org:8080/7081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong