[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


IMPALA-5347: reduce codegen overhead of timestamp trunc()

Trunc has many implementations that are switched between based on a
string argument. Before this patch all implementations were compiled for
every call to trunc(), which added a lot of unnecessary codegen time.

This patch avoids the problem by moving the implementation out of the
cross-compiled code.

Testing:
Ran expr-test.

I ran the repro query from IMPALA-5347 and verified that codegen time
was significantly reduced from ~1.4s to ~.35s.

Perf:
I ran the following targeted benchmark:
  set num_nodes=1;
  set num_scanner_threads=1;
  select count(*) from lineitem where trunc(l_shipdate, 'yy') >=
  '1998-01-01'

The end-to-end query latency was reduced to 0.52s from 0.72s on
average. The time spent in the scanner increased slightly from
around 390ms to around 410ms. This seems like a good-tradeoff.

Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Reviewed-on: http://gerrit.cloudera.org:8080/7081
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/CMakeLists.txt
M be/src/exprs/udf-builtins-ir.cc
A be/src/exprs/udf-builtins.cc
M be/src/exprs/udf-builtins.h
4 files changed, 280 insertions(+), 229 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/680/

-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 3: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7081/2/be/src/exprs/udf-builtins.cc
File be/src/exprs/udf-builtins.cc:

Line 16: // under the License.
> mention somewhere that these functions should specifically not get cross-co
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Hello Marcel Kornacker, Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7081

to look at the new patch set (#3).

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..

IMPALA-5347: reduce codegen overhead of timestamp trunc()

Trunc has many implementations that are switched between based on a
string argument. Before this patch all implementations were compiled for
every call to trunc(), which added a lot of unnecessary codegen time.

This patch avoids the problem by moving the implementation out of the
cross-compiled code.

Testing:
Ran expr-test.

I ran the repro query from IMPALA-5347 and verified that codegen time
was significantly reduced from ~1.4s to ~.35s.

Perf:
I ran the following targeted benchmark:
  set num_nodes=1;
  set num_scanner_threads=1;
  select count(*) from lineitem where trunc(l_shipdate, 'yy') >=
  '1998-01-01'

The end-to-end query latency was reduced to 0.52s from 0.72s on
average. The time spent in the scanner increased slightly from
around 390ms to around 410ms. This seems like a good-tradeoff.

Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
---
M be/src/exprs/CMakeLists.txt
M be/src/exprs/udf-builtins-ir.cc
A be/src/exprs/udf-builtins.cc
M be/src/exprs/udf-builtins.h
4 files changed, 280 insertions(+), 229 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7081/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7081/2/be/src/exprs/udf-builtins.cc
File be/src/exprs/udf-builtins.cc:

Line 16: // under the License.
mention somewhere that these functions should specifically not get 
cross-compiled (otherwise the next person might decide there's something to be 
gained from ...).


-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 2: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins.cc
File be/src/exprs/udf-builtins.cc:

Line 171:   // TODO: it would be nice to resolve the branch before codegen so 
we can optimise
I put a TODO here


-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins-ir.cc
File be/src/exprs/udf-builtins-ir.cc:

PS1, Line 242: 
> Yeah I agree it would be nice, I don't think we have the infrastructure now
I concur.May be a TODO ?


-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7081

to look at the new patch set (#2).

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..

IMPALA-5347: reduce codegen overhead of timestamp trunc()

Trunc has many implementations that are switched between based on a
string argument. Before this patch all implementations were compiled for
every call to trunc(), which added a lot of unnecessary codegen time.

This patch avoids the problem by moving the implementation out of the
cross-compiled code.

Testing:
Ran expr-test.

I ran the repro query from IMPALA-5347 and verified that codegen time
was significantly reduced from ~1.4s to ~.35s.

Perf:
I ran the following targeted benchmark:
  set num_nodes=1;
  set num_scanner_threads=1;
  select count(*) from lineitem where trunc(l_shipdate, 'yy') >=
  '1998-01-01'

The end-to-end query latency was reduced to 0.52s from 0.72s on
average. The time spent in the scanner increased slightly from
around 390ms to around 410ms. This seems like a good-tradeoff.

Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
---
M be/src/exprs/CMakeLists.txt
M be/src/exprs/udf-builtins-ir.cc
A be/src/exprs/udf-builtins.cc
M be/src/exprs/udf-builtins.h
4 files changed, 277 insertions(+), 229 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7081/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins-ir.cc
File be/src/exprs/udf-builtins-ir.cc:

PS1, Line 242: 
> May make sense to codegen and constant propagate in this case.
Yeah I agree it would be nice, I don't think we have the infrastructure now to 
do this in a generic way though, given the dispatch logic to map a string to an 
implementation is non-trivial. I didn't want to get sidetracked implementing a 
special-case optimisation here.


http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins.h
File be/src/exprs/udf-builtins.h:

PS1, Line 67: //
> nit:///
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..


Patch Set 1: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins-ir.cc
File be/src/exprs/udf-builtins-ir.cc:

PS1, Line 242: 
May make sense to codegen and constant propagate in this case.


http://gerrit.cloudera.org:8080/#/c/7081/1/be/src/exprs/udf-builtins.h
File be/src/exprs/udf-builtins.h:

PS1, Line 67: //
nit:///


-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5347: reduce codegen overhead of timestamp trunc()

2017-06-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7081

Change subject: IMPALA-5347: reduce codegen overhead of timestamp trunc()
..

IMPALA-5347: reduce codegen overhead of timestamp trunc()

Trunc has many implementations that are switched between based on a
string argument. Before this patch all implementations were compiled for
every call to trunc(), which added a lot of unnecessary codegen time.

This patch avoids the problem by moving the implementation out of the
cross-compiled code.

Testing:
Ran expr-test.

I ran the repro query from IMPALA-5347 and verified that codegen time
was significantly reduced from ~1.4s to ~.35s.

Perf:
I ran the following targeted benchmark:
  set num_nodes=1;
  set num_scanner_threads=1;
  select count(*) from lineitem where trunc(l_shipdate, 'yy') >=
  '1998-01-01'

The end-to-end query latency was reduced to 0.52s from 0.72s on
average. The time spent in the scanner increased slightly from
around 390ms to around 410ms. This seems like a good-tradeoff.

Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
---
M be/src/exprs/CMakeLists.txt
M be/src/exprs/udf-builtins-ir.cc
A be/src/exprs/udf-builtins.cc
M be/src/exprs/udf-builtins.h
4 files changed, 275 insertions(+), 227 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7081/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I58f51b2093a38929df847fdb5d25bb9aafc3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong