Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19875 )

Change subject: IMPALA-12134: Optimize row materialization time
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/19875/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19875/1//COMMIT_MSG@9
PS1, Line 9:  improves row materialization time by
The improvements are Beeswax specific at the moment if I understand correctly. 
Can you mention the effect on Beeswax/HS2 separately?


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/benchmarks/date-benchmark.cc
File be/src/benchmarks/date-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/benchmarks/date-benchmark.cc@41
PS1, Line 41: // ToYearMonthDay:            Function  iters/ms   10%ile   
50%ile   90%ile     10%ile     50%ile     90%ile
            : //                                                                
          (relative) (relative) (relative)
            : // 
---------------------------------------------------------------------------------------------------------
            : //              TestCctzToYearMonthDay               23.1     
23.4     23.7         1X         1X         1X
            : //                  TestToYearMonthDay                 68     
69.6     70.7      2.95X      2.98X      2.98X
            : //                          TestToYear                443      
446      448      19.2X      19.1X      18.9X
            : //                        TestToString               9.02     
9.04     9.06     0.391X     0.386X     0.382X
            : //           TestToString_stringstream               2.04     
2.04     2.08    0.0883X    0.0871X    0.0875X
Can you update this with your new results?


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/benchmarks/date-benchmark.cc@84
PS1, Line 84:       
timestamp_to_char_buf_result_[i].reserve(SimpleDateFormatTokenizer::DEFAULT_DATE_TIME_FMT_LEN);
nit: AFAIK  we try to limit to 90 columns in benchmark files too (the style 
check didn't complain though, maybe it doesn't cover be/src/benchmarks?)


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/date-parse-util.cc
File be/src/runtime/date-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/date-parse-util.cc@134
PS1, Line 134:
nit: extra line


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/date-value.cc
File be/src/runtime/date-value.cc:

http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/date-value.cc@431
PS1, Line 431: ToString
ToString could also use the optimized default formatter.

AFAIK the only major users of << are Beeswax protocol and complex types, while 
ToString() is used by HS2 and cast functions.
https://github.com/apache/impala/blob/63d13a35f35874822daf167d763ed683f1ec48ef/be/src/exprs/cast-functions-ir.cc#L361


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/timestamp-parse-util.cc@235
PS1, Line 235: const date& d, const time_duration& t
The year() / hours() ... etc functions will throw an exception  if the 
date/time are not valid. See HasDate() 
https://github.com/apache/impala/blob/63d13a35f35874822daf167d763ed683f1ec48ef/be/src/runtime/timestamp-value.h#L183

This could be handled at the call site and FormatDefault() could have a DCHECK 
that d/t are not special.


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/timestamp-value.cc@182
PS1, Line 182: operator
ToString() could also use the optimized default formatter, see my notes in 
DateValue


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/runtime/timestamp-value.cc@183
PS1, Line 183: fractional_seconds
Calling fractional_seconds() will throw an exception if the timestamp is not 
valid.

In most cases we can assume that TimestampValues are valid (invalid values are 
converted to NULL), but I think that in << we should be defensive and return 
early if !HasDateAndTime()


http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/service/query-result-set.cc
File be/src/service/query-result-set.cc:

http://gerrit.cloudera.org:8080/#/c/19875/1/be/src/service/query-result-set.cc@96
PS1, Line 96: medata
typo



--
To view, visit http://gerrit.cloudera.org:8080/19875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1ef5e4137fa6c2d0a5f08b430e01e3fb7de86330
Gerrit-Change-Number: 19875
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Comment-Date: Fri, 12 May 2023 06:27:46 +0000
Gerrit-HasComments: Yes

Reply via email to