Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16869 )

Change subject: IMPALA-9922: A better approach to deal with date's sub-second 
fractions
......................................................................


Patch Set 3:

(10 comments)

Thanks for this change!

http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@427
PS3, Line 427:       if (tok.type == FRACTION){
can fit into one line


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@433
PS3, Line 433:         if (!ParseAndValidate(tok_val, tok_len, 0, 9999, 
&dt_result->year)){
This and the ifs below fit into a single line, no need to wrap them.


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@483
PS3, Line 483: if(
nit: add a space after 'if'


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@483
PS3, Line 483: {
nit: leave a space before '{'


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@483
PS3, Line 483:         if(tok.pos + tok.len > str_len){
I think this doesn't work as desired when there is a timezone offset after the 
fraction part. I'm not sure, but please add some test coverage for this to be 
on the safe side.

Giving this a second though the timezone offset might have been trimmed from 
the end, but still it is feasible that you have additional tokens after the 
fractions.

Souldn't you instead treat fractions the same way as single-character tokens 
that could accept any length of input?


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@487
PS3, Line 487: {
nit: leave a space before '{'


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@628
PS3, Line 628:     // Test sub-second value with insufficient length.
As you also changed iso-sql-format-parser please add similar tests to 
tests/query_test/test_cast_with_format.py
(This iso-sql-format is used for CAST(... FORMAT...) kind of queries)


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@632
PS3, Line 632:     (TimestampTC("yyyy-MM-dd HH:mm:ss:SSS", "2020-10-01 
10:54:00:1345",false, true, true,
Would it truncate if there are tokens after SSS?
E.g.:
TimestampTC("yyyy-MM-dd mm:ss:SSS:HH", "2020-10-01 54:00:1345:10"


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@635
PS3, Line 635: ,false
nit: space after ','


http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@638
PS3, Line 638:     (TimestampTC("yyyy-MM-dd HH:mm:ss:S", "2020-10-01 
10:54:00:1345",false, true, true,
Again, I'd add a test where there are tokens after S



--
To view, visit http://gerrit.cloudera.org:8080/16869
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e870bb8ad8fd14d388f37dfc5175589ecf9a5a7
Gerrit-Change-Number: 16869
Gerrit-PatchSet: 3
Gerrit-Owner: fifteencai <fifteen...@tencent.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: fifteencai <fifteen...@tencent.com>
Gerrit-Comment-Date: Tue, 15 Dec 2020 13:24:38 +0000
Gerrit-HasComments: Yes

Reply via email to