Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16869 )
Change subject: IMPALA-9922: A better approach to deal with date's sub-second fractions ...................................................................... Patch Set 3: (10 comments) Thanks for this change! http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc File be/src/runtime/datetime-simple-date-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@427 PS3, Line 427: if (tok.type == FRACTION){ can fit into one line http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@433 PS3, Line 433: if (!ParseAndValidate(tok_val, tok_len, 0, 9999, &dt_result->year)){ This and the ifs below fit into a single line, no need to wrap them. http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@483 PS3, Line 483: if( nit: add a space after 'if' http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@483 PS3, Line 483: { nit: leave a space before '{' http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@483 PS3, Line 483: if(tok.pos + tok.len > str_len){ I think this doesn't work as desired when there is a timezone offset after the fraction part. I'm not sure, but please add some test coverage for this to be on the safe side. Giving this a second though the timezone offset might have been trimmed from the end, but still it is feasible that you have additional tokens after the fractions. Souldn't you instead treat fractions the same way as single-character tokens that could accept any length of input? http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/datetime-simple-date-format-parser.cc@487 PS3, Line 487: { nit: leave a space before '{' http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@628 PS3, Line 628: // Test sub-second value with insufficient length. As you also changed iso-sql-format-parser please add similar tests to tests/query_test/test_cast_with_format.py (This iso-sql-format is used for CAST(... FORMAT...) kind of queries) http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@632 PS3, Line 632: (TimestampTC("yyyy-MM-dd HH:mm:ss:SSS", "2020-10-01 10:54:00:1345",false, true, true, Would it truncate if there are tokens after SSS? E.g.: TimestampTC("yyyy-MM-dd mm:ss:SSS:HH", "2020-10-01 54:00:1345:10" http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@635 PS3, Line 635: ,false nit: space after ',' http://gerrit.cloudera.org:8080/#/c/16869/3/be/src/runtime/timestamp-test.cc@638 PS3, Line 638: (TimestampTC("yyyy-MM-dd HH:mm:ss:S", "2020-10-01 10:54:00:1345",false, true, true, Again, I'd add a test where there are tokens after S -- To view, visit http://gerrit.cloudera.org:8080/16869 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8e870bb8ad8fd14d388f37dfc5175589ecf9a5a7 Gerrit-Change-Number: 16869 Gerrit-PatchSet: 3 Gerrit-Owner: fifteencai <fifteen...@tencent.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: fifteencai <fifteen...@tencent.com> Gerrit-Comment-Date: Tue, 15 Dec 2020 13:24:38 +0000 Gerrit-HasComments: Yes