Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14714 )
Change subject: IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3 ...................................................................... Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@200 PS3, Line 200: (IsUsedToken("MM") && (IsUsedToken("MONTH") || IsUsedToken("MON"))) || : (IsUsedToken("MONTH") && IsUsedToken("MON")) Probably it would be simpler to do this check with a counter, similarly to L215-219. http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h File be/src/runtime/datetime-parser-common.h: http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@22 PS3, Line 22: #include <unordered_set> : #include <unordered_map> : #include <vector> nit: order includes alphabetically. http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@75 PS3, Line 75: TUE Closing apostrophe is missing. http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@315 PS3, Line 315: month or day nit: month or day name http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.cc File be/src/runtime/datetime-parser-common.cc: http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.cc@223 PS3, Line 223: // If it's ambiguous where the end of the month token is in the input then try to find : // the end of the month token by probing the input string. : int search_len = token_len; : while (search_len >= 3) { : const auto iter = MONTH_NAME_TO_INDEX.find(buff.substr(0, search_len)); : if (iter != MONTH_NAME_TO_INDEX.end()) { : *month = iter->second; : *token_end = token_start + search_len; : return true; : } : --search_len; : } I think, we could do this faster by taking advantage of the fact that checking the first 3 letters of the month is enough to decide what month to look for. const unordered_map<string, string> month_prefix_to_suffix = { {"jan", "uary"}, {"feb", "ruary"}, .. {"dec", "ember"} }; if (token_len >= 3) { const auto& prefix = buff.substr(0, 3); const auto it = month_prefix_to_suffix.find(prefix); if (it != month_prefix_to_suffix.end()) { const auto& suffix = it->second; // // Test that 'suffix' matches the rest of 'buff'. // .. } } return false; What do you think? -- To view, visit http://gerrit.cloudera.org:8080/14714 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic797f19a1311b54e5d00d01d0a7afe1f0f21fb8f Gerrit-Change-Number: 14714 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Comment-Date: Tue, 19 Nov 2019 16:02:57 +0000 Gerrit-HasComments: Yes