Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14714 )

Change subject: IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@200
PS3, Line 200: (IsUsedToken("MM") && (IsUsedToken("MONTH") || 
IsUsedToken("MON"))) ||
             :       (IsUsedToken("MONTH") && IsUsedToken("MON"))
Probably it would be simpler to do this check with a counter, similarly to 
L215-219.


http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h
File be/src/runtime/datetime-parser-common.h:

http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@22
PS3, Line 22: #include <unordered_set>
            : #include <unordered_map>
            : #include <vector>
nit: order includes alphabetically.


http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@75
PS3, Line 75: TUE
Closing apostrophe is missing.


http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@315
PS3, Line 315: month or day
nit: month or day name


http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.cc
File be/src/runtime/datetime-parser-common.cc:

http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.cc@223
PS3, Line 223:   // If it's ambiguous where the end of the month token is in 
the input then try to find
             :   // the end of the month token by probing the input string.
             :   int search_len = token_len;
             :   while (search_len >= 3) {
             :     const auto iter = MONTH_NAME_TO_INDEX.find(buff.substr(0, 
search_len));
             :     if (iter != MONTH_NAME_TO_INDEX.end()) {
             :       *month = iter->second;
             :       *token_end = token_start + search_len;
             :       return true;
             :     }
             :     --search_len;
             :   }
I think, we could do this faster by taking advantage of the fact that checking 
the first 3 letters of the month is enough to decide what month to look for.

const unordered_map<string, string> month_prefix_to_suffix = {
  {"jan", "uary"},
  {"feb", "ruary"},
..
  {"dec", "ember"}
};

if (token_len >= 3) {
  const auto& prefix = buff.substr(0, 3);
  const auto it = month_prefix_to_suffix.find(prefix);
  if (it != month_prefix_to_suffix.end()) {
    const auto& suffix = it->second;
    //
    // Test that 'suffix' matches the rest of 'buff'.
    // ..
  }
}

return false;

What do you think?



--
To view, visit http://gerrit.cloudera.org:8080/14714
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic797f19a1311b54e5d00d01d0a7afe1f0f21fb8f
Gerrit-Change-Number: 14714
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Nov 2019 16:02:57 +0000
Gerrit-HasComments: Yes

Reply via email to