Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. ......................................................................
Patch Set 6: (32 comments) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: specified direction(s) of a STRING value. > Then change the name of the new function away from btrim to something else. Greetings, Jim. This old function name "btrim" has been discarded. And it has been replaced using the new one "trim". Thank you. http://gerrit.cloudera.org:8080/#/c/4474/6//COMMIT_MSG Commit Message: PS6, Line 14: Syntax #3 confirms the standard SQL syntax (Core SQL feature ID > All of them are part of the standard, as is simply TRIM(string_to_be_trimme Done PS6, Line 18: | > just use regular prose here: Valid options are "leading", "trailing", and " Done PS6, Line 22: NULL > How is a NULL passed? Is it literally just "trim(NULL from 'foo')"? Greetings, Jim. Yes, it is exact as you have said. But I have to say: "trim(NULL from 'foo')" is invalid. The correct calling form is like: trim(NULL 'f' from 'foo'); PS6, Line 22: empty argument > What does "empty argument" mean in this case? Greetings, Jim. "empty argument" means an empty string like: "" or ''. Extra explaination has been added. Thank you for pointing out this. :) PS6, Line 23: returns > not just "returns", "TRIM returns". Here and below. Greetings, Jim. Thank you for pointing out that. :) PS6, Line 25: "trim_char_set": Case-sensitive characters to be removed. This > Please separate paragraphs by a blank line. Done PS6, Line 29: " " > Is the standard space or all whitespace, including tabs, newlines, etc.? Greetings, Jim. I mean the standard space here. This comment has been fixed. Thank you. :) PS6, Line 32: Blank > How is a "blank" argument different than an "empty argument" above? Greetings, Jim. No, "Blank" means nothing provided. For example: select trim(both " " from ); This is the case of "blank" argument. As you can see here, this definitely causes parsing error. Thank you. PS6, Line 33: ote: For Syntax #1, since no "characters" is specified, it trims : " "(space) by default. For Syntax #2, since no "where" is specified, : the option both is implied by default. > Please split this up and put it above in the text on each option. Done PS6, Line 44: @&@&@&@ > This is hard to read. Can you stick to alphanumeric characters in these exa Greetings, Jim. Thank you for pointing out that. Corresponding tests will be modified according to your guideline. http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 1977: TestStringValue("trim(leading FROM ' &@&127+ &@ ')", "&@&127+ &@ "); > Please try to have each test case test one thing different from every other Done http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 788: static inline bitset<256>* TrimInternalPrepare(FunctionContext* ctx, > Please use unnamed namespaces; see https://google.github.io/styleguide/cppg Done PS6, Line 805: bitset > const Done PS6, Line 806: & > Out parameters, when used, are not references. https://google.github.io/sty Done PS6, Line 806: begin > prefer to return, not pass out parameters. Done PS6, Line 807: static_cast<int> > What type is str.ptr[begin]? What type does test take? Do you need this cas Greetings, Jim. The type of str.ptr[begin] is uint8_t. I believe we don't need this case in this case. PS6, Line 807: test > Please use [], not test - we don't need the bounds checking. Done PS6, Line 813: int & > begin is not modified. Done PS6, Line 822: bitset > const Done PS6, Line 824: int32_t > Don't use int32_t and int interchangeably without a specific reason Done PS6, Line 835: str > The original functions made a new string val, rather than returning this on Greetings, Jim. I guess it is related to the dynamic memory management. Am I right? :) Line 848: option.ptr[i] = std::tolower(option.ptr[i]); > Though awkward, please insert the required casts here. Done PS6, Line 851: TrimOption > This should be done at parsing/analysis time, for efficiency reasons Done PS6, Line 855: INVALID > LEADING = 1, TRAILING = 2, BOTH = LEADING | TRAILING, INVALID = ~BOTH. The Done PS6, Line 858: opt_ > https://google.github.io/styleguide/cppguide.html#Variable_Names Done PS6, Line 861: ( > Please use clang-format to format your code. Done PS6, Line 861: 7 > try to avoid repeating literals, like 7. Maybe set static constexpr char LE Done PS6, Line 863: both > bad copy Done PS6, Line 866: both > bad copy Done http://gerrit.cloudera.org:8080/#/c/4474/1/be/src/exprs/string-functions.h File be/src/exprs/string-functions.h: PS1, Line 123: Base6 > And while you're changing the name, no need to call this BTrim. Greetings, Jim. I mean no offense and no harm. But it seems you are looking at some old change I uploaded previously. The latest change is here: https://gerrit.cloudera.org/#/c/4474/6/be/src/exprs/string-functions.h Sincerely hope this link is helpful for your review. Thank you. http://gerrit.cloudera.org:8080/#/c/4474/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: PS1, Line 478: ezza > This is the string you should change. Change it to something else, like sql Greetings, Jim. I mean no offense and no harm. But it seems you are looking at some old change I uploaded previously. The latest change is here: https://gerrit.cloudera.org/#/c/4474/6/common/function-registry/impala_functions.py To be straightforward, the latest function entry is like: [['trim'], 'STRING', ['STRING', 'STRING', 'STRING'], 'impala::StringFunctions::TrimStringWithOption', '_ZN6impala15StringFunctions12BTrimPrepareEPN10impala_udf15FunctionContextENS2_18FunctionStateScopeE', '_ZN6impala15StringFunctions10BTrimCloseEPN10impala_udf15FunctionContextENS2_18FunctionStateScopeE'], I don't think we can put this entry in the invisible part since doing this could make this function entry lose connection to the front-end call. Please feel free to point out my mistake if possible. Thank you. -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang <youwei.a.w...@intel.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Jim Apple <jbap...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Youwei Wang <youwei.a.w...@intel.com> Gerrit-HasComments: Yes