[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Jim Apple has abandoned this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Jim Apple has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 8: (24 comments) http://gerrit.cloudera.org:8080/#/c/4474/8//COMMIT_MSG Commit Message: Line 14: All syntaxes confirm the standard SQL syntax (Core SQL feature ID Please describe the existing syntax, too, and say that it already works. PS8, Line 18: trailing Please capitalize these. PS8, Line 18: leading Please try to match our existing formatting choices. This includes breaking lines at the 70-character mark, not arbitrarily early as in line 17. PS8, Line 19: Do not indent lines after the first one in a paragraph. Please use git log to study what previous commit messages looked like. PS8, Line 23: option this word is not needed. PS8, Line 22: empty : argument("" or '') just say "the empty string" PS8, Line 24: Syntax Why is this word capitalized, here and below PS8, Line 24: target "target" was not used earlier. Do you mean string_to_be_trimmed? PS8, Line 31: (standard space) You don't have to say "standard space" here or below. I was asking about what the standard says. Are you sure it just says " "? What do postgres and mysql do? PS8, Line 43: abcdefg Make the parameter strings even shorter PS8, Line 46: Syntax #2: Please separate these sections by blank lines. PS8, Line 51: " What is this doing here? Same above. Please try to be more consciencious about these things. http://gerrit.cloudera.org:8080/#/c/4474/8/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: PS8, Line 1977: Make the test cases even smaller. http://gerrit.cloudera.org:8080/#/c/4474/8/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: PS8, Line 799: static static in an unnamed namespace is redundant. PS8, Line 817: static Put this in the unnamed namespace PS8, Line 817: * & PS8, Line 818: begin Is this always 0? Here and below PS8, Line 838: int begin = 0; : begin = Make this one line PS8, Line 849: StringVal Does this do the right thing is is_null is true but ptr and len are set? PS8, Line 862: option.ptr[i] incorrect argument type PS8, Line 865: trim_option This is still not done at parsing/analysis time. That work happens in the front-end. Line 893: if (trim_option == TrimOption::LEADING || trim_option == TrimOption::BOTH) { trim_option | TrimOption::LEADING http://gerrit.cloudera.org:8080/#/c/4474/8/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: Line 429: [['trim'], 'STRING', ['STRING'], 'impala::StringFunctions::Trim'], Is this line still needed, or is it covered by the new grammar rules? Line 430: [['trim'], 'STRING', ['STRING', 'STRING', 'STRING'], 'impala::StringFunctions::TrimStringWithOption', Did you try changing this name and making it invisible? What happens? -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
Re: [Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
There are more than 40 other code reviews pending right now. https://gerrit.cloudera.org/#/q/status:open+project:Impala-ASF This is not an urgent bug fix, so it's lower priority, and it's already on patch set 8, and it's not that close to being +1ed. Please be patient. On Sun, Oct 9, 2016 at 6:25 PM, Youwei Wang (Code Review) wrote: > Youwei Wang has posted comments on this change. > > Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() > function. > .. > > > Patch Set 8: > > Greetings everyone. The review status of this patch has paused for a few > days. If possible, would you please move this a bit forward? Thank you. :) > > -- > To view, visit http://gerrit.cloudera.org:8080/4474 > To unsubscribe, visit http://gerrit.cloudera.org:8080/settings > > Gerrit-MessageType: comment > Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 > Gerrit-PatchSet: 8 > Gerrit-Project: Impala-ASF > Gerrit-Branch: master > Gerrit-Owner: Youwei Wang > Gerrit-Reviewer: Alex Behm > Gerrit-Reviewer: Jim Apple > Gerrit-Reviewer: Mostafa Mokhtar > Gerrit-Reviewer: Youwei Wang > Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 8: Greetings everyone. The review status of this patch has paused for a few days. If possible, would you please move this a bit forward? Thank you. :) -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has uploaded a new patch set (#8). Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. IMPALA-889: Add support for an ISO-SQL compliant trim() function. Purpose: Removes all instances of one or more characters from the specified direction(s) of a STRING value. Syntax #1 TRIM(where FROM string_to_be_trimmed); Syntax #2 TRIM(trim_char_set FROM string_to_be_trimmed); Syntax #3 TRIM(where trim_char_set FROM string_to_be_trimmed); All syntaxes confirm the standard SQL syntax (Core SQL feature ID E021-09). "where": Case-insensitive trim direction. Valid options are leading, trailing, and both. leading means trimming characters from the start; trailing means trimming characters from the end; both means trimming characters from both sides. These options should be provided without single/double quotation marks since they are not string literals but identifiers. NULL or empty argument("" or '') implies "both" option. If an invalid option is specified, TRIM returns target untouched. For Syntax #2, since no "where" is specified, the option both is implied by default. "trim_char_set": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. So the occurrence order of each character doesn't matter and duplicated instance of same character will be ignored. NULL argument implies " "(standard space) by default. Empty argument("" or '') makes TRIM return "string_to_be_trimmed" untouched. For Syntax #1, since no "trim_char_set" is specified, it trims " "(standard space) by default. "target": Case-sensitive target string to trim. This argument can be NULL. Blank argument causes parsing error. Return type: string Examples: Syntax #1: trim(both 'abfg' from 'abcdefg'); Result: 'cde'; trim(leading FROM ' 123 '); Result: '123 '; trim(trailing FROM ' 123 '); Result: ' 123'; Syntax #2: trim('xyz' FROM 'zyxabczyx')"; Result: "abc"; Syntax #3: trim(leading 'ab' from 'abcdefg'); Result: 'cdefg'; trim(trailing 'bfg%' from 'abcdefg'); Result: 'abcde'; trim(both 'abfg' from 'abcdefg')"; Result "cde"; Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M common/function-registry/impala_functions.py M common/thrift/Exprs.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/TrimExpr.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 9 files changed, 406 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4474/8 -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 6: (32 comments) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: specified direction(s) of a STRING value. > Then change the name of the new function away from btrim to something else. Greetings, Jim. This old function name "btrim" has been discarded. And it has been replaced using the new one "trim". Thank you. http://gerrit.cloudera.org:8080/#/c/4474/6//COMMIT_MSG Commit Message: PS6, Line 14: Syntax #3 confirms the standard SQL syntax (Core SQL feature ID > All of them are part of the standard, as is simply TRIM(string_to_be_trimme Done PS6, Line 18: | > just use regular prose here: Valid options are "leading", "trailing", and " Done PS6, Line 22: NULL > How is a NULL passed? Is it literally just "trim(NULL from 'foo')"? Greetings, Jim. Yes, it is exact as you have said. But I have to say: "trim(NULL from 'foo')" is invalid. The correct calling form is like: trim(NULL 'f' from 'foo'); PS6, Line 22: empty argument > What does "empty argument" mean in this case? Greetings, Jim. "empty argument" means an empty string like: "" or ''. Extra explaination has been added. Thank you for pointing out this. :) PS6, Line 23: returns > not just "returns", "TRIM returns". Here and below. Greetings, Jim. Thank you for pointing out that. :) PS6, Line 25: "trim_char_set": Case-sensitive characters to be removed. This > Please separate paragraphs by a blank line. Done PS6, Line 29: " " > Is the standard space or all whitespace, including tabs, newlines, etc.? Greetings, Jim. I mean the standard space here. This comment has been fixed. Thank you. :) PS6, Line 32: Blank > How is a "blank" argument different than an "empty argument" above? Greetings, Jim. No, "Blank" means nothing provided. For example: select trim(both " " from ); This is the case of "blank" argument. As you can see here, this definitely causes parsing error. Thank you. PS6, Line 33: ote: For Syntax #1, since no "characters" is specified, it trims : " "(space) by default. For Syntax #2, since no "where" is specified, : the option both is implied by default. > Please split this up and put it above in the text on each option. Done PS6, Line 44: @&@&@&@ > This is hard to read. Can you stick to alphanumeric characters in these exa Greetings, Jim. Thank you for pointing out that. Corresponding tests will be modified according to your guideline. http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 1977: TestStringValue("trim(leading FROM ' &@&127+ &@ ')", "&@&127+ &@ "); > Please try to have each test case test one thing different from every other Done http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 788: static inline bitset<256>* TrimInternalPrepare(FunctionContext* ctx, > Please use unnamed namespaces; see https://google.github.io/styleguide/cppg Done PS6, Line 805: bitset > const Done PS6, Line 806: & > Out parameters, when used, are not references. https://google.github.io/sty Done PS6, Line 806: begin > prefer to return, not pass out parameters. Done PS6, Line 807: static_cast > What type is str.ptr[begin]? What type does test take? Do you need this cas Greetings, Jim. The type of str.ptr[begin] is uint8_t. I believe we don't need this case in this case. PS6, Line 807: test > Please use [], not test - we don't need the bounds checking. Done PS6, Line 813: int & > begin is not modified. Done PS6, Line 822: bitset > const Done PS6, Line 824: int32_t > Don't use int32_t and int interchangeably without a specific reason Done PS6, Line 835: str > The original functions made a new string val, rather than returning this on Greetings, Jim. I guess it is related to the dynamic memory management. Am I right? :) Line 848: option.ptr[i] = std::tolower(option.ptr[i]); > Though awkward, please insert the required casts here. Done PS6, Line 851: TrimOption > This should be done at parsing/analysis time, for efficiency reasons Done PS6, Line 855: INVALID > LEADING = 1, TRAILING = 2, BOTH = LEADING | TRAILING, INVALID = ~BOTH. The Done PS6, Line 858: opt_ > https://google.github.io/styleguide/cppguide.html#Variable_Names Done PS6, Line 861: ( > Please use clang-format to format your code. Done PS6, Line 861: 7 > try to avoid repeating literals, like 7. Maybe set static constexpr char LE Done PS6, Line 863: both > bad copy Done PS6, Line 866: both > bad copy Done http://gerrit.cloudera.org:8080/#/c/4474/1/be/src/exprs/string-functions.h File be/src/exprs/string-functions.h: PS1, Line 123: Base6 > And while you're changing the name, no need to call this BTrim. Greetings, Jim. I mean no o
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has uploaded a new patch set (#7). Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. IMPALA-889: Add support for an ISO-SQL compliant trim() function. Purpose: Removes all instances of one or more characters from the specified direction(s) of a STRING value. Syntax #1 TRIM(where FROM string_to_be_trimmed); Syntax #2 TRIM(trim_char_set FROM string_to_be_trimmed); Syntax #3 TRIM(where trim_char_set FROM string_to_be_trimmed); All syntaxes confirm the standard SQL syntax (Core SQL feature ID E021-09). "where": Case-insensitive trim direction. Valid options are leading, trailing, and both. leading means trimming characters from the start; trailing means trimming characters from the end; both means trimming characters from both sides. These options should be provided without single/double quotation marks since they are not string literals but identifiers. NULL or empty argument("" or '') implies "both" option. If an invalid option is specified, TRIM returns target untouched. For Syntax #2, since no "where" is specified, the option both is implied by default. "trim_char_set": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. So the occurrence order of each character doesn't matter and duplicated instance of same character will be ignored. NULL argument implies " "(standard space) by default. Empty argument("" or '') makes TRIM return "string_to_be_trimmed" untouched. For Syntax #1, since no "trim_char_set" is specified, it trims " "(standard space) by default. "target": Case-sensitive target string to trim. This argument can be NULL. Blank argument causes parsing error. Return type: string Examples: Syntax #1: trim(both 'abfg' from 'abcdefg'); Result: 'cde'; trim(leading FROM ' 123 '); Result: '123 '; trim(trailing FROM ' 123 '); Result: ' 123'; Syntax #2: trim('xyz' FROM 'zyxabczyx')"; Result: "abc"; Syntax #3: trim(leading 'ab' from 'abcdefg'); Result: 'cdefg'; trim(trailing 'bfg%' from 'abcdefg'); Result: 'abcde'; trim(both 'abfg' from 'abcdefg')"; Result "cde"; Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M common/function-registry/impala_functions.py M common/thrift/Exprs.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/TrimExpr.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 9 files changed, 406 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4474/7 -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Jim Apple has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 6: (32 comments) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: specified direction(s) of a STRING value. > Greetings, Jim. Then change the name of the new function away from btrim to something else. http://gerrit.cloudera.org:8080/#/c/4474/6//COMMIT_MSG Commit Message: PS6, Line 14: Syntax #3 confirms the standard SQL syntax (Core SQL feature ID All of them are part of the standard, as is simply TRIM(string_to_be_trimmed). http://savage.net.au/SQL/sql-92.bnf.html PS6, Line 18: | just use regular prose here: Valid options are "leading", "trailing", and "both" PS6, Line 22: empty argument What does "empty argument" mean in this case? PS6, Line 22: NULL How is a NULL passed? Is it literally just "trim(NULL from 'foo')"? PS6, Line 23: returns not just "returns", "TRIM returns". Here and below. PS6, Line 25: "trim_char_set": Case-sensitive characters to be removed. This Please separate paragraphs by a blank line. PS6, Line 29: " " Is the standard space or all whitespace, including tabs, newlines, etc.? PS6, Line 32: Blank How is a "blank" argument different than an "empty argument" above? PS6, Line 33: ote: For Syntax #1, since no "characters" is specified, it trims : " "(space) by default. For Syntax #2, since no "where" is specified, : the option both is implied by default. Please split this up and put it above in the text on each option. PS6, Line 44: @&@&@&@ This is hard to read. Can you stick to alphanumeric characters in these examples, please? The smaller the input that illustrates the point, the clearer it is. This applies to tests, as well. http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 1977: TestStringValue("trim(leading FROM ' &@&127+ &@ ')", "&@&127+ &@ "); Please try to have each test case test one thing different from every other test case. What a case is testing should be clear by looking at it. The smaller the test case is, the better. As an example, would this test case be better as "(trim leading FROM ' a b ')"? http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 788: static inline bitset<256>* TrimInternalPrepare(FunctionContext* ctx, Please use unnamed namespaces; see https://google.github.io/styleguide/cppguide.html PS6, Line 805: bitset const PS6, Line 806: & Out parameters, when used, are not references. https://google.github.io/styleguide/cppguide.html#Reference_Arguments PS6, Line 806: begin prefer to return, not pass out parameters. PS6, Line 807: test Please use [], not test - we don't need the bounds checking. PS6, Line 807: static_cast What type is str.ptr[begin]? What type does test take? Do you need this cast? Why or why not? PS6, Line 813: int & begin is not modified. PS6, Line 822: bitset const PS6, Line 824: int32_t Don't use int32_t and int interchangeably without a specific reason PS6, Line 835: str The original functions made a new string val, rather than returning this one. Do you know why? Line 848: option.ptr[i] = std::tolower(option.ptr[i]); Though awkward, please insert the required casts here. PS6, Line 851: TrimOption This should be done at parsing/analysis time, for efficiency reasons PS6, Line 855: INVALID LEADING = 1, TRAILING = 2, BOTH = LEADING | TRAILING, INVALID = ~BOTH. The comparison on line 886 gets simplified. PS6, Line 858: opt_ https://google.github.io/styleguide/cppguide.html#Variable_Names PS6, Line 861: ( Please use clang-format to format your code. PS6, Line 861: 7 try to avoid repeating literals, like 7. Maybe set static constexpr char LEADING[] = "leading", then check if option.len == sizeof(LEADING)-1? See how that looks. PS6, Line 863: both bad copy PS6, Line 866: both bad copy http://gerrit.cloudera.org:8080/#/c/4474/1/be/src/exprs/string-functions.h File be/src/exprs/string-functions.h: PS1, Line 123: Base6 And while you're changing the name, no need to call this BTrim. http://gerrit.cloudera.org:8080/#/c/4474/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: PS1, Line 478: ezza This is the string you should change. Change it to something else, like sql92trim, then make that invisible. -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/4474/3//COMMIT_MSG Commit Message: Line 9: Purpose: Removes all instances of one or more characters from the > We should definitely not invent a new function name and stick to trim(). Th Done Line 32: NULL. Blank argument causes parsing error. > The 'where' and 'trim_char_set' clauses are optional, so the following are Done http://gerrit.cloudera.org:8080/#/c/4474/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: Line 2418: {: RESULT = c; :} > Not clear what this rule adds. The copy+paste nature of it means an increas Done Line 2449: > My understanding is that several of the trim() clauses are optional, e.g., Done -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has uploaded a new patch set (#6). Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. IMPALA-889: Add support for an ISO-SQL compliant trim() function. Purpose: Removes all instances of one or more characters from the specified direction(s) of a STRING value. Syntax #1 TRIM(where FROM string_to_be_trimmed); Syntax #2 TRIM(trim_char_set FROM string_to_be_trimmed); Syntax #3 TRIM(where trim_char_set FROM string_to_be_trimmed); Syntax #3 confirms the standard SQL syntax (Core SQL feature ID E021-09). "where": Case-insensitive trim direction. Valid options are: leading|trailing|both. leading means trimming characters from the start; trailing means trimming characters from the end; both means trimming characters from both sides. These options should be provided without single/double quotation marks since they are not string literals but identifiers. NULL or empty argument implies "both" option. If an invalid option is specified, returns target untouched. "trim_char_set": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. So the occurrence order of each character doesn't matter and duplicated instance of same character will be ignored. NULL argument implies " "(space) by default. Empty argument return string_to_be_trimmed untouched. "target": Case-sensitive target string to trim. This argument can be NULL. Blank argument causes parsing error. Note: For Syntax #1, since no "characters" is specified, it trims " "(space) by default. For Syntax #2, since no "where" is specified, the option both is implied by default. Return type: string Examples: Syntax #1: trim(both 'abfg%' from 'abc%%defg%'); returns 'c%%de'; trim(leading FROM ' 123 '); returns '123 '; trim(trailing FROM ' 123 '); returns ' 123'; Syntax #2: trim('&@' FROM '&@&@&@&127+1-@&@&@&@')"; returns "127+1-"; Syntax #3: trim(leading 'ab%' from 'abc%%defg%'); returns 'c%%defg%'; trim(trailing 'bfg%' from 'abc%%defg%'); returns 'abc%%de'; trim(both 'abfg%' from 'abc%%defg%')"; returns "c%%de"; Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M common/function-registry/impala_functions.py M common/thrift/Exprs.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/TrimExpr.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 9 files changed, 427 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4474/6 -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has uploaded a new patch set (#5). Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. IMPALA-889: Add support for an ISO-SQL compliant trim() function. Purpose: Removes all instances of one or more characters from the specified direction(s) of a STRING value. Syntax #1 TRIM(where FROM string_to_be_trimmed); Syntax #2 TRIM(trim_char_set FROM string_to_be_trimmed); Syntax #3 TRIM(where trim_char_set FROM string_to_be_trimmed); Syntax #3 confirms the standard SQL syntax (Core SQL feature ID E021-09). "where": Case-insensitive trim direction. Valid options are: leading|trailing|both. leading means trimming characters from the start; trailing means trimming characters from the end; both means trimming characters from both sides. These options should be provided without single/double quotation marks since they are not string literals but identifiers. NULL or empty argument implies "both" option. If an invalid option is specified, returns target untouched. "trim_char_set": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. So the occurrence order of each character doesn't matter and duplicated instance of same character will be ignored. NULL argument implies " "(space) by default. Empty argument return string_to_be_trimmed untouched. "target": Case-sensitive target string to trim. This argument can be NULL. Blank argument causes parsing error. Note: For Syntax #1, since no "characters" is specified, it trims " "(space) by default. For Syntax #2, since no "where" is specified, the option both is implied by default. Return type: string Examples: Syntax #1: trim(both 'abfg%' from 'abc%%defg%'); returns 'c%%de'; trim(leading FROM ' 123 '); returns '123 '; trim(trailing FROM ' 123 '); returns ' 123'; Syntax #2: trim('&@' FROM '&@&@&@&127+1-@&@&@&@')"; returns "127+1-"; Syntax #3: trim(leading 'ab%' from 'abc%%defg%'); returns 'c%%defg%'; trim(trailing 'bfg%' from 'abc%%defg%'); returns 'abc%%de'; trim(both 'abfg%' from 'abc%%defg%')"; returns "c%%de"; Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M common/function-registry/impala_functions.py M common/thrift/Exprs.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/com/cloudera/impala/analysis/TrimExpr.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeExprsTest.java 9 files changed, 429 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4474/5 -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has uploaded a new patch set (#4). Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. IMPALA-889: Add support for an ISO-SQL compliant trim() function. Purpose: Removes all instances of one or more characters from the specified direction(s) of a STRING value. Syntax #1 TRIM(where FROM string_to_be_trimmed); Syntax #2 TRIM(trim_char_set FROM string_to_be_trimmed); Syntax #3 TRIM(where trim_char_set FROM string_to_be_trimmed); Syntax #3 confirms the standard SQL syntax (Core SQL feature ID E021-09). "where": Case-insensitive trim direction. Valid options are: leading|trailing|both. leading means trimming characters from the start; trailing means trimming characters from the end; both means trimming characters from both sides. These options should be provided without single/double quotation marks since they are not string literals but identifiers. NULL or empty argument implies "both" option. If an invalid option is specified, returns target untouched. "trim_char_set": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. So the occurrence order of each character doesn't matter and duplicated instance of same character will be ignored. NULL argument implies " "(space) by default. Empty argument return string_to_be_trimmed untouched. "target": Case-sensitive target string to trim. This argument can be NULL. Blank argument causes parsing error. Note: For Syntax #1, since no "characters" is specified, it trims " "(space) by default. For Syntax #2, since no "where" is specified, the option both is implied by default. Return type: string Examples: Syntax #1: trim(both 'abfg%' from 'abc%%defg%'); returns 'c%%de'; trim(leading FROM ' 123 '); returns '123 '; trim(trailing FROM ' 123 '); returns ' 123'; Syntax #2: trim('&@' FROM '&@&@&@&127+1-@&@&@&@')"; returns "127+1-"; Syntax #3: trim(leading 'ab%' from 'abc%%defg%'); returns 'c%%defg%'; trim(trailing 'bfg%' from 'abc%%defg%'); returns 'abc%%de'; trim(both 'abfg%' from 'abc%%defg%')"; returns "c%%de"; Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M common/function-registry/impala_functions.py A common/function-registry/string-functions-ir.cc M common/thrift/Exprs.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/com/cloudera/impala/analysis/TrimExpr.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeExprsTest.java 10 files changed, 1,439 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4474/4 -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Alex Behm has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 3: (4 comments) Missing parser tests as well as TrimExpr.java http://gerrit.cloudera.org:8080/#/c/4474/3//COMMIT_MSG Commit Message: Line 9: Syntax #1: BTRIM(string where, string trim_char_set, string target); We should definitely not invent a new function name and stick to trim(). The existing trim() is a special case of the more general ISO-SQL trim(), so we should fold the old functionality into the new more general expr. Line 32: btrim('left', 'a%', 'abc%%defg%'); returns 'bc%%defg%'; The 'where' and 'trim_char_set' clauses are optional, so the following are also legal uses of trim(): trim(' abcde '); trim(right from 'abcde '); trim('z' from 'zzzabczzz'); trim(left 'xz' from 'abcxz'); http://gerrit.cloudera.org:8080/#/c/4474/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: Line 2418: string_expr ::= Not clear what this rule adds. The copy+paste nature of it means an increased maintenance burden. Line 2449: // Below is a special case for BTRIM. Idents are used to avoid adding new keywords. My understanding is that several of the trim() clauses are optional, e.g., the location and the characters to trim. The expr seems complicated enough that we should make TRIM a keyword and add a separate production "trim_expr ::=" to recognize it (keep the optional clauses in mind). Most other DBs seem to have TRIM as a keyword also, so we'd not be alone here. -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: Purpose: Removes all instances of one or more characters from the > Please look at invisible_functions, line 672 of that file. Greetings, Jim. Thank you so much for providing this great idea. However, I am afraid I must say this approach doesn't work as it is intended to be. If I move the btrim entry to the invisible_functions part, both two syntaxes are unavailable now for these two syntaxes share the same entry. I have verified this by experiments. Please feel free to point out my mistake if you think I am wrong. Thank you. :) -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Jim Apple has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: Purpose: Removes all instances of one or more characters from the > Greetings, Jim. Please look at invisible_functions, line 672 of that file. -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: Purpose: Removes all instances of one or more characters from the > "Pros" means the reasons for something, while "cons" means the reasons agai Greetings, Jim. Please be kind to allow me to give some explanations from the start: 1. In order to support the syntax like: btrim(heading|trailing|both char_set from string), it is necessary to register an entry in the common/function-registry/impala_functions.py file. Actually, it is also mapped to the entry point of the backend logic. 2. Such entry will be encoded in the frontend built-in UDF category for parsing usage and it will be called when users input such query: “select btrim(heading|trailing|both char_set from string)”; 3. Once such entry is ready and compiled, the syntax of the form #1 will be automatically supported, no matter you like it or not. This is exactly the same case as this function “extract(timestamp, string unit) || extract(unit FROM timestamp)” shown in this link: http://www.cloudera.com/documentation/archive/impala/2-x/2-1-x/topics/impala_datetime_functions.html It is easy for me to conceal the description of the form 1. But I can’t stop users from using it if they have some hacking skills and dig deeply into the impala_functions.py file. So in my opinion, it would be more friendly to point out the existence of such syntax - maybe there is someone whoever wants it. I am not sure whether my explanation can answer your question that “why there should be two different syntaxes for the same function.”If you don’t think my explanation is right. Please feel free to point out, thank you. :) PS1, Line 12: "where": Case-insensitive trim direction. Valid options are: > I don't think you mean "an enumerate value" -- "enumerate" is, as far as I Done PS1, Line 13: > This line is still 3 characters over the red line in patch set 2. Please do Done Line 19: "trim_char_set": Case-sensitive characters to be removed. This > I noted "don't just answer those particular questions", but you did exactly Done -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Youwei Wang has uploaded a new patch set (#3). Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. .. IMPALA-889: Add support for an ISO-SQL compliant trim() function. Syntax #1: BTRIM(string where, string trim_char_set, string target); Purpose: Removes all instances of one or more characters from the specified direction(s) of a STRING value. "where": Case-insensitive trim direction. Valid options are: 'left|leading|right|trailing|both'. 'left|leading' means trimming characters from the start; 'right|trailing' means trimming characters from the end; 'both' means trimming characters from both sides. If this argument is NULL or non-NULL but none of the valid options, the function returns the "target" argument untouched. Blank argument causes parsing error. "trim_char_set": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. So the occurrence order of each character doesn't matter and duplicated instance of same character will be ignored. If this argument is NULL, the function returns the target string untouched or it removes all occurrences of characters in the "trim_char_set" string according to the "where" option. Blank argument causes parsing error. "target": Case-sensitive target string to trim. Blank argument causes parsing error. Return type: string Examples: btrim('left', 'a%', 'abc%%defg%'); returns 'bc%%defg%'; btrim('right', 'fg%', 'abc%%defg%'); returns 'abc%%de'; btrim('leading', 'ab%', 'abc%%defg%'); returns 'c%%defg%'; btrim('trailing', 'bfg%', 'abc%%defg%'); returns 'abc%%de'; btrim('both', 'abfg%', 'abc%%defg%'); returns 'c%%de'; Syntax #2: BTRIM(where string FROM string); Purpose: Identical as Form #1 except this form confirms the standard SQL syntax (Core SQL feature ID E021-09). "where": Case-insensitive trim direction, can be one of leading, trailing, or both. These options should be provided without single/double quotation marks since they are not string literals but identifiers. For left and right are Impala keywords, they are not available in this syntax. The second string argument corresponds to the "trim_char_set" in Syntax #1 and the third string argument corresponds to the "target" in Syntax #1. Same argument meaning and restriction from Syntax #1 are applied. Return type: string Examples: btrim(leading 'ab%' from 'abc%%defg%'); returns 'c%%defg%'; btrim(trailing 'bfg%' from 'abc%%defg%'); returns 'abc%%de'; btrim(both 'abfg%' from 'abc%%defg%'); returns 'c%%de'; Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M common/function-registry/impala_functions.py M common/thrift/Exprs.thrift M fe/src/main/cup/sql-parser.cup M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeExprsTest.java 7 files changed, 198 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4474/3 -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Youwei Wang