[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8508 ) Change subject: IMPALA-5237: Support a quoted string in date/time format .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc@5747 PS2, Line 5747: '\\'Epoch time: \\'|MM|dd HH|mm|ss')", TYPE_BIGINT, 0); > Please wrap lines at 90 characters. Done http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h File be/src/runtime/timestamp-parse-util.h: http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h@79 PS2, Line 79: // User can specify a quoted string at date/time format. For example, > Please wrap lines at 90 characters (here and below). Done http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc@170 PS2, Line 170: int len = str - string_literal - 1; : // Keep compatibility with Hive which represents a single quote in the format : // e.g. from_unixtime(0, '\'\'') => ' : if (len == 0) len = 1; : DCHECK(len >= 0); : const int pos = str - str_begin - len; : DCHECK(pos >= 0); : c > What if (str - string_literal == 1) ? I missed support the feature: escaping a single quote -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 14 Nov 2017 23:51:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Hello Attila Jeges, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8508 to look at the new patch set (#3). Change subject: IMPALA-5237: Support a quoted string in date/time format .. IMPALA-5237: Support a quoted string in date/time format Impala does not represent a quoted string at date/time format. For example, addtional quoted string between the patterns (e.g. HH\'H\' => 11H). Hive supports this feature, so user wants to get a same result from Impala. By the way, Impala returns a different result as below. * Hive > select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\''); 08H 39M 24S * Impala > select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\''); 08'8' 39'4' 24'0' The change affects the format pattern for from_unixtime/from_timestamp/unix_timestamp. In unix_timestamp, user can also specify a quoted string like this. > select unix_timestamp('\'Epoch time: \'19700101', > '\'Epoch time: \'MMdd'); 0 By the way, the quoted strings between input and format should be exactly same and internally string comparison is case-sensitive. There is one intentional difference between Hive and Impala. In Impala, the format string should have any date or time patten as below. Throwing the error/warning is better if Impala cannot optimize the expression. User must rewrite the query and don't pay the function call. * Hive > select from_unixtime(0, '\'Hello world!\''); Hello world! * Impala > select from_unixtime(0, '\'Hello world!\''); Bad date/time conversion format: 'Hello world!' Testing: Add unit tests for working and nonworking cases Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 --- M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h 3 files changed, 92 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8508/3 -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/8508 ) Change subject: IMPALA-5237: Support a quoted string in date/time format .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/8508/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8508/2//COMMIT_MSG@9 PS2, Line 9: Impala does not represent a quoted string at date/time format. For example, Please wrap commit message lines at 70 characters (here and below). http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc@5747 PS2, Line 5747: '\\'day\\'dd,\\'month\\'MMM,\\'year\\',\\'hour\\'HH,\\'minute\\'mm,\\'second\\'ss')", Please wrap lines at 90 characters. http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h File be/src/runtime/timestamp-parse-util.h: http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h@79 PS2, Line 79: // User can specify a quoted string at date/time format. For example, addtional quoted string Please wrap lines at 90 characters (here and below). http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc@170 PS2, Line 170: if (str - string_literal != 1) { : const int len = str - string_literal - 1; : DCHECK(len >= 0); : const int pos = str - str_begin - len; : DCHECK(pos >= 0); : const char* val = string_literal + 1; : dt_ctx->toks.push_back(DateTimeFormatToken(SEPARATOR, pos, len, val)); : } What if (str - string_literal == 1) ? Note, that in Hive you get an apostrophe: > select from_unixtime(1492677561, 'H\'\''); 10' -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 2 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 14 Nov 2017 18:22:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8508 ) Change subject: IMPALA-5237: Support a quoted string in date/time format .. Patch Set 2: @Tim, it would be great if you also review my comments in the Jira ticket when you are available. I don't find you from the watchers of the ticket, so I'd inform you of this. -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 2 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 10 Nov 2017 00:04:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8508 ) Change subject: IMPALA-5237: Support a quoted string in date/time format .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc@5724 PS1, Line 5724: TestValue("unix_timestamp('1970|01|01 00|00|00', '|MM|dd HH|mm|ss')", TYPE_BIGINT, > I think we also need to add tests for parsing with format strings including Thanks for the information. A quoted text does not work In the first patch set due to missing implementation. Done. http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc@5746 PS1, Line 5746: TestError("from_unixtime(0, 'HHH\\'')"); > Can you add a test for an unmatched '\. Or if one of the above tests covers 5745 and 5746 are testing for unterminated quote http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.h File be/src/runtime/timestamp-parse-util.h: http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.h@77 PS1, Line 77: /// The following features are not supported: > Can you update this comment to include an example of the feature you added? Done http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@160 PS1, Line 160: if ('\'' == *str) { > nit: *str == '\'' to match the convention below Done http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@162 PS1, Line 162: do { > Can you add a comment here explaining that we're matching a string literal, Done http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@164 PS1, Line 164: // Meet end of string > This comment isn't to informative. Maybe: Done http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@165 PS1, Line 165: if (str == str_end) { > Nit: 1 line: Done http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@168 PS1, Line 168: } while ('\'' != *str); > nit: *str != '\'' to match the convention below Done -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 1 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 09 Nov 2017 06:28:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8508 to look at the new patch set (#2). Change subject: IMPALA-5237: Support a quoted string in date/time format .. IMPALA-5237: Support a quoted string in date/time format Impala does not represent a quoted string at date/time format. For example, addtional quoted string between the patterns(e.g. HH\'H\' => 11H) Hive supports this feature, so user wants to get a same result from Impala. By the way, Impala returns a different result as below. * Hive > select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\''); 08H 39M 24S * Impala > select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\''); 08'8' 39'4' 24'0' The change affects the format pattern for from_unixtime/from_timestamp/unix_timestamp. In unix_timestamp, user can also specify a quoted string like this. > select unix_timestamp('\'Epoch time: \'19700101', '\'Epoch time: \'MMdd'); 0 By the way, the quoted strings between input and format should be exactly same and internally string comparison is case-sensitive. There is one intentional difference between Hive and Impala. In Impala, the format string should have any date or time patten as below. Throwing the error/warning is better if Impala cannot optimize the expression. User must rewrite the query and don't pay the function call. * Hive > select from_unixtime(0, '\'Hello world!\''); Hello world! * Impala > select from_unixtime(0, '\'Hello world!\''); Bad date/time conversion format: 'Hello world!' Testing: Add unit tests for working and nonworking cases Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 --- M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h 3 files changed, 81 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8508/2 -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 2 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong