[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-14 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: Support a quoted string in date/time format
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc@5747
PS2, Line 5747:  '\\'Epoch time: \\'|MM|dd HH|mm|ss')", TYPE_BIGINT, 0);
> Please wrap lines at 90 characters.
Done


http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h@79
PS2, Line 79: //   User can specify a quoted string at date/time format. For 
example,
> Please wrap lines at 90 characters (here and below).
Done


http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc@170
PS2, Line 170:   int len = str - string_literal - 1;
 :   // Keep compatibility with Hive which represents a single 
quote in the format
 :   // e.g. from_unixtime(0, '\'\'') => '
 :   if (len == 0) len = 1;
 :   DCHECK(len >= 0);
 :   const int pos = str - str_begin - len;
 :   DCHECK(pos >= 0);
 :   c
> What if (str - string_literal == 1) ?
I missed support the feature: escaping a single quote



--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 14 Nov 2017 23:51:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-14 Thread Kim Jin Chul (Code Review)
Hello Attila Jeges, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8508

to look at the new patch set (#3).

Change subject: IMPALA-5237: Support a quoted string in date/time format
..

IMPALA-5237: Support a quoted string in date/time format

Impala does not represent a quoted string at date/time format.
For example, addtional quoted string between the patterns
(e.g. HH\'H\' => 11H). Hive supports this feature, so user wants
to get a same result from Impala. By the way, Impala returns
a different result as below.

* Hive
> select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\'');
08H 39M 24S

* Impala
> select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\'');
08'8' 39'4' 24'0'

The change affects the format pattern for
from_unixtime/from_timestamp/unix_timestamp.

In unix_timestamp, user can also specify a quoted string like this.

> select unix_timestamp('\'Epoch time: \'19700101',
> '\'Epoch time: \'MMdd');
0

By the way, the quoted strings between input and format should be
exactly same and internally string comparison is case-sensitive.

There is one intentional difference between Hive and Impala.
In Impala, the format string should have any date or time patten
as below. Throwing the error/warning is better if Impala cannot
optimize the expression. User must rewrite the query and don't pay
the function call.

* Hive
> select from_unixtime(0, '\'Hello world!\'');
Hello world!

* Impala
> select from_unixtime(0, '\'Hello world!\'');
Bad date/time conversion format: 'Hello world!'

Testing:
Add unit tests for working and nonworking cases

Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
3 files changed, 92 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8508/3
--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-14 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: Support a quoted string in date/time format
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8508/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8508/2//COMMIT_MSG@9
PS2, Line 9: Impala does not represent a quoted string at date/time format. For 
example,
Please wrap commit message lines at 70 characters (here and below).


http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/exprs/expr-test.cc@5747
PS2, Line 5747:  
'\\'day\\'dd,\\'month\\'MMM,\\'year\\',\\'hour\\'HH,\\'minute\\'mm,\\'second\\'ss')",
Please wrap lines at 90 characters.


http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.h@79
PS2, Line 79: //   User can specify a quoted string at date/time format. For 
example, addtional quoted string
Please wrap lines at 90 characters (here and below).


http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/8508/2/be/src/runtime/timestamp-parse-util.cc@170
PS2, Line 170:   if (str - string_literal != 1) {
 : const int len = str - string_literal - 1;
 : DCHECK(len >= 0);
 : const int pos = str - str_begin - len;
 : DCHECK(pos >= 0);
 : const char* val = string_literal + 1;
 : dt_ctx->toks.push_back(DateTimeFormatToken(SEPARATOR, 
pos, len, val));
 :   }
What if (str - string_literal == 1) ?

Note, that in Hive you get an apostrophe:
> select from_unixtime(1492677561, 'H\'\'');
10'



--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 14 Nov 2017 18:22:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-09 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: Support a quoted string in date/time format
..


Patch Set 2:

@Tim, it would be great if you also review my comments in the Jira ticket when 
you are available. I don't find you from the watchers of the ticket, so I'd 
inform you of this.


--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 10 Nov 2017 00:04:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-08 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: Support a quoted string in date/time format
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc@5724
PS1, Line 5724:   TestValue("unix_timestamp('1970|01|01 00|00|00', '|MM|dd 
HH|mm|ss')", TYPE_BIGINT,
> I think we also need to add tests for parsing with format strings including
Thanks for the information.  A quoted text does not work In the first patch set 
due to missing implementation. Done.


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc@5746
PS1, Line 5746:   TestError("from_unixtime(0, 'HHH\\'')");
> Can you add a test for an unmatched '\. Or if one of the above tests covers
5745 and 5746 are testing for unterminated quote


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.h@77
PS1, Line 77: /// The following features are not supported:
> Can you update this comment to include an example of the feature you added?
Done


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@160
PS1, Line 160: if ('\'' == *str) {
> nit: *str == '\'' to match the convention below
Done


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@162
PS1, Line 162:   do {
> Can you add a comment here explaining that we're matching a string literal,
Done


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@164
PS1, Line 164: // Meet end of string
> This comment isn't to informative. Maybe:
Done


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@165
PS1, Line 165: if (str == str_end) {
> Nit: 1 line:
Done


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@168
PS1, Line 168:   } while ('\'' != *str);
> nit: *str != '\'' to match the convention below
Done



--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 1
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 06:28:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2017-11-08 Thread Kim Jin Chul (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8508

to look at the new patch set (#2).

Change subject: IMPALA-5237: Support a quoted string in date/time format
..

IMPALA-5237: Support a quoted string in date/time format

Impala does not represent a quoted string at date/time format. For example,
addtional quoted string between the patterns(e.g. HH\'H\' => 11H)
Hive supports this feature, so user wants to get a same result
from Impala. By the way, Impala returns a different result as below.

* Hive
> select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\'');
08H 39M 24S

* Impala
> select from_unixtime(1492677564, 'HH\'H\' mm\'M\' ss\'S\'');
08'8' 39'4' 24'0'

The change affects the format pattern for 
from_unixtime/from_timestamp/unix_timestamp.

In unix_timestamp, user can also specify a quoted string like this.

> select unix_timestamp('\'Epoch time: \'19700101', '\'Epoch time: \'MMdd');
0

By the way, the quoted strings between input and format should be exactly same
and internally string comparison is case-sensitive.

There is one intentional difference between Hive and Impala. In Impala, the 
format
string should have any date or time patten as below. Throwing the error/warning 
is
better if Impala cannot optimize the expression. User must rewrite the query and
don't pay the function call.

* Hive
> select from_unixtime(0, '\'Hello world!\'');
Hello world!

* Impala
> select from_unixtime(0, '\'Hello world!\'');
Bad date/time conversion format: 'Hello world!'

Testing:
Add unit tests for working and nonworking cases

Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
3 files changed, 81 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8508/2
--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong