Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14291 )

Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
......................................................................


Patch Set 6:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-parser.cc@225
PS6, Line 225:  switch (**format) {
             :     case 'b': return '\b';
             :     case 'n': return '\n';
             :     case 'r': return '\r';
             :     case 't': return '\t';
             :   }
> Please check what escape sequences are supported by ANSI SQL standard.
Well, I'm not convinced that ANSI SQL is relevant here as according to this 
page it doesn't allow escaping quotes and double quotes.
https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.esqlc.doc/ids_esqlc_0015.htm

This doc about MySql is kind of in line with my implementation:
https://www.oreilly.com/library/view/mysql-cookbook/0596001452/ch04s02.html
It also says that 2 sequential double quotes works as an escaped double quote 
same as with a backslash. However, I don't feel the need for this.
Additionally MySql has \0 as a special char (ANSII NULL) but again, don't find 
it important.


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@109
PS6, Line 109: *current_pos != nullptr
> nit: current_pos != nullptr && *current_pos != nullptr
Done


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@112
PS6, Line 112: *current_pos < str_end
> nit: str_begin <= *current_pos && *current_pos < str_end
Done


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@240
PS6, Line 240: strncmp(current_pos, "\\\"", 2) == 0)
> This is safe only if 'current_pos' is zero-terminated, but I don't think it
It is zero-terminated


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@273
PS6, Line 273:  strncmp(*current_pos, "\\\"", 2)
> same as L240.
see above


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@277
PS6, Line 277: strncmp(*current_pos, "\\\\\\\"", 4)
> same as L240.
see above


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@282
PS6, Line 282: strncmp(*current_pos, "\\\\", 2)
> same as L240.
see above


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc
File be/src/runtime/datetime-parser-common.cc:

http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc@201
PS6, Line 201: continue
> nit: can be removed, here and below.
Done


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc@202
PS6, Line 202:    } else if (!tok.is_double_escaped && strncmp(text_it, "\\\"", 
2) == 0) {
             :       result.append("\"");
             :       ++text_it;
             :       continue;
             :     } else if (strncmp(text_it, "\\\\", 2) == 0) {
             :       result.append("\\");
             :       ++text_it;
             :       continue;
             :     } else if (strncmp(text_it, "\\b", 2) == 0) {
             :       result.append("\b");
             :       ++text_it;
             :       continue;
             :     } else if (strncmp(text_it, "\\n", 2) == 0) {
             :       result.append("\n");
             :       ++text_it;
             :       continue;
             :     } else if (strncmp(text_it, "\\r", 2) == 0) {
             :       result.append("\r");
             :       ++text_it;
             :       continue;
             :     } else if (strncmp(text_it, "\\t", 2) == 0) {
             :       result.append("\t");
             :       ++text_it;
             :       continue;
             :     }
> Are these the only escape sequences supported by ANSI SQL?
See the answer in datetime-iso-sql-format-parser.cc


http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py
File tests/query_test/test_cast_with_format.py:

http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@663
PS6, Line 663:    # Format part is surrounded by double quotes so the quotes 
indicating the start and
             :     # end of the text token has to be escaped.
> I did some testing around when the format string is surrounded by ' and the
If the format is surrounded by single quotes but the text double quotes 
surrounding the text token is escaped is a valid scenario. From the code it's 
not possible to decide whether a string is surrounded by single or double 
quotes I so won't be able to prevent that scenario.

The ones where the opening double quote is escaped but the closing isn't should 
result an error. This is a bug, I'll take care of it.
Done


http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@768
PS6, Line 768: covered
> surrounded
Don't see the difference. This seems nit of the nits :D
Done



--
To view, visit http://gerrit.cloudera.org:8080/14291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855
Gerrit-Change-Number: 14291
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Wed, 16 Oct 2019 14:03:09 +0000
Gerrit-HasComments: Yes

Reply via email to