Jim Apple has posted comments on this change.

Change subject: IMPALA-2428: Support multiple-character string as the field 
delimiter
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3314/4//COMMIT_MSG
Commit Message:

Line 15: 2. tuple delimiter could not be the first byte of field delimiter
What happens today if tuple delimiters and field delimiters are the same 
character?

Same question about escape character.


Line 17: 4. terminators could not contains '\0'
Today, are terminators allowed to contain '\0'?


http://gerrit.cloudera.org:8080/#/c/3314/4/be/src/exec/delimited-text-parser-test.cc
File be/src/exec/delimited-text-parser-test.cc:

Line 173:   Validate(&single_null_no_escape_parser, string("a\0b|c\0d|e\0f", 
11), 4, TUPLE_DELIM, 1, 3);
long line (here and elsewhere). Consider using clang-format to help you find 
and fix these, though please do no clang format code you are not changing.


http://gerrit.cloudera.org:8080/#/c/3314/1/be/src/exec/delimited-text-parser.cc
File be/src/exec/delimited-text-parser.cc:

Line 46:   DCHECK(tuple_delim != field_delim[0]);
> size of field_delim could not be 0. We have checked that in sql-parser.y an
I see you have taken that out of the parser now. Why is that?


http://gerrit.cloudera.org:8080/#/c/3314/1/fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java:

Line 268:     String fieldDelim = 
StringEscapeUtils.unescapeJava(rowFormat_.getFieldDelimiter());
> Just as parseDelim(), it turns escaped string into un-escaped string.For ex
Why isn't it already used elsewhere?


http://gerrit.cloudera.org:8080/#/c/3314/1/fe/src/main/java/com/cloudera/impala/catalog/HdfsStorageDescriptor.java
File fe/src/main/java/com/cloudera/impala/catalog/HdfsStorageDescriptor.java:

Line 171:     if (escapeChar == (byte)fieldDelim.charAt(0) ||
> Done
I do not see that fix in this code.


http://gerrit.cloudera.org:8080/#/c/3314/4/tests/query_test/test_delimited_text.py
File tests/query_test/test_delimited_text.py:

Line 56:   def test_create_table_rowformat(self, vector, unique_database):
Please name this more appropriately - it only tests failure modes right now, so 
please name it as such.


Line 61:       4. terminator could not contain \0."""
Please add these notes on each test so we know what is being tested in each 
place.


-- 
To view, visit http://gerrit.cloudera.org:8080/3314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1437ca35dc4fdc58a7db1c2c70d4da30adf0c3e
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Yuanhao Luo <luoyuan...@software.ict.ac.cn>
Gerrit-Reviewer: Jim Apple <jbap...@cloudera.com>
Gerrit-Reviewer: Yuanhao Luo <luoyuan...@software.ict.ac.cn>
Gerrit-HasComments: Yes

Reply via email to