[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-10-18 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has abandoned this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Abandoned

Abandoning the review until I can work on it again.

-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-10-04 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76: return 
Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident;
> I think we can still make progress here, I'm not sure I completely follow y
If I remember correctly, I encountered a problem with the 
com.cloudera.impala.analysis.TableName class. It stores the table name as a 
String, but the table name itself can be qualified in case of complex types, 
thus it can not be quoted by simply putting it between backticks.

It also seems that com.cloudera.impala.analysis.InsertStmt and 
com.cloudera.impala.catalog.View constructors take a List parameter for 
columns and store it in the object.

It seems to me that in order to properly quote these, we would need to refactor 
the code to use a data structure capable of representing hierarchical names 
unambiguously to pass the table and column references around and to store them. 
I thought that the ongoing effort you mentioned is about addressing this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-29 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76: return 
Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident;
> Those are two separate issues. In your example, we are guarding against a l
I added splitting because when I just put the whole string between backticks 
then Impala ended up escaping qualified identifiers of complex types 
incorrectly. I noticed that these references were already in a string form by 
the time getIdentSql gets called, so there is no way to properly quote them 
without refactoring to pass them in a structured format to the affected 
functions.

The "Invalid column/field name" error message mislead me to believe that dots 
are not allowed in any identifier, which lead me to come up with my approach 
which would work if identifiers really couldn't contain dots. Since this turned 
out to be a false assumption, it seems that I have to discard this change as 
this task can only be implemented properly after IMPALA-2287 and probably some 
other refactorings.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> I agree that it's slighly less pleasing to the eye but I think correctness/
For SHOW CREATE TABLE and CREATE VIEW it's fine to quote all identifiers, but I 
don't think we should needlessly pollute error messages, e.g., in 
AnalysisExceptions. To the best of my knowledge, Hive does not quote 
identifiers in error reports, and neither should we.


Line 76: return 
Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident;
> Hmmm. Interesting. On the other hand:
Those are two separate issues. In your example, we are guarding against a 
limitation in the Hive Metastore that disallows certain column names. 
Generally, the Hive Metastore is pretty restrictive.

However, within an Impala query you can use more or less arbitrary identifiers, 
for better of for worse. There's an argument to be made that we should not 
allow that, but that's the state of the world today.

Ambiguities are a reality that we cannot escape, even if we disallowed dots in 
quoted identifiers. If you are curious, you can look at 
AnalyzeStmts#TestSlotRefPathAmbiguity() and the other *Ambiguity tests to get a 
flavor.

I still don't see how the specific problem we are discussing is not solvable. 
This function should accept a single identifier and quote it. It's up to the 
caller (who has more knowledge about the identifier) to pass in the right value.

For the struct case there is an existing JIRA to change the behavior: 
IMPALA-2287

My basic question here is: Why did you add the splitting?


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-28 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> So we are in agreement then?
I agree that it's slighly less pleasing to the eye but I think 
correctness/unambiguity is more important than aesthetics. Also Hive does the 
same, so I think people are used to it.


Line 76: return 
Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident;
> select `a.b.c`.`x.y.z` from (select 1 `x.y.z`) `a.b.c`
Hmmm. Interesting. On the other hand:

> create view test as select `a.b.c`.`x.y.z` from (select 1 `x.y.z`) `a.b.c`;
ERROR: AnalysisException: Invalid column/field name: x.y.z

This is a serious problem, because it means that identifier names are ambigous. 
I checked that in case of a struct, functions get "column_name.field_name" as 
the column name. I haven't checked your example yet, but my guess is that in 
this case the column name parameter will become "x.y.z". Still, the former 
should be quoted as `column_name`.`field_name` and latter as `x.y.z`. It seems 
that it's too late to fix this problem by the time getIdentSql gets called.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-28 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> The main issue with this approach is that all invocations of toSql() will n
Initially I was also worried that identifiers become "ugly" and considered 
making this behavior optional. I was thinking of a query option that the user 
can control using the SET statement. Then I checked how Hive works and I found 
that it already quotes every identifier (I checked the table and view 
definitions if I remember correctly), so I came to the conclusion that quoting 
everything should be okay. Of course I expected some suggestions to how it 
should work, that's why I didn't fix the styling issues in the test.


Line 76: return 
Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident;
> I don't think splitting here is the right fix. An identifier itself could c
I checked that identifiers can't contain dots, or at least users can't create 
such identifiers. If you try specifying `a.b` as an identifier, Impala will 
complain that dots are not allowed in identifier names. Do you know of a code 
piece that gets around this mechanism and creates identifiers with dots in some 
other way?

Qualified table or column references are exactly the reason why I only made 
this version public and the other one private, as only this is safe to use to 
quote table or column references. In case of complex types, dots can become a 
part of either the table reference or the column reference.

In case of a struct, a column reference might look like column_name.field_name, 
which should be quoted as `column_name`.`field_name`. In case of a map or 
array, table references might look like table_name.map_or_array_name, which 
should be quoted `table_name`.`map_or_array_name`.

This is my understanding based on the short amount of time I spent on Impala, 
so please correct me if I'm wrong.


Line 88: if (ident.equals("*")) {
> This doesn't seem right or necessary. First, "*" is not an identifier (it i
Without this check a SELECT * became SELECT `*` which is invalid.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
File fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java:

Line 112:   sb.append(" " + type_.toString());
toSql()


Line 114:   sb.append(" " + typeDef_.toString());
toSql()


http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
The main issue with this approach is that all invocations of toSql() will not 
have quoted identifiers. Unfortunately, we also use toSql() for error 
reporting. For example, take a look at AnalyticExpr.java or any other Expr. If 
analyze() fails we typically we print the offending expr with toSql().

Error messages will now become unnecessarily weird.

I think the callers need to decide the quoting policy.

We should have two versions of toSql():
1. Only adds quotes if necessary for Impala (no need to consider Hive as before)
2. Always quote identifiers

Imo, the default toSql() should have the first behavior, and we may add another 
version that always quotes identifiers. For example, we could have a setup like 
this:

private String toSql(boolean quoteAllIdents);
private String toSql() { return toSql(false); }

You can imagine other variants. Happy to discuss.


Line 76: return 
Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident;
I don't think splitting here is the right fix. An identifier itself could 
contain a dot, even if unlikely. Also, qualified table or column references 
consist of multiple identifiers and not a single identifier, so the premise of 
this fix is kind of wrong (based on your comment).


Line 88: if (ident.equals("*")) {
This doesn't seem right or necessary. First, "*" is not an identifier (it is a 
terminal), and second those clauses that could contain "*" should handle it 
specially in toSql().

For my understanding, what use case is this change addressing?


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-26 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has uploaded a new patch set (#2).

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..

IMPALA-784: Use `-s in SHOW CREATE TABLE output

Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
---
M fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
M fe/src/main/java/com/cloudera/impala/analysis/DeleteStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/InsertStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/TableName.java
M fe/src/main/java/com/cloudera/impala/analysis/TableRef.java
M fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
M fe/src/main/java/com/cloudera/impala/analysis/UpdateStmt.java
M fe/src/test/java/com/cloudera/impala/analysis/ToSqlTest.java
8 files changed, 339 insertions(+), 338 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4527/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 1:

(2 comments)

This seems reasonable to me. I think we should get Alex Behm to confirm, since 
he was the one who originally wrote some of this code. 

How did you determine all the places that needed to be changed? We should also 
fix this for "SHOW CREATE FUNCTION" and any other similar statements that 
output SQL for consistency.

The CREATE TABLE expected output is probably coming from .test files. E.g. 
testdata/workloads/functional-query/queries/QueryTest/kudu-show-create.test

http://gerrit.cloudera.org:8080/#/c/4527/1/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

PS1, Line 72: reassamble
reassemble


Line 73: // This is safe, because periods are not allowed inside identifier 
names for
Thanks for the explanation.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-23 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..


Patch Set 1:

The functional tests still fail with false positives like
- CREATE TABLE functional_kudu.dimtbl (id BIGINT, name STRING, zip INT)
+ CREATE TABLE `functional_kudu`.`dimtbl` (`id` BIGINT, `name` STRING, `zip` 
INT)

I haven't fixed these yet, because I still have to figure out where the 
expected strings come from, but first I wanted to make sure that we agree on 
the approach to be taken.

The same applies to the overly long lines in the unit test. That test is fixed 
functionally but not stylistically, as I didn't want to spend too much time on 
it until I get feedback about whether this is the right approach.

Thanks,

Zoltan

-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

2016-09-23 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4527

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
..

IMPALA-784: Use `-s in SHOW CREATE TABLE output

Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
---
M fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
M fe/src/main/java/com/cloudera/impala/analysis/DeleteStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/InsertStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/TableName.java
M fe/src/main/java/com/cloudera/impala/analysis/TableRef.java
M fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
M fe/src/main/java/com/cloudera/impala/analysis/UpdateStmt.java
M fe/src/test/java/com/cloudera/impala/analysis/ToSqlTest.java
8 files changed, 339 insertions(+), 338 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4527/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi