[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..


Patch Set 6:

(1 comment)

I have modified comit message.

http://gerrit.cloudera.org:8080/#/c/7187/5//COMMIT_MSG
Commit Message:

PS5, Line 9: 
> Prose lines in commit messages  should neither begin nor end with spaces.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#6).

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..

IMPALA-5513: Fix display message exception when using invalid KEYVAL

Function print_to_stderr() has a syntax error when an error message is
displayed.

I solved this problem by exchanging the position of variable and the
subsequent strings in function print_to_stderr().

Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
---
M shell/impala_shell.py
M tests/shell/test_shell_commandline.py
2 files changed, 8 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7187/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-05 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 12:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/7223/12/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

PS12, Line 192: have_async_build_thread_token_
do we still need this? i guess the idea is to acquire the thread resource in 
Open() as well? Or is there another reason?


http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

PS11, Line 111: or
for


Line 119:   /// SendBuildInputToSink is called to allocate resources for this 
ExecNode.
once we do true multithreading, does this go away? since we'll be able to open 
the build child inside of this Open(), right? Maybe leave a todo about that to 
clarify why this is needed.


http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS11, Line 87:  o
nit extra space


http://gerrit.cloudera.org:8080/#/c/7223/12/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

PS12, Line 395:   // Total of minimum memory reservations for a host in bytes. 
Higher than
  :   // per_host_min_reservation if not all reservations are used 
concurrently.
I can't figure out what this means from just reading this comment.  Are there 
more than one per_host_min_reservation for each host? And why would this be 
higher if not all reservations are used concurrently (seems like that condition 
would mean the needed reservation is lower).

are one of these values related to all the fragment's instances? or are they 
really only about the fragment (i.e. what single instance would consume)?

Also, in the comment before, it says "buffer reservation" but here we say 
"memory reservation" - are these the same?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

PS12, Line 651: build
I guess here you mean build as a noun rather than a verb. i.e. this is not the 
memory used when performing the build. maybe reword to clarify as I had to read 
the code to understand the comment.

Either the next comment is sufficient, or you could reword this to explain the 
case of when no additional resources are needed.

Alternatively, can't we think of this as the resources of this node (but not 
its children) during probing? I.e. call this probePhaseProfile and rename 
probePhaseProfile to execProfile below. (See comment in 
ExecPhaseResourceProfiles about naming).


PS12, Line 654: !BackendConfig.INSTANCE.isPartitionedHashJoinEnabled()
why is that? the old ones don't consume bufferpool reservations.
but also, i think we override this flag for certain build types that aren't 
supported by the old join.


PS12, Line 667: .
... because of the async thread. ?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java:

PS12, Line 85: joinIds
where is that used?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

PS12, Line 108: plan fragment per host
is that for all instances of this fragment running on a host? or something else?


PS12, Line 139: nodes
isn't that always empty? if so, how about just allocating it here rather than 
taking it as a param?


PS12, Line 222: join build sinks
but what about the resources of the fragment itself?


PS12, Line 233: peakResources
maybe call that fInstanceResources or instanceResources or 
perFInstanceResources, etc.


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

PS12, Line 637: between Open()
is that inclusive of Open()? if so, the name postOpenProfile seems a bit 
misleading. Maybe duringOpenProfile should just be called openProfile and 
postOpenProfile should be execProfile?


PS12, Line 645: computeResourceProfile
computeNodeResourceProfile


PS12, Line 658: child is open
Maybe say "until after the child's Open() returns."  It ultimately means the 
same thing, but it took me a few minutes to follow what this was trying to tell 
me about the equation below.


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 62:   // estimates of zero, even if the contained PlanNodes have estimates 
of zero.
how is this chosen and why do we need it?


PS12, Line 353: Peak
it seems like we're using peak and sum to mean the same thing in this function. 
or is there a subtle distinction?


Line 354: // Total of per-host min

[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..


Patch Set 6:

(1 comment)

I hava added description about underscores. Than you.

http://gerrit.cloudera.org:8080/#/c/7179/5/shell/option_parser.py
File shell/option_parser.py:

Line 164:  " contains alphanumeric characters or 
underscores.")
> This actually is incorrect - it can also contain underscores.
Sorry for missing underline.


-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#6).

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..

IMPALA-5507: Add clear description to help information of KEYVAL option

Help information of KEYVAL option in impala-shell is not clear enough.

I fix this issue by adding clear description to help information of
KEYVAL option.

Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
---
M shell/option_parser.py
1 file changed, 7 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7179/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used 
without LDAP auth
..


Patch Set 3:

(3 comments)

Thank you for the suggestion. I have modified the code.

http://gerrit.cloudera.org:8080/#/c/7188/2//COMMIT_MSG
Commit Message:

PS2, Line 7: Throw an error when --ldap_password_cmd is used with
> Sorry for nitpicking, I don't think judgement is the right word here. How a
Done


PS2, Line 11: 
> nit: Please remove trailing spaces (here and below)
Done


http://gerrit.cloudera.org:8080/#/c/7188/2/shell/impala_shell.py
File shell/impala_shell.py:

PS2, Line 1333: Option --ldap_password_cmd requires using LDAP authentication m
> IMO something like "Option --ldap_password_cmd requires using LDAP authenti
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#3).

Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used 
without LDAP auth
..

IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth

When only with ldap_password_cmd option, impala-shell runs successfully.

I solved this problem by throwing an error when --ldap_password_cmd is
used without LDAP auth, that is, ldap_password_cmd option will only
take effect if ldap option presents.

Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
---
M shell/impala_shell.py
1 file changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7188/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5333: [DOCS] Document Impala ADLS support

2017-07-05 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#2).

Change subject: IMPALA-5333: [DOCS] Document Impala ADLS support
..

IMPALA-5333: [DOCS] Document Impala ADLS support

Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14
---
M docs/impala.ditamap
M docs/shared/impala_common.xml
A docs/topics/impala_adls.xml
M docs/topics/impala_parquet_file_size.xml
4 files changed, 721 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/7175/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..


Patch Set 5:

(1 comment)

Bharath, if you're OK with the test, then once my comment is addressed I will 
be OK with this change.

http://gerrit.cloudera.org:8080/#/c/7187/5//COMMIT_MSG
Commit Message:

PS5, Line 9:  
Prose lines in commit messages  should neither begin nor end with spaces.


-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..


Patch Set 5:

(4 comments)

I have modified the code according to your opinion and added the test case.

http://gerrit.cloudera.org:8080/#/c/7187/4//COMMIT_MSG
Commit Message:

PS4, Line 9: print_to_stderr
> Please fix as you did below.
Done


PS4, Line 10: d
> The usual commit message style is that prose lines do not begin with spaces
Done


http://gerrit.cloudera.org:8080/#/c/7187/4/shell/impala_shell.py
File shell/impala_shell.py:

PS4, Line 1252: 
> Don't think this is needed.
Done


Line 1253: 'It must follow the pattern "KEY=VALUE".')
> Mind adding a quick test in test_var_replacement() to cover this?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#5).

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..

IMPALA-5513: Fix display message exception when using invalid KEYVAL

Function print_to_stderr() has a syntax error when an error message is 
displayed.

I solved this problem by exchanging the position of variable and the 
subsequent strings in function print_to_stderr().

Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
---
M shell/impala_shell.py
M tests/shell/test_shell_commandline.py
2 files changed, 8 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7187/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-05 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7179/5/shell/option_parser.py
File shell/option_parser.py:

Line 164:  " contains only alphanumeric characters.")
This actually is incorrect - it can also contain underscores.


-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5333: [DOCS] Document Impala ADLS support

2017-07-05 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-5333: [DOCS] Document Impala ADLS support
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7175/1/docs/topics/impala_adls.xml
File docs/topics/impala_adls.xml:

Line 154:   
> L154 has trailing white space. It's best to remove trailing white space fro
Absolutely. I've gotten spoiled by a pre-commit hook that auto-removes any 
trailing spaces. Wow, looking at the hook I see that check & fixup has been 
commented out for some time now! I see that a few more have gotten into other 
files but I'll fix those separately.


PS1, Line 156: As an alternative, specify the credentials in 
environment variables before starting the impalad
 : daemon.
> Is it possible to list a mapping between core-site.xml settings above and e
Yes. Need to consult with Sailesh or I think there is some CDH doc for ADLS I 
can adapt.


Line 219:   
> This is still empty. :)
Need to consult with Sailesh. I haven't had hands-on experience with ADLS yet 
the way I have with S3.


PS1, Line 261: You point
> Maybe just an imperative "Point" ?
Almost. I'll word it as "To X, do Y". (Although now the wording on the S3 page 
is slightly different. Too much trouble to track down every instance in 
upstream and downstream docs to reconcile them all.)


PS1, Line 279: For example,
> Because the paragraph starting at L269 mentions a database with a LOCATION,
Done


PS1, Line 313: !??? ls adl://impala-demo/dir1/dir2/dir3 --recursive;
What's the ADLS command line syntax to do this? (I'm basing this on an S3 
example with 'aws s3' commands.


PS1, Line 329: !??? ls adl://impala-demo/dir1/dir2/dir3 --recursive;
Same question as on line 313.


-- 
To view, visit http://gerrit.cloudera.org:8080/7175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..


Patch Set 5:

(3 comments)

I have solved the suggestion you have mentioned.

http://gerrit.cloudera.org:8080/#/c/7179/4/shell/option_parser.py
File shell/option_parser.py:

PS4, Line 88: If the argument to -f is "-", then queries are read from stdin"
:  " and terminated with ctrl-d
> Try "If the argument to -f is "-", then queries are read from stdin and ter
Done


PS4, Line 163: starts with an alphabetic character and"
> "starts with an alphabetic character"
Done


PS4, Line 164: only alphanumeric characters.")
> "only alphanumeric characters."
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-05 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#5).

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..

IMPALA-5507: Add clear description to help information of KEYVAL option

Help information of KEYVAL option in impala-shell is not clear enough.

I fix this issue by adding clear description to help information of
KEYVAL option.

Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
---
M shell/option_parser.py
1 file changed, 7 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7179/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 12:

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#12).

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..

IMPALA-4862: make resource profile consistent with backend behaviour

This moves away from the PipelinedPlanNodeSet approach of enumerating
sets of concurrently-executing nodes because unions would force
creating many overlapping sets of nodes. The new approach computes
the peak resources during Open() and the peak resources between Open()
and Close() (i.e. while calling GetNext()) bottom-up for each plan node
in a fragment. The fragment resources are then combined to produce the
query resources.

The basic assumptions for the new resource estimates are:
* resources are acquired during or after the first call to Open()
  and released in Close().
* Blocking nodes call Open() on their child before acquiring
  their own resources (this required some backend changes).
* Blocking nodes call Close() on their children before returning
  from Open().
* The peak resource consumption of the query is the sum of the
  independent fragments (except for the parallel join build plans
  where we can assume there will be synchronisation). This is
  conservative but we don't synchronise fragment Open() and Close()
  across exchanges so can't make stronger assumptions in general.

Also compute the sum of minimum reservations. This will be useful
in the backend to determine exactly when all of the initial
reservations have been claimed from a shared pool of initial reservations.

Testing:
* Updated planner tests to reflect behavioural changes.
* Added extra resource requirement planner tests for unions, subplans,
  pipelines of blocking operators, and bushy join plans.
* Added single-node plans to resource-requirements tests. These have
  more complex plan trees inside a single fragment, which is useful
  for testing the peak resource requirement logic.

Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
---
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/common/TreeNode.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tabl

[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 11: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5036: Parquet count star optimization
..


IMPALA-5036: Parquet count star optimization

Instead of materializing empty rows when computing count star, we use
the data stored in the Parquet RowGroup.num_rows field. The Parquet
scanner tuple is modified to have one slot into which we will write the
num rows statistic. The aggregate function is changed from count to a
special sum function that gets initialized to 0. We also add a rewrite
rule so that count() is rewritten to count(*) in order to make
sure that this optimization is applied in all cases.

Testing:
- Added functional and planner tests

Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Reviewed-on: http://gerrit.cloudera.org:8080/6812
Reviewed-by: Taras Bobrovytsky 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
A testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test
R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_parquet_stats.py
27 files changed, 873 insertions(+), 82 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Taras Bobrovytsky: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5618: buffered-tuple-stream-v2 fixes

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-5618: buffered-tuple-stream-v2 fixes
..

IMPALA-5618: buffered-tuple-stream-v2 fixes

This fixes two issues:
* AddRowCustom() caused a performance regression when the function
  was heap-allocated. This is solved by splitting the API into two
  separate calls. This imposes an additional burden on the caller
  but it is easier to reason about its performance.
* Allow re-reading streams with 'delete_on_read_' set so long as no rows
  were read from the stream. This is necessary for some spilling ExecNodes
  that prepare the stream for reading in order to acquire the buffer,
  but then need to spill the stream to free memory before they actually
  are able to read the stream.

Change-Id: Ibab0d774f66be632f17376a56abf302821cca047
---
M be/src/runtime/buffered-tuple-stream-v2-test.cc
M be/src/runtime/buffered-tuple-stream-v2.cc
M be/src/runtime/buffered-tuple-stream-v2.h
M be/src/runtime/buffered-tuple-stream-v2.inline.h
4 files changed, 81 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/7358/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibab0d774f66be632f17376a56abf302821cca047
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

Line 150:   DCHECK(status != nullptr);
> DCHECK(have_async_build_trhead_);
Done


Line 198:   RETURN_IF_ERROR(AcquireResourcesForBuild(state));
> why do we do the build Open() here, rather than keep it in ProcessBuildInpu
Done. The no-sink case will go away with the old aggs and joins fortunately.


http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

PS10, Line 69: it
> nit: period
Done


PS10, Line 70: it it is started,
> garbled
Done


PS10, Line 109: can be safely closed early
> additionally, aren't we now requiring that it be closed early (i.e. the fro
Yeah the planner depends on this. I updated the comment to mention that and 
hopefully be a bit clearer.


Line 121:   /// Processes the build-side input.
> let's also say that the build side should already have been opened (though 
Done


Line 141:   Status SendBuildInputToSink(RuntimeState* state, DataSink* 
build_sink);
> shouldn't this be private? i.e. we don't expect a derived class to call it 
Done


Line 222:   /// is used for the build. Otherwise, ProcessBuildInput() is called 
on the subclass.
> explain that this opens the build
Done


http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS10, Line 87: in general
> why? are there cases this isn't true?
I think the only exception is subplans. Updated.


http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

PS10, Line 181:  right away
> what do we mean by "right away". It seemed like we were acquiring them righ
The important this is that it's before the build. Updated the comment 
accordingly.


-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#11).

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..

IMPALA-4862: make resource profile consistent with backend behaviour

This moves away from the PipelinedPlanNodeSet approach of enumerating
sets of concurrently-executing nodes because unions would force
creating many overlapping sets of nodes. The new approach computes
the peak resources during Open() and the peak resources between Open()
and Close() (i.e. while calling GetNext()) bottom-up for each plan node
in a fragment. The fragment resources are then combined to produce the
query resources.

The basic assumptions for the new resource estimates are:
* resources are acquired during or after the first call to Open()
  and released in Close().
* Blocking nodes call Open() on their child before acquiring
  their own resources (this required some backend changes).
* Blocking nodes call Close() on their children before returning
  from Open().
* The peak resource consumption of the query is the sum of the
  independent fragments (except for the parallel join build plans
  where we can assume there will be synchronisation). This is
  conservative but we don't synchronise fragment Open() and Close()
  across exchanges so can't make stronger assumptions in general.

Also compute the sum of minimum reservations. This will be useful
in the backend to determine exactly when all of the initial
reservations have been claimed from a shared pool of initial reservations.

Testing:
* Updated planner tests to reflect behavioural changes.
* Added extra resource requirement planner tests for unions, subplans,
  pipelines of blocking operators, and bushy join plans.
* Added single-node plans to resource-requirements tests. These have
  more complex plan trees inside a single fragment, which is useful
  for testing the peak resource requirement logic.

Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
---
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/common/TreeNode.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tabl

[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-05 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#2).

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..

IMPALA-5583: [DOCS] Document default_join_distribution_mode query option

New page for the query option.

Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
---
M docs/impala.ditamap
M docs/impala_keydefs.ditamap
A docs/topics/impala_default_join_distribution_mode.xml
3 files changed, 132 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/7300/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: John Russell 


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-05 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7300/1/docs/impala.ditamap
File docs/impala.ditamap:

Line 179:   
> Why mention IMPALA-5583 also?
In the past I've referred both to the "code implementation" JIRA and "document 
the new feature" JIRA in this kind of context. Just for ease of future 
maintenance and tracing if something is wrong or missing on the doc side. I 
guess that's less important when the doc one is a subtask of the code one. I'll 
take it out.


http://gerrit.cloudera.org:8080/#/c/7300/1/docs/topics/impala_default_join_distribution_mode.xml
File docs/topics/impala_default_join_distribution_mode.xml:

Line 40:   This option determines the join strategy that Impala uses when 
any of the tables
> We deliberately did not use "join strategy" in the option name because stra
Can you elaborate a little on the meaning of "join distribution mode" then? 
That's not terminology we've used elsewhere in the docs.


Line 47:   Hive ANALYZE TABLE statement.
> Sure you want to keep the ANALYZE TABLE part? In most situations we cannot 
Done


Line 48:   By default, when a table involved in the join query does not 
have statistics,
> Accuracy could be improved. What if both tables do not have stats? Clarify 
What is the answer if both tables are missing stats? Does Impala make a 
deduction about which is smaller and that one gets broadcast while the other 
doesn't?


Line 58:   might be missing statistics due to the overhead involved in 
calculating them,
> I wouldn't suppose a particular reason for not having stats.
Done


Line 61:   of a table involved in a join query and only transmits a portion 
of the table
> Not very accurate, both tables are transferred across the network. Not sure
I'd prefer to prepare and fine-tune a brief explanation so I could reuse that 
wording in places where such terminology is mentioned to a reader that might 
not have seen it before. Anyone who needs detailed background info can follow 
the "related info" links at the end of the page.


Line 67:   recommended when setting up and deploying new clusters. This 
setting is
> We should mention why we recommend this. SHUFFLE is generally a safer optio
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-5612: join inversion should factor in parallelism
..

IMPALA-5612: join inversion should factor in parallelism

The join inversion optimisation did not factor in the degree of
parallelism that the join executed with after inversion. In some cases
this lead to bad decisions, e.g. executing a join on a single node
instead of 20 nodes.

This patch adds a more sophisticated cost model that factors degree
of parallelism into the join inversion decision.

The behaviour is unchanged if inversion does not change the degree of
parallelism.

Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
---
M fe/src/main/java/org/apache/impala/planner/Planner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
8 files changed, 944 insertions(+), 850 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/7351/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism

2017-07-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5612: join inversion should factor in parallelism
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7351/1/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 425: // Invert the join if doing so reduces the estimated time to 
execute the join.
> Possibly move to a separate method?
Done


Line 434: // * it costs the same to process each build and probe row.
> The hash table build is a two pass while the probe is one. 
Done


Line 472: perRowCost = rhsCard * rhsAvgRowSize;
> Should check for overflows whenever multiplication is used.
It's floating point arithmetic so it will just overflow to "infinity", which is 
ok.


-- 
To view, visit http://gerrit.cloudera.org:8080/7351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.

2017-07-05 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Complete guide to important environment variables for build, 
test, and mini-cluster operations.
..


Patch Set 1:

> (2 comments)
 > 
 > > > I personally abhor Google docs as they just seem to get lost in
 > > the
 > > > Wiki-sea over time.  If we are going to take time to document
 > the
 > > > various set of environment variables controlling important
 > > aspects
 > > > of the build, why not make that accessible to both the internal
 > > and
 > > > external developers by including it in the codebase.
 > > >
 > > > This is my first draft of the standard build settings.
 > >
 > > What about the Wiki, like this page: 
 > > https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch
 > 
 > Our wikis are pretty nice, and for things like ASAN, totally worth
 > documenting separately, but for non-major features, keeping this up
 > to date still requires updating things in two places.  In
 > experience, this isn't something people will do reliably unless
 > forced.  I think we could force at least minimal documentation by
 > keeping a white-list of exported build variables and enforcing it
 > with a pre-checkin lint rule.

Do you plan to add that lint rule to this patch?

-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.

2017-07-05 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Complete guide to important environment variables for build, 
test, and mini-cluster operations.
..


Patch Set 1:

(2 comments)

> > I personally abhor Google docs as they just seem to get lost in
 > the
 > > Wiki-sea over time.  If we are going to take time to document the
 > > various set of environment variables controlling important
 > aspects
 > > of the build, why not make that accessible to both the internal
 > and
 > > external developers by including it in the codebase.
 > >
 > > This is my first draft of the standard build settings.
 > 
 > What about the Wiki, like this page: 
 > https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch

Our wikis are pretty nice, and for things like ASAN, totally worth documenting 
separately, but for non-major features, keeping this up to date still requires 
updating things in two places.  In experience, this isn't something people will 
do reliably unless forced.  I think we could force at least minimal 
documentation by keeping a white-list of exported build variables and enforcing 
it with a pre-checkin lint rule.

http://gerrit.cloudera.org:8080/#/c/7350/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 440
> Speaking confidently as the only team member with a fine arts education, I'
Glad to know that my alternative career as the artist Deth Leprikon remains a 
well kept secret.


http://gerrit.cloudera.org:8080/#/c/7350/1/docs/Developer-Guide/Build-Environment.txt
File docs/Developer-Guide/Build-Environment.txt:

Line 1: Environment Variables:
> How many of these are set in impala-config.sh, and how many are set by othe
Additional docs will be forthcoming


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 11:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/824/

-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 11:

> The build was failing because the return type of some new getters
 > was a "const bool" (instead of "bool"), which caused a compiler
 > warning. Strangely the private build that uses internal Jenkins
 > passed successfully. Forwarding the +2.

I think clang-tidy usually trips over the "const bool" return types.

-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 11: Code-Review+2

The build was failing because the return type of some new getters was a "const 
bool" (instead of "bool"), which caused a compiler warning. Strangely the 
private build that uses internal Jenkins passed successfully. Forwarding the +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#11).

Change subject: IMPALA-5036: Parquet count star optimization
..

IMPALA-5036: Parquet count star optimization

Instead of materializing empty rows when computing count star, we use
the data stored in the Parquet RowGroup.num_rows field. The Parquet
scanner tuple is modified to have one slot into which we will write the
num rows statistic. The aggregate function is changed from count to a
special sum function that gets initialized to 0. We also add a rewrite
rule so that count() is rewritten to count(*) in order to make
sure that this optimization is applied in all cases.

Testing:
- Added functional and planner tests

Change-Id: I536b85c014821296aed68a0c68faadae96005e62
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
A testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test
R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_parquet_stats.py
27 files changed, 873 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Taras Bobrovytsky (Code Review)
Hello Impala Public Jenkins, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6812

to look at the new patch set (#11).

Change subject: IMPALA-5036: Parquet count star optimization
..

IMPALA-5036: Parquet count star optimization

Instead of materializing empty rows when computing count star, we use
the data stored in the Parquet RowGroup.num_rows field. The Parquet
scanner tuple is modified to have one slot into which we will write the
num rows statistic. The aggregate function is changed from count to a
special sum function that gets initialized to 0. We also add a rewrite
rule so that count() is rewritten to count(*) in order to make
sure that this optimization is applied in all cases.

Testing:
- Added functional and planner tests

Change-Id: I536b85c014821296aed68a0c68faadae96005e62
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
A testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test
R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_parquet_stats.py
27 files changed, 873 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-05 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 10:

(10 comments)

I need to read through the frontend code again before commenting on that, but 
here's some comments on the backend part.

http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

Line 150:   DCHECK(status != nullptr);
DCHECK(have_async_build_trhead_);


Line 198:   RETURN_IF_ERROR(AcquireResourcesForBuild(state));
why do we do the build Open() here, rather than keep it in 
ProcessBuildInputAndOpenProbe() like the other cases? between async/sync and 
sink/no-sink and subplan/no-subplan, there are a lot of cases to understand.


http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

PS10, Line 69: it
nit: period


PS10, Line 70: it it is started,
garbled


PS10, Line 109: can be safely closed early
additionally, aren't we now requiring that it be closed early (i.e. the 
frontend assumes this)? hmm, or is this just an optimization in the async case?


Line 121:   /// Processes the build-side input.
let's also say that the build side should already have been opened (though this 
code is going away soon).


Line 141:   Status SendBuildInputToSink(RuntimeState* state, DataSink* 
build_sink);
shouldn't this be private? i.e. we don't expect a derived class to call it 
directly, right?


Line 222:   /// is used for the build. Otherwise, ProcessBuildInput() is called 
on the subclass.
explain that this opens the build


http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS10, Line 87: in general
why? are there cases this isn't true?


http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

PS10, Line 181:  right away
what do we mean by "right away". It seemed like we were acquiring them right 
away before (but too soon), no?


-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5514: Add the joint judgment of ldap password cmd and ldap

2017-07-05 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5514: Add the joint judgment of ldap_password_cmd and 
ldap
..


Patch Set 2:

(3 comments)

Thanks for raising the jira and submitting a quick fix. The fix looks good 
mostly apart from a few nits. Mind fixing them?

http://gerrit.cloudera.org:8080/#/c/7188/2//COMMIT_MSG
Commit Message:

PS2, Line 7: Add the joint judgment of ldap_password_cmd and ldap
Sorry for nitpicking, I don't think judgement is the right word here. How 
about, "Throw an error when --ldap_password_cmd is used without LDAP auth" ? 
(Also please update the description below accordingly).


PS2, Line 11:  
nit: Please remove trailing spaces (here and below)


http://gerrit.cloudera.org:8080/#/c/7188/2/shell/impala_shell.py
File shell/impala_shell.py:

PS2, Line 1333: Must use LDAP authentication while retrieving the LDAP password
IMO something like "Option --ldap_password_cmd requires using LDAP 
authentication mechanism (-l)" gives a more detailed description to the end 
user as to what the actual problem is. What do you think?


-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily

2017-07-05 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7346/4/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

Line 482:   RETURN_IF_ERROR(state->CheckQueryState());
> Done
thanks, looks good to me.  we really have to clean up how memory management 
works with UDFs to make it less brittle, but this is good for now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..


Patch Set 4:

(2 comments)

Couple more minor comments in addition to Jim's review.

http://gerrit.cloudera.org:8080/#/c/7187/4/shell/impala_shell.py
File shell/impala_shell.py:

PS4, Line 1252:  \
Don't think this is needed.


Line 1253: 'It must follow the pattern "KEY=VALUE".')
Mind adding a quick test in test_var_replacement() to cover this?


-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily

2017-07-05 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7346/4/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

Line 482:   RETURN_IF_ERROR(state->CheckQueryState());
> this is fine, but any reason to not just hoist it from the if-stmt to here,
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-05 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7179/4/shell/option_parser.py
File shell/option_parser.py:

PS4, Line 88: Queries read from stdin and end with ctrl+d "
:  "if the argument to -f is -.
Try "If the argument to -f is "-", then queries are read from stdin and 
terminated with ctrl-d."


PS4, Line 163: has character of a-z or A-Z at the beginning
"starts with an alphabetic character"


PS4, Line 164: characters of a-z or A-Z and/or numbers.
"only alphanumeric characters."


-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily

2017-07-05 Thread Thomas Tauber-Marshall (Code Review)
Hello Matthew Jacobs, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7346

to look at the new patch set (#5).

Change subject: IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily
..

IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily

IMPALA-3742 introduced KuduPartitionExpr, which takes a row and passes
it to the Kudu client to determine what partitionit belongs to.

The DataStreamSender never frees the local allocations for the Kudu
partition exprs causing it to hang on to memory longer than it needs to.

This patch also fixes two other related issues:
- DataStreamSender was dropping the Status from AddRow in the Kudu
  branch. Adds 'RETURN_IF_ERROR' and 'WARN_UNUSED_RESULT'
- Changes the HASH case in DataStreamSender to call FreeLocalAllocations
  on a per-batch basis, instead of a per-row basis.

Testing:
- Added an e2e test that runs a large insert with a mem limit that
  failed with oom previously.

Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e
---
M be/src/runtime/data-stream-sender.cc
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
2 files changed, 13 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7346/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL

2017-07-05 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5513: Fix display message exception when using invalid 
KEYVAL
..


Patch Set 4:

(2 comments)

> (2 comments)
 > 
 > I have modified the code according to your opinion.
 > I think your advice is fine, but how do i know who is familiar with
 > that file?

git log $FILENAME

http://gerrit.cloudera.org:8080/#/c/7187/4//COMMIT_MSG
Commit Message:

PS4, Line 9: Print_to_stderr
Please fix as you did below.


PS4, Line 10:  
The usual commit message style is that prose lines do not begin with spaces.


-- 
To view, visit http://gerrit.cloudera.org:8080/7187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.

2017-07-05 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: Complete guide to important environment variables for build, 
test, and mini-cluster operations.
..


Patch Set 1:

(2 comments)

Whether it's a .txt file in the code base, or a wiki entry, how can we make 
sure that this doc stays in sync with impala-config.sh (and/or any other place 
where env vars get set)? We'd have multiple sources of truth, wouldn't we?

http://gerrit.cloudera.org:8080/#/c/7350/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 440
Speaking confidently as the only team member with a fine arts education, I'm 
shocked that I did not know this setting existed. :-)


http://gerrit.cloudera.org:8080/#/c/7350/1/docs/Developer-Guide/Build-Environment.txt
File docs/Developer-Guide/Build-Environment.txt:

Line 1: Environment Variables:
How many of these are set in impala-config.sh, and how many are set by other 
means?


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-07-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 10:

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/823/

-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No