[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration

2019-05-16 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13248 )

Change subject: IMPALA-8503: add option to start Kudu cluster with HMS 
integration
..


Patch Set 3:

(4 comments)

> (6 comments)
 >
 > I don't have enough context for high-level review of this
 > changelist, but it seems the comment that Thomas added makes sense.
 >
 > Also, what is the way to configure the way how services started via
 > these scripts?  I know that in case of UNIX init.d scripts there
 > are configuration files in /etc that are picked up by the scripts.
 > Using environment variables makes it look like something similar to
 > that approach.

Ok, since both of you think using env is better, I updated the patch 
accordingly. Thanks!

http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin
File testdata/cluster/admin:

http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@425
PS1, Line 425:
> drop
Done


http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@427
PS1, Line 427:
> drop
Done


http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@428
PS1, Line 428: M
> drop
Done


http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/admin@497
PS1, Line 497: function restart {
 :   if [[ $# -ne 1 ]]; then
 : echo restart must be called with a single argument -- the 
service to restart. 1>&2
 : exit 1
 :   fi
 :   local SERVICE=$1
 :   stop $SERVICE
 :   start $SERVICE
 : }
 :
 : function delete_data {
 :   # Delete namenode, datanode and KMS data while preserving 
directory structure.
 :   rm -rf "$NODES_DIR/$NODE_PREFIX"*/data/dfs/{nn,dn}/*
 :   rm -f "$NODES_DIR/$NODE_PREFIX"*/data/kms.keystore
 :   delete_kudu_data
 : }
> Did you verify this works as expected?  I'm a bit confused about check at L
Yeah, I verified. What is the confusing point?



--
To view, visit http://gerrit.cloudera.org:8080/13248
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2
Gerrit-Change-Number: 13248
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Fri, 17 May 2019 05:57:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13318 )

Change subject: IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS 
integration
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13318/2/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
File fe/src/test/java/org/apache/impala/analysis/AuditingTest.java:

http://gerrit.cloudera.org:8080/#/c/13318/2/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java@65
PS2, Line 65: new TAccessEvent("_impala_builtins", 
TCatalogObjectType.DATABASE, "VIEW_METADATA")));
line too long (93 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I465673d749221bd5f3772814b1c22c2673a53f5c
Gerrit-Change-Number: 13318
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Fri, 17 May 2019 05:56:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration

2019-05-16 Thread Hao Hao (Code Review)
Hello Thomas Marshall, Mike Percy, Alexey Serbin, Andrew Wong, Adar Dembo, 
Grant Henke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13318

to look at the new patch set (#2).

Change subject: IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS 
integration
..

IMPALA-8504: Support CREATE TABLE statement with Kudu/HMS integration

This commits support the syntax of CREATE TABLE (and CTAS) statements for
managed Kudu tables with Kudu/HMS integration. A follow up patch will
address the actaul handling of CREATE TABLE statement with Kudu/HMS integration.

For managed table the syntax remains the same. However, the detailed changes
includes:
1) Kudu table will always be created with the new Kudu storage handler
   'org.apache.kudu.hive.KuduStorageHandler' even when Kudu/HMS integration
   is disabled. The legacy storage handler will be eventually deprecated.
2) When Kudu/HMS integration is enabled, the Kudu table underneath the managed
   HMS table will follow the naming convention 'db_name.table_name' instead of
   'impala::db_name.table_name'.
3) Add 'kudu.table_id' table property to be used with  Kudu/HMS integration.

This commits also extracts Kudu related DDL parsing and analyzing tests,
so that they can be run with or without Kudu/HMS integration enabled.

Change-Id: I465673d749221bd5f3772814b1c22c2673a53f5c
---
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
A fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
A fe/src/test/java/org/apache/impala/analysis/AuditingKuduTest.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
A fe/src/test/java/org/apache/impala/analysis/KuduHMSIntegrationTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
13 files changed, 825 insertions(+), 572 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/13318/2
--
To view, visit http://gerrit.cloudera.org:8080/13318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I465673d749221bd5f3772814b1c22c2673a53f5c
Gerrit-Change-Number: 13318
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8504: Introduce the new Kudu storage handler

2019-05-16 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13358


Change subject: IMPALA-8504: Introduce the new Kudu storage handler
..

IMPALA-8504: Introduce the new Kudu storage handler

This commits introduces the new Kudu storage handler
'org.apache.kudu.hive.KuduStorageHandler' that is used in Kudu/HMS
integration. Tables with either the legacy storage handler
'com.cloudera.kudu.hive.KuduStorageHandler' or the new one are now
considered as Kudu tables. A series of follow-up patches will be rely
on the new storage handler to support Kudu/HMS integration in Impala.

Change-Id: I75bcd5246005f4e35251aef9219f4d07eeb87dc6
---
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
2 files changed, 16 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/13358/1
--
To view, visit http://gerrit.cloudera.org:8080/13358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I75bcd5246005f4e35251aef9219f4d07eeb87dc6
Gerrit-Change-Number: 13358
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13302 )

Change subject: IMPALA-8121: part 3: invalidate on memory pressure
..

IMPALA-8121: part 3: invalidate on memory pressure

Enable --invalidate_tables_on_memory_pressure=true
on catalogd so that catalog can't hit out-of-memory.

Testing:
Ran core tests.

Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969
Reviewed-on: http://gerrit.cloudera.org:8080/13302
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M docker/catalogd/Dockerfile
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13302
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969
Gerrit-Change-Number: 13302
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13302 )

Change subject: IMPALA-8121: part 3: invalidate on memory pressure
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13302
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969
Gerrit-Change-Number: 13302
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 17 May 2019 04:54:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13355 )

Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4274/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e
Gerrit-Change-Number: 13355
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 17 May 2019 04:46:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available

2019-05-16 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1198
PS8, Line 1198:   // estimated row count by ESTIMATED_COMPRESSION_FACTOR.
Should we estimate the compression factor differently depending on the file 
format? eg text I would guess would have a much lower compression factor than 
parquet. Might be interesting to look at some emprical data here since changing 
it later could change plans, and seems not too hard to do a little research



--
To view, visit http://gerrit.cloudera.org:8080/12974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic414121c8df0d5222e4aeea096b5365beb04568a
Gerrit-Change-Number: 12974
Gerrit-PatchSet: 8
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 17 May 2019 04:44:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint

2019-05-16 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13299 )

Change subject: HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint
..


Patch Set 2:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@21
PS2, Line 21:  Before this patch is committed, the intention is
: to submit the changes to those files that are shown in this 
review as
: a patch on Impala's native-toolchain Thrift.
It seems to me like it would be easier to just copy-paste this, since the 
Thrift Protocol/Transport interfaces are typically pretty stable across Thrift 
versions. Or, contribute the changes upstream to Thrift itself. Carrying a 
largeish patch in the context of native-toolchain seems kinda like worst of 
both worlds to me.


http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@28
PS2, Line 28: sends a huge payload that can potentially crash the server.
yea, assuming this is meant to be exposed to the internet, we might want to 
consider fuzz testing the pre-authenticated endpoint in an ASAN build.


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@502
PS2, Line 502: LOG(ERROR) << "Failed to determine buffer length to decode 
base64 auth string.";
for all of the log messages in this function, can we include the remote socket 
address?


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@505
PS2, Line 505:   char out[out_max];
stack allocating here seems quite dangerous without constraining the length (eg 
to 1kb or something)


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@507
PS2, Line 507:   if (!Base64Decode(base64, in_len, out_max, out, _len)) {
Looking at Base64Decode, it doesn't seem to null-terminate the output, but down 
a few lines you're using 'out' as if it's a null-terminated C string.

I'd feel safer about all this code if we used C++ strings. gutil has this 
function:

bool Base64Unescape(const char* src, int slen, string* dest);

which will take care of all the buffer sizing stuff for you


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@521
PS2, Line 521:   size_t pass_len = out_len - user_len - 1;
again I'd feel safer about the above code if we used C++ strings, like:

pair u_p = strings::Split(up_string, 
strings::delimiter::Limit(":", 1));

(I can't necessarily spot a bug in your C string manipulation, but I also know 
that the above won't have a bug)


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@957
PS2, Line 957: SetConnectionUsername
this function is a bit weirdly named, since in the HTTP case, it isn't setting 
the connection username a tall, but rather setting up a hook that will later 
set a connection username. Maybe a moot point based on my comment elsewhere 
that this needs to be per-request state rather than per-connection state for 
HTTP


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@977
PS2, Line 977:   const string& err = Substitute("Bad transport type: $0", 
underlying_transport_type);
can we just LOG(FATAL) here since it would be a coding bug?


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@998
PS2, Line 998:   return Status::Expected(err);
same


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc@2424
PS2, Line 2424: if (hs2_http_port > 0) {
Maybe we should use '-1' to mean disable? port 0 usually means "use an 
ephemeral port'


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h
File be/src/transport/THttpServer.h:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@71
PS2, Line 71: std::vector
that's a bit of an odd choice of type instead of std::string


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@80
PS2, Line 80:   THttpServerTransportFactory(bool requireBasicAuth = false)
explicit


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@68
PS2, Line 68: strncmp
probably needs to be made case-insensitive (odd that x-forwarded-for is not 
using THRIFT_strncasecmp).


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@69
PS2, Line 69: 7
probably need to also check that sz >= 7, otherwise we might read past the end 
of value

(one of the advantages of pulling THttpServer into Impala code itself instead 
of trying to patch Thrift woudl be we could use nice convenience functions like 

[Impala-ASF-CR] Add USE CDP HIVE=true case to build-all-flag-combinations.sh

2019-05-16 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13335 )

Change subject: Add USE_CDP_HIVE=true case to build-all-flag-combinations.sh
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f8e689242e20efb37fbadf7c04764ea8ffb9a9f
Gerrit-Change-Number: 13335
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 17 May 2019 03:38:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3

2019-05-16 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13355 )

Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e
Gerrit-Change-Number: 13355
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 17 May 2019 03:36:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..

Allow data cache to be enabled optionally when running tests

This change adds two environment variables ${DATA_CACHE_DIR}
and ${DATA_CACHE_SIZE} which specify the root directory path
of the data cache and the data cache size respectively. If both
are set, Impala cluster will be started with data cache enabled
when running E2E tests. When running with HDFS, it will also
set the config option --always_use_data_cache to true to force
use of the data cache all the time.

Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Reviewed-on: http://gerrit.cloudera.org:8080/13330
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M bin/run-all-tests.sh
1 file changed, 15 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Fri, 17 May 2019 01:43:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/334/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 01:00:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
..


Patch Set 10:

(15 comments)

Thanks for your improvements - I think this is good. I really like the test, 
it'll be valuable in future.

http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc
File be/src/scheduling/cluster-membership-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@35
PS10, Line 35: /// handling code in 4 subsequently more sophisticated ways:
Maybe it's worth mentioning that it's effectively simulating the statestore 
except updates are done serially


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@48
PS10, Line 48: rng_.Reset(seed);
I don't know about the pros/cons of Kudu's RNG but we have some test utilities 
that handle seeding and allow overriding with an env variable - 
RandTestUtil::SeedRng(). I'd be inclined to keep things consistent within our 
test code.


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@64
PS10, Line 64:   /// Backends that are created but offline, i.e. they have no 
ClusterMembershipMgr and
It might be helpful to document the states that a backend can be in (and how 
that's represented in terms of membership in these lists).


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@86
PS10, Line 86: pull
poll?


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@168
PS10, Line 168: // Broadcast to all other backends
I was trying to think if there were any interesting cases where different CMMs 
saw deltas with different timing, e.g. two executors quiesce simultaneously so 
don't see the other's quiescence before they mark themselves as quiesced. The 
test framework I think is simulating serial updates but the statestore polls 
for updates in parallel.

I think this is unimportant since state of other backends doesn't affect the 
deltas pushed out in any way and each key is single writer, but just wanted to 
double check that you agreed with my thinking.


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@174
PS10, Line 174: send
sent


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@178
PS10, Line 178: // TODO Make these a function
?


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@202
PS10, Line 202: /// statestore directly.
I'm not sure what "driving the interface with the statestore" really means.


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@358
PS10, Line 358:
const or constexpr


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@359
PS10, Line 359: p_add
These are constants, right, so "const double P_ADD"


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@363
PS10, Line 363:   //
empty comment. Actually would be good to document that these are cumulative 
counts.


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@384
PS10, Line 384: if (p < p_delete && !quiescing_.empty()) {
I feel like we should also be deleting non-quiescing backends, to simulate 
non-graceful failures. I don't think it adds much complexity (famous last 
words).


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@79
PS8, Line 79:   // implicitly-defined default and copy constructors.
> I'm not sure I'm following your thoughts behind documenting it. Should we d
I think both - i think they're pretty tightly coupled anyway. Implicit copy 
constructors for complex structs are a bit of a code smell - I'd be suspicious 
if I came across this code and think that maybe it was accidentally copying 
things. I can also see devs accidentally breaking copy construction if they 
didn't realise it was meant to be used that way.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@188
PS8, Line 188: ckend explicit
> I went with 1. Do you think we should expose the fact that we rely on a hos
I guess I don't see a reason why we need to change the interface, so this is 
fine.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@299
PS8, Line 299: LOG(WARNING) << "Error updating frontend 

[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/334/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 00:50:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13357

to look at the new patch set (#4).

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..

IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
---
M docs/topics/impala_scalability.xml
1 file changed, 109 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/4
--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml
File docs/topics/impala_scalability.xml:

http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1022
PS3, Line 1022: One scalability aspect that affects heavily loaded 
clusters is the number of calls made
> I think we want to explicitly mention the HDFS NameNode here as NameNode sc
Done


http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1032
PS3, Line 1032: You can reduce the number of calls made to your files 
system by enabling the file handle
> "made to your files system" --> "made to your file system's metadata layer"
Done



--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 00:50:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml
File docs/topics/impala_scalability.xml:

http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1022
PS3, Line 1022: One scalability aspect that affects heavily loaded 
clusters is the number of calls made
I think we want to explicitly mention the HDFS NameNode here as NameNode 
scalability issues are a big issue for Impala.

Would revise to something like: "Once scalability aspect that affects heavily 
loaded clusters is the load on the metadata layer from looking up the details 
as each file is opened. On HDFS that can lead to increased load on the 
NameNode, and on S3 this can lead to an excessive number of S3 metadata 
requests."


http://gerrit.cloudera.org:8080/#/c/13357/3/docs/topics/impala_scalability.xml@1032
PS3, Line 1032: You can reduce the number of calls made to your files 
system by enabling the file handle
"made to your files system" --> "made to your file system's metadata layer"



--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 00:33:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
..


Patch Set 10:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3258/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 May 2019 00:27:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/333/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 00:22:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml
File docs/topics/impala_scalability.xml:

http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml@1118
PS2, Line 1118:   
> For this section, and the ones below, just we just reference that the this
I combined the 2 sections into one.



--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 00:16:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/333/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 17 May 2019 00:15:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13357

to look at the new patch set (#3).

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..

IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
---
M docs/topics/impala_scalability.xml
1 file changed, 108 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/3
--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/331/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 16 May 2019 23:57:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/332/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 23:51:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/332/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 16 May 2019 23:59:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml
File docs/topics/impala_scalability.xml:

http://gerrit.cloudera.org:8080/#/c/13357/2/docs/topics/impala_scalability.xml@1118
PS2, Line 1118:   
For this section, and the ones below, just we just reference that the this is 
the same file handle cache used for HDFS files? Seems better than duplicating 
these sections, especially since I don't think these sections include anything 
unique to S3.



--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 16 May 2019 23:57:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 2:

Sorry for the messy diffs!
The actual changes start from L1027.
Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 16 May 2019 23:52:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13357

to look at the new patch set (#2).

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..

IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
---
M docs/topics/impala_scalability.xml
1 file changed, 116 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/2
--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13357 )

Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/331/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 23:49:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

2019-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13357


Change subject: IMPALA-8490: [DOCS] Describe the S3 file handle caching feature
..

IMPALA-8490: [DOCS] Describe the S3 file handle caching feature

Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
---
M docs/topics/impala_scalability.xml
1 file changed, 499 insertions(+), 736 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13357/1
--
To view, visit http://gerrit.cloudera.org:8080/13357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I304a0a033475f2289d8a620448d70b90447e4ee1
Gerrit-Change-Number: 13357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-16 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
..


Patch Set 10:

(11 comments)

Thanks for the comments. Please see PS10.

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@67
PS8, Line 67:   typedef std::shared_ptr 
BackendDescriptorPtr;
> I think I would prefer if there was something in the name of this and Snaps
Would BackendDescriptorSPtr work for you? I'd like to keep it from getting 
out-of-hand long, Maybe BeDescSharedPtr would work, too?


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@70
PS8, Line 70:   typedef std::unordered_map 
BackendIdMap;
> nit: std::string in headers here and elsewhere. I guess a "using" is in a h
Done


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@79
PS8, Line 79:   // implicitly-defined default and copy constructors.
> Probably worth documenting that we expect this to be copy-constructed.
I'm not sure I'm following your thoughts behind documenting it. Should we 
document it for users of the struct to mark it as the intended behavior? Or for 
future developers who want to make changes and need to keep copy-construction 
supported? For now I assumed the former and added some documentation in comment 
and the code.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@121
PS8, Line 121:   /// steps.
> Can you document thread safety for these functions and the basics of the ca
Done


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@133
PS8, Line 133:   /// membership. This callback will only be called when 
backends are deleted from the
> Valid to call any time before Init()?
Done


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@39
PS8, Line 39:   LOG(INFO) << "Starting cluster membership manager";
> Consider DCHECK(TestInfo::is_test());
Thx, TIL.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@188
PS8, Line 188: ckend explicit
> Yeah I like 1) since it makes the comparison more explicit. We should add a
I went with 1. Do you think we should expose the fact that we rely on a 
host:port combination through the interface? I.e., should if be 
RemoveExecutor(string krpc_address) or similar instead of passing the full be 
desc?


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@235
PS8, Line 235: AddLo
> nit: unnecessary std::
Done


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@299
PS8, Line 299: LOG(WARNING) << "Error updating frontend membership 
snapshot: " << status.GetDetail();
> This method lacks a check to compare the local be desc inside current_backe
Done


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@313
PS8, Line 313:   auto it = state.current_backends.find(local_backend_id_);
> Would this consistency check have detected the bug you found? Do we need to
Yes, that's how I found it, a host trying to quiesce itself will hit the 
CheckConsistency DCHECK after trying to remove its own backend descriptor.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.h
File be/src/scheduling/executor-group.h:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.h@41
PS8, Line 41: /// host/IP address.
> Worth making explicit that this can be copy-constructed. Either by document
Done



--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 23:37:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-16 Thread Lars Volker (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13207

to look at the new patch set (#10).

Change subject: IMPALA-8460: Simplify cluster membership management
..

IMPALA-8460: Simplify cluster membership management

This change adds a class to track cluster membership called
ClusterMembershipMgr. It replaces the logic that was partially
duplicated between the ImpalaServer and the Coordinator and makes sure
that the local backend descriptor is consistent (IMPALA-8469).

The ClusterMembershipMgr maintains a view of the cluster membership and
incorporates incoming updates from the statestore. It also registers the
local backend with the statestore after startup. Clients can obtain a
consistent, immutable snapshot of the current cluster membership from
the ClusterMembershipMgr. Additionally, callbacks can be registered to
receive notifications of cluster membership changes. The ImpalaServer
and Frontend use this mechanism.

This change also unifies the naming of executor-related classes, in
particular it renames "BackendConfig" to "ExecutorGroup". In
anticipation of a subsequent change (IMPALA-8484), it adds maps to store
multiple executor groups.

This change also disables the generation of default operators from the
thrift files and instead adds explicit implementations for the ones that
we rely on. This forces us to explicitly specify comparators when
manipulating containers of thrift structs and will help prevent
accidental bugs.

Testing: This change adds a backend unit test for the new cluster
membership manager. The observable behavior of Impala does not change,
and the existing scheduler unit test and end to end tests should make
sure of that.

Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
---
M be/src/benchmarks/scheduler-benchmark.cc
M be/src/common/logging.h
M be/src/gutil/strings/split.cc
M be/src/gutil/strings/split.h
M be/src/rpc/thrift-util.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/query-state.h
M be/src/scheduling/CMakeLists.txt
D be/src/scheduling/backend-config-test.cc
D be/src/scheduling/backend-config.cc
D be/src/scheduling/backend-config.h
A be/src/scheduling/cluster-membership-mgr-test.cc
A be/src/scheduling/cluster-membership-mgr.cc
A be/src/scheduling/cluster-membership-mgr.h
A be/src/scheduling/cluster-membership-test-util.cc
A be/src/scheduling/cluster-membership-test-util.h
A be/src/scheduling/executor-group-test.cc
A be/src/scheduling/executor-group.cc
A be/src/scheduling/executor-group.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
M be/src/util/network-util.cc
M be/src/util/network-util.h
M be/src/util/uid-util-test.cc
M common/thrift/CMakeLists.txt
M common/thrift/Frontend.thrift
M common/thrift/StatestoreService.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_coordinators.py
44 files changed, 1,831 insertions(+), 1,008 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/13207/10
--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc
File be/src/scheduling/cluster-membership-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@86
PS10, Line 86: // The empty delta is used to pull subscribers for updates 
without sending new changes.
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@180
PS10, Line 180: bool is_quiescing = find(quiescing_.begin(), 
quiescing_.end(), be) != quiescing_.end();
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/13207/10/be/src/scheduling/cluster-membership-mgr-test.cc@388
PS10, Line 388:   quiescing_.erase(remove(quiescing_.begin(), 
quiescing_.end(), be), quiescing_.end());
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 23:37:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13302 )

Change subject: IMPALA-8121: part 3: invalidate on memory pressure
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4273/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13302
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969
Gerrit-Change-Number: 13302
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 23:34:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13302 )

Change subject: IMPALA-8121: part 3: invalidate on memory pressure
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13302
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969
Gerrit-Change-Number: 13302
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 23:34:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8121: part 3: invalidate on memory pressure

2019-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13302 )

Change subject: IMPALA-8121: part 3: invalidate on memory pressure
..


Patch Set 6:

Test failure was a flake


--
To view, visit http://gerrit.cloudera.org:8080/13302
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d55ef0058abcf70f75b10ae9d89a0274859969
Gerrit-Change-Number: 13302
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 23:34:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 7:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3257/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 16 May 2019 22:22:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
A fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHook.java
A fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/test/java/org/apache/impala/hooks/QueryExecHookManagerTest.java
8 files changed, 436 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/7
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 


[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13355 )

Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3256/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e
Gerrit-Change-Number: 13355
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 21:42:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 7:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/13352/6/be/src/service/frontend.h@190
PS6, Line 190: PostExecHook
> nit: PostQueryExecHook is probably a better name?
Done


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java
File fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java:

http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@36
PS2, Line 36:
> It is dangerous to let plugin fail Impala in this way. Could impala catches
This is during impala daemon start-up; is it not preferred to prevent start-up 
if there is a configuration problem?


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@47
PS2, Line 47:
> "when a (qualifying) Impala query" should it be "after a (qualifying) Impal
"has executed" already communicates that the point of time in question is in 
the past, so "after" is redundant.

But I think it's clearer, so I'll change it to "after"


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@48
PS2, Line 48:
> should "has executed" be "has been executed"?
This is just a difference between active voice and passive voice.  No 
difference in meaning


http://gerrit.cloudera.org:8080/#/c/13352/2/fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java@63
PS2, Line 63:
> why don't you have cleanup function, which will be called by Impala when Im
Because of the way impala shuts down, it will never be guaranteed to execute.  
This is covered and commented on in the design doc


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java
File fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java:

http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@28
PS6, Line 28: lineageGraph
> nit: use _ suffix naming convention for member varaibles, i.e. lineageGraph
kk


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/PostQueryHookContext.java@31
PS6, Line 31: Objects.requireNonNull(lineageGraph).deepCopy();
> I don't think we want to do a deep copy each time.
Each time it's constructed?  Or called?

Because a single context object is passed to all the hook implementations


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@33
PS6, Line 33: singleton
> i'm not 100% convinced why this needs to be a singleton. Can Frontend owns
I agree.  But what about JniFrontend owning it?


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@78
PS6, Line 78: private static final int NUM_HOOK_THREADS = 3;
> should this be configurable?
Probably.  I'd like to get the hook configuration sorted first before I tackle 
this though, since they should probably use the same config mechanism


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@97
PS6, Line 97: Configuration config
> I don't think passing hadoop configuration is correct here. Impala doesn't
Yeah that sounds better.  Using a backend flag


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@98
PS6, Line 98: LOG.info("impalaStartup hook invoked with config: {}", config);
> config can get pretty big, we don't want to print it on info log level
The `toString` of the config object actually only prints the file names, not 
their content.


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@152
PS6, Line 152:   @Override
 :   protected void finalize() throws Throwable {
 : this.cleanUp();
 : super.finalize();
 :   }
> i don't know if it's a good idea do this in finalize(), maybe use Runtime.a
Will do


http://gerrit.cloudera.org:8080/#/c/13352/6/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@175
PS6, Line 175: void
> can we return a List instead of void in case we want to do somethin
Do you intend on passing the futures back to the backend?  I fully support at 
least storing the futures here in case we decide to take action later on, but 
since this is the "manager", i think that any course of action 

[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13352/7/fe/src/main/java/org/apache/impala/hooks/QueryExecHookManager.java@55
PS7, Line 55:  * is performed asynchronously during {@link 
#executePostQueryHooks(PostQueryHookContext)}.
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 7
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 16 May 2019 21:35:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and 
partition
..

IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

This happens when tables load metadata from HMS.
Add MetastoreShim functions to support HMS3 only functions.
Add validwriteIdlists to query profile through timeline.

Tests:
Manually tests HMS2 and HMS3, using log files to check
Unit tests against HMS3

ToDo:
WriteId and valid writeIds can be fetched in other time, need
more study on that.

Profile example:
Query Compilation: 5s057ms
   - Metadata load started: 63.006ms (63.006ms)
   - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms)
   - Loaded ValidWriteIdLists:
   acid.insert_only_no_partitions:6:9223372036854775807::
   acid.insert_only_with_partitions:3:9223372036854775807::
 : 4s921ms (120.580ms)
   - Analysis finished: 4s929ms (8.013ms)
Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Reviewed-on: http://gerrit.cloudera.org:8080/13215
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M common/thrift/CatalogObjects.thrift
A fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
15 files changed, 348 insertions(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 19
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and 
partition
..


Patch Set 18: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 18
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 16 May 2019 21:12:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3

2019-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13355 )

Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e
Gerrit-Change-Number: 13355
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 20:59:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8560: Prometheus metrics support in Impala

2019-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13345 )

Change subject: IMPALA-8560: Prometheus metrics support in Impala
..


Patch Set 2:

(13 comments)

Some more style nits and I some requests for additional test coverage. You 
probably have a better understanding of the prometheus format than I do, so if 
you can think of more edge cases that I didn't ask for tests for, please add 
them.

http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h
File be/src/util/collection-metrics.h:

http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h@80
PS2, Line 80: +
<<

You can also just combine these lines, i.e.

  *val << name << " " << s;


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h@170
PS2, Line 170: string
nit: std::string


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/collection-metrics.h@173
PS2, Line 173: *val << name;
You can combine these lines by chaining <<. If you do that here and below it 
will be a lot more concise.


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/histogram-metric.h@75
PS2, Line 75: string
nit: std::string


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/histogram-metric.h@80
PS2, Line 80:   *value << "{le=0.2} ";
Same comment about chaining << to get this into fewer lines. If you look at 
some of the LOG statements in the code we even have multi-line chains in a lot 
of places.


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc
File be/src/util/metrics-test.cc:

http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@470
PS2, Line 470: +
nit: << instead of + here and in below lines


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@477
PS2, Line 477: TEST_F(MetricsTest, CountersPrometheus) {
Can you add a BYTE unit counter?


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@488
PS2, Line 488:   AddMetricDef("gauge", TMetricKind::GAUGE, TUnit::NONE);
Can you add tests for the supported unit types:

  $ grep unit common/thrift/metrics.json  | sort | uniq
"units": "BYTES",
"units": "NONE",
"units": "TIME_MS",
"units": "TIME_NS",
"units": "TIME_S",
"units": "UNIT",


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@495
PS2, Line 495: TEST_F(MetricsTest, PropertiesPrometheus) {
Can you test some more edge cases with the strings, e.g. what happens if 
there's a line break in the string? It looks like the prometheus format 
requires strings to be escaped: 
https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@513
PS2, Line 513:   AssertPrometheus(stats_val, "stats_metric",
Do we have test coverage for all the floating point formats in the spec: 
https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@514
PS2, Line 514:   "stats_metric_count 2\nstats_metric_last 
20.00\nstats_metric_min "
Can you format this in a way that makes it easier to see the lines, i.e.

  "stats_metric_count 2\n"
  "stats_metric_last 20.00\n"
  "stats_metric_min 10\n"
...


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@533
PS2, Line 533:   "histogram-metric{le=0.2} 2500\nhistogram-metric{le=0.5} "
Same comment on formatting.


http://gerrit.cloudera.org:8080/#/c/13345/2/be/src/util/metrics-test.cc@543
PS2, Line 543:   exp_val << "# HELP description\n# TYPE counter1 
COUNTER\ncounter1 2048\n# HELP "
same comment on formatting.



--
To view, visit http://gerrit.cloudera.org:8080/13345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08
Gerrit-Change-Number: 13345
Gerrit-PatchSet: 2
Gerrit-Owner: Harshil 
Gerrit-Reviewer: Harshil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 20:57:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8560: Prometheus metrics support in Impala

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13345 )

Change subject: IMPALA-8560: Prometheus metrics support in Impala
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3255/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08
Gerrit-Change-Number: 13345
Gerrit-PatchSet: 2
Gerrit-Owner: Harshil 
Gerrit-Reviewer: Harshil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 20:50:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3

2019-05-16 Thread Laszlo Gaal (Code Review)
Hello Todd Lipcon, Tim Armstrong, Joe McDonnell,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13355

to review the following change.


Change subject: Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3
..

Bump toolchain version to 35-4c4c185a57 to pick up CMake 3.14.3

Fixes in the new CMake version include:
- improved rule generation for ninja
- forbidding the use of select() during CMake runs, which caused hangs
  in the past.

Private builds have passed in the following configurations:
- on Ubuntu 14.04 using ninja
- on CentOS 7.4 and CnetOS 6.4 using GNU make
- in a Docker container (using docker/test-with-docker.py) using
  GNU make on Ubuntu 16.04 and Ubuntu 18.04

Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/13355/1
--
To view, visit http://gerrit.cloudera.org:8080/13355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I251f1cd5e0f02abf5275f2f5aa37fac32532315e
Gerrit-Change-Number: 13355
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8344: Add support for running the minicluster with S3Guard

2019-05-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13020 )

Change subject: IMPALA-8344: Add support for running the minicluster with 
S3Guard
..


Patch Set 4:

(6 comments)

mostly minor comments, overall LGTM pending a few comments

http://gerrit.cloudera.org:8080/#/c/13020/4/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/13020/4/bin/impala-config.sh@341
PS4, Line 341:   export 
S3GUARD_METADATASTORE_IMPL="org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore"
Is this necessary? I think the default impl uses the NullMetadataStore already, 
right?


http://gerrit.cloudera.org:8080/#/c/13020/4/testdata/bin/load-test-warehouse-snapshot.sh
File testdata/bin/load-test-warehouse-snapshot.sh:

http://gerrit.cloudera.org:8080/#/c/13020/4/testdata/bin/load-test-warehouse-snapshot.sh@67
PS4, Line 67:   hadoop s3guard prune -seconds 1 -meta 
"dynamodb://${S3GUARD_DYNAMODB_TABLE}" \
is it necessary to prune a newly initialized S3Guard table?


http://gerrit.cloudera.org:8080/#/c/13020/4/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/13020/4/tests/common/impala_test_suite.py@a175
PS4, Line 175:
Should we delete the S3Client as well? It doesn't look like it is used anywhere 
else, right? We can remove the boto3 dependency as well, which decreases the 
number of Python dependencies we have.


http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py
File tests/util/hdfs_util.py:

http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py@158
PS4, Line 158:   def __init__(self, filesystem_type="HDFS"):
the filesystem_type is just purely for debug messages right? might be useful to 
mention that


http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py@163
PS4, Line 163: hadoop_command = ['hadoop', 'fs'] + command
nit: hdfs instead of hadoop


http://gerrit.cloudera.org:8080/#/c/13020/4/tests/util/hdfs_util.py@170
PS4, Line 170:   def create_file(self, path, file_data, overwrite=True):
nit: would be nice to have some docs for these methods, e.g. mention that 
create_file actually writes to a temp file on the local fs first and then 
uploads the file to the target fs



--
To view, visit http://gerrit.cloudera.org:8080/13020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c748529a494bb6e70fec96dc031523ff79bf61d
Gerrit-Change-Number: 13020
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 16 May 2019 20:41:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4272/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 16 May 2019 20:25:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 16 May 2019 20:25:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3254/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 16 May 2019 20:20:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3253/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 16 May 2019 20:20:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8560: Prometheus metrics support in Impala

2019-05-16 Thread Harshil (Code Review)
Harshil has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/13345 )

Change subject: IMPALA-8560: Prometheus metrics support in Impala
..

IMPALA-8560: Prometheus metrics support in Impala

-- This change adds Prometheus text explosion format metric generation.
-- More details can be found below:
 -- https://prometheus.io/docs/instrumenting/exposition_formats
-- Added unit test to test this change

Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08
---
M be/src/util/collection-metrics.h
M be/src/util/histogram-metric.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
5 files changed, 280 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13345/2
--
To view, visit http://gerrit.cloudera.org:8080/13345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08
Gerrit-Change-Number: 13345
Gerrit-PatchSet: 2
Gerrit-Owner: Harshil 
Gerrit-Reviewer: Harshil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config

2019-05-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13313 )

Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py@260
PS1, Line 260: help=SUPPRESS_HELP
> One potential use case that maybe useful is is to specify a different locat
This is one of the ways we thought of while coming up with this solution, so I 
am completely on board if you think this use case will be common. The only 
reason I am usually skeptical about adding env variables is that, if you are 
sourcing a bunch of them in and the naming is ambiguous then there can be some 
conflicting ones and it can get difficult to manage. We will have to be super 
careful about naming this so its super obvious what it is used for, how about 
IMPALA_SHELL_GLOBAL_CONFIG_FILE. Also if we want users to set this then we 
might want to expose this somewhere in the help text and make sure we update 
the impala docs.



--
To view, visit http://gerrit.cloudera.org:8080/13313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
Gerrit-Change-Number: 13313
Gerrit-PatchSet: 4
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 19:56:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow running backend tests sharded and in parallel

2019-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13290 )

Change subject: Allow running backend tests sharded and in parallel
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/13290/2/be/CMakeLists.txt@675
PS2, Line 675: RUN_SERIAL
I haven't use this before. If we have -j 8 on our ctest invocation, and 
expr-test and decompress-test are sharded, expr-test shards may overlap with 
decompress-test shards. Is that right? (I don't mean to imply any problem with 
that.)



--
To view, visit http://gerrit.cloudera.org:8080/13290
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d44475e3f5a45c806f00d5003eeadf59683b009
Gerrit-Change-Number: 13290
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 19:58:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config

2019-05-16 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13313 )

Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/13313/1/shell/option_parser.py@260
PS1, Line 260: help=SUPPRESS_HELP
> This is one of the ways we thought of while coming up with this solution, s
+1 on IMPALA_SHELL_GLOBAL_CONFIG_FILE since this is an Impala shell specific 
config.



--
To view, visit http://gerrit.cloudera.org:8080/13313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
Gerrit-Change-Number: 13313
Gerrit-PatchSet: 4
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 19:58:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13352/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13352/3/be/src/service/impala-server.cc@496
PS3, Line 496:   //
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 19:23:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13352/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13352/5/be/src/service/impala-server.cc@496
PS5, Line 496:   //
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 19:34:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java
File fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java:

http://gerrit.cloudera.org:8080/#/c/13352/5/fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java@31
PS5, Line 31: this.lineageGraph = 
Objects.requireNonNull(lineageGraph).deepCopy();
Lot of defensive copies because `TLineageGraph` is mutable, and context objects 
are given by reference to potentially multiple hooks.  This initial copy may 
effect performance since it occurs even if no hooks are registered.  This could 
be avoided if we provide an additional flag to disable hooks, independent of 
hook registration.  Then we could avoid the JNI call and this constructor 
altogether.

Feedback?



--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 16 May 2019 19:34:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaQueryExecHook.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
7 files changed, 361 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/5
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 5
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has removed Todd Lipcon from this change.  ( 
http://gerrit.cloudera.org:8080/13352 )

Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..


Removed reviewer Todd Lipcon.
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] [WIP] IMPALA-8473: publish lineage info via hook

2019-05-16 Thread radford nguyen (Code Review)
radford nguyen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13352


Change subject: [WIP] IMPALA-8473: publish lineage info via hook
..

[WIP] IMPALA-8473: publish lineage info via hook

This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

Hooks can be consumed by downstream consumers (i.e.
runtime dependencies) at supported places during Impala
execution:

- impalad startup
- post-query execution

The consumers are to be frontend Java dependencies
intiated at runtime.  The hadoop property
`impala.post.exec.hooks` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  IOW, a
query may complete and return to the user before any
or all hooks have completed executing.

Tests:

TBD

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHook.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaExecHookManager.java
A fe/src/main/java/org/apache/impala/hooks/ImpalaHookContext.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
7 files changed, 361 insertions(+), 25 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13352/3
--
To view, visit http://gerrit.cloudera.org:8080/13352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Gerrit-Change-Number: 13352
Gerrit-PatchSet: 3
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3252/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 16 May 2019 19:12:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3251/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 16 May 2019 19:11:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] DWX-124: Prometheus metrics support in Impala

2019-05-16 Thread Harshil (Code Review)
Harshil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13345 )

Change subject: DWX-124: Prometheus metrics support in Impala
..


Patch Set 1:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/13345/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13345/1//COMMIT_MSG@9
PS1, Line 9: -- This change adds Prometheus text explosion format metric 
generation in impala, more details
> Can you wrap lines to 72 chars in commit messages.
Done


http://gerrit.cloudera.org:8080/#/c/13345/1//COMMIT_MSG@13
PS1, Line 13: -- Doc that talk about this approach is here: 
https://docs.google.com/document/d/1KeOuFowH83q9l7iGHFMXL2mAejt4EZCjn3XfKDu7sho/edit?usp=sharing
> This doc isn't publicly accesssible.
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h
File be/src/util/collection-metrics.h:

http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@170
PS1, Line 170: string
> Use std::string in header files (we have a policy not to import std::string
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@170
PS1, Line 170:   string name, std::stringstream* val, std::stringstream* 
metric_kind) {
> It's annoying that the original author of this file didn't move the ToJson(
ok thanks


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@172
PS1, Line 172: string collect_data;
> We should just append to val instead of adding an intermediate string.
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@175
PS1, Line 175: tatic_cast(
> Why are the casts needed? this seems suspicious to me - like it might not w
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@178
PS1, Line 178:   collect_data += name + "_last " + std::to_string(value_) + 
"\n";
> For string concatenation we either prefer using << to append to a stringstr
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@200
PS1, Line 200: sqrt
> std::sqrt
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/collection-metrics.h@204
PS1, Line 204: *metric_kind << "# TYPE " + name + " counter";
> use << to append to the stream instead of string concatenation.
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/histogram-metric.h@81
PS1, Line 81:   histogram_data +=
> Same comments about using << to append directly to value
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/histogram-metric.h@99
PS1, Line 99: *metric_kind << "# TYPE " + name + " histogram";
> Same comment about <<
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.h@171
PS1, Line 171: string
> std::string in headers
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.h@173
PS1, Line 173: *metric_kind << "# TYPE " + name + " " + 
PrintThriftEnum(kind()).c_str();
> Use << instead of +
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc
File be/src/util/metrics.cc:

http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@190
PS1, Line 190: return;
> return not needed
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@206
PS1, Line 206:
> Don't need to add vertical whitespace
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@220
PS1, Line 220: void MetricGroup::ToPrometheus(bool include_children, 
std::stringstream* out_val) {
> Same comments apply about using << instead of strings.
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@220
PS1, Line 220: std::
> shouldn't need to use std:: prefix in .cc files.
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/be/src/util/metrics.cc@245
PS1, Line 245:
> unnecessary vertical whitespace and return
Done


http://gerrit.cloudera.org:8080/#/c/13345/1/bin/distcc/distcc_server_setup.sh
File bin/distcc/distcc_server_setup.sh:

http://gerrit.cloudera.org:8080/#/c/13345/1/bin/distcc/distcc_server_setup.sh@58
PS1, Line 58:   if ! [[ $LSB_VERSION == 14.04 || $LSB_VERSION == 16.04 || 
$LSB_VERSION == 18.04 ]]; then
> Great if this works - but can you do this in a separate change? I also want
I am using ubuntu 18.04, so I tested this by running it on host that I have.



--
To view, visit http://gerrit.cloudera.org:8080/13345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5349085a2007b568cb97f9b8130804ea64d7bb08
Gerrit-Change-Number: 13345
Gerrit-PatchSet: 1
Gerrit-Owner: Harshil 
Gerrit-Reviewer: Harshil 
Gerrit-Reviewer: Impala Public Jenkins 

[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 4: Code-Review+2

This looks good to me.


--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 16 May 2019 19:12:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13277 )

Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/330/ : Doc tests passed.


-- 
To view, visit http://gerrit.cloudera.org:8080/13277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb
Gerrit-Change-Number: 13277
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 19:00:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13313 )

Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3250/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
Gerrit-Change-Number: 13313
Gerrit-PatchSet: 4
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 18:52:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits

2019-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13277 )

Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml
File docs/topics/impala_scaling_limits.xml:

http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@77
PS3, Line 77:   Number of Impalad Coordinators: 1 coordinator for every 
50 executors
> for at most every 50 executors (could be misinterpreted as-is)
Done


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@148
PS3, Line 148:   Number of views
> Also total, per database?
Done


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@152
PS3, Line 152:   Number of user-defined functions
> Also total, per database?
Done


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@167
PS3, Line 167:   Number of block per file
> HDFS blocks?
Done


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@171
PS3, Line 171:   Footer size
> Footer size isn't so much under user control - it's a function of the amoun
Removed


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@282
PS3, Line 282: fragment
> fragments
Done



--
To view, visit http://gerrit.cloudera.org:8080/13277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb
Gerrit-Change-Number: 13277
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 18:48:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits

2019-05-16 Thread Alex Rodoni (Code Review)
Hello Lars Volker, Fredy Wijaya, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13277

to look at the new patch set (#4).

Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
..

IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits

- Listed the known/tested SCALING Limits.
- Unknown limits are marked hidden for now. When the numbers
are available, will remove the hidden tag.

Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb
---
M docs/impala.ditamap
A docs/topics/impala_scaling_limits.xml
2 files changed, 365 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/13277/4
--
To view, visit http://gerrit.cloudera.org:8080/13277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb
Gerrit-Change-Number: 13277
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13277 )

Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/330/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb
Gerrit-Change-Number: 13277
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 18:49:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7608: Estimate row count from file size when no stats available

2019-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12974 )

Change subject: IMPALA-7608: Estimate row count from file size when no stats 
available
..


Patch Set 8:

(20 comments)

Another round of thoughts on the test infrastructure. I think the product code 
looks fine, just want to make sure the tests are more maintainable and easier 
to work on in the future.

http://gerrit.cloudera.org:8080/#/c/12974/8/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/12974/8/be/src/service/query-options.h@48
PS8, Line 48: // The Debug webpage won't be able to be connected once the 
DCHECK fails.
Remove the last line, no need to describe symptoms of a DCHECK here.


http://gerrit.cloudera.org:8080/#/c/12974/8/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/12974/8/common/thrift/ImpalaService.thrift@391
PS8, Line 391: DISABLE_HDFS_NUM_ROWS_EST
EST->ESTIMATE. No real need to abbreviate in a safety valve argument that will 
be rarely used.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@285
PS8, Line 285:   // Added this field to prevent the case in the method 
getStatsNumRows when
Can you rewrite the comment to just describe what the value is. No need to 
include the story about why you added it here.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@288
PS8, Line 288:   private double defaultRowWidth_ = 1.0;
Should be a constant, not a variable, i.e. private static double 
DEFAULT_ROW_WIDTH; Maybe also rename to indicate that it's an estimate, not the 
real row width, i.e. DEFAULT_ROW_WIDTH_ESTIMATE.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1148
PS8, Line 1148:   private long getStatsNumRows(Analyzer analyzer) {
I would prefer if you just passed in the query options to this methods. 
Otherwise when reading code one might think that it's using the analyzer in a 
more complex way.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1190
PS8, Line 1190: // In the case when there is no Column defined, we use 
an ultimate
Do we have a test case for this?


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java:

PS8:
We should also test the nodes with the new estimate functionality disabled in 
these unit tests. Just to make sure that all branches in the code that handle 
missing stats are covered.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@234
PS8, Line 234: // The cardinality of an AggregateNode is always 1 no matter
Not true if there's a group by - we should test that too.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@262
PS8, Line 262: path.add(0);
Can we use something cleaner for constant lists like: Arrays.asList(0, 1, 0);


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@454
PS8, Line 454: GENERATE_DISTRIBUTED_PLAN
I think this new planner test option is awkward since the planner tests already 
have a different mechanism to test single-node plans. Also the default for 
planner tests is to generate a distributed plan, which makes it more confusing.

I'd propose that we instead implement it purely within this file instead - i.e. 
just add an argument to verifyCardinality.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@482
PS8, Line 482:   protected void verifyCardinality(String query, long expected) {
For the tests that rely on file-size-based estimates, I think we want some 
margin of error for the cardinality estimates.I think Paul added something for 
the planner tests.

The problem is that file sizes may vary a little bit, e.g. because of changes 
to the file writers or random variation, so cardinality estimates based on file 
size can vary too.


http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/12974/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@253
PS8, Line 253:   public void testFkPkJoinDetectionWithHDFSNumRowsEstEnabled() {
Let's call the ones with the default options testFkPkJoinDetection() - no need 
to embed the fact that it's 

[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Michael Ho (Code Review)
Hello Lars Volker, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13330

to look at the new patch set (#4).

Change subject: Allow data cache to be enabled optionally when running tests
..

Allow data cache to be enabled optionally when running tests

This change adds two environment variables ${DATA_CACHE_DIR}
and ${DATA_CACHE_SIZE} which specify the root directory path
of the data cache and the data cache size respectively. If both
are set, Impala cluster will be started with data cache enabled
when running E2E tests. When running with HDFS, it will also
set the config option --always_use_data_cache to true to force
use of the data cache all the time.

Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
---
M bin/run-all-tests.sh
1 file changed, 15 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/13330/4
--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13330 )

Change subject: Allow data cache to be enabled optionally when running tests
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13330/2/bin/run-all-tests.sh
File bin/run-all-tests.sh:

http://gerrit.cloudera.org:8080/#/c/13330/2/bin/run-all-tests.sh@64
PS2, Line 64:
> Do we know if the choice of filesystem matters for the cache? For example,
It may matter if the filesystem doesn't support hole punching. I leave it as 
empty in the new patch.


http://gerrit.cloudera.org:8080/#/c/13330/2/bin/run-all-tests.sh@78
PS2, Line 78: _START_CLUSTER_ARGS="${TEST_S
> I think the data caching is orthogonal to the filesystem, and we might want
Done



--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 16 May 2019 18:11:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow data cache to be enabled optionally when running tests

2019-05-16 Thread Michael Ho (Code Review)
Hello Lars Volker, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13330

to look at the new patch set (#3).

Change subject: Allow data cache to be enabled optionally when running tests
..

Allow data cache to be enabled optionally when running tests

This change adds two environment variables ${DATA_CACHE_DIR}
and ${DATA_CACHE_SIZE} which specify the root directory path
of the data cache and the data cache size respectively. If both
are set, Impala cluster will be started with data cache enabled
when running E2E tests. When running with HDFS, it will also
set the config option --always_use_data_cache to true to force
use of the data cache all the time.

Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
---
M bin/run-all-tests.sh
1 file changed, 15 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/13330/3
--
To view, visit http://gerrit.cloudera.org:8080/13330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I09117ab289c2355408212a5fc6493ab751f4853b
Gerrit-Change-Number: 13330
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config

2019-05-16 Thread Ethan Xue (Code Review)
Ethan Xue has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13313 )

Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13313/1/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/13313/1/shell/impala_shell.py@1628
PS1, Line 1628:   test_global_config = os.path.expanduser(options.global_config)
  :   if os.path.isfile(test_global_config):
  : # Always output the source of the global config if verbose
  : if options.verbose:
  :   print_to_stderr(
  : "Loading in options from global config file: %s \n" % 
test_global_config)
  : if test_global_config != default_global_config:
  :   default_global_config = test_global_config
  :   elif test_global_config != default_global_config:
  : print_to_stderr('%s not found.\n' % test_global_config)
  : sys.exit(1)
  :
  :   # use path to custom file specified by user in config_file 
option
  :   input_config = os.path.expanduser(options.config_file)
  :   # verify input_config, if found
  :   if input_config != default_user_config:
  : if os.path.isfile(input_config):
  :   if options.verbose:
  : print_to_stderr("Loading in options from config file: 
%s \n" % input_config)
  :   # command
> One confusion I had earlier was particularly this line: input_config != use
Ah I see what you're getting at, and I agree. Thanks!


http://gerrit.cloudera.org:8080/#/c/13313/1/shell/impala_shell.py@1656
PS1, Line 1656:
> nit: formatting is a bit odd. In Python, we usually indent parser.option_li
Done



--
To view, visit http://gerrit.cloudera.org:8080/13313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
Gerrit-Change-Number: 13313
Gerrit-PatchSet: 4
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 17:56:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6042: Allow Impala shell to use a global impalarc config

2019-05-16 Thread Ethan Xue (Code Review)
Hello Fredy Wijaya, Todd Lipcon, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13313

to look at the new patch set (#4).

Change subject: IMPALA-6042: Allow Impala shell to use a global impalarc config
..

IMPALA-6042: Allow Impala shell to use a global impalarc config

Currently, impalarc files can be specified on a per-user basis
(stored in ~/.impalarc), and they aren't created by default. The
Impala shell should pick up /etc/impalarc as well, in addition
to the user-specific configurations.

The intent here is to allow a "global" configuration of the shell
by a system administrator.

Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
---
M shell/impala_shell.py
M shell/impala_shell_config_defaults.py
M shell/option_parser.py
A tests/shell/good_impalarc2
M tests/shell/test_shell_commandline.py
5 files changed, 78 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/13313/4
--
To view, visit http://gerrit.cloudera.org:8080/13313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
Gerrit-Change-Number: 13313
Gerrit-PatchSet: 4
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits

2019-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13277 )

Change subject: IMPALA-8116: [DOCS] A new doc for Impala Scaling Limits
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml
File docs/topics/impala_scaling_limits.xml:

http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@77
PS3, Line 77:   Number of Impalad Coordinators: 1 coordinator for every 
50 executors
for at most every 50 executors (could be misinterpreted as-is)


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@148
PS3, Line 148:   Number of views
Also total, per database?


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@152
PS3, Line 152:   Number of user-defined functions
Also total, per database?


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@167
PS3, Line 167:   Number of block per file
HDFS blocks?


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@171
PS3, Line 171:   Footer size
Footer size isn't so much under user control - it's a function of the amount of 
metadata per file, which is mostly a funciton of the number of columns and row 
groups


http://gerrit.cloudera.org:8080/#/c/13277/3/docs/topics/impala_scaling_limits.xml@282
PS3, Line 282: fragment
fragments



--
To view, visit http://gerrit.cloudera.org:8080/13277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6df672e5de1fb2d34f6b78524e8f20e85ea34fb
Gerrit-Change-Number: 13277
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 May 2019 17:45:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13353 )

Change subject: IMPALA-8443: Record time spent in authorization in the runtime 
profile
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3249/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae
Gerrit-Change-Number: 13353
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 17:24:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8443: Record time spent in authorization in the runtime profile

2019-05-16 Thread Tamas Mate (Code Review)
Tamas Mate has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13353


Change subject: IMPALA-8443: Record time spent in authorization in the runtime 
profile
..

IMPALA-8443: Record time spent in authorization in the runtime profile

The analysis and authorization is handled together as authorization
depends on the results of the analysis. I have moved the events inside
the analyzeAndAuthorize method and split them, this will help locating
which part of the analysis was slow.

Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
2 files changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/13353/2
--
To view, visit http://gerrit.cloudera.org:8080/13353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5bb85e57fcc75d41f3eb2911e6d375e0da6f82ae
Gerrit-Change-Number: 13353
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 


[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and 
partition
..


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4271/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 18
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 16 May 2019 15:51:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and 
partition
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 18
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 16 May 2019 15:51:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13339 )

Change subject: IMPALA-8369 : Skip test_owner_privileges test when running 
against Hive-3
..

IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3

Currently, when running with USE_CDP_HIVE=true, Sentry service's sync
with HMS is very slow. This is most likely due to the fact that in HMS-3
the notification events are generated using the JSONMessageFactory
provided by Metastore, unlike in case of HMS-2 setup. When running
against HMS-2, Sentry provides its own MessageFactory implementation
which has its limitations and cannot be used in HMS-3. In order to fix
this Sentry should add support for the out-of-box message factory
available in Hive-3 (See SENTRY-2518).

Due to these additional delays from Sentry test_owner_privileges fails
due to race conditions between the cached information in catalog and
Sentry server (See IMPALA-8550). This patch disables this
test when running against HMS-3 until we fix the issues both on
the Sentry and Impala side.

Testing done:
1. Confirmed the test is skipped when using USE_CDP_HIVE=true
2. Confirmed the test is not skipped when using USE_CDP_HIVE=false

Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3
Reviewed-on: http://gerrit.cloudera.org:8080/13339
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/authorization/test_owner_privileges.py
M tests/common/skip.py
2 files changed, 8 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3
Gerrit-Change-Number: 13339
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13339 )

Change subject: IMPALA-8369 : Skip test_owner_privileges test when running 
against Hive-3
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3
Gerrit-Change-Number: 13339
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 15:34:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8476: Replace Sentry admin check workaround with proper Sentry API

2019-05-16 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13346 )

Change subject: IMPALA-8476: Replace Sentry admin check workaround with proper 
Sentry API
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13346/4/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/13346/4/fe/src/main/java/org/apache/impala/service/JniCatalog.java@309
PS4, Line 309: // When there is no user having user name as user.getName() in 
the
 :   // underlying OS, the method isAdmin() in 
SentryPolicyServiceClient.class
 :   // would throw a SentryUserException. In this case, we 
also consider this
 :   // requesting user a non Sentry administrator.
reword: When a user is not defined in Sentry, isAdmin() will throw 
SentryUserException, we will consider this requesting user as a non-Sentry 
administrator.



--
To view, visit http://gerrit.cloudera.org:8080/13346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a27140d401494bc372ad0da96ada57bda94cd82
Gerrit-Change-Number: 13346
Gerrit-PatchSet: 4
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 14:07:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8541: Remove existing roles in order to not interfere with a test.

2019-05-16 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13349 )

Change subject: IMPALA-8541: Remove existing roles in order to not interfere 
with a test.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13349/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13349/1//COMMIT_MSG@7
PS1, Line 7: Remove existing roles in order to not interfere with a test.
reword: Remove existing roles prior to executing the tests in AuthorizationTest


http://gerrit.cloudera.org:8080/#/c/13349/1//COMMIT_MSG@8
PS1, Line 8:
   : The test TestShortUsernameUsed() in AuthorizationTest.java would 
fail if the
   : current user is associated with some role granted the privilege to 
access some
   : databases, e.g., the database "functional".
   :
   : Specifically, for the SQL statement "select * from alltypes", if 
the role is
   : already granted the privilege to access the database "functional", 
only
   : AnalysisException will be thrown. To address this issue, we remove 
all the
   : lingering roles before running this test.
The issue isn't specific to TestShortUsernameUsed(). We can explain the issue 
in a more generic fashion.


http://gerrit.cloudera.org:8080/#/c/13349/1/fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java
File fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java:

http://gerrit.cloudera.org:8080/#/c/13349/1/fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java@438
PS1, Line 438: // Remove existing roles in order to not interfere with this 
test.
 : for (TSentryRole role : sentryService_.listAllRoles(USER)) {
 :   AUTHZ_CATALOG.removeRole(role.getRoleName());
 : }
We should do that in the @BeforeClass. Also, we don't actually need to remove 
the roles in Sentry. We can just remove the roles from the catalog, e.g.

AUTHZ_CATALOG.getAuthPolicy().getAllRoles().forEach(
role -> AUTHZ_CATALOG.getAuthPolicy().removeRole(role.getName()));



--
To view, visit http://gerrit.cloudera.org:8080/13349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7122ee0cc90eaf7d45c5637c28c9cdb59502967b
Gerrit-Change-Number: 13349
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2019 14:00:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13339 )

Change subject: IMPALA-8369 : Skip test_owner_privileges test when running 
against Hive-3
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3
Gerrit-Change-Number: 13339
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 10:05:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 : Skip test owner privileges test when running against Hive-3

2019-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13339 )

Change subject: IMPALA-8369 : Skip test_owner_privileges test when running 
against Hive-3
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4270/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3
Gerrit-Change-Number: 13339
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 16 May 2019 10:05:32 +
Gerrit-HasComments: No