[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 24:

(3 comments)

nits/clarifications, otherwise lgtm.

http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1017
PS24, Line 1017:* the privilege, it will be removed on the next refresh. 
ALTER DATABASE SET OWNER
might be worthwhile to state that while hms state will correctly reflect the 
owner and the catalogd catalog will reflect the hms state, since sentry is 
updated async by hms, sentry's state may not reflect the state in hms. when an 
out-of-date syntry updates the catalogd, it will use sentries incorrect state. 
consequently, the state in catalogd may not always match the state in hms.

there will be some period that we could wait for sentry and hms to be in sync. 
perhaps we wait for their state to settle. if it doesn't, we roll-back the hms 
state? not sure if hms would give you the xact to effectively roll-back. I'm 
fine with not doing this now, but if its something that may be of interest, 
perhaps a todo is useful.


http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/test/resources/sentry-site.xml.template
File fe/src/test/resources/sentry-site.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/24/fe/src/test/resources/sentry-site.xml.template@81
PS24, Line 81:   
nit: remove the spacing


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@156
PS23, Line 156:
> Can't use that since I want to create the database after the cluster starts
create table if not exists 'foo' ... ?



--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 24
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 06:18:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-06 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@225
PS4, Line 225:
> We don't know how much memory is released until a GC.
ok, I was looking at our own memory estimate of tables (it gets computed when 
serializing to thrift-- that will not happen with v2) as a proxy for how much 
we expect to reclaim. no need to change this for this round.


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@148
PS7, Line 148:  
fe_.getImpaladTableUsageTracker().recordTableUsage(loadedTbls_.keySet());
nit: formatting


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175
PS4, Line 175: icted because o
> The call site of this function is guarded with a try clause so I think it's
if we know it can be hit, is the main reason not to guard it for readability?


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java:

http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@76
PS7, Line 76:* invalidation is disabled.
nit: , but in that case, it will be a no-op.


http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py
File tests/custom_cluster/test_automatic_invalidation.py:

http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py@35
PS7, Line 35: query = "select count(*) from functional.alltypes"
add a brief description of what each test is trying to do. I expected that the 
same test is applied to v1 and v2 but was surprised to see two fairly different 
approaches when validating the state of catalogd (which is the same for both)-- 
why the difference?


http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py@57
PS7, Line 57: mns (list) = list

[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file

2018-09-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11227 )

Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per 
file
..


Patch Set 2:

Looks like the failed test is related to this patch. Will wait for it to be 
fixed before +2'ing


--
To view, visit http://gerrit.cloudera.org:8080/11227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
Gerrit-Change-Number: 11227
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 05:45:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(7 comments)

Some high-level comments.

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@26
PS3, Line 26:
Can you add a sample output here.


http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32
PS3, Line 32:   breaking down by category?
> still hoping to get some feedback on the above
Yep, just looking at the code, I have a feeling it could be chatty and most 
users don't probably need that information. But if you can paste some sample 
profiles somewhere (especially with a mix of cache hits + misses, with/without 
partitions, multiple partial RPCs etc.) that'd be great.


http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@33
PS3, Line 33: - worth trying to include the byte sizes of metadata fetched?
Could be useful from incremental stats / large partitioned tables.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@309
PS3, Line 309:   ret = FeSupport.GetPartialCatalogObject(new 
TSerializer().serialize(req));
Should we track num rpcs?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@349
PS3, Line 349: Stopwatch sw = new Stopwatch().start();
Does it help to implement a ScopedTimerWithStats (try-with-resources way)?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@363
PS3, Line 363: /*numHits=*/hit.getRef() ? 1 : 0,
 :   /*numMisses=*/hit.getRef() ? 0 : 1
this is kinda weird. Maybe overload this method that takes a single bool which 
calls the helper?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> Just felt like it's good practice to treat scopes as nestable and "restore"
Ya I think we should have some guard that prevents nested scopes. I doubt if 
anyone will use it, but still :-)



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 05:27:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11227 )

Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per 
file
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3128/


--
To view, visit http://gerrit.cloudera.org:8080/11227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
Gerrit-Change-Number: 11227
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 05:16:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 24:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11314/24/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/24/tests/authorization/test_owner_privileges.py@34
PS24, Line 34: from tests.common.impala_test_suite import ImpalaTestSuite
flake8: F401 'tests.common.impala_test_suite.ImpalaTestSuite' imported but 
unused



--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 24
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 04:48:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 24:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/612/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 24
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 04:46:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Adam Holley (Code Review)
Adam Holley has uploaded a new patch set (#24). ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..

IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

This patch adds calls to automatically create or remove owner
privileges in the catalog based on the statement.  This is similar to
the existing pattern where after privileges are granted in Sentry,
they are created in the catalog directly instead of pulled from
Sentry.

The sentry-site.xml and hive-site.xml template files have been updated
to enable usage of object ownership by default for easier development
with a secured impala.

When object ownership is enabled:
CREATE DATABASE will grant the user OWNER privileges to that database.
ALTER DATABASE SET OWNER will transfer the OWNER privileges to the
new owner.
DROP DATABASE will revoke the OWNER privileges from the owner.
This will apply to DATABASE, TABLE, and VIEW.

Example:
If ownership is enabled, when a table is created, the creator is the
owner, and Sentry will create owner privileges for the created table so
the user can continue working with it without waiting for Sentry
refresh.  Inserts will be available immediately.

Testing:
- Created new custom cluster tests for object ownership

Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
---
M bin/create-test-configuration.sh
M bin/impala-config.sh
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java
M fe/src/main/java/org/apache/impala/util/SentryProxy.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M fe/src/test/resources/sentry-site.xml.template
A fe/src/test/resources/sentry-site_no_oo.xml.template
A fe/src/test/resources/sentry-site_oo_nogrant.xml.template
M testdata/bin/run-sentry-service.sh
A tests/authorization/test_owner_privileges.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_test_suite.py
31 files changed, 1,044 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/24
--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 24
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 23:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java@54
PS23, Line 54: authorization is enabled
> I got confused by what's supposed to happen if its enabled, but isObjectOwn
Done


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1002
PS23, Line 1002: updateDatabasePrivileges(newDb.getName(), null, 
params.server_name, null, null,
> with so many args, it would be easier to read if the params were commented,
Done


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1840
PS23, Line 1840: if (user == null) {
> looks like there's a race here (two operations see that owner doesn't exist
Done


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2878
PS23, Line 2878:   .removePrivilege(privilege);
> is there a race here as well?
Duplicate removes should basically noop the second one.


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3196
PS23, Line 3196: isObjectOwnershipEnabled
> after seeing this method, I'm wondering if this can be pulled up to analysi
Moved to updateDatabasePrivileges


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template
File fe/src/test/resources/sentry-site_no_oo.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template@76
PS23, Line 76: enable owner
> in this case, disable?
Done


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template
File fe/src/test/resources/sentry-site_oo_nogrant.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template@22
PS23, Line 22: sentry.service.security.mode
> add comments to highlight which of these settings pertain to "oo" and which
Done


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@70
PS23, Line 70: ImpalaTestSuite
> is this needed since CustomClusterTestSuite inherits already from ImpalaTes
Done


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@81
PS23, Line 81: Using user 'root
> I missed this.. where is 'root' explicitly used?
Adding a comment at the top to address.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@87
PS23, Line 87: self.execute_query("create role owner_priv_test_ROOT")
> can things be simplified if you create a unique id per test in the test mat
Adding a comment at the top to address.  Basically need "root" because every 
system the tests run on should have a user root with group root.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@109
PS23, Line 109: self.execute_query("create role owner_priv_admin")
  :
> why are these added when cleanup is supposed to remove things?
We create a role and grant it privileges so we can clean up otherwise we might 
not have access.  We don't want to rely on roles being in a state at the end of 
tests.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@156
PS23, Line 156:   def test_owner_privileges_with_grant(self, vector):
> if you use the unique_database fixture here, all objects (db, table, view)
Can't use that since I want to create the database after the cluster starts up. 
 The unique_database creates the database before the test so owner privileges 
aren't created.  I tried adding 'if not exists' to the create and didn't work.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@172
PS23, Line 172: self.root_impalad_client = self.create_impala_client()
  : s
> how do these two differ?
Adding a comment at the top to address.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@248
PS23, Line 248: ser="root"
> maybe the comment about 'root' up on L81 is relevant here?
Addi

[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11380 )

Change subject: IMPALA-5031: Allow UBSAN to be set on codegen
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/611/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560
Gerrit-Change-Number: 11380
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Sep 2018 03:50:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11380 )

Change subject: IMPALA-5031: Allow UBSAN to be set on codegen
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11380/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/11380/2/be/src/codegen/llvm-codegen.h@77
PS2, Line 77: // find be/src -type f -execdir sed -i s/DCHECK_EQ\(replaced,\ 
/DCHECK_REPLACE_COUNT\(replaced,\ /g {} \;
line too long (105 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560
Gerrit-Change-Number: 11380
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Sep 2018 03:49:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog

2018-09-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11158 )

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
..


Patch Set 2:

Forgot to add, can we add some fe tests to check these fields are getting 
interned.


--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 07 Sep 2018 03:26:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen

2018-09-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11380 )

Change subject: IMPALA-5031: Allow UBSAN to be set on codegen
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG@12
PS1, Line 12: slows down testing significantly, it is not made the default when
> Might be worth mentioning that it only works on C++ code cross-compiled to
Done


http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h
File be/src/common/logging.h:

http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h@49
PS1, Line 49: /// Define verbose logging levels.  Per-row logging is more 
verbase than per-file /
> This might belong better in a codegen-related header? It feels very specifi
Done



--
To view, visit http://gerrit.cloudera.org:8080/11380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560
Gerrit-Change-Number: 11380
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Sep 2018 03:12:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen

2018-09-06 Thread Jim Apple (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11380

to look at the new patch set (#2).

Change subject: IMPALA-5031: Allow UBSAN to be set on codegen
..

IMPALA-5031: Allow UBSAN to be set on codegen

This adds a new buildall options, -full_ubsan, that is stronger than
just -ubsan, in that it forces code generated by cross compilation to
LLVM IR to use the undefined behavior sanitizer as well. Because this
slows down testing significantly, it is not made the default when
using -ubsan.

Change-Id: I054e78dd172ee140f2095a523595ff030494e560
---
M be/CMakeLists.txt
M be/src/codegen/llvm-codegen.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/select-node.cc
M be/src/exec/topn-node.cc
M be/src/exec/union-node.cc
M bin/make_impala.sh
M buildall.sh
13 files changed, 76 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11380/2
--
To view, visit http://gerrit.cloudera.org:8080/11380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560
Gerrit-Change-Number: 11380
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog

2018-09-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11158 )

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
..


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java
File fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@62
PS2, Line 62:   public static void internFieldsInPlace(
nit: for each method, should we list the set of fields that we selectively 
intern? (here and below internFieldsInPlace()?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@88
PS2, Line 88: internFieldsInPlace(table.getClustering_columns());
null check for table


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@113
PS2, Line 113: if (sd.isSetCols()) {
null check for sd.


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@131
PS2, Line 131: if (fs.isSetName()) {
null check


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@155
PS2, Line 155: "-1".equals(val) ||
 :   "0".equals(val) ||
 :   val.isEmpty() ||
 :   "true".equalsIgnoreCase(val) ||
 :   "false".equalsIgnoreCase(val) ||
 :   "TASK".equals(val) ||
 :   val.startsWith("impala_"))
Can we create a static HashSet out of these (lowercased) and use something like

if (hashSet.contains(val.lower(

Easier to extend I think


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@172
PS2, Line 172: e.getKey().intern()
Maybe move everything to guava interner to be consistent? Native interning is 
slow anyway (as per the blog post)


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@174
PS2, Line 174: return ret;
Preconditions.checkState(ret.size() == parameters.size())?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@188
PS2, Line 188: internColumnFamily
can we rename it to something like internString() that we can reuse for all 
Strings?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@198
PS2, Line 198: // but it's likely someone would trip over one of these.
This looks risky if someone updates the thrift def for TNetworkAddress :-)


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java
File fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java@35
PS2, Line 35: columnQualifier
just curious, any reason why not this?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@853
PS2, Line 853: hmsParameters_ = CatalogInterners.internParameters(
 :   msPartition.getParameters());
Shouldn't we do it after we remove the incremental stats ? (or) blacklist 
incremental stats from getting interned.


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1203
PS2, Line 1203: intern
just curious, why not intern everything via Guava?



--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 07 Sep 2018 03:11:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11227 )

Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per 
file
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/610/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
Gerrit-Change-Number: 11227
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 02:35:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32
PS3, Line 32:   breaking down by category?
still hoping to get some feedback on the above


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189
PS3, Line 189:   private static final String DB_LIST_STATS_CATEGORY = 
"DatabaseList";
> Is it worth having some sort of enum for these keys? Maybe it will make the
yea I think it'll add quite some boiler plate without any real gains in safety 
since the profiles store this stuff in a string map anyway


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> Can you explain what is going on here? Is there a case where oldThreadLocal
Just felt like it's good practice to treat scopes as nestable and "restore" the 
prior state after the scope is popped out of. Currently we have a pretty strict 
1:1 query:thread kind of mapping but maybe in the future that might not be the 
case. I can add a Preconditions.checkState that it starts as null prior to 
entering the scope if you think that's cleaner until we have some fancier use 
cases?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 02:22:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file

2018-09-06 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11227

to look at the new patch set (#2).

Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per 
file
..

IMPALA-7047. Refreshing partitions should not make an RPC per file

The code to handle REFRESH of a single partition was incorrectly
ignoring the previously-known file descriptors. This meant that, instead
of only calling 'getFileBlockLocations' on the files that had changed
since the prior load, it was instead calling it on every file.

In addition to refresh of single partitions this also affected refresh
of unpartitioned tables (which is implemented as a refresh of its single
"default" partition).

This patch fixes the behavior by copying over the existing file
descriptor list into the re-created partition before refreshing it. A
new unit test uses FS statistics to verify the change. The new
assertions act as a regression test and fail if I comment out the fix.

I also tested this by pointing my dev box at a remote filesystem that
was approximately 60ms away. The initial load of an unpartitioned table
with approximately 45000 files takes around 23 seconds in this setup.
Without the patch in place, REFRESH was taking upwards of 35 minutes (I
got tired and gave up at this point). Multiplying the 60ms round trip by
45000 files estimates 45 minutes. With the fix in place, REFRESH of the
same table took around 4.5 seconds.

Clearly, in typical setups where catalogd and HDFS are on a shared local
network, the gains won't be so dramatic. But, even with a 1ms round trip
(plausible when including fixed RPC overhead and potentially congested
datacenter networks) this would save 45 seconds on this example table
with 45000 files.

Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
2 files changed, 44 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/11227/2
--
To view, visit http://gerrit.cloudera.org:8080/11227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
Gerrit-Change-Number: 11227
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11227 )

Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per 
file
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3128/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
Gerrit-Change-Number: 11227
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 02:06:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7047. Refreshing partitions should not make an RPC per file

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11227 )

Change subject: IMPALA-7047. Refreshing partitions should not make an RPC per 
file
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1256
PS1, Line 1256:* Updates the file metadata of an unpartitioned HdfsTable.
> Maybe we should add that this is optimized for incremental load. (Full load
Done


http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1264
PS1, Line 1264: resetPartitions();
  : org.apache.hadoop.hive.metastore.api.Table msTbl = 
getMetaStoreTable();
  : Preconditions.checkNotNull(msTbl);
  : setPrototypePartition(msTbl.getSd());
  : HdfsPartition part = createPartition(msTbl.getSd(), null, 
new FsPermissionCache());
> ok. Can you add a comment for this?
Done


http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1269
PS1, Line 1269:
> ...
Done


http://gerrit.cloudera.org:8080/#/c/11227/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2068
PS1, Line 2068: 
refreshedPartition.setFileDescriptors(oldPartition.getFileDescriptors());
> same here, can't we directly call refreshpartitionFileMetadata(oldPartition
added same comment as above



--
To view, visit http://gerrit.cloudera.org:8080/11227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2051b96599206164aaa06ecbdf64374c46eda956
Gerrit-Change-Number: 11227
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 01:58:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 7:

(8 comments)

Just a few small notes remaining from me.

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175
PS4, Line 175: icted because o
> The call site of this function is guarded with a try clause so I think it's
I don't think relying on catching NullPointerException is the right approach, 
though -- that would be very concerning for an operator to see "Unexpected 
exception: NullPointerException" in the logs.

Under what circumstance does it return null? Perhaps we should return false 
with a warning if it does?


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java:

http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@75
PS7, Line 75:   AtomicLong triggerCount_ = new AtomicLong();
probably worth a javadoc on this because it wasn't entirely clear to me what it 
meant. When I first saw it, I thought it was the number of times a table had 
been invalidated due to this class, but actually it looks like it's just the 
number of times the background thread woke up and did a scan. Maybe 
'scanCount_' would be better?


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@150
PS7, Line 150:   if (!(gcbean instanceof NotificationListener)) {
I think this is now misplaced -- shouldn't this be inside the loop after the 
check for 'Old'? (but above the setting of member vars, as Vuk caught)


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@225
PS7, Line 225: "Table " + table.getFullName() + " is invalidated
nit: I think "Invalidated table foo.bar" is clearer than "foo.bar is 
invalidated" (or as you had it before with no "is")


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@273
PS7, Line 273: ly" +
nit: missing space


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@274
PS7, Line 274: w
nit: capitalize.


http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java:

http://gerrit.cloudera.org:8080/#/c/11224/7/fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java@55
PS7, Line 55: atalog_
: .invalidateTable(new TTableName(dbName, tblName),
: tblWasRemoved, dbWasAdded);
nit: can now reformat this code in a less strange way


http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py
File tests/custom_cluster/test_automatic_invalidation.py:

http://gerrit.cloudera.org:8080/#/c/11224/7/tests/custom_cluster/test_automatic_invalidation.py@45
PS7, Line 45:   if "Table functional.alltypes is invalidated due to 
inactivity" in \
We have self.assert_impalad_log_contains which does this in a bit nicer way. 
Also, I think this loop should eventually give up rather than causing a test 
timeout if it never shows up. Check out test_restart_catalogd in 
test_local_catalog.py for an example.



--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 01:35:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 07 Sep 2018 01:00:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 7:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/609/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 23:57:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-06 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..

IMPALA-7448: Invalidate recently unused tables from catalogd

This patch implements an automatic invalidation mechanism in catalogd.
There are two invalidation strategies:
1. Periodically the HDFS tables that are not used in a configured
   period "invalidate_tables_timeout_s" is invalidated from catalogd.
2. If the old GC generation is almost full, a certain percentage of LRU
   tables are invalidated. This can be enabled by backend flag
   "invalidate_tables_on_memory_pressure".

The table usage is reported by impalad to catalogd when the tables are
used during planning.
Tests on time-based invalidation are added. It is manually verified that
the GC callback is called if strings are randomly stuffed into catalogd.

Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/common/global-flags.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/fe-support.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
A fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
A tests/custom_cluster/test_automatic_invalidation.py
28 files changed, 804 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/11224/7
--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-06 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@19
PS4, Line 19: is
> nit: are
Done


http://gerrit.cloudera.org:8080/#/c/11224/4//COMMIT_MSG@20
PS4, Line 20: randomly stuffed into catalogd
> what does this mean?
I wrote a thread that kept creating stringbuilders. The notification listener 
is called and the memory pool names are as expected.


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@210
PS4, Line 210: query
> only queries?
changed to SQL statements


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@211
PS4, Line 211: evict
> just clarifying that this knob and the memory ones are independent. in othe
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@212
PS4, Line 212: unused
> nit: remove
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@225
PS4, Line 225: invalidate_tables_fraction_on_memory_pressure
> I understand this knob to establish a target lower bound for an eviction ev
We don't know how much memory is released until a GC.


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/common/global-flags.cc@226
PS4, Line 226: the numbers of
> remove
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h
File be/src/exec/catalog-op-executor.h:

http://gerrit.cloudera.org:8080/#/c/11224/4/be/src/exec/catalog-op-executor.h@72
PS4, Line 72: table names
> nit: simplify to just tables
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2000
PS4, Line 2000:* Set the last used time of specified tables to now.
> Because of the thrift spec, I was expecting to see usage count being used s
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java
File fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@46
PS4, Line 46: unused_tbl_ttl_sec
> stale?
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@62
PS4, Line 62: shrinking
> eviction
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@89
PS4, Line 89:   private long lastInvalidationTime;
> add an "_".
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@116
PS4, Line 116: 
BackendConfig.INSTANCE.getInvalidateTablesGcOldGenFullThreshold(),
 :   B
> are these two validated anywhere (e.g., [0,1])?
added a check


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@140
PS4, Line 140: poolName
> is there somewhat stable site to link to where these names are given, perha
I can't find any.


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@146
PS4, Line 146:   if (!(gcbean instanceof NotificationListener)) {
> why not do this check before setting various class members above?
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/CatalogdTableInvalidator.java@175
PS4, Line 175: getLastGcInfo()
> just starting here with the doc for this method, I see that null can be ret
The call site of this function is guarded with a try clause so I think it's safe


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java@94
PS4, Line 94: Random random = new Random();
> put this outside of the while loop.
Done


http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11224/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@656
PS4, Line 656: load
> what rule are you using to pair the load with

[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11355 )

Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and 
PARQUET_DICTIONARY_FILTERING
..


Patch Set 5: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/68/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/11355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0
Gerrit-Change-Number: 11355
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 06 Sep 2018 23:00:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189
PS3, Line 189:   private static final String DB_LIST_STATS_CATEGORY = 
"DatabaseList";
Is it worth having some sort of enum for these keys? Maybe it will make the 
code more untidy?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
Can you explain what is going on here? Is there a case where 
oldThreadLocalValue_ is not null?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 22:54:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING

2018-09-06 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11355 )

Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and 
PARQUET_DICTIONARY_FILTERING
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml
File docs/topics/impala_parquet_dictionary_filtering.xml:

http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_dictionary_filtering.xml@87
PS4, Line 87:   The NumDictFilteredRowGroups field in the 
query profile output shows the
> We should mention somewhere that row groups can be filtered out by parquet
Done


http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml
File docs/topics/impala_parquet_read_statistics.xml:

http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@49
PS4, Line 49:   Parquet stores min/max stats which can be used to skip 
reading blocks if they don't
> The stats are (currently) used to skip row groups. "qualify a predicate" so
Updated with a different wording


http://gerrit.cloudera.org:8080/#/c/11355/4/docs/topics/impala_parquet_read_statistics.xml@61
PS4, Line 61: Of the numerical types: bool, integer, floating point
> Impala also supports other data types, but there are restrictions around th
Can I have the version info for the old and new statistics?



--
To view, visit http://gerrit.cloudera.org:8080/11355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0
Gerrit-Change-Number: 11355
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 06 Sep 2018 22:53:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11355 )

Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and 
PARQUET_DICTIONARY_FILTERING
..


Patch Set 5:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/68/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0
Gerrit-Change-Number: 11355
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 06 Sep 2018 22:53:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5937: [DOCS] PARQUET READ STATISTICS and PARQUET DICTIONARY FILTERING

2018-09-06 Thread Alex Rodoni (Code Review)
Hello Lars Volker, Fredy Wijaya, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11355

to look at the new patch set (#5).

Change subject: IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and 
PARQUET_DICTIONARY_FILTERING
..

IMPALA-5937: [DOCS] PARQUET_READ_STATISTICS and PARQUET_DICTIONARY_FILTERING

Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0
---
M docs/impala.ditamap
M docs/topics/impala_parquet.xml
A docs/topics/impala_parquet_dictionary_filtering.xml
A docs/topics/impala_parquet_read_statistics.xml
4 files changed, 269 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/11355/5
--
To view, visit http://gerrit.cloudera.org:8080/11355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I88fa8c4a64560711251076c50e1695f7f032f9c0
Gerrit-Change-Number: 11355
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11393 )

Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1194
PS1, Line 1194: initMetrics();
> Doesn't this reset the metrics even for incremental loads? Also doesn't it
ah, indeed you're right, I think I need to find a different way to accomplish 
this patch. I didn't run tests before posting it for review :)



--
To view, visit http://gerrit.cloudera.org:8080/11393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92
Gerrit-Change-Number: 11393
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 06 Sep 2018 22:36:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables

2018-09-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11393 )

Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11393/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1194
PS1, Line 1194: initMetrics();
Doesn't this reset the metrics even for incremental loads? Also doesn't it hit 
the Preconditions check?



--
To view, visit http://gerrit.cloudera.org:8080/11393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92
Gerrit-Change-Number: 11393
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 21:48:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 23:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java@54
PS23, Line 54: authorization is enabled
I got confused by what's supposed to happen if its enabled, but 
isObjectOwnershipEnabled is false. What will be the end-state of the various 
systems under the four combinations here?


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1002
PS23, Line 1002: updateDatabasePrivileges(newDb.getName(), null, 
params.server_name, null, null,
with so many args, it would be easier to read if the params were commented, 
e.g., (..., /*tableName*/null, ...), in particular for nulls and booleans.


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1840
PS23, Line 1840: if (user == null) {
looks like there's a race here (two operations see that owner doesn't exist, 
one will go first and then the other will overwrite)... I see that addUser will 
overwrite so the perhaps the race is a no-op. however, it differs from the 
intent here so just wondering if anything unintended crops up here? for 
example, two operations for the same user will succeed so externally, will it 
look like the same user was added twice?

if there is an assumption about locking from the caller, pls state it.


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2878
PS23, Line 2878:   .removePrivilege(privilege);
is there a race here as well?


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3196
PS23, Line 3196: isObjectOwnershipEnabled
after seeing this method, I'm wondering if this can be pulled up to analysis? 
why let one of those alter commands get to this point when sentry object 
ownership is not enabled?

same question for the method below as well.

if the main explanation is that something is optional here-- is that spelled 
out somewhere?


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template
File fe/src/test/resources/sentry-site_no_oo.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_no_oo.xml.template@76
PS23, Line 76: enable owner
in this case, disable?


http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template
File fe/src/test/resources/sentry-site_oo_nogrant.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/23/fe/src/test/resources/sentry-site_oo_nogrant.xml.template@22
PS23, Line 22: sentry.service.security.mode
add comments to highlight which of these settings pertain to "oo" and which 
pertain to "nogrant".

pls add the same to the just "oo" sentry file as well. I have no idea which of 
these specify the intent of the file.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@70
PS23, Line 70: ImpalaTestSuite
is this needed since CustomClusterTestSuite inherits already from 
ImpalaTestSuite?


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@81
PS23, Line 81: Using user 'root
I missed this.. where is 'root' explicitly used?


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@87
PS23, Line 87: self.execute_query("create role owner_priv_test_ROOT")
can things be simplified if you create a unique id per test in the test matrix? 
I see that all tests are marked to run serially, but in any event, I'm worried 
about leakage to other tests, or within this test.


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@109
PS23, Line 109: self.execute_query("create role owner_priv_admin")
  :
why are these added when cleanup is supposed to remove things?


http://gerrit.cloudera.org:8080/#/c/11314/23/tests/authorization/test_owner_privileges.py@156
PS23, Line 156:   def test_owner_privileges_with_grant(self, vector):
if you use the unique_database fixture here, all objects (db, table, view) will 
be cleaned up. The name can be pa

[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3126/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 06 Sep 2018 21:41:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11158 )

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 21:11:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7335: Fix a race in HdfsScanNode

2018-09-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11337 )

Change subject: IMPALA-7335: Fix a race in HdfsScanNode
..


Patch Set 5:

We should go ahead with the previous fix if the refactor is more complex than 
expected - sorry to send you down the wrong path there.


--
To view, visit http://gerrit.cloudera.org:8080/11337
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id470818846a5c69ad28a6ff695069702982aa793
Gerrit-Change-Number: 11337
Gerrit-PatchSet: 5
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Sep 2018 20:40:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7178: Add the possibility to reduce logging for common data errors

2018-09-06 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10793 )

Change subject: WIP IMPALA-7178: Add the possibility to reduce logging for 
common data errors
..


Patch Set 1:

Csaba, any updates here?


--
To view, visit http://gerrit.cloudera.org:8080/10793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3b7c1fd020a7ba5e0d9c619e1b67236dce198aa
Gerrit-Change-Number: 10793
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 06 Sep 2018 20:21:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py
File tests/custom_cluster/test_rpc_timeout.py:

http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py@42
PS12, Line 42:
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py@42
PS12, Line 42:
flake8: E251 unexpected spaces around keyword / parameter equals



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 06 Sep 2018 18:26:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/608/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 06 Sep 2018 18:23:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7540. Intern most repetitive strings and network addresses in catalog

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11158 )

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3125/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 18:04:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7522: Fix overflow in milliseconds add()

2018-09-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11390 )

Change subject: IMPALA-7522: Fix overflow in milliseconds_add()
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I491389257d6560873d942ac4f9a0fd836cc216da
Gerrit-Change-Number: 11390
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Sep 2018 18:04:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11393 )

Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3123/


--
To view, visit http://gerrit.cloudera.org:8080/11393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92
Gerrit-Change-Number: 11393
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 18:04:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Allow UBSAN to be set on codegen

2018-09-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11380 )

Change subject: IMPALA-5031: Allow UBSAN to be set on codegen
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11380/1//COMMIT_MSG@12
PS1, Line 12: is not made the default when using -ubsan.
Might be worth mentioning that it only works on C++ code cross-compiled to LLVM 
IR. I.e. not on IrBuilder generated IR. You can sort of infer this since 
IrBuilder code isn't C++ so the concept doesn't apply, but good to be explicit.


http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h
File be/src/common/logging.h:

http://gerrit.cloudera.org:8080/#/c/11380/1/be/src/common/logging.h@49
PS1, Line 49: // The number of function calls replaced is not knowable when 
UBSAN is enabled, since it
This might belong better in a codegen-related header? It feels very specific to 
that use case.

llvm-codegen.h?



--
To view, visit http://gerrit.cloudera.org:8080/11380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I054e78dd172ee140f2095a523595ff030494e560
Gerrit-Change-Number: 11380
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Sep 2018 17:49:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-09-06 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 12:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h
File be/src/exec/hdfs-parquet-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h@199
PS10, Line 199:
> nit: should #include the appropriate .pb.h here ("include-what-you-use")
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@60
PS10, Line 60: const Coordinator&
> I think this can probably change back to being const if you take the sugges
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@176
PS10, Line 176: last_report_ti
> nit: I think the term "sequence number" is more usual here -- "version" to
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@220
PS10, Line 220:   /// Backend exec params, owned by the QuerySchedule and has 
query lifetime.
> This "back pointer" still seems error-prone to me. I think the object lifet
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@267
PS10, Line 267:   return num_remaining_instances_ == 0 || !status_.ok();
> I think a VLOG_QUERY about the skipped RPC is probably useful
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@294
PS10, Line 294: DCHECK(!instance_stats->done_);
> nit: why not:
The ctor was marked explicit so not sure it's allowed:
  "explicit Status(const StatusPB& status);"


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@287
PS10, Line 287: atus = report
> Ah, I missed that we join on the reporter thread first.
Good idea about using DFAKE_MUTEX(). Also switched to using a non-atomic.

Also simplified the logic in Coordinator::BackendState::ApplyExecStatusReport() 
as we can rely purely on the sequence number as you suggested.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@375
PS10, Line 375: ReportExecStatusResponsePB resp;
> should we have a failure injection point on the RPC itself? I only saw fail
Please find the tests in test_rpc_timeout.py which:

1. inject delays in the RPC handler to induce timeout
2. run with a very short service queue to emulate a busy server.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@379
PS10, Line 379: reak;
> should we backoff?
I will refrain from changing the logic here too much. There will be a follow up 
patch after IMPALA-4063 which will change the retry logic. TODO added.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc@202
PS10, Line 202:   }
> the method doc says that new_errors is cleared, but it's actually written i
This was lost after refactoring this function. Fixed now.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/service/control-service.cc
File be/src/service/control-service.cc:

PS10:
> This is a general krpc-in-Impala question: I can't find where you set up au
Very good point. This is definitely a bug and it's now fixed in this commit 
here 
(https://github.com/apache/impala/commit/5c541b960491ba91533712144599fb3b6d99521d)


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h
File be/src/util/uid-util.h:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h@79
PS10, Line 79:   DCHECK(uid_pb.IsInitialized());
> worth DCHECKs here that the fields are set by calling uid_pb.IsInitialized(
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh@562
PS10, Line 562: export HBASE_CONF_DIR="$IMPALA_FE_DIR/src/test/resources"
> why's this necessary? Can we change cmake to invoke it from the full path i
FindProtobuf should have set PROTOBUF_PROTOC_EXECUTABLE. Not sure why I needed 
to set it before.


http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py
File tests/custom_cluster/test_rpc_exception.py:

http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py@97
PS10, Line 97:
> can we change this flag to be in millis instead of seconds? Or do we advert
I don't 

[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-09-06 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#12).

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..

IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

This patch also fixes IMPALA-7241 which may lead to duplicated
dml stats being applied. The fix is by adding a monotonically
increasing version number for fragment instances' reports. The
coordinator will ignore any report smaller than or equal to the
version in the last report.

Testing done:
1. Exhaustive build.
2. Added some targeted test cases for profile serialization failure and RPC 
retries/timeout.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/common/global-flags.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_timeout.py
59 files changed, 1,155 insertions(+), 695 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/12
-- 
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7516: Fix query location accounting

2018-09-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11372 )

Change subject: IMPALA-7516: Fix query location accounting
..


Patch Set 1: Code-Review+2

(1 comment)

Thanks for fixing. Looks to to submit after the flake warning and my comment 
are addressed.

http://gerrit.cloudera.org:8080/#/c/11372/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/11372/1/tests/custom_cluster/test_admission_controller.py@674
PS1, Line 674: admission results that can affect it."""
Maybe reference the JIRA that this is a regression test?



-- 
To view, visit http://gerrit.cloudera.org:8080/11372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I66c8e59299747df57c9f39db1cb1f46ff6bbab7e
Gerrit-Change-Number: 11372
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Sep 2018 17:40:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7519: Support elliptic curve ssl ciphers

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11376 )

Change subject: IMPALA-7519: Support elliptic curve ssl ciphers
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/607/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6
Gerrit-Change-Number: 11376
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 17:39:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 8:

Looks like the compile-only job didn't trigger for this. I'm confident it's 
just whitespace so I'll just commit.


--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 17:27:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..

IMPALA-7203. Support UDFs in CatalogdMetaProvider

This adds support to fetch the list of function names within a database,
and then to fetch the list of overloads for a given function name. These
items are cached on the coordinator and invalidated when minimal
function objects are seen on the minimal catalog topic stream.

Aside from the straight-forward plumbing of this new RPC, it's worth
noting that this patch changes the MetaProvider interface to provide
Impala Functions directly instead of HMS Function objects. This means
that we will now fully support all the types of functions supported in
legacy catalogs. As such, test_udfs now passes.

Making this change simplified the code in LocalDb but also means that
DirectMetaProvider no longer supports fetching functions. When we move
to trying to eliminate catalogd altogether at some point, we can revive
this code. For now, I just throw an exception for the function-related
code in DirectMetaProvider, as it's unused.

The one other notable thing here is that TFunction now has optional
fields where it used to have required ones. This was necessary in order
to send invalidations as TCatalogObjects. Given that TFunction is an
internal-only construct, this shouldn't raise compatibility issues.

Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Reviewed-on: http://gerrit.cloudera.org:8080/11359
Reviewed-by: Todd Lipcon 
Tested-by: Todd Lipcon 
---
M common/thrift/CatalogService.thrift
M common/thrift/Types.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
9 files changed, 232 insertions(+), 170 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 9
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7519: Support elliptic curve ssl ciphers

2018-09-06 Thread Thomas Marshall (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11376

to look at the new patch set (#2).

Change subject: IMPALA-7519: Support elliptic curve ssl ciphers
..

IMPALA-7519: Support elliptic curve ssl ciphers

Thrift's SSLSocketFactory class does not support setting ciphers that
use ecdh. This patch modifies our existing subclass of
SSLSocketFactory to override the ciphers() method and enable ECDH.

The code for this was taken from be/src/kudu/security/tls_context.cc

Testing:
- Added a custom cluster test that verifies that a cluster with only
  ECDH ciphers enabled works.

Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6
---
M be/src/rpc/thrift-server.cc
M tests/custom_cluster/test_client_ssl.py
2 files changed, 49 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/11376/2
--
To view, visit http://gerrit.cloudera.org:8080/11376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6
Gerrit-Change-Number: 11376
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7203. Support UDFs in CatalogdMetaProvider

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11359 )

Change subject: IMPALA-7203. Support UDFs in CatalogdMetaProvider
..


Patch Set 8: Verified+1 Code-Review+2

Checked diff between r7 and r8 and it's just docs changes and some 
whitespace/comment diff from a revised parent patch. I'll commit this once the 
build-only check comes back.


--
To view, visit http://gerrit.cloudera.org:8080/11359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef8ece9f214dca9441833b00f65c7c152d0ab53
Gerrit-Change-Number: 11359
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 16:54:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 22:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11314/22/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/22/tests/authorization/test_owner_privileges.py@249
PS22, Line 249: "
flake8: E501 line too long (92 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 22
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 16:51:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7541. Avoid initializing Metrics for IncompleteTables

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11393 )

Change subject: IMPALA-7541. Avoid initializing Metrics for IncompleteTables
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3123/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0bcaa9c45a8cf75266d4b7c16cc14f0fd669b92
Gerrit-Change-Number: 11393
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Sep 2018 16:28:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/606/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 22
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 16:21:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 23:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/605/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 23
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 16:11:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7542: find-fragment-instances misses to find the "root threads"

2018-09-06 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11396 )

Change subject: IMPALA-7542: find-fragment-instances misses to find the "root 
threads"
..


Patch Set 1: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11396/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11396/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-7542: find-fragment-instances misses to find the "root 
threads"
nit: Please make the short description about what this change does, not what 
the bug was, e.g. "fix find-fragment-instances to find all root threads".


http://gerrit.cloudera.org:8080/#/c/11396/1//COMMIT_MSG@19
PS1, Line 19: I tested it locally on a core file.
I think it would be great to have a test for this script, both to show how to 
use it and because it's somewhat hard to test. We don't need to add one in this 
change though, but I'll file a subsequent jira.


http://gerrit.cloudera.org:8080/#/c/11396/1/lib/python/impala_py_lib/gdb/impala-gdb.py
File lib/python/impala_py_lib/gdb/impala-gdb.py:

http://gerrit.cloudera.org:8080/#/c/11396/1/lib/python/impala_py_lib/gdb/impala-gdb.py@47
PS1, Line 47: v
nit: Can you rename this variable to something more descriptive, maybe tdi? I 
can't think of what "v" means.



--
To view, visit http://gerrit.cloudera.org:8080/11396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6
Gerrit-Change-Number: 11396
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 06 Sep 2018 16:03:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Adam Holley (Code Review)
Adam Holley has uploaded a new patch set (#23). ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..

IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

This patch adds calls to automatically create or remove owner
privileges in the catalog based on the statement.  This is similar to
the existing pattern where after privileges are granted in Sentry,
they are created in the catalog directly instead of pulled from
Sentry.

The sentry-site.xml and hive-site.xml template files have been updated
to enable usage of object ownership by default for easier development
with a secured impala.

When object ownership is enabled:
CREATE DATABASE will grant the user OWNER privileges to that database.
ALTER DATABASE SET OWNER will transfer the OWNER privileges to the
new owner.
DROP DATABASE will revoke the OWNER privileges from the owner.
This will apply to DATABASE, TABLE, and VIEW.

Example:
If ownership is enabled, when a table is created, the creator is the
owner, and Sentry will create owner privileges for the created table so
the user can continue working with it without waiting for Sentry
refresh.  Inserts will be available immediately.

Testing:
- Created new custom cluster tests for object ownership

Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
---
M bin/create-test-configuration.sh
M bin/impala-config.sh
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java
M fe/src/main/java/org/apache/impala/util/SentryProxy.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M fe/src/test/resources/sentry-site.xml.template
A fe/src/test/resources/sentry-site_no_oo.xml.template
A fe/src/test/resources/sentry-site_oo_nogrant.xml.template
M testdata/bin/run-sentry-service.sh
A tests/authorization/test_owner_privileges.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_test_suite.py
31 files changed, 1,038 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/23
--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 23
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Adam Holley (Code Review)
Adam Holley has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..

IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

This patch adds calls to automatically create or remove owner
privileges in the catalog based on the statement.  This is similar to
the existing pattern where after privileges are granted in Sentry,
they are created in the catalog directly instead of pulled from
Sentry.

The sentry-site.xml and hive-site.xml template files have been updated
to enable usage of object ownership by default for easier development
with a secured impala.

When object ownership is enabled:
CREATE DATABASE will grant the user OWNER privileges to that database.
ALTER DATABASE SET OWNER will transfer the OWNER privileges to the
new owner.
DROP DATABASE will revoke the OWNER privileges from the owner.
This will apply to DATABASE, TABLE, and VIEW.

Example:
If ownership is enabled, when a table is created, the creator is the
owner, and Sentry will create owner privileges for the created table so
the user can continue working with it without waiting for Sentry
refresh.  Inserts will be available immediately.

Testing:
- Created new custom cluster tests for object ownership

Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
---
M bin/create-test-configuration.sh
M bin/impala-config.sh
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java
M fe/src/main/java/org/apache/impala/util/SentryProxy.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M fe/src/test/resources/sentry-site.xml.template
A fe/src/test/resources/sentry-site_no_oo.xml.template
A fe/src/test/resources/sentry-site_oo_nogrant.xml.template
M testdata/bin/run-sentry-service.sh
A tests/authorization/test_owner_privileges.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_test_suite.py
31 files changed, 1,040 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/11314/22
--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 22
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-06 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11314 )

Change subject: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET 
OWNER
..


Patch Set 20:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11314/22/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

http://gerrit.cloudera.org:8080/#/c/11314/22/common/thrift/JniCatalog.thrift@77
PS22, Line 77:   // The server name for security privileges when authorization 
enabled.
> nit: is
Done


http://gerrit.cloudera.org:8080/#/c/11314/22/common/thrift/JniCatalog.thrift@131
PS22, Line 131:   // The server name for security privileges when authorization 
enabled.
> nit: is (several more below)
Done


http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2660
PS22, Line 2660: Checks for authorization being enabled and returning the 
appropriate value for
   :* server name.
> simplify: Returns the server name if authorization is enabled. Returns null
Done.  Yes.


http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1005
PS22, Line 1005:
> nit: use /** comments.
Done


http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/service/Frontend.java@488
PS22, Line 488:   == TPrincipalType.USER) {
> pull up to prev. line
Makes the line length 92.


http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/util/SentryPolicyService.java
File fe/src/main/java/org/apache/impala/util/SentryPolicyService.java:

http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/main/java/org/apache/impala/util/SentryPolicyService.java@485
PS22, Line 485: entry
> Sentry? .. what's 'entry'?
Done


http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/test/resources/mysql-hive-site.xml.template
File fe/src/test/resources/mysql-hive-site.xml.template:

http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/test/resources/mysql-hive-site.xml.template@a116
PS22, Line 116:
> I was going to comment on why I don't see this file removed but I don't see
It's been unused as far as I know in development.   It appears in the 
documentation so may be used in production.


http://gerrit.cloudera.org:8080/#/c/11314/22/fe/src/test/resources/mysql-hive-site.xml.template@115
PS22, Line 115: hive.sentry.conf.url
> why this name change in this patch?
Because hive.access.conf.url doesn't work.
impalad.INFO:W0906 10:27:26.005672 25891 HiveConf.java:4043] HiveConf of name 
hive.access.conf.url does not exist


http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py
File tests/authorization/test_owner_privileges.py:

http://gerrit.cloudera.org:8080/#/c/11314/20/tests/authorization/test_owner_privileges.py@250
PS20, Line 250: __root_query_fail
> The change is quite different than what I had in mind.
Done



--
To view, visit http://gerrit.cloudera.org:8080/11314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e09332e007ed5aa6a0840683c879a8295c3d2b0
Gerrit-Change-Number: 11314
Gerrit-PatchSet: 20
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 15:33:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

2018-09-06 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11183 )

Change subject: IMPALA-7521: Speed up sub-second unix time->TimestampValue 
conversions
..


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc
File be/src/benchmarks/convert-timestamp-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/benchmarks/convert-timestamp-benchmark.cc@592
PS16, Line 592: (uint64_t)MICROS_PER_SEC
> Is there a reason to use static_cast for primitive numeric types? IMO C sty
AFAIK we are following the google style guide here:
https://google.github.io/styleguide/cppguide.html#Casting

The problem with C-style casting is that it is too strong, e.g. it can cast 
pointers to integers.

However, the google style guide allows brace-initialization, which is as brief 
as a C-style cast and only does conversion, not casting.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-test.cc@241
PS16, Line 241: 1000
> Attila mentioned this too. I think that 1000 is easier to read than MILLIS_
It's usually better to use named constants because they express the intent 
better.
Here it is trivial why we need 1000 so I don't feel strong about it.


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@188
PS16, Line 188: (int64_t)1000*1000*1000
> I was not happy to write this here, but walltime.h constants are not availa
nit: I see, btw you could use C++14 digit separators: 1'000'000'000


http://gerrit.cloudera.org:8080/#/c/11183/16/be/src/runtime/timestamp-value.h@318
PS16, Line 318: int64_t SplitTime(int64_t* ticks)
> I see the point, and I trust the compiler that it will eliminate the variab
Using different variables lets you use more names to express the calculation 
better, i.e. to make the code easier to read.

E.g. in this example 'nanos' means something different before and after the 
function call.

That's my view, but I'm OK if you don't agree with it and keep it as it is.



--
To view, visit http://gerrit.cloudera.org:8080/11183
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa
Gerrit-Change-Number: 11183
Gerrit-PatchSet: 16
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 06 Sep 2018 12:29:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11379 )

Change subject: IMPALA-6442: Misleading file offset reporting in error messages.
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3122/


--
To view, visit http://gerrit.cloudera.org:8080/11379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
Gerrit-Change-Number: 11379
Gerrit-PatchSet: 2
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Thu, 06 Sep 2018 11:50:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7542: find-fragment-instances misses to find the "root threads"

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11396 )

Change subject: IMPALA-7542: find-fragment-instances misses to find the "root 
threads"
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/604/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6
Gerrit-Change-Number: 11396
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 06 Sep 2018 11:12:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7542: find-fragment-instances misses to find the "root threads"

2018-09-06 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11396


Change subject: IMPALA-7542: find-fragment-instances misses to find the "root 
threads"
..

IMPALA-7542: find-fragment-instances misses to find the "root threads"

find-fragment-instances didn't show all the threads
that worked on some fragment instance. It was because
it checked the fragment instance of the parent thread
instead of the actual thread, therefore missing the
"root threads" that started working on the fragment
instance.

I modified the get_fragment_instances() function to
check the local ThreadDebugInfo object of the threads.

I tested it locally on a core file.

Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6
---
M lib/python/impala_py_lib/gdb/impala-gdb.py
1 file changed, 4 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/11396/1
--
To view, visit http://gerrit.cloudera.org:8080/11396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6
Gerrit-Change-Number: 11396
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11379 )

Change subject: IMPALA-6442: Misleading file offset reporting in error messages.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/603/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
Gerrit-Change-Number: 11379
Gerrit-PatchSet: 2
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Thu, 06 Sep 2018 08:48:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11379 )

Change subject: IMPALA-6442: Misleading file offset reporting in error messages.
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11379/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/11379/2/be/src/exec/hdfs-parquet-scanner.cc@1215
PS2, Line 1215:   int64_t metadata_start = file_len - sizeof(int32_t) - 
sizeof(PARQUET_VERSION_NUMBER) -
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@406
PS2, Line 406:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@406
PS2, Line 406: """
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@407
PS2, Line 407: ;
flake8: E703 statement ends with a semicolon


http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@408
PS2, Line 408: ;
flake8: E703 statement ends with a semicolon


http://gerrit.cloudera.org:8080/#/c/11379/2/tests/query_test/test_scanners.py@409
PS2, Line 409: ;
flake8: E703 statement ends with a semicolon



--
To view, visit http://gerrit.cloudera.org:8080/11379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
Gerrit-Change-Number: 11379
Gerrit-PatchSet: 2
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Thu, 06 Sep 2018 08:45:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11379 )

Change subject: IMPALA-6442: Misleading file offset reporting in error messages.
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3122/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
Gerrit-Change-Number: 11379
Gerrit-PatchSet: 2
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Thu, 06 Sep 2018 08:19:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-06 Thread Yongjun Zhang (Code Review)
Yongjun Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11379 )

Change subject: IMPALA-6442: Misleading file offset reporting in error messages.
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1165
PS1, Line 1165: sca
> Now that there are two variables storing some kind of length it would be go
Done in new rev.


http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1214
PS1, Line 1214: // file_len - 4-byte footer length field - 4-byte version 
number field - metadata size
  :   int64_t metadata_start = file_len - sizeof(int32_t) -
> not your change but can you please fix the line wrapping while you're here?
Done in new rev.


http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1228
PS1, Line 1228: const HdfsFileDesc* file_desc = 
scan_node_->GetFileDesc(partition_id, filename());
> If both are NULL we won't catch it now. Are we sure that cannot happen? Sho
Thanks a lot for the review Lars.

The assumption is that stream_->file_desc() is not null, if it's null, the call 
to get file length before this branch would have failed already. If we want, we 
could add a DCHECK in the beginning of the function.

I added the DCHECK here for multiple reasons:
1. only the logic in the existing code here tries to find file_desc.
2. this branch is an "unlikely" branch. Code out side this branch has no need 
to find file_desc.
3. the partition_id variable is only used in this branch.

Another version of my code did move these few lines to before this branch, but 
I chose to move it here because otherwise it's going to be some additional work 
for other code that doesn't need to reach this "unlikely" branch and doesn't 
use the partition_id variable.

What do you think?


http://gerrit.cloudera.org:8080/#/c/11379/1/be/src/exec/hdfs-parquet-scanner.cc@1267
PS1, Line 1267: "at file of
> nit: wrap after file_length
done in new rev.


http://gerrit.cloudera.org:8080/#/c/11379/1/testdata/data/corrupt_footer_len_decr.parquet
File testdata/data/corrupt_footer_len_decr.parquet:

http://gerrit.cloudera.org:8080/#/c/11379/1/testdata/data/corrupt_footer_len_decr.parquet@1
PS1, Line 1: PAR1LÒ,
&n%c^f&6&6(ÒÒ,,Hschema%c%$&n%c^f&6&6(ÒÒ,f(Oimpala
 version 2.11.0-SNAPSHOT (build 
26b2cc85d2084e9e464669b2a67a8015ede173e1)½PAR1
> Add a description for both test files to the README file in the same folder
done in new rev.


http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py@401
PS1, Line 401:   def corrupt_footer_len_common(self, vector, unique_database, 
testname_postfix):
> The function should have a comment. Please find a more descriptive name for
Done in new rev.


http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py@412
PS1, Line 412: create_table_and_copy_files(self.client,
> nit: single line
Done in new rev.


http://gerrit.cloudera.org:8080/#/c/11379/1/tests/query_test/test_scanners.py@420
PS1, Line 420:
> nit: capitalization
Done in new rev.



--
To view, visit http://gerrit.cloudera.org:8080/11379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
Gerrit-Change-Number: 11379
Gerrit-PatchSet: 2
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Thu, 06 Sep 2018 08:11:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-06 Thread Yongjun Zhang (Code Review)
Yongjun Zhang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11379


Change subject: IMPALA-6442: Misleading file offset reporting in error messages.
..

IMPALA-6442: Misleading file offset reporting in error messages.

The error message described in IMPALA-6442 incorrectly reported the file offset 
where the
Parquet footer starts, as if the offset is counted from the file end instead of 
from the
file beginning. The fix changed the reported file offset to be counted from the 
beginning
of the Parquet file.

Testing:
Create a small table that contains one row of data with a single column that's 
bigint and
store it as Parquet. Manually changed the footer size field to be
  1) smaller than the original footer size by 1, to trigger the error message 
fixed by
this jira to be printed, to verify that the fix functions correctly;
  2) bigger than the file size, thus to trigger another related error message 
to be
printed.

Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
---
M be/src/exec/hdfs-parquet-scanner.cc
M testdata/data/README
A testdata/data/corrupt_footer_len_decr.parquet
A testdata/data/corrupt_footer_len_incr.parquet
M tests/query_test/test_scanners.py
5 files changed, 65 insertions(+), 17 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/11379/2
--
To view, visit http://gerrit.cloudera.org:8080/11379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I35235e99ea9ceb0d31961dd3b8069f7194f5a2de
Gerrit-Change-Number: 11379
Gerrit-PatchSet: 2
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong