[Impala-ASF-CR] IMPALA-10596: De-flake TestAdmissionControllerStress

2021-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17272 )

Change subject: IMPALA-10596: De-flake TestAdmissionControllerStress
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8503/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1594a82f507db8dc094d4441dd8739d037f974ff
Gerrit-Change-Number: 17272
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 06 Apr 2021 03:53:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10596: De-flake TestAdmissionControllerStress

2021-04-05 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17272


Change subject: IMPALA-10596: De-flake TestAdmissionControllerStress
..

IMPALA-10596: De-flake TestAdmissionControllerStress

Currently some tests in TestAdmissionControllerStress fail because
they rely on queries holding onto resources on backends till they
are explicitly closed. This was done by keeping the query alive by
fetching very limited rows in intervals. After results spooling was
turned on by default, the queries started to finish early and
released their resources on other backends which let queued queries
to get admitted. This patch turns off result spooling for the queries
to avoid this situation.

Testing:
Ran test in a loop on my local dev machine.

Change-Id: I1594a82f507db8dc094d4441dd8739d037f974ff
---
M tests/custom_cluster/test_admission_controller.py
1 file changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/17272/1
--
To view, visit http://gerrit.cloudera.org:8080/17272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1594a82f507db8dc094d4441dd8739d037f974ff
Gerrit-Change-Number: 17272
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
> Unfortunately, I don't think this approach would work. maxSkippedUpdatesLoc
Ah, I see. After thinking this more clear, I think my idea should be changing 
'maxNumAttempts' from 5 to 5 * (maxSkippedUpdatesLockContention_ + 1). Please 
verify my understanding. Let's say there are 6 concurrent DDLs (ddl1, ddl2, 
..., ddl6) on table 'foo' and both with sync_ddl=true. And let's say they 
finishes in this order:

t0: ddl1 done, go into waitForSyncDdlVersion()
t1-a: catalog topic update thread start collecting updates
t1-b: ddl2 done, bump the table version, go into waitForSyncDdlVersion()
t1-c: catalog topic update thread finish collecting updates and skip the table 
since its version is out of scope
t2-a: catalog topic update thread start
t2-b: ddl3 done, bump the table version, go into waitForSyncDdlVersion()
t2-c: catalog topic update thread finish collecting updates and skip the table 
since its version is out of scope
...
t5-a: catalog topic update thread start
t5-b: ddl6 done, bump the table version, go into waitForSyncDdlVersion()
t5-c: catalog topic update thread finish collecting updates and skip the table 
since its version is out of scope
t6: waitForSyncDdlVersion() for ddl1 returns since 'maxNumAttempts' is reached.

This is the expected behavior before IMPALA-6671. Now when skipping locked 
table is enabled, more rounds of topic updates could be sent in the time range 
between t1-a and t1-b. So 'maxNumAttempts' is reached quickly if it's still 5. 
There are at most maxSkippedUpdatesLockContention_ updates can be sent between 
t1-a and t1-b, the same for t2, t3, ..., t5. So I'm thinking setting 
maxNumAttempts to 5 * (maxSkippedUpdatesLockContention_ + 1) can achieve the 
pre-IMPALA-6671 purpose.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 06 Apr 2021 02:24:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog

2021-04-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613: Standup HMS thrift server in Catalog
..


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java:

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91
PS4, Line 91: FALLBACK_FILE_MD_TIME_WARN_THRESHOLD
nit: usually we append _MS if the time units is ms. It is done for the external 
config options but it is a good convention for internal constants as well.


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91
PS4, Line 91: 100
I am curious about this value..should this depend on the number of files for 
which the metadata needs to be loaded ?


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@126
PS4, Line 126:   checkCondition(getTableRequest.getEngine() != null, 
"Column stats are requested "
The comment above says to check the processor engine is set to Impala but this 
is only checking that engine is non-null.


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@135
PS4, Line 135: if (getTableRequest.isSetValidWriteIdList()) {
For transactional tables, should we have a precondition that the 
ValidWriteIdList is non-null ?


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@205
PS4, Line 205:* @param catalog The catalog instance which caches 
the table.
nit: the defaultCatalog is missing in the params.


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java:

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java@50
PS4, Line 50: the
nit: remove 'the'


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java:

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@57
PS4, Line 57: //TODO add
It says TODO here but if --enable_catalogd_hms_cache is true, this patch does 
serve certain requests from the cache, right ?


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@250
PS4, Line 250: metaserver
nit: At other places you have used 'metastore server' or 'metastore service'.  
Would be good to use the same here.


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@146
PS4, Line 146: return 
super.get_partitions_by_names_req(getPartitionsByNamesRequest);
This doesn't do any logging of the request (unlike the get_partition_by_expr) ? 
Would be good to have the  INFO log entry.



--
To view, visit http://gerrit.cloudera.org:8080/17244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Gerrit-Change-Number: 17244
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 06 Apr 2021 00:53:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog

2021-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613: Standup HMS thrift server in Catalog
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8502/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Gerrit-Change-Number: 17244
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 22:55:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog

2021-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613: Standup HMS thrift server in Catalog
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1342
PS4, Line 1342: 
CatalogHmsAPIHelper.loadAndSetFileMetadataFromFs(validTxnList, 
validWriteIdList, result);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java:

http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java@25
PS4, Line 25: import static 
org.apache.impala.catalog.metastore.CatalogHmsFileMetadataTest.assertFdsAreSame;
line too long (94 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Gerrit-Change-Number: 17244
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 22:35:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog

2021-04-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613: Standup HMS thrift server in Catalog
..

IMPALA-10613: Standup HMS thrift server in Catalog

This change adds the basic infrastructure to start the HMS server in
Catalog. It introduces a new configuration (--start_hms_server) along with a
config for the port and starts a HMS thrift server in the CatalogServiceCatalog
instance. Currently, all the HMS APIs are "pass-through" to the backing HMS
service. Except for the following 3 HMS APIs which can be used to request
a table and its partitions.

Additionally, there is another flag (--enable_catalogd_hms_cache) which
can be used to disable the usage of catalogd for providing the table
and partition metadata. This contribution was done by Kishen Das.

1. get_table_req
2. get_partitions_by_expr
3. get_partitions_by_names

In case of get_partitions_by_expr we need the hive-exec jar to be
present in the classpath since it needs to load the PartitionExpressionProxy
to push down the partition predicates to the HMS database. In case of
get_table_req if column statistics are requested, we return the
table level statistics.

Additionally, this patch adds a new configuration
fallback_to_hms_on_errors for the catalog which is used to determine
if the Catalog falls back to HMS service in case of errors while
executing the API. This is useful for testing purposes.

In order to expose the file-metadata for the tables and partitions,
HMS API changes were made to add the filemetadata fields to table
and partitions. In case of transactional tables, the file-metadata
which is returned is consistent with the provided ValidWriteIdList
in the API call.

There are a few TODOs which will be done in follow up tasks:
1. Add support for SASL support.
2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs
in the hive branch doesn't break Catalog's HMS service.

Testing:
1. Added a new end-to-end test which starts the HMS service in Catalog and runs
some basic HMS APIs against it.
2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and
confirmed most tests are working. There were some test failures but they are
unrelated since the test assumes an empty warehouse whereas we run against the
actual HMS service running in the mini-cluster.

Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
---
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A 
fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/common/impala_test_suite.py
A tests/custom_cluster/test_metastore_service.py
23 files changed, 5,313 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/4
--
To view, visit http://gerrit.cloudera.org:8080/17244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Gerrit-Change-Number: 17244
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog

2021-04-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog
..


Patch Set 3:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/17244/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17244/3//COMMIT_MSG@7
PS3, Line 7:
> nit: no whitespace
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/catalog/catalog-server.cc@94
PS3, Line 94: hms_port_number
> hms_port
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/17244/3/be/src/common/global-flags.cc@352
PS3, Line 352: " it will be redirected to HMS.");
> nit: add a blank line here
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java
File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java:

http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@44
PS3, Line 44: import org.apache.hadoop.hive.common.ValidReaderWriteIdList;
> nit: unused import
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@87
PS3, Line 87: CatalogHMSAPIHelper
> nit: I think our naming rules prefer "CatalogHmsApiHelper"
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@96
PS3, Line 96: maxNonHdfsPartsParallelLoad
> Should this be maxHdfsPartsParallelLoad(), or we just choose this because i
Yes, I thought of just reusing the larger size of the 2 pools since this is a 
static pool which is shared by all the API invocations in case of cache misses. 
I have a TODO here to have it separately configurable. Perhaps I should just 
use a constant of 20 to avoid confusion. Any thoughts?


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@121
PS3, Line 121: GetPartialCatalogObjectRequestBuilder reqBuilder = new 
GetPartialCatalogObjectRequestBuilder()
> line too long (98 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@138
PS3, Line 138: CDPD-14901
> File an upstream JIRA for this?
Thanks for spotting this. Updated the comment with the latest observations 
about this.


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141
PS3, Line 141: deepCopy
> Could you add a comment for deepCopy() here? We already deep copy the hms t
Thanks for catching this. Yes, it doesn't look like this is necessary. I was 
making a copy a bit pessimistically since that would mean that we are using 
implementation knowledge of the API. The copy was necessary since the table was 
being modified later down below. I added a comment explaining the same but I 
would be okay to remove the deepCopy too if you think that is an overkill.


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@225
PS3, Line 225: GetPartialCatalogObjectRequestBuilder catalogReq = new 
GetPartialCatalogObjectRequestBuilder()
> line too long (98 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@519
PS3, Line 519: + "fallback path. Time taken: {} msec", 
getPartsResult.getPartitionsSize(),
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@523
PS3, Line 523: + "fallback path. Time taken: {} msec", 
getPartsResult.getPartitionsSize(),
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java:

http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHMSClientUtils.java@46
PS3, Line 46: CatalogHMSClientUtils
> nit: CatalogHmsClientUtils?
Done


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@141
PS3, Line 141: import 
org.apache.hadoop.hive.metastore.api.InvalidPartitionException;
> nit: unused import
My IDE tells me that this is not a unused import. Some of the methods below 
throw this exception 

[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8501/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 21:19:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..

IMPALA-6671: Change wait for sync ddl timeout

When skip locked tables from topic updates is enabled
(topic_update_tbl_max_wait_time_ms > 0), it is possible that a thread
waiting for a topic update during a sync ddl execution to terminate
earlier. This is because the waitForSyncDdlVersion currently statically
counts the number of instances of topic updates against a maximum
and bails out when the maximum is reached. With topic update thread
skipping locked tables, this number of instances of topic updates
is more likely to hit the maximum attempt limit.

This commit changes the logic so that when locked tables are skipped from
topic updates, the sync ddl operation waits until a configurable
timeout. This timeout value is specified in seconds using the configuration
max_wait_time_for_sync_ddl_s. The default value of this configuration is 0
which means it waits indefinitely. This makes sure that only the sync ddl
queries on a table which has been locked wait for extended durations while
the other unrelated sync ddl queries can finish appropriately.

Also this commit changes the default values of following flags
which were introduced in the earlier patch for IMPALA-6671.

The current default value of topic_update_tbl_max_wait_time_ms
of 500ms is too low and may skip the locked tables more
aggressively than needed. The new defaults were set based on
analysis of a real world deployment.

topic_update_tbl_max_wait_time_ms = 12
catalog_max_lock_skipped_topic_updates = 3

Testing:
1. Ran existing test test_topic_update_frequency.
2. Added a new test for the newly added flag.

Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M tests/custom_cluster/test_topic_update_frequency.py
6 files changed, 88 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17253/5
--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-6671: Change wait for sync ddl timeout

2021-04-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17253 )

Change subject: IMPALA-6671: Change wait for sync ddl timeout
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@92
PS4, Line 92: DEFINE_int32(max_wait_time_for_sync_ddl_s, 0, "Maximum time (in 
seconds) until "
> Do you have a recommendation on what this should be when topic_update_tbl_m
Do you mean what should the recommendation for the least value of it? It is 
hard to predict. Ideally, the minimum time required for this would be the 
2*(time required for one topic update delta calculation).

The default value of 0 is good for most cases when 
topic_update_tbl_max_wait_time_ms is enabled, unless there are any issues and 
user has to override it manually.


http://gerrit.cloudera.org:8080/#/c/17253/4/be/src/catalog/catalog-server.cc@94
PS4, Line 94: will
> nit: will wait?
Done


http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17253/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3094
PS4, Line 3094: long maxNumAttempts = 5;
> I think another solution is increasing maxNumAttempts with maxSkippedUpdate
Unfortunately, I don't think this approach would work. 
maxSkippedUpdatesLockContention_ is not just simply counts the number of topic 
updates due to the lock, but number of distinct lock operations which are 
skipped by topic update thread. Otherwise, it would be not work since we would 
just be delaying the onset of the problem. Hence we cannot rely on 
maxSkippedUpdatesLockContention_ since the topic updates skipped will almost 
always be much higher than maxSkippedUpdatesLockContention_. Hope that makes 
sense.



--
To view, visit http://gerrit.cloudera.org:8080/17253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79e64cdec0e6aa7b597a47851b4b5c5441ca5528
Gerrit-Change-Number: 17253
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 20:59:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10613 : Standup HMS thrift server in Catalog

2021-04-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17244 )

Change subject: IMPALA-10613 : Standup HMS thrift server in Catalog
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java
File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java:

http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@44
PS3, Line 44: import org.apache.hadoop.hive.common.ValidReaderWriteIdList;
nit: unused import


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141
PS3, Line 141: deepCopy
Could you add a comment for deepCopy() here? We already deep copy the hms table 
object in Table.getPartialInfo(): 
https://github.com/apache/impala/blob/22d1f8c7fe14e636d9186f69cf156851879f37b3/fe/src/main/java/org/apache/impala/catalog/Table.java#L661

Maybe we don't need this?


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@141
PS3, Line 141: import 
org.apache.hadoop.hive.metastore.api.InvalidPartitionException;
nit: unused import


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@222
PS3, Line 222: import 
org.apache.hadoop.hive.metastore.api.UnknownPartitionException;
nit: unused import


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@264
PS3, Line 264: import org.apache.impala.catalog.CatalogException;
nit: unused import


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@751
PS3, Line 751: info
nit: Will this too verbose? Consider "trace"?


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@759
PS3, Line 759:   Table tbl = result.getTable();
 :   boolean isTransactional = tbl.getParameters() != null && 
AcidUtils
 :   .isTransactionalTable(tbl.getParameters());
nit: move these after the following if-clause


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@778
PS3, Line 778: setFileMetadata
At the first glance, I thought this is setting the file metadata using those 
cached in catalogd. However, it always triggers file metadata loading. Can we 
rename this to "loadFileMetadata" or "loadAndSetFileMetadata"?


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1256
PS3, Line 1256: If the hive-exec jar is not present in the classpath, we 
fall-back to HMS since
  :* Catalog has no way to deserialize the expression sent over 
by the client.
It seems we just forward the call. Are we missing some codes here?

EDIT: Seems copied from 
CatalogMetastoreServiceHandler.get_partitions_by_expr(). Maybe we can remove 
these comments.


http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1300
PS3, Line 1300: if (fallBackToHMSOnErrors_) {
  :   return;
  : }
nit: Many simple if-clauses in this patch are not in our code style. I think we 
prefer

 if (fallBackToHMSOnErrors_) return;

It'd be better to keep the same code style. But if these codes are copied from 
Hive, please ignore this.



--
To view, visit http://gerrit.cloudera.org:8080/17244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Gerrit-Change-Number: 17244
Gerrit-PatchSet: 3
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 05 Apr 2021 10:14:14 +
Gerrit-HasComments: Yes