[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17771 )

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..


Patch Set 10:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/9852/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 30 Nov 2021 07:44:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17771 )

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17771/10/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/17771/10/bin/bootstrap_toolchain.py@469
PS10, Line 469: )
flake8: E501 line too long (91 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 30 Nov 2021 07:22:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-29 Thread Anonymous Coward (Code Review)
Hello Quanlong Huang, Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17771

to look at the new patch set (#10).

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..

WiP: IMPALA-10798 : Prototype for JSON reader

This prototype allows user to  create a table stored as jsonfile and
query it.
Steps to test:
- create a json table with schema specified using eligible datatypes
(int8/16/32/64/float/double/string/varchar/char/timestamp/boolean)
- add your json file (with eligble datatypes and same column names as
 schema specified in the create command) to hdfs location
- add this 'location' to your table
- run a select statement

Fix:
- arrow library is included wherever required
- json format is added to scan node base class.
- json scanner files are added, that implement methods to read the
 json file from the specified file location

Change-Id: If79364a421d862d0d837f9be694911e388d4d629
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-json-scanner.cc
A be/src/exec/hdfs-json-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindArrow.cmake
9 files changed, 580 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/17771/10
--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7681/


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 06:43:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7680/


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 05:47:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9851/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 05:12:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7682/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 04:58:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..

IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after
Compaction

After compaction happened in Hive(HIVE ACID table), queries made in
Impala possibly fail with a FileNotFoundException if files already
removed by the Hive cleaner.

In IMPALA-10801, catalogd checks the latest compaction id before serving
metadata. However, coordinators don't take advantage of that.
Coordinators have their own local cache, so we will have to do the
same check for coordinators as well. Besides, we also need to attach
writeIdList to requests that need to fetch file metadata. Since this
checking brings additional overhead for queries, we introduce a flag
auto_check_compaction and set it as false by default for now. We will
find some other efficient ways to do compaction checking in the future.

Tests:
Added unit tests to CatalogdMetaProviderTest

Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
12 files changed, 356 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/5
--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 35: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 01:17:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9850/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 00:27:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11021: Fix bug when query contains illegal predicate hints

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18040 )

Change subject: IMPALA-11021: Fix bug when query contains illegal predicate 
hints
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id719bc4280c811456333eb4b4ec5bc9cb8bae128
Gerrit-Change-Number: 18040
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 30 Nov 2021 00:23:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11021: Fix bug when query contains illegal predicate hints

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18040 )

Change subject: IMPALA-11021: Fix bug when query contains illegal predicate 
hints
..

IMPALA-11021: Fix bug when query contains illegal predicate hints

Currently Impala support predicate hint: ALWAYS_TRUE, we can use this
hint after where keyword. If we use illegal hints carelessly, query
will throw IllegalStateException which is not expected. Query should
return normal results with a warning instead of a exception. This is
due to the condition check in Analyzer.addWarning().
After create TExecRequest and initialize it, Impala will get warnings
from 'GlobalState.warnings', and 'GlobalState.warningsRetrieved' will
be set to 'true' then. But after this, Impala will substitute predicate
by clone(), and analyze new predicate in later phase. New predicate
analyze will add hint warning to 'GlobalState.warnings', but failed and
throw IllegalStateException due to 'globalState_.warningsRetrieved'
check failed which is expected as 'false'.
This check is added in IMPALA-4166, I removed original condition check
and added a new check to ensure that all warnings for new/substituted
predicates are already exists in 'globalState_.warnings'. And this will
also avoiding exception caused by illegal hints.

Testing:
- Added new fe tests in 'AnalyzeStmtsTest'

Change-Id: Id719bc4280c811456333eb4b4ec5bc9cb8bae128
Reviewed-on: http://gerrit.cloudera.org:8080/18040
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
2 files changed, 51 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/18040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id719bc4280c811456333eb4b4ec5bc9cb8bae128
Gerrit-Change-Number: 18040
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7681/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 00:17:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 4: Code-Review+2

The patch looks good to me. I think we should create separate followups for the 
looking it how we can fix this for cases where compactor cleans up the files 
just after the query is compiled. But since cleaner runs after some 
configurable delay after compaction, the queries which run for longer duration 
than that delay could be still affected by this issue.


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 00:16:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@348
PS3, Line 348: conduct
> nit, conducted
Ack


http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349
PS3, Line 349: m
> move to previous line?
Ack


http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349
PS3, Line 349: ala makes "
 : "additional RPCs to hive metastore for each table
> suggest you to change it to more generic since end users may not understand
Ack



--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 30 Nov 2021 00:03:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..

IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after
Compaction

After compaction happened in Hive(HIVE ACID table), queries made in
Impala possibly fail with a FileNotFoundException if files already
removed by the Hive cleaner.

In IMPALA-10801, catalogd checks the latest compaction id before serving
metadata. However, coordinators don't take advantage of that.
Coordinators have their own local cache, so we will have to do the
same check for coordinators as well. Besides, we also need to attach
writeIdList to requests that need to fetch file metadata. Since this
checking brings additional overhead for queries, we introduce a flag
auto_check_compaction and set it as false by default for now. We will
find some other efficient ways to do compaction checking in the future.

Tests:
Added unit tests to CatalogdMetaProviderTest

Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
12 files changed, 337 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/4
--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9849/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 29 Nov 2021 23:54:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@348
PS3, Line 348: conduct
nit, conducted


http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349
PS3, Line 349: .
move to previous line?


http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349
PS3, Line 349: sends one "
 : "getLatestCompactionInfo request per table to HMS.
suggest you to change it to more generic since end users may not understand 
these details. Something like "... because Impala makes additional RPCs to hive 
metastore for each table in a query during the query compilation"


http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@530
PS3, Line 530: checkLatestCompaction
Logging the time taken for this method will be helpful.


http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1755
PS3, Line 1755: if (BackendConfig.INSTANCE.isAutoCheckCompaction() && 
analysisResult.isQueryStmt()
  :   && !queryCtx.isSetTransaction_id()) {
  : long txnId = openTransaction(queryCtx);
  : timeline.markEvent("Transaction opened (" + 
String.valueOf(txnId) + ")");
  :   }
If we open a transaction here then we must make sure that we also commit it 
when the query completes. I suggest you do it in a followup since I don't think 
the backend calls back into frontend when the query completes and it is not as 
straight-forward.



--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 29 Nov 2021 23:48:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-29 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..

IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after
Compaction

After compaction happened in Hive(HIVE ACID table), queries made in
Impala possibly fail with a FileNotFoundException if files already
removed by the Hive cleaner.

In IMPALA-10801, catalogd checks the latest compaction id before serving
metadata. However, coordinators don't take advantage of that.
Coordinators have their own local cache, so we will have to do the
same check for coordinators as well. Besides, we also need to attach
writeIdList to requests that need to fetch file metadata. Since this
checking brings additional overhead for queries, we introduce a flag
auto_check_compaction and set it as false by default for now. We will
find some other efficient ways to do compaction checking in the future.

Tests:
Added unit tests to CatalogdMetaProviderTest

Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
---
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
13 files changed, 335 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/3
--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 35:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7679/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:49:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 35:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9848/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:13:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 34:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9847/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 34
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:07:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11021: Fix bug when query contains illegal predicate hints

2021-11-29 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18040 )

Change subject: IMPALA-11021: Fix bug when query contains illegal predicate 
hints
..


Patch Set 6: Code-Review+2

Looks great! Thanks for fixing this!


--
To view, visit http://gerrit.cloudera.org:8080/18040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id719bc4280c811456333eb4b4ec5bc9cb8bae128
Gerrit-Change-Number: 18040
Gerrit-PatchSet: 6
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:02:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11021: Fix bug when query contains illegal predicate hints

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18040 )

Change subject: IMPALA-11021: Fix bug when query contains illegal predicate 
hints
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7678/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/18040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id719bc4280c811456333eb4b4ec5bc9cb8bae128
Gerrit-Change-Number: 18040
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:03:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11021: Fix bug when query contains illegal predicate hints

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18040 )

Change subject: IMPALA-11021: Fix bug when query contains illegal predicate 
hints
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id719bc4280c811456333eb4b4ec5bc9cb8bae128
Gerrit-Change-Number: 18040
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:03:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Sourabh Goyal (Code Review)
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#35).

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A 

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 34:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/34/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/34/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1299
PS34, Line 1299: 
MetastoreEventsProcessor.getNextMetastoreEventsInBatches(catalog_, 
currentEventId,
line too long (94 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 34
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 29 Nov 2021 17:45:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-29 Thread Sourabh Goyal (Code Review)
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#34).

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A 

[Impala-ASF-CR] IMPALA-11035: Make x-forwarded-for http header case-insensitive

2021-11-29 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18048 )

Change subject: IMPALA-11035: Make x-forwarded-for http header case-insensitive
..


Patch Set 4:

Looks like the pre-commit test is hitting IMPALA-10999


--
To view, visit http://gerrit.cloudera.org:8080/18048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9c4070a4a2d5ad9decb186a9219957d8c26a7d7
Gerrit-Change-Number: 18048
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Nov 2021 15:00:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11035: Make x-forwarded-for http header case-insensitive

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18048 )

Change subject: IMPALA-11035: Make x-forwarded-for http header case-insensitive
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7677/


--
To view, visit http://gerrit.cloudera.org:8080/18048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9c4070a4a2d5ad9decb186a9219957d8c26a7d7
Gerrit-Change-Number: 18048
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Nov 2021 14:53:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10771: Add Tencent COS support

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17503 )

Change subject: IMPALA-10771: Add Tencent COS support
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9846/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idce135a7591d1b4c74425e365525be3086a39821
Gerrit-Change-Number: 17503
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 29 Nov 2021 11:18:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10771: Add Tencent COS support

2021-11-29 Thread Fucun Chu (Code Review)
Fucun Chu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17503 )

Change subject: IMPALA-10771: Add Tencent COS support
..


Patch Set 5:

The hadoop-cos project has added a license and follows the MIT license. 
https://github.com/tencentyun/hadoop-cos/issues/35


--
To view, visit http://gerrit.cloudera.org:8080/17503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idce135a7591d1b4c74425e365525be3086a39821
Gerrit-Change-Number: 17503
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 29 Nov 2021 11:00:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10771: Add Tencent COS support

2021-11-29 Thread Fucun Chu (Code Review)
Fucun Chu has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17503 )

Change subject: IMPALA-10771: Add Tencent COS support
..

IMPALA-10771: Add Tencent COS support

This patch adds support for COS(Cloud Object Storage). Using the
hadoop-cos, the implementation is similar to other remote FileSystems.

New flags for COS:
- num_cos_io_threads: Number of COS I/O threads. Defaults to be 16.

Follow-up:
- Support for caching COS file handles will be addressed in
   IMPALA-10772.
- test_concurrent_inserts and test_failing_inserts in
   test_acid_stress.py are skipped due to slow file listing on
   COS (IMPALA-10773).

Tests:
 - Upload hdfs test data to a COS bucket. Modify all locations in HMS
   DB to point to the COS bucket. Remove some hdfs caching params.
   Run CORE tests.

Change-Id: Idce135a7591d1b4c74425e365525be3086a39821
---
M be/src/exec/hdfs-table-sink.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M java/executor-deps/pom.xml
M java/pom.xml
M testdata/bin/create-load-data.sh
M testdata/bin/run-all.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py
M tests/authorization/test_ranger.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_coordinators.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/custom_cluster/test_hive_parquet_codec_interop.py
M tests/custom_cluster/test_hive_text_codec_interop.py
M tests/custom_cluster/test_insert_behaviour.py
M tests/custom_cluster/test_lineage.py
M tests/custom_cluster/test_local_catalog.py
M tests/custom_cluster/test_local_tz_conversion.py
M tests/custom_cluster/test_metadata_replicas.py
M tests/custom_cluster/test_metastore_service.py
M tests/custom_cluster/test_parquet_max_page_header.py
M tests/custom_cluster/test_permanent_udfs.py
M tests/custom_cluster/test_query_retries.py
M tests/custom_cluster/test_restart_services.py
M tests/custom_cluster/test_topic_update_frequency.py
M tests/data_errors/test_data_errors.py
M tests/failure/test_failpoints.py
M tests/metadata/test_catalogd_debug_actions.py
M tests/metadata/test_compute_stats.py
M tests/metadata/test_ddl.py
M tests/metadata/test_hdfs_encryption.py
M tests/metadata/test_hdfs_permissions.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_metadata_query_statements.py
M tests/metadata/test_partition_metadata.py
M tests/metadata/test_refresh_partition.py
M tests/metadata/test_views_compatibility.py
M tests/query_test/test_acid.py
M tests/query_test/test_date_queries.py
M tests/query_test/test_hbase_queries.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert_behaviour.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_join_queries.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_observability.py
M tests/query_test/test_partitioning.py
M tests/query_test/test_resource_limits.py
M tests/query_test/test_scanners.py
M tests/stress/test_acid_stress.py
M tests/stress/test_ddl_stress.py
M tests/util/filesystem_utils.py
60 files changed, 275 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/17503/5
--
To view, visit http://gerrit.cloudera.org:8080/17503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idce135a7591d1b4c74425e365525be3086a39821
Gerrit-Change-Number: 17503
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11035: Make x-forwarded-for http header case-insensitive

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18048 )

Change subject: IMPALA-11035: Make x-forwarded-for http header case-insensitive
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9c4070a4a2d5ad9decb186a9219957d8c26a7d7
Gerrit-Change-Number: 18048
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Nov 2021 08:15:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11035: Make x-forwarded-for http header case-insensitive

2021-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18048 )

Change subject: IMPALA-11035: Make x-forwarded-for http header case-insensitive
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7677/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/18048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9c4070a4a2d5ad9decb186a9219957d8c26a7d7
Gerrit-Change-Number: 18048
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 29 Nov 2021 08:15:59 +
Gerrit-HasComments: No