[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9820/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 2
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 20 Nov 2021 06:14:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7652/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 2
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 20 Nov 2021 05:52:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..

IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after
Compaction

After compaction happened in Hive(HIVE ACID table), queries made in
Impala possibly fail with a FileNotFoundException if files already
removed by the Hive cleaner.

In IMPALA-10801, catalogd checks the latest compaction id before serving
metadata. However, coordinators don't take advantage of that.
Coordinators have their own local cache, so we will have to do the
same check for coordinators as well. Besides, we also need to attach
writeIdList to requests that need to fetch file metadata.

Tests:
Added unit tests to CatalogdMetaProviderTest

Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
---
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
8 files changed, 308 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/2
--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 2
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9819/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 15
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Sat, 20 Nov 2021 03:07:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9818/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Sat, 20 Nov 2021 02:59:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 15:

Fix formatting errors.


--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 15
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Sat, 20 Nov 2021 02:45:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#15). ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..

[WIP] IMPALA-10992 Planner changes for estimate peak memory

This patch provides planner support to deal with both the current
default and a new large query executor group. Per a preset threshold,
the planner delivers a suitable plan that runs on the default query
executor group if the total estimated per-host memory is no more than
the threshold. Otherwise, it delieves a different plan suitable to
run on the large query executor group.

A new query option 'max_per_host_memory_threshold_for_regular_queries'
is added to define a threshold T. A query with the estimated per host
memory <= T runs in the default query executor group. Otherwise, the
query runs in the large query executor group.

A new command-line option 'large_executor_group_size' is added for
command start-impala-cluster.py. It specifies the number of executor
nodes in the large group when the back end starts. Its value must be
smaller than that of the 'cluster_size' parameter. If omitted, the
start script assumes the large executor group does not exist and
every query runs in the default group.

Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
---
M be/src/runtime/exec-env.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/backend-gflag-util.cc
M bin/start-impala-cluster.py
M common/protobuf/statestore_service.proto
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/common/IdGenerator.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/ClassUtil.java
M fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java
M fe/src/test/java/org/apache/impala/planner/ClusterSizeTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
32 files changed, 520 insertions(+), 189 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/17994/15
--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 15
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 14:

1. Looked into # of runtime filter differences and found that the Query options 
set during normal planning phase are get re-reused in re-plannng phase. Fixed 
it by making a copy and restore the copy before re-planning phase;

2. Investigated row_size difference for UNNEST node: 0B expected vs 13B actual: 
found a bug in UnnestNote.init() where the memory layout is computed after 
computeStats() is called from PlanNode.init(). Fixed the bug and Updated 
tpch-nested.test accordingly.


--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Sat, 20 Nov 2021 02:40:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..

[WIP] IMPALA-10992 Planner changes for estimate peak memory

This patch provides planner support to deal with both the current
default and a new large query executor group. Per a preset threshold,
the planner delivers a suitable plan that runs on the default query
executor group if the total estimated per-host memory is no more than
the threshold. Otherwise, it delieves a different plan suitable to
run on the large query executor group.

A new query option 'max_per_host_memory_threshold_for_regular_queries'
is added to define a threshold T. A query with the estimated per host
memory <= T runs in the default query executor group. Otherwise, the
query runs in the large query executor group.

A new command-line option 'large_executor_group_size' is added for
command start-impala-cluster.py. It specifies the number of executor
nodes in the large group when the back end starts. Its value must be
smaller than that of the 'cluster_size' parameter. If omitted, the
start script assumes the large executor group does not exist and
every query runs in the default group.

Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
---
M be/src/runtime/exec-env.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/backend-gflag-util.cc
M bin/start-impala-cluster.py
M common/protobuf/statestore_service.proto
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/common/IdGenerator.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/ClassUtil.java
M fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java
M fe/src/test/java/org/apache/impala/planner/ClusterSizeTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
32 files changed, 520 insertions(+), 189 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/17994/14
--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17994 )

Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory
..


Patch Set 14:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17994/14/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/17994/14/fe/src/main/java/org/apache/impala/planner/PlanNode.java@612
PS14, Line 612: builder.append("PlanNode[" + getId().asInt()+ "]: Totle # 
of tuples=" + tupleIds_.size() + "\n");
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/17994/14/fe/src/main/java/org/apache/impala/planner/PlanNode.java@615
PS14, Line 615:   builder.append("TupelId=" + tid + ", 
getAvgSerializedSize()=" + desc.getAvgSerializedSize() + "\n");
line too long (106 > 90)


http://gerrit.cloudera.org:8080/#/c/17994/14/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/17994/14/fe/src/main/java/org/apache/impala/service/Frontend.java@1567
PS14, Line 1567: LOG.error("<<< createPartialExecRequestForExecutorGroup() 
for " + kind + ", sql=" + planner.getPlannerCtx().getQueryStmt().toSql());
line too long (136 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb
Gerrit-Change-Number: 17994
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Sat, 20 Nov 2021 02:38:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7651/


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 20 Nov 2021 00:50:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17983 )

Change subject: IMPALA-10920: Zipping unnest for arrays
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9817/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Number: 17983
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 22:47:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-19 Thread Gabor Kaszab (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17983

to look at the new patch set (#9).

Change subject: IMPALA-10920: Zipping unnest for arrays
..

IMPALA-10920: Zipping unnest for arrays

This patch provides an unnest implementation for arrays where unnesting
multiple arrays in one query results the items of the arrays being
zipped together instead of joining. There are two different syntaxes
introduced for this purpose:

1: ISO:SQL 2016 compliant syntax:
SELECT a1.item, a2.item
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);

2: Postgres compatible syntax:
SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;

Let me show the expected behaviour through the following example:
Inputs: arr1: {1,2,3}, arr2: {11, 12}
After running any of the above queries we expect the following output:
===
| arr1 | arr2 |
===
| 1| 11   |
| 2| 12   |
| 3| NULL |
===

Expected behaviour:
 - When unnesting multiple arrays with zipping unnest then the 'i'th
   item of one array will be put next to the 'i'th item of the other
   arrays in the results.
 - In case the size of the arrays is not the same then the shorter
   arrays will be filled with NULL values up to the size of the longest
   array.

On a sidenote, UNNEST is added to Impala's SQL language as a new
keyword. This might interfere with use cases where a resource (db,
table, column, etc.) is named "UNNEST".

Restrictions:
 - It is not allowed to have WHERE filters on an unnested item of an
   array in the same SELECT query. E.g. this is not allowed:
   SELECT arr1.item
   FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5;

   Note, that it is allowed to have an outer SELECT around the one
   doing unnests and have a filter there on the unnested items.
 - If there is an outer SELECT filtering on the unnested array's items
   from the inner SELECT then these predicates won't be pushed down to
   the SCAN node. They are rather evaluated in the UNNEST node to
   guarantee result correctness after unnesting.
   Note, this restriction is only active when there are multiple arrays
   being unnested, or in other words when zipping unnest logic is
   required to produce results.
 - It's not allowed to do a zipping and a (traditional) joining unnest
   together in one SELECT query.
 - It's not allowed to perform zipping unnests on arrays from different
   tables.

Testing:
 - Added a bunch of E2E tests to the test suite to cover both syntaxes.
 - Did a manual test run on a table with 1000 rows, 3 array columns
   with size of around 5000 items in each array. I did an unnest on all
   three arrays in one query to see if there are any crashes or
   suspicious slowness when running on this scale.

Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
---
M be/src/exec/unnest-node.cc
M be/src/exec/unnest-node.h
M common/thrift/PlanNodes.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
A fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
A testdata/ComplexTypesTbl/arrays.orc
A testdata/ComplexTypesTbl/arrays.parq
M testdata/data/README
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-from-clause.test
A 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
M tests/query_test/test_nested_types.py
30 files changed, 1,408 insertions(+), 136 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/17983/9
--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset

[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18041 )

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..


Patch Set 2:

(5 comments)

Nice work!

http://gerrit.cloudera.org:8080/#/c/18041/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18041/2//COMMIT_MSG@12
PS2, Line 12: but the hosts are only referred by indexes. In the Coordinator's 
local
nit: referred to


http://gerrit.cloudera.org:8080/#/c/18041/2//COMMIT_MSG@17
PS2, Line 17: This patch translates the host index to the coordinators host 
list, so
nit properly


http://gerrit.cloudera.org:8080/#/c/18041/2//COMMIT_MSG@21
PS2, Line 21: manually
nit. I wonder if it is feasible to have a query test to protect this piece of 
work.


http://gerrit.cloudera.org:8080/#/c/18041/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18041/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@458
PS2, Line 458: fd.
may need to check fd is not null.


http://gerrit.cloudera.org:8080/#/c/18041/2/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18041/2/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@111
PS2, Line 111: snapshot
Should we check snapshot is null?



--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 19 Nov 2021 20:48:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9816/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 18:46:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18043 )

Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7651/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 18:24:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction

2021-11-19 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18043


Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog 
after Compaction
..

IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after
Compaction

After compaction happened in Hive(HIVE ACID table), queries made in
Impala possibly fail with a FileNotFoundException if files already
removed by the Hive cleaner.

In IMPALA-10801, catalogd checks the latest compaction id before serving
metadata. However, coordinators don't take advantage of that.
Coordinators have their own local cache, so we will have to do the
same check for coordinators as well. Besides, we also need to attach
writeIdList to requests that need to fetch file metadata.

Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
---
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
8 files changed, 303 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/1
--
To view, visit http://gerrit.cloudera.org:8080/18043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b
Gerrit-Change-Number: 18043
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 31:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7650/


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Fri, 19 Nov 2021 18:20:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupMapping

2021-11-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18019 )

Change subject: IMPALA-11027: Adding flag to enable support for 
ShellBasedUnixGroupMapping
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18019/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18019/2//COMMIT_MSG@7
PS2, Line 7: ShellBasedUnixGroupMapping
nit: ShellBasedUnixGroupsMapping (missing 's' for Groups)


http://gerrit.cloudera.org:8080/#/c/18019/2/be/src/service/frontend.cc
File be/src/service/frontend.cc:

http://gerrit.cloudera.org:8080/#/c/18019/2/be/src/service/frontend.cc@77
PS2, Line 77: group
nit: maybe 'groups' here as well?


http://gerrit.cloudera.org:8080/#/c/18019/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/18019/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@826
PS2, Line 826: &&
 : BackendConfig.INSTANCE.isShellBasedGroupMappingEnabled()
Shouldn't it be

 !BackendConfig.INSTANCE.isShellBasedGroupMappingEnabled()?

IIUC we want to raise an error when the flag is false.



--
To view, visit http://gerrit.cloudera.org:8080/18019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470
Gerrit-Change-Number: 18019
Gerrit-PatchSet: 2
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 19 Nov 2021 17:00:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11031: Listmap.getIndex() name is misleading

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18042 )

Change subject: IMPALA-11031: Listmap.getIndex() name is misleading
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9815/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I689dfb67e1a9104812489d6299ed43446d2fcae8
Gerrit-Change-Number: 18042
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 16:57:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18041 )

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9814/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 19 Nov 2021 16:51:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11031: Listmap.getIndex() name is misleading

2021-11-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18042


Change subject: IMPALA-11031: Listmap.getIndex() name is misleading
..

IMPALA-11031: Listmap.getIndex() name is misleading

Listmap.getIndex(t) modifies the ListMap object when there is
no mapping for 't'. Hence the name of it is very misleading as
the reader wouldn't expect modifications from simple getters.

This patch renames it to getOrAddIndex().

Change-Id: I689dfb67e1a9104812489d6299ed43446d2fcae8
---
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
7 files changed, 11 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/18042/1
--
To view, visit http://gerrit.cloudera.org:8080/18042
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I689dfb67e1a9104812489d6299ed43446d2fcae8
Gerrit-Change-Number: 18042
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18041 )

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..


Patch Set 2:

(5 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/18041/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18041/1//COMMIT_MSG@18
PS1, Line 18: block locations remain consistent.
> t
Done


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@455
PS1, Line 455: } else {
> precondition about hostIndex != null?
Done


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1014
PS1, Line 1014: Map pathToFds =
> code in comment
Done


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@500
PS1, Line 500:   new 
TScanRangeLocation(analyzer.getHostIndex().getIndex(networkAddress)));
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/util/ListMap.java
File fe/src/main/java/org/apache/impala/util/ListMap.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/util/ListMap.java@56
PS1, Line 56:   public int getIndex(T t) {
> can you add a comment about this rename in the comment, or possibly move it
Opened IMPALA-11031



--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 19 Nov 2021 16:31:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Zoltan Borok-Nagy (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18041

to look at the new patch set (#2).

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..

IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local 
catalog mode

When local catalog mode is used, Impala retrieves the Iceberg
snapshot from CatalogD. The response contains a map of the file
descriptors. The file descriptors contain block location information,
but the hosts are only referred by indexes. In the Coordinator's local
catalog the host indexes might refer to different hosts than in
CatalogD. This might lead to unnecessary remote reads as scan ranges
are scheduled to random hosts.

This patch translates the host index to the coordinators host list, so
block locations remain consistent.

Testing:
 * tested manually on a 6-node cluster, and verified that the file
   locations are consistent with HDFS

Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
6 files changed, 64 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/18041/2
--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18041 )

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18041/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18041/1//COMMIT_MSG@18
PS1, Line 18: block locations remain consisten.
t


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@455
PS1, Line 455: } else {
precondition about hostIndex != null?


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1014
PS1, Line 1014: //return resp.table_info.getIceberg_snapshot();
code in comment


http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/util/ListMap.java
File fe/src/main/java/org/apache/impala/util/ListMap.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/util/ListMap.java@56
PS1, Line 56:   public int getOrAddIndex(T t) {
can you add a comment about this rename in the comment, or possibly move it to 
a different patch? I agree with the rename as the old name was misleading, but 
it adds some extra noise to the commit



--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 16:14:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18041 )

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9813/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 14:33:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18041 )

Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18041/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@500
PS1, Line 500:   new 
TScanRangeLocation(analyzer.getHostIndex().getOrAddIndex(networkAddress)));
line too long (93 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 14:12:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local catalog mode

2021-11-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18041


Change subject: IMPALA-11022: Impala uses wrong file descriptors for Iceberg 
tables in local catalog mode
..

IMPALA-11022: Impala uses wrong file descriptors for Iceberg tables in local 
catalog mode

When local catalog mode is used, Impala retrieves the Iceberg
snapshot from CatalogD. The response contains a map of the file
descriptors. The file descriptors contain block location information,
but the hosts are only referred by indexes. In the Coordinator's local
catalog the host indexes might refer to different hosts than in
CatalogD. This might lead to unnecessary remote reads as scan ranges
are scheduled to random hosts.

This patch translates the host index to the coordinators host list, so
block locations remain consisten.

Testing:
 * tested manually on a 6-node cluster, and verified that the file
   locations are consistent with HDFS

Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
13 files changed, 74 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/18041/1
--
To view, visit http://gerrit.cloudera.org:8080/18041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I253b505846e1cf4d1be445c0d06b2552dc4ba1f8
Gerrit-Change-Number: 18041
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 31:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7650/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Fri, 19 Nov 2021 12:07:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last synced event id for incremental reload

2021-11-19 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has abandoned this change. ( 
http://gerrit.cloudera.org:8080/18006 )

Change subject: [DO NOT MERGE]: IMPALA-10926: Test patchset: 25 - set last 
synced event id for incremental reload
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/18006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: If0f52186a14fb1097f26b88674c7e7cede4f68b4
Gerrit-Change-Number: 18006
Gerrit-PatchSet: 6
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10920: Zipping unnest for arrays

2021-11-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17983 )

Change subject: IMPALA-10920: Zipping unnest for arrays
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17983/8/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
File 
fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/17983/8/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@753
PS8, Line 753: id", "int_struct_col
This doesn't look good, as none of the the columns is int_array_col, so the 
query shouldn't be allowed.


http://gerrit.cloudera.org:8080/#/c/17983/8/testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test
File 
testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test:

http://gerrit.cloudera.org:8080/#/c/17983/8/testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test@174
PS8, Line 174:  QUERY
Is it valid to add multiple  QUERY sections in a test case? Is it possible 
that only one will be executed?



--
To view, visit http://gerrit.cloudera.org:8080/17983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Gerrit-Change-Number: 17983
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 09:14:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11028: Table loading can fail when events are cleaned up

2021-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18038 )

Change subject: IMPALA-11028: Table loading can fail when events are cleaned up
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95e5e20e1a2086688a92abdfb28e89177e996a1a
Gerrit-Change-Number: 18038
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Nov 2021 09:11:00 +
Gerrit-HasComments: No