[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 07 May 2024 01:56:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.
See details in CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta().

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Reviewed-on: http://gerrit.cloudera.org:8080/21326
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 262 insertions(+), 54 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 7:

> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10606/

The failure is due to IMPALA-9441. Retrigger the job.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 20:43:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10610/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 20:44:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 20:44:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10606/


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 15:50:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 10:41:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10606/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 10:43:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 06 May 2024 10:43:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-29 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 6: Code-Review+1

(1 comment)

LGTM

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14
PS5, Line 14: When catalogd collects the next
: catalog update, they will be collected. HdfsTable then clears the 
set.
> These are about CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta(). Ad
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 29 Apr 2024 22:39:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16045/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sun, 28 Apr 2024 07:47:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-28 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14
PS5, Line 14: When catalogd collects the next
: catalog update, they will be collected. HdfsTable then clears the 
set.
> nit: can you make it more clear?
These are about CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta(). Added 
in the commit message. Is it better now?


http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@30
PS5, Line 30:
> nit: unused import
Done



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sun, 28 Apr 2024 07:24:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-28 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, k.venureddy2...@gmail.com, Sai Hemanth Gantasala, Joe 
McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21326

to look at the new patch set (#6).

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.
See details in CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta().

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 262 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21326/6
--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-26 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14
PS5, Line 14: When catalogd collects the next
: catalog update, they will be collected. HdfsTable then clears the 
set.
nit: can you make it more clear?


http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@30
PS5, Line 30: import java.util.stream.Collectors;
nit: unused import



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 26 Apr 2024 21:26:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-25 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1125
PS5, Line 1125: collected from a new version
> > If the partition was readded, shouldn't that operation also remove it fro
Thanks for the explanation!



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 26 Apr 2024 05:57:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-23 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1125
PS5, Line 1125: collected from a new version
> If the partition was readded, shouldn't that operation also remove it from 
> dropped_partitions?

I think you mean 'droppedPartitions' of HdfsTable instead of 
'dropped_partitions' of THdfsTable which never changes when it's added to the 
deleteLog. For 'droppedPartitions' of HdfsTable, we haven't done that yet. 
Currently, it only adds new items in HdfsTable#dropPartition()
https://github.com/apache/impala/blob/9b05a205fec397fa1e19ae467b1cc406ca43d948/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1146
We can update it in HdfsTable#addPartitionNoThrow() when a partition is 
re-added. But that only helps when dropping and re-adding a partition on the 
same HdfsTable object. That comes to the other question.

> How could the catalog collect the new version of the partition before 
> collecting the deletion of the partition?

An example is the following sequence:
#1 DropPartition addes the partition to 'droppedPartitions' of HdfsTable
#2 InvalidateTable replaces the HdfsTable with an IncompleteTable and adds the 
THdfsTable object into the deleteLog. The 'dropped_partitions' of this 
THdfsTable object will have a THdfsPartition object representing this partition.
https://github.com/apache/impala/blob/9b05a205fec397fa1e19ae467b1cc406ca43d948/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2363
#3 The table is loaded again so the IncompleteTable is replaced with a new 
HdfsTable object.
#4 AddPartition adds a new HdfsPartition instance (but the same partition name) 
to the new HdfsTable object.

If all these happens in a catalog update cycle, i.e. catalogd collects last 
round of catalog updates before #1, catalogd will first collect both the table 
and partition updates at L1013, then collects deletions based on the deleteLog 
at L1039 and come here.

PS5 adds a test case (Test 2) for this: 
https://gerrit.cloudera.org/c/21326/4..5/tests/custom_cluster/test_partition.py



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 23 Apr 2024 07:42:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1125
PS5, Line 1125: collected from a new version
How could the catalog collect the new version of the partition before 
collecting the deletion of the partition? If the partition was readded, 
shouldn't that operation also remove it from dropped_partitions?



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 23 Apr 2024 06:14:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 22 Apr 2024 06:51:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10571/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 22 Apr 2024 01:50:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15969/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 20 Apr 2024 08:23:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-20 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21326

to look at the new patch set (#5).

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 263 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21326/5
--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 20 Apr 2024 04:00:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10564/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Apr 2024 22:57:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15958/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Apr 2024 12:17:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-19 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21326/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1048
PS3, Line 1048: TCatalogObject catalog =
Extracted this part into a method.


http://gerrit.cloudera.org:8080/#/c/21326/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1082
PS3, Line 1082:   return;
We should only add the deletion if it's not in the updates (same as L1065), 
i.e. if there is a new version of the partition collected, don't add the 
deletion.

When a partition is updated, the old version is added to the droppedPartitions 
set, while the new version is in partitionsMap. When the table is invalidated 
and then reloaded, the old HdfsTable instance will be added to the deleteLog. 
If these all happen in a catalog update cycle, catalogd will collect updates of 
the partition. We should ignore its old version which is inside the 
droppedPartitions of the old HdfsTable instance.

This causes tests like 
metadata/test_recover_partitions.py::TestRecoverPartitions::test_post_invalidate
 hanging forever.



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Apr 2024 11:53:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-19 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21326

to look at the new patch set (#4).

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 219 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21326/4
--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10557/


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 23:27:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10557/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 13:26:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15939/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 11:13:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21326

to look at the new patch set (#3).

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 151 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21326/3
--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 2:

(1 comment)

Thanks for the quick review!

http://gerrit.cloudera.org:8080/#/c/21326/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1075
PS2, Line 1075: if (addedPartNames.contains(part.partition_name)) 
continue;
> What does this case means? The partition was dropped, but was readded later
Yeah, if a partition is dropped and then re-added, the droppedPartitions will 
have the old instance and the partitionMap will have the new instance. When the 
table is dropped/invalidated, partitions from the partitionMap are collected in 
the for-loop at L1057. Some of them could have the same partition name as those 
in the dropped_partitions.

Renamed 'addedPartNames' to 'collectedPartNames' to avoid confusion.



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 10:49:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21326/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1075
PS2, Line 1075: if (addedPartNames.contains(part.partition_name)) 
continue;
What does this case means? The partition was dropped, but was readded later?



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 10:17:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15938/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 09:52:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21326/1/tests/custom_cluster/test_partition.py
File tests/custom_cluster/test_partition.py:

http://gerrit.cloudera.org:8080/#/c/21326/1/tests/custom_cluster/test_partition.py@93
PS1, Line 93: T
> flake8: F821 undefined name 'TestPartitionMetadata'
Done


http://gerrit.cloudera.org:8080/#/c/21326/1/tests/custom_cluster/test_partition.py@98
PS1, Line 98:
> flake8: W504 line break after binary operator
Done



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 09:48:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21326

to look at the new patch set (#2).

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 148 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21326/2
--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15937/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Apr 2024 09:25:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21326


Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..

IMPALA-13009: Fix catalogd not sending deletion updates for some dropped 
partitions

*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_partition.py
M tests/metadata/test_recover_partitions.py
8 files changed, 148 insertions(+), 19 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21326/1
--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21326/1/tests/custom_cluster/test_partition.py
File tests/custom_cluster/test_partition.py:

http://gerrit.cloudera.org:8080/#/c/21326/1/tests/custom_cluster/test_partition.py@93
PS1, Line 93: T
flake8: F821 undefined name 'TestPartitionMetadata'


http://gerrit.cloudera.org:8080/#/c/21326/1/tests/custom_cluster/test_partition.py@98
PS1, Line 98: a
flake8: W504 line break after binary operator



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Apr 2024 09:01:29 +
Gerrit-HasComments: Yes