[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Reviewed-on: http://gerrit.cloudera.org:8080/21019 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 186 insertions(+), 19 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 15 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 14: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 14 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 27 Mar 2024 08:27:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 14 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 27 Mar 2024 03:22:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10437/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 14 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 27 Mar 2024 03:22:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: Code-Review+2 Carry +1 from Csaba -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 27 Mar 2024 03:22:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 08:04:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 05:00:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10430/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 03:04:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15669/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 01:20:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 01:07:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1966 PS12, Line 1966: // Check if the trivial SD properties are changed during the alter statement. > nit: Please add a comment saying callers should make sure 'beforeSd' and 'a Done http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1969 PS12, Line 1969: StorageDescriptor afterSd) { : Preconditions.checkNotNull(beforeSd, > nit: please add error messages like Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 00:54:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#13). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 186 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/13 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 12: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1966 PS12, Line 1966: // Check if the trivial SD properties are changed during the alter statement nit: Please add a comment saying callers should make sure 'beforeSd' and 'afterSd' are not equal. http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1969 PS12, Line 1969: Preconditions.checkNotNull(beforeSd); : Preconditions.checkNotNull(afterSd); nit: please add error messages like Preconditions.checkNotNull(beforeSd, "beforeSd is null"); Preconditions.checkNotNull(afterSd, "afterSd is null"); -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 23:15:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10426/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 20:00:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15660/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 19:59:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 12: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1903 PS11, Line 1903: : } else if (!isCustomTblPropsChanged(whitelistedTblProperties, beforeTable, : > nit: This hasn't been removed yet. My bad!! Will remove it once and for all. http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971 PS11, Line 1971: rageDescriptor previousSD = normalize > It seems this can be a Precondition check. Done http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1981 PS11, Line 1981: return true; > Can we add a log saying file metadata reload can be skipped? Done http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@202 PS11, Line 202: private static > nit: add "final" Done http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3164 PS11, Line 3164: unse > nit: "unset" Ack -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 19:41:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#12). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 185 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/12 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 11: Code-Review+1 (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1903 PS11, Line 1903: else { : skipFileMetadata = false; : } nit: This hasn't been removed yet. http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971 PS11, Line 1971: (beforeSd != null && afterSd != null) It seems this can be a Precondition check. http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1981 PS11, Line 1981: return true; Can we add a log saying file metadata reload can be skipped? http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@202 PS11, Line 202: private static nit: add "final" http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3164 PS11, Line 3164: true nit: "unset" -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 12:37:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15617/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 19:55:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899 PS8, Line 1899: ipFileMetadata = true; > Nice catch! Please also add test case for this. Ack http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1901 PS10, Line 1901: f (isTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd())) { : skipFileMetadata = true; : } > nit: don't need this else-branch since 'skipFileMetadata' is already false. Done http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1975 PS10, Line 1975: i > nit: add a space after comma Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 19:30:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#11). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 185 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/11 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 10: Code-Review+1 lgtm besides Quanlong's comments -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 08:10:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899 PS8, Line 1899: (isTrivialSdPropsChang > Good point! This is what I'm missing. We should check for SD changes only i Nice catch! Please also add test case for this. http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1901 PS10, Line 1901: else { : skipFileMetadata = false; : } nit: don't need this else-branch since 'skipFileMetadata' is already false. http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1975 PS10, Line 1975: , nit: add a space after comma -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 06:53:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15598/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 00:34:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#10). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 173 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/10 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 9: (6 comments) http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899 PS8, Line 1899: (isTrivialSdPropsChang > Can't this return true when the SD didn't change at all but there are other Good point! This is what I'm missing. We should check for SD changes only if we notice any changes in SD. http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1963 PS8, Line 1963: > Is it possible that on is null while the other is not? Shouldn't the functi I have addressed this at L#1899 http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1965 PS8, Line 1965: // Check if the trivial SD properties are changed during the alter statement > nit: this looks a bit unusual in Impala, can you split it to to two lines? Ack http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1967 PS8, Line 1967: StorageDescriptor afterSd) { > non-trivial could be added to the log message Done http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1977 PS8, Line 1977: > I think that it would make the intention of the function clearer to just st Ack http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1987 PS8, Line 1987: sd.unsetCols(); > Can you add a comment for this? The previous lines just unset the propertie Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 00:12:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15594/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 00:12:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#9). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 172 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/9 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/21019/9/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/9/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1904 PS9, Line 1904: } else if (!isCustomTblPropsChanged(whitelistedTblProperties, beforeTable, afterTable)) { line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 20 Mar 2024 23:50:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: (6 comments) http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899 PS8, Line 1899: isTrivialSdPropsChanged Can't this return true when the SD didn't change at all but there are other changes that would warrant a file metadata reload? Or the Sd property won't be set in that case? http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1963 PS8, Line 1963: if (beforeSd != null && afterSd != null) { Is it possible that on is null while the other is not? Shouldn't the function return false in that case? http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1965 PS8, Line 1965: currentSD = normalizeStorageDescriptor(afterSd.deepCopy()); nit: this looks a bit unusual in Impala, can you split it to to two lines? http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1967 PS8, Line 1967: infoLog("Change in table storage descriptor (SD) detected for table {}.{}. " + non-trivial could be added to the log message http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1977 PS8, Line 1977: unsetting I think that it would make the intention of the function clearer to just state that it unsets properties where the change is considered trivial. http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1987 PS8, Line 1987: if (!sd.isSetStoredAsSubDirectories()) { Can you add a comment for this? The previous lines just unset the properties, while in this case we want null and 'false' to be considered the same, but 'true' will be considered different. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 20 Mar 2024 13:46:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 20 Mar 2024 09:01:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10392/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 20 Mar 2024 04:14:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10385/ -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 19 Mar 2024 23:26:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10385/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 19 Mar 2024 18:34:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15564/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 19 Mar 2024 18:14:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1922 PS7, Line 1922: infoLog("Change in table schema detected for table {}.{} from {} ({}) " + : "to {} ({}). So file metadata reload can be skipped.", dbName_, tblName_, : beforeCols.get(i).getName(), beforeCols.get(i).getType(), : afterCols.get(i).getName(), afterCols.get(i).getType()); : return true; > nit: might be a bit clearer to refactor this Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 19 Mar 2024 17:51:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#8). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 161 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/8 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 7: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1922 PS7, Line 1922: infoLog("Change in table schema detected for table {}.{} from {} to {} ", : tblName_, dbName_, beforeCols.get(i).getName() + " (" + : beforeCols.get(i).getType() +")", afterCols.get(i).getName() + " (" + : afterCols.get(i).getType() + "). So file metadata reload can be " + : "skipped."); nit: might be a bit clearer to refactor this infoLog("Change in table schema detected for table {}.{} from {} ({}) " + "to {} ({}). So file metadata reload can be skipped.", tblName_, dbName_, beforeCols.get(i).getName(), beforeCols.get(i).getType(), afterCols.get(i).getName(), afterCols.get(i).getType()); -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 18 Mar 2024 12:30:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15526/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 15 Mar 2024 05:34:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@14 PS6, Line 14: . The > nit: remove "=true" ? Done http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@15 PS6, Line 15: : > nit: this is stale now Ack http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1936 PS6, Line 1936: "newOwner: {}. So file metadata reload can be skipped.", > nit: let's also add "So file metadata should be reloaded" here. Done http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971 PS6, Line 1971: return false; > So we return false for other unknown cases. In the future if a new field is In the latest patch, I have addressed this comment. Please let me know if that looks good. http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3050 PS6, Line 3050: assertNotEquals(fileMetadataLoadAfter, fileMetadataLoadBefore); > Why do we change this? fileMetadataLoadAfter >= fileMetadataLoadBefore is a issue with reverting back to old code. Will do it properly in the next patch set. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 15 Mar 2024 05:10:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#7). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 159 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/7 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@14 PS6, Line 14: =true nit: remove "=true" ? http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@15 PS6, Line 15: Also introduced a small : optimization to skip reloading of table schema nit: this is stale now http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1936 PS6, Line 1936: "whitelisted config: {}, value before: {}, value after: {}", dbName_, nit: let's also add "So file metadata should be reloaded" here. http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971 PS6, Line 1971: return false; So we return false for other unknown cases. In the future if a new field is added to the definition of SD (e.g. in a higher Hive version) and the change of it is non-trivial (requires file metadata reloading), it will be skipped here (considered as trivial change). This is the case mentioned by Csaba that we need to address. As this method is only used as !isNonTrivialSdPropsChanged() at line 1886, we can convert it to isTrivialSdPropsChanged() and only return true for known cases (i.e. changes on location, input/output format, serde, storedAsSubDirectories). The difference is that canSkipFileMetadataReload() will only return true for known cases. http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3050 PS6, Line 3050: assertNotEquals(fileMetadataLoadAfter + 1, fileMetadataLoadBefore); Why do we change this? fileMetadataLoadAfter >= fileMetadataLoadBefore is always true, so fileMetadataLoadAfter + 1 != fileMetadataLoadBefore is always true. The assertion becomes meaningless. Do you want to use assertEquals(fileMetadataLoadAfter, fileMetadataLoadBefore + 1) instead? -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 11:13:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15494/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:52:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785 PS4, Line 1785: skipFileMetadata > nit: "skipFileMetadataReload" sounds better Done http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792 PS4, Line 1792: if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd( { : skipFileMetadata = true; : } else { : skipFileMetadata = false; : } > This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of w I'm reverting back to patchset 2's logic of skipping file metadata. Reason being, it is not common to change two different things in the same alter table query. I have tried different combinations of tblproperties/Sd properties/schema change and owner change in the same alter table query and I cannot seem to find the right logic that always works. So I think users would have to set file_metadata_reload="" if two options are changing in single alter table statement. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860 PS4, Line 1860: file metadata reload can be skipped > Shouldn't this be "file metadata should be reloaded"? Sorry my bad. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873 PS4, Line 1873: table schema reload can be skipped > nit: "file metadata should be reloaded" Ack http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035 PS4, Line 3035: > Why do we remove existing tests in testAlterTableNoFileMetadataReload()? Good catch. This is not intentional. I'll revert it back. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:29:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#6). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 149 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/6 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785 PS4, Line 1785: skipFileMetadata nit: "skipFileMetadataReload" sounds better http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792 PS4, Line 1792: if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd( { : skipFileMetadata = true; : } else { : skipFileMetadata = false; : } This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of what its previous value is. When the table owner changed and there are only trivial SD props changed, shouldn't we have skipFileMetadata = true? The current code set skipFileMetadata to false finally. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860 PS4, Line 1860: file metadata reload can be skipped Shouldn't this be "file metadata should be reloaded"? http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873 PS4, Line 1873: table schema reload can be skipped nit: "file metadata should be reloaded" http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035 PS4, Line 3035: Why do we remove existing tests in testAlterTableNoFileMetadataReload()? -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 08:09:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15477/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 11 Mar 2024 23:20:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (1 comment) > When testing IMPALA-10976, did you find any test failure if we don't skip > reloading table schema? I have seen some failures but I don't remember now. I think what we can do for now is to keep the skipping file metadata reload logic and remove the table schema skipping reload logic and we'll address that in IMPALA-10976. Is that ok with you? http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788 PS1, Line 1788: } > Thanks for pointing to the existing codes. I found it's an existing issue t Thanks for pointing out this issue. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 11 Mar 2024 22:56:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#4). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 156 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/4 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 3: (1 comment) > Patch Set 2: > > > Patch Set 2: > > > > (2 comments) > > > > Skip reloading file metadata looks good to me. But I'm uncertain whether > > skip reloading table schema is safe. Maybe we should compare the whole > > msTable object instead of just checking several fields. > > Comparing the whole msTable object is not ideal because certain fields like > change columns, change owner, may require required table metadata reload but > certain other properties in serde like change location, and change row/file > format don't need table metadata reload. This is much more evident when we > use this patch in conjunction with IMPALA-10976. I found a bug that changing the fileformat to AVRO does require table schema reload. Details in the comments. I think the main purpose for this JIRA is to skip reloading file metadata which is expensive. Skip reloading the table schema is a minor optimization and brings risks. I'm not sure if there are other bugs like IMPALA-12889. So keeps reloading the table schema seems safer to me. When testing IMPALA-10976, did you find any test failure if we don't skip reloading table schema? http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788 PS1, Line 1788: !Objects.equals(beforeSd.getSerdeInfo(), afterSd.getSerdeInfo())) { > This behavior is consistent with the CatalogOpExecutor (i.e., command execu Thanks for pointing to the existing codes. I found it's an existing issue that avro schema is not updated correctly when changing file format to AVRO. Filed IMPALA-12889. The following scenario works before this patch: 1. Create a non-avro table with 'avro.schema.url' set to a valid schema file. impala> create table my_part_tbl(i int) partitioned by (p int) stored as parquet; impala> alter table my_part_tbl set tblproperties('avro.schema.url'='hdfs:test-warehouse/avro_schemas/functional/alltypes.json'); 2. In HIVE, change the file format to AVRO hive> alter table my_part_tbl set fileformat avro; 3. After Impala applies the ALTER_TABLE event, the schema should be the same: hive> describe my_part_tbl; +--++--+ | col_name | data_type | comment | +--++--+ | id | int| | | bool_col | boolean| | | tinyint_col | int| | | smallint_col | int| | | int_col | int| | | bigint_col | bigint | | | float_col| float | | | double_col | double | | | date_string_col | string | | | string_col | string | | | timestamp_col| string | | | p| int| | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name | data_type | comment | | p| int| | +--++--+ If I apply the current patch, Impala will keep using the old schema even after applying the ALTER_TABLE event. impala> describe my_part_tbl +--+--+-+ | name | type | comment | +--+--+-+ | i| int | | | p| int | | +--+--+-+ -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 11 Mar 2024 12:27:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15468/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 23:02:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1890 PS2, Line 1890: AsSubDirectories(), > How can we make sure 'SetStoredAsSubDirectories' is the only change? It see At least from end-to-end we cannot change schema and this property in a single statement. so we can rule out the end-user use case. There is a possibility that the client can do this. I addressed this in the latest patch set. Please let me know if you are ok with that. http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3104 PS2, Line 3104: > From unset to false is also an important case we want to fix. Can we add it Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 22:36:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#3). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 197 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/3 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 2: > Patch Set 2: > > (2 comments) > > Skip reloading file metadata looks good to me. But I'm uncertain whether skip > reloading table schema is safe. Maybe we should compare the whole msTable > object instead of just checking several fields. Comparing the whole msTable object is not ideal because certain fields like change columns, change owner, may require required table metadata reload but certain other properties in serde like change location, and change row/file format don't need table metadata reload. This is much more evident when we use this patch in conjunction with IMPALA-10976. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 22:34:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 2: (2 comments) Skip reloading file metadata looks good to me. But I'm uncertain whether skip reloading table schema is safe. Maybe we should compare the whole msTable object instead of just checking several fields. http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1890 PS2, Line 1890: So table schema reload can be skipped. How can we make sure 'SetStoredAsSubDirectories' is the only change? It seems HMS API allows giving an arbitrary Table object. Is it possible that a client modifies the schema and this field in one RPC thus generates one ALTER_TABLE event? http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3104 PS2, Line 3104: } >From unset to false is also an important case we want to fix. Can we add it as >well? -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 08:57:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15258/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 21 Feb 2024 00:28:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#2). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 167 insertions(+), 53 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/2 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S > Could you address this comment? The current patch sets skipFileMetadataRelo Addressed in the latest patch. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788 PS1, Line 1788: !Objects.equals(beforeSd.getSerdeInfo(), afterSd.getSerdeInfo())) { > Is it correct to skip schema reload for avro tables? In HdfsTable#load(), ' This behavior is consistent with the CatalogOpExecutor (i.e., command executed from impala) path. https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1193-L1204. We set boolean reloadTableSchema = false; and We won't reload table schema if file format, row format or location is changed. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1889 PS1, Line 1889: return true; > Please add a log for this. Ack http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3060 PS1, Line 3060: // Test 2: set rowformat > It'd be helpful to add logs between different cases, e.g. Ack http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3080 PS1, Line 3080: hmsTbl.getSd().setStoredAsSubDirectories(false); > I added some debug logs and found this field is already set to false before Ack http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3084 PS1, Line 3084: Test 4 > nit: "Test 5". Please add more test cases like from true to false/unset. Ack -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 21 Feb 2024 00:02:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S > yeah, it seems very unlikely that a new field will need file metadata reloa Could you address this comment? The current patch sets skipFileMetadataReload_ to false in known cases like location changes, inputFormat changes, storedAsSubDirectories non-trivial changes. For other unknown cases in SD changes, skipFileMetadataReload_ will be true, which is unsafe. What we suggested is setting skipFileMetadataReload_ to true in known cases of SD changes, e.g. trivial changes in storedAsSubDirectories. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788 PS1, Line 1788: !Objects.equals(beforeSd.getSerdeInfo(), afterSd.getSerdeInfo())) { Is it correct to skip schema reload for avro tables? In HdfsTable#load(), 'loadTableSchema' is used in two places: if (loadTableSchema) { // set nullPartitionKeyValue from the hive conf. nullPartitionKeyValue_ = MetaStoreUtil.getNullPartitionKeyValue(client).intern(); loadSchema(msTbl); loadAllColumnStats(client); loadConstraintsInfo(client, msTbl); } if (loadTableSchema) setAvroSchema(client, msTbl); The second usage is for avro tables and it uses the InputFormat and SerdeInfo. https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1302 https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1819 http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1889 PS1, Line 1889: return true; Please add a log for this. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3060 PS1, Line 3060: // Test 2: set rowformat It'd be helpful to add logs between different cases, e.g. LOG.info("Test changes in rowformat"); So when analyzing logs/fe_tests/FeSupport.INFO, we get the log boundaries easily. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3080 PS1, Line 3080: hmsTbl.getSd().setStoredAsSubDirectories(false); I added some debug logs and found this field is already set to false before this. So this changes nothing. We'd better use unsetStoredAsSubDirectories() to actually make changes. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3084 PS1, Line 3084: Test 4 nit: "Test 5". Please add more test cases like from true to false/unset. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 20 Feb 2024 07:32:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2705 PS1, Line 2705: tbl.setLastSyncedEventId(currentHmsEventId); > wasn't this a bug before this change? Yeah, correct. I have seen a couple of test failures when syncToLatestEventId is enabled (while testing IMPALA-10976) and thought this was the right place to fix it. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 13 Feb 2024 19:56:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@8 PS1, Line 8: trivial changes in StorageDescriptor > Fair thought!! I don't have an answer for this. ack, if storedAsSubDirectories can change between unert and false frequently, then this seems useful http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S > Agree with you. It is very unlikely that a new field added in SD would requ yeah, it seems very unlikely that a new field will need file metadata reload, but it also unlikely that it will be changed frequently. I would opt for safety here and allow difference only in selected members, if it's not very hard to implement it. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2705 PS1, Line 2705: tbl.setLastSyncedEventId(currentHmsEventId); wasn't this a bug before this change? what happened if isSkipFileMetadataReload was true syncToLatestEventId was also true? In this case we shouldn't skip till the current HMS event, as this may skip file modifying events -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 13 Feb 2024 16:23:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 12 Feb 2024 08:18:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@8 PS1, Line 8: trivial changes in StorageDescriptor > high level design question (which I should have asked in the Jira earlier): Fair thought!! I don't have an answer for this. Maybe Quanlong might have some thoughts about this. My thought is that maybe we can hide this optimization behind a config. http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S > It may be safer to have a set of members that are safe to ignore. What happ Agree with you. It is very unlikely that a new field added in SD would require file metadata reload. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 12 Feb 2024 03:46:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10267/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 12 Feb 2024 03:47:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@8 PS1, Line 8: trivial changes in StorageDescriptor > high level design question (which I should have asked in the Jira earlier): We saw storedAsSubDirectories changes from unset to "false" or "false" to unset in a customer env. It seems some Spark daily jobs could modify this parameter. http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S > It may be safer to have a set of members that are safe to ignore. What happ +1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Sat, 10 Feb 2024 13:37:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@8 PS1, Line 8: trivial changes in StorageDescriptor high level design question (which I should have asked in the Jira earlier): what changes do we expect in StorageDescriptor do happen frequently? If these changes are expected to be very rare, then I am not sure that it worth the added complexity and risk to optimize them. IMPALA-11534 was critical as some table properties can change on every insert, which can be very frequent. Is there a reason why StorageDescriptor should change regularly, e.g. every day? http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S It may be safer to have a set of members that are safe to ignore. What happens if a member is added to StorageDescriptor in Hive? It sounds unlikely to add a new member that changes the expected directory structure, but I also can't exclude it completely. It seems better to reload files for unknown members, and later add them to the ignore list if it turns out that they don't need reload. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 08 Feb 2024 10:37:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15208/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 08 Feb 2024 02:30:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 1: Can you please review my patch? Thanks. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 08 Feb 2024 02:06:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21019 Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 146 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/1 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala