Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 )
Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor ...................................................................... Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15 PS1, Line 15: skipped for all other changes in S > yeah, it seems very unlikely that a new field will need file metadata reloa Could you address this comment? The current patch sets skipFileMetadataReload_ to false in known cases like location changes, inputFormat changes, storedAsSubDirectories non-trivial changes. For other unknown cases in SD changes, skipFileMetadataReload_ will be true, which is unsafe. What we suggested is setting skipFileMetadataReload_ to true in known cases of SD changes, e.g. trivial changes in storedAsSubDirectories. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788 PS1, Line 1788: !Objects.equals(beforeSd.getSerdeInfo(), afterSd.getSerdeInfo())) { Is it correct to skip schema reload for avro tables? In HdfsTable#load(), 'loadTableSchema' is used in two places: if (loadTableSchema) { // set nullPartitionKeyValue from the hive conf. nullPartitionKeyValue_ = MetaStoreUtil.getNullPartitionKeyValue(client).intern(); loadSchema(msTbl); loadAllColumnStats(client); loadConstraintsInfo(client, msTbl); } if (loadTableSchema) setAvroSchema(client, msTbl); The second usage is for avro tables and it uses the InputFormat and SerdeInfo. https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1302 https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1819 http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1889 PS1, Line 1889: return true; Please add a log for this. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3060 PS1, Line 3060: // Test 2: set rowformat It'd be helpful to add logs between different cases, e.g. LOG.info("Test changes in rowformat"); So when analyzing logs/fe_tests/FeSupport.INFO, we get the log boundaries easily. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3080 PS1, Line 3080: hmsTbl.getSd().setStoredAsSubDirectories(false); I added some debug logs and found this field is already set to false before this. So this changes nothing. We'd better use unsetStoredAsSubDirectories() to actually make changes. http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3084 PS1, Line 3084: Test 4 nit: "Test 5". Please add more test cases like from true to false/unset. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com> Gerrit-Comment-Date: Tue, 20 Feb 2024 07:32:39 +0000 Gerrit-HasComments: Yes