Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21019/1//COMMIT_MSG@15
PS1, Line 15: skipped for all other changes in S
> yeah, it seems very unlikely that a new field will need file metadata reloa
Could you address this comment? The current patch sets skipFileMetadataReload_ 
to false in known cases like location changes, inputFormat changes, 
storedAsSubDirectories non-trivial changes. For other unknown cases in SD 
changes, skipFileMetadataReload_ will be true, which is unsafe.

What we suggested is setting skipFileMetadataReload_ to true in known cases of 
SD changes, e.g. trivial changes in storedAsSubDirectories.


http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788
PS1, Line 1788:             !Objects.equals(beforeSd.getSerdeInfo(), 
afterSd.getSerdeInfo())) {
Is it correct to skip schema reload for avro tables? In HdfsTable#load(), 
'loadTableSchema' is used in two places:

        if (loadTableSchema) {
            // set nullPartitionKeyValue from the hive conf.
          nullPartitionKeyValue_ =
            MetaStoreUtil.getNullPartitionKeyValue(client).intern();
          loadSchema(msTbl);
          loadAllColumnStats(client);
          loadConstraintsInfo(client, msTbl);
        }

        if (loadTableSchema) setAvroSchema(client, msTbl);

The second usage is for avro tables and it uses the InputFormat and SerdeInfo.
https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1302
https://github.com/apache/impala/blob/085b1806da6a1941200288a2f9a243e389e10820/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1819


http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1889
PS1, Line 1889:           return true;
Please add a log for this.


http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3060
PS1, Line 3060:     // Test 2: set rowformat
It'd be helpful to add logs between different cases, e.g.

  LOG.info("Test changes in rowformat");

So when analyzing logs/fe_tests/FeSupport.INFO, we get the log boundaries 
easily.


http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3080
PS1, Line 3080:     hmsTbl.getSd().setStoredAsSubDirectories(false);
I added some debug logs and found this field is already set to false before 
this. So this changes nothing. We'd better use unsetStoredAsSubDirectories() to 
actually make changes.


http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3084
PS1, Line 3084: Test 4
nit: "Test 5". Please add more test cases like from true to false/unset.



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Comment-Date: Tue, 20 Feb 2024 07:32:39 +0000
Gerrit-HasComments: Yes

Reply via email to