zhangbutao commented on code in PR #4372:
URL: https://github.com/apache/hive/pull/4372#discussion_r1223788240


##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -6724,6 +6725,15 @@ public void alterTableBranchOperation(Table table, 
AlterTableBranchSpec createBr
     }
   }
 
+  public void alterTableTagOperation(Table table, AlterTableTagSpec 
createTagSpec) throws HiveException {
+    try {
+      HiveStorageHandler storageHandler = 
createStorageHandler(table.getTTable());

Review Comment:
   Good suggestion. I tried to use Caffeine cache to get storage handler, but 
it failed unfortunately.
   Here it is what i have done:
   
   ```
   // Define storage handler Caffine cache.
     private static final Cache<String, HiveStorageHandler> storageHandlerCache 
= Caffeine.newBuilder()
         .expireAfterAccess(10, TimeUnit.MINUTES).build();
   
   // Get storage handler from cache.
     private HiveStorageHandler 
getCachedStorageHandler(org.apache.hadoop.hive.metastore.api.Table tbl) {
      return storageHandlerCache.get(new 
Table(tbl).getFullTableName().toString(), k -> {
         try {
           // create a new storage handler if it is not in cache.
           return createStorageHandler(tbl);
         } catch (MetaException ex) {
           LOG.error("Failed to get storage handler", ex);
           throw new RuntimeException("Failed to get storage handler:  " + 
ex.getMessage());
         }
       });
     }
   ```
   
   And then i replaced the invoked method 
`HiveIcebergStorageHandler::createStorageHandler` with 
`HiveIcebergStorageHandler::getCachedStorageHandler`, like this:
   ```
   @@ -5699,7 +5704,7 @@ private IMetaStoreClient createMetaStoreClient(boolean 
allowEmbedded) throws Met
          public HiveMetaHook getHook(
                  org.apache.hadoop.hive.metastore.api.Table tbl)
                  throws MetaException {
   -        HiveStorageHandler storageHandler = createStorageHandler(tbl);
   +        HiveStorageHandler storageHandler = getCachedStorageHandler(tbl);
            return storageHandler == null ? null : storageHandler.getMetaHook();
          }
        };
   ```
   
   Then I analyzed the reason for the cached storage handler. It is mostly 
because `HiveIcebergStorageHandler ` has a class-level `conf ` which can be 
used to load a iceberg table and the `conf`  storage the latest `METADATA 
`location, and if the iceberg table is modified, we should use the modified 
`conf ` to get a new  `METADATA `location.  If we always use the cached 
storageHandler, we maybe get a old iceberg table whose latest snapshot is fake.
   
   // Here it is a class-level `conf`
   
https://github.com/apache/hive/blob/e0bf9609585d671b936c95ef5ccd4dcadc9be71a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L192
   
   Here we use method `IcebergTableUtil.getTable(conf properties)` and `conf  
`to get the iceberg table:
   
https://github.com/apache/hive/blob/e0bf9609585d671b936c95ef5ccd4dcadc9be71a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L751
   
   // Here we use the `conf `to get the latest `METADATA `location
   
https://github.com/apache/hive/blob/e0bf9609585d671b936c95ef5ccd4dcadc9be71a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java#L88
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to