rdblue commented on a change in pull request #2129:
URL: https://github.com/apache/iceberg/pull/2129#discussion_r572300491



##########
File path: 
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalogs.java
##########
@@ -36,4 +38,11 @@ public static HiveCatalog loadCatalog(Configuration conf) {
     String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname, 
"");
     return CATALOG_CACHE.get(metastoreUri, uri -> new HiveCatalog(conf));
   }
+
+  public static HiveCatalog loadCatalog(String catalogName, Map<String, 
String> properties, Configuration conf) {
+    // metastore URI can be null in local mode
+    String metastoreUri = conf.get(HiveConf.ConfVars.METASTOREURIS.varname, 
"");
+    return CATALOG_CACHE.get(metastoreUri, uri -> (HiveCatalog) 
CatalogUtil.loadCatalog(HiveCatalog.class.getName(),
+            catalogName, properties, conf));
+  }

Review comment:
       > For writes TezAM (for Tez) or YARN Application Master (for MR) calls 
the commitJob which still needs to create the Catalog.
   
   This is later in the job, though. I wonder if we should introduce a thread 
to close connections if they are not used for some period of time. Garbage 
collecting connections could help if we think that the extra connection from 
the HS2 side is a problem. The AM connection probably wouldn't last long though?
   
   > I think sharing HMS Clients between the Catalogs is not really a good 
practice.
   
   Sounds reasonable.
   
   > I think the best solution would be a different HiveCatalog implementation 
reusing the HMS Client from the Hive session.
   
   This sounds okay, but should probably be done in a follow-up instead of 
here. It also doesn't affect name-based caching. The only catalog that could 
share a connection with the Hive session is the unnamed catalog used when there 
is no catalog set. We cannot share connections to named catalogs because there 
is no reason to think those are hitting the same metastore.
   
   > ll that said, I think the name based caching is flawed if the 
configuration can be changed
   
   I don't think that configuration can be changed, for the reason you cite. 
Once a catalog is created, it is configured and that can't be changed. We 
_could_ look up settings each time they are used but that seems like a lot of 
work for a problem we fundamentally don't need to solve.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to