hudi-bot opened a new issue, #14648:
URL: https://github.com/apache/hudi/issues/14648

   Currently there are three ways in HoodieHiveClient to perform Hive 
functionalities. One is through Hive JDBC, one is through Hive Metastore API. 
One is through Hive Driver.
    
    There’s a parameter called +{{hoodie.datasource.hive_sync.use_jdbc}}+ to 
control whether use Hive JDBC or not. However, this parameter does not 
accurately describe the situation.
   
   
    Basically, current logic is when set +*use_jdbc*+ to true, most of the 
methods in HoodieHiveClient will use JDBC, and few methods in HoodieHiveClient 
will use Hive Metastore API.
    When set +*use_jdbc*+ to false, most of the methods in HoodieHiveClient 
will use Hive Driver, and few methods in HoodieHiveClient will use Hive 
Metastore API.
   
   Here is a table shows that what will actually be used when setting use_jdbc 
to ture/false.
   |Method|use_jdbc=true|use_jdbc=false|
   |{{addPartitionsToTable}}|JDBC|Hive Driver|
   |{{updatePartitionsToTable}}|JDBC|Hive Driver|
   |{{scanTablePartitions}}|Metastore API|Metastore API|
   |{{updateTableDefinition}}|JDBC|Hive Driver|
   |{{createTable}}|JDBC|Hive Driver|
   |{{getTableSchema}}|JDBC|Metastore API|
   |{{doesTableExist}}|Metastore API|Metastore API|
   |getLastCommitTimeSynced|Metastore API|Metastore API|
   
   [~bschell] and I developed several Metastore API implementation for 
{{createTable, }}{{addPartitionsToTable}}{{, }}{{updatePartitionsToTable}}{{, 
}}{{updateTableDefinition }}{{which will be helpful for several issues: e.g. 
resolving null partition hive sync issue and supporting ALTER_TABLE cascade 
with AWS glue catalog}}{{. }}
   
   {{But it seems hard to organize three implementations within the current 
config. So we plan to separate HoodieHiveClient into three classes:}}
    # {{HoodieHiveClient which implements all the APIs through Metastore API.}}
    # {{HoodieHiveJDBCClient which extends from HoodieHiveClient and overwrite 
several the APIs through Hive JDBC.}}
    # {{HoodieHiveDriverClient which extends from HoodieHiveClient and 
overwrite several the APIs through Hive Driver.}}
   
   {{And we introduce a new parameter 
}}+*hoodie.datasource.hive_sync.hive_client_class*+ which could** _**_ let you 
choose which Hive Client class to use.
   
   {{}}
   
    
   
    
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-1194
   - Type: Task
   - Epic: https://issues.apache.org/jira/browse/HUDI-2519
   - Fix version(s):
     - 0.16.0
     - 1.1.0
   
   
   ---
   
   
   ## Comments
   
   17/Feb/22 02:39;xushiyan;[~wenningd] from the PR, looks like this is covered 
in HUDI-1848 ? can we close this one?;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to