[ 
https://issues.apache.org/jira/browse/HIVE-20398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuchang updated HIVE-20398:
---------------------------
    Description: 
When we are conducting the hive upgrading,  we have following use case:

We want to sync the operations between two metastore server (A and B) by thrift 
api, but both them are based on the same HDFS. So, for operations like 
*drop_partitions*, *drop_table*,  *insert_overwrite, create_table* which will 
cause the data modification in HDFS, these HDFS data modification will be 
executed twice, which is not what we want, instead,  we want it to be executed 
by only Metastore Server A. For metastore Server B, It should be configured to 
only change his metadata, but skipping the HDFS data modification.

So, we need a switch to control this.

like 
{code:java}
hive.metastore.skip.hdfs=false{code}
whose default value is *false.* When its value is *true*, the metastore server 
will only conduct the metadata modification, but skip the HDFS data 
modification.

 

  was:
When we are conducting the hive upgrading,  we have following use case:

We want to sync the operations between two metastore server (A and B) by thrift 
api, but both them are based on the same HDFS. So, for operations like 
*drop_partitions*, *drop_table*,  *insert_overwrite, create_table* which will 
cause the data modification in HDFS, we want it to be executed by only 
Metastore Server A. For metastore Server B, he will only change his metadata, 
but didn't do corresponding HDFS files operation.

So, we need a switch to control this.

like 

 
{code:java}
hive.metastore.skip.hdfs{code}
whose default value is *false* just like what is happening now.

When its value is true, the metastore server will only conduct the metadata 
modification, but skip the HDFS data modification.

 


> [Hive Metastore] Add a Configuration Item for Metastore Server to Skip the 
> HDFS Data Modification
> -------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20398
>                 URL: https://issues.apache.org/jira/browse/HIVE-20398
>             Project: Hive
>          Issue Type: Task
>          Components: Metastore
>    Affects Versions: 2.3.2
>            Reporter: wuchang
>            Assignee: wuchang
>            Priority: Major
>
> When we are conducting the hive upgrading,  we have following use case:
> We want to sync the operations between two metastore server (A and B) by 
> thrift api, but both them are based on the same HDFS. So, for operations like 
> *drop_partitions*, *drop_table*,  *insert_overwrite, create_table* which will 
> cause the data modification in HDFS, these HDFS data modification will be 
> executed twice, which is not what we want, instead,  we want it to be 
> executed by only Metastore Server A. For metastore Server B, It should be 
> configured to only change his metadata, but skipping the HDFS data 
> modification.
> So, we need a switch to control this.
> like 
> {code:java}
> hive.metastore.skip.hdfs=false{code}
> whose default value is *false.* When its value is *true*, the metastore 
> server will only conduct the metadata modification, but skip the HDFS data 
> modification.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to