[jira] [Commented] (HIVE-14065) Provide an API for making Hive read-only for a short period
[ https://issues.apache.org/jira/browse/HIVE-14065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464330#comment-16464330 ] Alexander Kolbasov commented on HIVE-14065: --- Apache Sentry no longer needs this API. > Provide an API for making Hive read-only for a short period > --- > > Key: HIVE-14065 > URL: https://issues.apache.org/jira/browse/HIVE-14065 > Project: Hive > Issue Type: Improvement >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe >Priority: Major > > HIVE-7973 added a notification log which allows clients to do incremental > replication of the Hive metastore. However, it is a challenge to get the > initial state of the Hive database. Using existing APIs may give us an > inconsistent state. For example, if a Hive table is renamed while we're > loading all tables, we may miss that information. > The easiest way to fix this would be to provide an API for making Hive > read-only for a short period. This locking API would come with a timeout so > that if the locker failed, the system would not stay down. It would return > an ID which uniquely identified the lock instance. The read-only lock itself > could be implemented by taking all the ZooKeeper locks. The RPC for removing > the lock would return back a status indicating whether the lock had timed out > before being removed or not. If it had timed out, we could retry our > snapshot loading process with a longer timeout period. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-14065) Provide an API for making Hive read-only for a short period
[ https://issues.apache.org/jira/browse/HIVE-14065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15340856#comment-15340856 ] Mohit Sabharwal commented on HIVE-14065: [~alangates], I think Colin is requesting an API that effectively takes a (shared?) lock at the metastore level, disallowing all writes that currently each need an exclusive Zk lock. > Provide an API for making Hive read-only for a short period > --- > > Key: HIVE-14065 > URL: https://issues.apache.org/jira/browse/HIVE-14065 > Project: Hive > Issue Type: Improvement >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > > HIVE-7973 added a notification log which allows clients to do incremental > replication of the Hive metastore. However, it is a challenge to get the > initial state of the Hive database. Using existing APIs may give us an > inconsistent state. For example, if a Hive table is renamed while we're > loading all tables, we may miss that information. > The easiest way to fix this would be to provide an API for making Hive > read-only for a short period. This locking API would come with a timeout so > that if the locker failed, the system would not stay down. It would return > an ID which uniquely identified the lock instance. The read-only lock itself > could be implemented by taking all the ZooKeeper locks. The RPC for removing > the lock would return back a status indicating whether the lock had timed out > before being removed or not. If it had timed out, we could retry our > snapshot loading process with a longer timeout period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14065) Provide an API for making Hive read-only for a short period
[ https://issues.apache.org/jira/browse/HIVE-14065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15340696#comment-15340696 ] Alan Gates commented on HIVE-14065: --- I'm unclear what you mean by "taking all the ZooKeeper locks". Can you elaborate? > Provide an API for making Hive read-only for a short period > --- > > Key: HIVE-14065 > URL: https://issues.apache.org/jira/browse/HIVE-14065 > Project: Hive > Issue Type: Improvement >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > > HIVE-7973 added a notification log which allows clients to do incremental > replication of the Hive metastore. However, it is a challenge to get the > initial state of the Hive database. Using existing APIs may give us an > inconsistent state. For example, if a Hive table is renamed while we're > loading all tables, we may miss that information. > The easiest way to fix this would be to provide an API for making Hive > read-only for a short period. This locking API would come with a timeout so > that if the locker failed, the system would not stay down. It would return > an ID which uniquely identified the lock instance. The read-only lock itself > could be implemented by taking all the ZooKeeper locks. The RPC for removing > the lock would return back a status indicating whether the lock had timed out > before being removed or not. If it had timed out, we could retry our > snapshot loading process with a longer timeout period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)