[jira] [Commented] (ACCUMULO-3842) [UMBRELLA] Remove non-transient data from ZooKeeper

Josh Elser (JIRA) Sat, 23 May 2015 12:20:06 -0700

    [ 
https://issues.apache.org/jira/browse/ACCUMULO-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557486#comment-14557486
 ]


Josh Elser commented on ACCUMULO-3842:
--------------------------------------

Caught up one some procv2 HBase stuff

HBASE-13571 deals with schema updates. This is done by re-opening every region. 
That isn't relevant for what we're talking about here.

HBASE-13687 and HBASE-13688 mention that there is a "missing piece" that might 
be relevant, but the information is lacking.

On the original design docs, I see

{quote}
Multi-Machine Procedures and Timeouts
Operations like Snapshots or ACLs cache updates requires a bit of coordination 
across multiple machine. To do that the procedure will send a message (may be 
done as poll via heartbeat) to each machine required by the procedure and will 
wait until each one respond. The procedure can have a timeout that will trigger 
a failure of the procedure causing the rollback.
{quote}

This doesn't seem like anything novel that trying to adopt procv2 would gain us 
that we couldn't already do with FATE. I'm happy to entertain a conversation if 
I missed something, but, from what I've read so far, I don't see a reason why 
we'd want to adopt procv2 presently.

> [UMBRELLA] Remove non-transient data from ZooKeeper
> ---------------------------------------------------
>
>                 Key: ACCUMULO-3842
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3842
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, tserver
>            Reporter: Josh Elser
>             Fix For: 1.8.0
>
>
> Wanted to start brainstorming about this.
> We store a lot of persistent data in ZooKeeper that would better stored in 
> something backed by HDFS. ZooKeeper can be a very convenient place to store 
> persisted data so that it's available to all nodes, but it comes at a price 
> and often must be asynchronously accessed to achieve good performance.
> * Table/Namespace configuration
> * Users/Authorizations
> * Problem reports (maybe?)
> * System configuration overrides (maybe?)
> Some benefits we'd see from this:
> * Loss of ZooKeeper doesn't lose table configuration and users.
> * Greatly reduce zookeeper watchers (assume 
> watchers=50*num_tables*num_tservers)
> * Consistent updates of table constraints and all other table properties
> The last note is the most important one IMO. The number of test issues alone 
> that we've had with constraints not being seen on all servers are bound to 
> affect users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ACCUMULO-3842) [UMBRELLA] Remove non-transient data from ZooKeeper

Reply via email to