[ 
https://issues.apache.org/jira/browse/HBASE-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904425#comment-13904425
 ] 

Enis Soztutar commented on HBASE-10544:
---------------------------------------

This has been discussed in the context of master redesign (HBASE-5487, 
https://issues.apache.org/jira/browse/HBASE-5487?focusedCommentId=13797368&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13797368).
 

Most of the master RPC requests are "half-async". Meaning, the request is sent, 
some processing is done (in some cases it is full sync, some cases half sync), 
but the client does not get a handle to ask the status of the operation. 
Instead it looks for the "side effects" of the operation to complete. For 
example, createTable() is sent to master, some processing is done there (to 
check for whether there are same named tables, etc), and the client waits for 
the regions to appear in META. If anything goes wrong, the client is stuck 
waiting. Also, since the request is only in master's memory, master failover 
cannot recover running operations, and the table is left in a weird state. 

I think the way here is to implement persistent operations in master 
(persistent via zk, WAL, system table, etc). When a client issues a request, it 
gets a handle which can be used to ask the status later. 

> Surface completion state of global administrative actions
> ---------------------------------------------------------
>
>                 Key: HBASE-10544
>                 URL: https://issues.apache.org/jira/browse/HBASE-10544
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>             Fix For: 0.98.1, 0.99.0
>
>
> When issuing requests for global administrative actions, such as major 
> compaction, users have to look for indirect evidence the action has 
> completed, and cannot really be sure of the final state. 
> Hat tip to [~jdcryans] and [~stack].
> We can approach this a couple of ways. We could add a per regionserver metric 
> for percentage of admin requests complete, maybe also aggregated by the 
> master. This would provide a single point of reference. However if we also 
> want to insure 100% completion even in the presence of node failures, or 
> provide separate completion feedback for each request, I think we need to 
> redo flush and compaction requests as Procedures. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to