[jira] [Commented] (HBASE-10277) refactor AsyncProcess

Sergey Shelukhin (JIRA) Wed, 22 Jan 2014 15:16:31 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879329#comment-13879329
 ]


Sergey Shelukhin commented on HBASE-10277:
------------------------------------------

Callback for each put call would still need to be maintained separately for 
each put call, and if you need results (like from get) you'd still need to have 
place to store them. Current patch bypasses the creation of Object[] when 
results are not needed, e.g. for streaming puts. Previous code does the same 
thing, but the Object[] for get was still created inside the callback object.
One problem with current callback is that as soon as you have multiple submits 
the index argument to callback becomes ambiguous, so you no longer know for 
which action you receive the error or result. So, HCM batch code creates AP for 
each single submit, that way it knows the index is always from that submit when 
the callback populates the array; and HTable just doesn't use callback for 
streaming puts because it doesn't need the results... if HTable were to use 
current callback for error management (or streaming gets when you need results 
and there are multiple submit calls), it becomes a real problem.

We can instead add /per-call/ callback in the context. It's a hybrid between #3 
and #4; AP can avoid global error support; we can add async call with callback 
to HTable which would use the "regular" path; current streaming put can have 
the same semantics but maintain the contexts in HTable rather than AP. Let me 
think more about the latter case.

> refactor AsyncProcess
> ---------------------
>
>                 Key: HBASE-10277
>                 URL: https://issues.apache.org/jira/browse/HBASE-10277
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-10277.01.patch, HBASE-10277.patch
>
>
> AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
> callback and with reuse, and one from HCM/HTable batch call, with callback 
> and w/o reuse. In the former case (but not the latter), it also does some 
> throttling of actions on initial submit call, limiting the number of 
> outstanding actions per server.
> The latter case is relatively straightforward. The former appears to be error 
> prone due to reuse - if, as javadoc claims should be safe, multiple submit 
> calls are performed without waiting for the async part of the previous call 
> to finish, fields like hasError become ambiguous and can be used for the 
> wrong call; callback for success/failure is called based on "original index" 
> of an action in submitted list, but with only one callback supplied to AP in 
> ctor it's not clear to which submit call the index belongs, if several are 
> outstanding.
> I was going to add support for HBASE-10070 to AP, and found that it might be 
> difficult to do cleanly.
> It would be nice to normalize AP usage patterns; in particular, separate the 
> "global" part (load tracking) from per-submit-call part.
> Per-submit part can more conveniently track stuff like initialActions, 
> mapping of indexes and retry information, that is currently passed around the 
> method calls.
> -I am not sure yet, but maybe sending of the original index to server in 
> "ClientProtos.MultiAction" can also be avoided.- Cannot be avoided because 
> the API to server doesn't have one-to-one correspondence between requests and 
> responses in an individual call to multi (retries/rearrangement have nothing 
> to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

Reply via email to