[jira] [Commented] (HBASE-10277) refactor AsyncProcess

Nicolas Liochon (JIRA) Mon, 20 Jan 2014 07:41:23 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876520#comment-13876520
 ]


Nicolas Liochon commented on HBASE-10277:
-----------------------------------------

It's a difficult patch to read:
1) Some changes are cosmetics: some protected becomes private, some "this." are 
removed. I'm not against these changes, but it makes the real meat more 
difficult to find.
2) The javadoc has not been updated, so when the code differs from the javadoc, 
the reader has to sort out himself if it's just that the javadoc is now out 
dated or if there is a regression.
I haven't yet reviewed it globally, but here is a set of questions/comments.

AsyncProcess#submit. Why does it take a tableName? Does it mean that an 
AsyncProcess can now be shared between Tables?

AsyncRequestSet#waitUntilDone
Same responsibility as AsyncProcess#waitUntilDone, but less features (no logs. 
These logs are useful).
bq. It would be nice to normalize AP usage patterns; in particular, separate 
the "global" part (load tracking) from per-submit-call part.
This part should go in HConnection, as we should manage the load tracking 
globally and not only for a single call. Iit would be a change in bahavior 
compared to the 0.94, but I think we should do it. Would it make you life 
easier here?

bq. I ran some perf test using YCSB and table with write-dropping coproc
Great, really. We should do that much more often...

bq. Probably this perf difference will not be noticeable on real requests 
(remains to be tested).
Let me be more pessimistic here :-).

bq. Also got rid of callback that was mostly used for tests, tests can check 
results without it.
I'm not a big fan of this part of the change. Callbacks can be reused in 
different context (for example to have a different policy, such as ignoring 
errors as in HTableMultiplexer). As well, we now have a hardRetryLimit, but 
this attribute is used only in tests.


More globally, This patch allows to reuse a single AsyncProcess between 
independant streams of writes. Would that be necessary if it was cheaper to 
create ? The cost is reading the configuration, as when we do a HTable#get and 
create a RegionServerCallable. The problem is that with this patch, we still 
create a AsyncProcess in some cases, for example on the batchCallback path...

> refactor AsyncProcess
> ---------------------
>
>                 Key: HBASE-10277
>                 URL: https://issues.apache.org/jira/browse/HBASE-10277
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-10277.patch
>
>
> AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
> callback and with reuse, and one from HCM/HTable batch call, with callback 
> and w/o reuse. In the former case (but not the latter), it also does some 
> throttling of actions on initial submit call, limiting the number of 
> outstanding actions per server.
> The latter case is relatively straightforward. The former appears to be error 
> prone due to reuse - if, as javadoc claims should be safe, multiple submit 
> calls are performed without waiting for the async part of the previous call 
> to finish, fields like hasError become ambiguous and can be used for the 
> wrong call; callback for success/failure is called based on "original index" 
> of an action in submitted list, but with only one callback supplied to AP in 
> ctor it's not clear to which submit call the index belongs, if several are 
> outstanding.
> I was going to add support for HBASE-10070 to AP, and found that it might be 
> difficult to do cleanly.
> It would be nice to normalize AP usage patterns; in particular, separate the 
> "global" part (load tracking) from per-submit-call part.
> Per-submit part can more conveniently track stuff like initialActions, 
> mapping of indexes and retry information, that is currently passed around the 
> method calls.
> -I am not sure yet, but maybe sending of the original index to server in 
> "ClientProtos.MultiAction" can also be avoided.- Cannot be avoided because 
> the API to server doesn't have one-to-one correspondence between requests and 
> responses in an individual call to multi (retries/rearrangement have nothing 
> to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

Reply via email to