[ 
https://issues.apache.org/jira/browse/HBASE-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133668#comment-15133668
 ] 

Junegunn Choi commented on HBASE-15213:
---------------------------------------

@[~chenheng]

Since any handler can remove an entry from the queue, we should break out of 
the loop when the entry we are waiting on is not found in the queue. So I 
needed fast containment check (if (!writeQueue.contains(w)) break) which 
LinkedHashSet provides.

O(N^2) comes from the following reasoning, take this for example.
- Let's say we have 10 concurrent increment transactions from T1 to T10, each 
with WriteEntry W1 to W10 in the queue.
- Assume that T1 finishes later than the other transactions, T2 ~ T10 go into 
wait-loop.
- T1 finishes, removes W1 from the queue, and notifies the other waiting 
transactions.
- W2 can only be removed if the handler for T2 wakes up and checks the head of 
the queue, but in the worst case, it can be the last one to wake up.
- So for example, T3 wakes up but W3 is not at the head so waits again, then 
T10 wakes up to no avail, T9, T7, .... finally T2
- Which is N context switches just to remove an entry. And we have N entries, 
so it's roughly n (n  + 1) / 2 = O(n^2)

> Fix increment performance regression caused by HBASE-8763 on branch-1.0
> -----------------------------------------------------------------------
>
>                 Key: HBASE-15213
>                 URL: https://issues.apache.org/jira/browse/HBASE-15213
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Junegunn Choi
>            Assignee: Junegunn Choi
>         Attachments: HBASE-15213.branch-1.0.patch
>
>
> This is an attempt to fix the increment performance regression caused by 
> HBASE-8763 on branch-1.0.
> I'm aware that hbase.increment.fast.but.narrow.consistency was added to 
> branch-1.0 (HBASE-15031) to address the issue and a separate work is ongoing 
> on master branch, but anyway, this is my take on the problem.
> I read through HBASE-14460 and HBASE-8763 but it wasn't clear to me what 
> caused the slowdown but I could indeed reproduce the performance regression.
> Test setup:
> - Server: 4-core Xeon 2.4GHz Linux server running mini cluster (100 handlers, 
> JDK 1.7)
> - Client: Another box of the same spec
> - Increments on random 10k records on a single-region table, recreated every 
> time
> Increment throughput (TPS):
> || Num threads || Before HBASE-8763 (d6cc2fb) || branch-1.0 || branch-1.0 
> (narrow-consistency) ||
> || 1            | 2661                         | 2486        | 2359  |
> || 2            | 5048                         | 5064        | 4867  |
> || 4            | 7503                         | 8071        | 8690  |
> || 8            | 10471                        | 10886       | 13980 |
> || 16           | 15515                        | 9418        | 18601 |
> || 32           | 17699                        | 5421        | 20540 |
> || 64           | 20601                        | 4038        | 25591 |
> || 96           | 19177                        | 3891        | 26017 |
> We can clearly observe that the throughtput degrades as we increase the 
> number of concurrent requests, which led me to believe that there's severe 
> context switching overhead and I could indirectly confirm that suspicion with 
> cs entry in vmstat output. branch-1.0 shows a much higher number of context 
> switches even with much lower throughput.
> Here are the observations:
> - WriteEntry in the writeQueue can only be removed by the very handler that 
> put it, only when it is at the front of the queue and marked complete.
> - Since a WriteEntry is marked complete after the wait-loop, only one entry 
> can be removed at a time.
> - This stringent condition causes O(N^2) context switches where n is the 
> number of concurrent handlers processing requests.
> So what I tried here is to mark WriteEntry complete before we go into 
> wait-loop. With the change, multiple WriteEntries can be shifted at a time 
> without context switches. I changed writeQueue to LinkedHashSet since fast 
> containment check is needed as WriteEntry can be removed by any handler.
> The numbers look good, it's virtually identical to pre-HBASE-8763 era.
> || Num threads || branch-1.0 with fix ||
> || 1            | 2459                 |
> || 2            | 4976                 |
> || 4            | 8033                 |
> || 8            | 12292                |
> || 16           | 15234                |
> || 32           | 16601                |
> || 64           | 19994                |
> || 96           | 20052                |
> So what do you think about it? Please let me know if I'm missing anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to