[ 
https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15359168#comment-15359168
 ] 

stack commented on HBASE-14479:
-------------------------------

I tried this again with total random read workload (all from cache). Readers 
are here at safe point:

{code}
2449 "RpcServer.reader=0,bindAddress=ve0528.halxg.cloudera.com,port=16020" #34 
daemon prio=5 os_prio=0 tid=0x00007fb669c7f1e0 nid=0x1c7e8 waiting on condition 
[0x00007fae4d244000]
2450    java.lang.Thread.State: WAITING (parking)
2451   at sun.misc.Unsafe.park(Native Method)
2452   - parking to wait for  <0x00007faf661d4c00> (a 
java.util.concurrent.Semaphore$NonfairSync)
2453   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
2454   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
2455   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
2456   at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
2457   at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
2458   at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:688)
2459   at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:669)
2460   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2461   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2462   at java.lang.Thread.run(Thread.java:745)
{code}

...i.e. at the new semaphore. Throughput is way down... 150k ops/s vs 380k 
ops/s.

Looking w/ honest profiler, the call stack is way different w/ current branch-1 
spending most of its time responding:

Current branch-1
{code}
    6 Tree Profile:
    7  (t 100.0,s  5.2) org.apache.hadoop.hbase.ipc.RpcServer$Responder::run
    8   (t 94.8,s  0.0) 
org.apache.hadoop.hbase.ipc.RpcServer$Responder::doRunLoop
    9    (t 81.0,s  0.6) 
org.apache.hadoop.hbase.ipc.RpcServer$Responder::doAsyncWrite
   10     (t 79.9,s  1.1) 
org.apache.hadoop.hbase.ipc.RpcServer$Responder::processAllResponses
   11      (t 76.4,s  0.6) 
org.apache.hadoop.hbase.ipc.RpcServer$Responder::processResponse
   12       (t 75.9,s  0.0) org.apache.hadoop.hbase.ipc.RpcServer::channelWrite
   13        (t 73.6,s  0.0) org.apache.hadoop.hbase.ipc.BufferChain::write
   14         (t 72.4,s  2.3) sun.nio.ch.SocketChannelImpl::write
   15          (t 67.8,s  0.6) sun.nio.ch.IOUtil::write
   16           (t 62.1,s  0.0) sun.nio.ch.SocketDispatcher::writev
   17            (t 62.1,s 62.1) sun.nio.ch.FileDispatcherImpl::writev0
   18           (t  2.3,s  0.6) sun.nio.ch.Util::getTemporaryDirectBuffer
   19            (t  1.7,s  0.0) java.lang.ThreadLocal::get
   20             (t  1.7,s  0.0) 
java.lang.ThreadLocal$ThreadLocalMap::access$000
   21              (t  1.7,s  1.7) 
java.lang.ThreadLocal$ThreadLocalMap::getEntry
   22           (t  0.6,s  0.0) sun.nio.ch.IOVecWrapper::get
   23            (t  0.6,s  0.0) java.lang.ThreadLocal::get
   24             (t  0.6,s  0.0) 
java.lang.ThreadLocal$ThreadLocalMap::access$000
   25              (t  0.6,s  0.6) 
java.lang.ThreadLocal$ThreadLocalMap::getEntry
   26           (t  0.6,s  0.6) sun.nio.ch.Util::offerLastTemporaryDirectBuffer
   27           (t  0.6,s  0.0) java.nio.DirectByteBuffer::put
   28            (t  0.6,s  0.6) java.nio.Buffer::limit
   29           (t  0.6,s  0.6) java.nio.Buffer::position
   30           (t  0.6,s  0.0) sun.nio.ch.IOVecWrapper::putLen
   31            (t  0.6,s  0.6) sun.nio.ch.NativeObject::putLong
   32          (t  1.1,s  0.0) 
java.nio.channels.spi.AbstractInterruptibleChannel::begin
   33           (t  1.1,s  0.0) 
java.nio.channels.spi.AbstractInterruptibleChannel::blockedOn
   34            (t  1.1,s  0.0) java.lang.System$2::blockedOn
   35             (t  1.1,s  1.1) java.lang.Thread::blockedOn
   36          (t  1.1,s  1.1) sun.nio.ch.SocketChannelImpl::writerCleanup
   37         (t  1.1,s  1.1) java.nio.Buffer::hasRemaining
...
{code}

With patch:

{code}
Tree Profile:
 (t 100.0,s  2.2) java.lang.Thread::run
  (t 97.8,s  0.0) java.util.concurrent.ThreadPoolExecutor$Worker::run
   (t 97.8,s  0.0) java.util.concurrent.ThreadPoolExecutor::runWorker
    (t 97.8,s  0.1) org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader::run
     (t 97.7,s  0.2) 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader::doRunLoop
      (t 63.9,s  0.9) 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader::leading
       (t 59.1,s  0.0) sun.nio.ch.SelectorImpl::select
        (t 59.1,s  0.0) sun.nio.ch.SelectorImpl::select
         (t 59.1,s  0.0) sun.nio.ch.SelectorImpl::lockAndDoSelect
          (t 59.1,s  0.1) sun.nio.ch.EPollSelectorImpl::doSelect
           (t 49.2,s  0.0) sun.nio.ch.EPollArrayWrapper::poll
            (t 43.2,s  0.9) sun.nio.ch.EPollArrayWrapper::updateRegistrations
             (t 42.0,s 42.0) sun.nio.ch.EPollArrayWrapper::epollCtl
             (t  0.4,s  0.2) java.util.BitSet::get
              (t  0.1,s  0.1) java.util.BitSet::wordIndex
            (t  6.0,s  6.0) sun.nio.ch.EPollArrayWrapper::epollWait
           (t  9.1,s  1.2) sun.nio.ch.EPollSelectorImpl::updateSelectedKeys
            (t  5.3,s  0.0) java.util.HashMap::get
             (t  5.3,s  3.9) java.util.HashMap::getNode
              (t  1.3,s  1.3) java.lang.Integer::equals
            (t  1.0,s  0.0) 
sun.nio.ch.SocketChannelImpl::translateAndSetReadyOps
             (t  1.0,s  1.0) sun.nio.ch.SocketChannelImpl::translateReadyOps
            (t  0.9,s  0.0) java.util.HashSet::add
             (t  0.9,s  0.0) java.util.HashMap::put
              (t  0.9,s  0.5) java.util.HashMap::putVal
               (t  0.4,s  0.4) java.util.HashMap::newNode
            (t  0.6,s  0.0) java.util.HashSet::contains
             (t  0.6,s  0.0) java.util.HashMap::containsKey
              (t  0.5,s  0.5) java.util.HashMap::getNode
              (t  0.1,s  0.1) java.util.HashMap::hash
            (t  0.1,s  0.1) sun.nio.ch.EPollArrayWrapper::getDescriptor
           (t  0.6,s  0.6) sun.nio.ch.IOUtil::drain
           (t  0.1,s  0.0) java.nio.channels.spi.AbstractSelector::end
            (t  0.1,s  0.1) 
java.nio.channels.spi.AbstractInterruptibleChannel::blockedOn
       (t  1.5,s  0.0) sun.nio.ch.SelectionKeyImpl::interestOps
        (t  1.5,s  0.6) sun.nio.ch.SelectionKeyImpl::nioInterestOps
         (t  0.9,s  0.0) 
sun.nio.ch.SocketChannelImpl::translateAndSetInterestOps
          (t  0.9,s  0.0) sun.nio.ch.EPollSelectorImpl::putEventOps
           (t  0.9,s  0.9) sun.nio.ch.EPollArrayWrapper::setInterest
       (t  1.2,s  0.0) java.util.concurrent.ConcurrentLinkedQueue::add
        (t  1.2,s  0.9) java.util.concurrent.ConcurrentLinkedQueue::offer
         (t  0.2,s  0.2) 
java.util.concurrent.ConcurrentLinkedQueue$Node::casNext
         (t  0.1,s  0.1) java.util.concurrent.ConcurrentLinkedQueue$Node::<init>
       (t  0.7,s  0.1) java.util.HashMap$KeyIterator::next
...
{code}






> Apply the Leader/Followers pattern to RpcServer's Reader
> --------------------------------------------------------
>
>                 Key: HBASE-14479
>                 URL: https://issues.apache.org/jira/browse/HBASE-14479
>             Project: HBase
>          Issue Type: Improvement
>          Components: IPC/RPC, Performance
>            Reporter: Hiroshi Ikeda
>            Assignee: Hiroshi Ikeda
>            Priority: Minor
>         Attachments: HBASE-14479-V2 (1).patch, HBASE-14479-V2.patch, 
> HBASE-14479-V2.patch, HBASE-14479.patch, flamegraph-19152.svg, 
> flamegraph-32667.svg, gc.png, gets.png, io.png, median.png
>
>
> {{RpcServer}} uses multiple selectors to read data for load distribution, but 
> the distribution is just done by round-robin. It is uncertain, especially for 
> long run, whether load is equally divided and resources are used without 
> being wasted.
> Moreover, multiple selectors may cause excessive context switches which give 
> priority to low latency (while we just add the requests to queues), and it is 
> possible to reduce throughput of the whole server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to