[
https://issues.apache.org/jira/browse/HBASE-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417809#comment-16417809
]
stack commented on HBASE-20236:
-------------------------------
[~eshcar] Thank you. Funny. I was looking at this yesterday. Sorry the doc is
poor. I was trying to read what was there yesterday and came to same conclusion
about my hackery. The class comment tries to give the high-level notion of what
is going on here; i.e. if a handler is available when a Call is queued, skip
the queue and pass the Call directly to the Handler and let it go run it
immediately skipping queuing and coordination costs; otherwise, add to the
queue for processing later when a Handler is free. See HBASE-16023 for
measurements and some interesting commentary by [~ikeda], a contributor smart
on concurrency.
The Semaphore is meant to coordinate the Reader doing hand-off of the Call to
the Handler. The Handler may be occupied running an existing Call. If you
remove the Semaphore, I'm afraid that a Handler could be asked run a Call while
in the midst of running another. Yesterday I spent five minutes trying to
figure if there was a way we could purge the Semaphore. BTW, this hand-off from
Readers to Handlers when a read-only workload mostly from cache is our most
expensive operation; having the Readers run the Call rather than pass to a
Handler doubles our throughput at least (but makes it so there is no
scheduling).
I used to be intimate with the machinations here but would need to spend time
to refresh (and improve the doc). Any input would be much appreciated (could
even try stuff if you want me to -- I'm doing perf stuff generally these
times... Currently making the report you asked for over on HBASE-20188).
> [locking] Write-time worst offenders
> ------------------------------------
>
> Key: HBASE-20236
> URL: https://issues.apache.org/jira/browse/HBASE-20236
> Project: HBase
> Issue Type: Sub-task
> Components: Performance
> Affects Versions: 2.0.0-beta-2
> Reporter: stack
> Priority: Major
>
> Messing w/ my new toy, here are worst offenders locking; they must be bad if
> they show up in this sampling profiler:
> {code}
> 7 Total: 769321884622 (99.24%) samples: 2965
> 8 [ 0] java.util.concurrent.Semaphore$NonfairSync
> 9 [ 1] sun.misc.Unsafe.park
> 10 [ 2] java.util.concurrent.locks.LockSupport.park
> 11 [ 3]
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
> 12 [ 4]
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly
> 13 [ 5]
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly
> 14 [ 6] java.util.concurrent.Semaphore.acquire
> 15 [ 7]
> org.apache.hadoop.hbase.ipc.FastPathBalancedQueueRpcExecutor$FastPathHandler.getCallRunner
> 16 [ 8] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
> 17
> 18 Total: 4284274263 (0.55%) samples: 23543
> 19 [ 0] org.apache.hadoop.hbase.regionserver.MutableSegment
> 20 [ 1] org.apache.hadoop.hbase.ByteBufferKeyValue.getSequenceId
> 21 [ 2] org.apache.hadoop.hbase.regionserver.Segment.updateMetaInfo
> 22 [ 3] org.apache.hadoop.hbase.regionserver.Segment.internalAdd
> 23 [ 4] org.apache.hadoop.hbase.regionserver.MutableSegment.add
> 24 [ 5] org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd
> 25 [ 6] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
> 26 [ 7] org.apache.hadoop.hbase.regionserver.AbstractMemStore.add
> 27 [ 8] org.apache.hadoop.hbase.regionserver.HStore.add
> 28 [ 9] org.apache.hadoop.hbase.regionserver.HRegion.applyToMemStore
> 29 [10] org.apache.hadoop.hbase.regionserver.HRegion.access$600
> 30 [11]
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.applyFamilyMapToMemStore
> 31 [12]
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lambda$writeMiniBatchOperationsToMemStore$0
> 32 [13]
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation$$Lambda$442.1445825895.visit
> 33 [14]
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.visitBatchOperations
> 34 [15]
> org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.writeMiniBatchOperationsToMemStore
> 35 [16]
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.writeMiniBatchOperationsToMemStore
> 36 [17] org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate
> 37 [18] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
> 38 [19] org.apache.hadoop.hbase.regionserver.HRegion.batchMutate
> 39 [20] org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp
> 40 [21]
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp
> 41 [22]
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation
> 42 [23] org.apache.hadoop.hbase.regionserver.RSRpcServices.multi
> 43 [24]
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod
> 44 [25] org.apache.hadoop.hbase.ipc.RpcServer.call
> 45 [26] org.apache.hadoop.hbase.ipc.CallRunner.run
> 46 [27] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
> 47 [28] org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run
> 48
> 49 Total: 717708856 (0.09%) samples: 214
> 50 [ 0] java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync
> 51 [ 1] sun.misc.Unsafe.park
> 52 [ 2] java.util.concurrent.locks.LockSupport.park
> 53 [ 3]
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
> 54 [ 4] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
> 55 [ 5] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
> 56 [ 6] java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock
> 57 [ 7] org.apache.hadoop.hbase.regionserver.HRegion.blockUpdates
> 58 [ 8]
> org.apache.hadoop.hbase.regionserver.RegionServicesForStores.blockUpdates
> 59 [ 9]
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory
> 60 [10]
> org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run
> 61 [11] java.util.concurrent.ThreadPoolExecutor.runWorker
> 62 [12] java.util.concurrent.ThreadPoolExecutor$Worker.run
> 63 [13] java.lang.Thread.run
> ...
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)