[ 
https://issues.apache.org/jira/browse/IGNITE-10418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Chudov reassigned IGNITE-10418:
-------------------------------------

    Assignee:     (was: Denis Chudov)

> Implement lightweight profiling of messages processing
> ------------------------------------------------------
>
>                 Key: IGNITE-10418
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10418
>             Project: Ignite
>          Issue Type: New Feature
>            Reporter: Alexey Scherbakov
>            Priority: Major
>              Labels: IEP-35
>
> There is a lack of capabilities to identify bottlenecks without extensive 
> profiling on server and client side (JFR recording, sampling profilers, 
> regular thread dumps, etc), which is not always possible. Even having 
> profiling data not always helpful for determining several types of 
> bottlenecks, for example, if there is a contention on single key/partition.
> Lightweight message profiling will allow to track each message execution, to 
> collect a statistics of execution in executors for each grid node and for all 
> nodes, collect histograms distributed by waiting/execution time for each type 
> of message.
> We need to implement:
>  # histogram metrics for message execution time, queue waiting time, queue 
> size at the moments of queue add and execution start, with distribution by 
> message type;
>  # Dumping of messages if it’s execution/waiting time exceeds some threshold 
> timeout, i.e.
> {code:java}
> Slow message: *enqueueTs*=2018-11-27 15:10:22.241, *waitTime*=0.048, 
> *procTime*=305.186, *messageId*=3a3064a9, *queueSzBefore*=0, 
> *headMessageId*=null, *queueSzAfter*=0, *message*=GridNearTxFinishRequest 
> [miniId=1, mvccSnapshot=null, super=GridDistributedTxFinishRequest 
> [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], 
> futId=199a3155761-f379f312-ad4b-4181-acc5-0aacb3391f07, threadId=296, 
> commitVer=null, invalidate=false, commit=true, baseVer=null, txSize=0, 
> sys=false, plc=2, subjId=dda703a0-69ee-47cf-9b9a-bf3dc9309feb, 
> taskNameHash=0, flags=32, syncMode=FULL_SYNC, txState=IgniteTxStateImpl 
> [activeCacheIds=[644280847], recovery=false, mvccEnabled=false, txMap=HashSet 
> [IgniteTxEntry [key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true], 
> cacheId=644280847, txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=8, val=8, 
> hasValBytes=true], cacheId=644280847], val=[op=READ, val=null], 
> prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null], 
> entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, 
> explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], 
> filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry 
> [key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true], val=null, 
> ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=8, 
> extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion 
> [topVer=2147483647, order=0, nodeOrder=0]], flags=2]GridDistributedCacheEntry 
> [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=8, super=], prepared=0, 
> locked=false, nodeId=null, locMapped=false, expiryPlc=null, 
> transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, 
> xidVer=GridCacheVersion{code}
>  # JMX tools and command line interface to get this metrics and print 
> statistics view.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to