[ 
https://issues.apache.org/jira/browse/CASSANDRA-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571162#comment-16571162
 ] 

Chris Lohfink commented on CASSANDRA-14436:
-------------------------------------------

While I think we can do something like creating a concurrent set of Samplers 
for each SamplerType that we tie to a Sampler session and flag it to start at 
same time I dont think its necessary. The current use of top partitions has 
never had a reported issue with people trying to concurrently run profiling 
sessions so it can be a new feature to add in another ticket at sometime but I 
dont think its needed enough here.

In meantime I added a strict restriction on a single at a time, raising an 
exception if someone tries to kick off a 2nd one. Also the sampling will 
timeout at the end of the duration so if the finish is never called it wont 
spin forever.

I did write some basic jmh benchmarks but i didnt want to make insert() 
accessible and the {{.*microbench.*}} in build.xml makes default visibility not 
an option so... yeah. Ultimately (when on) its just ThreadExecuterPool.submit() 
on the addSample in read/write path which is pretty straight forward limitation 
on the contention on the queue but i saw 100-300nanosecond -ish. Going into the 
actual guts, the frequency sampler being a wrapper around the addthis 
StreamSummary - which there might be something better out there now but its 
seemed to do fine so far. In some worst case JMH benchmarks I was able to see 
this hit 3us or so, which could conceivably underperform writes which would 
cause a backup. The MaxSampler uses MinMaxPriorityQueue, which after 
PriorityQueue(comparator) becomes available (post java8) that can be replaced 
to be more performant, but that rarely breaks a microsecond even with top 1024. 
Just incase as a catchall I added the same as the trace executor - a throwaway 
loadshedding incase the sampler executor does get backed up. This includes some 
plumbing so its reported appropriately in metrics.

> Add sampler for query time and expose with nodetool
> ---------------------------------------------------
>
>                 Key: CASSANDRA-14436
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14436
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Lohfink
>            Assignee: Chris Lohfink
>            Priority: Major
>
> Create a new {{nodetool profileload}} that functions just like toppartitions 
> but with more data, returning the slowest local reads and writes on the host 
> during a given duration and highest frequency touched partitions (same as 
> {{nodetool toppartitions}}). Refactor included to extend use of the sampler 
> for uses outside of top frequency (max instead of total sample values).
> Future work to this is to include top cpu and allocations by query and 
> possibly tasks/cpu/allocations by stage during time window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to