[jira] [Commented] (CASSANDRA-1632) Thread workflow and cpu affinity

Jason Brown (JIRA) Wed, 20 Nov 2013 15:40:09 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828304#comment-13828304
 ]


Jason Brown commented on CASSANDRA-1632:
----------------------------------------

On a happier note, though, by simply switching OTC to batch read from it's LBQ, 
I scored a 10% improvement in coordinator throughput (latencies remained 
unaffected). I'll clean up that patch 
(https://github.com/jasobrown/cassandra/tree/1632_batchDispatch), and actually 
put in error handling :). I'll also add the same technique elsewhere in the 
code, although PeriodicCommitLogExecutorService seems like the only other 
interesting place to apply it.

> Thread workflow and cpu affinity
> --------------------------------
>
>                 Key: CASSANDRA-1632
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1632
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jason Brown
>              Labels: performance
>         Attachments: threadAff_reads.txt, threadAff_writes.txt
>
>
> Here are some thoughts I wanted to write down, we need to run some serious 
> benchmarks to see the benefits:
> 1) All thread pools for our stages use a shared queue per stage. For some 
> stages we could move to a model where each thread has its own queue. This 
> would reduce lock contention on the shared queue. This workload only suits 
> the stages that have no variance, else you run into thread starvation. Some 
> stages that this might work: ROW-MUTATION.
> 2) Set cpu affinity for each thread in each stage. If we can pin threads to 
> specific cores, and control the workflow of a message from Thrift down to 
> each stage, we should see improvements on reducing L1 cache misses. We would 
> need to build a JNI extension (to set cpu affinity), as I could not find 
> anywhere in JDK where it was exposed. 
> 3) Batching the delivery of requests across stage boundaries. Peter Schuller 
> hasn't looked deep enough yet into the JDK, but he thinks there may be 
> significant improvements to be had there. Especially in high-throughput 
> situations. If on each consumption you were to consume everything in the 
> queue, rather than implying a synchronization point in between each request.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (CASSANDRA-1632) Thread workflow and cpu affinity

Reply via email to