[ 
https://issues.apache.org/jira/browse/CASSANDRA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962362#comment-13962362
 ] 

Benedict commented on CASSANDRA-6995:
-------------------------------------

Did you test this with CQL3 + native client connectivity?

I have a patch lying around that does exactly this, but I didn't try too hard 
to integrate it because it's only a very specific use case this helps (thrift 
queries with smart routing, I found, and since thrift doesn't natively support 
smart routing it's a limited win), and the patch is a little more invasive than 
this one. 

If we want to do it, we should probably obey the concurrent reads property in 
the yaml. Which means modifying the executors to permit stealing one of the 
work permits (and if none are available dropping it onto the queue as normal). 
In the nearish future it may be possible to speculatively execute on the 
assumption the data is in memory, and fall back to the executor stage only if 
it turns out not to be, but until then we really need to stick to the yaml.

Anyway, I can rebase my old patch and post it if we want something to compare 
with that does this.

> Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to 
> read stage
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6995
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6995
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.0.7
>
>         Attachments: 6995-v1.diff, syncread-stress.txt
>
>
> When performing a read local to a coordinator node, AbstractReadExecutor will 
> create a new SP.LocalReadRunnable and drop it into the read stage for 
> asynchronous execution. If you are using a client that intelligently routes  
> read requests to a node holding the data for a given request, and are using 
> CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and waiting for the 
> context switches (and possible NUMA misses) adds unneccesary latency. We can 
> reduce that latency and improve throughput by avoiding the queueing and 
> thread context switching by simply executing the SP.LocalReadRunnable 
> synchronously in the request thread. Testing on a three node cluster (each 
> with 32 cpus, 132 GB ram) yields ~10% improvement in throughput and ~20% 
> speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to