[ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644556#comment-14644556 ]
Ariel Weisberg commented on CASSANDRA-7392: ------------------------------------------- bq. The core issue is how to monitor tasks. The existing approach should have a cost linear to the number of in progress operations, since we cancel the future when the operation completes. This should be the same as the number of threads. I thought about a single thread with a queue as well, but opted for scheduling a task per operation due to the cost of keeping a shared array or queue. It did not occur to me to consider using a thread local, which is an excellent idea. It appears that by default cancel doesn't remove the task from the priority queue and that STPE has a setRemoveOnCancelPolicy(boolean) method that's let you choose what happens. You are right the cost in space could be linear to the number of threads, but each read still has to push an element onto the global concurrent queue and then remove it from the queue as part of canceling the future. So the cost in terms of accessing global mutable shared state is still linear to the number of operations. I am a big believer in getting the # accesses to global mutable shared state on a per operation basis as close 0 as possible so we don't get whacked by Amidahl's law. I prefer push harder on scale up than single threaded performance because the ROI is greater. bq.To make sure I understand your suggestion correctly: an entry is inserted when the thread local is created, meaning when the thread local initial value is created, so the COW array is copied once per thread. Then, when a new operation starts, we need to atomically modify the state in the thread local. Correct? So, this gets rid of the priority queue contention as well as the insert and remove logarithmic costs, which is nice. The downside is the complexity of atomically swapping in a new operation, which shouldn't be much. I think you have it. Code wise it is a little more complex because you are implementing behaviors that are handled by STPE now. Execution wise it is a win (hard to know how big) because the cache line holding the currently running operation for a thread can stay in the exclusive state across multiple operations. It will only enter the share state when the timeout thread goes to check it. I'm not sure if this field will be a candidate for lazySet you should ask Benedict. bq.It should send no response, this is the same behavior for queries that are dropped before they are started, see the top of MessagingDeliveryTask.run(). We could also send a timeout error but I think the coordinator times out anyway on its own, and I prefer not to change the coordinator code. Sounds good to me. I like keeping scope small. > Abort in-progress queries that time out > --------------------------------------- > > Key: CASSANDRA-7392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7392 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Jonathan Ellis > Assignee: Stefania > Fix For: 3.x > > > Currently we drop queries that time out before we get to them (because node > is overloaded) but not queries that time out while being processed. > (Particularly common for index queries on data that shouldn't be indexed.) > Adding the latter and logging when we have to interrupt one gets us a poor > man's "slow query log" for free. -- This message was sent by Atlassian JIRA (v6.3.4#6332)