I'm not thinking about counters specifically here, and assuming you are sending 
batch mutations of the same sizeā€¦ 

The mutations (inserts, counter increments) for a row are turned into a single 
task server side, and are then processed in a serial fashion. If you send a 
mutation for 2 rows it will be turned into two tasks, which can then be 
processed in parallel. 

There is an point of dimensioning returns here. Each row you write to or read 
from will become a task, if you write to 1,000 rows at once you will put 1,000 
tasks in the thread pool which typically has 32 concurrent threads. This may 
block / add latency to other requests. It's more of an issue with reads than 
writes. 

Does that apply to your situation ? 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/02/2012, at 1:19 AM, Amit Chavan wrote:

> 
> Hi,
> 
> In our use case, we maintain minute-wise roll ups for different metrics. 
> These are stored in a counter column family where the row key is a composite 
> containing the timestamp rounded to the last minute and an integer between 
> 0-9 (This integer is calculated as the MD5 hash of the metric mod 10). The 
> column names are the metrics we wish to track. Typically, each row has about 
> 100,000 counters.
> 
> We tested two scenarios. The first one is as mentioned above. In this case we 
> got a per write latency of about 80 micro-seconds to 100 micro-seconds.
> 
> In the other scenario, we calculated the integer in the row key as mod 100. 
> In this case we observed a per write latency of 50 micro-seconds to 70 
> micro-seconds.
> 
> I wish to understand why updates to counters were faster as they got spread 
> across multiple rows?
> 
> Cluster summary : 4 nodes running Cassandra 1.0.5. Each with 8 cores, 32G 
> RAM, 10G Cassandra heap. We are using replication factor of 2.
> 
> 
> -- 
> Thanks!
> Amit Chavan
> 

Reply via email to