[ 
https://issues.apache.org/jira/browse/CASSANDRA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919464#action_12919464
 ] 

zhu han commented on CASSANDRA-1546:
------------------------------------

Kelvin,

Thanks for clarification. It is a good idea  that the lead replica only 
aggregates counter in MT on CL.ONE.  IIRC, what makes things more complicated 
is CL.QUORUM or CL.ALL. #1546 is intended to support these CL levels. 

CL.QUORUm and CL.ALL requires the lead replicas to propagate the sum of local 
columns to other replicas, which need iteration over all SST and MT in current 
implementation. As Sylvain pointed out:
{quote}
So, to answer the question about serializing the writes, there is no need (and
I believe it's a good thing performance-wise). When a leader receives an
update, it doesn't read-then-write. It writes-then-read. And as parts of the
read, the newly inserted LocalCounterColumn will be 'merged' with the other,
already present LocalCounterColumn and yield the actual value of the column,
without the risk of loosing an increment.
{quote}
The read here is the sum up of local columns on write path, instead of normal 
read path.

There are several possible solutions:
1) Maintain a cache of sums of local column. We can reuse the row cache 
mechanism or build a specified cache for counters, as Sylvain said.

2) Take the great idea from #1072, do read-then-write, aggregates all local 
columns over SSTables only when it's absent from MT. Otherwise, aggregate local 
columns with presented local column in MT. So that, write path hits SSTable 
when the local columns is absent from  MT, and UUID marker is disabled. 

3) As #1072,  on write path, only propagates sums of local columns in MT  to 
remote replicas, and gives a different column name to remote replicas, each 
time when the local column is absent from MT. The column name is recorded in 
MT, either. Otherwise, reads the column name from MT and give it to remote 
replicas again.  On read path, iterate all SSTables to get the latest value of 
local column and remote columns, add them all.  During compaction, we can 
compact the remote columns from the same replica as one column to limit the 
number of remote columns.

3) should be as cheap as #1072 and normal cassandra write path. UUID marker 
columns can take the same way as 3) to be aggregated. But it's not easier to 
implement and a bit ugly...



> (Yet another) approach to counting
> ----------------------------------
>
>                 Key: CASSANDRA-1546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1546
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 0.7.0
>
>         Attachments: 0001-Remove-IClock-from-internals.patch, 
> 0001-v2-Remove-IClock-from-internals.patch, 
> 0001-v3-Remove-IClock-from-internals.txt, 0002-Counters.patch, 
> 0002-v2-Counters.patch, 0002-v3-Counters.txt, 
> 0003-Generated-thrift-files-changes.patch, 0003-v2-Thrift-changes.patch, 
> 0003-v3-Thrift-changes.txt, marker_idea.txt
>
>
> This could be described as a mix between CASSANDRA-1072 without clocks and 
> CASSANDRA-1421.
> More details in the comment below.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to