[ 
https://issues.apache.org/jira/browse/CASSANDRA-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871690#comment-13871690
 ] 

Aleksey Yeschenko commented on CASSANDRA-6504:
----------------------------------------------

Pushed v2 to https://github.com/iamaleksey/cassandra/commits/6504-v2

Just the difference between (trunk with merge in 6505-v2) and 6504-v2: 
https://github.com/iamaleksey/cassandra/compare/6505-trunk...6504-v2

I am writing more unit tests now (for counter cache and counter mutation), but 
that shouldn't be blocking the review (the tests will be there before 6504 gets 
properly committed).

bq. I don't think we need the CounterId renewal stuff on cleanup anymore. So 1) 
we can remove it from Cleanup and remove a bunch of stuff from CounterId

Done. We *could* also move the counter id to system.local, since we don't care 
about history anymore, but I'm keeping it as is for now.

bq. 2) this means a node shouldn't change it's counterId at all anymore. So, 
since the counter cache stores only old local shards, we can skip storing the 
counterId in each cache key (we'd want to save the counterId at the start of 
the cache file or something just to assert it hasn't changed at reload time 
"just in case" but that should be enough).

True. Done. Also moved counter cache loading to SS.initServer(), after the 
potential counter id renewal (the only place where it can happen) and commit 
log replay.

Minor changes not originating from the review:
- CFPropDefs (CQL2 and CQL3) to inline KW_REPLICATEONWRITE for obsoleteKeywords
- CassandraServer.doInsert() to use MIN(mutations' timeouts) instead of MAX
- addition of the missing counter cache nodetool stuff

CounterContext stuff from v1 of 6505 that didn't get into 6505-v2, but did get 
into 6404-v2 branch:
- reuse of writeElement() for copyTo() - will go away with 6506
- changes to diff() - will go away with 6506
- inlining of IContext

Not changed, or done differently:

bq. Nit: in the yaml, it'd be more consistent to keep counter_cache_size_in_mb 
commented out to mean "default" (rather than adding it but with an empty 
value). We could then simplify the comment to something like "Default to 
min(2.5% of the heap, 50MB). Set to 0 to disable."

It's blank, but not commented out, for consistency with key_cache_size_in_mb 
and row_cache_size_in_mb. Should comment out all of them or leave as is.

bq. Given that it's a rather hot path, might be worth using a 2 values long[] 
instead of a Pair<Long, Long>?

Created a named class instead - ClockAndCount, that's also reused by the 
counter cache (used in place of the old CounterCacheEntry) instead. And 
switched currentValues to an array from an ArrayList, since generics are no 
longer involved.

All the other issues and nits have been fixed. Sorry for doing it in a rebase. 
If v3 ever happens, it's going to be a separate commit on top of the v2 branch.



> counters++
> ----------
>
>                 Key: CASSANDRA-6504
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6504
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.1
>
>
> Continuing CASSANDRA-4775 here.
> We are changing counter write path to explicitly 
> lock-read-modify-unlock-replicate, thus getting rid of the previously used 
> 'local' (deltas) and 'remote' shards distinction. Unfortunately, we can't 
> simply start using 'remote' shards exclusively, since shard merge rules 
> prioritize the 'local' shards. Which is why we are introducing the third 
> shard type - 'global', the only shard type to be used in 2.1+.
> The updated merge rules are going to look like this:
> global + global = keep the shard with the highest logical clock
> global + local or remote = keep the global one
> local + local = sum counts (and logical clock)
> local + remote = keep the local one
> remote + remote = keep the shard with highest logical clock
> This is required for backward compatibility with pre-2.1 counters. To make 
> 2.0-2.1 live upgrade possible, 'global' shard merge logic will have to be 
> back ported to 2.0. 2.0 will not produce them, but will be able to understand 
> the global shards coming from the 2.1 nodes during the live upgrade. See 
> CASSANDRA-6505.
> Other changes introduced in this issue:
> 1. replicate_on_write is gone. From now on we only avoid replication at RF 1.
> 2. REPLICATE_ON_WRITE stage is gone
> 3. counter mutations are running in their own COUNTER_MUTATION stage now
> 4. counter mutations have a separate counter_write_request_timeout setting
> 5. mergeAndRemoveOldShards() code is gone, for now, until/unless a better 
> solution is found
> 6. we only replicate the fresh global shard now, not the complete 
> (potentially quite large) counter context
> 7. to help with concurrency and reduce lock contention, we cache node's 
> global shards in a new counter cache ({cf id, partition key, cell name} -> 
> {count, clock}). The cache is only used by counter writes, to help with 'hot' 
> counters being simultaneously updated.
> Improvements to be handled by separate JIRA issues:
> 1. Split counter context into separate cells - one shard per cell. See 
> CASSANDRA-6506. This goes into either 2.1 or 3.0.
> Potential improvements still being debated:
> 1. Coalesce the mutations in COUNTER_MUTATION stage if they share the same 
> partition key, and apply them together, to improve the locking situation when 
> updating different counter cells in one partition. See CASSANDRA-6508. Will 
> to into 2.1 or 3.0, if deemed beneficial.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to