[ 
https://issues.apache.org/jira/browse/CASSANDRA-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869621#comment-13869621
 ] 

Sylvain Lebresne commented on CASSANDRA-6504:
---------------------------------------------

The whole approach looks good to me. I went over the parts of CASSANDRA-6505 
quickly but it'd be nice to rebase once the latter is committed.
Otherwise, a bunch of comments on the code itself:
* In StorageService, in drainOnShutdown callback and drain(), 
s/Stage.MUTATION/Stage/COUNTER_MUTATION/.
* I don't think we need the CounterId renewal stuff on cleanup anymore. So 1) 
we can remove it from Cleanup and remove a bunch of stuff from CounterId, but 
2) this means a node shouldn't change it's counterId at all anymore. So, since 
the counter cache stores only old local shards, we can skip storing the 
counterId in each cache key (we'd want to save the counterId at the start of 
the cache file or something just to assert it hasn't changed at reload time 
"just in case" but that should be enough).
* In CFS, for the striped locks sizing, shouldn't this use 
getConcurrentCounterWriters (rather than getConcurrentWriters)?
* In CounterMutation:
  ** In applyWithResult, we shouldn't ignore the tryLock result. Nit: I'd 
rather throw a timeout directly in the method rather than returning null (I'm 
thinking of InterruptedException, but also for handling a false result from 
tryLock). Even if returning null does trigger a timeout in the end, it's easier 
to follow the intent if a timeout is thrown directly.
  ** In processModifications, I'd rather test for 'cell instanceof 
CounterUpdateCell' rather than checking explicitely for tombstones or row 
marker. It's shorter and a bit more future proof (besides, I personally find it 
good hygiene to check instanceof when you're going to cast on the next line :)).
  ** Given that it's a rather hot path, might be worth using a 2 values long[] 
instead of a Pair<Long, Long>?
  ** Nit: could skip getCurrentValuesFromCache entirely if there is no cache, 
if only for symetry with updateCounterCache
* Nit: while we're at it, I'd add a isCounter() shortcut in CFMetaData.
* Nit: in the yaml, it'd be more consistent to keep counter_cache_size_in_mb 
commented out to mean "default" (rather than adding it but with an empty 
value). We could then simplify the comment to something like "Default to 
min(2.5% of the heap, 50MB). Set to 0 to disable."
* Nit: the documented default for counter_cache_save_period don't match the 
value actually set just after (that is, to be precise, the documented value is 
the default from Config.java, but it's extra weird to override said default by 
some other value in the default config file).
* Nit: There's an import added to FBUtilities but no code so I think it's a 
leftover of something.


> counters++
> ----------
>
>                 Key: CASSANDRA-6504
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6504
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.1
>
>
> Continuing CASSANDRA-4775 here.
> We are changing counter write path to explicitly 
> lock-read-modify-unlock-replicate, thus getting rid of the previously used 
> 'local' (deltas) and 'remote' shards distinction. Unfortunately, we can't 
> simply start using 'remote' shards exclusively, since shard merge rules 
> prioritize the 'local' shards. Which is why we are introducing the third 
> shard type - 'global', the only shard type to be used in 2.1+.
> The updated merge rules are going to look like this:
> global + global = keep the shard with the highest logical clock
> global + local or remote = keep the global one
> local + local = sum counts (and logical clock)
> local + remote = keep the local one
> remote + remote = keep the shard with highest logical clock
> This is required for backward compatibility with pre-2.1 counters. To make 
> 2.0-2.1 live upgrade possible, 'global' shard merge logic will have to be 
> back ported to 2.0. 2.0 will not produce them, but will be able to understand 
> the global shards coming from the 2.1 nodes during the live upgrade. See 
> CASSANDRA-6505.
> Other changes introduced in this issue:
> 1. replicate_on_write is gone. From now on we only avoid replication at RF 1.
> 2. REPLICATE_ON_WRITE stage is gone
> 3. counter mutations are running in their own COUNTER_MUTATION stage now
> 4. counter mutations have a separate counter_write_request_timeout setting
> 5. mergeAndRemoveOldShards() code is gone, for now, until/unless a better 
> solution is found
> 6. we only replicate the fresh global shard now, not the complete 
> (potentially quite large) counter context
> 7. to help with concurrency and reduce lock contention, we cache node's 
> global shards in a new counter cache ({cf id, partition key, cell name} -> 
> {count, clock}). The cache is only used by counter writes, to help with 'hot' 
> counters being simultaneously updated.
> Improvements to be handled by separate JIRA issues:
> 1. Split counter context into separate cells - one shard per cell. See 
> CASSANDRA-6506. This goes into either 2.1 or 3.0.
> Potential improvements still being debated:
> 1. Coalesce the mutations in COUNTER_MUTATION stage if they share the same 
> partition key, and apply them together, to improve the locking situation when 
> updating different counter cells in one partition. See CASSANDRA-6508. Will 
> to into 2.1 or 3.0, if deemed beneficial.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to