[ https://issues.apache.org/jira/browse/CASSANDRA-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693681#comment-13693681 ]
Colin B. commented on CASSANDRA-4775: ------------------------------------- Would anyone be interested in a type of counter that is about 99% correct, but not exact? The Hyperloglog cardinality estimation algorithm would be fairly straight forward to implement inside Cassandra. It estimates the number of distinct elements in a set. One way to use it as a counter is to have a two "set"s of timeuuids, A and R. Each time you want to increment the counter add a timeuuid to the A set, each time you want to decrement add a timeuuid to the R set. Count is the count in A minus the count in R. Re-adding the same item (timeuuid) to a "set" is idempotent. A read would need to access a constant amount of internal data and the internal data is a good fit for Cassandra's method of merging distributed writes. A description of the Hyperloglog algorithm is available here: http://blog.aggregateknowledge.com/2012/10/25/sketch-of-the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure/ > Counters 2.0 > ------------ > > Key: CASSANDRA-4775 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4775 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Arya Goudarzi > Assignee: Aleksey Yeschenko > Labels: counters > Fix For: 2.1 > > > The existing partitioned counters remain a source of frustration for most > users almost two years after being introduced. The remaining problems are > inherent in the design, not something that can be fixed given enough > time/eyeballs. > Ideally a solution would give us > - similar performance > - less special cases in the code > - potential for a retry mechanism -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira