[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495843#comment-14495843 ] Rob Emery commented on CASSANDRA-2103: -- I also agree with Nikolay, Amol and Marco. This seems a little clunky to not have any way of expiring counters. In our (presumably fairly common usecase) we have different granularities of statistics stored in time buckets, after the first day then the minute bucket becomes pointless. Currently the only way for us to dispose of the unused data would be to hack it with a cronjob and delete the buckets for the previous day, which just feels really unpleasant versus the elegance of using TTLs on columns. I would concur with the desired behaviour of setting the TTL on the first upsert and then ignoring attempts to set it on subsequent updates. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Kelvin Kakugawa > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302044#comment-14302044 ] Marco Palladino commented on CASSANDRA-2103: I do also agree with Nikolay and Amol. My use case is about storing analytics information and then deleting data when they get too old and are not incremented/used by the application anymore. I am no expert, but maybe another option would be complying with the TTL set when creating the table using {{default_time_to_live}} (as opposed as setting the TTL when increasing the counter for the first time, which is also a nice option to have). The application itself could then control the rotation of data by storing/duplicating counters in "hot" or "frozen" tables. This would require some more planning when creating the data model, as such it would be totally fine to only allow the TTL when creating the table the first time, and prevent the TTL from being set when altering the table. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Kelvin Kakugawa > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961035#comment-13961035 ] Amol Fasale commented on CASSANDRA-2103: Agreed with Nikolay, counters should have TTL values, but only once insertable, should not be updatable (Only when initializing the counters). Next time whenever counters increments, should ignore the TTL. Usecase like, when we are maintaining a realtime counters, like daily page views, next day these becomes useless, should not have any point storing keeping this unnecessary data and increasing overhead. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Kelvin Kakugawa > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13571345#comment-13571345 ] Nikolay commented on CASSANDRA-2103: What about following use case: First time when you increase the counter (e.g. counter does not exists), you supply TTL. Then counter is created with this TTL. Next, the TTL is ignored, until counter is "alive". This means if you create it with TTL 1 week, counter will be "alive" for a week, then it expires and then it is created again , and assuming you still give TTL 1 week, counter is like resetting. In this case implementation will not be that hard I believe, since all replicas will have consolidated expire date? > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Kelvin Kakugawa > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051329#comment-13051329 ] Yang Yang commented on CASSANDRA-2103: -- there could be a problem with trying to relying on forcing compaction order: if you base the intended order on max timestamp of each sstable, the timestamp is not trustworthy, because a single malicious client request can bump up its timestamp to the future, and arbitrarily change the order of compaction, thus rendering the approach in 2735 useless. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Kelvin Kakugawa > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046772#comment-13046772 ] Jonathan Ellis commented on CASSANDRA-2103: --- We're going to address this with pluggable compaction strategies instead, specifically CASSANDRA-2735. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Kelvin Kakugawa > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990651#comment-12990651 ] Kelvin Kakugawa commented on CASSANDRA-2103: Yes, I agree w/ you. Our use case is very narrow. We basically want fine-grained deletes, w/o having to periodically sweep the data store for irrelevant keys. We're not interested in reading the data when it's at the end of the ttl bound. A non-ideal solution to the above situation would be to never extend localExpirationTime, once its been set. It would still work for our case, but it would lose flexibility. i.e. if you make a mistake, you can't fix it. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 >Reporter: Kelvin Kakugawa >Assignee: Kelvin Kakugawa > Fix For: 0.8 > > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990529#comment-12990529 ] Sylvain Lebresne commented on CASSANDRA-2103: - bq. I think your above case mixes two orthogonal uses for ttls. Typically, a column will only have ttls and they'll be updated in a fixed period w/in the ttl, which is very useful for our case. Mixing a ttl and non-ttl use w/in the same column is bound to produce probs. The problem has nothing to do with mixing ttl with non-ttl. But on that, a non-ttled column is after all just a ttled column with an arbitrary long ttl, so that there is a problem with mixing ttl with non-ttl should be a strong hint that something is fishy. The problem is that when you send an increment with a ttl, you can't know what will be the actual lifetime of this increment. If the CounterColumn corresponding to this increment is never merged to another, more recent, CounterColumn, then its lifetime is the ttl. But if it is merged (during it's lifetime) then it's lifetime is extended to the ttl of the new column. All well and good except, and that is the problem, you just don't know when a column will be merged. It is not because a new update have been *issued* during the lifetime of a preceding one that those two updates will be *merged* during the lifetime of the preceding one. It all depends on when compactions will kick in (which in turns is random from the point of view of the client). Let's take an example. Say you have a counter column family and the only update you ever do to this CF is increment by 1 with a ttl of 1 week. So only column with ttl. The idea being to have counters that reset themselves if not incremented for a week (that is the only thing that would to make sense to me). Now say that for one of the counter in this CF, you happen to increment it regularly every given hour. After x days, you expect the value of this counter to be 24 * x (if you disagree on this, you really have to explain me what you'd expect here). I guarantee you that it is not what you will get. Maybe for some days it will look like it works, because every new insert will get merged in time to the old ones, extending the ttl of the whole count. But someday (which depends on seemingly random things like the load of the CF, the memtable thresholds and other compaction thresholds), a compaction will create a sstable containing some amount of the total count that is big enough that it won't get compacted for a week. For a week, you will still get the expected result. But after that specific week, you will (definitively) lose that part of the counter that hasn't been compacted for a week. From then on you won't get the expected value. This patch can't work because its observed behavior depends on when compactions will be done which, from a client point of view, is a random event. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 >Reporter: Kelvin Kakugawa >Assignee: Kelvin Kakugawa > Fix For: 0.8 > > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990201#comment-12990201 ] Kelvin Kakugawa commented on CASSANDRA-2103: If you look at the unit tests, it expects the above case. i.e. the last column w/ or w/o a ttl "wins", so to speak. If two columns w/ equal timestamps are reconciled, the non-ttl wins. I think your above case mixes two orthogonal uses for ttls. Typically, a column will only have ttls and they'll be updated in a fixed period w/in the ttl, which is very useful for our case. Mixing a ttl and non-ttl use w/in the same column is bound to produce probs. As you rightly noted. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 >Reporter: Kelvin Kakugawa >Assignee: Kelvin Kakugawa > Fix For: 0.8 > > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990105#comment-12990105 ] Sylvain Lebresne commented on CASSANDRA-2103: - Sorry but I'm not sure expiring counters work. First, there is the problem I've describe in CASSANDRA-2101, that is still valid here since expiring counters become tombstone when expired. So we will have unpredictible results. But this unpredictibility is even worst here I believe. Say you do a first increment with a ttl of x seconds and let's call c1 the resulting expiring column. And say that later (but *much before* c1 expires from the client standpoint) you do another increment without ttl (or with one, that doesn't really matter actually) and let's call this c2. Then the result will all depend on when compaction happens. If c1 and c2 are compacted (or resolved if we're talking memtable) quickly (say c1 is still in the memtable when c2 arrive for instance), then we'll end up with a value of c1.value() + c2.value(), this having no ttl (or c2 ttl if it had one). But if c1 and c2 are compacted after c1 has expired (which can easily happen, even if c2 was issued much before c1 expires), we'll end up with only c2 (since during reconciliation c1 will now be a tombstone with lower timestamp than c2). In my opinion this is far too much unpredictable and dependent on internal events. So for now, and unless I'm missing something obvious, I'm -1 on this. > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 >Reporter: Kelvin Kakugawa >Assignee: Kelvin Kakugawa > Fix For: 0.8 > > Attachments: 0001-CASSANDRA-2103-expiring-counters-logic-tests.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2103) expiring counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989919#comment-12989919 ] Kelvin Kakugawa commented on CASSANDRA-2103: material refactor to db.ColumnSerializer > expiring counter columns > > > Key: CASSANDRA-2103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2103 > Project: Cassandra > Issue Type: New Feature > Components: Core >Affects Versions: 0.8 >Reporter: Kelvin Kakugawa >Assignee: Kelvin Kakugawa > Fix For: 0.8 > > Attachments: 0001-CASSANDRA-2103-add-expiring-counter-columns.patch > > > add ttl functionality to counter columns. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira