[ https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348101#comment-16348101 ]
Paulo Motta edited comment on CASSANDRA-14092 at 2/1/18 6:45 AM: ----------------------------------------------------------------- After a second thought I agree that trying to automatically recover lost data is bad not only because users may have reacted to the absence of data, but also because due to the negative {{localDeletionTime}}, the lost data will likely be purged instantly during compaction, so not much data will be left to recover after a while and the auto-recovery behavior will be pretty much undefined. Furthermore, we will need to keep the recovery logic in the storage engine at least until 4.0 with limited benefit. The only way a user that set TTL 20 years ahead can get away with reliably recovering its data is if it has {{incremental_backups}} enabled, and in this case, it can easily find out which SSTables have a negative {{minLocalDeletionTime}} and run scrub on them to fix the overflow, so I'm strongly leaning towards my previous approach of providing a way for scrub to fix negative timestamps, rather then embedding this recovery logic in the storage engine itself - which would be arguably useful in a handful of cases. Regarding what to do with TTLs exceeding the maximum expiry date until we have a permanent solution to the problem, I think it's fair to expose the cap vs reject trade-off as flag to the operator, so I introduced a {{-Dcassandra.expiration_date_overflow_policy={REJECT, CAP_NOWARN, CAP_WARN}} option, which defaults to the conservative REJECT behavior and has the following meaning explained on NEWS.txt: {noformat} - REJECT: this is the default policy and will reject any requests with expiration date timestamp after 2038-01-19T03:14:06+00:00. - CAP_NOWARN: any insert with TTL expiring after 2038-01-19T03:14:06+00:00 will expire on 2038-01-19T03:14:06+00:00. - CAP_WARN: same as previous, but will log a warning message when inserted data is capped. {noformat} Any feedback and suggestions of alternative more meaningful naming is welcome. One downside of defaulting to REJECT I see is that applications will start to break as time progresses and TTLs start reaching the maximum date more often, but given that we plan to fix the underlying issue soon in upcoming versions and users have a way to change the policy I think it shouldn't be a big problem, but I don't feel strongly about it so if anyone does please chime in. The patch below implements and tests the expiration_date_overflow_policy, but also adds a new flag {{cassandra.recover_overflowed_expiration_best_effort}} to perform the auto-recovery, which I intend to remove due to the reasons explained above and replace it with the [scrub-based recovery|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092] (unless someone objects). I plan to cleanup these details later today, but if anyone can take a look and give early feedback that would be nice, as I'm hoping to get this in as soon as we can to start a vote. The 2.1 and 3.0 are functionally equivalent, except that the 2.1 patch adds a missing 20-years cap for the default_time_to_live option, and that the 3.0 patch adds the to-be-removed {{cassandra.recover_overflowed_expiration_best_effort}} option. The versions for the other branches are pretty much derived from those so I will do that after the final review round. ||2.1||3.0||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-14092-v3]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092-v3]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14092]| was (Author: pauloricardomg): After a second thought I agree that trying to automatically recover lost data is bad not only because users may have reacted to the absence of data, but also because due to the negative {{localDeletionTime}}, the lost data will likely be purged instantly during compaction, so not much data will be left after a while and the auto-recovery behavior will be pretty much undefined. Furthermore, we will need to keep the recovery logic in the storage engine at least until 4.0 with limited benefit. The only way a user that set TTL 20 years ahead can get away with reliably recovering its data is if it has {{incremental_backups}} enabled, and in this case, it can easily find out which SSTables have a negative {{minLocalDeletionTime}} and run scrub on them to fix the overflow, so I'm strongly leaning towards my previous approach of providing a way for scrub to fix negative timestamps, rather then embedding this recovery logic in the storage engine itself - which would be arguably useful in a handful of cases. Regarding what to do with TTLs exceeding the maximum expiry date until we have a permanent solution to the problem, I think it's fair to expose the cap vs reject trade-off as flag to the operator, so I introduced a {{-Dcassandra.expiration_date_overflow_policy={REJECT, CAP_NOWARN, CAP_WARN}} option, which defaults to the conservative REJECT behavior and has the following meaning explained on NEWS.txt: {noformat} - REJECT: this is the default policy and will reject any requests with expiration date timestamp after 2038-01-19T03:14:06+00:00. - CAP_NOWARN: any insert with TTL expiring after 2038-01-19T03:14:06+00:00 will expire on 2038-01-19T03:14:06+00:00. - CAP_WARN: same as previous, but will log a warning message when inserted data is capped. {noformat} Any feedback and suggestions of alternative more meaningful naming is welcome. One downside of defaulting to REJECT I see is that applications will start to break as time progresses and TTLs start reaching the maximum date more often, but given that we plan to fix the underlying issue soon in upcoming versions and users have a way to change the policy I think it shouldn't be a big problem, but I don't feel strongly about it so if anyone does please chime in. The patch below implements and tests the expiration_date_overflow_policy, but also adds a new flag {{cassandra.recover_overflowed_expiration_best_effort}} to perform the auto-recovery, which I intend to remove due to the reasons explained above and replace it with the [scrub-based recovery|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092] (unless someone objects). I plan to cleanup these details later today, but if anyone can take a look and give early feedback that would be nice, as I'm hoping to get this in as soon as we can to start a vote. The 2.1 and 3.0 are functionally equivalent, except that the 2.1 patch adds a missing 20-years cap for the default_time_to_live option, and that the 3.0 patch adds the to-be-removed {{cassandra.recover_overflowed_expiration_best_effort}} option. The versions for the other branches are pretty much derived from those so I will do that after the final review round. ||2.1||3.0||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-14092-v3]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092-v3]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14092]| > Max ttl of 20 years will overflow localDeletionTime > --------------------------------------------------- > > Key: CASSANDRA-14092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14092 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Paulo Motta > Assignee: Paulo Motta > Priority: Blocker > Fix For: 2.1.20, 2.2.12, 3.0.16, 3.11.2 > > > CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year > 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for > {{localDeletionTime}}. > It turns out that next year the {{localDeletionTime}} will start overflowing > with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20 > years) > Integer.MAX_VALUE}}), so we should remove this limitation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org