[ 
https://issues.apache.org/jira/browse/CASSANDRA-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349824#comment-16349824
 ] 

Paulo Motta commented on CASSANDRA-14092:
-----------------------------------------

Thanks for the quick turnaround [~beobal]! See follow-up below:
{quote}The wording of the NEWS.txt entry is good, I do wonder if we should 
maybe place it right at the top of the file rather than just in the 3.0.16 
section for extra emphasis. Any thoughts on that?
{quote}
Good idea, I did this and also updated the text to contemplate the possibility 
of data loss before this patch and how to fix it with scrub:
{noformat}
MAXIMUM TTL EXPIRATION DATE NOTICE
-----------------------------------

The maximum expiration timestamp that can be represented by the storage engine 
is 2038-01-19T03:14:06+00:00,
which means that inserts with TTL that expire after this date are not currently 
supported.

Prior to 3.0.16 in the 3.0.X series and 3.11.2 in the 3.11 series, there was no 
protection against INSERTS
with TTL expiring after the maximum supported date, causing the expiration time 
field to overflow and the
records to expire immediately. Expired records due to overflow may have been 
removed permanently after a
compaction. The 2.1.X and 2.2.X series are not subject to data loss due to this 
issue if assertions are enabled,
since an AssertionError is thrown during INSERT when the expiration time field 
overflows on these versions.

In practice this issue will affect only users that use very large TTLs, close 
to the maximum allowed value of
630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time 
progresses, the maximum supported
TTL will be gradually reduced as the the maximum expiration date approaches. 
For instance, a user on an affected
version on 2028-01-19T03:14:06 with a TTL of 10 years will be affected by this 
bug, so we urge users of very
large TTLs to upgrade to a version where this issue is addressed as soon as 
possible.

Potentially affected users should inspect their SSTables and search for 
negative min local deletion times to
detect this issue. SSTables in this state must be backed up immediately, as 
they are subject to data loss
during auto-compactions, and may be recovered by running the sstablescrub tool 
from versions 3.0.16+ and/or 3.11.2+.

The Cassandra project plans to fix this limitation in newer versions, but while 
the fix is not available, operators
can decide which policy to apply when dealing with inserts with TTL exceeding 
the maximum supported expiration date:
  - REJECT: this is the default policy and will reject any requests with 
expiration date timestamp after 2038-01-19T03:14:06+00:00.
  - CAP: any insert with TTL expiring after 2038-01-19T03:14:06+00:00 will 
expire on 2038-01-19T03:14:06+00:00 and the client will receive a warning.
  - CAP_NOWARN: same as previous, except that the client warning will not be 
emitted.

These policies may be specified via the 
-Dcassandra.expiration_date_overflow_policy=POLICY startup option which can be 
set in the jvm.options file.

See CASSANDRA-14092 for more details about this issue.
{noformat}
Please let me know what do you think of the updated text. We should also 
probably publish this text (or a subset of it) during the release announcement 
e-mail.

While writing the text above, I figured that there is also a remote possibility 
of data loss in 2.1/2.2 if assertions are disabled, but didn't backport the 
scrub recovery since it was not a straightforward backport and I didn't think 
it was worth the effort right now. We can always do that later if necessary, 
the most important thing right now is to ship the policies. To reflect this I 
updated the 4th paragraph on 2.1 and 2.2 to:
{noformat}
2.1.X / 2.2.X users in the conditions above should not be subject to data loss 
unless assertions are disabled, in which
case the suspect SSTables must be backed up immediately and manually recovered, 
as they are subject to data loss
during auto-compaction.
{noformat}
 
{quote}I also have one piece of feedback on the policies; I don't see any 
benefit in being able to turn off logging of capped expirations (especially 
since we're using NoSpamLogger) but I do I think the client warning is useful.
{quote}
I agree and updated the patch with this suggestion, but at the same time I 
think advanced operators may want to control the periodicity of the logging, so 
I created a property 
{{cassandra.expiration_overflow_warning_interval_minutes=5}} to control this.
  
{quote}I also noticed that the logging of a parse error/invalid value for the 
policy sysprop is at DEBUG in the current patches, but it might be sensible to 
draw a bit more attention to that if it happens.
{quote}
Agreed, changed the logging to WARN.

I finished the cleanup of the patch and already provided a version for all 
branches. The 2.1 and 2.2 versions are pretty much the same, as well as the 
3.0/3.11/trunk, except for some minor conflicts. Please find below a short 
summary of the changes per branch:
 * 2.1:
 ** Add REJECT and CAP expiration date overflow policies and tests
 ** Cap max default TTL at 20 years and tests
 ** Add NEWS.txt entry
 * 2.2:
 ** Same as 2.1, few minor import conflicts
 * 3.0
 ** Add REJECT and CAP, CAP_NOWARN expiration date overflow policies and tests
 ** Add ability to scrub to fix negative localDeletionTime and tests with 
broken SSTables
 ** Add ability to sstablemetadata to show minLocalDeletionTime
 ** Add expiration date overflow policies to jvm.options file
 ** Add NEWS.txt entry
 * 3.11
 ** Same as 3.0, few minor conflicts during merge
 * master
 ** Same as 3.11, few minor conflicts during merge
 ** Removed ability of scrub to fix sstables with negative localdeletionTime 
and tests
 * dtest
 ** Test all policies on CQL for default and user supplied TTL
 ** Test cap policy on thrift for default and user supplied TTL
 ** Check that offline scrub recovers sstable with negative localDeletionTime

I submitted a preliminary round of CI with the non-cleaned up patch and the 
results looked good. I will submit again for all the branches below and post 
the results here when they are ready.
||2.1||2.2||3.0||3.11||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-14092-v5]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-14092-v5]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14092-v5]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:3.11-14092-v5]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14092-v5]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14092-v5]|

> Max ttl of 20 years will overflow localDeletionTime
> ---------------------------------------------------
>
>                 Key: CASSANDRA-14092
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14092
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Blocker
>             Fix For: 2.1.20, 2.2.12, 3.0.16, 3.11.2
>
>
> CASSANDRA-4771 added a max value of 20 years for ttl to protect against [year 
> 2038 overflow bug|https://en.wikipedia.org/wiki/Year_2038_problem] for 
> {{localDeletionTime}}.
> It turns out that next year the {{localDeletionTime}} will start overflowing 
> with the maximum ttl of 20 years ({{System.currentTimeMillis() + ttl(20 
> years) > Integer.MAX_VALUE}}), so we should remove this limitation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to