[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610728#comment-14610728 ] Jonathan Ellis commented on CASSANDRA-7872: --- Ping [~krummas] > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.x > > Attachments: 7872-v2.0-NoPhQ.txt, 7872-v2.0-bugdetector.txt, > 7872-v2.0-robustness.txt, EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164823#comment-14164823 ] Pavel Yaskevich commented on CASSANDRA-7872: bq. Could we fix this by just waiting until the file to be removed, before removing the log entry? I think that would simplify startup logic a bit because we wouldn't have to care about refresh but in general we are getting ancestor information from the SSTable files, so keeping info in compaction log wouldn't make much difference. bq. If we were to add phantom refs I'd rather have it as an assert to flag a refcount bug, rather than trying to paper over bugs. (As an assert ISTM that it doesn't matter a great deal if behavior differs across JVMs.) Exactly my point, flagging it would be fine with me. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163727#comment-14163727 ] Jonathan Ellis commented on CASSANDRA-7872: --- If we were to add phantom refs I'd rather have it as an assert to flag a refcount bug, rather than trying to paper over bugs. (As an assert ISTM that it doesn't matter a great deal if behavior differs across JVMs.) That said, I'm not sure how much complexity we're signing up for there so I'd definitely like to break it out as a separate patch. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163719#comment-14163719 ] Jonathan Ellis commented on CASSANDRA-7872: --- back to the non GC functionality: bq. this fixes a bit different situation when compaction succeeds ("compaction log" entry is removed) file is not removed right away but queued for removal Could we fix this by just waiting until the file to be removed, before removing the log entry? > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160086#comment-14160086 ] Pavel Yaskevich commented on CASSANDRA-7872: I am not sure why are you trying to twist my words here, I didn't ever claim that FullGC never happens with C* rather that we trying our best to reduce frequency of it to minimum. My concern here is, as already stated, that #3 is intended as a safety net feature but we are trying to rely on implementation detail which is never a good design, even tho it's currently this way it doesn't mean it's here to say or works the same across the platforms and garbage collectors, it's always better to address the problem instead of trying to fix it's sympthoms. That said, if other people think it's a good idea I would comply. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160039#comment-14160039 ] Oleg Anastasyev commented on CASSANDRA-7872: Pavel, this is not happens to work, this is just how CMS work since the very beginning. The weak proc stage is explaned in this old blog post of Oracle back in 2006. https://blogs.oracle.com/poonam/entry/understanding_cms_gc_logs for example. If you google, you can find more. If you check out the code of the hotspot, you can find it there. If youre curious, you can find this stage on G1 mixed cycle as well. I can hardly believe that Oracle will ever remove weak refs processing from concurrent GC cycles. This is just meaningless. Just because you did not made PhRefs work for some purpose in past does not mean it cannot be used for all purposes ever. In your case, you arguing with, this could be JVM with a bug or you may just had bug in your code, which prevented PhRef to work as intended. Until you specify exact version of JDK you was using, it is hard to tell was it bug, behavior of that exact JVM or just bug in your code. By the way, claiming that "FullGC never happens on C*" as an argument against #3 is not legal as well, because you cannot point out some "spec like JMM", that Full GC never happens on CMS as well. Moreover, it is explicitly known, that it does happen on any CMS instance, regardless of how hard you try. So even in worst case, using that JDK, version of which you cannot recon now, with possibly broken standard JVM behavior, it will delete files from disk earlier than restart of a node. And this is the exact purpose of #3 - just to remove it earlier that restart. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159709#comment-14159709 ] Pavel Yaskevich commented on CASSANDRA-7872: The thing is it's not guaranteed it just *happens* to work this way, pretty much what I was saying all the way. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159705#comment-14159705 ] Aleksey Yeschenko commented on CASSANDRA-7872: -- I'd go for some more explicit way, if possible, but ultimately don't see an issue with using a feature that's guaranteed to work well on Oracle and Open JVMs, and work somehow elsewhere (after all, we don't have an issue w/ using Unsafe). FWIW I feel like CASSANDRA-7705 is also relevant to this issue. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159494#comment-14159494 ] Pavel Yaskevich commented on CASSANDRA-7872: bq. Having Plan B is better for reliability, than not having it at all. This whole ticket is about Plan B. C* is recommended to run on Oracle's Java, it would not likely run on anything else (just b/c it uses undocumented classes, like sun.misc.Cleaner for example). Oracle's Java has contract for PhRefs and the way how it is implemented. Having Plan B that could potentially never trigger and gives people force perspective on things is worse then not having one at all, also recommended doesn't mean required (and OpenJDK has sun.misc.Cleaner for that matter, but agent support is weaker comparing to Oracle JDK). You claim that there is a "contract" but the reality of things is that you failed to provide any proof that such contact exists even in Oracle JDK which explicitly mentions how/when phantom or any type of reference is supposed to be cleaned up except "If the garbage collector determines at a certain point in time" which is a broad definition and could mean anything e.g. only at Full GC time, as that behavior is left to be an implementation detail by JVM there is no guarantee that it's not going to change across releases or even that all of the garbage collector implementations are going to yield the same behavior, so things can go south and crash way before Plan B would trigger. I made my opinion known and pretty much done arguing about this without any real arguments to support #3, so I will leave this to [~jbellis] to tie break and we'll go from there. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159485#comment-14159485 ] Oleg Anastasyev commented on CASSANDRA-7872: I cannot agree with you. Having Plan B is better for reliability, than not having it at all. This whole ticket is about Plan B. C* is recommended to run on Oracle's Java, it would not likely run on anything else (just b/c it uses undocumented classes, like sun.misc.Cleaner for example). Oracle's Java has contract for PhRefs and the way how it is implemented. This way is stable and will be supported by JDK team in future to do not break backward compatibility, this is how they do JDK development. It fits purpose of cleaning things out which cannot be cleaned other way. So not using a feature of JDK and leave garbage on disk, because of possible existence of some abstract JDK out there with different behavior (on which C* will not likely work anyway) is just to sacrifice reliability for nothing. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159477#comment-14159477 ] Pavel Yaskevich commented on CASSANDRA-7872: I'm not trying to mix terms here, I just gave a concrete example for reference but we've multiple things. My point is - there is no guaranteed phrase or time when phantom or any other reference cleanup is going to happen, if #3 is intended as "Plan B" in case things go wrong with SSTable reference counting then by relying on phantom references we rely purely on implementation detail that in some JVMs it happens on CMS or minor collection which would not be true for all GC collector implementations and/or JVM implementations, by doing so we create false sense of safely for everybody which is a bad thing, so I'm strongly -1 on doing so. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159475#comment-14159475 ] Oleg Anastasyev commented on CASSANDRA-7872: Dont you mixing Cleaner of MappedBuffer, which relies on finalizers with Phantom ref Q? You may observed a behavior of MappedByteBuffers dont release their mapped memory to OS until FullGC, and this is well known 7 year "bug-o-feature" of JDK. The source of this behavior is how finalize() methods are processed in JDK. But PhantomRef processing is a different story, there are no finalizers there, so their proc happens on each CMS cycle. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159474#comment-14159474 ] Pavel Yaskevich commented on CASSANDRA-7872: That's exactly my point, "certain point in time" could be only when Full GC is performed as an fallback for CMS (exactly what we have observed previously) and that point could take weeks to happen which means that system will, most luckily, run out of disk space even before phantom references are determined to be cleaned, which brings us back to the point on relying on implementation detail (in our cases that it's going to happen on minor collection or CMS) with no execution guarantees which gives everybody a false impression of safely where there is none. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159472#comment-14159472 ] Oleg Anastasyev commented on CASSANDRA-7872: Well this is part of PhRef contract. http://docs.oracle.com/javase/6/docs/api/java/lang/ref/PhantomReference.html {quote} If the garbage collector determines at a certain point in time that the referent of a phantom reference is phantom reachable, then at that time or at some later time it will enqueue the reference {quote} The key point here not about CMS, but about some certain time while running the program. This is enough for cleaning up resources which cannot be cleaned other way due to bugs, etc. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159470#comment-14159470 ] Pavel Yaskevich commented on CASSANDRA-7872: bq. Having it deleted on next CMS gc (i.e. at some not exactly predictable point in near future) is IMO better than having them never deleted. If you can back quoted statement up with any document (JMM or other) that explicitly guarantees cleanup behavior you described (being, it's guaranteed to happen on CMS or minor collection for phantom references) I would be fine adding #3, meanwhile there is no real point of arguing about adding code that relies on the implementation detail and may never actually trigger. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159467#comment-14159467 ] Oleg Anastasyev commented on CASSANDRA-7872: I suggest youre talking not about STW but about that the moment of cleanup cannot be predicted and guaranteed to happen. Yes, if you build your entire critical resource cleanup system on ref queues, it is a problem. But this is not this case in this particular ticket. Delete of much of resources happen normally when sstable ref count reaches 0, i.e. as predictable as before. Cleanup on Ref queue is used only for those, which miscount their refs, so they will never be deleted, until restart. Having it deleted on next CMS gc (i.e. at some not exactly predictable point in near future) is IMO better than having them never deleted. Some of our c* clusters are running for several months and even years with no reboot, so not cleaning up resources until restart is an operational pain. As I can see CASSANDRA-7705 is planned for 3.0, which is far from being ready. So i suggest it is better to have some resources cleanup code now, in 2.0 and 2.1. It could be removed later on 3.0 release. Another benefit of having #3 is that it could help to catch bugs reference miscounts not cleaning up resources, b/c it notes to logs when it detected the miscount. Without #3 miscounts pass unnoticed. This could be used debugging CASSANDRA-7705 as well. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159465#comment-14159465 ] Pavel Yaskevich commented on CASSANDRA-7872: I'm not saying that STW is required to cleanup weak refs, but rather that it was (and probably still is) not guaranteed to happen at any given phrase and is an implementation detail, I think we've tried multiple versions of OpenJDK and some versions of Oracle JDK for 1.7 last year, although I can't recon which ones. We tried to make it work for the memory mapped mode couple of times before and I'm coming from the experience where it never performed what was expected reliably and that's why we do call Cleaner explicitly once reference count for SSTable reaches 0 instead of relaying on weak ref cleanup, personally I don't think it's a good idea to rely on this especially as we have CASSANDRA-7705 for the future versions so I would be very much happy to have 1 and 2 but not #3. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159459#comment-14159459 ] Oleg Anastasyev commented on CASSANDRA-7872: No, Phattom ref proc does not require STW FullGC. On Java 6, 7, and 8 it gets processed during CMS remark phase (if you enable PrintGCDetails you'll see timing of this stage in logs, like for example: {code} [Rescan (parallel) , 0.0159280 secs] [weak refs processing, 0.0032600 secs] <--- here [class unloading, 0.0082150 secs][scrub symbol table, 0.0078220 secs][scrub string table, 0.0013500 secs] [1 CMS-remark: 4480383K(24117248K)] 4506007K(28835840K), 0.1030490 secs] [Times: user=1.00 sys=0.01, real=0.10 secs] {code} This is how it works since 1.6_24 and all more modern JDKs, including all java 7 and on. jdk 6_24 is 3 years old now, so this way of work can be considered stable. Could you provide details on which java or java build number it did not work this way ? > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159285#comment-14159285 ] Pavel Yaskevich commented on CASSANDRA-7872: Yes, STW FullGC, what I am trying to say is that behavior is clean up different across available JVMs, so I am not really a fan of including something with would work only on certain implementations. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159188#comment-14159188 ] Oleg Anastasyev commented on CASSANDRA-7872: Do you mean stop the world full gc ? Phantom Queue is cleaned up at CMS cycle on java 6. Since java 7 (i am not sure about since what exact build number, but at least on 7_60 it works) it is cleaned either on ParNew or CMS cycle - as soon as reference is GCed. Here is a log snippet from production demonstrating it on ParNew for example: {code} 04:45:40,505 org.apache.cassandra.db.Memtable$FlushRunnable Completed flushing /mnt/db 3/system/compactions_in_progress/system-compactions_in_progress-jb-1489409-Data.db (42 bytes) for commitlog position ReplayPosition(segmentI d=1410945220011, position=86495573) 04:45:40,670 org.apache.cassandra.db.Memtable$FlushRunnable Completed flushing /mnt/db 3/system/compactions_in_progress/system-compactions_in_progress-jb-1489410-Data.db (152 bytes) for commitlog position ReplayPosition(segment Id=1410945220011, position=86503582) 04:45:40,672 org.apache.cassandra.db.compaction.CompactionTask Compacting [SSTableRead er(path='/mnt/db3/system/compactions_in_progress/system-compactions_in_progress-jb-1489409-Data.db'), SSTableReader(path='/mnt/db3/system/co mpactions_in_progress/system-compactions_in_progress-jb-1489410-Data.db'), SSTableReader(path='/mnt/db3/system/compactions_in_progress/syste m-compactions_in_progress-jb-1489407-Data.db'), SSTableReader(path='/mnt/db1/system/compactions_in_progress/system-compactions_in_progress-j b-1489408-Data.db')] Oct 4 04:45:40 [GC[ParNew Oct 4 04:45:40 Desired survivor size 268435456 bytes, new threshold 3 (max 3) Oct 4 04:45:40 - age 1: 12869048 bytes, 12869048 total Oct 4 04:45:40 - age 2:3157328 bytes, 16026376 total Oct 4 04:45:40 - age 3:1676208 bytes, 17702584 total Oct 4 04:45:40 : 4206407K->21161K(4718592K), 0.0428280 secs] 7744224K->3560919K(28835840K), 0.0462120 secs] [Times: user=0.55 sys=0.00, real=0.05 secs] 04:45:40,786 org.apache.cassandra.io.sstable.SSTableDeletingQueue$SSTableDeletingReference$1 Obsolete SSTable /mnt/db3/system/compactions_in_progress/system-compactions_in_progress-jb-1489409 not used anymore. Its reference co unting were broken or its deletion task is stuck. Forcing its remove now 04:45:40,787 org.apache.cassandra.io.sstable.SSTable Deleted /mnt/db3/system/compactio ns_in_progress/system-compactions_in_progress-jb-1489409 04:45:40,788 org.apache.cassandra.io.sstable.SSTableDeletingQueue$SSTableDeletingRefer ence$1 Obsolete SSTable /mnt/db3/system/compactions_in_progress/system-compactions_in_progress-jb-1489410 not used anymore. Its reference counting were broken or its deletion task is stuck. Forcing its remove now 04:45:40,788 org.apache.cassandra.io.sstable.SSTable Deleted /mnt/db3/system/compactio ns_in_progress/system-compactions_in_progress-jb-1489410 {code} compactions in progress ss table reader was created and dereferenced between 2 parnew collections, so its ref is cleaned on ParNew. If sstable reader reference survives new gen collection its phantom will be cleaned on closes CMS cycle, which clean its reference from heap. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > ssta
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159050#comment-14159050 ] Pavel Yaskevich commented on CASSANDRA-7872: I understand intention behind #3 and we have already tried to do similar thing before but as I mentioned it doesn't really help as we are trying our best to make sure that FullGC happens very-very rarely so with high probility #3 will actually never trigger so there is no real point of adding dead code. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159005#comment-14159005 ] Oleg Anastasyev commented on CASSANDRA-7872: #3 is not neccessary of course, but helps to free disk space earlier if reference counting fails. This could be essential, if nodes are not rebooted for a long time. And if ref counting is not failed, cleaning this ref queue is basically a nop. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158935#comment-14158935 ] Pavel Yaskevich commented on CASSANDRA-7872: In case you are talking about "compaction log" CF then this fixes a bit different situation when compaction succeeds ("compaction log" entry is removed) file is not removed right away but queued for removal so in case of failure old "compacted" files are still left on disk and there is no indication that those files should be removed except new sstable's metadata which points to them as it's ancestors, attached patch uses that information to determine if all of the sstables present in the data directories should be loaded, discarding ones which are compacted by for some reason are still there. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158918#comment-14158918 ] Jonathan Ellis commented on CASSANDRA-7872: --- Doesn't the compaction information recorded in the system table already ensure that 1. either only new sstables are kept (if compaction completed) or only old ones retained (if it does not) and 2. old ones that are not deleted will be deleted on startup? > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158833#comment-14158833 ] Pavel Yaskevich commented on CASSANDRA-7872: I can agree with #1 and #2 from the description (with minor code stuff refactoring in v2.0 patch) but I would rather not include #3 as phantom references are tricky business and only cleaned by by FullGC. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7872) ensure compacted obsolete sstables are not open on node restart and nodetool refresh, even on sstable reference miscounting or deletion tasks are failed.
[ https://issues.apache.org/jira/browse/CASSANDRA-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139954#comment-14139954 ] Benedict commented on CASSANDRA-7872: - See also CASSANDRA-7705. Ensuring deletion in startup is still not a bad thing, but we have more robust management coming in 2.1 or 3.0, so we won't want to merge all of this upstream. > ensure compacted obsolete sstables are not open on node restart and nodetool > refresh, even on sstable reference miscounting or deletion tasks are failed. > - > > Key: CASSANDRA-7872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7872 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Oleg Anastasyev >Assignee: Oleg Anastasyev > Fix For: 2.0.11 > > Attachments: EnsureNoObsoleteSSTables-7872-v2.0.txt > > > Since CASSANDRA-4436 compacted sstables are no more marked with > COMPACTED_MARKER file. Instead after they are compacted, DataTracker calls > SSTableReader.markObsolete(), but the actual deletion is happening later on > SSTableReader.releaseReference(). > This reference counting is very fragile, it is very easy to introduce a > hard-to-catch and rare bug, so this reference count never reaches 0 ( like > CASSANDRA-6503 for example ) > This means, that very rarely obsolete sstable files are not removed from disk > (but are not used anymore by cassandra to read data). > If more than gc grace time has passed since sstable file was not removed from > disk and operator issues either nodetool refresh or just reboots a node, > these obsolete files are being discovered and open for read by a node. So > deleted data is resurrected, being quickly spread by RR to whole cluster. > Because consequences are very serious (even a single not removed obsolete > sstable file could render your data useless) this patch makes sure no > obsolete sstable file can be open for read by: > 1. Removing sstables on CFS init analyzing sstable generations (sstable is > removed, if there are another sstable, listing this as ancestor) > 2. Reimplementing COMPACTED_MARKER file for sstable. This marker is created > as soon as markObsolete is called. This is neccessary b/c generation info can > be lost (when sstables compact to none) > 3. To remove sstables sooner then restart - reimplemented the good old GC > phantom reference queue as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)