[jira] [Updated] (CASSANDRA-14543) Hinted handoff to replay purgeable tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-14543: --- Resolution: Won't Fix Status: Resolved (was: Awaiting Feedback) > Hinted handoff to replay purgeable tombstones > -- > > Key: CASSANDRA-14543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14543 > Project: Cassandra > Issue Type: Improvement >Reporter: Jay Zhuang >Priority: Minor > > Hinted-handoff currently only dispatches and applies the mutations that are > within GCGS: > [{{Hint.java:97}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/Hint.java#L97]. > Which is to make sure it won't resurrect any deleted data. > But replaying tombstones should be safe, it could reduce the chance to have > [un-repairable inconsistent > data|https://lists.apache.org/thread.html/2d3d39d960143d4d2146ed2530821504ff855e832713dec7d0afd8ac@%3Cdev.cassandra.apache.org%3E]. > Here is the user scenario it tries to fix: > {noformat} > 1. Create a 3 nodes cluster > 2. Create a table with small gc_grace_seconds (for reproducing purpose): > CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE TABLE foo.bar ( > id int PRIMARY KEY, > name text > ) WITH gc_grace_seconds=30; > 3. Insert data with consistency all: > INSERT INTO foo.bar (id, name) VALUES(1, 'cstar'); > 4. stop 1 node > $ ccm node2 stop > 5. Delete the data with consistency quorum: > DELETE FROM foo.bar WHERE id=1; > 6. Wait 30 seconds and then start node2: > $ ccm node2 start > {noformat} > Now, node2 has the data, node1/node3 have the purgeable tombstone. It > triggers RR every time which sends data from node2 to node1/node3 but repairs > nothing. > With purgeable tombstones hints handoff, it at least will dispatch the > tombstone and delete the data on node2. It won't fix the root cause but > reduce the chance to have this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14543) Hinted handoff to replay purgeable tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528468#comment-16528468 ] Jay Zhuang commented on CASSANDRA-14543: {quote} [~iamaleksey]: Replaying just the tombstones might be safe-ish, but it’s only helping with your issue in a very narrow time window. And there will be a price to pay for this: hint dispatch will have to become less efficient if we end up inspecting and filtering out every mutation. {quote} Make sense to me. Mostly it's for the use case that {{GCGS < max_hint_window_in_ms}}, which seems always a bad idea. For tombstone, it may hit this issue, for normal data, hints ({{>GCGS}}) are not dispatched even in hint_window. Should we consider logging a warning or error message when {{GCGS < max_hint_window_in_ms}}? (BTW: please review this minor patch to improve the hints handoff performance: CASSANDRA-14536 :) ). > Hinted handoff to replay purgeable tombstones > -- > > Key: CASSANDRA-14543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14543 > Project: Cassandra > Issue Type: Improvement >Reporter: Jay Zhuang >Priority: Minor > > Hinted-handoff currently only dispatches and applies the mutations that are > within GCGS: > [{{Hint.java:97}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/Hint.java#L97]. > Which is to make sure it won't resurrect any deleted data. > But replaying tombstones should be safe, it could reduce the chance to have > [un-repairable inconsistent > data|https://lists.apache.org/thread.html/2d3d39d960143d4d2146ed2530821504ff855e832713dec7d0afd8ac@%3Cdev.cassandra.apache.org%3E]. > Here is the user scenario it tries to fix: > {noformat} > 1. Create a 3 nodes cluster > 2. Create a table with small gc_grace_seconds (for reproducing purpose): > CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE TABLE foo.bar ( > id int PRIMARY KEY, > name text > ) WITH gc_grace_seconds=30; > 3. Insert data with consistency all: > INSERT INTO foo.bar (id, name) VALUES(1, 'cstar'); > 4. stop 1 node > $ ccm node2 stop > 5. Delete the data with consistency quorum: > DELETE FROM foo.bar WHERE id=1; > 6. Wait 30 seconds and then start node2: > $ ccm node2 start > {noformat} > Now, node2 has the data, node1/node3 have the purgeable tombstone. It > triggers RR every time which sends data from node2 to node1/node3 but repairs > nothing. > With purgeable tombstones hints handoff, it at least will dispatch the > tombstone and delete the data on node2. It won't fix the root cause but > reduce the chance to have this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14551) ReplicationAwareTokenAllocator should block bootstrap if no replication number is set
[ https://issues.apache.org/jira/browse/CASSANDRA-14551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-14551: --- Status: Patch Available (was: Open) > ReplicationAwareTokenAllocator should block bootstrap if no replication > number is set > - > > Key: CASSANDRA-14551 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14551 > Project: Cassandra > Issue Type: Bug > Components: Configuration >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > > We're using > [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]. > When bootstrapping a new DC, the tokens are not well distributed. The > problem is because the replication number is not set for the new DC before > the bootstrap. > I would suggest blocking the bootstrap if replication number is not set. It's > unsafe to assume the default replicas is 1. Which also causes the following > invalid stats: > {noformat} > WARN [main] 2018-06-29 17:30:55,696 TokenAllocation.java:69 - Replicated > node load in datacenter before allocation max NaN min NaN stddev NaN > WARN [main] 2018-06-29 17:30:55,696 TokenAllocation.java:70 - Replicated > node load in datacenter after allocation max NaN min NaN stddev NaN > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14551) ReplicationAwareTokenAllocator should block bootstrap if no replication number is set
[ https://issues.apache.org/jira/browse/CASSANDRA-14551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528420#comment-16528420 ] Jay Zhuang commented on CASSANDRA-14551: Here is the patch, please review: | Branch | uTest | dTest | | [14551-trunk|https://github.com/cooldoger/cassandra/tree/14551-trunk] | [!https://circleci.com/gh/cooldoger/cassandra/tree/14551-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14551-trunk] | [!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/586/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/586/] | cc. [~dikanggu] > ReplicationAwareTokenAllocator should block bootstrap if no replication > number is set > - > > Key: CASSANDRA-14551 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14551 > Project: Cassandra > Issue Type: Bug > Components: Configuration >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > > We're using > [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]. > When bootstrapping a new DC, the tokens are not well distributed. The > problem is because the replication number is not set for the new DC before > the bootstrap. > I would suggest blocking the bootstrap if replication number is not set. It's > unsafe to assume the default replicas is 1. Which also causes the following > invalid stats: > {noformat} > WARN [main] 2018-06-29 17:30:55,696 TokenAllocation.java:69 - Replicated > node load in datacenter before allocation max NaN min NaN stddev NaN > WARN [main] 2018-06-29 17:30:55,696 TokenAllocation.java:70 - Replicated > node load in datacenter after allocation max NaN min NaN stddev NaN > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14551) ReplicationAwareTokenAllocator should block bootstrap if no replication number is set
Jay Zhuang created CASSANDRA-14551: -- Summary: ReplicationAwareTokenAllocator should block bootstrap if no replication number is set Key: CASSANDRA-14551 URL: https://issues.apache.org/jira/browse/CASSANDRA-14551 Project: Cassandra Issue Type: Bug Components: Configuration Reporter: Jay Zhuang Assignee: Jay Zhuang We're using [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]. When bootstrapping a new DC, the tokens are not well distributed. The problem is because the replication number is not set for the new DC before the bootstrap. I would suggest blocking the bootstrap if replication number is not set. It's unsafe to assume the default replicas is 1. Which also causes the following invalid stats: {noformat} WARN [main] 2018-06-29 17:30:55,696 TokenAllocation.java:69 - Replicated node load in datacenter before allocation max NaN min NaN stddev NaN WARN [main] 2018-06-29 17:30:55,696 TokenAllocation.java:70 - Replicated node load in datacenter after allocation max NaN min NaN stddev NaN {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-9608) Support Java 11
[ https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528106#comment-16528106 ] Jason Brown commented on CASSANDRA-9608: I've taken a first pass through the scripts and build.xml parts (all the non-code stuff), and on the whole it's looking pretty good. I've made a few minor comments on the PR, but I have these points, as well: - you have the java version check code copied across several scripts, and it looks like clients.in.sh is used by many other scripts (so I guess this is the 'canonical' location?). Should we have the main bin/cassandra.in.sh use this? Maybe the one in tools/bin, as well? Also, I don't think (I may be wrong) the cassandra.in.sh in the debian/redhat can call the clients.in.sh script. wdyt? - the changes you made to cassandra-env.sh need to be made to cassandra-env.ps1 (wrt java version checking). There's also some Windows {{.bat}} files in the code base. Can you check to see if they need updates, as well? - can we add a simple note to conf/jvm8-clients.options, with something like this: "this file is intentionaly blank". or should we just get rid of it, and add it when we actually need it? I suspect the code will be easier to review, so hopefully I can knock that out in short order. > Support Java 11 > --- > > Key: CASSANDRA-9608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9608 > Project: Cassandra > Issue Type: Task >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > Fix For: 4.x > > Attachments: jdk_9_10.patch > > > This ticket is intended to group all issues found to support Java 9 in the > future. > From what I've found out so far: > * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. > It can be easily solved using this patch: > {code} > - artifactId="cobertura"/> > + artifactId="cobertura"> > + > + > {code} > * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods > {{monitorEnter}} + {{monitorExit}}. These methods are used by > {{o.a.c.utils.concurrent.Locks}} which is only used by > {{o.a.c.db.AtomicBTreeColumns}}. > I don't mind to start working on this yet since Java 9 is in a too early > development phase. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
[ https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528103#comment-16528103 ] ASF GitHub Bot commented on CASSANDRA-6541: --- Github user jasobrown commented on a diff in the pull request: https://github.com/apache/cassandra/pull/236#discussion_r199253772 --- Diff: conf/jvm11.options --- @@ -0,0 +1,89 @@ +### +#jvm11.options# +# # +# See jvm.options. This file is specific for Java 11 and newer. # +### + +# +# GC SETTINGS # +# + + + +### CMS Settings +#-XX:+UseParNewGC +#-XX:+UseConcMarkSweepGC +#-XX:+CMSParallelRemarkEnabled +#-XX:SurvivorRatio=8 +#-XX:MaxTenuringThreshold=1 +#-XX:CMSInitiatingOccupancyFraction=75 +#-XX:+UseCMSInitiatingOccupancyOnly +#-XX:CMSWaitDuration=1 +#-XX:+CMSParallelInitialMarkEnabled +#-XX:+CMSEdenChunksRecordAlways +## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541 +#-XX:+CMSClassUnloadingEnabled + + + +### G1 Settings +## Use the Hotspot garbage-first collector. +-XX:+UseG1GC +-XX:+ParallelRefProcEnabled + +# +## Have the JVM do less remembered set work during STW, instead +## preferring concurrent GC. Reduces p99.9 latency. +-XX:G1RSetUpdatingPauseTimePercent=5 +# +## Main G1GC tunable: lowering the pause target will lower throughput and vise versa. +## 200ms is the JVM default and lowest viable setting +## 1000ms increases throughput. Keep it smaller than the timeouts in cassandra.yaml. +-XX:MaxGCPauseMillis=500 + +## Optional G1 Settings +# Save CPU time on large (>= 16GB) heaps by delaying region scanning +# until the heap is 70% full. The default in Hotspot 8u40 is 40%. +#-XX:InitiatingHeapOccupancyPercent=70 + +# For systems with > 8 cores, the default ParallelGCThreads is 5/8 the number of logical cores. +# Otherwise equal to the number of cores when 8 or less. +# Machines with > 10 cores should try setting these to <= full cores. +#-XX:ParallelGCThreads=16 +# By default, ConcGCThreads is 1/4 of ParallelGCThreads. +# Setting both to the same value can reduce STW durations. +#-XX:ConcGCThreads=16 + + +### JPMS + +-Djdk.attach.allowAttachSelf=true +--add-exports java.base/jdk.internal.misc=ALL-UNNAMED +--add-opens java.base/jdk.internal.module=ALL-UNNAMED +--add-exports java.base/jdk.internal.ref=ALL-UNNAMED +--add-exports java.base/sun.nio.ch=ALL-UNNAMED +--add-exports java.management.rmi/com.sun.jmx.remote.internal.rmi=ALL-UNNAMED +--add-exports java.rmi/sun.rmi.registry=ALL-UNNAMED +--add-exports java.rmi/sun.rmi.server=ALL-UNNAMED +--add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED + + +### GC logging options -- uncomment to enable + +# Java 11 (and newer) GC logging options: +# See description of https://bugs.openjdk.java.net/browse/JDK-8046148 for details about the syntax +# The following is the equivalent to -XX:+PrintGCDetails -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M +#-Xlog:gc=info,heap*=trace,age*=debug,safepoint=info,promotion*=trace:file=/var/log/cassandra/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10240 --- End diff -- minor nit: `filesize=10240` is the file size in bytes. change to `10485760` if you actually want 10MB files. > New versions of Hotspot create new Class objects on every JMX connection > causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set. > - > > Key: CASSANDRA-6541 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6541 > Project: Cassandra > Issue Type: Bug > Components: Configuration >Reporter: jonathan lacefield >Assignee: Brandon Williams >Priority: Minor > Fix For: 1.2.16, 2.0.6, 2.1 beta2 > > Attachments: dse_systemlog > > > Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 > (maybe earlier), are experiencing issues with GC and JMX where heap slowly > fills up overtime until OOM or a full GC event occurs, specifically when CMS > is leveraged. Adding: > {noformat} > JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled" > {noformat} > The th
[jira] [Comment Edited] (CASSANDRA-14549) Transient Replication: support logged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-14549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527800#comment-16527800 ] Ariel Weisberg edited comment on CASSANDRA-14549 at 6/29/18 3:38 PM: - I think this is pretty important to have in 4.0, but not as important as having minimal PAXOS support. So if we have to let it slip we should make sure it works correctly but just fails to implement the cheap quorum optimization and document the caveat. was (Author: aweisberg): I think this is pretty important to have in 4.0, but not as important as having minimal PAXOS support. So if we have to let it slip we should make sure it fails correctly if you have transient replication enabled and it is documented as a caveat. > Transient Replication: support logged batches > - > > Key: CASSANDRA-14549 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14549 > Project: Cassandra > Issue Type: Sub-task >Reporter: Blake Eggleston >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14547) Transient Replication: Support paxos
[ https://issues.apache.org/jira/browse/CASSANDRA-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527805#comment-16527805 ] Ariel Weisberg commented on CASSANDRA-14547: I think we should try and get PAXOS commit to use transient replication. Hopefully we can live with the other phases of PAXOS not using transient replication under the assumption that LWT are a smaller portion of the workload and document the caveat. There is work here at every step because PAXOS doesn't use the regular read/write path for each phase so it doesn't automatically pick up transient replication. The current code will send writes to all replicas so we don't get the benefit of the cheap quorum optimization. > Transient Replication: Support paxos > > > Key: CASSANDRA-14547 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14547 > Project: Cassandra > Issue Type: Sub-task >Reporter: Blake Eggleston >Priority: Major > Fix For: 4.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14549) Transient Replication: support logged batches
[ https://issues.apache.org/jira/browse/CASSANDRA-14549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527800#comment-16527800 ] Ariel Weisberg commented on CASSANDRA-14549: I think this is pretty important to have in 4.0, but not as important as having minimal PAXOS support. So if we have to let it slip we should make sure it fails correctly if you have transient replication enabled and it is documented as a caveat. > Transient Replication: support logged batches > - > > Key: CASSANDRA-14549 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14549 > Project: Cassandra > Issue Type: Sub-task >Reporter: Blake Eggleston >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14548) Transient Replication: support counters
[ https://issues.apache.org/jira/browse/CASSANDRA-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527796#comment-16527796 ] Ariel Weisberg commented on CASSANDRA-14548: For 4.0 I think we should forbid mixing transient replication and counters. > Transient Replication: support counters > --- > > Key: CASSANDRA-14548 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14548 > Project: Cassandra > Issue Type: Sub-task >Reporter: Blake Eggleston >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13543) Cassandra SASI index gives unexpected number of results
[ https://issues.apache.org/jira/browse/CASSANDRA-13543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan West reassigned CASSANDRA-13543: --- Assignee: Jordan West (was: Alex Petrov) > Cassandra SASI index gives unexpected number of results > --- > > Key: CASSANDRA-13543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13543 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Alexander Nabatchikov >Assignee: Jordan West >Priority: Major > > I've faced the issue with LIKE query to the column indexed by SASI index. > Cassandra can return different number of rows when the data stays immutable. > {code} > CREATE TABLE idx_test > ( > id int, > str text, > i int, > PRIMARY KEY (id) > ); > CREATE CUSTOM INDEX idx_test_idx ON idx_test (str) > USING 'org.apache.cassandra.index.sasi.SASIIndex' > WITH OPTIONS = { > 'mode': 'CONTAINS', > 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', > 'tokenization_enable_stemming': 'true', > 'tokenization_normalize_lowercase': 'true' > }; > INSERT INTO idx_test (id, str, i) VALUES (1, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (2, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (3, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (4, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (5, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (6, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (7, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (8, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (9, 'a b c d', 10); > INSERT INTO idx_test (id, str, i) VALUES (10, 'a b c d', 10); > {code} > Query: > {code} > SELECT * FROM idx_test WHERE str LIKE 'b' > AND i = 10 > ALLOW FILTERING; > {code} > This query mostly returns 0 rows, but sometimes 1 row appears in result row > set as: > {code} > id | i | str > 10 | 10 | a b c d > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted
[ https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-14423: Resolution: Fixed Reviewers: Marcus Eriksson Reproduced In: 3.11.2, 3.11.0 (was: 3.11.0, 3.11.2) Status: Resolved (was: Ready to Commit) committed as f8912ce9329a8bc360e93cf61e56814135fbab39 > SSTables stop being compacted > - > > Key: CASSANDRA-14423 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14423 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Kurt Greaves >Assignee: Kurt Greaves >Priority: Blocker > Fix For: 2.2.13, 3.0.17, 3.11.3 > > > So seeing a problem in 3.11.0 where SSTables are being lost from the view and > not being included in compactions/as candidates for compaction. It seems to > get progressively worse until there's only 1-2 SSTables in the view which > happen to be the most recent SSTables and thus compactions completely stop > for that table. > The SSTables seem to still be included in reads, just not compactions. > The issue can be fixed by restarting C*, as it will reload all SSTables into > the view, but this is only a temporary fix. User defined/major compactions > still work - not clear if they include the result back in the view but is not > a good work around. > This also results in a discrepancy between SSTable count and SSTables in > levels for any table using LCS. > {code:java} > Keyspace : xxx > Read Count: 57761088 > Read Latency: 0.10527088681224288 ms. > Write Count: 2513164 > Write Latency: 0.018211106398149903 ms. > Pending Flushes: 0 > Table: xxx > SSTable count: 10 > SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0] > Space used (live): 894498746 > Space used (total): 894498746 > Space used by snapshots (total): 0 > Off heap memory used (total): 11576197 > SSTable Compression Ratio: 0.6956629530569777 > Number of keys (estimate): 3562207 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 87 > Local read count: 57761088 > Local read latency: 0.108 ms > Local write count: 2513164 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 86.33 > Bloom filter false positives: 43 > Bloom filter false ratio: 0.0 > Bloom filter space used: 8046104 > Bloom filter off heap memory used: 8046024 > Index summary off heap memory used: 3449005 > Compression metadata off heap memory used: 81168 > Compacted partition minimum bytes: 104 > Compacted partition maximum bytes: 5722 > Compacted partition mean bytes: 175 > Average live cells per slice (last five minutes): 1.0 > Maximum live cells per slice (last five minutes): 1 > Average tombstones per slice (last five minutes): 1.0 > Maximum tombstones per slice (last five minutes): 1 > Dropped Mutations: 0 > {code} > Also for STCS we've confirmed that SSTable count will be different to the > number of SSTables reported in the Compaction Bucket's. In the below example > there's only 3 SSTables in a single bucket - no more are listed for this > table. Compaction thresholds haven't been modified for this table and it's a > very basic KV schema. > {code:java} > Keyspace : yyy > Read Count: 30485 > Read Latency: 0.06708991307200263 ms. > Write Count: 57044 > Write Latency: 0.02204061776873992 ms. > Pending Flushes: 0 > Table: yyy > SSTable count: 19 > Space used (live): 18195482 > Space used (total): 18195482 > Space used by snapshots (total): 0 > Off heap memory used (total): 747376 > SSTable Compression Ratio: 0.7607394576769735 > Number of keys (estimate): 116074 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 39 > Local read count: 30485 > Local read latency: NaN ms > Local write count: 57044 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 79.76 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.0 > Bloom filter space used: 690912 > Bloom filter off heap memory used: 690760 > Index summary off heap memory used: 54736 > Compression metadata off heap memory used: 1880 > Compacted partition minimum bytes: 73 > Compacted partition maximum bytes: 124 > Compacted partition mean bytes: 96 > Average live cells per slice (last five minutes): NaN > Maximum live cells per slice (last five minutes): 0 > Average tombstones per slice (last five minutes): NaN > Maximum tombstones per slice (last five minutes): 0 > Dropped Mutations: 0 > {code} > {code:java} > Apr 27 03:10:39 cassandra[9263]: TRACE o.a
[jira] [Commented] (CASSANDRA-7282) Faster Memtable map
[ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527305#comment-16527305 ] Michael Burman commented on CASSANDRA-7282: --- I've updated this work with the following updates to get this process restarted: * Ported to newer storage format and rebased to current master * Removed visibility optimization from resize operation as I could make this NPE with a race condition if large resizes happen in short duration ** I tested different resize algorithms, such as tracking the read and write path chains and resizing if certain criteria is met. This is the strategy used by the Linux kernel's RCU, but I did not manage to make any notable performance gains using that method so I left the original resize algorithm. * Each node token range is now backed by a separate index. ** Reduces the contention on the index updates, resizes and size parameter update (which was a contention point for all threads previously). ** Apply a normalization for token range hash updates to improve the distribution of hashes and that way the coverage of the index *** Center the hash range to around 0 first and then apply a scaling multiplier (proportional size of the token range compared to the available hash range [Long.MIN_VALUE, Long.MAX_VALUE]) ** Allows different growth sizes for each token range ** For system tables and cases where node might see writes that were not known during the initilization and overflow index with range [Long.MIN_VALUE, Long.MAX_VALUE] is used (works like the original patch) ** For lookup to the correct index I used a balanced BST. If there's a better way, I'm all ears. Linear search would be as fast if the amount of vnodes is small, but this scales to larger amount of vnodes also. I've tried to run different workloads against it and also tried to use different hashing methods, including the 32-bit hash from MurmurHash to trigger a bad hash distribution, but I didn't couldn't see any scenario with performance degradation compared to CSLM. With a hash attack (which Murmur is vulnerable to) this can be done of course, but those will break Cassandra node distribution also. Link to the branch: [https://github.com/burmanm/cassandra/tree/7282-trunk-2] > Faster Memtable map > --- > > Key: CASSANDRA-7282 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7282 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Priority: Major > Labels: performance > Fix For: 4.x > > Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, > run1.svg, writes.svg > > > Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in > our memtables. Maintaining this is an O(lg(n)) operation; since the vast > majority of users use a hash partitioner, it occurs to me we could maintain a > hybrid ordered list / hash map. The list would impose the normal order on the > collection, but a hash index would live alongside as part of the same data > structure, simply mapping into the list and permitting O(1) lookups and > inserts. > I've chosen to implement this initial version as a linked-list node per item, > but we can optimise this in future by storing fatter nodes that permit a > cache-line's worth of hashes to be checked at once, further reducing the > constant factor costs for lookups. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bba0d03e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bba0d03e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bba0d03e Branch: refs/heads/cassandra-3.11 Commit: bba0d03e9c5e62c222734839a9adc83f1aec6f95 Parents: ea62d88 489c2f6 Author: Mick Semb Wever Authored: Fri Jun 29 16:58:26 2018 +1000 Committer: Mick Semb Wever Committed: Fri Jun 29 17:00:02 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 67 +++- .../db/compaction/AntiCompactionTest.java | 109 ++- 3 files changed, 147 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index f0a4de5,f033bf2..fa6b03e --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -478,115 -474,17 +477,126 @@@ public class CompactionManager implemen }, jobs, OperationType.CLEANUP); } +public AllSSTableOpStatus performGarbageCollection(final ColumnFamilyStore cfStore, TombstoneOption tombstoneOption, int jobs) throws InterruptedException, ExecutionException +{ +assert !cfStore.isIndex(); + +return parallelAllSSTableOperation(cfStore, new OneSSTableOperation() +{ +@Override +public Iterable filterSSTables(LifecycleTransaction transaction) +{ +Iterable originals = transaction.originals(); +if (cfStore.getCompactionStrategyManager().onlyPurgeRepairedTombstones()) +originals = Iterables.filter(originals, SSTableReader::isRepaired); +List sortedSSTables = Lists.newArrayList(originals); +Collections.sort(sortedSSTables, SSTableReader.maxTimestampComparator); +return sortedSSTables; +} + +@Override +public void execute(LifecycleTransaction txn) throws IOException +{ +logger.debug("Garbage collecting {}", txn.originals()); +CompactionTask task = new CompactionTask(cfStore, txn, getDefaultGcBefore(cfStore, FBUtilities.nowInSeconds())) +{ +@Override +protected CompactionController getCompactionController(Set toCompact) +{ +return new CompactionController(cfStore, toCompact, gcBefore, null, tombstoneOption); +} +}; +task.setUserDefined(true); +task.setCompactionType(OperationType.GARBAGE_COLLECT); +task.execute(metrics); +} +}, jobs, OperationType.GARBAGE_COLLECT); +} + +public AllSSTableOpStatus relocateSSTables(final ColumnFamilyStore cfs, int jobs) throws ExecutionException, InterruptedException +{ +if (!cfs.getPartitioner().splitter().isPresent()) +{ +logger.info("Partitioner does not support splitting"); +return AllSSTableOpStatus.ABORTED; +} +final Collection> r = StorageService.instance.getLocalRanges(cfs.keyspace.getName()); + +if (r.isEmpty()) +{ +logger.info("Relocate cannot run before a node has joined the ring"); +return AllSSTableOpStatus.ABORTED; +} + +final DiskBoundaries diskBoundaries = cfs.getDiskBoundaries(); + +return parallelAllSSTableOperation(cfs, new OneSSTableOperation() +{ +@Override +public Iterable filterSSTables(LifecycleTransaction transaction) +{ +Set originals = Sets.newHashSet(transaction.originals()); +Set needsRelocation = originals.stream().filter(s -> !inCorrectLocation(s)).collect(Collectors.toSet()); +transaction.cancel(Sets.difference(originals, needsRelocation)); + +Map> groupedByDisk = groupByDiskIndex(needsRelocation); + +int maxSize = 0; +for (List diskSSTables : groupedByDisk.values()) +maxSize = Math.max(maxSize, diskSSTables.size()); + +List mixedSSTable
[04/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs
Stop SSTables being lost from compaction strategy after full repairs patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for CASSANDRA-14423 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9 Branch: refs/heads/trunk Commit: f8912ce9329a8bc360e93cf61e56814135fbab39 Parents: 1143bc1 Author: kurt Authored: Thu Jun 14 10:59:19 2018 + Committer: Mick Semb Wever Committed: Fri Jun 29 16:49:53 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 70 ++- .../db/compaction/AntiCompactionTest.java | 120 ++- 3 files changed, 156 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7b1089e..9d6a9ea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.13 + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 419f66e..013fc04 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -460,6 +460,16 @@ public class CompactionManager implements CompactionManagerMBean }, jobs, OperationType.CLEANUP); } +/** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. + * @return Futures executing anti-compaction. + */ public ListenableFuture submitAntiCompaction(final ColumnFamilyStore cfs, final Collection> ranges, final Refs sstables, @@ -475,6 +485,8 @@ public class CompactionManager implements CompactionManagerMBean { for (SSTableReader compactingSSTable : cfs.getTracker().getCompacting()) sstables.releaseIfHolds(compactingSSTable); +// We don't anti-compact any SSTable that has been compacted during repair as it may have been compacted +// with unrepaired data. Set compactedSSTables = new HashSet<>(); for (SSTableReader sstable : sstables) if (sstable.isMarkedCompacted()) @@ -504,9 +516,17 @@ public class CompactionManager implements CompactionManagerMBean * * Caller must reference the validatedForRepair sstables (via ParentRepairSession.getActiveRepairedSSTableRefs(..)). * + * NOTE: Repairs can take place on both unrepaired (incremental + full) and repaired (full) data. + * Although anti-compaction could work on repaired sstables as well and would result in having more accurate + * repairedAt values for these, we avoid anti-compacting already repaired sstables, as we currently don't + * make use of any actual repairedAt value and splitting up sstables just for that is not worth it. However, we will + * still update repairedAt if the SSTable is fully contained within the repaired ranges, as this does not require + * anticompaction. + * * @param cfs * @param ranges Ranges that the repair was carried out on * @param validatedForRepair SSTables containing the repaired ranges. Should be referenced before passing them. + * @param txn Transaction across all SSTables that were repaired. * @throws InterruptedException * @throws IOException */ @@ -519,13 +539,7 @@ public class CompactionManager implements CompactionManagerMBean logger.info("Starting anticompaction for {}.{} on {}/
[10/10] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5cc68a87 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5cc68a87 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5cc68a87 Branch: refs/heads/trunk Commit: 5cc68a87359dd02412bdb70a52dfcd718d44a5ba Parents: 4cb83cb bba0d03 Author: Mick Semb Wever Authored: Fri Jun 29 17:00:24 2018 +1000 Committer: Mick Semb Wever Committed: Fri Jun 29 17:00:24 2018 +1000 -- -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[02/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs
Stop SSTables being lost from compaction strategy after full repairs patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for CASSANDRA-14423 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9 Branch: refs/heads/cassandra-3.0 Commit: f8912ce9329a8bc360e93cf61e56814135fbab39 Parents: 1143bc1 Author: kurt Authored: Thu Jun 14 10:59:19 2018 + Committer: Mick Semb Wever Committed: Fri Jun 29 16:49:53 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 70 ++- .../db/compaction/AntiCompactionTest.java | 120 ++- 3 files changed, 156 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7b1089e..9d6a9ea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.13 + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 419f66e..013fc04 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -460,6 +460,16 @@ public class CompactionManager implements CompactionManagerMBean }, jobs, OperationType.CLEANUP); } +/** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. + * @return Futures executing anti-compaction. + */ public ListenableFuture submitAntiCompaction(final ColumnFamilyStore cfs, final Collection> ranges, final Refs sstables, @@ -475,6 +485,8 @@ public class CompactionManager implements CompactionManagerMBean { for (SSTableReader compactingSSTable : cfs.getTracker().getCompacting()) sstables.releaseIfHolds(compactingSSTable); +// We don't anti-compact any SSTable that has been compacted during repair as it may have been compacted +// with unrepaired data. Set compactedSSTables = new HashSet<>(); for (SSTableReader sstable : sstables) if (sstable.isMarkedCompacted()) @@ -504,9 +516,17 @@ public class CompactionManager implements CompactionManagerMBean * * Caller must reference the validatedForRepair sstables (via ParentRepairSession.getActiveRepairedSSTableRefs(..)). * + * NOTE: Repairs can take place on both unrepaired (incremental + full) and repaired (full) data. + * Although anti-compaction could work on repaired sstables as well and would result in having more accurate + * repairedAt values for these, we avoid anti-compacting already repaired sstables, as we currently don't + * make use of any actual repairedAt value and splitting up sstables just for that is not worth it. However, we will + * still update repairedAt if the SSTable is fully contained within the repaired ranges, as this does not require + * anticompaction. + * * @param cfs * @param ranges Ranges that the repair was carried out on * @param validatedForRepair SSTables containing the repaired ranges. Should be referenced before passing them. + * @param txn Transaction across all SSTables that were repaired. * @throws InterruptedException * @throws IOException */ @@ -519,13 +539,7 @@ public class CompactionManager implements CompactionManagerMBean logger.info("Starting anticompaction for {}.{
[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/489c2f69 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/489c2f69 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/489c2f69 Branch: refs/heads/cassandra-3.0 Commit: 489c2f69510b001770d9a59e55ba5d5175019050 Parents: 4e23c9e f8912ce Author: Mick Semb Wever Authored: Fri Jun 29 16:53:36 2018 +1000 Committer: Mick Semb Wever Committed: Fri Jun 29 16:57:34 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 66 ++- .../db/compaction/AntiCompactionTest.java | 109 ++- 3 files changed, 147 insertions(+), 29 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/CHANGES.txt -- diff --cc CHANGES.txt index aeeb0ae,9d6a9ea..d694f3b --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,34 -1,5 +1,35 @@@ -2.2.13 +3.0.17 + * Always close RT markers returned by ReadCommand#executeLocally() (CASSANDRA-14515) + * Reverse order queries with range tombstones can cause data loss (CASSANDRA-14513) + * Fix regression of lagging commitlog flush log message (CASSANDRA-14451) + * Add Missing dependencies in pom-all (CASSANDRA-14422) + * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447) + * Fix deprecated repair error notifications from 3.x clusters to legacy JMX clients (CASSANDRA-13121) + * Cassandra not starting when using enhanced startup scripts in windows (CASSANDRA-14418) + * Fix progress stats and units in compactionstats (CASSANDRA-12244) + * Better handle missing partition columns in system_schema.columns (CASSANDRA-14379) + * Delay hints store excise by write timeout to avoid race with decommission (CASSANDRA-13740) + * Deprecate background repair and probablistic read_repair_chance table options + (CASSANDRA-13910) + * Add missed CQL keywords to documentation (CASSANDRA-14359) + * Fix unbounded validation compactions on repair / revert CASSANDRA-13797 (CASSANDRA-14332) + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) + * Handle all exceptions when opening sstables (CASSANDRA-14202) + * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) + * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) + * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252) + * Respect max hint window when hinting for LWT (CASSANDRA-14215) + * Adding missing WriteType enum values to v3, v4, and v5 spec (CASSANDRA-13697) + * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163) + * Fix NPE when performing comparison against a null frozen in LWT (CASSANDRA-14087) + * Log when SSTables are deleted (CASSANDRA-14302) + * Fix batch commitlog sync regression (CASSANDRA-14292) + * Write to pending endpoint when view replica is also base replica (CASSANDRA-14251) + * Chain commit log marker potential performance regression in batch commit mode (CASSANDRA-14194) + * Fully utilise specified compaction threads (CASSANDRA-14210) + * Pre-create deletion log records to finish compactions quicker (CASSANDRA-12763) +Merged from 2.2: + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index ab363e0,013fc04..f033bf2 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -474,6 -460,16 +474,17 @@@ public class CompactionManager implemen }, jobs, OperationType.CLEANUP); } + /** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. ++ * @param parentRepairSession Corresponding repair session + * @return
[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bba0d03e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bba0d03e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bba0d03e Branch: refs/heads/trunk Commit: bba0d03e9c5e62c222734839a9adc83f1aec6f95 Parents: ea62d88 489c2f6 Author: Mick Semb Wever Authored: Fri Jun 29 16:58:26 2018 +1000 Committer: Mick Semb Wever Committed: Fri Jun 29 17:00:02 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 67 +++- .../db/compaction/AntiCompactionTest.java | 109 ++- 3 files changed, 147 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index f0a4de5,f033bf2..fa6b03e --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -478,115 -474,17 +477,126 @@@ public class CompactionManager implemen }, jobs, OperationType.CLEANUP); } +public AllSSTableOpStatus performGarbageCollection(final ColumnFamilyStore cfStore, TombstoneOption tombstoneOption, int jobs) throws InterruptedException, ExecutionException +{ +assert !cfStore.isIndex(); + +return parallelAllSSTableOperation(cfStore, new OneSSTableOperation() +{ +@Override +public Iterable filterSSTables(LifecycleTransaction transaction) +{ +Iterable originals = transaction.originals(); +if (cfStore.getCompactionStrategyManager().onlyPurgeRepairedTombstones()) +originals = Iterables.filter(originals, SSTableReader::isRepaired); +List sortedSSTables = Lists.newArrayList(originals); +Collections.sort(sortedSSTables, SSTableReader.maxTimestampComparator); +return sortedSSTables; +} + +@Override +public void execute(LifecycleTransaction txn) throws IOException +{ +logger.debug("Garbage collecting {}", txn.originals()); +CompactionTask task = new CompactionTask(cfStore, txn, getDefaultGcBefore(cfStore, FBUtilities.nowInSeconds())) +{ +@Override +protected CompactionController getCompactionController(Set toCompact) +{ +return new CompactionController(cfStore, toCompact, gcBefore, null, tombstoneOption); +} +}; +task.setUserDefined(true); +task.setCompactionType(OperationType.GARBAGE_COLLECT); +task.execute(metrics); +} +}, jobs, OperationType.GARBAGE_COLLECT); +} + +public AllSSTableOpStatus relocateSSTables(final ColumnFamilyStore cfs, int jobs) throws ExecutionException, InterruptedException +{ +if (!cfs.getPartitioner().splitter().isPresent()) +{ +logger.info("Partitioner does not support splitting"); +return AllSSTableOpStatus.ABORTED; +} +final Collection> r = StorageService.instance.getLocalRanges(cfs.keyspace.getName()); + +if (r.isEmpty()) +{ +logger.info("Relocate cannot run before a node has joined the ring"); +return AllSSTableOpStatus.ABORTED; +} + +final DiskBoundaries diskBoundaries = cfs.getDiskBoundaries(); + +return parallelAllSSTableOperation(cfs, new OneSSTableOperation() +{ +@Override +public Iterable filterSSTables(LifecycleTransaction transaction) +{ +Set originals = Sets.newHashSet(transaction.originals()); +Set needsRelocation = originals.stream().filter(s -> !inCorrectLocation(s)).collect(Collectors.toSet()); +transaction.cancel(Sets.difference(originals, needsRelocation)); + +Map> groupedByDisk = groupByDiskIndex(needsRelocation); + +int maxSize = 0; +for (List diskSSTables : groupedByDisk.values()) +maxSize = Math.max(maxSize, diskSSTables.size()); + +List mixedSSTables = new A
[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/489c2f69 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/489c2f69 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/489c2f69 Branch: refs/heads/cassandra-3.11 Commit: 489c2f69510b001770d9a59e55ba5d5175019050 Parents: 4e23c9e f8912ce Author: Mick Semb Wever Authored: Fri Jun 29 16:53:36 2018 +1000 Committer: Mick Semb Wever Committed: Fri Jun 29 16:57:34 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 66 ++- .../db/compaction/AntiCompactionTest.java | 109 ++- 3 files changed, 147 insertions(+), 29 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/CHANGES.txt -- diff --cc CHANGES.txt index aeeb0ae,9d6a9ea..d694f3b --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,34 -1,5 +1,35 @@@ -2.2.13 +3.0.17 + * Always close RT markers returned by ReadCommand#executeLocally() (CASSANDRA-14515) + * Reverse order queries with range tombstones can cause data loss (CASSANDRA-14513) + * Fix regression of lagging commitlog flush log message (CASSANDRA-14451) + * Add Missing dependencies in pom-all (CASSANDRA-14422) + * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447) + * Fix deprecated repair error notifications from 3.x clusters to legacy JMX clients (CASSANDRA-13121) + * Cassandra not starting when using enhanced startup scripts in windows (CASSANDRA-14418) + * Fix progress stats and units in compactionstats (CASSANDRA-12244) + * Better handle missing partition columns in system_schema.columns (CASSANDRA-14379) + * Delay hints store excise by write timeout to avoid race with decommission (CASSANDRA-13740) + * Deprecate background repair and probablistic read_repair_chance table options + (CASSANDRA-13910) + * Add missed CQL keywords to documentation (CASSANDRA-14359) + * Fix unbounded validation compactions on repair / revert CASSANDRA-13797 (CASSANDRA-14332) + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) + * Handle all exceptions when opening sstables (CASSANDRA-14202) + * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) + * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) + * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252) + * Respect max hint window when hinting for LWT (CASSANDRA-14215) + * Adding missing WriteType enum values to v3, v4, and v5 spec (CASSANDRA-13697) + * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163) + * Fix NPE when performing comparison against a null frozen in LWT (CASSANDRA-14087) + * Log when SSTables are deleted (CASSANDRA-14302) + * Fix batch commitlog sync regression (CASSANDRA-14292) + * Write to pending endpoint when view replica is also base replica (CASSANDRA-14251) + * Chain commit log marker potential performance regression in batch commit mode (CASSANDRA-14194) + * Fully utilise specified compaction threads (CASSANDRA-14210) + * Pre-create deletion log records to finish compactions quicker (CASSANDRA-12763) +Merged from 2.2: + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index ab363e0,013fc04..f033bf2 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -474,6 -460,16 +474,17 @@@ public class CompactionManager implemen }, jobs, OperationType.CLEANUP); } + /** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. ++ * @param parentRepairSession Corresponding repair session + * @return
[01/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 1143bc113 -> f8912ce93 refs/heads/cassandra-3.0 4e23c9e4d -> 489c2f695 refs/heads/cassandra-3.11 ea62d8862 -> bba0d03e9 refs/heads/trunk 4cb83cb81 -> 5cc68a873 Stop SSTables being lost from compaction strategy after full repairs patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for CASSANDRA-14423 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9 Branch: refs/heads/cassandra-2.2 Commit: f8912ce9329a8bc360e93cf61e56814135fbab39 Parents: 1143bc1 Author: kurt Authored: Thu Jun 14 10:59:19 2018 + Committer: Mick Semb Wever Committed: Fri Jun 29 16:49:53 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 70 ++- .../db/compaction/AntiCompactionTest.java | 120 ++- 3 files changed, 156 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7b1089e..9d6a9ea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.13 + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 419f66e..013fc04 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -460,6 +460,16 @@ public class CompactionManager implements CompactionManagerMBean }, jobs, OperationType.CLEANUP); } +/** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. + * @return Futures executing anti-compaction. + */ public ListenableFuture submitAntiCompaction(final ColumnFamilyStore cfs, final Collection> ranges, final Refs sstables, @@ -475,6 +485,8 @@ public class CompactionManager implements CompactionManagerMBean { for (SSTableReader compactingSSTable : cfs.getTracker().getCompacting()) sstables.releaseIfHolds(compactingSSTable); +// We don't anti-compact any SSTable that has been compacted during repair as it may have been compacted +// with unrepaired data. Set compactedSSTables = new HashSet<>(); for (SSTableReader sstable : sstables) if (sstable.isMarkedCompacted()) @@ -504,9 +516,17 @@ public class CompactionManager implements CompactionManagerMBean * * Caller must reference the validatedForRepair sstables (via ParentRepairSession.getActiveRepairedSSTableRefs(..)). * + * NOTE: Repairs can take place on both unrepaired (incremental + full) and repaired (full) data. + * Although anti-compaction could work on repaired sstables as well and would result in having more accurate + * repairedAt values for these, we avoid anti-compacting already repaired sstables, as we currently don't + * make use of any actual repairedAt value and splitting up sstables just for that is not worth it. However, we will + * still update repairedAt if the SSTable is fully contained within the repaired ranges, as this does not require + * anticompaction. + * * @param cfs * @param ranges Ranges that the repair was carried out on * @param validatedForRepair SSTables containing the repaired ranges. Should be referenced before passing them. + * @param txn Transaction across all SSTables
[03/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs
Stop SSTables being lost from compaction strategy after full repairs patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for CASSANDRA-14423 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9 Branch: refs/heads/cassandra-3.11 Commit: f8912ce9329a8bc360e93cf61e56814135fbab39 Parents: 1143bc1 Author: kurt Authored: Thu Jun 14 10:59:19 2018 + Committer: Mick Semb Wever Committed: Fri Jun 29 16:49:53 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 70 ++- .../db/compaction/AntiCompactionTest.java | 120 ++- 3 files changed, 156 insertions(+), 35 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7b1089e..9d6a9ea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.13 + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 419f66e..013fc04 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -460,6 +460,16 @@ public class CompactionManager implements CompactionManagerMBean }, jobs, OperationType.CLEANUP); } +/** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. + * @return Futures executing anti-compaction. + */ public ListenableFuture submitAntiCompaction(final ColumnFamilyStore cfs, final Collection> ranges, final Refs sstables, @@ -475,6 +485,8 @@ public class CompactionManager implements CompactionManagerMBean { for (SSTableReader compactingSSTable : cfs.getTracker().getCompacting()) sstables.releaseIfHolds(compactingSSTable); +// We don't anti-compact any SSTable that has been compacted during repair as it may have been compacted +// with unrepaired data. Set compactedSSTables = new HashSet<>(); for (SSTableReader sstable : sstables) if (sstable.isMarkedCompacted()) @@ -504,9 +516,17 @@ public class CompactionManager implements CompactionManagerMBean * * Caller must reference the validatedForRepair sstables (via ParentRepairSession.getActiveRepairedSSTableRefs(..)). * + * NOTE: Repairs can take place on both unrepaired (incremental + full) and repaired (full) data. + * Although anti-compaction could work on repaired sstables as well and would result in having more accurate + * repairedAt values for these, we avoid anti-compacting already repaired sstables, as we currently don't + * make use of any actual repairedAt value and splitting up sstables just for that is not worth it. However, we will + * still update repairedAt if the SSTable is fully contained within the repaired ranges, as this does not require + * anticompaction. + * * @param cfs * @param ranges Ranges that the repair was carried out on * @param validatedForRepair SSTables containing the repaired ranges. Should be referenced before passing them. + * @param txn Transaction across all SSTables that were repaired. * @throws InterruptedException * @throws IOException */ @@ -519,13 +539,7 @@ public class CompactionManager implements CompactionManagerMBean logger.info("Starting anticompaction for {}.
[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/489c2f69 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/489c2f69 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/489c2f69 Branch: refs/heads/trunk Commit: 489c2f69510b001770d9a59e55ba5d5175019050 Parents: 4e23c9e f8912ce Author: Mick Semb Wever Authored: Fri Jun 29 16:53:36 2018 +1000 Committer: Mick Semb Wever Committed: Fri Jun 29 16:57:34 2018 +1000 -- CHANGES.txt | 1 + .../db/compaction/CompactionManager.java| 66 ++- .../db/compaction/AntiCompactionTest.java | 109 ++- 3 files changed, 147 insertions(+), 29 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/CHANGES.txt -- diff --cc CHANGES.txt index aeeb0ae,9d6a9ea..d694f3b --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,34 -1,5 +1,35 @@@ -2.2.13 +3.0.17 + * Always close RT markers returned by ReadCommand#executeLocally() (CASSANDRA-14515) + * Reverse order queries with range tombstones can cause data loss (CASSANDRA-14513) + * Fix regression of lagging commitlog flush log message (CASSANDRA-14451) + * Add Missing dependencies in pom-all (CASSANDRA-14422) + * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447) + * Fix deprecated repair error notifications from 3.x clusters to legacy JMX clients (CASSANDRA-13121) + * Cassandra not starting when using enhanced startup scripts in windows (CASSANDRA-14418) + * Fix progress stats and units in compactionstats (CASSANDRA-12244) + * Better handle missing partition columns in system_schema.columns (CASSANDRA-14379) + * Delay hints store excise by write timeout to avoid race with decommission (CASSANDRA-13740) + * Deprecate background repair and probablistic read_repair_chance table options + (CASSANDRA-13910) + * Add missed CQL keywords to documentation (CASSANDRA-14359) + * Fix unbounded validation compactions on repair / revert CASSANDRA-13797 (CASSANDRA-14332) + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) + * Handle all exceptions when opening sstables (CASSANDRA-14202) + * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) + * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) + * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252) + * Respect max hint window when hinting for LWT (CASSANDRA-14215) + * Adding missing WriteType enum values to v3, v4, and v5 spec (CASSANDRA-13697) + * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163) + * Fix NPE when performing comparison against a null frozen in LWT (CASSANDRA-14087) + * Log when SSTables are deleted (CASSANDRA-14302) + * Fix batch commitlog sync regression (CASSANDRA-14292) + * Write to pending endpoint when view replica is also base replica (CASSANDRA-14251) + * Chain commit log marker potential performance regression in batch commit mode (CASSANDRA-14194) + * Fully utilise specified compaction threads (CASSANDRA-14210) + * Pre-create deletion log records to finish compactions quicker (CASSANDRA-12763) +Merged from 2.2: + * Fix bug that prevented compaction of SSTables after full repairs (CASSANDRA-14423) * Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551) * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743) * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index ab363e0,013fc04..f033bf2 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -474,6 -460,16 +474,17 @@@ public class CompactionManager implemen }, jobs, OperationType.CLEANUP); } + /** + * Submit anti-compactions for a collection of SSTables over a set of repaired ranges and marks corresponding SSTables + * as repaired. + * + * @param cfs Column family for anti-compaction + * @param ranges Repaired ranges to be anti-compacted into separate SSTables. + * @param sstables {@link Refs} of SSTables within CF to anti-compact. + * @param repairedAt Unix timestamp of when repair was completed. ++ * @param parentRepairSession Corresponding repair session + * @return Futures
[jira] [Commented] (CASSANDRA-10540) RangeAwareCompaction
[ https://issues.apache.org/jira/browse/CASSANDRA-10540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527299#comment-16527299 ] Marcus Eriksson commented on CASSANDRA-10540: - Hey [~Lerh Low] thanks so much for the testing, I will look in to the results soon, I assume you didn't find any issues with the patch? > RangeAwareCompaction > > > Key: CASSANDRA-10540 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10540 > Project: Cassandra > Issue Type: New Feature >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Major > Labels: compaction, lcs, vnodes > Fix For: 4.x > > > Broken out from CASSANDRA-6696, we should split sstables based on ranges > during compaction. > Requirements; > * dont create tiny sstables - keep them bunched together until a single vnode > is big enough (configurable how big that is) > * make it possible to run existing compaction strategies on the per-range > sstables > We should probably add a global compaction strategy parameter that states > whether this should be enabled or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted
[ https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527284#comment-16527284 ] mck commented on CASSANDRA-14423: - Jumping on this to commit. Two (other) reviewers have +1 this now, and tests have passed. Repeating for clarity, the patches and their tests were… ||Branch||uTest||dTest|| |[cassandra-2.2_14423|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14423-2.2]|[!https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-2.2.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-2.2]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/583/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/583]| |[cassandra-3.0_14423|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14423-3.0]|[!https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.0.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.0]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/581/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/581]| |[cassandra-3.11_14423|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14423-3.11]|[!https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.11.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.11]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/580/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/580]| > SSTables stop being compacted > - > > Key: CASSANDRA-14423 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14423 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Kurt Greaves >Assignee: Kurt Greaves >Priority: Blocker > Fix For: 2.2.13, 3.0.17, 3.11.3 > > > So seeing a problem in 3.11.0 where SSTables are being lost from the view and > not being included in compactions/as candidates for compaction. It seems to > get progressively worse until there's only 1-2 SSTables in the view which > happen to be the most recent SSTables and thus compactions completely stop > for that table. > The SSTables seem to still be included in reads, just not compactions. > The issue can be fixed by restarting C*, as it will reload all SSTables into > the view, but this is only a temporary fix. User defined/major compactions > still work - not clear if they include the result back in the view but is not > a good work around. > This also results in a discrepancy between SSTable count and SSTables in > levels for any table using LCS. > {code:java} > Keyspace : xxx > Read Count: 57761088 > Read Latency: 0.10527088681224288 ms. > Write Count: 2513164 > Write Latency: 0.018211106398149903 ms. > Pending Flushes: 0 > Table: xxx > SSTable count: 10 > SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0] > Space used (live): 894498746 > Space used (total): 894498746 > Space used by snapshots (total): 0 > Off heap memory used (total): 11576197 > SSTable Compression Ratio: 0.6956629530569777 > Number of keys (estimate): 3562207 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 87 > Local read count: 57761088 > Local read latency: 0.108 ms > Local write count: 2513164 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 86.33 > Bloom filter false positives: 43 > Bloom filter false ratio: 0.0 > Bloom filter space used: 8046104 > Bloom filter off heap memory used: 8046024 > Index summary off heap memory used: 3449005 > Compression metadata off heap memory used: 81168 > Compacted partition minimum bytes: 104 > Compacted partition maximum bytes: 5722 > Compacted partition mean bytes: 175 > Average live cells per slice (last five minutes): 1.0 > Maximum live cells per slice (last five minutes): 1 > Average tombstones per slice (last five minutes): 1.0 > Maximum tombstones per slice (last five minutes): 1 > Dropped Mutations: 0 > {code} > Also for STCS we've confirmed that SSTable count will be different to the > number of SSTables reported in the Compaction Bucket's. In the below example > there's only 3 SSTables in a single bucket - no more are listed for this > table. Compaction thresholds haven't been modified for this table and it's a > very basic KV schema. > {code:java} > Keyspace : yyy > Read Count: 30485 > Read Latency: 0.06708991307200263 ms. > Write Count: 57044 > Write Latency: 0.02204061776873992 ms. > Pending Flushes: 0 > Table: yyy > SSTable count: 19 > Space used (live): 18195482 > Space used (total): 18195482 >
[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted
[ https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-14423: --- Status: Ready to Commit (was: Patch Available) > SSTables stop being compacted > - > > Key: CASSANDRA-14423 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14423 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Kurt Greaves >Assignee: Kurt Greaves >Priority: Blocker > Fix For: 2.2.13, 3.0.17, 3.11.3 > > > So seeing a problem in 3.11.0 where SSTables are being lost from the view and > not being included in compactions/as candidates for compaction. It seems to > get progressively worse until there's only 1-2 SSTables in the view which > happen to be the most recent SSTables and thus compactions completely stop > for that table. > The SSTables seem to still be included in reads, just not compactions. > The issue can be fixed by restarting C*, as it will reload all SSTables into > the view, but this is only a temporary fix. User defined/major compactions > still work - not clear if they include the result back in the view but is not > a good work around. > This also results in a discrepancy between SSTable count and SSTables in > levels for any table using LCS. > {code:java} > Keyspace : xxx > Read Count: 57761088 > Read Latency: 0.10527088681224288 ms. > Write Count: 2513164 > Write Latency: 0.018211106398149903 ms. > Pending Flushes: 0 > Table: xxx > SSTable count: 10 > SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0] > Space used (live): 894498746 > Space used (total): 894498746 > Space used by snapshots (total): 0 > Off heap memory used (total): 11576197 > SSTable Compression Ratio: 0.6956629530569777 > Number of keys (estimate): 3562207 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 87 > Local read count: 57761088 > Local read latency: 0.108 ms > Local write count: 2513164 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 86.33 > Bloom filter false positives: 43 > Bloom filter false ratio: 0.0 > Bloom filter space used: 8046104 > Bloom filter off heap memory used: 8046024 > Index summary off heap memory used: 3449005 > Compression metadata off heap memory used: 81168 > Compacted partition minimum bytes: 104 > Compacted partition maximum bytes: 5722 > Compacted partition mean bytes: 175 > Average live cells per slice (last five minutes): 1.0 > Maximum live cells per slice (last five minutes): 1 > Average tombstones per slice (last five minutes): 1.0 > Maximum tombstones per slice (last five minutes): 1 > Dropped Mutations: 0 > {code} > Also for STCS we've confirmed that SSTable count will be different to the > number of SSTables reported in the Compaction Bucket's. In the below example > there's only 3 SSTables in a single bucket - no more are listed for this > table. Compaction thresholds haven't been modified for this table and it's a > very basic KV schema. > {code:java} > Keyspace : yyy > Read Count: 30485 > Read Latency: 0.06708991307200263 ms. > Write Count: 57044 > Write Latency: 0.02204061776873992 ms. > Pending Flushes: 0 > Table: yyy > SSTable count: 19 > Space used (live): 18195482 > Space used (total): 18195482 > Space used by snapshots (total): 0 > Off heap memory used (total): 747376 > SSTable Compression Ratio: 0.7607394576769735 > Number of keys (estimate): 116074 > Memtable cell count: 0 > Memtable data size: 0 > Memtable off heap memory used: 0 > Memtable switch count: 39 > Local read count: 30485 > Local read latency: NaN ms > Local write count: 57044 > Local write latency: NaN ms > Pending flushes: 0 > Percent repaired: 79.76 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.0 > Bloom filter space used: 690912 > Bloom filter off heap memory used: 690760 > Index summary off heap memory used: 54736 > Compression metadata off heap memory used: 1880 > Compacted partition minimum bytes: 73 > Compacted partition maximum bytes: 124 > Compacted partition mean bytes: 96 > Average live cells per slice (last five minutes): NaN > Maximum live cells per slice (last five minutes): 0 > Average tombstones per slice (last five minutes): NaN > Maximum tombstones per slice (last five minutes): 0 > Dropped Mutations: 0 > {code} > {code:java} > Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy > Compaction buckets are > [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd