[jira] [Updated] (CASSANDRA-14543) Hinted handoff to replay purgeable tombstones

2018-06-29 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14543:
---
Resolution: Won't Fix
Status: Resolved  (was: Awaiting Feedback)

> Hinted handoff to replay purgeable tombstones 
> --
>
> Key: CASSANDRA-14543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14543
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Priority: Minor
>
> Hinted-handoff currently only dispatches and applies the mutations that are 
> within GCGS: 
> [{{Hint.java:97}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/Hint.java#L97].
>  Which is to make sure it won't resurrect any deleted data.
> But replaying tombstones should be safe, it could reduce the chance to have 
> [un-repairable inconsistent 
> data|https://lists.apache.org/thread.html/2d3d39d960143d4d2146ed2530821504ff855e832713dec7d0afd8ac@%3Cdev.cassandra.apache.org%3E].
> Here is the user scenario it tries to fix:
> {noformat}
> 1. Create a 3 nodes cluster
> 2. Create a table with small gc_grace_seconds (for reproducing purpose):
> CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 3};
> CREATE TABLE foo.bar (
> id int PRIMARY KEY,
> name text
> ) WITH gc_grace_seconds=30;
> 3. Insert data with consistency all:
> INSERT INTO foo.bar (id, name) VALUES(1, 'cstar');
> 4. stop 1 node
> $ ccm node2 stop
> 5. Delete the data with consistency quorum:
> DELETE FROM foo.bar WHERE id=1;
> 6. Wait 30 seconds and then start node2:
> $ ccm node2 start
> {noformat}
> Now, node2 has the data, node1/node3 have the purgeable tombstone. It 
> triggers RR every time which sends data from node2 to node1/node3 but repairs 
> nothing.
> With purgeable tombstones hints handoff, it at least will dispatch the 
> tombstone and delete the data on node2. It won't fix the root cause but 
> reduce the chance to have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14543) Hinted handoff to replay purgeable tombstones

2018-06-29 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528468#comment-16528468
 ] 

Jay Zhuang commented on CASSANDRA-14543:


{quote}
[~iamaleksey]:
Replaying just the tombstones might be safe-ish, but it’s only helping with 
your issue in a very narrow time window. And there will be a price to pay for 
this: hint dispatch will have to become less efficient if we end up inspecting 
and filtering out every mutation.
{quote}
Make sense to me. Mostly it's for the use case that {{GCGS < 
max_hint_window_in_ms}}, which seems always a bad idea. For tombstone, it may 
hit this issue, for normal data, hints ({{>GCGS}}) are not dispatched even in 
hint_window.
Should we consider logging a warning or error message when {{GCGS < 
max_hint_window_in_ms}}?

(BTW: please review this minor patch to improve the hints handoff performance: 
CASSANDRA-14536 :) ).

> Hinted handoff to replay purgeable tombstones 
> --
>
> Key: CASSANDRA-14543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14543
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Priority: Minor
>
> Hinted-handoff currently only dispatches and applies the mutations that are 
> within GCGS: 
> [{{Hint.java:97}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/Hint.java#L97].
>  Which is to make sure it won't resurrect any deleted data.
> But replaying tombstones should be safe, it could reduce the chance to have 
> [un-repairable inconsistent 
> data|https://lists.apache.org/thread.html/2d3d39d960143d4d2146ed2530821504ff855e832713dec7d0afd8ac@%3Cdev.cassandra.apache.org%3E].
> Here is the user scenario it tries to fix:
> {noformat}
> 1. Create a 3 nodes cluster
> 2. Create a table with small gc_grace_seconds (for reproducing purpose):
> CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 3};
> CREATE TABLE foo.bar (
> id int PRIMARY KEY,
> name text
> ) WITH gc_grace_seconds=30;
> 3. Insert data with consistency all:
> INSERT INTO foo.bar (id, name) VALUES(1, 'cstar');
> 4. stop 1 node
> $ ccm node2 stop
> 5. Delete the data with consistency quorum:
> DELETE FROM foo.bar WHERE id=1;
> 6. Wait 30 seconds and then start node2:
> $ ccm node2 start
> {noformat}
> Now, node2 has the data, node1/node3 have the purgeable tombstone. It 
> triggers RR every time which sends data from node2 to node1/node3 but repairs 
> nothing.
> With purgeable tombstones hints handoff, it at least will dispatch the 
> tombstone and delete the data on node2. It won't fix the root cause but 
> reduce the chance to have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14551) ReplicationAwareTokenAllocator should block bootstrap if no replication number is set

2018-06-29 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14551:
---
Status: Patch Available  (was: Open)

> ReplicationAwareTokenAllocator should block bootstrap if no replication 
> number is set
> -
>
> Key: CASSANDRA-14551
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14551
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> We're using 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm].
>  When bootstrapping a new DC, the tokens are not well distributed. The 
> problem is because the replication number is not set for the new DC before 
> the bootstrap.
> I would suggest blocking the bootstrap if replication number is not set. It's 
> unsafe to assume the default replicas is 1. Which also causes the following 
> invalid stats:
> {noformat}
> WARN  [main] 2018-06-29 17:30:55,696 TokenAllocation.java:69 - Replicated 
> node load in datacenter before allocation max NaN min NaN stddev NaN
> WARN  [main] 2018-06-29 17:30:55,696 TokenAllocation.java:70 - Replicated 
> node load in datacenter after allocation max NaN min NaN stddev NaN
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14551) ReplicationAwareTokenAllocator should block bootstrap if no replication number is set

2018-06-29 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528420#comment-16528420
 ] 

Jay Zhuang commented on CASSANDRA-14551:


Here is the patch, please review:
| Branch | uTest | dTest |
| [14551-trunk|https://github.com/cooldoger/cassandra/tree/14551-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14551-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14551-trunk]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/586/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/586/]
 |

cc. [~dikanggu]

> ReplicationAwareTokenAllocator should block bootstrap if no replication 
> number is set
> -
>
> Key: CASSANDRA-14551
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14551
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> We're using 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm].
>  When bootstrapping a new DC, the tokens are not well distributed. The 
> problem is because the replication number is not set for the new DC before 
> the bootstrap.
> I would suggest blocking the bootstrap if replication number is not set. It's 
> unsafe to assume the default replicas is 1. Which also causes the following 
> invalid stats:
> {noformat}
> WARN  [main] 2018-06-29 17:30:55,696 TokenAllocation.java:69 - Replicated 
> node load in datacenter before allocation max NaN min NaN stddev NaN
> WARN  [main] 2018-06-29 17:30:55,696 TokenAllocation.java:70 - Replicated 
> node load in datacenter after allocation max NaN min NaN stddev NaN
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14551) ReplicationAwareTokenAllocator should block bootstrap if no replication number is set

2018-06-29 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14551:
--

 Summary: ReplicationAwareTokenAllocator should block bootstrap if 
no replication number is set
 Key: CASSANDRA-14551
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14551
 Project: Cassandra
  Issue Type: Bug
  Components: Configuration
Reporter: Jay Zhuang
Assignee: Jay Zhuang


We're using 
[ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm].
 When bootstrapping a new DC, the tokens are not well distributed. The problem 
is because the replication number is not set for the new DC before the 
bootstrap.
I would suggest blocking the bootstrap if replication number is not set. It's 
unsafe to assume the default replicas is 1. Which also causes the following 
invalid stats:
{noformat}
WARN  [main] 2018-06-29 17:30:55,696 TokenAllocation.java:69 - Replicated node 
load in datacenter before allocation max NaN min NaN stddev NaN
WARN  [main] 2018-06-29 17:30:55,696 TokenAllocation.java:70 - Replicated node 
load in datacenter after allocation max NaN min NaN stddev NaN
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9608) Support Java 11

2018-06-29 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528106#comment-16528106
 ] 

Jason Brown commented on CASSANDRA-9608:


I've taken a first pass through the scripts and build.xml parts (all the 
non-code stuff), and on the whole it's looking pretty good. I've made a few 
minor comments on the PR, but I have these points, as well:
 - you have the java version check code copied across several scripts, and it 
looks like clients.in.sh is used by many other scripts (so I guess this is the 
'canonical' location?). Should we have the main bin/cassandra.in.sh use this? 
Maybe the one in tools/bin, as well? Also, I don't think (I may be wrong) the 
cassandra.in.sh in the debian/redhat can call the clients.in.sh script. wdyt?
 - the changes you made to cassandra-env.sh need to be made to 
cassandra-env.ps1 (wrt java version checking). There's also some Windows 
{{.bat}} files in the code base. Can you check to see if they need updates, as 
well?
 - can we add a simple note to conf/jvm8-clients.options, with something like 
this: "this file is intentionaly blank". or should we just get rid of it, and 
add it when we actually need it?

I suspect the code will be easier to review, so hopefully I can knock that out 
in short order.

> Support Java 11
> ---
>
> Key: CASSANDRA-9608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9608
> Project: Cassandra
>  Issue Type: Task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
> Attachments: jdk_9_10.patch
>
>
> This ticket is intended to group all issues found to support Java 9 in the 
> future.
> From what I've found out so far:
> * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. 
> It can be easily solved using this patch:
> {code}
> - artifactId="cobertura"/>
> + artifactId="cobertura">
> +  
> +
> {code}
> * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods 
> {{monitorEnter}} + {{monitorExit}}. These methods are used by 
> {{o.a.c.utils.concurrent.Locks}} which is only used by 
> {{o.a.c.db.AtomicBTreeColumns}}.
> I don't mind to start working on this yet since Java 9 is in a too early 
> development phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2018-06-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528103#comment-16528103
 ] 

ASF GitHub Bot commented on CASSANDRA-6541:
---

Github user jasobrown commented on a diff in the pull request:

https://github.com/apache/cassandra/pull/236#discussion_r199253772
  
--- Diff: conf/jvm11.options ---
@@ -0,0 +1,89 @@
+###
+#jvm11.options#
+# #
+# See jvm.options. This file is specific for Java 11 and newer.   #
+###
+
+#
+#  GC SETTINGS  #
+#
+
+
+
+### CMS Settings
+#-XX:+UseParNewGC
+#-XX:+UseConcMarkSweepGC
+#-XX:+CMSParallelRemarkEnabled
+#-XX:SurvivorRatio=8
+#-XX:MaxTenuringThreshold=1
+#-XX:CMSInitiatingOccupancyFraction=75
+#-XX:+UseCMSInitiatingOccupancyOnly
+#-XX:CMSWaitDuration=1
+#-XX:+CMSParallelInitialMarkEnabled
+#-XX:+CMSEdenChunksRecordAlways
+## some JVMs will fill up their heap when accessed via JMX, see 
CASSANDRA-6541
+#-XX:+CMSClassUnloadingEnabled
+
+
+
+### G1 Settings
+## Use the Hotspot garbage-first collector.
+-XX:+UseG1GC
+-XX:+ParallelRefProcEnabled
+
+#
+## Have the JVM do less remembered set work during STW, instead
+## preferring concurrent GC. Reduces p99.9 latency.
+-XX:G1RSetUpdatingPauseTimePercent=5
+#
+## Main G1GC tunable: lowering the pause target will lower throughput and 
vise versa.
+## 200ms is the JVM default and lowest viable setting
+## 1000ms increases throughput. Keep it smaller than the timeouts in 
cassandra.yaml.
+-XX:MaxGCPauseMillis=500
+
+## Optional G1 Settings
+# Save CPU time on large (>= 16GB) heaps by delaying region scanning
+# until the heap is 70% full. The default in Hotspot 8u40 is 40%.
+#-XX:InitiatingHeapOccupancyPercent=70
+
+# For systems with > 8 cores, the default ParallelGCThreads is 5/8 the 
number of logical cores.
+# Otherwise equal to the number of cores when 8 or less.
+# Machines with > 10 cores should try setting these to <= full cores.
+#-XX:ParallelGCThreads=16
+# By default, ConcGCThreads is 1/4 of ParallelGCThreads.
+# Setting both to the same value can reduce STW durations.
+#-XX:ConcGCThreads=16
+
+
+### JPMS
+
+-Djdk.attach.allowAttachSelf=true
+--add-exports java.base/jdk.internal.misc=ALL-UNNAMED
+--add-opens java.base/jdk.internal.module=ALL-UNNAMED
+--add-exports java.base/jdk.internal.ref=ALL-UNNAMED
+--add-exports java.base/sun.nio.ch=ALL-UNNAMED
+--add-exports 
java.management.rmi/com.sun.jmx.remote.internal.rmi=ALL-UNNAMED
+--add-exports java.rmi/sun.rmi.registry=ALL-UNNAMED
+--add-exports java.rmi/sun.rmi.server=ALL-UNNAMED
+--add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED
+
+
+### GC logging options -- uncomment to enable
+
+# Java 11 (and newer) GC logging options:
+# See description of https://bugs.openjdk.java.net/browse/JDK-8046148 for 
details about the syntax
+# The following is the equivalent to -XX:+PrintGCDetails 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M

+#-Xlog:gc=info,heap*=trace,age*=debug,safepoint=info,promotion*=trace:file=/var/log/cassandra/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10240
--- End diff --

minor nit: `filesize=10240` is the file size in bytes. change to `10485760` 
if you actually want 10MB files.


> New versions of Hotspot create new Class objects on every JMX connection 
> causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
> -
>
> Key: CASSANDRA-6541
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: jonathan lacefield
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 1.2.16, 2.0.6, 2.1 beta2
>
> Attachments: dse_systemlog
>
>
> Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
> (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
> fills up overtime until OOM or a full GC event occurs, specifically when CMS 
> is leveraged.  Adding:
> {noformat}
> JVM_OPTS="$JVM_OPTS -XX:+CMSClassUnloadingEnabled"
> {noformat}
> The th

[jira] [Comment Edited] (CASSANDRA-14549) Transient Replication: support logged batches

2018-06-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527800#comment-16527800
 ] 

Ariel Weisberg edited comment on CASSANDRA-14549 at 6/29/18 3:38 PM:
-

I think this is pretty important to have in 4.0, but not as important as having 
minimal PAXOS support. So if we have to let it slip we should make sure it 
works correctly but just fails to implement the cheap quorum optimization and 
document the caveat.


was (Author: aweisberg):
I think this is pretty important to have in 4.0, but not as important as having 
minimal PAXOS support. So if we have to let it slip we should make sure it 
fails correctly if you have transient replication enabled and it is documented 
as a caveat.

> Transient Replication: support logged batches
> -
>
> Key: CASSANDRA-14549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14549
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Blake Eggleston
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14547) Transient Replication: Support paxos

2018-06-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527805#comment-16527805
 ] 

Ariel Weisberg commented on CASSANDRA-14547:


I think we should try and get PAXOS commit to use transient replication. 
Hopefully we can live with the other phases of PAXOS not using transient 
replication under the assumption that LWT are a smaller portion of the workload 
and document the caveat.

There is work here at every step because PAXOS doesn't use the regular 
read/write path for each phase so it doesn't automatically pick up transient 
replication. The current code will send writes to all replicas so we don't get 
the benefit of the cheap quorum optimization.

> Transient Replication: Support paxos
> 
>
> Key: CASSANDRA-14547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14547
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14549) Transient Replication: support logged batches

2018-06-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527800#comment-16527800
 ] 

Ariel Weisberg commented on CASSANDRA-14549:


I think this is pretty important to have in 4.0, but not as important as having 
minimal PAXOS support. So if we have to let it slip we should make sure it 
fails correctly if you have transient replication enabled and it is documented 
as a caveat.

> Transient Replication: support logged batches
> -
>
> Key: CASSANDRA-14549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14549
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Blake Eggleston
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14548) Transient Replication: support counters

2018-06-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527796#comment-16527796
 ] 

Ariel Weisberg commented on CASSANDRA-14548:


For 4.0 I think we should forbid mixing transient replication and counters.

> Transient Replication: support counters
> ---
>
> Key: CASSANDRA-14548
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14548
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Blake Eggleston
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13543) Cassandra SASI index gives unexpected number of results

2018-06-29 Thread Jordan West (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West reassigned CASSANDRA-13543:
---

Assignee: Jordan West  (was: Alex Petrov)

> Cassandra SASI index gives unexpected number of results
> ---
>
> Key: CASSANDRA-13543
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13543
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Alexander Nabatchikov
>Assignee: Jordan West
>Priority: Major
>
> I've faced the issue with LIKE query to the column indexed by SASI index. 
> Cassandra can return different number of rows when the data stays immutable.
> {code}
> CREATE TABLE idx_test
> (
>   id int,
>   str text,
>   i int,
>   PRIMARY KEY (id)
> );
> CREATE CUSTOM INDEX idx_test_idx ON idx_test (str)
> USING 'org.apache.cassandra.index.sasi.SASIIndex'
> WITH OPTIONS = { 
>   'mode': 'CONTAINS',
>   'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>   'tokenization_enable_stemming': 'true',
>   'tokenization_normalize_lowercase': 'true'
> };
> INSERT INTO idx_test (id, str, i) VALUES (1, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (2, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (3, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (4, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (5, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (6, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (7, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (8, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (9, 'a b c d', 10);
> INSERT INTO idx_test (id, str, i) VALUES (10, 'a b c d', 10);
> {code}
> Query:
> {code}
> SELECT * FROM idx_test WHERE str LIKE 'b' 
> AND i = 10
> ALLOW FILTERING;
> {code}
> This query mostly returns 0 rows, but sometimes 1 row appears in result row 
> set as:
> {code}
> id |  i  |  str
> 10 |  10 |  a b c d
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted

2018-06-29 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14423:

   Resolution: Fixed
Reviewers: Marcus Eriksson
Reproduced In: 3.11.2, 3.11.0  (was: 3.11.0, 3.11.2)
   Status: Resolved  (was: Ready to Commit)

committed as  f8912ce9329a8bc360e93cf61e56814135fbab39

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a

[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

2018-06-29 Thread Michael Burman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527305#comment-16527305
 ] 

Michael Burman commented on CASSANDRA-7282:
---

I've updated this work with the following updates to get this process restarted:
 * Ported to newer storage format and rebased to current master
 * Removed visibility optimization from resize operation as I could make this 
NPE with a race condition if large resizes happen in short duration
 ** I tested different resize algorithms, such as tracking the read and write 
path chains and resizing if certain criteria is met. This is the strategy used 
by the Linux kernel's RCU, but I did not manage to make any notable performance 
gains using that method so I left the original resize algorithm.
 * Each node token range is now backed by a separate index.
 ** Reduces the contention on the index updates, resizes and size parameter 
update (which was a contention point for all threads previously).
 ** Apply a normalization for token range hash updates to improve the 
distribution of hashes and that way the coverage of the index
 *** Center the hash range to around 0 first and then apply a scaling 
multiplier (proportional size of the token range compared to the available hash 
range [Long.MIN_VALUE, Long.MAX_VALUE])
 ** Allows different growth sizes for each token range
 ** For system tables and cases where node might see writes that were not known 
during the initilization and overflow index with range [Long.MIN_VALUE, 
Long.MAX_VALUE] is used (works like the original patch)
 ** For lookup to the correct index I used a balanced BST. If there's a better 
way, I'm all ears. Linear search would be as fast if the amount of vnodes is 
small, but this scales to larger amount of vnodes also.

I've tried to run different workloads against it and also tried to use 
different hashing methods, including the 32-bit hash from MurmurHash to trigger 
a bad hash distribution, but I didn't couldn't see any scenario with 
performance degradation compared to CSLM. With a hash attack (which Murmur is 
vulnerable to) this can be done of course, but those will break Cassandra node 
distribution also.

Link to the branch: [https://github.com/burmanm/cassandra/tree/7282-trunk-2]

> Faster Memtable map
> ---
>
> Key: CASSANDRA-7282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
> Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, 
> run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-06-29 Thread mck
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bba0d03e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bba0d03e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bba0d03e

Branch: refs/heads/cassandra-3.11
Commit: bba0d03e9c5e62c222734839a9adc83f1aec6f95
Parents: ea62d88 489c2f6
Author: Mick Semb Wever 
Authored: Fri Jun 29 16:58:26 2018 +1000
Committer: Mick Semb Wever 
Committed: Fri Jun 29 17:00:02 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  67 +++-
 .../db/compaction/AntiCompactionTest.java   | 109 ++-
 3 files changed, 147 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index f0a4de5,f033bf2..fa6b03e
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@@ -478,115 -474,17 +477,126 @@@ public class CompactionManager implemen
  }, jobs, OperationType.CLEANUP);
  }
  
 +public AllSSTableOpStatus performGarbageCollection(final 
ColumnFamilyStore cfStore, TombstoneOption tombstoneOption, int jobs) throws 
InterruptedException, ExecutionException
 +{
 +assert !cfStore.isIndex();
 +
 +return parallelAllSSTableOperation(cfStore, new OneSSTableOperation()
 +{
 +@Override
 +public Iterable 
filterSSTables(LifecycleTransaction transaction)
 +{
 +Iterable originals = transaction.originals();
 +if 
(cfStore.getCompactionStrategyManager().onlyPurgeRepairedTombstones())
 +originals = Iterables.filter(originals, 
SSTableReader::isRepaired);
 +List sortedSSTables = 
Lists.newArrayList(originals);
 +Collections.sort(sortedSSTables, 
SSTableReader.maxTimestampComparator);
 +return sortedSSTables;
 +}
 +
 +@Override
 +public void execute(LifecycleTransaction txn) throws IOException
 +{
 +logger.debug("Garbage collecting {}", txn.originals());
 +CompactionTask task = new CompactionTask(cfStore, txn, 
getDefaultGcBefore(cfStore, FBUtilities.nowInSeconds()))
 +{
 +@Override
 +protected CompactionController 
getCompactionController(Set toCompact)
 +{
 +return new CompactionController(cfStore, toCompact, 
gcBefore, null, tombstoneOption);
 +}
 +};
 +task.setUserDefined(true);
 +task.setCompactionType(OperationType.GARBAGE_COLLECT);
 +task.execute(metrics);
 +}
 +}, jobs, OperationType.GARBAGE_COLLECT);
 +}
 +
 +public AllSSTableOpStatus relocateSSTables(final ColumnFamilyStore cfs, 
int jobs) throws ExecutionException, InterruptedException
 +{
 +if (!cfs.getPartitioner().splitter().isPresent())
 +{
 +logger.info("Partitioner does not support splitting");
 +return AllSSTableOpStatus.ABORTED;
 +}
 +final Collection> r = 
StorageService.instance.getLocalRanges(cfs.keyspace.getName());
 +
 +if (r.isEmpty())
 +{
 +logger.info("Relocate cannot run before a node has joined the 
ring");
 +return AllSSTableOpStatus.ABORTED;
 +}
 +
 +final DiskBoundaries diskBoundaries = cfs.getDiskBoundaries();
 +
 +return parallelAllSSTableOperation(cfs, new OneSSTableOperation()
 +{
 +@Override
 +public Iterable 
filterSSTables(LifecycleTransaction transaction)
 +{
 +Set originals = 
Sets.newHashSet(transaction.originals());
 +Set needsRelocation = 
originals.stream().filter(s -> 
!inCorrectLocation(s)).collect(Collectors.toSet());
 +transaction.cancel(Sets.difference(originals, 
needsRelocation));
 +
 +Map> groupedByDisk = 
groupByDiskIndex(needsRelocation);
 +
 +int maxSize = 0;
 +for (List diskSSTables : 
groupedByDisk.values())
 +maxSize = Math.max(maxSize, diskSSTables.size());
 +
 +List mixedSSTable

[04/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs

2018-06-29 Thread mck
Stop SSTables being lost from compaction strategy after full repairs

patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for 
CASSANDRA-14423


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9

Branch: refs/heads/trunk
Commit: f8912ce9329a8bc360e93cf61e56814135fbab39
Parents: 1143bc1
Author: kurt 
Authored: Thu Jun 14 10:59:19 2018 +
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:49:53 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  70 ++-
 .../db/compaction/AntiCompactionTest.java   | 120 ++-
 3 files changed, 156 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7b1089e..9d6a9ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.13
+ * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
  * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
  * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
  * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 419f66e..013fc04 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -460,6 +460,16 @@ public class CompactionManager implements 
CompactionManagerMBean
 }, jobs, OperationType.CLEANUP);
 }
 
+/**
+ * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+ * as repaired.
+ *
+ * @param cfs Column family for anti-compaction
+ * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+ * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+ * @param repairedAt Unix timestamp of when repair was completed.
+ * @return Futures executing anti-compaction.
+ */
 public ListenableFuture submitAntiCompaction(final ColumnFamilyStore 
cfs,
   final Collection> 
ranges,
   final Refs sstables,
@@ -475,6 +485,8 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 for (SSTableReader compactingSSTable : 
cfs.getTracker().getCompacting())
 sstables.releaseIfHolds(compactingSSTable);
+// We don't anti-compact any SSTable that has been 
compacted during repair as it may have been compacted
+// with unrepaired data.
 Set compactedSSTables = new HashSet<>();
 for (SSTableReader sstable : sstables)
 if (sstable.isMarkedCompacted())
@@ -504,9 +516,17 @@ public class CompactionManager implements 
CompactionManagerMBean
  *
  * Caller must reference the validatedForRepair sstables (via 
ParentRepairSession.getActiveRepairedSSTableRefs(..)).
  *
+ * NOTE: Repairs can take place on both unrepaired (incremental + full) 
and repaired (full) data.
+ * Although anti-compaction could work on repaired sstables as well and 
would result in having more accurate
+ * repairedAt values for these, we avoid anti-compacting already repaired 
sstables, as we currently don't
+ * make use of any actual repairedAt value and splitting up sstables just 
for that is not worth it. However, we will
+ * still update repairedAt if the SSTable is fully contained within the 
repaired ranges, as this does not require
+ * anticompaction.
+ *
  * @param cfs
  * @param ranges Ranges that the repair was carried out on
  * @param validatedForRepair SSTables containing the repaired ranges. 
Should be referenced before passing them.
+ * @param txn Transaction across all SSTables that were repaired.
  * @throws InterruptedException
  * @throws IOException
  */
@@ -519,13 +539,7 @@ public class CompactionManager implements 
CompactionManagerMBean
 logger.info("Starting anticompaction for {}.{} on {}/

[10/10] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2018-06-29 Thread mck
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5cc68a87
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5cc68a87
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5cc68a87

Branch: refs/heads/trunk
Commit: 5cc68a87359dd02412bdb70a52dfcd718d44a5ba
Parents: 4cb83cb bba0d03
Author: Mick Semb Wever 
Authored: Fri Jun 29 17:00:24 2018 +1000
Committer: Mick Semb Wever 
Committed: Fri Jun 29 17:00:24 2018 +1000

--

--



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[02/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs

2018-06-29 Thread mck
Stop SSTables being lost from compaction strategy after full repairs

patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for 
CASSANDRA-14423


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9

Branch: refs/heads/cassandra-3.0
Commit: f8912ce9329a8bc360e93cf61e56814135fbab39
Parents: 1143bc1
Author: kurt 
Authored: Thu Jun 14 10:59:19 2018 +
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:49:53 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  70 ++-
 .../db/compaction/AntiCompactionTest.java   | 120 ++-
 3 files changed, 156 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7b1089e..9d6a9ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.13
+ * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
  * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
  * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
  * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 419f66e..013fc04 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -460,6 +460,16 @@ public class CompactionManager implements 
CompactionManagerMBean
 }, jobs, OperationType.CLEANUP);
 }
 
+/**
+ * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+ * as repaired.
+ *
+ * @param cfs Column family for anti-compaction
+ * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+ * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+ * @param repairedAt Unix timestamp of when repair was completed.
+ * @return Futures executing anti-compaction.
+ */
 public ListenableFuture submitAntiCompaction(final ColumnFamilyStore 
cfs,
   final Collection> 
ranges,
   final Refs sstables,
@@ -475,6 +485,8 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 for (SSTableReader compactingSSTable : 
cfs.getTracker().getCompacting())
 sstables.releaseIfHolds(compactingSSTable);
+// We don't anti-compact any SSTable that has been 
compacted during repair as it may have been compacted
+// with unrepaired data.
 Set compactedSSTables = new HashSet<>();
 for (SSTableReader sstable : sstables)
 if (sstable.isMarkedCompacted())
@@ -504,9 +516,17 @@ public class CompactionManager implements 
CompactionManagerMBean
  *
  * Caller must reference the validatedForRepair sstables (via 
ParentRepairSession.getActiveRepairedSSTableRefs(..)).
  *
+ * NOTE: Repairs can take place on both unrepaired (incremental + full) 
and repaired (full) data.
+ * Although anti-compaction could work on repaired sstables as well and 
would result in having more accurate
+ * repairedAt values for these, we avoid anti-compacting already repaired 
sstables, as we currently don't
+ * make use of any actual repairedAt value and splitting up sstables just 
for that is not worth it. However, we will
+ * still update repairedAt if the SSTable is fully contained within the 
repaired ranges, as this does not require
+ * anticompaction.
+ *
  * @param cfs
  * @param ranges Ranges that the repair was carried out on
  * @param validatedForRepair SSTables containing the repaired ranges. 
Should be referenced before passing them.
+ * @param txn Transaction across all SSTables that were repaired.
  * @throws InterruptedException
  * @throws IOException
  */
@@ -519,13 +539,7 @@ public class CompactionManager implements 
CompactionManagerMBean
 logger.info("Starting anticompaction for {}.{

[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2018-06-29 Thread mck
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/489c2f69
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/489c2f69
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/489c2f69

Branch: refs/heads/cassandra-3.0
Commit: 489c2f69510b001770d9a59e55ba5d5175019050
Parents: 4e23c9e f8912ce
Author: Mick Semb Wever 
Authored: Fri Jun 29 16:53:36 2018 +1000
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:57:34 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  66 ++-
 .../db/compaction/AntiCompactionTest.java   | 109 ++-
 3 files changed, 147 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/CHANGES.txt
--
diff --cc CHANGES.txt
index aeeb0ae,9d6a9ea..d694f3b
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,34 -1,5 +1,35 @@@
 -2.2.13
 +3.0.17
 + * Always close RT markers returned by ReadCommand#executeLocally() 
(CASSANDRA-14515)
 + * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
 + * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
 + * Add Missing dependencies in pom-all (CASSANDRA-14422)
 + * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)
 + * Fix deprecated repair error notifications from 3.x clusters to legacy JMX 
clients (CASSANDRA-13121)
 + * Cassandra not starting when using enhanced startup scripts in windows 
(CASSANDRA-14418)
 + * Fix progress stats and units in compactionstats (CASSANDRA-12244)
 + * Better handle missing partition columns in system_schema.columns 
(CASSANDRA-14379)
 + * Delay hints store excise by write timeout to avoid race with decommission 
(CASSANDRA-13740)
 + * Deprecate background repair and probablistic read_repair_chance table 
options
 +   (CASSANDRA-13910)
 + * Add missed CQL keywords to documentation (CASSANDRA-14359)
 + * Fix unbounded validation compactions on repair / revert CASSANDRA-13797 
(CASSANDRA-14332)
 + * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
 + * Handle all exceptions when opening sstables (CASSANDRA-14202)
 + * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
 + * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)
 + * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
 + * Respect max hint window when hinting for LWT (CASSANDRA-14215)
 + * Adding missing WriteType enum values to v3, v4, and v5 spec 
(CASSANDRA-13697)
 + * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163)
 + * Fix NPE when performing comparison against a null frozen in LWT 
(CASSANDRA-14087)
 + * Log when SSTables are deleted (CASSANDRA-14302)
 + * Fix batch commitlog sync regression (CASSANDRA-14292)
 + * Write to pending endpoint when view replica is also base replica 
(CASSANDRA-14251)
 + * Chain commit log marker potential performance regression in batch commit 
mode (CASSANDRA-14194)
 + * Fully utilise specified compaction threads (CASSANDRA-14210)
 + * Pre-create deletion log records to finish compactions quicker 
(CASSANDRA-12763)
 +Merged from 2.2:
+  * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
   * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
   * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
   * Use Bounds instead of Range for sstables in anticompaction 
(CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index ab363e0,013fc04..f033bf2
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@@ -474,6 -460,16 +474,17 @@@ public class CompactionManager implemen
  }, jobs, OperationType.CLEANUP);
  }
  
+ /**
+  * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+  * as repaired.
+  *
+  * @param cfs Column family for anti-compaction
+  * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+  * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+  * @param repairedAt Unix timestamp of when repair was completed.
++ * @param parentRepairSession Corresponding repair session
+  * @return 

[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-06-29 Thread mck
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bba0d03e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bba0d03e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bba0d03e

Branch: refs/heads/trunk
Commit: bba0d03e9c5e62c222734839a9adc83f1aec6f95
Parents: ea62d88 489c2f6
Author: Mick Semb Wever 
Authored: Fri Jun 29 16:58:26 2018 +1000
Committer: Mick Semb Wever 
Committed: Fri Jun 29 17:00:02 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  67 +++-
 .../db/compaction/AntiCompactionTest.java   | 109 ++-
 3 files changed, 147 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/bba0d03e/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index f0a4de5,f033bf2..fa6b03e
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@@ -478,115 -474,17 +477,126 @@@ public class CompactionManager implemen
  }, jobs, OperationType.CLEANUP);
  }
  
 +public AllSSTableOpStatus performGarbageCollection(final 
ColumnFamilyStore cfStore, TombstoneOption tombstoneOption, int jobs) throws 
InterruptedException, ExecutionException
 +{
 +assert !cfStore.isIndex();
 +
 +return parallelAllSSTableOperation(cfStore, new OneSSTableOperation()
 +{
 +@Override
 +public Iterable 
filterSSTables(LifecycleTransaction transaction)
 +{
 +Iterable originals = transaction.originals();
 +if 
(cfStore.getCompactionStrategyManager().onlyPurgeRepairedTombstones())
 +originals = Iterables.filter(originals, 
SSTableReader::isRepaired);
 +List sortedSSTables = 
Lists.newArrayList(originals);
 +Collections.sort(sortedSSTables, 
SSTableReader.maxTimestampComparator);
 +return sortedSSTables;
 +}
 +
 +@Override
 +public void execute(LifecycleTransaction txn) throws IOException
 +{
 +logger.debug("Garbage collecting {}", txn.originals());
 +CompactionTask task = new CompactionTask(cfStore, txn, 
getDefaultGcBefore(cfStore, FBUtilities.nowInSeconds()))
 +{
 +@Override
 +protected CompactionController 
getCompactionController(Set toCompact)
 +{
 +return new CompactionController(cfStore, toCompact, 
gcBefore, null, tombstoneOption);
 +}
 +};
 +task.setUserDefined(true);
 +task.setCompactionType(OperationType.GARBAGE_COLLECT);
 +task.execute(metrics);
 +}
 +}, jobs, OperationType.GARBAGE_COLLECT);
 +}
 +
 +public AllSSTableOpStatus relocateSSTables(final ColumnFamilyStore cfs, 
int jobs) throws ExecutionException, InterruptedException
 +{
 +if (!cfs.getPartitioner().splitter().isPresent())
 +{
 +logger.info("Partitioner does not support splitting");
 +return AllSSTableOpStatus.ABORTED;
 +}
 +final Collection> r = 
StorageService.instance.getLocalRanges(cfs.keyspace.getName());
 +
 +if (r.isEmpty())
 +{
 +logger.info("Relocate cannot run before a node has joined the 
ring");
 +return AllSSTableOpStatus.ABORTED;
 +}
 +
 +final DiskBoundaries diskBoundaries = cfs.getDiskBoundaries();
 +
 +return parallelAllSSTableOperation(cfs, new OneSSTableOperation()
 +{
 +@Override
 +public Iterable 
filterSSTables(LifecycleTransaction transaction)
 +{
 +Set originals = 
Sets.newHashSet(transaction.originals());
 +Set needsRelocation = 
originals.stream().filter(s -> 
!inCorrectLocation(s)).collect(Collectors.toSet());
 +transaction.cancel(Sets.difference(originals, 
needsRelocation));
 +
 +Map> groupedByDisk = 
groupByDiskIndex(needsRelocation);
 +
 +int maxSize = 0;
 +for (List diskSSTables : 
groupedByDisk.values())
 +maxSize = Math.max(maxSize, diskSSTables.size());
 +
 +List mixedSSTables = new A

[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2018-06-29 Thread mck
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/489c2f69
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/489c2f69
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/489c2f69

Branch: refs/heads/cassandra-3.11
Commit: 489c2f69510b001770d9a59e55ba5d5175019050
Parents: 4e23c9e f8912ce
Author: Mick Semb Wever 
Authored: Fri Jun 29 16:53:36 2018 +1000
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:57:34 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  66 ++-
 .../db/compaction/AntiCompactionTest.java   | 109 ++-
 3 files changed, 147 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/CHANGES.txt
--
diff --cc CHANGES.txt
index aeeb0ae,9d6a9ea..d694f3b
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,34 -1,5 +1,35 @@@
 -2.2.13
 +3.0.17
 + * Always close RT markers returned by ReadCommand#executeLocally() 
(CASSANDRA-14515)
 + * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
 + * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
 + * Add Missing dependencies in pom-all (CASSANDRA-14422)
 + * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)
 + * Fix deprecated repair error notifications from 3.x clusters to legacy JMX 
clients (CASSANDRA-13121)
 + * Cassandra not starting when using enhanced startup scripts in windows 
(CASSANDRA-14418)
 + * Fix progress stats and units in compactionstats (CASSANDRA-12244)
 + * Better handle missing partition columns in system_schema.columns 
(CASSANDRA-14379)
 + * Delay hints store excise by write timeout to avoid race with decommission 
(CASSANDRA-13740)
 + * Deprecate background repair and probablistic read_repair_chance table 
options
 +   (CASSANDRA-13910)
 + * Add missed CQL keywords to documentation (CASSANDRA-14359)
 + * Fix unbounded validation compactions on repair / revert CASSANDRA-13797 
(CASSANDRA-14332)
 + * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
 + * Handle all exceptions when opening sstables (CASSANDRA-14202)
 + * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
 + * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)
 + * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
 + * Respect max hint window when hinting for LWT (CASSANDRA-14215)
 + * Adding missing WriteType enum values to v3, v4, and v5 spec 
(CASSANDRA-13697)
 + * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163)
 + * Fix NPE when performing comparison against a null frozen in LWT 
(CASSANDRA-14087)
 + * Log when SSTables are deleted (CASSANDRA-14302)
 + * Fix batch commitlog sync regression (CASSANDRA-14292)
 + * Write to pending endpoint when view replica is also base replica 
(CASSANDRA-14251)
 + * Chain commit log marker potential performance regression in batch commit 
mode (CASSANDRA-14194)
 + * Fully utilise specified compaction threads (CASSANDRA-14210)
 + * Pre-create deletion log records to finish compactions quicker 
(CASSANDRA-12763)
 +Merged from 2.2:
+  * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
   * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
   * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
   * Use Bounds instead of Range for sstables in anticompaction 
(CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index ab363e0,013fc04..f033bf2
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@@ -474,6 -460,16 +474,17 @@@ public class CompactionManager implemen
  }, jobs, OperationType.CLEANUP);
  }
  
+ /**
+  * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+  * as repaired.
+  *
+  * @param cfs Column family for anti-compaction
+  * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+  * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+  * @param repairedAt Unix timestamp of when repair was completed.
++ * @param parentRepairSession Corresponding repair session
+  * @return

[01/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs

2018-06-29 Thread mck
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 1143bc113 -> f8912ce93
  refs/heads/cassandra-3.0 4e23c9e4d -> 489c2f695
  refs/heads/cassandra-3.11 ea62d8862 -> bba0d03e9
  refs/heads/trunk 4cb83cb81 -> 5cc68a873


Stop SSTables being lost from compaction strategy after full repairs

patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for 
CASSANDRA-14423


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9

Branch: refs/heads/cassandra-2.2
Commit: f8912ce9329a8bc360e93cf61e56814135fbab39
Parents: 1143bc1
Author: kurt 
Authored: Thu Jun 14 10:59:19 2018 +
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:49:53 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  70 ++-
 .../db/compaction/AntiCompactionTest.java   | 120 ++-
 3 files changed, 156 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7b1089e..9d6a9ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.13
+ * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
  * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
  * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
  * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 419f66e..013fc04 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -460,6 +460,16 @@ public class CompactionManager implements 
CompactionManagerMBean
 }, jobs, OperationType.CLEANUP);
 }
 
+/**
+ * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+ * as repaired.
+ *
+ * @param cfs Column family for anti-compaction
+ * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+ * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+ * @param repairedAt Unix timestamp of when repair was completed.
+ * @return Futures executing anti-compaction.
+ */
 public ListenableFuture submitAntiCompaction(final ColumnFamilyStore 
cfs,
   final Collection> 
ranges,
   final Refs sstables,
@@ -475,6 +485,8 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 for (SSTableReader compactingSSTable : 
cfs.getTracker().getCompacting())
 sstables.releaseIfHolds(compactingSSTable);
+// We don't anti-compact any SSTable that has been 
compacted during repair as it may have been compacted
+// with unrepaired data.
 Set compactedSSTables = new HashSet<>();
 for (SSTableReader sstable : sstables)
 if (sstable.isMarkedCompacted())
@@ -504,9 +516,17 @@ public class CompactionManager implements 
CompactionManagerMBean
  *
  * Caller must reference the validatedForRepair sstables (via 
ParentRepairSession.getActiveRepairedSSTableRefs(..)).
  *
+ * NOTE: Repairs can take place on both unrepaired (incremental + full) 
and repaired (full) data.
+ * Although anti-compaction could work on repaired sstables as well and 
would result in having more accurate
+ * repairedAt values for these, we avoid anti-compacting already repaired 
sstables, as we currently don't
+ * make use of any actual repairedAt value and splitting up sstables just 
for that is not worth it. However, we will
+ * still update repairedAt if the SSTable is fully contained within the 
repaired ranges, as this does not require
+ * anticompaction.
+ *
  * @param cfs
  * @param ranges Ranges that the repair was carried out on
  * @param validatedForRepair SSTables containing the repaired ranges. 
Should be referenced before passing them.
+ * @param txn Transaction across all SSTables 

[03/10] cassandra git commit: Stop SSTables being lost from compaction strategy after full repairs

2018-06-29 Thread mck
Stop SSTables being lost from compaction strategy after full repairs

patch by Kurt Greaves; reviewed by Stefan Podkowinski, Marcus Eriksson, for 
CASSANDRA-14423


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f8912ce9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f8912ce9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f8912ce9

Branch: refs/heads/cassandra-3.11
Commit: f8912ce9329a8bc360e93cf61e56814135fbab39
Parents: 1143bc1
Author: kurt 
Authored: Thu Jun 14 10:59:19 2018 +
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:49:53 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  70 ++-
 .../db/compaction/AntiCompactionTest.java   | 120 ++-
 3 files changed, 156 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7b1089e..9d6a9ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.13
+ * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
  * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
  * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
  * Use Bounds instead of Range for sstables in anticompaction (CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f8912ce9/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 419f66e..013fc04 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -460,6 +460,16 @@ public class CompactionManager implements 
CompactionManagerMBean
 }, jobs, OperationType.CLEANUP);
 }
 
+/**
+ * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+ * as repaired.
+ *
+ * @param cfs Column family for anti-compaction
+ * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+ * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+ * @param repairedAt Unix timestamp of when repair was completed.
+ * @return Futures executing anti-compaction.
+ */
 public ListenableFuture submitAntiCompaction(final ColumnFamilyStore 
cfs,
   final Collection> 
ranges,
   final Refs sstables,
@@ -475,6 +485,8 @@ public class CompactionManager implements 
CompactionManagerMBean
 {
 for (SSTableReader compactingSSTable : 
cfs.getTracker().getCompacting())
 sstables.releaseIfHolds(compactingSSTable);
+// We don't anti-compact any SSTable that has been 
compacted during repair as it may have been compacted
+// with unrepaired data.
 Set compactedSSTables = new HashSet<>();
 for (SSTableReader sstable : sstables)
 if (sstable.isMarkedCompacted())
@@ -504,9 +516,17 @@ public class CompactionManager implements 
CompactionManagerMBean
  *
  * Caller must reference the validatedForRepair sstables (via 
ParentRepairSession.getActiveRepairedSSTableRefs(..)).
  *
+ * NOTE: Repairs can take place on both unrepaired (incremental + full) 
and repaired (full) data.
+ * Although anti-compaction could work on repaired sstables as well and 
would result in having more accurate
+ * repairedAt values for these, we avoid anti-compacting already repaired 
sstables, as we currently don't
+ * make use of any actual repairedAt value and splitting up sstables just 
for that is not worth it. However, we will
+ * still update repairedAt if the SSTable is fully contained within the 
repaired ranges, as this does not require
+ * anticompaction.
+ *
  * @param cfs
  * @param ranges Ranges that the repair was carried out on
  * @param validatedForRepair SSTables containing the repaired ranges. 
Should be referenced before passing them.
+ * @param txn Transaction across all SSTables that were repaired.
  * @throws InterruptedException
  * @throws IOException
  */
@@ -519,13 +539,7 @@ public class CompactionManager implements 
CompactionManagerMBean
 logger.info("Starting anticompaction for {}.

[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2018-06-29 Thread mck
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/489c2f69
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/489c2f69
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/489c2f69

Branch: refs/heads/trunk
Commit: 489c2f69510b001770d9a59e55ba5d5175019050
Parents: 4e23c9e f8912ce
Author: Mick Semb Wever 
Authored: Fri Jun 29 16:53:36 2018 +1000
Committer: Mick Semb Wever 
Committed: Fri Jun 29 16:57:34 2018 +1000

--
 CHANGES.txt |   1 +
 .../db/compaction/CompactionManager.java|  66 ++-
 .../db/compaction/AntiCompactionTest.java   | 109 ++-
 3 files changed, 147 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/CHANGES.txt
--
diff --cc CHANGES.txt
index aeeb0ae,9d6a9ea..d694f3b
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,34 -1,5 +1,35 @@@
 -2.2.13
 +3.0.17
 + * Always close RT markers returned by ReadCommand#executeLocally() 
(CASSANDRA-14515)
 + * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
 + * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
 + * Add Missing dependencies in pom-all (CASSANDRA-14422)
 + * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)
 + * Fix deprecated repair error notifications from 3.x clusters to legacy JMX 
clients (CASSANDRA-13121)
 + * Cassandra not starting when using enhanced startup scripts in windows 
(CASSANDRA-14418)
 + * Fix progress stats and units in compactionstats (CASSANDRA-12244)
 + * Better handle missing partition columns in system_schema.columns 
(CASSANDRA-14379)
 + * Delay hints store excise by write timeout to avoid race with decommission 
(CASSANDRA-13740)
 + * Deprecate background repair and probablistic read_repair_chance table 
options
 +   (CASSANDRA-13910)
 + * Add missed CQL keywords to documentation (CASSANDRA-14359)
 + * Fix unbounded validation compactions on repair / revert CASSANDRA-13797 
(CASSANDRA-14332)
 + * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
 + * Handle all exceptions when opening sstables (CASSANDRA-14202)
 + * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
 + * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)
 + * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
 + * Respect max hint window when hinting for LWT (CASSANDRA-14215)
 + * Adding missing WriteType enum values to v3, v4, and v5 spec 
(CASSANDRA-13697)
 + * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163)
 + * Fix NPE when performing comparison against a null frozen in LWT 
(CASSANDRA-14087)
 + * Log when SSTables are deleted (CASSANDRA-14302)
 + * Fix batch commitlog sync regression (CASSANDRA-14292)
 + * Write to pending endpoint when view replica is also base replica 
(CASSANDRA-14251)
 + * Chain commit log marker potential performance regression in batch commit 
mode (CASSANDRA-14194)
 + * Fully utilise specified compaction threads (CASSANDRA-14210)
 + * Pre-create deletion log records to finish compactions quicker 
(CASSANDRA-12763)
 +Merged from 2.2:
+  * Fix bug that prevented compaction of SSTables after full repairs 
(CASSANDRA-14423)
   * Incorrect counting of pending messages in OutboundTcpConnection 
(CASSANDRA-11551)
   * Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)
   * Use Bounds instead of Range for sstables in anticompaction 
(CASSANDRA-14411)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/489c2f69/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index ab363e0,013fc04..f033bf2
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@@ -474,6 -460,16 +474,17 @@@ public class CompactionManager implemen
  }, jobs, OperationType.CLEANUP);
  }
  
+ /**
+  * Submit anti-compactions for a collection of SSTables over a set of 
repaired ranges and marks corresponding SSTables
+  * as repaired.
+  *
+  * @param cfs Column family for anti-compaction
+  * @param ranges Repaired ranges to be anti-compacted into separate 
SSTables.
+  * @param sstables {@link Refs} of SSTables within CF to anti-compact.
+  * @param repairedAt Unix timestamp of when repair was completed.
++ * @param parentRepairSession Corresponding repair session
+  * @return Futures 

[jira] [Commented] (CASSANDRA-10540) RangeAwareCompaction

2018-06-29 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527299#comment-16527299
 ] 

Marcus Eriksson commented on CASSANDRA-10540:
-

Hey [~Lerh Low] thanks so much for the testing, I will look in to the results 
soon, I assume you didn't find any issues with the patch?

> RangeAwareCompaction
> 
>
> Key: CASSANDRA-10540
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10540
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
>  Labels: compaction, lcs, vnodes
> Fix For: 4.x
>
>
> Broken out from CASSANDRA-6696, we should split sstables based on ranges 
> during compaction.
> Requirements;
> * dont create tiny sstables - keep them bunched together until a single vnode 
> is big enough (configurable how big that is)
> * make it possible to run existing compaction strategies on the per-range 
> sstables
> We should probably add a global compaction strategy parameter that states 
> whether this should be enabled or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-29 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527284#comment-16527284
 ] 

mck commented on CASSANDRA-14423:
-

Jumping on this to commit. Two (other) reviewers have +1 this now, and tests 
have passed.

Repeating for clarity, the patches and their tests were…
||Branch||uTest||dTest||
|[cassandra-2.2_14423|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14423-2.2]|[!https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-2.2.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-2.2]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/583/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/583]|
|[cassandra-3.0_14423|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14423-3.0]|[!https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.0.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.0]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/581/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/581]|
|[cassandra-3.11_14423|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14423-3.11]|[!https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.11.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/CASSANDRA-14423-3.11]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/580/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/580]|

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
>  

[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted

2018-06-29 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14423:
---
Status: Ready to Commit  (was: Patch Available)

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd