[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-09-01 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14145:

Status: Ready to Commit  (was: Patch Available)

>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.x
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read

2018-09-01 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599806#comment-16599806
 ] 

Sam Tunnicliffe commented on CASSANDRA-14145:
-

This has two +1s and CI looks good (the dtests are running a branch which 
enables this globally), but unfortunately this conflicts quite heavily with 
CASSANDRA-14408 which landed yesterday. So I'm going to mark it Ready To Commit 
and take a look at the rebase on Monday. If I get through that in a reasonable 
time (a day or two) and CI still looks good, I'll commit then if nobody objects 
too much.

||utests||dtests||
|[utests|https://circleci.com/gh/beobal/cassandra/390]|[vnodes|https://circleci.com/gh/beobal/cassandra/389]
 / [no vnodes|https://circleci.com/gh/beobal/cassandra/388]|


>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.x
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13304) Add checksumming to the native protocol

2018-09-01 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-13304:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

bq. The issue is that the checksum over the lengths is only calculated over the 
least-significant byte of the compressed and uncompressed lengths
Great catch. I've fixed it to calculate over the entireity of the lengths and 
pulled in your test. I had to add a couple of checks to it though as it 
generated some false positives. It can imagine there are better ways to acheive 
what I did via the generators, but this was the simplest thing I could think of.

bq. EDIT (added this comment this morning): In the test failure below we also 
hit the buffer resize condition, ret.writableBytes() < (CHUNK_HEADER_OVERHEAD + 
toWrite) this is because we don’t account for CHUNK_HEADER_OVERHEAD when 
allocating the buffer and by the time we check, we have already written some 
parts of the chunk header. While this test is an edge case (single byte input) 
it does lead to us allocating a buffer 3.6x larger than we need. 
Dinesh Joshi and I were reviewing the bug above before I posted and we were 
thinking it would be nice to refactor 
ChecksummingTransformer#transformInbound/transformOutbound. They are a bit 
large/unwieldy right now. We can open a JIRA to address this later if you 
prefer.

Hmm, yes. It is a bit of an edge case, but we should fix it. Lets open a follow 
up for that and potential refactoring.

bq. Re: the roundTripZeroLength property. This is mostly covered by the 
property I already added although this makes it more likely to generate a few 
cases. If you want to keep it I would recommend setting withExamples and using 
something small like 10 or 20 examples (since the state space is small).
I added it because my understanding was that the initial roundtrip test is 
somewhat probablistic, so I wanted to make sure we covered that edge case. I've 
left it in for now, with the low {{withExamples}}, but I'm happy to remove it 
if you think it's not necessary.

bq. The System.out.println I added in roundTripSafetyProperty should be removed
I did remove it didn't I? (I may be going a bit snow blind, but I can't see it).


Tests are all passing now (modulo things I *think* we're expecting to fail 
right now), so committing this to trunk in 
{{65fb17a88bd096b1e952ccca31ad709759644a1b}}, thanks all!

||utests||dtests||
|[utests|https://circleci.com/gh/beobal/cassandra/380]|[vnodes|https://circleci.com/gh/beobal/cassandra/379]
 / [no vnodes|https://circleci.com/gh/beobal/cassandra/378]|


> Add checksumming to the native protocol
> ---
>
> Key: CASSANDRA-13304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Michael Kjellman
>Assignee: Sam Tunnicliffe
>Priority: Blocker
>  Labels: client-impacting
> Fix For: 4.0
>
> Attachments: 13304_v1.diff, boxplot-read-throughput.png, 
> boxplot-write-throughput.png
>
>
> The native binary transport implementation doesn't include checksums. This 
> makes it highly susceptible to silently inserting corrupted data either due 
> to hardware issues causing bit flips on the sender/client side, C*/receiver 
> side, or network in between.
> Attaching an implementation that makes checksum'ing mandatory (assuming both 
> client and server know about a protocol version that supports checksums) -- 
> and also adds checksumming to clients that request compression.
> The serialized format looks something like this:
> {noformat}
>  *  1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
>  *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Number of Compressed Chunks  | Compressed Length (e1)/
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * /  Compressed Length cont. (e1) |Uncompressed Length (e1)   /
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e1) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  

[1/2] cassandra git commit: Add checksumming to the native protocol

2018-09-01 Thread samt
Repository: cassandra
Updated Branches:
  refs/heads/trunk 960174da6 -> 65fb17a88


http://git-wip-us.apache.org/repos/asf/cassandra/blob/65fb17a8/test/unit/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformerTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformerTest.java
 
b/test/unit/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformerTest.java
new file mode 100644
index 000..82401b0
--- /dev/null
+++ 
b/test/unit/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformerTest.java
@@ -0,0 +1,224 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.transport.frame.checksum;
+
+import java.io.IOException;
+import java.util.EnumSet;
+import java.util.Random;
+
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.Unpooled;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.transport.Frame;
+import org.apache.cassandra.transport.ProtocolException;
+import org.apache.cassandra.transport.frame.compress.Compressor;
+import org.apache.cassandra.transport.frame.compress.LZ4Compressor;
+import org.apache.cassandra.transport.frame.compress.SnappyCompressor;
+import org.apache.cassandra.utils.ChecksumType;
+import org.apache.cassandra.utils.Pair;
+import org.quicktheories.core.Gen;
+
+import static org.quicktheories.QuickTheory.qt;
+import static org.quicktheories.generators.SourceDSL.*;
+
+public class ChecksummingTransformerTest
+{
+private static final int DEFAULT_BLOCK_SIZE = 1 << 15;
+private static final int MAX_INPUT_SIZE = 1 << 18;
+private static final EnumSet FLAGS = 
EnumSet.of(Frame.Header.Flag.COMPRESSED, Frame.Header.Flag.CHECKSUMMED);
+
+@BeforeClass
+public static void init()
+{
+// required as static ChecksummingTransformer instances read default 
block size from config
+DatabaseDescriptor.clientInitialization();
+}
+
+@Test
+public void roundTripSafetyProperty()
+{
+qt().withExamples(500)
+.forAll(inputs(),
+compressors(),
+checksumTypes(),
+blocksizes())
+.checkAssert(this::roundTrip);
+}
+
+@Test
+public void roundTripZeroLengthInput()
+{
+qt().withExamples(20)
+.forAll(zeroLengthInputs(),
+compressors(),
+checksumTypes(),
+blocksizes())
+.checkAssert(this::roundTrip);
+}
+
+@Test
+public void corruptionCausesFailure()
+{
+qt().withExamples(500)
+.forAll(inputWithCorruptablePosition(),
+integers().between(0, 
Byte.MAX_VALUE).map(Integer::byteValue),
+compressors(),
+checksumTypes())
+.checkAssert(this::roundTripWithCorruption);
+}
+
+private void roundTripWithCorruption(Pair 
inputAndCorruptablePosition,
+ byte corruptionValue,
+ Compressor compressor,
+ ChecksumType checksum)
+{
+String input = inputAndCorruptablePosition.left;
+ByteBuf expectedBuf = Unpooled.wrappedBuffer(input.getBytes());
+int byteToCorrupt = inputAndCorruptablePosition.right;
+ChecksummingTransformer transformer = new 
ChecksummingTransformer(checksum, DEFAULT_BLOCK_SIZE, compressor);
+ByteBuf outbound = transformer.transformOutbound(expectedBuf);
+
+// make sure we're actually expecting to produce some corruption
+if (outbound.getByte(byteToCorrupt) == corruptionValue)
+return;
+
+if (byteToCorrupt >= outbound.writerIndex())
+return;
+
+try
+{
+int oldIndex = outbound.writerIndex();
+outbound.writerIndex(byteToCorrupt);
+outbound.writeByte(corruptionValue);
+outbound.writerIndex(oldIndex);
+ByteBuf 

[2/2] cassandra git commit: Add checksumming to the native protocol

2018-09-01 Thread samt
Add checksumming to the native protocol

Patch my Michael Kjellman and Sam Tunnicliffe; reviewed by Dinesh Joshi
and Jordan West for CASSANDRA-13304


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/65fb17a8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/65fb17a8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/65fb17a8

Branch: refs/heads/trunk
Commit: 65fb17a88bd096b1e952ccca31ad709759644a1b
Parents: 960174d
Author: Sam Tunnicliffe 
Authored: Fri Mar 10 15:18:33 2017 +
Committer: Sam Tunnicliffe 
Committed: Sat Sep 1 22:41:37 2018 +0100

--
 CHANGES.txt |   1 +
 bin/debug-cql   |   2 +-
 build.xml   |   2 +
 conf/cassandra.yaml |   4 +
 .../org/apache/cassandra/config/Config.java |   1 +
 .../cassandra/config/DatabaseDescriptor.java|   5 +
 .../org/apache/cassandra/transport/CBUtil.java  |   6 +
 .../org/apache/cassandra/transport/Client.java  |  57 ++-
 .../apache/cassandra/transport/Connection.java  |  11 +-
 .../org/apache/cassandra/transport/Frame.java   |  42 ++-
 .../cassandra/transport/FrameCompressor.java| 211 ---
 .../cassandra/transport/ProtocolVersion.java|   5 +
 .../org/apache/cassandra/transport/Server.java  |   8 +-
 .../cassandra/transport/SimpleClient.java   |  29 +-
 .../transport/frame/FrameBodyTransformer.java   |  57 +++
 .../frame/checksum/ChecksummingTransformer.java | 361 +++
 .../frame/compress/CompressingTransformer.java  | 164 +
 .../transport/frame/compress/Compressor.java|  62 
 .../transport/frame/compress/LZ4Compressor.java |  68 
 .../frame/compress/SnappyCompressor.java|  79 
 .../transport/messages/OptionsMessage.java  |  20 +-
 .../transport/messages/StartupMessage.java  |  70 +++-
 .../org/apache/cassandra/cql3/CQLTester.java|   4 +-
 .../cassandra/cql3/PreparedStatementsTest.java  |   2 +-
 .../cassandra/service/ClientWarningsTest.java   |   8 +-
 .../service/ProtocolBetaVersionTest.java|   4 +-
 .../cassandra/transport/MessagePayloadTest.java |   6 +-
 .../checksum/ChecksummingTransformerTest.java   | 224 
 .../stress/settings/StressSettings.java |   2 +-
 29 files changed, 1234 insertions(+), 281 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/65fb17a8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index aca31fe..301f97f 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add checksumming to the native protocol (CASSANDRA-13304)
  * Make AuthCache more easily extendable (CASSANDRA-14662)
  * Extend RolesCache to include detailed role info (CASSANDRA-14497)
  * Add fqltool compare (CASSANDRA-14619)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/65fb17a8/bin/debug-cql
--
diff --git a/bin/debug-cql b/bin/debug-cql
index c184df9..9550ddf 100755
--- a/bin/debug-cql
+++ b/bin/debug-cql
@@ -46,7 +46,7 @@ esac
 
 class="org.apache.cassandra.transport.Client"
 cassandra_parms="-Dlogback.configurationFile=logback-tools.xml"
-"$JAVA" $JVM_OPTS $cassandra_parms  -cp "$CLASSPATH" "$class" $1 $2
+"$JAVA" $JVM_OPTS $cassandra_parms  -cp "$CLASSPATH" "$class" $@
 
 exit $?
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/65fb17a8/build.xml
--
diff --git a/build.xml b/build.xml
index 1c957d3..86462f7 100644
--- a/build.xml
+++ b/build.xml
@@ -435,6 +435,7 @@
 
   
   
+  
   
  
   
@@ -560,6 +561,7 @@
 artifactId="cassandra-parent"
 version="${version}"/>
 
+
 
 
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/65fb17a8/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index 503a0fa..995a520 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -656,6 +656,10 @@ native_transport_port: 9042
 # you may want to adjust max_value_size_in_mb accordingly. This should be 
positive and less than 2048.
 # native_transport_max_frame_size_in_mb: 256
 
+# If checksumming is enabled as a protocol option, denotes the size of the 
chunks into which frame
+# are bodies will be broken and checksummed.
+# native_transport_frame_block_size_in_kb: 32
+
 # The maximum number of concurrent client connections.
 # The default is -1, which means unlimited.
 # native_transport_max_concurrent_connections: -1


[jira] [Updated] (CASSANDRA-14662) Refactor AuthCache

2018-09-01 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14662:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

Thanks [~KurtG], committed in {{960174da67eb6008c73340e61700ea34ec550a12}}

> Refactor AuthCache
> --
>
> Key: CASSANDRA-14662
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14662
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
>  Labels: security
> Fix For: 4.0
>
>
> When building an LDAP IAuthenticator plugin I ran into a few issues when 
> trying to reuse the AuthCache similar to how PasswordAuthenticator implements 
> it. Most of the problems stemmed from the underlying cache being inaccessible 
> and not being able to override {{initCache}} properly.
> Anyway, I've had a stab at refactoring AuthCache with the following 
> improvements:
> # Make it possible to extend and override all necessary methods (initCache, 
> init, validate)
> # Makes it possible to specify a {{CacheLoader}} rather than just a 
> {{Function}}, allowing you to have a get/load that throws exceptions.
> # Use AuthCache on its own rather than extending it for each use case 
> ({{invalidate(K)}} moved to be part of MBean)
> # Provided a builder that uses sane defaults so we don't have unnecessary 
> repeated code everywhere
> The refactor made all the extensions of AuthCache unnecessary, so I've 
> simplified those cases to use AuthCache and removed any classes extending 
> AuthCache. I also removed some noop compatibility classes that were marked to 
> be removed in 4.0.
> Also added some tests in AuthCacheTest.
> |[trunk|https://github.com/apache/cassandra/compare/trunk...kgreav:authcache]|
> |[utests|https://circleci.com/gh/kgreav/cassandra/206]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Make AuthCache easier to subclass

2018-09-01 Thread samt
Repository: cassandra
Updated Branches:
  refs/heads/trunk cc12665bb -> 960174da6


Make AuthCache easier to subclass

Patch by Kurt Greaves; reviewed by Sam Tunnicliffe for CASSANDRA-14662


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/960174da
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/960174da
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/960174da

Branch: refs/heads/trunk
Commit: 960174da67eb6008c73340e61700ea34ec550a12
Parents: cc12665
Author: kurt 
Authored: Sat Sep 1 21:43:58 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Sat Sep 1 21:46:11 2018 +0100

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/auth/AuthCache.java| 115 +++-
 .../apache/cassandra/auth/AuthCacheMBean.java   |   4 +-
 .../apache/cassandra/auth/NetworkAuthCache.java |   2 +-
 .../cassandra/auth/PasswordAuthenticator.java   |   2 +-
 .../apache/cassandra/auth/PermissionsCache.java |   2 +-
 .../cassandra/auth/PermissionsCacheMBean.java   |  26 
 .../org/apache/cassandra/auth/RolesCache.java   |   2 +-
 .../apache/cassandra/auth/RolesCacheMBean.java  |  26 
 .../apache/cassandra/auth/AuthCacheTest.java| 137 +++
 10 files changed, 229 insertions(+), 88 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/960174da/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a7468f4..aca31fe 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Make AuthCache more easily extendable (CASSANDRA-14662)
  * Extend RolesCache to include detailed role info (CASSANDRA-14497)
  * Add fqltool compare (CASSANDRA-14619)
  * Add fqltool replay (CASSANDRA-14618)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/960174da/src/java/org/apache/cassandra/auth/AuthCache.java
--
diff --git a/src/java/org/apache/cassandra/auth/AuthCache.java 
b/src/java/org/apache/cassandra/auth/AuthCache.java
index d6ff0b0..4f36a63 100644
--- a/src/java/org/apache/cassandra/auth/AuthCache.java
+++ b/src/java/org/apache/cassandra/auth/AuthCache.java
@@ -35,24 +35,40 @@ import com.github.benmanes.caffeine.cache.LoadingCache;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-public class AuthCache implements AuthCacheMBean
+import static com.google.common.base.Preconditions.checkNotNull;
+
+public class AuthCache implements AuthCacheMBean
 {
 private static final Logger logger = 
LoggerFactory.getLogger(AuthCache.class);
 
 private static final String MBEAN_NAME_BASE = 
"org.apache.cassandra.auth:type=";
 
-private volatile LoadingCache cache;
-
-private final String name;
-private final IntConsumer setValidityDelegate;
-private final IntSupplier getValidityDelegate;
-private final IntConsumer setUpdateIntervalDelegate;
-private final IntSupplier getUpdateIntervalDelegate;
-private final IntConsumer setMaxEntriesDelegate;
-private final IntSupplier getMaxEntriesDelegate;
-private final Function loadFunction;
-private final BooleanSupplier enableCache;
-
+/**
+ * Underlying cache. LoadingCache will call underlying load function on 
{@link #get} if key is not present
+ */
+protected volatile LoadingCache cache;
+
+private String name;
+private IntConsumer setValidityDelegate;
+private IntSupplier getValidityDelegate;
+private IntConsumer setUpdateIntervalDelegate;
+private IntSupplier getUpdateIntervalDelegate;
+private IntConsumer setMaxEntriesDelegate;
+private IntSupplier getMaxEntriesDelegate;
+private Function loadFunction;
+private BooleanSupplier enableCache;
+
+/**
+ * @param name Used for MBean
+ * @param setValidityDelegate Used to set cache validity period. See 
{@link Policy#expireAfterWrite()}
+ * @param getValidityDelegate Getter for validity period
+ * @param setUpdateIntervalDelegate Used to set cache update interval. See 
{@link Policy#refreshAfterWrite()}
+ * @param getUpdateIntervalDelegate Getter for update interval
+ * @param setMaxEntriesDelegate Used to set max # entries in cache. See 
{@link com.github.benmanes.caffeine.cache.Policy.Eviction#setMaximum(long)}
+ * @param getMaxEntriesDelegate Getter for max entries.
+ * @param loadFunction Function to load the cache. Called on {@link 
#get(Object)}
+ * @param cacheEnabledDelegate Used to determine if cache is enabled.
+ */
 protected AuthCache(String name,
 IntConsumer setValidityDelegate,
 IntSupplier getValidityDelegate,
@@ -61,23 +77,26 @@ public class AuthCache implements AuthCacheMBean
 

[jira] [Updated] (CASSANDRA-14497) Add Role login cache

2018-09-01 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14497:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

CI looks good (only a couple of dtest failures, previously failing on trunk)
||utests||dtests||
|[utests|https://circleci.com/gh/beobal/cassandra/375]|[vnodes|https://circleci.com/gh/beobal/cassandra/376]
 / [no vnodes|https://circleci.com/gh/beobal/cassandra/374]|

committed to trunk as {{cc12665bb7645d17ba70edcf952ee6a1ea63127b}}

> Add Role login cache
> 
>
> Key: CASSANDRA-14497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Jay Zhuang
>Assignee: Sam Tunnicliffe
>Priority: Major
>  Labels: security
> Fix For: 4.0
>
>
> The 
> [{{ClientState.login()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ClientState.java#L313]
>  function is used for all auth message: 
> [{{AuthResponse.java:82}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/AuthResponse.java#L82].
>  But the 
> [{{role.canLogin}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L521]
>  information is not cached. So it hits the database every time: 
> [{{CassandraRoleManager.java:407}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L407].
>  For a cluster with lots of new connections, it's causing performance issue. 
> The mitigation for us is to increase the {{system_auth}} replication factor 
> to match the number of nodes, so 
> [{{local_one}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L488]
>  would be very cheap. The P99 dropped immediately, but I don't think it is 
> not a good solution.
> I would purpose to add {{Role.canLogin}} to the RolesCache to improve the 
> auth performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Improve RolesCache to include detailed role info

2018-09-01 Thread samt
Repository: cassandra
Updated Branches:
  refs/heads/trunk f83bd5ac2 -> cc12665bb


Improve RolesCache to include detailed role info

Patch by Sam Tunnicliffe; reviewed by Jay Zhuang


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cc12665b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cc12665b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cc12665b

Branch: refs/heads/trunk
Commit: cc12665bb7645d17ba70edcf952ee6a1ea63127b
Parents: f83bd5a
Author: Sam Tunnicliffe 
Authored: Wed Apr 25 12:11:34 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Sat Sep 1 21:12:52 2018 +0100

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/auth/AuthCache.java|  13 ++
 .../cassandra/auth/AuthenticatedUser.java   |  31 ++-
 .../cassandra/auth/CassandraAuthorizer.java |   7 +-
 .../cassandra/auth/CassandraRoleManager.java| 230 ---
 .../org/apache/cassandra/auth/IRoleManager.java |  18 ++
 src/java/org/apache/cassandra/auth/Role.java|  72 ++
 src/java/org/apache/cassandra/auth/Roles.java   | 132 ++-
 .../org/apache/cassandra/auth/RolesCache.java   |  34 ++-
 .../apache/cassandra/service/ClientState.java   |   5 +-
 .../auth/CassandraNetworkAuthorizerTest.java|  36 +--
 .../auth/CassandraRoleManagerTest.java  |  88 +++
 .../apache/cassandra/auth/RoleTestUtils.java|  85 +++
 .../org/apache/cassandra/auth/RolesTest.java|  95 
 14 files changed, 669 insertions(+), 178 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc12665b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 1ba9975..a7468f4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Extend RolesCache to include detailed role info (CASSANDRA-14497)
  * Add fqltool compare (CASSANDRA-14619)
  * Add fqltool replay (CASSANDRA-14618)
  * Log keyspace in full query log (CASSANDRA-14656)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc12665b/src/java/org/apache/cassandra/auth/AuthCache.java
--
diff --git a/src/java/org/apache/cassandra/auth/AuthCache.java 
b/src/java/org/apache/cassandra/auth/AuthCache.java
index 3954230..d6ff0b0 100644
--- a/src/java/org/apache/cassandra/auth/AuthCache.java
+++ b/src/java/org/apache/cassandra/auth/AuthCache.java
@@ -89,6 +89,19 @@ public class AuthCache implements AuthCacheMBean
 }
 }
 
+protected void unregisterMBean()
+{
+try
+{
+MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
+mbs.unregisterMBean(getObjectName());
+}
+catch (Exception e)
+{
+logger.warn("Error unregistering {} cache mbean", name, e);
+}
+}
+
 protected ObjectName getObjectName() throws MalformedObjectNameException
 {
 return new ObjectName(MBEAN_NAME_BASE + name);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc12665b/src/java/org/apache/cassandra/auth/AuthenticatedUser.java
--
diff --git a/src/java/org/apache/cassandra/auth/AuthenticatedUser.java 
b/src/java/org/apache/cassandra/auth/AuthenticatedUser.java
index 3d7c078..9f22bea 100644
--- a/src/java/org/apache/cassandra/auth/AuthenticatedUser.java
+++ b/src/java/org/apache/cassandra/auth/AuthenticatedUser.java
@@ -94,18 +94,46 @@ public class AuthenticatedUser
 /**
  * Get the roles that have been granted to the user via the IRoleManager
  *
- * @return a list of roles that have been granted to the user
+ * @return a set of identifiers for the roles that have been granted to 
the user
  */
 public Set getRoles()
 {
 return Roles.getRoles(role);
 }
 
+/**
+ * Get the detailed info on roles granted to the user via IRoleManager
+ *
+ * @return a set of Role objects detailing the roles granted to the user
+ */
+public Set getRoleDetails()
+{
+   return Roles.getRoleDetails(role);
+}
+
 public Set getPermissions(IResource resource)
 {
 return permissionsCache.getPermissions(this, resource);
 }
 
+/**
+ * Check whether this user has login privileges.
+ * LOGIN is not inherited from granted roles, so must be directly granted 
to the primary role for this user
+ *
+ * @return true if the user is permitted to login, false otherwise.
+ */
+public boolean canLogin()
+{
+return Roles.canLogin(getPrimaryRole());
+}
+
+/**
+ * Verify that there is not DC level restriction on this user accessing 
this node.
+ * Further extends 

[jira] [Comment Edited] (CASSANDRA-13304) Add checksumming to the native protocol

2018-09-01 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599314#comment-16599314
 ] 

Jordan West edited comment on CASSANDRA-13304 at 9/1/18 3:22 PM:
-

[~beobal] I'm still going over the changes but I wanted to post this bug now 
before I finish. The issue is that the checksum over the lengths is only 
calculated over the least-significant byte of the compressed and uncompressed 
lengths. This means that if we introduce corruption in the three most 
significant bytes we won't catch it, which leads to a host of different bugs 
(index out of bounds exception , lz4 decompression issues, etc). I pushed a 
patch 
[here|https://github.com/jrwest/cassandra/commit/e57a2508c26f05efb826a7f4342964fa6d6691bd]
 with a new test that catches the issue. I left in the fixed seed for now so 
its easy for you to run and see the failing example in a debugger (its the 
second example that fails). I've pasted a stack trace of the failure with that 
seed below. The example generated has a single byte input and introduces 
corruption into the 4th byte in the stream (the second most significant byte of 
the first length in the stream). This leads to a case where the checksums match 
but when we go to read the data we read past the total length of the buffer.

A couple other comments while I'm here. All minor:
 * EDIT (added this comment this morning): In the test failure below we also 
hit the buffer resize condition, {{ret.writableBytes() < (CHUNK_HEADER_OVERHEAD 
+ toWrite)}} this is because we don’t account for {{CHUNK_HEADER_OVERHEAD}} 
when allocating the buffer and by the time we check, we have already written 
some parts of the chunk header. While this test is an edge case (single byte 
input) it does lead to us allocating a buffer 3.6x larger than we need. 
 * [~djoshi3] and I were reviewing the bug above before I posted and we were 
thinking it would be nice to refactor 
{{ChecksummingTransformer#transformInbound/transformOutbound}}. They are a bit 
large/unwieldy right now. We can open a JIRA to address this later if you 
prefer.
 * Re: the {{roundTripZeroLength}} property. This is mostly covered by the 
property I already added although this makes it more likely to generate a few 
cases. If you want to keep it I would recommend setting {{withExamples}} and 
using something small like 10 or 20 examples (since the state space is small).
 * The {{System.out.println}} I added in {{roundTripSafetyProperty}} should be 
removed

Stack Trace:
{code:java}
java.lang.AssertionError: Property falsified after 2 example(s) 
Smallest found falsifying value(s) :- \{(c,3), 0, null, Adler32}

Cause was :-
 java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1401)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1388)
 at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:870)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformer.transformInbound(ChecksummingTransformer.java:289)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.roundTripWithCorruption(ChecksummingTransformerTest.java:106)
 at 
org.quicktheories.dsl.TheoryBuilder4.lambda$checkAssert$9(TheoryBuilder4.java:163)
 at org.quicktheories.dsl.TheoryBuilder4.lambda$check$8(TheoryBuilder4.java:151)
 at org.quicktheories.impl.Property.tryFalsification(Property.java:23)
 at org.quicktheories.impl.Core.shrink(Core.java:111)
 at org.quicktheories.impl.Core.run(Core.java:39)
 at org.quicktheories.impl.TheoryRunner.check(TheoryRunner.java:35)
 at org.quicktheories.dsl.TheoryBuilder4.check(TheoryBuilder4.java:150)
 at org.quicktheories.dsl.TheoryBuilder4.checkAssert(TheoryBuilder4.java:162)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.corruptionCausesFailure(ChecksummingTransformerTest.java:87)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 

[jira] [Comment Edited] (CASSANDRA-13304) Add checksumming to the native protocol

2018-09-01 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599314#comment-16599314
 ] 

Jordan West edited comment on CASSANDRA-13304 at 9/1/18 3:21 PM:
-

[~beobal] I'm still going over the changes but I wanted to post this bug now 
before I finish. The issue is that the checksum over the lengths is only 
calculated over the least-significant byte of the compressed and uncompressed 
lengths. This means that if we introduce corruption in the three most 
significant bytes we won't catch it, which leads to a host of different bugs 
(index out of bounds exception , lz4 decompression issues, etc). I pushed a 
patch 
[here|https://github.com/jrwest/cassandra/commit/e57a2508c26f05efb826a7f4342964fa6d6691bd]
 with a new test that catches the issue. I left in the fixed seed for now so 
its easy for you to run and see the failing example in a debugger (its the 
second example that fails). I've pasted a stack trace of the failure with that 
seed below. The example generated has a single byte input and introduces 
corruption into the 4th byte in the stream (the second most significant byte of 
the first length in the stream). This leads to a case where the checksums match 
but when we go to read the data we read past the total length of the buffer.

A couple other comments while I'm here. All minor:
 * [~djoshi3] and I were reviewing the bug above before I posted and we were 
thinking it would be nice to refactor 
{{ChecksummingTransformer#transformInbound/transformOutbound}}. They are a bit 
large/unwieldy right now. We can open a JIRA to address this later if you 
prefer.
 * Re: the {{roundTripZeroLength}} property. This is mostly covered by the 
property I already added although this makes it more likely to generate a few 
cases. If you want to keep it I would recommend setting {{withExamples}} and 
using something small like 10 or 20 examples (since the state space is small).
 * The {{System.out.println}} I added in {{roundTripSafetyProperty}} should be 
removed
 * In the test failure below we also hit the buffer resize condition, 
{{ret.writableBytes() < (CHUNK_HEADER_OVERHEAD + toWrite)}} this is because we 
don’t account for {{CHUNK_HEADER_OVERHEAD}} when allocating the buffer and by 
the time we check, we have already written some parts of the chunk header. 
While this test is an edge case (single byte input) it does lead to us 
allocating a buffer 3.6x larger than we need. 

Stack Trace:
{code:java}
java.lang.AssertionError: Property falsified after 2 example(s) 
Smallest found falsifying value(s) :- \{(c,3), 0, null, Adler32}

Cause was :-
 java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1401)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1388)
 at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:870)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformer.transformInbound(ChecksummingTransformer.java:289)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.roundTripWithCorruption(ChecksummingTransformerTest.java:106)
 at 
org.quicktheories.dsl.TheoryBuilder4.lambda$checkAssert$9(TheoryBuilder4.java:163)
 at org.quicktheories.dsl.TheoryBuilder4.lambda$check$8(TheoryBuilder4.java:151)
 at org.quicktheories.impl.Property.tryFalsification(Property.java:23)
 at org.quicktheories.impl.Core.shrink(Core.java:111)
 at org.quicktheories.impl.Core.run(Core.java:39)
 at org.quicktheories.impl.TheoryRunner.check(TheoryRunner.java:35)
 at org.quicktheories.dsl.TheoryBuilder4.check(TheoryBuilder4.java:150)
 at org.quicktheories.dsl.TheoryBuilder4.checkAssert(TheoryBuilder4.java:162)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.corruptionCausesFailure(ChecksummingTransformerTest.java:87)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at 

[jira] [Comment Edited] (CASSANDRA-13304) Add checksumming to the native protocol

2018-09-01 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599314#comment-16599314
 ] 

Jordan West edited comment on CASSANDRA-13304 at 9/1/18 3:21 PM:
-

[~beobal] I'm still going over the changes but I wanted to post this bug now 
before I finish. The issue is that the checksum over the lengths is only 
calculated over the least-significant byte of the compressed and uncompressed 
lengths. This means that if we introduce corruption in the three most 
significant bytes we won't catch it, which leads to a host of different bugs 
(index out of bounds exception , lz4 decompression issues, etc). I pushed a 
patch 
[here|https://github.com/jrwest/cassandra/commit/e57a2508c26f05efb826a7f4342964fa6d6691bd]
 with a new test that catches the issue. I left in the fixed seed for now so 
its easy for you to run and see the failing example in a debugger (its the 
second example that fails). I've pasted a stack trace of the failure with that 
seed below. The example generated has a single byte input and introduces 
corruption into the 4th byte in the stream (the second most significant byte of 
the first length in the stream). This leads to a case where the checksums match 
but when we go to read the data we read past the total length of the buffer.

A couple other comments while I'm here. All minor:
 * [~djoshi3] and I were reviewing the bug above before I posted and we were 
thinking it would be nice to refactor 
{{ChecksummingTransformer#transformInbound/transformOutbound}}. They are a bit 
large/unwieldy right now. We can open a JIRA to address this later if you 
prefer.
 * Re: the {{roundTripZeroLength}} property. This is mostly covered by the 
property I already added although this makes it more likely to generate a few 
cases. If you want to keep it I would recommend setting {{withExamples}} and 
using something small like 10 or 20 examples (since the state space is small).
 * The {{System.out.println}} I added in {{roundTripSafetyProperty}} should be 
removed
 * In the test failure below we also hit the buffer resize condition, 
{{ret.writableBytes() < (CHUNK_HEADER_OVERHEAD + toWrite)}} this is because we 
don’t account for {{CHUNK_HEADER_OVERHEAD}} when allocating the buffer and by 
the time we check, we have already written some parts of the chunk header. 
While this test is an edge case (single byte input) it does lead to us 
allocating a buffer 3.8x larger than we need. 

Stack Trace:
{code:java}
java.lang.AssertionError: Property falsified after 2 example(s) 
Smallest found falsifying value(s) :- \{(c,3), 0, null, Adler32}

Cause was :-
 java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1401)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1388)
 at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:870)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformer.transformInbound(ChecksummingTransformer.java:289)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.roundTripWithCorruption(ChecksummingTransformerTest.java:106)
 at 
org.quicktheories.dsl.TheoryBuilder4.lambda$checkAssert$9(TheoryBuilder4.java:163)
 at org.quicktheories.dsl.TheoryBuilder4.lambda$check$8(TheoryBuilder4.java:151)
 at org.quicktheories.impl.Property.tryFalsification(Property.java:23)
 at org.quicktheories.impl.Core.shrink(Core.java:111)
 at org.quicktheories.impl.Core.run(Core.java:39)
 at org.quicktheories.impl.TheoryRunner.check(TheoryRunner.java:35)
 at org.quicktheories.dsl.TheoryBuilder4.check(TheoryBuilder4.java:150)
 at org.quicktheories.dsl.TheoryBuilder4.checkAssert(TheoryBuilder4.java:162)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.corruptionCausesFailure(ChecksummingTransformerTest.java:87)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at 

[jira] [Comment Edited] (CASSANDRA-13304) Add checksumming to the native protocol

2018-09-01 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599314#comment-16599314
 ] 

Jordan West edited comment on CASSANDRA-13304 at 9/1/18 3:06 PM:
-

[~beobal] I'm still going over the changes but I wanted to post this bug now 
before I finish. The issue is that the checksum over the lengths is only 
calculated over the least-significant byte of the compressed and uncompressed 
lengths. This means that if we introduce corruption in the three most 
significant bytes we won't catch it, which leads to a host of different bugs 
(index out of bounds exception , lz4 decompression issues, etc). I pushed a 
patch 
[here|https://github.com/jrwest/cassandra/commit/e57a2508c26f05efb826a7f4342964fa6d6691bd]
 with a new test that catches the issue. I left in the fixed seed for now so 
its easy for you to run and see the failing example in a debugger (its the 
second example that fails). I've pasted a stack trace of the failure with that 
seed below. The example generated has a single byte input and introduces 
corruption into the 4th byte in the stream (the second most significant byte of 
the first length in the stream). This leads to a case where the checksums match 
but when we go to read the data we read past the total length of the buffer.

A couple other comments while I'm here. All minor:
 * [~djoshi3] and I were reviewing the bug above before I posted and we were 
thinking it would be nice to refactor 
{{ChecksummingTransformer#transformInbound/transformOutbound}}. They are a bit 
large/unwieldy right now. We can open a JIRA to address this later if you 
prefer.
 * Re: the {{roundTripZeroLength}} property. This is mostly covered by the 
property I already added although this makes it more likely to generate a few 
cases. If you want to keep it I would recommend setting {{withExamples}} and 
using something small like 10 or 20 examples (since the state space is small).
 * The {{System.out.println}} I added in {{roundTripSafetyProperty}} should be 
removed

Stack Trace:
{code:java}
java.lang.AssertionError: Property falsified after 2 example(s) 
Smallest found falsifying value(s) :- \{(c,3), 0, null, Adler32}

Cause was :-
 java.lang.IndexOutOfBoundsException: readerIndex(10) + length(16711681) 
exceeds writerIndex(15): UnpooledHeapByteBuf(ridx: 10, widx: 15, cap: 54/54)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1401)
 at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1388)
 at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:870)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformer.transformInbound(ChecksummingTransformer.java:289)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.roundTripWithCorruption(ChecksummingTransformerTest.java:106)
 at 
org.quicktheories.dsl.TheoryBuilder4.lambda$checkAssert$9(TheoryBuilder4.java:163)
 at org.quicktheories.dsl.TheoryBuilder4.lambda$check$8(TheoryBuilder4.java:151)
 at org.quicktheories.impl.Property.tryFalsification(Property.java:23)
 at org.quicktheories.impl.Core.shrink(Core.java:111)
 at org.quicktheories.impl.Core.run(Core.java:39)
 at org.quicktheories.impl.TheoryRunner.check(TheoryRunner.java:35)
 at org.quicktheories.dsl.TheoryBuilder4.check(TheoryBuilder4.java:150)
 at org.quicktheories.dsl.TheoryBuilder4.checkAssert(TheoryBuilder4.java:162)
 at 
org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.corruptionCausesFailure(ChecksummingTransformerTest.java:87)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at 

[jira] [Updated] (CASSANDRA-14672) After deleting data in 3.11.3, reads fail with "open marker and close marker have different deletion times"

2018-09-01 Thread Spiros Ioannou (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Spiros Ioannou updated CASSANDRA-14672:
---
Summary: After deleting data in 3.11.3, reads fail with "open marker and 
close marker have different deletion times"  (was: After deleting data in 
3.11.3, reads fail: "open marker and close marker have different deletion 
times")

> After deleting data in 3.11.3, reads fail with "open marker and close marker 
> have different deletion times"
> ---
>
> Key: CASSANDRA-14672
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14672
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 7, GCE, 9 nodes, 4TB disk/~2TB full each, level 
> compaction, timeseries data
>Reporter: Spiros Ioannou
>Priority: Critical
>
> We had 3.11.0, then we upgraded to 3.11.3 last week. We routinely perform 
> deletions as the one described below. After upgrading we run the following 
> deletion query:
>  
> {code:java}
> DELETE FROM measurement_events_dbl WHERE measurement_source_id IN ( 
> 9df798a2-6337-11e8-b52b-42010afa015a,  9df7717e-6337-11e8-b52b-42010afa015a, 
> a08b8042-6337-11e8-b52b-42010afa015a, a08e52cc-6337-11e8-b52b-42010afa015a, 
> a08e6654-6337-11e8-b52b-42010afa015a, a08e6104-6337-11e8-b52b-42010afa015a, 
> a08e6c76-6337-11e8-b52b-42010afa015a, a08e5a9c-6337-11e8-b52b-42010afa015a, 
> a08bcc50-6337-11e8-b52b-42010afa015a) AND year IN (2018) AND measurement_time 
> >= '2018-07-19 04:00:00'{code}
>  
> Immediately after that, trying to read the last value produces an error:
> {code:java}
> select * FROM measurement_events_dbl WHERE measurement_source_id = 
> a08b8042-6337-11e8-b52b-42010afa015a AND year IN (2018) order by 
> measurement_time desc limit 1;
> ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
> message="Operation failed - received 0 responses and 2 failures" 
> info={'failures': 2, 'received_responses': 0, 'required_responses': 1, 
> 'consistency': 'ONE'}{code}
>  
> And the following exception: 
> {noformat}
> WARN [ReadStage-4] 2018-08-29 06:59:53,505 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-4,5,main]: {}
> java.lang.RuntimeException: java.lang.IllegalStateException: 
> UnfilteredRowIterator for pvpms_mevents.measurement_events_dbl has an illegal 
> RT bounds sequence: open marker and close marker have different deletion times
>  at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2601)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_181]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  [apache-cassandra-3.11.3.jar:3.11.3]
>  at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.3.jar:3.11.3]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: java.lang.IllegalStateException: UnfilteredRowIterator for 
> pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence: open 
> marker and close marker have different deletion times
>  at 
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.applyToMarker(RTBoundValidator.java:81)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:148) 
> ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:136)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:187)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:180)
>  

[jira] [Updated] (CASSANDRA-14672) After deleting data in 3.11.3, reads fail with "open marker and close marker have different deletion times"

2018-09-01 Thread Spiros Ioannou (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Spiros Ioannou updated CASSANDRA-14672:
---
Priority: Blocker  (was: Critical)

> After deleting data in 3.11.3, reads fail with "open marker and close marker 
> have different deletion times"
> ---
>
> Key: CASSANDRA-14672
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14672
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 7, GCE, 9 nodes, 4TB disk/~2TB full each, level 
> compaction, timeseries data
>Reporter: Spiros Ioannou
>Priority: Blocker
>
> We had 3.11.0, then we upgraded to 3.11.3 last week. We routinely perform 
> deletions as the one described below. After upgrading we run the following 
> deletion query:
>  
> {code:java}
> DELETE FROM measurement_events_dbl WHERE measurement_source_id IN ( 
> 9df798a2-6337-11e8-b52b-42010afa015a,  9df7717e-6337-11e8-b52b-42010afa015a, 
> a08b8042-6337-11e8-b52b-42010afa015a, a08e52cc-6337-11e8-b52b-42010afa015a, 
> a08e6654-6337-11e8-b52b-42010afa015a, a08e6104-6337-11e8-b52b-42010afa015a, 
> a08e6c76-6337-11e8-b52b-42010afa015a, a08e5a9c-6337-11e8-b52b-42010afa015a, 
> a08bcc50-6337-11e8-b52b-42010afa015a) AND year IN (2018) AND measurement_time 
> >= '2018-07-19 04:00:00'{code}
>  
> Immediately after that, trying to read the last value produces an error:
> {code:java}
> select * FROM measurement_events_dbl WHERE measurement_source_id = 
> a08b8042-6337-11e8-b52b-42010afa015a AND year IN (2018) order by 
> measurement_time desc limit 1;
> ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
> message="Operation failed - received 0 responses and 2 failures" 
> info={'failures': 2, 'received_responses': 0, 'required_responses': 1, 
> 'consistency': 'ONE'}{code}
>  
> And the following exception: 
> {noformat}
> WARN [ReadStage-4] 2018-08-29 06:59:53,505 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-4,5,main]: {}
> java.lang.RuntimeException: java.lang.IllegalStateException: 
> UnfilteredRowIterator for pvpms_mevents.measurement_events_dbl has an illegal 
> RT bounds sequence: open marker and close marker have different deletion times
>  at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2601)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_181]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  [apache-cassandra-3.11.3.jar:3.11.3]
>  at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.3.jar:3.11.3]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: java.lang.IllegalStateException: UnfilteredRowIterator for 
> pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence: open 
> marker and close marker have different deletion times
>  at 
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.applyToMarker(RTBoundValidator.java:81)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:148) 
> ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:136)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:187)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:180)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:176)
>  ~[apache-cassandra-3.11.3.jar:3.11.3]
>  at 
> 

[jira] [Commented] (CASSANDRA-14655) Upgrade C* to use latest guava (26.0)

2018-09-01 Thread Sumanth Pasupuleti (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599593#comment-16599593
 ] 

Sumanth Pasupuleti commented on CASSANDRA-14655:


Thanks for the suggestions [~andrew.tolbert] . All UTs pass now, with a change 
to AuditLogManager, which I need to get reviewed (along with other changes).

[Github|https://github.com/sumanth-pasupuleti/cassandra/tree/guava_26_trunk_1]
 [CircleCI Build|https://circleci.com/gh/sumanth-pasupuleti/cassandra/108]
 [CircleCI UTs|https://circleci.com/gh/sumanth-pasupuleti/cassandra/109]

> Upgrade C* to use latest guava (26.0)
> -
>
> Key: CASSANDRA-14655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14655
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Minor
> Fix For: 4.x
>
>
> C* currently uses guava 23.3. This JIRA is about changing C* to use latest 
> guava (26.0). Originated from a discussion in the mailing list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14408) Transient Replication: Incremental & Validation repair handling of transient replicas

2018-09-01 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599581#comment-16599581
 ] 

Marcus Eriksson commented on CASSANDRA-14408:
-

+1

> Transient Replication: Incremental & Validation repair handling of transient 
> replicas
> -
>
> Key: CASSANDRA-14408
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14408
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Repair
>Reporter: Ariel Weisberg
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>
> At transient replicas anti-compaction shouldn't output any data for transient 
> ranges as the data will be dropped after repair.
> Transient replicas should also never have data streamed to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14619) Create fqltool compare command

2018-09-01 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14619:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

fixed the PR comments and the int32/int16 issue, thanks for noticing!

this patch also moves everything fqltool related in to tools/ to avoid adding 
new java driver dependencies in the main code base

committed as {{f83bd5ac2bbc6755213a6ad0675e7e5400c79670}}
test results (includes CASSANDRA-14656 and CASSANDRA-14618)
https://circleci.com/gh/krummas/cassandra/tree/marcuse%2Ffql_squash
thanks!

> Create fqltool compare command
> --
>
> Key: CASSANDRA-14619
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14619
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
>  Labels: fqltool
> Fix For: 4.x
>
>
> We need a {{fqltool compare}} command that can take the recorded runs from 
> CASSANDRA-14618 and compares them, it should output any differences and 
> potentially all queries against the mismatching partition up until the 
> mismatch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14656) Full query log needs to log the keyspace

2018-09-01 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14656:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

committed as {{46c33f324e5f3373d85838f364aece7ca6a6189c}}
test results (includes CASSANDRA-14618 and CASSANDRA-14619)
https://circleci.com/gh/krummas/cassandra/tree/marcuse%2Ffql_squash

thanks!

> Full query log needs to log the keyspace
> 
>
> Key: CASSANDRA-14656
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14656
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
>  Labels: fqltool
> Fix For: 4.0
>
>
> If the full query log is enabled and a set of clients have already executed 
> "USE " we can't figure out which keyspace the following queries are 
> executed against.
> We need this for CASSANDRA-14618



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14618) Create fqltool replay command

2018-09-01 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14618:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

committed as {{62ffb7723917768c38c9e012710c6dce509191c1}}
test results (includes CASSANDRA-14656 and CASSANDRA-14619)
https://circleci.com/gh/krummas/cassandra/tree/marcuse%2Ffql_squash

thanks!

> Create fqltool replay command
> -
>
> Key: CASSANDRA-14618
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14618
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
>  Labels: fqltool
> Fix For: 4.0
>
>
> Make it possible to replay the full query logs from CASSANDRA-13983 against 
> one or several clusters. The goal is to be able to compare different runs of 
> production traffic against different versions/configurations of Cassandra.
> * It should be possible to take logs from several machines and replay them in 
> "order" by the timestamps recorded
> * Record the results from each run to be able to compare different runs 
> (against different clusters/versions/etc)
> * If {{fqltool replay}} is run against 2 or more clusters, the results should 
> be compared as we go



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/7] cassandra git commit: Add fqltool replay

2018-09-01 Thread marcuse
http://git-wip-us.apache.org/repos/asf/cassandra/blob/62ffb772/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java
--
diff --git a/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java 
b/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java
new file mode 100644
index 000..043ead8
--- /dev/null
+++ b/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.tools.fqltool.commands;
+
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.function.Predicate;
+import java.util.stream.Collectors;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import io.airlift.airline.Arguments;
+import io.airlift.airline.Command;
+import io.airlift.airline.Option;
+import net.openhft.chronicle.core.io.Closeable;
+import net.openhft.chronicle.queue.ChronicleQueue;
+import net.openhft.chronicle.queue.ChronicleQueueBuilder;
+
+import org.apache.cassandra.tools.fqltool.FQLQuery;
+import org.apache.cassandra.tools.fqltool.FQLQueryIterator;
+import org.apache.cassandra.tools.fqltool.QueryReplayer;
+import org.apache.cassandra.utils.AbstractIterator;
+import org.apache.cassandra.utils.MergeIterator;
+
+/**
+ * replay the contents of a list of paths containing full query logs
+ */
+@Command(name = "replay", description = "Replay full query logs")
+public class Replay implements Runnable
+{
+@Arguments(usage = " [...]", description = "Paths 
containing the full query logs to replay.", required = true)
+private List arguments = new ArrayList<>();
+
+@Option(title = "target", name = {"--target"}, description = "Hosts to 
replay the logs to, can be repeated to replay to more hosts.")
+private List targetHosts;
+
+@Option(title = "results", name = { "--results"}, description = "Where to 
store the results of the queries, this should be a directory. Leave this option 
out to avoid storing results.")
+private String resultPath;
+
+@Option(title = "keyspace", name = { "--keyspace"}, description = "Only 
replay queries against this keyspace and queries without keyspace set.")
+private String keyspace;
+
+@Option(title = "debug", name = {"--debug"}, description = "Debug mode, 
print all queries executed.")
+private boolean debug;
+
+@Option(title = "store_queries", name = {"--store-queries"}, description = 
"Path to store the queries executed. Stores queries in the same order as the 
result sets are in the result files. Requires --results")
+private String queryStorePath;
+
+@Override
+public void run()
+{
+try
+{
+List resultPaths = null;
+if (resultPath != null)
+{
+File basePath = new File(resultPath);
+if (!basePath.exists() || !basePath.isDirectory())
+{
+System.err.println("The results path (" + basePath + ") 
should be an existing directory");
+System.exit(1);
+}
+resultPaths = targetHosts.stream().map(target -> new 
File(basePath, target)).collect(Collectors.toList());
+resultPaths.forEach(File::mkdir);
+}
+if (targetHosts.size() < 1)
+{
+System.err.println("You need to state at least one --target 
host to replay the query against");
+System.exit(1);
+}
+replay(keyspace, arguments, targetHosts, resultPaths, 
queryStorePath, debug);
+}
+catch (Exception e)
+{
+throw new RuntimeException(e);
+}
+}
+
+public static void replay(String keyspace, List arguments, 
List targetHosts, List resultPaths, String queryStorePath, 
boolean debug)
+{
+int readAhead = 200; // how many fql queries should we read in to 
memory to be able to sort them?
+List readQueues = null;
+List iterators = null;
+List> filters = new ArrayList<>();
+
+if (keyspace != null)
+filters.add(fqlQuery -> 

[4/7] cassandra git commit: Add fqltool compare

2018-09-01 Thread marcuse
http://git-wip-us.apache.org/repos/asf/cassandra/blob/f83bd5ac/tools/fqltool/test/unit/org/apache/cassandra/fqltool/FQLReplayTest.java
--
diff --git 
a/tools/fqltool/test/unit/org/apache/cassandra/fqltool/FQLReplayTest.java 
b/tools/fqltool/test/unit/org/apache/cassandra/fqltool/FQLReplayTest.java
new file mode 100644
index 000..61c8aa0
--- /dev/null
+++ b/tools/fqltool/test/unit/org/apache/cassandra/fqltool/FQLReplayTest.java
@@ -0,0 +1,675 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.fqltool;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Objects;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+import com.google.common.collect.Lists;
+import org.junit.Test;
+
+import com.datastax.driver.core.CodecRegistry;
+import com.datastax.driver.core.SimpleStatement;
+import com.datastax.driver.core.Statement;
+import net.openhft.chronicle.queue.ChronicleQueue;
+import net.openhft.chronicle.queue.ChronicleQueueBuilder;
+import net.openhft.chronicle.queue.ExcerptAppender;
+import net.openhft.chronicle.queue.ExcerptTailer;
+import org.apache.cassandra.audit.FullQueryLogger;
+import org.apache.cassandra.cql3.QueryOptions;
+import org.apache.cassandra.cql3.statements.BatchStatement;
+import org.apache.cassandra.fqltool.commands.Compare;
+import org.apache.cassandra.fqltool.commands.Replay;
+import org.apache.cassandra.service.ClientState;
+import org.apache.cassandra.service.QueryState;
+import org.apache.cassandra.tools.Util;
+import org.apache.cassandra.utils.ByteBufferUtil;
+import org.apache.cassandra.utils.MergeIterator;
+import org.apache.cassandra.utils.Pair;
+
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertTrue;
+
+public class FQLReplayTest
+{
+public FQLReplayTest()
+{
+Util.initDatabaseDescriptor();
+}
+
+@Test
+public void testOrderedReplay() throws IOException
+{
+File f = generateQueries(100, true);
+int queryCount = 0;
+try (ChronicleQueue queue = ChronicleQueueBuilder.single(f).build();
+ FQLQueryIterator iter = new 
FQLQueryIterator(queue.createTailer(), 101))
+{
+long last = -1;
+while (iter.hasNext())
+{
+FQLQuery q = iter.next();
+assertTrue(q.queryStartTime >= last);
+last = q.queryStartTime;
+queryCount++;
+}
+}
+assertEquals(100, queryCount);
+}
+
+@Test
+public void testQueryIterator() throws IOException
+{
+File f = generateQueries(100, false);
+int queryCount = 0;
+try (ChronicleQueue queue = ChronicleQueueBuilder.single(f).build();
+ FQLQueryIterator iter = new 
FQLQueryIterator(queue.createTailer(), 1))
+{
+long last = -1;
+while (iter.hasNext())
+{
+FQLQuery q = iter.next();
+assertTrue(q.queryStartTime >= last);
+last = q.queryStartTime;
+queryCount++;
+}
+}
+assertEquals(100, queryCount);
+}
+
+@Test
+public void testMergingIterator() throws IOException
+{
+File f = generateQueries(100, false);
+File f2 = generateQueries(100, false);
+int queryCount = 0;
+try (ChronicleQueue queue = ChronicleQueueBuilder.single(f).build();
+ ChronicleQueue queue2 = ChronicleQueueBuilder.single(f2).build();
+ FQLQueryIterator iter = new 
FQLQueryIterator(queue.createTailer(), 101);
+ FQLQueryIterator iter2 = new 
FQLQueryIterator(queue2.createTailer(), 101);
+ MergeIterator> merger = 
MergeIterator.get(Lists.newArrayList(iter, iter2), FQLQuery::compareTo, new 

[6/7] cassandra git commit: Add fqltool compare

2018-09-01 Thread marcuse
http://git-wip-us.apache.org/repos/asf/cassandra/blob/f83bd5ac/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java
--
diff --git a/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java 
b/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java
deleted file mode 100644
index 043ead8..000
--- a/src/java/org/apache/cassandra/tools/fqltool/commands/Replay.java
+++ /dev/null
@@ -1,148 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.cassandra.tools.fqltool.commands;
-
-
-import java.io.File;
-import java.util.ArrayList;
-import java.util.List;
-import java.util.function.Predicate;
-import java.util.stream.Collectors;
-
-import com.google.common.annotations.VisibleForTesting;
-
-import io.airlift.airline.Arguments;
-import io.airlift.airline.Command;
-import io.airlift.airline.Option;
-import net.openhft.chronicle.core.io.Closeable;
-import net.openhft.chronicle.queue.ChronicleQueue;
-import net.openhft.chronicle.queue.ChronicleQueueBuilder;
-
-import org.apache.cassandra.tools.fqltool.FQLQuery;
-import org.apache.cassandra.tools.fqltool.FQLQueryIterator;
-import org.apache.cassandra.tools.fqltool.QueryReplayer;
-import org.apache.cassandra.utils.AbstractIterator;
-import org.apache.cassandra.utils.MergeIterator;
-
-/**
- * replay the contents of a list of paths containing full query logs
- */
-@Command(name = "replay", description = "Replay full query logs")
-public class Replay implements Runnable
-{
-@Arguments(usage = " [...]", description = "Paths 
containing the full query logs to replay.", required = true)
-private List arguments = new ArrayList<>();
-
-@Option(title = "target", name = {"--target"}, description = "Hosts to 
replay the logs to, can be repeated to replay to more hosts.")
-private List targetHosts;
-
-@Option(title = "results", name = { "--results"}, description = "Where to 
store the results of the queries, this should be a directory. Leave this option 
out to avoid storing results.")
-private String resultPath;
-
-@Option(title = "keyspace", name = { "--keyspace"}, description = "Only 
replay queries against this keyspace and queries without keyspace set.")
-private String keyspace;
-
-@Option(title = "debug", name = {"--debug"}, description = "Debug mode, 
print all queries executed.")
-private boolean debug;
-
-@Option(title = "store_queries", name = {"--store-queries"}, description = 
"Path to store the queries executed. Stores queries in the same order as the 
result sets are in the result files. Requires --results")
-private String queryStorePath;
-
-@Override
-public void run()
-{
-try
-{
-List resultPaths = null;
-if (resultPath != null)
-{
-File basePath = new File(resultPath);
-if (!basePath.exists() || !basePath.isDirectory())
-{
-System.err.println("The results path (" + basePath + ") 
should be an existing directory");
-System.exit(1);
-}
-resultPaths = targetHosts.stream().map(target -> new 
File(basePath, target)).collect(Collectors.toList());
-resultPaths.forEach(File::mkdir);
-}
-if (targetHosts.size() < 1)
-{
-System.err.println("You need to state at least one --target 
host to replay the query against");
-System.exit(1);
-}
-replay(keyspace, arguments, targetHosts, resultPaths, 
queryStorePath, debug);
-}
-catch (Exception e)
-{
-throw new RuntimeException(e);
-}
-}
-
-public static void replay(String keyspace, List arguments, 
List targetHosts, List resultPaths, String queryStorePath, 
boolean debug)
-{
-int readAhead = 200; // how many fql queries should we read in to 
memory to be able to sort them?
-List readQueues = null;
-List iterators = null;
-List> filters = new ArrayList<>();
-
-if (keyspace != null)
-filters.add(fqlQuery -> 

[7/7] cassandra git commit: Add fqltool compare

2018-09-01 Thread marcuse
Add fqltool compare

Patch by marcuse; reviewed by Jason Brown and Dinesh Joshi for CASSANDRA-14619


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f83bd5ac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f83bd5ac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f83bd5ac

Branch: refs/heads/trunk
Commit: f83bd5ac2bbc6755213a6ad0675e7e5400c79670
Parents: 62ffb77
Author: Marcus Eriksson 
Authored: Fri Aug 24 14:41:09 2018 +0200
Committer: Marcus Eriksson 
Committed: Sat Sep 1 09:59:21 2018 +0200

--
 CHANGES.txt |   1 +
 bin/fqltool |  76 --
 bin/fqltool.bat |  36 -
 build.xml   |  67 +-
 ide/idea-iml-file.xml   |   2 +
 ide/idea/workspace.xml  |   1 +
 .../cassandra/tools/FullQueryLogTool.java   |  95 ---
 .../tools/fqltool/DriverResultSet.java  | 241 --
 .../cassandra/tools/fqltool/FQLQuery.java   | 278 ---
 .../tools/fqltool/FQLQueryIterator.java |  72 --
 .../cassandra/tools/fqltool/FQLQueryReader.java | 116 ---
 .../cassandra/tools/fqltool/QueryReplayer.java  | 167 
 .../tools/fqltool/ResultComparator.java | 116 ---
 .../cassandra/tools/fqltool/ResultHandler.java  | 124 ---
 .../cassandra/tools/fqltool/ResultStore.java| 142 
 .../cassandra/tools/fqltool/commands/Dump.java  | 325 
 .../tools/fqltool/commands/Replay.java  | 148 
 .../cassandra/tools/fqltool/FQLReplayTest.java  | 760 ---
 tools/bin/cassandra.in.bat  |   2 +-
 tools/bin/cassandra.in.sh   |   2 +-
 tools/bin/fqltool   |  76 ++
 tools/bin/fqltool.bat   |  36 +
 .../cassandra/fqltool/DriverResultSet.java  | 248 ++
 .../org/apache/cassandra/fqltool/FQLQuery.java  | 265 +++
 .../cassandra/fqltool/FQLQueryIterator.java |  72 ++
 .../cassandra/fqltool/FQLQueryReader.java   | 116 +++
 .../cassandra/fqltool/FullQueryLogTool.java |  99 +++
 .../apache/cassandra/fqltool/QueryReplayer.java | 172 +
 .../cassandra/fqltool/ResultComparator.java | 116 +++
 .../apache/cassandra/fqltool/ResultHandler.java | 133 
 .../apache/cassandra/fqltool/ResultStore.java   | 291 +++
 .../cassandra/fqltool/StoredResultSet.java  | 292 +++
 .../cassandra/fqltool/commands/Compare.java | 120 +++
 .../apache/cassandra/fqltool/commands/Dump.java | 325 
 .../cassandra/fqltool/commands/Replay.java  | 148 
 .../cassandra/fqltool/FQLCompareTest.java   | 131 
 .../apache/cassandra/fqltool/FQLReplayTest.java | 675 
 37 files changed, 3384 insertions(+), 2702 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f83bd5ac/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 1227337..1ba9975 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add fqltool compare (CASSANDRA-14619)
  * Add fqltool replay (CASSANDRA-14618)
  * Log keyspace in full query log (CASSANDRA-14656)
  * Transient Replication and Cheap Quorums (CASSANDRA-14404)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f83bd5ac/bin/fqltool
--
diff --git a/bin/fqltool b/bin/fqltool
deleted file mode 100755
index 15a0b20..000
--- a/bin/fqltool
+++ /dev/null
@@ -1,76 +0,0 @@
-#!/bin/sh
-
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-if [ "x$CASSANDRA_INCLUDE" = "x" ]; then
-# Locations (in order) to use when searching for an include file.
-for include in "`dirname "$0"`/cassandra.in.sh" \
-   "$HOME/.cassandra.in.sh" \
-   /usr/share/cassandra/cassandra.in.sh \
-   /usr/local/share/cassandra/cassandra.in.sh \
-   /opt/cassandra/cassandra.in.sh; 

[3/7] cassandra git commit: Add fqltool replay

2018-09-01 Thread marcuse
Add fqltool replay

Patch by marcuse; reviewed by Jason Brown and Dinesh Joshi for CASSANDRA-14618


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/62ffb772
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/62ffb772
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/62ffb772

Branch: refs/heads/trunk
Commit: 62ffb7723917768c38c9e012710c6dce509191c1
Parents: 46c33f3
Author: Marcus Eriksson 
Authored: Mon Aug 6 16:32:27 2018 +0200
Committer: Marcus Eriksson 
Committed: Sat Sep 1 08:35:54 2018 +0200

--
 CHANGES.txt |   1 +
 .../apache/cassandra/audit/FullQueryLogger.java |   5 +-
 .../apache/cassandra/service/QueryState.java|   8 +
 .../cassandra/tools/FullQueryLogTool.java   |   6 +-
 .../tools/fqltool/DriverResultSet.java  | 241 ++
 .../apache/cassandra/tools/fqltool/Dump.java| 325 
 .../cassandra/tools/fqltool/FQLQuery.java   | 278 +++
 .../tools/fqltool/FQLQueryIterator.java |  72 ++
 .../cassandra/tools/fqltool/FQLQueryReader.java | 116 +++
 .../cassandra/tools/fqltool/QueryReplayer.java  | 167 
 .../tools/fqltool/ResultComparator.java | 116 +++
 .../cassandra/tools/fqltool/ResultHandler.java  | 124 +++
 .../cassandra/tools/fqltool/ResultStore.java| 142 
 .../cassandra/tools/fqltool/commands/Dump.java  | 325 
 .../tools/fqltool/commands/Replay.java  | 148 
 .../cassandra/tools/fqltool/FQLReplayTest.java  | 760 +++
 16 files changed, 2505 insertions(+), 329 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/62ffb772/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cd2a14a..1227337 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add fqltool replay (CASSANDRA-14618)
  * Log keyspace in full query log (CASSANDRA-14656)
  * Transient Replication and Cheap Quorums (CASSANDRA-14404)
  * Log server-generated timestamp and nowInSeconds used by queries in FQL 
(CASSANDRA-14675)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/62ffb772/src/java/org/apache/cassandra/audit/FullQueryLogger.java
--
diff --git a/src/java/org/apache/cassandra/audit/FullQueryLogger.java 
b/src/java/org/apache/cassandra/audit/FullQueryLogger.java
index c9f8447..9c1f472 100644
--- a/src/java/org/apache/cassandra/audit/FullQueryLogger.java
+++ b/src/java/org/apache/cassandra/audit/FullQueryLogger.java
@@ -23,6 +23,7 @@ import java.util.List;
 
 import javax.annotation.Nullable;
 
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Preconditions;
 import com.google.common.primitives.Ints;
 
@@ -151,7 +152,7 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
 logRecord(wrappedQuery, binLog);
 }
 
-static class Query extends AbstractLogEntry
+public static class Query extends AbstractLogEntry
 {
 private final String query;
 
@@ -181,7 +182,7 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
 }
 }
 
-static class Batch extends AbstractLogEntry
+public static class Batch extends AbstractLogEntry
 {
 private final int weight;
 private final BatchStatement.Type batchType;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/62ffb772/src/java/org/apache/cassandra/service/QueryState.java
--
diff --git a/src/java/org/apache/cassandra/service/QueryState.java 
b/src/java/org/apache/cassandra/service/QueryState.java
index 2bd07ab..26f58bf 100644
--- a/src/java/org/apache/cassandra/service/QueryState.java
+++ b/src/java/org/apache/cassandra/service/QueryState.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.service;
 
 import java.net.InetAddress;
 
+import org.apache.cassandra.transport.ClientStat;
 import org.apache.cassandra.utils.FBUtilities;
 
 /**
@@ -39,6 +40,13 @@ public class QueryState
 this.clientState = clientState;
 }
 
+public QueryState(ClientState clientState, long timestamp, int 
nowInSeconds)
+{
+this(clientState);
+this.timestamp = timestamp;
+this.nowInSeconds = nowInSeconds;
+}
+
 /**
  * @return a QueryState object for internal C* calls (not limited by any 
kind of auth).
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/62ffb772/src/java/org/apache/cassandra/tools/FullQueryLogTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/FullQueryLogTool.java 

[5/7] cassandra git commit: Add fqltool compare

2018-09-01 Thread marcuse
http://git-wip-us.apache.org/repos/asf/cassandra/blob/f83bd5ac/tools/fqltool/src/org/apache/cassandra/fqltool/ResultComparator.java
--
diff --git 
a/tools/fqltool/src/org/apache/cassandra/fqltool/ResultComparator.java 
b/tools/fqltool/src/org/apache/cassandra/fqltool/ResultComparator.java
new file mode 100644
index 000..d8d419a
--- /dev/null
+++ b/tools/fqltool/src/org/apache/cassandra/fqltool/ResultComparator.java
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.fqltool;
+
+
+import java.util.List;
+import java.util.Objects;
+import java.util.stream.Collectors;
+
+import com.google.common.collect.Streams;
+
+public class ResultComparator
+{
+/**
+ * Compares the rows in rows
+ * the row at position x in rows will have come from host at position x in 
targetHosts
+ */
+public boolean compareRows(List targetHosts, FQLQuery query, 
List rows)
+{
+if (rows.size() < 2 || rows.stream().allMatch(Objects::isNull))
+return true;
+
+if (rows.stream().anyMatch(Objects::isNull))
+{
+handleMismatch(targetHosts, query, rows);
+return false;
+}
+
+ResultHandler.ComparableRow ref = rows.get(0);
+boolean equal = true;
+for (int i = 1; i < rows.size(); i++)
+{
+ResultHandler.ComparableRow compare = rows.get(i);
+if (!ref.equals(compare))
+equal = false;
+}
+if (!equal)
+handleMismatch(targetHosts, query, rows);
+return equal;
+}
+
+/**
+ * Compares the column definitions
+ *
+ * the column definitions at position x in cds will have come from host at 
position x in targetHosts
+ */
+public boolean compareColumnDefinitions(List targetHosts, FQLQuery 
query, List cds)
+{
+if (cds.size() < 2)
+return true;
+
+boolean equal = true;
+List refDefs = cds.get(0).asList();
+for (int i = 1; i < cds.size(); i++)
+{
+List toCompare = 
cds.get(i).asList();
+if (!refDefs.equals(toCompare))
+equal = false;
+}
+if (!equal)
+handleColumnDefMismatch(targetHosts, query, cds);
+return equal;
+}
+
+private void handleMismatch(List targetHosts, FQLQuery query, 
List rows)
+{
+System.out.println("MISMATCH:");
+System.out.println("Query = " + query);
+System.out.println("Results:");
+System.out.println(Streams.zip(rows.stream(), targetHosts.stream(), 
(r, host) -> String.format("%s: %s%n", host, r == null ? "null" : 
r)).collect(Collectors.joining()));
+}
+
+private void handleColumnDefMismatch(List targetHosts, FQLQuery 
query, List cds)
+{
+System.out.println("COLUMN DEFINITION MISMATCH:");
+System.out.println("Query = " + query);
+System.out.println("Results: ");
+System.out.println(Streams.zip(cds.stream(), targetHosts.stream(), 
(cd, host) -> String.format("%s: %s%n", host, 
columnDefinitionsString(cd))).collect(Collectors.joining()));
+}
+
+private String 
columnDefinitionsString(ResultHandler.ComparableColumnDefinitions cd)
+{
+StringBuilder sb = new StringBuilder();
+if (cd == null)
+sb.append("NULL");
+else if (cd.wasFailed())
+sb.append("FAILED");
+else
+{
+for (ResultHandler.ComparableDefinition def : cd)
+{
+sb.append(def.toString());
+}
+}
+return sb.toString();
+}
+
+
+
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f83bd5ac/tools/fqltool/src/org/apache/cassandra/fqltool/ResultHandler.java
--
diff --git a/tools/fqltool/src/org/apache/cassandra/fqltool/ResultHandler.java 
b/tools/fqltool/src/org/apache/cassandra/fqltool/ResultHandler.java
new file mode 100644
index 000..8c4c018
--- /dev/null
+++ b/tools/fqltool/src/org/apache/cassandra/fqltool/ResultHandler.java

[1/7] cassandra git commit: Log keyspace in full query log

2018-09-01 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk f7431b432 -> f83bd5ac2


Log keyspace in full query log

Patch by marcuse; reviewed by Dinesh Joshi for CASSANDRA-14656


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/46c33f32
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/46c33f32
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/46c33f32

Branch: refs/heads/trunk
Commit: 46c33f324e5f3373d85838f364aece7ca6a6189c
Parents: f7431b4
Author: Marcus Eriksson 
Authored: Mon Aug 20 18:30:21 2018 +0200
Committer: Marcus Eriksson 
Committed: Sat Sep 1 08:23:54 2018 +0200

--
 CHANGES.txt |  1 +
 .../apache/cassandra/audit/FullQueryLogger.java | 13 +++-
 .../cassandra/audit/FullQueryLoggerTest.java| 72 +++-
 3 files changed, 82 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/46c33f32/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b53b986..cd2a14a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Log keyspace in full query log (CASSANDRA-14656)
  * Transient Replication and Cheap Quorums (CASSANDRA-14404)
  * Log server-generated timestamp and nowInSeconds used by queries in FQL 
(CASSANDRA-14675)
  * Add diagnostic events for read repairs (CASSANDRA-14668)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/46c33f32/src/java/org/apache/cassandra/audit/FullQueryLogger.java
--
diff --git a/src/java/org/apache/cassandra/audit/FullQueryLogger.java 
b/src/java/org/apache/cassandra/audit/FullQueryLogger.java
index 8cd8f4a..c9f8447 100644
--- a/src/java/org/apache/cassandra/audit/FullQueryLogger.java
+++ b/src/java/org/apache/cassandra/audit/FullQueryLogger.java
@@ -21,6 +21,8 @@ import java.nio.ByteBuffer;
 import java.util.ArrayList;
 import java.util.List;
 
+import javax.annotation.Nullable;
+
 import com.google.common.base.Preconditions;
 import com.google.common.primitives.Ints;
 
@@ -53,6 +55,7 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
 
 public static final String GENERATED_TIMESTAMP = "generated-timestamp";
 public static final String GENERATED_NOW_IN_SECONDS = 
"generated-now-in-seconds";
+public static final String KEYSPACE = "keyspace";
 
 public static final String BATCH = "batch";
 public static final String SINGLE_QUERY = "single-query";
@@ -262,6 +265,8 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
 
 private final long generatedTimestamp;
 private final int generatedNowInSeconds;
+@Nullable
+private final String keyspace;
 
 AbstractLogEntry(QueryOptions queryOptions, QueryState queryState, 
long queryStartTime)
 {
@@ -273,6 +278,7 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
 
 this.generatedTimestamp = queryState.generatedTimestamp();
 this.generatedNowInSeconds = queryState.generatedNowInSeconds();
+this.keyspace = queryState.getClientState().getRawKeyspace();
 
 /*
  * Struggled with what tradeoff to make in terms of query options 
which is potentially large and complicated
@@ -309,6 +315,8 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
 
 wire.write(GENERATED_TIMESTAMP).int64(generatedTimestamp);
 wire.write(GENERATED_NOW_IN_SECONDS).int32(generatedNowInSeconds);
+
+wire.write(KEYSPACE).text(keyspace);
 }
 
 @Override
@@ -325,7 +333,10 @@ public class FullQueryLogger extends BinLogAuditLogger 
implements IAuditLogger
  + 4  // 
protocolVersion
  + EMPTY_BYTEBUF_SIZE + queryOptionsBuffer.capacity() // 
queryOptionsBuffer
  + 8  // 
generatedTimestamp
- + 4; // 
generatedNowInSeconds
+ + 4  // 
generatedNowInSeconds
+ + (keyspace != null
+? Ints.checkedCast(ObjectSizes.sizeOf(keyspace))  // 
keyspace
+: OBJECT_REFERENCE_SIZE); // null
 }
 
 protected abstract String type();

http://git-wip-us.apache.org/repos/asf/cassandra/blob/46c33f32/test/unit/org/apache/cassandra/audit/FullQueryLoggerTest.java
--
diff --git 

[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read

2018-09-01 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599575#comment-16599575
 ] 

Marcus Eriksson commented on CASSANDRA-14145:
-

I like the new InputCollector approach, makes it really clean

I pushed a small (totally untested) update 
[here|https://github.com/krummas/cassandra/commits/sam/14145] - only allocates 
the repairedSSTables set if needed and avoids iterating all sstables unless we 
track repaired status. Also makes InputCollector static.

We should benchmark this as we still do a few more allocations (I think). But 
I'm fine punting that until later.

+1 on the patch, but we need the dtest changes in as well (enabling globally 
etc)

>  Detecting data resurrection during read
> 
>
> Key: CASSANDRA-14145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 4.x
>
>
> We have seen several bugs in which deleted data gets resurrected. We should 
> try to see if we can detect this on the read path and possibly fix it. Here 
> are a few examples which brought back data
> A replica lost an sstable on startup which caused one replica to lose the 
> tombstone and not the data. This tombstone was past gc grace which means this 
> could resurrect data. We can detect such invalid states by looking at other 
> replicas. 
> If we are running incremental repair, Cassandra will keep repaired and 
> non-repaired data separate. Every-time incremental repair will run, it will 
> move the data from non-repaired to repaired. Repaired data across all 
> replicas should be 100% consistent. 
> Here is an example of how we can detect and mitigate the issue in most cases. 
> Say we have 3 machines, A,B and C. All these machines will have data split 
> b/w repaired and non-repaired. 
> 1. Machine A due to some bug bring backs data D. This data D is in repaired 
> dataset. All other replicas will have data D and tombstone T 
> 2. Read for data D comes from application which involve replicas A and B. The 
> data being read involves data which is in repaired state.  A will respond 
> back to co-ordinator with data D and B will send nothing as tombstone is past 
> gc grace. This will cause digest mismatch. 
> 3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
> will ask both replicas to send back all data like we do today but with this 
> patch, replicas will respond back what data it is returning is coming from 
> repaired vs non-repaired. If data coming from repaired does not match, we 
> know there is a something wrong!! At this time, co-ordinator cannot determine 
> if replica A has resurrected some data or replica B has lost some data. We 
> can still log error in the logs saying we hit an invalid state.
> 4. Besides the log, we can take this further and even correct the response to 
> the query. After logging an invalid state, we can ask replica A and B (and 
> also C if alive) to send back all data for this including gcable tombstones. 
> If any machine returns a tombstone which is after this data, we know we 
> cannot return this data. This way we can avoid returning data which has been 
> deleted. 
> Some Challenges with this 
> 1. When data will be moved from non-repaired to repaired, there could be a 
> race here. We can look at which incremental repairs have promoted things on 
> which replica to avoid false positives.  
> 2. If the third replica is down and live replica does not have any tombstone, 
> we wont be able to break the tie in deciding whether data was actually 
> deleted or resurrected. 
> 3. If the read is for latest data only, we wont be able to detect it as the 
> read will be served from non-repaired data. 
> 4. If the replica where we lose a tombstone is the last replica to compact 
> the tombstone, we wont be able to decide if data is coming back or rest of 
> the replicas has lost that data. But we will still detect something is wrong. 
> 5. We wont affect 99.9% of the read queries as we only do extra work during 
> digest mismatch.
> 6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org