date:20180618

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2018-06-18 Thread Dinesh Joshi (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516682#comment-16516682
 ] 

Dinesh Joshi commented on CASSANDRA-10735:
--

`user@ ML` is users mailing list. See: http://cassandra.apache.org/community/

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
> Attachments: netty-ssl-trunk.tgz, nettyssl-bench.tgz, 
> nettysslbench.png, nettysslbench_small.png, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-18 Thread Kurt Greaves (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516669#comment-16516669
 ] 

Kurt Greaves commented on CASSANDRA-14525:
--

Thanks [~chovatia.jayd...@gmail.com], I agree. No fault on your part, more a 
problem with the consistent lack of reviewers we have who can prioritise review 
work. Just unfortunate that it's wasted more time than necessary for everyone. 
I think [~VincentWhite] would appreciate the acknowledgement (especially after 
such a long time) so FCFS makes sense to me, but there's no use doing the work 
twice, just take into account the two patches slight discrepancies when 
reviewing I guess.

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]

[jira] [Updated] (CASSANDRA-14515) Short read protection in presence of almost-purgeable range tombstones may cause permanent data loss

2018-06-18 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14515:

Priority: Blocker  (was: Major)

> Short read protection in presence of almost-purgeable range tombstones may 
> cause permanent data loss
> 
>
> Key: CASSANDRA-14515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14515
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Because read responses don't necessarily close their open RT bounds, it's 
> possible to lose data during short read protection, if a closing bound is 
> compacted away between two adjacent reads from a node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2018-06-18 Thread jahar (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516631#comment-16516631
 ] 

jahar commented on CASSANDRA-10735:
---

Thanks Jason for your response. Can you please elaborate what is this 
_*user@ML?*_

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
> Attachments: netty-ssl-trunk.tgz, nettyssl-bench.tgz, 
> nettysslbench.png, nettysslbench_small.png, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14444) Got NPE when querying Cassandra 3.11.2

2018-06-18 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-1:

Reproduced In: 3.11.2
Since Version: 3.11.2

> Got NPE when querying Cassandra 3.11.2
> --
>
> Key: CASSANDRA-1
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Ubuntu 14.04, JDK 1.8.0_171. 
> Cassandra 3.11.2
>Reporter: Xiaodong Xie
>Priority: Blocker
>
> We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2
> After upgrading, we immediately got exceptions in Cassandra like this one: 
>  
> {code}
> ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 
> QueryMessage.java:129 - Unexpected error during query
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:248)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:92)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.pager.PartitionRangeQueryPager.(PartitionRangeQueryPager.java:44)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:268)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:475)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:288)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:118)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:255) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:240) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_171]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.2.jar:3.11.2]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
> {code}
>  
> The table schema is like:
> {code}
> CREATE TABLE example.example_table (
>  id bigint,
>  hash text,
>  json text,
>  PRIMARY KEY (id, hash)
> ) WITH COMPACT STORAGE
> {code}
>  
> The query is something like:
> {code}
> "select * from example.example_table;" // (We do know this is bad practise, 
> and we are trying to fix that right now)
> {code}
> with fetch-size as 200, using DataStax Java driver. 
> This table contains about 20k rows. 
>  
> Actually, the fix is quite simple, 
>  
> {code}
> --- a/src/java/org/apache/cassandra/service/pager/PagingState.java
> +++ b/src/java/org/apache/cassandra/service/pager/PagingState.java
> @@ -46,7 +46,7 @@ public class PagingState
> public PagingState(ByteBuffer partitionKey, RowMark rowMark, int remaining, 
> int remainingInPartition)
>  {
> - this.partitionKey = partitionKey;
> + this.partitionKey = partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER 
> : partitionKey;
>  this.rowMark = rowMark;
>  this.remaining = remaining;
>  this.remainingInPartition = remainingInPartition;
> {code}
>  
> "partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER : partit

[jira] [Updated] (CASSANDRA-14529) nodetool import row cache invalidation races with adding sstables to tracker

2018-06-18 Thread Jordan West (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-14529:

Status: Patch Available  (was: Open)

Made the cache invalidation run after the files are added to the tracker. This 
is similar to 
[streaming|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/streaming/CassandraStreamReceiver.java#L207-L210].
 There is still a race condition but the worst case is only invalidation of a 
cached copy of the newly added data. 

Branch: [https://github.com/jrwest/cassandra/commits/14529-trunk]
 Tests: [https://circleci.com/gh/jrwest/cassandra/tree/14529-trunk]

> nodetool import row cache invalidation races with adding sstables to tracker
> 
>
> Key: CASSANDRA-14529
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14529
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> CASSANDRA-6719 introduced {{nodetool import}} with row cache invalidation, 
> which [occurs before adding new sstables to the 
> tracker|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/SSTableImporter.java#L137-L178].
>  Stale reads will result after a read is interleaved with the read row's 
> invalidation and adding the containing file to the tracker.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14529) nodetool import row cache invalidation races with adding sstables to tracker

2018-06-18 Thread Jordan West (JIRA)

Jordan West created CASSANDRA-14529:
---

 Summary: nodetool import row cache invalidation races with adding 
sstables to tracker
 Key: CASSANDRA-14529
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14529
 Project: Cassandra
  Issue Type: Bug
Reporter: Jordan West
Assignee: Jordan West


CASSANDRA-6719 introduced {{nodetool import}} with row cache invalidation, 
which [occurs before adding new sstables to the 
tracker|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/SSTableImporter.java#L137-L178].
 Stale reads will result after a read is interleaved with the read row's 
invalidation and adding the containing file to the tracker.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

2018-06-18 Thread Christian Spriegel (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516182#comment-16516182
 ] 

Christian Spriegel commented on CASSANDRA-14480:


The fact that it will be 4.0 only, is indeed a hard pill to swallow. :(

> Digest mismatch requires all replicas to be responsive
> --
>
> Key: CASSANDRA-14480
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Christian Spriegel
>Priority: Major
> Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that 
> requires all up nodes to be able to respond. If one of these nodes is not 
> responding, then the read-repair is being reported to the client as 
> ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
> are responding. But unfortunetaly the third node being "up" in the ring, but 
> not being able to respond does lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently 
> reported as up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should 
> already produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses 
> were required but only 1 replica responded). Within the 
> DigestMismatchException catch-block it can be seen that the repairHandler is 
> waiting for 3 responses, even though the exception says that 2 responses are 
> required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Resolved] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

2018-06-18 Thread Christian Spriegel (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Spriegel resolved CASSANDRA-14480.

Resolution: Duplicate

> Digest mismatch requires all replicas to be responsive
> --
>
> Key: CASSANDRA-14480
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Christian Spriegel
>Priority: Major
> Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that 
> requires all up nodes to be able to respond. If one of these nodes is not 
> responding, then the read-repair is being reported to the client as 
> ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
> are responding. But unfortunetaly the third node being "up" in the ring, but 
> not being able to respond does lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently 
> reported as up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should 
> already produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses 
> were required but only 1 replica responded). Within the 
> DigestMismatchException catch-block it can be seen that the repairHandler is 
> waiting for 3 responses, even though the exception says that 2 responses are 
> required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

2018-06-18 Thread Jeff Jirsa (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516134#comment-16516134
 ] 

Jeff Jirsa commented on CASSANDRA-14480:


If it's a dupe (and it looks like it may be), then you have good news and bad 
news.

The good news is that 10726 is patch-available.
The bad news is it's a major refactor that won't land until 4.0

If you're satisfied it's a dupe, please feel free to relate+close it.


> Digest mismatch requires all replicas to be responsive
> --
>
> Key: CASSANDRA-14480
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Christian Spriegel
>Priority: Major
> Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that 
> requires all up nodes to be able to respond. If one of these nodes is not 
> responding, then the read-repair is being reported to the client as 
> ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
> are responding. But unfortunetaly the third node being "up" in the ring, but 
> not being able to respond does lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently 
> reported as up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should 
> already produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses 
> were required but only 1 replica responded). Within the 
> DigestMismatchException catch-block it can be seen that the repairHandler is 
> waiting for 3 responses, even though the exception says that 2 responses are 
> required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

2018-06-18 Thread Christian Spriegel (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516124#comment-16516124
 ] 

Christian Spriegel commented on CASSANDRA-14480:


[~jjirsa]: It sounds like my ticket is a duplicate.

 

 

> Digest mismatch requires all replicas to be responsive
> --
>
> Key: CASSANDRA-14480
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Christian Spriegel
>Priority: Major
> Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that 
> requires all up nodes to be able to respond. If one of these nodes is not 
> responding, then the read-repair is being reported to the client as 
> ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
> are responding. But unfortunetaly the third node being "up" in the ring, but 
> not being able to respond does lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently 
> reported as up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should 
> already produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses 
> were required but only 1 replica responded). Within the 
> DigestMismatchException catch-block it can be seen that the repairHandler is 
> waiting for 3 responses, even though the exception says that 2 responses are 
> required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

2018-06-18 Thread Jeff Jirsa (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516015#comment-16516015
 ] 

Jeff Jirsa commented on CASSANDRA-14480:


Is this different from CASSANDRA-10726 ? 


> Digest mismatch requires all replicas to be responsive
> --
>
> Key: CASSANDRA-14480
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Christian Spriegel
>Priority: Major
> Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that 
> requires all up nodes to be able to respond. If one of these nodes is not 
> responding, then the read-repair is being reported to the client as 
> ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
> are responding. But unfortunetaly the third node being "up" in the ring, but 
> not being able to respond does lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently 
> reported as up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should 
> already produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses 
> were required but only 1 replica responded). Within the 
> DigestMismatchException catch-block it can be seen that the repairHandler is 
> waiting for 3 responses, even though the exception says that 2 responses are 
> required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14504) fqltool should open chronicle queue read only and a GC bug

2018-06-18 Thread Ariel Weisberg (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-14504:
---
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

Commited as 
[c6570fac180b6f816efb47cbd9b7fe30c771835d|https://github.com/apache/cassandra/commit/c6570fac180b6f816efb47cbd9b7fe30c771835d].
 Thanks.

> fqltool should open chronicle queue read only and a GC bug
> --
>
> Key: CASSANDRA-14504
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14504
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> There are two issues with fqltool.
> The first is that it doesn't open the chronicle queue read only so it won't 
> work if it doesn't have write permissions and it's not clear if it's safe to 
> open the queue to write if the server is also still appending.
> The next issue is that NativeBytesStore.toTemporaryDirectByteBuffer() returns 
> a ByteBuffer that doesn't strongly reference the memory it refers to 
> resulting it in sometimes being reclaimed and containing the wrong data when 
> we go to read from it. At least that is the theory. Simple solution is to use 
> toByteArray() and that seems to make it work consistently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra git commit: fqltool should open chronicle queue read only and a GC bug

2018-06-18 Thread aweisberg

Repository: cassandra
Updated Branches:
  refs/heads/trunk 717c10837 -> c6570fac1


fqltool should open chronicle queue read only and a GC bug

Patch by Ariel Weisberg; Reviewed by Sam Tunnicliffe for CASSANDRA-14504


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c6570fac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c6570fac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c6570fac

Branch: refs/heads/trunk
Commit: c6570fac180b6f816efb47cbd9b7fe30c771835d
Parents: 717c108
Author: Ariel Weisberg 
Authored: Thu Jun 7 14:22:43 2018 -0400
Committer: Ariel Weisberg 
Committed: Mon Jun 18 12:35:06 2018 -0400

--
 src/java/org/apache/cassandra/tools/fqltool/Dump.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c6570fac/src/java/org/apache/cassandra/tools/fqltool/Dump.java
--
diff --git a/src/java/org/apache/cassandra/tools/fqltool/Dump.java 
b/src/java/org/apache/cassandra/tools/fqltool/Dump.java
index 6a748bc..52eadb5 100644
--- a/src/java/org/apache/cassandra/tools/fqltool/Dump.java
+++ b/src/java/org/apache/cassandra/tools/fqltool/Dump.java
@@ -81,7 +81,7 @@ public class Dump implements Runnable
 {
 int protocolVersion = wireIn.read("protocol-version").int32();
 sb.append("Protocol version: 
").append(protocolVersion).append(System.lineSeparator());
-QueryOptions options = 
QueryOptions.codec.decode(Unpooled.wrappedBuffer(wireIn.read("query-options").bytesStore().toTemporaryDirectByteBuffer()),
 ProtocolVersion.decode(protocolVersion));
+QueryOptions options = 
QueryOptions.codec.decode(Unpooled.wrappedBuffer(wireIn.read("query-options").bytes()),
 ProtocolVersion.decode(protocolVersion));
 sb.append("Query time: 
").append(wireIn.read("query-time").int64()).append(System.lineSeparator());
 
 if (type.equals("single"))
@@ -126,7 +126,7 @@ public class Dump implements Runnable
 
 //Backoff strategy for spinning on the queue, not aggressive at all as 
this doesn't need to be low latency
 Pauser pauser = Pauser.millis(100);
-List queues = arguments.stream().distinct().map(path 
-> ChronicleQueueBuilder.single(new 
File(path)).rollCycle(RollCycles.valueOf(rollCycle)).build()).collect(Collectors.toList());
+List queues = arguments.stream().distinct().map(path 
-> ChronicleQueueBuilder.single(new 
File(path)).readOnly(true).rollCycle(RollCycles.valueOf(rollCycle)).build()).collect(Collectors.toList());
 List tailers = 
queues.stream().map(ChronicleQueue::createTailer).collect(Collectors.toList());
 boolean hadWork = true;
 while (hadWork)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14527) Real time Bad query logging framework

2018-06-18 Thread Jaydeepkumar Chovatia (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516003#comment-16516003
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14527:
---

{quote}I like the idea of making operational issues more visible to the user. 
But the intention to create a common framework for logging certain messages, 
doesn't sound very convincing to me. At such point, I'm always asking myself, 
what kind of problem we precisely want to solve. What's the value in adding 
abstraction layers around producing log messages for very specific, predefined 
conditions?
{quote}
Thanks for the review [~spo...@gmail.com]. Here are some of the reason behind 
this intention:
 1. As described in the doc, CASSANDRA-12403 [large partition warning 
|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L208]
 [tombstone warning 
|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ReadCommand.java#L490]
 are already having some sort of this logic to inform user about some uncommon 
behavior but they are very specific for certain type of problems and with many 
limitations like changing thresholds require restart, user cannot consume in 
different ways, etc. Also adding yet another type of problem will require 
duplication of work. 
 2. Improve C* operational aspect
3. Having a common way of detecting/reporting often encourages people to add 
more anti-patterns and reduces duplicate code

{quote}
I'd also recommend to first take any such ideas to the dev mailing list before 
spending time implementing such changes. The Google docs hosted proposal gives 
a nice overview on the ideas behind this and would have been a good way to 
start a discussion and get some early feedback.
{quote}
I agree, will send this to dev mailing list.

> Real time Bad query logging framework
> -
>
> Key: CASSANDRA-14527
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14527
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.x
>
>
> If Cassandra is not used in right way then it can create adverse effect to 
> the application. There are lots of bad queries when using Cassandra, but 
> major problem is end user don’t know where and what exactly is the problem. 
> Most of the times end user is ready to take actions on bad queries provided 
> Cassandra gives all detailed which could have potential impact on cluster 
> performance.
> There has been already lots of work done as part of CASSANDRA-12403, proposal 
> as part of this JIRA is let’s have some common way of detecting and logging 
> different problems in Cassandra cluster which could have potential impact on 
> Cassandra cluster performance.
> Please visit this document which has details like what is currently 
> available, motivation behind developing this common framework, architecture, 
> samples, etc. 
> [https://docs.google.com/document/d/1D0HNjC3a7gnuKnR_iDXLI5mvn1zQxtV7tloMaLYIENE/edit?usp=sharing]
> Here is the patch with this feature:
> ||trunk||
> |[!https://circleci.com/gh/jaydeepkumar1984/cassandra/tree/bqr.svg?style=svg! 
> |https://circleci.com/gh/jaydeepkumar1984/cassandra/82]|
> |[patch 
> |https://github.com/apache/cassandra/compare/trunk...jaydeepkumar1984:bqr]|
> Please review this doc and the patch, and provide your opinion and feedback 
> about this effort.
> Thank you!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14504) fqltool should open chronicle queue read only and a GC bug

2018-06-18 Thread Sam Tunnicliffe (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515955#comment-16515955
 ] 

Sam Tunnicliffe commented on CASSANDRA-14504:
-

LGTM (minus the circle yaml change ofc)

> fqltool should open chronicle queue read only and a GC bug
> --
>
> Key: CASSANDRA-14504
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14504
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> There are two issues with fqltool.
> The first is that it doesn't open the chronicle queue read only so it won't 
> work if it doesn't have write permissions and it's not clear if it's safe to 
> open the queue to write if the server is also still appending.
> The next issue is that NativeBytesStore.toTemporaryDirectByteBuffer() returns 
> a ByteBuffer that doesn't strongly reference the memory it refers to 
> resulting it in sometimes being reclaimed and containing the wrong data when 
> we go to read from it. At least that is the theory. Simple solution is to use 
> toByteArray() and that seems to make it work consistently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14504) fqltool should open chronicle queue read only and a GC bug

2018-06-18 Thread Sam Tunnicliffe (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14504:

Status: Ready to Commit  (was: Patch Available)

> fqltool should open chronicle queue read only and a GC bug
> --
>
> Key: CASSANDRA-14504
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14504
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> There are two issues with fqltool.
> The first is that it doesn't open the chronicle queue read only so it won't 
> work if it doesn't have write permissions and it's not clear if it's safe to 
> open the queue to write if the server is also still appending.
> The next issue is that NativeBytesStore.toTemporaryDirectByteBuffer() returns 
> a ByteBuffer that doesn't strongly reference the memory it refers to 
> resulting it in sometimes being reclaimed and containing the wrong data when 
> we go to read from it. At least that is the theory. Simple solution is to use 
> toByteArray() and that seems to make it work consistently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-18 Thread Jaydeepkumar Chovatia (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515904#comment-16515904
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14525:
---

I see [~KurtG], sorry I missed it. In my opinion this is a bug and needs to be 
fixed (CASSANDRA-14063 should be considered as per FCFS priority). We also need 
to fix dtest along with this so CASSANDRA-14526 also needs to be landed.

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.j

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

2018-06-18 Thread Christian Spriegel (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515889#comment-16515889
 ] 

Christian Spriegel commented on CASSANDRA-14480:


I just saw this happening in a production system:

 
{noformat}
Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra 
timeout during read query at consistency ALL (8 responses were required but 
only 7 replica responded){noformat}
Our queries use LOCAL_QUORUM, but we have RTEs happening due to read-repair. 
read_repair_chance = 0.1 is set, so its going cross DC :(

 

> Digest mismatch requires all replicas to be responsive
> --
>
> Key: CASSANDRA-14480
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Christian Spriegel
>Priority: Major
> Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that 
> requires all up nodes to be able to respond. If one of these nodes is not 
> responding, then the read-repair is being reported to the client as 
> ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
> are responding. But unfortunetaly the third node being "up" in the ring, but 
> not being able to respond does lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently 
> reported as up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should 
> already produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses 
> were required but only 1 replica responded). Within the 
> DigestMismatchException catch-block it can be seen that the repairHandler is 
> waiting for 3 responses, even though the exception says that 2 responses are 
> required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14470) Repair validation failed/unable to create merkle tree

2018-06-18 Thread Harry Hough (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515810#comment-16515810
 ] 

Harry Hough commented on CASSANDRA-14470:
-

Sounds good, worse case I'll open a new one after 3.11.3 is released and I have 
tested it. Thank you for your help.

> Repair validation failed/unable to create merkle tree
> -
>
> Key: CASSANDRA-14470
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14470
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Harry Hough
>Priority: Major
>
> I had trouble repairing with a full repair across all nodes and keyspaces so 
> I swapped to doing table by table. This table will not repair even after 
> scrub/restart of all nodes. I am using command:
> {code:java}
> nodetool repair -full -seq keyspace table
> {code}
> {code:java}
> [2018-05-25 19:26:36,525] Repair session 0198ee50-6050-11e8-a3b7-9d0793eab507 
> for range [(165598500763544933,166800441975877433], 
> (-5455068259072262254,-5445777107512274819], 
> (-4614366950466274594,-4609359222424798148], 
> (3417371506258365094,3421921915575816226], 
> (5221788898381458942,5222846663270250559], 
> (3421921915575816226,3429175540277204991], 
> (3276484330153091115,3282213186258578546], 
> (-3306169730424140596,-3303439264231406101], 
> (5228704360821395206,5242415853745535023], 
> (5808045095951939338,5808562658315740708], 
> (-3303439264231406101,-3302592736123212969]] finished (progress: 1%)
> [2018-05-25 19:27:23,848] Repair session 0180f980-6050-11e8-a3b7-9d0793eab507 
> for range [(-8495158945319933291,-8482949618583319581], 
> (1803296697741516342,1805330812863783941], 
> (8633191319643427141,8637771071728131257], 
> (2214097236323810344,2218253238829661319], 
> (8637771071728131257,8639627594735133685], 
> (2195525904029414718,2214097236323810344], 
> (-8500127431270773970,-8495158945319933291], 
> (7151693083782264341,7152162989417914407], 
> (-8482949618583319581,-8481973749935314249]] finished (progress: 1%)
> [2018-05-25 19:30:32,590] Repair session 01ac9d62-6050-11e8-a3b7-9d0793eab507 
> for range [(7887346492105510731,7893062759268864220], 
> (-153277717939330979,-151986584968539220], 
> (-6351665356961460262,-6336288442758847669], 
> (7881942012672602731,7887346492105510731], 
> (-5884528383037906783,-5878097817437987368], 
> (6054625594262089428,6060773114960761336], 
> (-6354401100436622515,-6351665356961460262], 
> (3358411934943460772,336336663817876], 
> (6255644242745576360,6278718135193665575], 
> (-6321106762570843270,-6316788220143151823], 
> (1754319239259058661,1759314644652031521], 
> (7893062759268864220,7894890594190784729], 
> (-8012293411840276426,-8011781808288431224]] failed with error [repair 
> #01ac9d62-6050-11e8-a3b7-9d0793eab507 on keyspace/table, 
> [(7887346492105510731,7893062759268864220], 
> (-153277717939330979,-151986584968539220], 
> (-6351665356961460262,-6336288442758847669], 
> (7881942012672602731,7887346492105510731],
> (-5884528383037906783,-5878097817437987368], 
> (6054625594262089428,6060773114960761336], 
> (-6354401100436622515,-6351665356961460262], 
> (3358411934943460772,336336663817876], 
> (6255644242745576360,6278718135193665575], 
> (-6321106762570843270,-6316788220143151823], 
> (1754319239259058661,1759314644652031521], 
> (7893062759268864220,7894890594190784729], 
> (-8012293411840276426,-8011781808288431224]]] Validation failed in 
> /192.168.8.64 (progress: 1%)
> [2018-05-25 19:30:38,744] Repair session 01ab16c1-6050-11e8-a3b7-9d0793eab507 
> for range [(4474598255414218354,4477186372547790770], 
> (-8368931070988054567,-8367389908801757978], 
> (4445104759712094068,4445123832517144036], 
> (6749641233379918040,6749879473217708908], 
> (717627050679001698,729408043324000761], 
> (8984622403893999385,8990662643404904110], 
> (4457612694557846994,4474598255414218354], 
> (5589049422573545528,5593079877787783784], 
> (3609693317839644945,3613727999875360405], 
> (8499016262183246473,8504603366117127178], 
> (-5421277973540712245,-5417725796037372830], 
> (5586405751301680690,5589049422573545528], 
> (-2611069890590917549,-2603911539353128123], 
> (2424772330724108233,2427564448454334730], 
> (3172651438220766183,3175226710613527829], 
> (4445123832517144036,4457612694557846994], 
> (-6827531712183440570,-6800863837312326365], 
> (5593079877787783784,5596020904874304252], 
> (716705770783505310,717627050679001698], 
> (115377252345874298,119626359210683992], 
> (239394377432130766,240250561347730054]] failed with error [repair 
> #01ab16c1-6050-11e8-a3b7-9d0793eab507 on keyspace/table, 
> [(4474598255414218354,4477186372547790770], 
> (-8368931070988054567,-8367389908801757978], 
> (4445104759712094068,4445123832517144036], 
> (6749641233379918040,6749879473217708908], 
> (717627050679001698,7294080433

[jira] [Commented] (CASSANDRA-14527) Real time Bad query logging framework

2018-06-18 Thread Stefan Podkowinski (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515781#comment-16515781
 ] 

Stefan Podkowinski commented on CASSANDRA-14527:


I like the idea of making operational issues more visible to the user. But the 
intention to create a common framework for logging certain messages, doesn't 
sound very convincing to me. At such point, I'm always asking myself, what kind 
of problem we precisely want to solve. What's the value in adding abstraction 
layers around producing log messages for very specific, predefined conditions?

I'd also recommend to first take any such ideas to the dev mailing list before 
spending time implementing such changes. The Google docs hosted proposal gives 
a nice overview on the ideas behind this and would have been a good way to 
start a discussion and get some early feedback.

> Real time Bad query logging framework
> -
>
> Key: CASSANDRA-14527
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14527
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.x
>
>
> If Cassandra is not used in right way then it can create adverse effect to 
> the application. There are lots of bad queries when using Cassandra, but 
> major problem is end user don’t know where and what exactly is the problem. 
> Most of the times end user is ready to take actions on bad queries provided 
> Cassandra gives all detailed which could have potential impact on cluster 
> performance.
> There has been already lots of work done as part of CASSANDRA-12403, proposal 
> as part of this JIRA is let’s have some common way of detecting and logging 
> different problems in Cassandra cluster which could have potential impact on 
> Cassandra cluster performance.
> Please visit this document which has details like what is currently 
> available, motivation behind developing this common framework, architecture, 
> samples, etc. 
> [https://docs.google.com/document/d/1D0HNjC3a7gnuKnR_iDXLI5mvn1zQxtV7tloMaLYIENE/edit?usp=sharing]
> Here is the patch with this feature:
> ||trunk||
> |[!https://circleci.com/gh/jaydeepkumar1984/cassandra/tree/bqr.svg?style=svg! 
> |https://circleci.com/gh/jaydeepkumar1984/cassandra/82]|
> |[patch 
> |https://github.com/apache/cassandra/compare/trunk...jaydeepkumar1984:bqr]|
> Please review this doc and the patch, and provide your opinion and feedback 
> about this effort.
> Thank you!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14527) Real time Bad query logging framework

2018-06-18 Thread Stefan Podkowinski (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14527:
---
Fix Version/s: (was: 4.0)
   4.x

> Real time Bad query logging framework
> -
>
> Key: CASSANDRA-14527
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14527
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.x
>
>
> If Cassandra is not used in right way then it can create adverse effect to 
> the application. There are lots of bad queries when using Cassandra, but 
> major problem is end user don’t know where and what exactly is the problem. 
> Most of the times end user is ready to take actions on bad queries provided 
> Cassandra gives all detailed which could have potential impact on cluster 
> performance.
> There has been already lots of work done as part of CASSANDRA-12403, proposal 
> as part of this JIRA is let’s have some common way of detecting and logging 
> different problems in Cassandra cluster which could have potential impact on 
> Cassandra cluster performance.
> Please visit this document which has details like what is currently 
> available, motivation behind developing this common framework, architecture, 
> samples, etc. 
> [https://docs.google.com/document/d/1D0HNjC3a7gnuKnR_iDXLI5mvn1zQxtV7tloMaLYIENE/edit?usp=sharing]
> Here is the patch with this feature:
> ||trunk||
> |[!https://circleci.com/gh/jaydeepkumar1984/cassandra/tree/bqr.svg?style=svg! 
> |https://circleci.com/gh/jaydeepkumar1984/cassandra/82]|
> |[patch 
> |https://github.com/apache/cassandra/compare/trunk...jaydeepkumar1984:bqr]|
> Please review this doc and the patch, and provide your opinion and feedback 
> about this effort.
> Thank you!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14471) Manage audit whitelists with CQL

2018-06-18 Thread JIRA



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515729#comment-16515729
 ] 

Per Otterström commented on CASSANDRA-14471:


Thanks for your feedback.

bq. Trying to treat audit logging in a similar way to the roles/permissions 
seems wrong to me. The 2 have completely different logic and should be clearly 
separated at the code level.

To me there is a clear connection since Auditing (or Accounting) is often 
related to Authentication and Authorization - 
([AAA|https://en.wikipedia.org/wiki/AAA_(computer_security)]). However, to make 
sure it is working as a stand-alone feature for other use cases we could 
persist things in a separate keyspace, and keep it separate at code level.

bq. The MUTE/UNMUTE syntax is in my opinion confusing.

I'm not really happy with {{MUTE/UNMUTE}} either, but it's what I've come up 
with so far. I'm open to better suggestions.

bq. It is also limited because it only allow you to do a white list approach.

Actually, that was intentional. IMO it is hard to overlook the result when 
setting up filters using the current mix of includes/excludes on 
keyspaces/categories/users. Having everything included by default, and then 
whitelist selected parts seem much more straight forward. Also this fit the 
primary purpose of audits well. IMO everything should be accounted for, unless 
explicitly excluded.

bq. The approach that you would like to use for persisting the whitelists is 
risky. If one of your node cannot get the data from the configuration table it 
might be unable to start (this problem is an existing one with the current auth 
framework).

There is no reason to prevent it from starting up just because data is not 
accessible at startup. At worst the filter component would not be able to load 
whitelists from the database which would lead to unwanted audit records. From a 
AAA perspective this is a reasonable fallback solution. Considering the 
caching-solution we in place for the auth framework, is this a problem for 
users in production?

bq. I would rather prefer a syntax similar to what Transact-SQL is doing.

Interesting reference! So from what I can understand the administrator can do 
{{CREATE SERVER AUDIT}} which roughly corresponds to the settings we have for 
{{logger}} and {{audit_logs_dir}} in the yaml file and then {{CREATE DATABASE 
AUDIT SPECIFICATION}} which corresponds to the filtering part that we're 
addressing in this ticket.

Here is a simple but expressive example I've found (slightly modified):

{{CREATE DATABASE AUDIT SPECIFICATION Audit_Pay_Tables}}
{{FOR SERVER AUDIT Payrole_Security_Audit}}
{{  ADD (SELECT , INSERT ON HumanResources.EmployeePayHistory BY dbo )}}
{{  ,ADD (SELECT , INSERT ON HumanResources.EmployeeSalary BY dbo )}}
{{WITH (STATE = ON) ;}}

Basically, this syntax make it possible to group conditions together under a 
label and associate that with a "backend".

One thing I don't like with this approach is the fact that everything is exempt 
from audit logging until the administrator explicitly creates and audit 
specification for the operation/resource/role combination.


> Manage audit whitelists with CQL
> 
>
> Key: CASSANDRA-14471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14471
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Per Otterström
>Priority: Major
>  Labels: audit, security
> Fix For: 4.0
>
>
> Since CASSANDRA-12151 is merged we have support for audit logs in Cassandra. 
> With this ticket I want to explore the idea of managing audit whitelists 
> using CQL.
>  I can think of a few different benefits compared to current yaml-based 
> whitelist/blacklist approach.
>  * Nodes would always be aligned - no risk that node configuraiton go out of 
> sync as tables are added and whitelists updated.
>  * Easier to manage whitelists in large clusters - change in one place and 
> apply cluster wide.
>  * Changes to the whitelists would be in the audit log itself.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-18 Thread Stefan Podkowinski (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515644#comment-16515644
 ] 

Stefan Podkowinski commented on CASSANDRA-14423:


I'd like to add for interested readers that full repairs on subranges (e.g. 
using reaper) will not be affected by this issue. In this case, "Not a global 
repair, will not do anticompaction" will occur in your logs.

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mut

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-18 Thread Stefan Podkowinski (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515628#comment-16515628
 ] 

Stefan Podkowinski commented on CASSANDRA-14423:


Can we move the status check into {{performAnticompaction}} by adding already 
repaired sstables to {{nonAnticompacting}}? I think filtering there would be 
more coherent, given we also create a corresponding log message and use the 
same code path for canceling/releasing such sstables. We also keep updating 
repairedAt this way, in case of fully contained sstables (and the triggered 
event notification related to that).

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
>

[jira] [Comment Edited] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-18 Thread Kurt Greaves (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515592#comment-16515592
 ] 

Kurt Greaves edited comment on CASSANDRA-14525 at 6/18/18 11:15 AM:


We've already had a ticket (and a _very_ similar patch) for this since November 
last year... CASSANDRA-14063 


was (Author: kurtg):
We've already had a ticket (and a _very_ similar patch( for this since November 
last year... CASSANDRA-14063 

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/o

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-18 Thread Kurt Greaves (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515592#comment-16515592
 ] 

Kurt Greaves commented on CASSANDRA-14525:
--

We've already had a ticket (and a _very_ similar patch( for this since November 
last year... CASSANDRA-14063 

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> [StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2018-06-18 Thread Jason Brown (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515586#comment-16515586
 ] 

Jason Brown commented on CASSANDRA-10735:
-

[~jahar.tyagi] I think you are having a client-side problem, and not on the 
server. This ticket describes functionality going into the server-side database 
for 4.0. You should probably contact the user@ ML for help.

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
> Attachments: netty-ssl-trunk.tgz, nettyssl-bench.tgz, 
> nettysslbench.png, nettysslbench_small.png, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14356) LWTs keep failing in trunk after immutable refactor

2018-06-18 Thread mck (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515556#comment-16515556
 ] 

mck commented on CASSANDRA-14356:
-

Committed as 717c108374

> LWTs keep failing in trunk after immutable refactor
> ---
>
> Key: CASSANDRA-14356
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14356
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: OpenJDK Runtime Environment (build 1.8.0_161-b14), 
> Cassandra 4.0 commit c22ee2bd451d030e99cfb65be839bbc735a5352f (29.3.2018 
> 14:01)
>Reporter: Michael Burman
>Assignee: Michael Burman
>Priority: Major
>  Labels: LWT
> Fix For: 4.0
>
> Attachments: CASSANDRA-14356.diff
>
>
> In the PaxosState, the original assert check is in the form of:
> assert promised.update.metadata() == accepted.update.metadata() && 
> accepted.update.metadata() == mostRecentCommit.update.metadata();
> However, after the change to make TableMetadata immutable this no longer 
> works as these instances are not necessarily the same (or never). This causes 
> the LWTs to fail although they're still correctly targetting the same table.
> From IRC:
>  It's a bug alright. Though really, the assertion should be on the 
> metadata ids, cause TableMetadata#equals does more than what we want.
>  That is, replacing by .equals() is not ok. That would reject throw 
> on any change to a table metadata, while the spirit of the assumption was to 
> sanity check both update were on the same table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra git commit: Fix assertions in PaxosState and PrepareResponse after TableMetadata was made immutable

2018-06-18 Thread mck

Repository: cassandra
Updated Branches:
  refs/heads/trunk 255242237 -> 717c10837


Fix assertions in PaxosState and PrepareResponse after TableMetadata was made 
immutable

Patch by Michael Burman; reviewed by Mick Semb Wever for CASSANDRA-14356


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/717c1083
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/717c1083
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/717c1083

Branch: refs/heads/trunk
Commit: 717c108374a56897d10fcad41fe82b43e2192648
Parents: 2552422
Author: Mick Semb Wever 
Authored: Sun Jun 17 14:29:00 2018 +1000
Committer: Mick Semb Wever 
Committed: Mon Jun 18 20:03:27 2018 +1000

--
 CHANGES.txt  | 1 +
 src/java/org/apache/cassandra/service/paxos/PaxosState.java  | 2 +-
 src/java/org/apache/cassandra/service/paxos/PrepareResponse.java | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/717c1083/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4ea32c9..fd236a2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Fix assertions in LWTs after TableMetadata was made immutable 
(CASSANDRA-14356)
  * Abort compactions quicker (CASSANDRA-14397)
  * Support light-weight transactions in cassandra-stress (CASSANDRA-13529)
  * Make AsyncOneResponse use the correct timeout (CASSANDRA-14509)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/717c1083/src/java/org/apache/cassandra/service/paxos/PaxosState.java
--
diff --git a/src/java/org/apache/cassandra/service/paxos/PaxosState.java 
b/src/java/org/apache/cassandra/service/paxos/PaxosState.java
index 7d59374..6e02435 100644
--- a/src/java/org/apache/cassandra/service/paxos/PaxosState.java
+++ b/src/java/org/apache/cassandra/service/paxos/PaxosState.java
@@ -46,7 +46,7 @@ public class PaxosState
 public PaxosState(Commit promised, Commit accepted, Commit 
mostRecentCommit)
 {
 assert 
promised.update.partitionKey().equals(accepted.update.partitionKey()) && 
accepted.update.partitionKey().equals(mostRecentCommit.update.partitionKey());
-assert promised.update.metadata() == accepted.update.metadata() && 
accepted.update.metadata() == mostRecentCommit.update.metadata();
+assert 
promised.update.metadata().id.equals(accepted.update.metadata().id) && 
accepted.update.metadata().id.equals(mostRecentCommit.update.metadata().id);
 
 this.promised = promised;
 this.accepted = accepted;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/717c1083/src/java/org/apache/cassandra/service/paxos/PrepareResponse.java
--
diff --git a/src/java/org/apache/cassandra/service/paxos/PrepareResponse.java 
b/src/java/org/apache/cassandra/service/paxos/PrepareResponse.java
index 2110dd7..4c7becc 100644
--- a/src/java/org/apache/cassandra/service/paxos/PrepareResponse.java
+++ b/src/java/org/apache/cassandra/service/paxos/PrepareResponse.java
@@ -45,7 +45,7 @@ public class PrepareResponse
 public PrepareResponse(boolean promised, Commit inProgressCommit, Commit 
mostRecentCommit)
 {
 assert 
inProgressCommit.update.partitionKey().equals(mostRecentCommit.update.partitionKey());
-assert inProgressCommit.update.metadata() == 
mostRecentCommit.update.metadata();
+assert 
inProgressCommit.update.metadata().id.equals(mostRecentCommit.update.metadata().id);
 
 this.promised = promised;
 this.mostRecentCommit = mostRecentCommit;


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14356) LWTs keep failing in trunk after immutable refactor

2018-06-18 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14356:

Resolution: Fixed
Status: Resolved  (was: Testing)

> LWTs keep failing in trunk after immutable refactor
> ---
>
> Key: CASSANDRA-14356
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14356
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: OpenJDK Runtime Environment (build 1.8.0_161-b14), 
> Cassandra 4.0 commit c22ee2bd451d030e99cfb65be839bbc735a5352f (29.3.2018 
> 14:01)
>Reporter: Michael Burman
>Assignee: Michael Burman
>Priority: Major
>  Labels: LWT
> Fix For: 4.0
>
> Attachments: CASSANDRA-14356.diff
>
>
> In the PaxosState, the original assert check is in the form of:
> assert promised.update.metadata() == accepted.update.metadata() && 
> accepted.update.metadata() == mostRecentCommit.update.metadata();
> However, after the change to make TableMetadata immutable this no longer 
> works as these instances are not necessarily the same (or never). This causes 
> the LWTs to fail although they're still correctly targetting the same table.
> From IRC:
>  It's a bug alright. Though really, the assertion should be on the 
> metadata ids, cause TableMetadata#equals does more than what we want.
>  That is, replacing by .equals() is not ok. That would reject throw 
> on any change to a table metadata, while the spirit of the assumption was to 
> sanity check both update were on the same table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14356) LWTs keep failing in trunk after immutable refactor

2018-06-18 Thread Aleksey Yeschenko (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515536#comment-16515536
 ] 

Aleksey Yeschenko commented on CASSANDRA-14356:
---

[~michaelsembwever] Change LGTM, ship it (:

> LWTs keep failing in trunk after immutable refactor
> ---
>
> Key: CASSANDRA-14356
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14356
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: OpenJDK Runtime Environment (build 1.8.0_161-b14), 
> Cassandra 4.0 commit c22ee2bd451d030e99cfb65be839bbc735a5352f (29.3.2018 
> 14:01)
>Reporter: Michael Burman
>Assignee: Michael Burman
>Priority: Major
>  Labels: LWT
> Fix For: 4.0
>
> Attachments: CASSANDRA-14356.diff
>
>
> In the PaxosState, the original assert check is in the form of:
> assert promised.update.metadata() == accepted.update.metadata() && 
> accepted.update.metadata() == mostRecentCommit.update.metadata();
> However, after the change to make TableMetadata immutable this no longer 
> works as these instances are not necessarily the same (or never). This causes 
> the LWTs to fail although they're still correctly targetting the same table.
> From IRC:
>  It's a bug alright. Though really, the assertion should be on the 
> metadata ids, cause TableMetadata#equals does more than what we want.
>  That is, replacing by .equals() is not ok. That would reject throw 
> on any change to a table metadata, while the spirit of the assumption was to 
> sanity check both update were on the same table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted

2018-06-18 Thread Stefan Podkowinski (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14423:
---
Reproduced In: 3.11.2, 3.11.0  (was: 3.11.0, 3.11.2)
 Reviewer: Stefan Podkowinski

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> [[BigTableReader(path='/var/lib/cassa

[jira] [Created] (CASSANDRA-14528) Provide stacktraces for various error logs

2018-06-18 Thread Stefan Podkowinski (JIRA)

Stefan Podkowinski created CASSANDRA-14528:
--

 Summary: Provide stacktraces for various error logs
 Key: CASSANDRA-14528
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14528
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stefan Podkowinski
Assignee: Stefan Podkowinski
 Fix For: 4.x


We should reintroduce some stack traces that have gone missing since 
CASSANDRA-13723 (ba87ab4e954ad2). The cleanest way would probably to use 
{{String.format}} for any custom messages, e.g. 
{{logger.error(String.format("Error using param {}", param), e)}}, so we make 
this more implicit and robust for coming api changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2018-06-18 Thread jahar (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515426#comment-16515426
 ] 

jahar commented on CASSANDRA-10735:
---

Hi,

I just followed the instructions given on 
[https://docs.datastax.com/en/developer/java-driver/3.0/manual/ssl/] to use 
NettySSLOptions, but getting 
_com.datastax.driver.core.exceptions.NoHostAvailableException._ 

My .crt and private key and certificates are ok as I have verified them using 
OpenSSL. Tried a lot but not able to find the root cause. 

JdkSSLOptions is working fine but when I use the SSLOptions it fails. This is 
what I am using in code:

 

    _KeyStore ks = KeyStore.getInstance("JKS");_
    _trustStore = new FileInputStream(theTrustStorePath);_
    _ks.load(trustStore, theTrustStorePassword.toCharArray());_
    _TrustManagerFactory tmf = 
TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());_
    _tmf.init(ks);_
    _SslContextBuilder builder =_
    _SslContextBuilder.forClient()_
    _.sslProvider(SslProvider.OPENSSL)_
    _.trustManager(tmf)_
    _.ciphers(theCipherSuites)//_
    _.keyManager(new File("mycert.pem"),_
    _new File("mykey.pem"));_
    _SSLOptions sslOptions = new NettySSLOptions(builder.build());_
    _return sslOptions;_

 

This throws exception _mySession = myCluster.connect();_

Any idea or suggestions please.

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
> Attachments: netty-ssl-trunk.tgz, nettyssl-bench.tgz, 
> nettysslbench.png, nettysslbench_small.png, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

[jira] [Updated] (CASSANDRA-14515) Short read protection in presence of almost-purgeable range tombstones may cause permanent data loss

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

[jira] [Updated] (CASSANDRA-14444) Got NPE when querying Cassandra 3.11.2

[jira] [Updated] (CASSANDRA-14529) nodetool import row cache invalidation races with adding sstables to tracker

[jira] [Created] (CASSANDRA-14529) nodetool import row cache invalidation races with adding sstables to tracker

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

[jira] [Resolved] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

[jira] [Updated] (CASSANDRA-14504) fqltool should open chronicle queue read only and a GC bug

cassandra git commit: fqltool should open chronicle queue read only and a GC bug

[jira] [Commented] (CASSANDRA-14527) Real time Bad query logging framework

[jira] [Commented] (CASSANDRA-14504) fqltool should open chronicle queue read only and a GC bug

[jira] [Updated] (CASSANDRA-14504) fqltool should open chronicle queue read only and a GC bug

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

[jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive

[jira] [Commented] (CASSANDRA-14470) Repair validation failed/unable to create merkle tree

[jira] [Commented] (CASSANDRA-14527) Real time Bad query logging framework

[jira] [Updated] (CASSANDRA-14527) Real time Bad query logging framework

[jira] [Commented] (CASSANDRA-14471) Manage audit whitelists with CQL

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

[jira] [Comment Edited] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

[jira] [Commented] (CASSANDRA-14356) LWTs keep failing in trunk after immutable refactor

cassandra git commit: Fix assertions in PaxosState and PrepareResponse after TableMetadata was made immutable

[jira] [Updated] (CASSANDRA-14356) LWTs keep failing in trunk after immutable refactor

[jira] [Commented] (CASSANDRA-14356) LWTs keep failing in trunk after immutable refactor

[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted

[jira] [Created] (CASSANDRA-14528) Provide stacktraces for various error logs

[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

35 matches

Site Navigation

Mail list logo

Footer information