[jira] [Assigned] (CASSANDRA-12454) Unable to start on IPv6-only node with local JMX

2016-09-08 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-12454:
---

Assignee: Sam Tunnicliffe  (was: Aaron Ploetz)

> Unable to start on IPv6-only node with local JMX
> 
>
> Key: CASSANDRA-12454
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12454
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu Trusty, Oracle JDK 1.8.0_102-b14, IPv6-only host
>Reporter: Vadim Tsesko
>Assignee: Sam Tunnicliffe
> Fix For: 3.x
>
>
> A Cassandra node using *default* configuration is unable to start on 
> *IPv6-only* machine with the following error message:
> {code}
> ERROR [main] 2016-08-13 14:38:07,309 CassandraDaemon.java:731 - Bad URL path: 
> :0:0:0:0:0:1/jndi/rmi://0:0:0:0:0:0:0:1:7199/jmxrmi
> {code}
> The problem might be located in {{JMXServerUtils.createJMXServer()}} (I am 
> not sure, because there is no stack trace in {{system.log}}):
> {code:java}
> String urlTemplate = "service:jmx:rmi://%1$s/jndi/rmi://%1$s:%2$d/jmxrmi";
> ...
> String url = String.format(urlTemplate, (serverAddress != null ? 
> serverAddress.getHostAddress() : "0.0.0.0"), port);
> {code}
> IPv6 addresses must be surrounded by square brackets when passed to 
> {{JMXServiceURL}}.
> Disabling {{LOCAL_JMX}} mode in {{cassandra-env.sh}} (and enabling JMX 
> authentication) helps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-12454) Unable to start on IPv6-only node with local JMX

2016-09-08 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-12454:
---

Assignee: Aaron Ploetz  (was: Sam Tunnicliffe)

> Unable to start on IPv6-only node with local JMX
> 
>
> Key: CASSANDRA-12454
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12454
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu Trusty, Oracle JDK 1.8.0_102-b14, IPv6-only host
>Reporter: Vadim Tsesko
>Assignee: Aaron Ploetz
> Fix For: 3.x
>
>
> A Cassandra node using *default* configuration is unable to start on 
> *IPv6-only* machine with the following error message:
> {code}
> ERROR [main] 2016-08-13 14:38:07,309 CassandraDaemon.java:731 - Bad URL path: 
> :0:0:0:0:0:1/jndi/rmi://0:0:0:0:0:0:0:1:7199/jmxrmi
> {code}
> The problem might be located in {{JMXServerUtils.createJMXServer()}} (I am 
> not sure, because there is no stack trace in {{system.log}}):
> {code:java}
> String urlTemplate = "service:jmx:rmi://%1$s/jndi/rmi://%1$s:%2$d/jmxrmi";
> ...
> String url = String.format(urlTemplate, (serverAddress != null ? 
> serverAddress.getHostAddress() : "0.0.0.0"), port);
> {code}
> IPv6 addresses must be surrounded by square brackets when passed to 
> {{JMXServiceURL}}.
> Disabling {{LOCAL_JMX}} mode in {{cassandra-env.sh}} (and enabling JMX 
> authentication) helps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12622) Snap package of Cassandra

2016-09-08 Thread Evan (JIRA)
Evan created CASSANDRA-12622:


 Summary: Snap package of Cassandra
 Key: CASSANDRA-12622
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12622
 Project: Cassandra
  Issue Type: New Feature
  Components: Packaging
Reporter: Evan
Priority: Minor
 Fix For: 3.10


Picking up the conversation from [1], I'd like to propose that Cassandra 
publish snap packages (http://snapcraft.io).

I've put together a patch:
https://github.com/apache/cassandra/compare/trunk...evandandrea:snap

This could be used to build and publish a snap on every commit to trunk [2, 3], 
or as a quicker way for developers to one-off build something more lightweight 
than a container for testing.

Alternatively, you could keep snap publication to released versions of 
Cassandra. Dependencies are bundled, so you would get to decide Oracle vs 
OpenJDK and the exact version. For the end user it would mean confidence that 
Cassandra with this bundled set of dependencies had been tested by the project. 
Uploads would instantly reach all of Ubuntu and a fair few other distributions 
without any changes [4], hopefully simplifying install instructions.

I couldn't find where the machinery for driving the Cassandra release process 
lives, but if someone can point me in the right direction I'd be happy to 
submit a patch for that.

1: https://www.mail-archive.com/dev@cassandra.apache.org/msg09216.html
2: Builds of trunk would be best published to the edge channel:
http://snapcraft.io/#snapcraft_home_using-snaps_channels
3: What automatic building and publishing could look like using Travis:
https://travis-ci.org/evandandrea/cassandra-snap/builds/158449135#L3937
4: http://snapcraft.io/docs/core/install



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: use HashCodeBuilder.toHashCode(), not HashCodeBuilder.hashCode()

2016-09-08 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk ea77d00bf -> f0b229afb


use HashCodeBuilder.toHashCode(), not HashCodeBuilder.hashCode()


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f0b229af
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f0b229af
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f0b229af

Branch: refs/heads/trunk
Commit: f0b229afbb8992c13055432db8146166e4c6a2ca
Parents: ea77d00
Author: Dave Brosius 
Authored: Thu Sep 8 21:08:37 2016 -0400
Committer: Dave Brosius 
Committed: Thu Sep 8 21:08:37 2016 -0400

--
 src/java/org/apache/cassandra/index/sasi/disk/RowKey.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f0b229af/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
--
diff --git a/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java 
b/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
index c0139d6..fc5a2c0 100644
--- a/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
+++ b/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
@@ -60,7 +60,7 @@ public class RowKey implements Comparable
 
 public int hashCode()
 {
-return new 
HashCodeBuilder().append(decoratedKey).append(clustering).hashCode();
+return new 
HashCodeBuilder().append(decoratedKey).append(clustering).toHashCode();
 }
 
 public int compareTo(RowKey other)



cassandra git commit: Cassandra uses commons.lang3, not commons.lang

2016-09-08 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk e73633cd8 -> ea77d00bf


Cassandra uses commons.lang3, not commons.lang


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ea77d00b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ea77d00b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ea77d00b

Branch: refs/heads/trunk
Commit: ea77d00bf05682972b0a60ed5f16ebc7a93ee649
Parents: e73633c
Author: Dave Brosius 
Authored: Thu Sep 8 19:34:41 2016 -0400
Committer: Dave Brosius 
Committed: Thu Sep 8 19:34:41 2016 -0400

--
 src/java/org/apache/cassandra/index/sasi/disk/RowKey.java | 2 +-
 src/java/org/apache/cassandra/index/sasi/disk/Token.java  | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ea77d00b/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
--
diff --git a/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java 
b/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
index 518ad27..c0139d6 100644
--- a/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
+++ b/src/java/org/apache/cassandra/index/sasi/disk/RowKey.java
@@ -21,7 +21,7 @@ package org.apache.cassandra.index.sasi.disk;
 import java.util.*;
 import java.util.stream.*;
 
-import org.apache.commons.lang.builder.HashCodeBuilder;
+import org.apache.commons.lang3.builder.HashCodeBuilder;
 
 import org.apache.cassandra.config.CFMetaData;
 import org.apache.cassandra.db.*;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ea77d00b/src/java/org/apache/cassandra/index/sasi/disk/Token.java
--
diff --git a/src/java/org/apache/cassandra/index/sasi/disk/Token.java 
b/src/java/org/apache/cassandra/index/sasi/disk/Token.java
index 2412477..8ea864f 100644
--- a/src/java/org/apache/cassandra/index/sasi/disk/Token.java
+++ b/src/java/org/apache/cassandra/index/sasi/disk/Token.java
@@ -18,7 +18,6 @@
 package org.apache.cassandra.index.sasi.disk;
 
 import com.google.common.primitives.Longs;
-import org.apache.commons.lang.builder.HashCodeBuilder;
 
 import org.apache.cassandra.index.sasi.utils.*;
 



[jira] [Commented] (CASSANDRA-11031) Allow filtering on partition key columns for queries without secondary indexes

2016-09-08 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475002#comment-15475002
 ] 

Alex Petrov commented on CASSANDRA-11031:
-

Thank you for noticing that. I've removed one more lambdas usage and rebased. 
I've put it into different commit to make it easier for you to review. Should I 
squash commits together for commit?

> Allow filtering on partition key columns for queries without secondary indexes
> --
>
> Key: CASSANDRA-11031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11031
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Minor
> Fix For: 3.x
>
>
> Currently, Allow Filtering only works for secondary Index column or 
> clustering columns. And it's slow, because Cassandra will read all data from 
> SSTABLE from hard-disk to memory to filter.
> But we can support allow filtering on Partition Key, as far as I know, 
> Partition Key is in memory, so we can easily filter them, and then read 
> required data from SSTable.
> This will similar to "Select * from table" which scan through entire cluster.
> CREATE TABLE multi_tenant_table (
>   tenant_id text,
>   pk2 text,
>   c1 text,
>   c2 text,
>   v1 text,
>   v2 text,
>   PRIMARY KEY ((tenant_id,pk2),c1,c2)
> ) ;
> Select * from multi_tenant_table where tenant_id = "datastax" allow filtering;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11195) paging may returns incomplete results on small page size

2016-09-08 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474943#comment-15474943
 ] 

Sylvain Lebresne commented on CASSANDRA-11195:
--

+1

> paging may returns incomplete results on small page size
> 
>
> Key: CASSANDRA-11195
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11195
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: Benjamin Lerer
>  Labels: dtest
> Attachments: allfiles.tar.gz, node1.log, node1_debug.log, node2.log, 
> node2_debug.log
>
>
> This was found through a flapping test, and running that test is still the 
> easiest way to repro the issue. On CI we're seeing a 40-50% failure rate, but 
> locally this test fails much less frequently.
> If I attach a python debugger and re-query the "bad" query, it continues to 
> return incomplete data indefinitely. If I go directly to cqlsh I can see all 
> rows just fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12300) Disallow unset memtable_cleanup_threshold when flush writers is set

2016-09-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474551#comment-15474551
 ] 

Benedict edited comment on CASSANDRA-12300 at 9/8/16 8:42 PM:
--

Heh, funny how many places that advice has been declared awful (I apparently 
did so 15 months ago in CASSANDRA-9274) and yet it persists

... I guess I'm ultimately to blame though, since I reviewed the patch :(


was (Author: benedict):
Heh, funny how many places that advice has been declared awful (I apparently 
did so 15 months ago in CASSANDRA-9274) and yet it persists

> Disallow unset memtable_cleanup_threshold when flush writers is set
> ---
>
> Key: CASSANDRA-12300
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12300
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>
> Many times I see flush writers set, and mct unset, leading to a very small 
> mct, which causes unneeded frequent flushing, and then of course compaction.  
> I also think the default is a bit conservative, typically ending up at 0.11, 
> where I'd say the majority of use cases only have one or two hot tables and 
> are much better served at 0.7 or 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8779) Add type code to binary query parameters in QUERY messages

2016-09-08 Thread Sandeep Tamhankar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474683#comment-15474683
 ] 

Sandeep Tamhankar edited comment on CASSANDRA-8779 at 9/8/16 6:47 PM:
--

This ticket has been around for a while; any timeframe/priority for when it 
will be completed? There is a Ruby driver ticket for the client side of this: 
https://datastax-oss.atlassian.net/browse/RUBY-112. If "not soon", we'll move 
the Ruby driver ticket out of the next release.


was (Author: stamhankar999):
This ticket has been around for a while; any timeframe/priority for when it 
will be completed? There is a Ruby driver ticket for the client side of this: 
https://datastax-oss.atlassian.net/browse/RUBY-112

> Add type code to binary query parameters in QUERY messages
> --
>
> Key: CASSANDRA-8779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8779
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
> Environment: Linux Mint 64-bit | ruby-driver 2.1 | java-driver 2.1 | 
> C* 2.1.2
>Reporter: Kishan Karunaratne
>  Labels: client-impacting, protocolv5
> Fix For: 3.x
>
>
> If I insert a tuple using an extra pair of ()'s, C* will let me do the 
> insert, but (incorrectly) creates a nested tuple as the first tuple value. 
> Upon doing a select statement, the result is jumbled and has weird binary in 
> it (which I wasn't able to copy into here).
> Example using ruby-driver:
> {noformat}
> session.execute("CREATE TABLE mytable (a int PRIMARY KEY, b 
> frozen>)")
> complete = Cassandra::Tuple.new('foo', 123, true)
> session.execute("INSERT INTO mytable (a, b) VALUES (0, (?))", arguments: 
> [complete])# extra ()'s here
> result = session.execute("SELECT b FROM mytable WHERE a=0").first
> p result['b']
> {noformat}
> Output:
> {noformat}
> #
> {noformat}
> Bug also confirmed using java-driver. 
> Example using java-driver:
> {noformat}
> session.execute("CREATE TABLE mytable (a int PRIMARY KEY, b 
> frozen>)");
> TupleType t = TupleType.of(DataType.ascii(), DataType.cint(), 
> DataType.cboolean());
> TupleValue complete = t.newValue("foo", 123, true);
> session.execute("INSERT INTO mytable (a, b) VALUES (0, (?))", complete); // 
> extra ()'s here
> TupleValue r = session.execute("SELECT b FROM mytable WHERE 
> a=0").one().getTupleValue("b");
> System.out.println(r);
> {noformat}
> Output:
> {noformat}
> ('foo{', null, null)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8779) Add type code to binary query parameters in QUERY messages

2016-09-08 Thread Sandeep Tamhankar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474683#comment-15474683
 ] 

Sandeep Tamhankar commented on CASSANDRA-8779:
--

This ticket has been around for a while; any timeframe/priority for when it 
will be completed? There is a Ruby driver ticket for the client side of this: 
https://datastax-oss.atlassian.net/browse/RUBY-112

> Add type code to binary query parameters in QUERY messages
> --
>
> Key: CASSANDRA-8779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8779
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
> Environment: Linux Mint 64-bit | ruby-driver 2.1 | java-driver 2.1 | 
> C* 2.1.2
>Reporter: Kishan Karunaratne
>  Labels: client-impacting, protocolv5
> Fix For: 3.x
>
>
> If I insert a tuple using an extra pair of ()'s, C* will let me do the 
> insert, but (incorrectly) creates a nested tuple as the first tuple value. 
> Upon doing a select statement, the result is jumbled and has weird binary in 
> it (which I wasn't able to copy into here).
> Example using ruby-driver:
> {noformat}
> session.execute("CREATE TABLE mytable (a int PRIMARY KEY, b 
> frozen>)")
> complete = Cassandra::Tuple.new('foo', 123, true)
> session.execute("INSERT INTO mytable (a, b) VALUES (0, (?))", arguments: 
> [complete])# extra ()'s here
> result = session.execute("SELECT b FROM mytable WHERE a=0").first
> p result['b']
> {noformat}
> Output:
> {noformat}
> #
> {noformat}
> Bug also confirmed using java-driver. 
> Example using java-driver:
> {noformat}
> session.execute("CREATE TABLE mytable (a int PRIMARY KEY, b 
> frozen>)");
> TupleType t = TupleType.of(DataType.ascii(), DataType.cint(), 
> DataType.cboolean());
> TupleValue complete = t.newValue("foo", 123, true);
> session.execute("INSERT INTO mytable (a, b) VALUES (0, (?))", complete); // 
> extra ()'s here
> TupleValue r = session.execute("SELECT b FROM mytable WHERE 
> a=0").one().getTupleValue("b");
> System.out.println(r);
> {noformat}
> Output:
> {noformat}
> ('foo{', null, null)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

2016-09-08 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474662#comment-15474662
 ] 

Michael Kjellman commented on CASSANDRA-9754:
-

Over the past few days I've made some really great progress. I have the 2.1 
based implementation (as found at 
https://github.com/mkjellman/cassandra/commits/CASSANDRA-9754-2.1) in a 
temporary performance cluster running stably against cassandra-stress.

I found a few issues that I've been fixing as I find them while running the 
code under load:
 * Fix reading of non-birch indexes from SSTableScanner
 * Force un-mmapping of the current mmapped buffer from a PageAlignedReader 
before mmapping a new region
 * Fix alignTo() issues when using anything other than 4096 padding for indexes 
(e.g. 2048)
 * Make Birch/PageAligned Format padding length configurable 
(sstable_index_segment_padding_in_kb)
 * Fix signing issue when serializing and deserializing an unsigned short
 * Use a reusable buffer in PageAlignedWriter
 * Fix an issue where the index of the current subsegment was being used when 
the index of the current segment should have been used
 * Other minor cleanup, spelling nits, etc

I've observed a bug where a java.nio.BufferUnderflowException is sometimes 
thrown under load from a ValidationExecutor thread while doing a repair. I've 
put some temporary logging in to dump the state of the reader when the 
exception happens but I'm still not sure how it gets into that state. Wondering 
if there is some kind of concurrency problem somewhere?

Also, (although obvious in hindsight) the page alignment to keep segments 
aligned on 4kb boundaries causes an unacceptable write amplificiation for the 
size of the index file for workloads with small row keys and < 64kb of data in 
the row (a.k.a. no index). I've been discussing with a few people the various 
options we have and the tradeoffs for each one of them. Hoping to formalize 
those thoughts and implement something today or tomorrow.

So, all and all, the 2.1 based implementation is really stabilizing and initial 
performance tests are looking very encouraging!

> Make index info heap friendly for large CQL partitions
> --
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Michael Kjellman
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

2016-09-08 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451066#comment-15451066
 ] 

Michael Kjellman edited comment on CASSANDRA-9754 at 9/8/16 6:37 PM:
-

I pushed a rebased commit that addresses many additional comments by 
[~jasobrown] from review, adds additional unit tests, and has many further 
improvements to documentation. This is still 2.1 based, however the review and 
improvements made in the org.apache.cassandra.db.index.birch package is 
agnostic to a trunk or 2.1 based patch.

https://github.com/mkjellman/cassandra/tree/CASSANDRA-9754-2.1

Some Highlights:
 * Fix a bug in KeyIterator identified by [~jjirsa] that would cause the 
iterator to return nothing when the backing SegmentedFile contains exactly 1 
key/segment.
 * Add unit tests for KeyIterator
 * Add SSTable version ka support to LegacySSTableTest. Actually test something 
in LegacySSTableTest.
 * Add additional unit tests around PageAlignedReader, PageAlignedWriter, 
BirchWriter, and BirchReader
 * Remove word lists and refactor all unit tests to use 
TimeUUIDTreeSerializableIterator instead
 * Improve documentation and fix documentation as required to properly parse 
and format during javadoc creation
 * Remove reset() functionality from BirchReader.BirchIterator
 * Fix many other nits


was (Author: mkjellman):
I pushed a rebased commit that addresses many additional comments by 
[~jasobrown] from review, adds additional unit tests, and has many further 
improvements to documentation. This is still 2.1 based, however the review and 
improvements made in the org.apache.cassandra.db.index.birch package is 
agnostic to a trunk or 2.1 based patch.

https://github.com/mkjellman/cassandra/commit/3d686799a0e79c23d86881bb041b5408dcfda014
https://github.com/mkjellman/cassandra/tree/CASSANDRA-9754-2.1

Some Highlights:
 * Fix a bug in KeyIterator identified by [~jjirsa] that would cause the 
iterator to return nothing when the backing SegmentedFile contains exactly 1 
key/segment.
 * Add unit tests for KeyIterator
 * Add SSTable version ka support to LegacySSTableTest. Actually test something 
in LegacySSTableTest.
 * Add additional unit tests around PageAlignedReader, PageAlignedWriter, 
BirchWriter, and BirchReader
 * Remove word lists and refactor all unit tests to use 
TimeUUIDTreeSerializableIterator instead
 * Improve documentation and fix documentation as required to properly parse 
and format during javadoc creation
 * Remove reset() functionality from BirchReader.BirchIterator
 * Fix many other nits

> Make index info heap friendly for large CQL partitions
> --
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Michael Kjellman
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12300) Disallow unset memtable_cleanup_threshold when flush writers is set

2016-09-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474551#comment-15474551
 ] 

Benedict commented on CASSANDRA-12300:
--

Heh, funny how many places that advice has been declared awful (I apparently 
did so 15 months ago in CASSANDRA-9274) and yet it persists

> Disallow unset memtable_cleanup_threshold when flush writers is set
> ---
>
> Key: CASSANDRA-12300
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12300
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>
> Many times I see flush writers set, and mct unset, leading to a very small 
> mct, which causes unneeded frequent flushing, and then of course compaction.  
> I also think the default is a bit conservative, typically ending up at 0.11, 
> where I'd say the majority of use cases only have one or two hot tables and 
> are much better served at 0.7 or 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12372) Remove deprecated memtable_cleanup_threshold for 4.0

2016-09-08 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474550#comment-15474550
 ] 

Benedict commented on CASSANDRA-12372:
--

Erm yeah.  See my user-list rant for the complexities of MCT, and why it 
probably isn't a good idea to preclude specifying it.

It's a very important property to be able to specify; perhaps the most 
important.



> Remove deprecated memtable_cleanup_threshold for 4.0
> 
>
> Key: CASSANDRA-12372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12372
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.x
>
>
> This is going to be deprecated in 3.10 since it doesn't make sense to specify 
> a value. It only makes sense to calculate it based on memtable_flush_writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12349) Adding some new features to cqlsh

2016-09-08 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-12349:

Reviewer: Philip Thompson  (was: Marcus Eriksson)

> Adding some new features to cqlsh
> -
>
> Key: CASSANDRA-12349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12349
> Project: Cassandra
>  Issue Type: New Feature
> Environment: All
>Reporter: vin01
>Priority: Minor
>  Labels: CQLSH
>
> I will like to have following features in in cqlsh, I have made a patch to 
> enable them as well.
> 1. Aliases.
> 2. Safe mode (prompt on delete,update,truncate,drop if safe_mode is true).
> 3. Press q to exit.
> Its also shared here -> 
> https://github.com/vineet01/cassandra/blob/trunk/new_features.txt
> Example for aliases :-
> cassandra@cqlsh> show 
>  ;  ALIASES  HOST SESSION  VERSION  
> cassandra@cqlsh> show ALIASES ;
> Aliases :> {'dk': 'desc keyspaces;', 'sl': 'select * from'}
> now if you type dk and press  it will auto complete it to "desc 
> keyspace".
> Adding an alias from shell :-
> cassandra@cqlsh> alias slu=select * from login.user ;
> Alias added successfully - sle:select * login.user ;
> cassandra@cqlsh> show ALIASES ;
> Aliases :> {'slu': 'select * from login.user ;', 'dk': 'desc keyspaces;', 
> 'sl': 'select * from'}
> cassandra@cqlsh> sle
> Expanded alias to> select * from login.user ;
>  username | blacklisted | lastlogin | password   
> Adding an alias directly in file :-
> aliases will be kept in same cqlshrc file.
> [aliases]
> dk = desc keyspaces;
> sl = select * from
> sle = select * from login.user ;
> now if we type just "sle" it will autocomplete rest of it and show next 
> options.
> Example of safe mode :-
> cassandra@cqlsh> truncate login.user ;
> Are you sure you want to do this? (y/n) > n
> Not performing any action.
> cassandra@cqlsh> updatee login.user set password=null;
> Are you sure you want to do this? (y/n) > 
> Not performing any action.
> Initial commit :- 
> https://github.com/vineet01/cassandra/commit/0bfce2ccfc610021a74a1f82ed24aa63e1b72fec
> Current version :- 
> https://github.com/vineet01/cassandra/blob/trunk/bin/cqlsh.py
> Please review and suggest any improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-08 Thread Arunkumar M (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arunkumar M updated CASSANDRA-12573:

Assignee: (was: Arunkumar M)

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Priority: Critical
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12621) How to query '%' character using LIKE operator in Cassandra 3.7?

2016-09-08 Thread Mikhail Krupitskiy (JIRA)
Mikhail Krupitskiy created CASSANDRA-12621:
--

 Summary: How to query '%' character using LIKE operator in 
Cassandra 3.7?
 Key: CASSANDRA-12621
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12621
 Project: Cassandra
  Issue Type: Bug
Reporter: Mikhail Krupitskiy
Priority: Critical


I use Cassandra 3.7 and have a text column with SASI index. Let's assume that I 
want to find column values that contain '%' character somewhere in the middle. 
The problem is that '%' is a command char for LIKE clauses. How to escape '%' 
char in a query like LIKE '%%%'?

Here is a test script:

DROP keyspace if exists kmv;
CREATE keyspace if not exists kmv WITH REPLICATION = { 'class' : 
'SimpleStrategy', 'replication_factor':'1'} ;
USE kmv;
CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, c1));
CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
'analyzed' : 'true',
'analyzer_class' : 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
'case_sensitive' : 'false', 
'mode' : 'CONTAINS'
};

INSERT into kmv (id, c1, c2) values (1, 'f22', 'qwe%asd');

SELECT c2 from kmv.kmv where c2 like '%$$%$$%';

The select query returns nothing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-08 Thread Maxim Podkolzine (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473404#comment-15473404
 ] 

Maxim Podkolzine commented on CASSANDRA-12573:
--

Can anyone please take a look at this?

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Arunkumar M
>Priority: Critical
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)