[jira] [Commented] (CASSANDRA-12253) Fix exceptions when enabling gossip on proxy nodes.

2016-09-15 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495414#comment-15495414
 ] 

Dikang Gu commented on CASSANDRA-12253:
---

[~jkni] yes, they all look good to me, thanks!

> Fix exceptions when enabling gossip on proxy nodes.
> ---
>
> Key: CASSANDRA-12253
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12253
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 0001-for-proxy-node-not-set-gossip-tokens.patch, 
> 0002-for-proxy-node-not-set-gossip-tokens.patch, 
> 0003-for-proxy-node-not-set-gossip-tokens.patch
>
>
> We have a tier of Cassandra nodes running with join_ring=false flag, which we 
> call proxy nodes, and they will never join the ring.
> The problem is that sometimes we need to disable and enable the gossip on 
> those nodes, and `nodetool enablegossip` throws exceptions when we do that:
> {code}
> java.lang.AssertionError
> at 
> org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2213)
> at 
> org.apache.cassandra.service.StorageService.startGossiping(StorageService.java:371)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
> at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
> at sun.rmi.transport.Transport$1.run(Transport.java:177)
> at sun.rmi.transport.Transport$1.run(Transport.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12652) Failure in SASIIndexTest.testStaticIndex-compression

2016-09-15 Thread Joel Knighton (JIRA)
Joel Knighton created CASSANDRA-12652:
-

 Summary: Failure in SASIIndexTest.testStaticIndex-compression
 Key: CASSANDRA-12652
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12652
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Joel Knighton


Stacktrace:
{code}
junit.framework.AssertionFailedError: expected:<1> but was:<0>
at 
org.apache.cassandra.index.sasi.SASIIndexTest.testStaticIndex(SASIIndexTest.java:1839)
at 
org.apache.cassandra.index.sasi.SASIIndexTest.testStaticIndex(SASIIndexTest.java:1786)
{code}

Example failure:
http://cassci.datastax.com/job/trunk_testall/1176/testReport/org.apache.cassandra.index.sasi/SASIIndexTest/testStaticIndex_compression/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12651) Failure in SecondaryIndexTest.testAllowFilteringOnPartitionKeyWithSecondaryIndex

2016-09-15 Thread Joel Knighton (JIRA)
Joel Knighton created CASSANDRA-12651:
-

 Summary: Failure in 
SecondaryIndexTest.testAllowFilteringOnPartitionKeyWithSecondaryIndex
 Key: CASSANDRA-12651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12651
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Joel Knighton


This has failed with/without compression.

Stacktrace:
{code}
junit.framework.AssertionFailedError: Got less rows than expected. Expected 2 
but got 0
at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:909)
at 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexTest.lambda$testAllowFilteringOnPartitionKeyWithSecondaryIndex$78(SecondaryIndexTest.java:1228)
at 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexTest$$Lambda$293/218688965.apply(Unknown
 Source)
at 
org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:1215)
at 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexTest.testAllowFilteringOnPartitionKeyWithSecondaryIndex(SecondaryIndexTest.java:1218)
{code}

Examples:
http://cassci.datastax.com/job/trunk_testall/1176/testReport/org.apache.cassandra.cql3.validation.entities/SecondaryIndexTest/testAllowFilteringOnPartitionKeyWithSecondaryIndex/
http://cassci.datastax.com/job/trunk_testall/1176/testReport/org.apache.cassandra.cql3.validation.entities/SecondaryIndexTest/testAllowFilteringOnPartitionKeyWithSecondaryIndex_compression/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12650) Failure in KeyCacheCqlTest.test2iKeyCachePathsShallowIndexEntry

2016-09-15 Thread Joel Knighton (JIRA)
Joel Knighton created CASSANDRA-12650:
-

 Summary: Failure in 
KeyCacheCqlTest.test2iKeyCachePathsShallowIndexEntry
 Key: CASSANDRA-12650
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12650
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Joel Knighton


This test has failed both with/without compression

Stacktrace:
{code}
junit.framework.AssertionFailedError: expected:<0> but was:<1>
at 
org.apache.cassandra.cql3.KeyCacheCqlTest.test2iKeyCachePaths(KeyCacheCqlTest.java:268)
at 
org.apache.cassandra.cql3.KeyCacheCqlTest.test2iKeyCachePathsShallowIndexEntry(KeyCacheCqlTest.java:185)
{code}

Example failures:
http://cassci.datastax.com/job/trunk_testall/1176/testReport/org.apache.cassandra.cql3/KeyCacheCqlTest/test2iKeyCachePathsShallowIndexEntry/
http://cassci.datastax.com/job/trunk_testall/1176/testReport/org.apache.cassandra.cql3/KeyCacheCqlTest/test2iKeyCachePathsShallowIndexEntry_compression/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-15 Thread Geoffrey Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495387#comment-15495387
 ] 

Geoffrey Yu commented on CASSANDRA-12367:
-

Thanks for the first pass [~slebresne]! I added another commit to address your 
comments 
[here|https://github.com/geoffxy/cassandra/commit/a71968ebba8b67591b88cafd2daf3b37e17fec52].
 I added {{rowCount()}} to the {{Partition}} interface to be able to pass in a 
{{rowEstimate}} to {{UnfilteredRowIteratorSerializer.serializedSize()}} since 
all the implementing classes already had that method available. Please let me 
know how it looks now!

{quote}
Wonders if it wouldn't be more user friendly to return 0 if the key is not 
hosted on that replica (which will simply happen if we don't check anything). 
Genuine question though, I could see both options having advantages, so 
mentioning it for the sake of discussion.
{quote}

I don't feel strongly either way since I also agree that both options have 
merit. I've left the check in for now but I have no objection to removing it if 
others feel strongly.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12253) Fix exceptions when enabling gossip on proxy nodes.

2016-09-15 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495287#comment-15495287
 ] 

Joel Knighton commented on CASSANDRA-12253:
---

Thanks for the patch - it looks pretty good overall.

A few minor comments:
- Patches should include a CHANGES.txt entry and a last line in the commit 
message of the form "patch by X; reviewed by Y for CASSANDRA-Z".
- Project code style says to avoid braces for single line if statements.
- A few comments and error messages should be updated for the changes.

These are covered at 
http://cassandra.apache.org/doc/latest/development/patches.html and 
http://cassandra.apache.org/doc/latest/development/code_style.html.

I've pushed a branch with these nits fixed at 
https://github.com/jkni/cassandra/commits/dikang/12253-2.2. I started CI for 
2.2 - unit tests looked good but dtests failed due to unrelated problems with 
the test environment. Tomorrow, I'll rerun 2.2 CI as well as CI for 3.0 and 
trunk with the patch merged. If tests look good and you agree to the small 
things I fixed in my review branch, I'll give this a +1.

> Fix exceptions when enabling gossip on proxy nodes.
> ---
>
> Key: CASSANDRA-12253
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12253
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 0001-for-proxy-node-not-set-gossip-tokens.patch, 
> 0002-for-proxy-node-not-set-gossip-tokens.patch, 
> 0003-for-proxy-node-not-set-gossip-tokens.patch
>
>
> We have a tier of Cassandra nodes running with join_ring=false flag, which we 
> call proxy nodes, and they will never join the ring.
> The problem is that sometimes we need to disable and enable the gossip on 
> those nodes, and `nodetool enablegossip` throws exceptions when we do that:
> {code}
> java.lang.AssertionError
> at 
> org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2213)
> at 
> org.apache.cassandra.service.StorageService.startGossiping(StorageService.java:371)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
> at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
> at sun.rmi.transport.Transport$1.run(Transport.java:177)
> at sun.rmi.transport.Transport$1.run(Transport.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
> at 
> 

[jira] [Created] (CASSANDRA-12649) Add BATCH metrics

2016-09-15 Thread Alwyn Davis (JIRA)
Alwyn Davis created CASSANDRA-12649:
---

 Summary: Add BATCH metrics
 Key: CASSANDRA-12649
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
 Project: Cassandra
  Issue Type: Wish
Reporter: Alwyn Davis
Priority: Minor
 Fix For: 3.x


To identify causes of load on a cluster, it would be useful to have some 
additional metrics:
* *Mutation size distribution:* I believe this would be relevant when tracking 
the performance of unlogged batches.
* *Logged / Unlogged Partitions per batch distribution:* This would also give a 
count of batch types processed. Multiple distinct tables in batch would just be 
considered as separate partitions.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12648) cqlsh - get trace of query without showing results

2016-09-15 Thread Jon Haddad (JIRA)
Jon Haddad created CASSANDRA-12648:
--

 Summary: cqlsh - get trace of query without showing results
 Key: CASSANDRA-12648
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12648
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jon Haddad


there are some circumstances (dealing with sensitive information) where it's 
helpful to show the trace of a query but you don't want to show the query 
results.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12643) Estimated histograms tend to overflow

2016-09-15 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494465#comment-15494465
 ] 

Edward Capriolo commented on CASSANDRA-12643:
-

Also I wanted to point out something. The EstimatedHistogram is used for 
non-reporting cases. If you do a usage search there are some internal 
structures that are sized based on this. While in practice it may not be a 
problem, I am struggling with an "estimator" throw throws a RuntimeException. 
Afterall it is an estimate. None of the things that call it check and do 
anything for this exception. Theoretically this could cause a process to never 
complete. Thinking about two methods. One that always returns data subclasses 
do not bubble up into reporter. Possibly a second with a check/unchecked 
exception so that things calling it can have a fall back logic. 

> Estimated histograms tend to overflow
> -
>
> Key: CASSANDRA-12643
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12643
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494341#comment-15494341
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

Ok I get around the issue about %w%a%

So this will be interpreter first by the CQL parser as LIKE CONTAINS with 
searched term = w%a

And then things get complicated

1) if you're using NonTokeninzingAnalyzer or NoOpAnalyzer, everything is fine, 
the % in 'w%a' is interpreted as simple literal and not wildcard character

2) if you're using StandardAnalyzer, it's an entirely different story. During 
the parsing of the search predicates by the query planer, the term 'w%a' is 
passed to the analyzer (StandardAnalyzer here):  
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/plan/Operation.java#L303-L323

The StandardAnalyzer is tokenizing the search term so 'w%a' becomes 2 distinct 
token, 'w' OR 'a'. Why does it ignore the % ? Because according to Unicode line 
breaking rule, % is a separator, read here: 
http://www.unicode.org/Public/UNIDATA/LineBreak.txt

Nowhere in the source code we can see this, in fact you'll need to look into 
the JFlex grammar file 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/analyzer/StandardTokenizerImpl.jflex
 to see a reference to Unicode word breaking rules ...

So indeed when using StandardAnalyzer, any % character will be interpreter as a 
separator so our LIKE '%w%a%' is indeed transformed into a LIKE '%w%' OR LIKE 
'%a%' e.g all words containing 'w' OR 'a', irrespective of their relative 
position to each other ...

Why is it an OR predicate and not an AND predicate ? The answer is a comment in 
the source code here: 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/plan/Operation.java#L290-L295

Experiment 1 returns 0 rows because using NonTokenizingAnalyzer, CORRECT
Experiment 2 returns 3 rows (asdqwe, qweasd, qwea1) because using 
StandardAnalyzer and all the words contains 'w' OR 'a', CORRECT

Same remark for experiments 3 & 4.

Indeed it is not really a bug, it is because you're using the StandardAnalyzer 
with tokenization ...


> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if 

[jira] [Updated] (CASSANDRA-12647) Improve vnode allocation on RandomPartitioner

2016-09-15 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-12647:
--
Status: Patch Available  (was: Open)

Here is the patch for trunk, 
https://github.com/DikangGu/cassandra/commit/926eca5085706e6d4b4e8b745959c2cd6e69f158

> Improve vnode allocation on RandomPartitioner
> -
>
> Key: CASSANDRA-12647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12647
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.0.x, 3.x
>
>
> CASSANDRA-7032 introduced the improved vnode allocation algorithm, but it 
> only support Murmur3Partitioner for now. 
> We use RandomPartitioner in most of our clusters, I'd like to add the support 
> to RandomPartitioner as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[cassandra] Git Push Summary

2016-09-15 Thread jake
Repository: cassandra
Updated Tags:  refs/tags/3.0.9-tentative [created] d600f51ee


[jira] [Updated] (CASSANDRA-10364) Improve test coverage for schema tables and document 3.0 changes to schema tables

2016-09-15 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-10364:

Reviewer:   (was: Tyler Hobbs)

> Improve test coverage for schema tables and document 3.0 changes to schema 
> tables
> -
>
> Key: CASSANDRA-10364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10364
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 3.0.x
>
>
> In particular,
> {{LegacySchemaMigratorTest.java}}:
> Needed test coverage:
> Legacy schema tables are removed
> New schema tables are written to with the correct timestamp
> Legacy schema tables don't exist in new schema tables
> Migrating tables in general, especially COMPACT ones
> Null values for any optional fields
> Maybe UDTs that refer to other UDTs?
> NTS keyspaces
> Similar coverage is also important to have for {{SchemaKeyspace}}, with no 
> migration involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12554) updateJobs in PendingRangeCalculatorService should be decremented in finally block

2016-09-15 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494188#comment-15494188
 ] 

Tyler Hobbs commented on CASSANDRA-12554:
-

Tests look good except for the trunk dtest, which had some test runner 
problems.  I've restarted that, and if it looks good, I'll commit.

> updateJobs in PendingRangeCalculatorService should be decremented in finally 
> block
> --
>
> Key: CASSANDRA-12554
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12554
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: CASSANDRA_12554_3.0.txt
>
>
> We fixed an issue in CASSANDRA-7390 with MoveTests by adding a count for 
> running jobs. While looking at the code, I can see that decrement of this 
> counter should be done in finally block. 
> Also we dont need to change the setRejectedExecutionHandler in CASSANDRA-7390 
> as we can change the order of calling increment. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10145) Change protocol to allow sending key space independent of query string

2016-09-15 Thread Sandeep Tamhankar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494152#comment-15494152
 ] 

Sandeep Tamhankar edited comment on CASSANDRA-10145 at 9/15/16 6:28 PM:


poc.patch (with diffs from trunk) adds an optional 'keyspace' argument to the 
QUERY message. I have verified that it behaves properly in my environment with 
a modified version of the Ruby driver, but the intent is really to get an 
initial code review and address questions/concerns I discuss in comments / text 
annotated with "sandman:". The two main concerns are:
1. What to do with PREPARE messages since they don't currently have a flags 
byte for optional arguments.
2. Should 0x80 in flags be defined to say "there is another byte of 
extra-flags; look there for additional optional parameters to this message".

You need not bother running this code; this patch is a partial implementation 
of this feature and is meant as a proof of concept. Based on feedback and 
discussion, I'll flesh it out, add tests, etc., and then submit a new patch.


was (Author: stamhankar999):
poc.patch adds an optional 'keyspace' argument to the QUERY message. I have 
verified that it behaves properly in my environment with a modified version of 
the Ruby driver, but the intent is really to get an initial code review and 
address questions/concerns I discuss in comments / text annotated with 
"sandman:". The two main concerns are:
1. What to do with PREPARE messages since they don't currently have a flags 
byte for optional arguments.
2. Should 0x80 in flags be defined to say "there is another byte of 
extra-flags; look there for additional optional parameters to this message".

You need not bother running this code; this patch is a partial implementation 
of this feature and is meant as a proof of concept. Based on feedback and 
discussion, I'll flesh it out, add tests, etc., and then submit a new patch.

> Change protocol to allow sending key space independent of query string
> --
>
> Key: CASSANDRA-10145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Sandeep Tamhankar
> Fix For: 3.x
>
> Attachments: poc.patch
>
>
> Currently keyspace is either embedded in the query string or set through "use 
> keyspace" on a connection by client driver. 
> There are practical use cases where client user has query and keyspace 
> independently. In order for that scenario to work, they will have to create 
> one client session per keyspace or have to resort to some string replace 
> hackery.
> It will be nice if protocol allowed sending keyspace separately from the 
> query. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10145) Change protocol to allow sending key space independent of query string

2016-09-15 Thread Sandeep Tamhankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Tamhankar updated CASSANDRA-10145:
--
Attachment: poc.patch

> Change protocol to allow sending key space independent of query string
> --
>
> Key: CASSANDRA-10145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Sandeep Tamhankar
> Fix For: 3.x
>
> Attachments: poc.patch
>
>
> Currently keyspace is either embedded in the query string or set through "use 
> keyspace" on a connection by client driver. 
> There are practical use cases where client user has query and keyspace 
> independently. In order for that scenario to work, they will have to create 
> one client session per keyspace or have to resort to some string replace 
> hackery.
> It will be nice if protocol allowed sending keyspace separately from the 
> query. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10145) Change protocol to allow sending key space independent of query string

2016-09-15 Thread Sandeep Tamhankar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494152#comment-15494152
 ] 

Sandeep Tamhankar edited comment on CASSANDRA-10145 at 9/15/16 6:28 PM:


poc.patch adds an optional 'keyspace' argument to the QUERY message. I have 
verified that it behaves properly in my environment with a modified version of 
the Ruby driver, but the intent is really to get an initial code review and 
address questions/concerns I discuss in comments / text annotated with 
"sandman:". The two main concerns are:
1. What to do with PREPARE messages since they don't currently have a flags 
byte for optional arguments.
2. Should 0x80 in flags be defined to say "there is another byte of 
extra-flags; look there for additional optional parameters to this message".

You need not bother running this code; this patch is a partial implementation 
of this feature and is meant as a proof of concept. Based on feedback and 
discussion, I'll flesh it out, add tests, etc., and then submit a new patch.


was (Author: stamhankar999):
This patch adds an optional 'keyspace' argument to the QUERY message. I have 
verified that it behaves properly in my environment with a modified version of 
the Ruby driver, but the intent is really to get an initial code review and 
address questions/concerns I discuss in comments / text annotated with 
"sandman:". The two main concerns are:
1. What to do with PREPARE messages since they don't currently have a flags 
byte for optional arguments.
2. Should 0x80 in flags be defined to say "there is another byte of 
extra-flags; look there for additional optional parameters to this message".

You need not bother running this code; this patch is a partial implementation 
of this feature and is meant as a proof of concept. Based on feedback and 
discussion, I'll flesh it out, add tests, etc., and then submit a new patch.

> Change protocol to allow sending key space independent of query string
> --
>
> Key: CASSANDRA-10145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Sandeep Tamhankar
> Fix For: 3.x
>
> Attachments: poc.patch
>
>
> Currently keyspace is either embedded in the query string or set through "use 
> keyspace" on a connection by client driver. 
> There are practical use cases where client user has query and keyspace 
> independently. In order for that scenario to work, they will have to create 
> one client session per keyspace or have to resort to some string replace 
> hackery.
> It will be nice if protocol allowed sending keyspace separately from the 
> query. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10145) Change protocol to allow sending key space independent of query string

2016-09-15 Thread Sandeep Tamhankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Tamhankar updated CASSANDRA-10145:
--
Fix Version/s: 3.x
   Status: Patch Available  (was: In Progress)

This patch adds an optional 'keyspace' argument to the QUERY message. I have 
verified that it behaves properly in my environment with a modified version of 
the Ruby driver, but the intent is really to get an initial code review and 
address questions/concerns I discuss in comments / text annotated with 
"sandman:". The two main concerns are:
1. What to do with PREPARE messages since they don't currently have a flags 
byte for optional arguments.
2. Should 0x80 in flags be defined to say "there is another byte of 
extra-flags; look there for additional optional parameters to this message".

You need not bother running this code; this patch is a partial implementation 
of this feature and is meant as a proof of concept. Based on feedback and 
discussion, I'll flesh it out, add tests, etc., and then submit a new patch.

> Change protocol to allow sending key space independent of query string
> --
>
> Key: CASSANDRA-10145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Sandeep Tamhankar
> Fix For: 3.x
>
>
> Currently keyspace is either embedded in the query string or set through "use 
> keyspace" on a connection by client driver. 
> There are practical use cases where client user has query and keyspace 
> independently. In order for that scenario to work, they will have to create 
> one client session per keyspace or have to resort to some string replace 
> hackery.
> It will be nice if protocol allowed sending keyspace separately from the 
> query. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12141) dtest failure in consistency_test.TestConsistency.short_read_reversed_test

2016-09-15 Thread Sean McCarthy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494026#comment-15494026
 ] 

Sean McCarthy commented on CASSANDRA-12141:
---

Still seeing this failure: 
http://cassci.datastax.com/job/cassandra-3.9_novnode_dtest/49/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_reading_max_insert_errors/

> dtest failure in consistency_test.TestConsistency.short_read_reversed_test
> --
>
> Key: CASSANDRA-12141
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12141
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sean McCarthy
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/280/testReport/consistency_test/TestConsistency/short_read_reversed_test
> Failed on CassCI build trunk_offheap_dtest #280
> {code}
> Standard Output
> Unexpected error in node2 log, error: 
> ERROR [epollEventLoopGroup-2-5] 2016-06-27 19:14:54,412 Slf4JLogger.java:176 
> - LEAK: ByteBuf.release() was not called before it's garbage-collected. 
> Enable advanced leak reporting to find out where the leak occurred. To enable 
> advanced leak reporting, specify the JVM option 
> '-Dio.netty.leakDetection.level=advanced' or call 
> ResourceLeakDetector.setLevel() See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12647) Improve vnode allocation on RandomPartitioner

2016-09-15 Thread Dikang Gu (JIRA)
Dikang Gu created CASSANDRA-12647:
-

 Summary: Improve vnode allocation on RandomPartitioner
 Key: CASSANDRA-12647
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12647
 Project: Cassandra
  Issue Type: Improvement
Reporter: Dikang Gu
Assignee: Dikang Gu
 Fix For: 3.0.x, 3.x


CASSANDRA-7032 introduced the improved vnode allocation algorithm, but it only 
support Murmur3Partitioner for now. 

We use RandomPartitioner in most of our clusters, I'd like to add the support 
to RandomPartitioner as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8616) sstable tools may result in commit log segments be written

2016-09-15 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493908#comment-15493908
 ] 

Tyler Hobbs commented on CASSANDRA-8616:


The patches seem fine to me, but there are some problems in the tests.  It 
seems like the 2.2 testall failures around SSTableRewriter could be related to 
this, since they don't show up in normal 2.2 test failures.  The 3.0 and 3.3 
dtests are erroring out when trying to copy test results at the end, but if I 
remember correctly, this can be caused by some tests not terminating, so we 
should investigate that.  The trunk RemoveTest failure in testall could also be 
related to this patch.

> sstable tools may result in commit log segments be written
> --
>
> Key: CASSANDRA-8616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tyler Hobbs
>Assignee: Yuki Morishita
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: 8161-2.0.txt
>
>
> There was a report of sstable2json causing commitlog segments to be written 
> out when run.  I haven't attempted to reproduce this yet, so that's all I 
> know for now.  Since sstable2json loads the conf and schema, I'm thinking 
> that it may inadvertently be triggering the commitlog code.
> sstablescrub, sstableverify, and other sstable tools have the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2016-09-15 Thread Sergio Bossa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493898#comment-15493898
 ] 

Sergio Bossa commented on CASSANDRA-9318:
-

[~Stefania],

I fixed the tests related to the new {{DatabaseDescriptor}} initialization 
methods.

I've also addressed [~slebresne]'s concerns and modified the back-pressure 
algorithm to always observe the write timeout, and if the rate limit causes it 
to be exceeded, rather than observe the rate limit, just pause up to the 
timeout _minus_ the current response time from the replica with the lower rate: 
this is to avoid client timeouts and also give enough time to replicas to 
actually acknowledge the mutations (at the expense of having more inflight 
mutations than the rate limit, but I believe this is the right tradeoff).

I've run several round of tests and dtests: tests are always green, but some 
dtests always fail intermittently; those failures do not seem related to this 
issue, but someone else more familiar with the failing dtests might want to 
have a look.

Finally, I've re-run some manual stress tests on an overloaded 4 nodes RF=3 
cluster, and here are the results of inserting 1M rows at CL.ONE:
\\
\\
* SLOW back-pressure.
||Node||Dropped Mutations||Dropped Hints||
|1|18143|0|
|2|10|0|
|3|0|0|
|4|0|0|
Timeouts: 39
Total runtime: 20 mins

* No back-pressure
||Node||Dropped Mutations||Dropped Hints||
|1|471751|248403|
|2|70996|13571|
|3|640|0|
|4|75318|24801|
Timeouts: 6
Total runtime: 5 mins

At CL.QUORUM:
\\
\\
* SLOW back-pressure.
||Node||Dropped Mutations||Dropped Hints||
|1|27781|8584|
|2|4650|0|
|3|0|0|
|4|0|0|
Timeouts: 37
Total runtime: 17 mins

* No back-pressure
||Node||Dropped Mutations||Dropped Hints||
|1|353972|133429|
|2|258776|81981|
|3|636|0|
|4|13870|1710|
Timeouts: 74
Total runtime: 6 mins

> Bound the number of in-flight requests at the coordinator
> -
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Streaming and Messaging
>Reporter: Ariel Weisberg
>Assignee: Sergio Bossa
> Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DOAN DuyHai updated CASSANDRA-12573:

Comment: was deleted

(was: Right, the escaping issue does not matter here. What we want to 
understand is how SASI interprets the {{%}} in the middle of the term.

Please note that you're using C* 3.7. I have contributed a bug fix (that was 
scheduled for 3.9 and is in trunk) about skip stop words being applied after 
stemming whereas it should be applied before. I'm not sure if it is relevant to 
the current data set here but it rings a bell in my head when you get weird 
behaviors only when using StandardAnalyzer)

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 

[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493798#comment-15493798
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

Right, the escaping issue does not matter here. What we want to understand is 
how SASI interprets the {{%}} in the middle of the term.

Please note that you're using C* 3.7. I have contributed a bug fix (that was 
scheduled for 3.9 and is in trunk) about skip stop words being applied after 
stemming whereas it should be applied before. I'm not sure if it is relevant to 
the current data set here but it rings a bell in my head when you get weird 
behaviors only when using StandardAnalyzer

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values 

[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493799#comment-15493799
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

Right, the escaping issue does not matter here. What we want to understand is 
how SASI interprets the {{%}} in the middle of the term.

Please note that you're using C* 3.7. I have contributed a bug fix (that was 
scheduled for 3.9 and is in trunk) about skip stop words being applied after 
stemming whereas it should be applied before. I'm not sure if it is relevant to 
the current data set here but it rings a bell in my head when you get weird 
behaviors only when using StandardAnalyzer

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values 

[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread Mikhail Krupitskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493766#comment-15493766
 ] 

Mikhail Krupitskiy commented on CASSANDRA-12573:


Let's try to clarify things.
As I understand there are two different issues:
1) Incorrect processing of escaped '%'.
2) Incorrect processing of %foo%bar% patterns without any escaping.

This issue (12573) is not about escaping and all requests from the experiments 
meaningly don't do any escaping.

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual 

[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493749#comment-15493749
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

I'm going to try reproducing the issue. But anyway right now there is indeed 
*no escaping* of {{%}}, either for the first, last character or in the middle 
of the term.

I'm attempting escaping for first & last character. 

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493727#comment-15493727
 ] 

Alex Petrov commented on CASSANDRA-12573:
-

What most likely is meant is why {{%w%a%}} search term is matching {{qwe, 
qweasd, qwea1, 1qwe, asdqwe}}, although only when analyzer is used.

>From what I seen in the code {{%}} is only meaningful as very first and very 
>last characters in search term. In the middle it'll bear same semantic meaning 
>as any other character.

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: 

[jira] [Commented] (CASSANDRA-12590) Segfault reading secondary index

2016-09-15 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493711#comment-15493711
 ] 

Sam Tunnicliffe commented on CASSANDRA-12590:
-

I spent some time hacking on a unit test using byteman to inject 
synchronisation points so that we can pause the flush/reclaim threads at just 
the right points. Doing so, I was able to effectively stop the flush process at 
the point that the base table has been flushed/reclaimed, but the index table 
is still in the flushing state. 

Accessing the contents of the index memtable at this point does not seem to 
cause a problem, whilst reading from the (now reclaimed) base memtable results 
in a segfault as expected. 

One piece of good news for this is that the switching and reclaiming have been 
re-ordered in trunk by CASSANDRA-12358, such that no reclamation is performed 
until all memtables have been switched. 

[~cam1982], as we're not any closer to a repro/test right now, do you have a 
dev/test cluster where you can try running your workload on latest trunk?


> Segfault reading secondary index
> 
>
> Key: CASSANDRA-12590
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12590
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
> Environment: Occurs on Cassandra 3.5 and 3.7
>Reporter: Cameron Zemek
>Assignee: Sam Tunnicliffe
>
> Getting segfaults when reading secondary index as follows:
> J 9272 C2 
> org.apache.cassandra.dht.LocalPartitioner$LocalToken.compareTo(Lorg/apache/cassandra/dht/Token;)I
>  (53 bytes) @ 0x7fd7354749b7 [0x7fd735474840+0x177]
> J 5661 C2 org.apache.cassandra.db.DecoratedKey.compareTo(Ljava/lang/Object;)I 
> (9 bytes) @ 0x7fd7351b35b8 [0x7fd7351b3440+0x178]
> J 14205 C2 
> java.util.concurrent.ConcurrentSkipListMap.doGet(Ljava/lang/Object;)Ljava/lang/Object;
>  (142 bytes) @ 0x7fd736404dd8 [0x7fd736404cc0+0x118]
> J 17764 C2 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(Lorg/apache/cassandra/db/ColumnFamilyStore;)Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;
>  (635 bytes) @ 0x7fd736e09638 [0x7fd736e08720+0xf18]
> J 17808 C2 
> org.apache.cassandra.index.internal.CassandraIndexSearcher.search(Lorg/apache/cassandra/db/ReadExecutionController;)Lorg/apache/cassandra/db/partitions/UnfilteredPartitionIterator;
>  (68 bytes) @ 0x7fd736e01a48 [0x7fd736e012a0+0x7a8]
> J 14217 C2 
> org.apache.cassandra.db.ReadCommand.executeLocally(Lorg/apache/cassandra/db/ReadExecutionController;)Lorg/apache/cassandra/db/partitions/UnfilteredPartitionIterator;
>  (219 bytes) @ 0x7fd736417c1c [0x7fd736416fa0+0xc7c]
> J 14585 C2 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow()V 
> (337 bytes) @ 0x7fd736541e6c [0x7fd736541d60+0x10c]
> J 14584 C2 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run()V 
> (48 bytes) @ 0x7fd7357957b4 [0x7fd735795760+0x54]
> J 9648% C2 org.apache.cassandra.concurrent.SEPWorker.run()V (253 bytes) @ 
> 0x7fd735938d8c [0x7fd7359356e0+0x36ac]
> Which I have translated to the codepath:
> org.apache.cassandra.dht.LocalPartitioner (Line 139)
> org.apache.cassandra.db.DecoratedKey (Line 85)
> java.util.concurrent.ConcurrentSkipListMap (Line 794)
> org.apache.cassandra.db.SinglePartitionReadCommand (Line 498)
> org.apache.cassandra.index.internal.CassandraIndexSearcher (Line 60)
> org.apache.cassandra.db.ReadCommand (Line 367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493699#comment-15493699
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

Experiments 2, 3, 4 also contains a {{%}} in the middle of the searched term ...

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread Mikhail Krupitskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493690#comment-15493690
 ] 

Mikhail Krupitskiy commented on CASSANDRA-12573:


Yes, but in experiments 2,3,4 we still have non-empty results.

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493680#comment-15493680
 ] 

DOAN DuyHai edited comment on CASSANDRA-12573 at 9/15/16 3:33 PM:
--

In your data set, there is no row containing the substring {{w%a}}. The {{%}} 
is interpreted as a literal % value and not as *all characters*


was (Author: doanduyhai):
In your data set, there is no row containing the substring '%w%a%'

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, 

[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493680#comment-15493680
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

In your data set, there is no row containing the substring '%w%a%'

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12644) CREATE OR ALTER TABLE

2016-09-15 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12644:

Issue Type: Improvement  (was: Bug)

> CREATE OR ALTER TABLE
> -
>
> Key: CASSANDRA-12644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12644
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>
> Similar to how tools like Puppet & Chef allow you to specify what you want 
> rather than how you want it done, it would be nice to be able to give 
> Cassandra this:
> {code}CREATE OR ALTER TABLE stuff ( 
> id int primary key,
> name text,
> city text,
> state text);{code}
> and it would look at the existing schema and work out that it needed to add 
> fields that are missing.  This should only work in a non destructive fashion, 
> that is, it should not remove fields, indexes, etc.  If a user attempts to 
> change a table and the action would be destructive, they should get an error 
> that they have to apply those changes explicitly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread Mikhail Krupitskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493518#comment-15493518
 ] 

Mikhail Krupitskiy commented on CASSANDRA-12573:


As I see it doesn't explain results of experiments described in the description.
E.g. A request for '%w%a%' returns several results without '%' at all.

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, asdqwe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12573) SASI index. Incorrect results for '%foo%bar%'-like search pattern.

2016-09-15 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493422#comment-15493422
 ] 

DOAN DuyHai commented on CASSANDRA-12573:
-

Currently SASI can only understand the {{%}} for the beginning (suffix) or 
ending (prefix) position. Any expression containing the {{%}} in the middle 
like {{%w%a%}} will *not* be interpreter by SASI as wildcard.

{{%w%a%}} will translate into "Give me all results containing {{w%a}} 

> SASI index. Incorrect results for '%foo%bar%'-like search pattern. 
> ---
>
> Key: CASSANDRA-12573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12573
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Assignee: Alex Petrov
>Priority: Critical
>  Labels: sasi
>
> We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests 
> with "LIKE '%foo%bar%'" constraints on a column with SASI index.
> Below are few experiments that show this behaviour.
> Experiment 1:
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: no rows.
> Experiment 2 (NOTE: definition of index is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: asdqwe, qweasd, qwea1.
> Experiment 3 (NOTE: primary key is compound now and inserted data was 
> changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w%a%';
> {noformat}
> Expected result: qweasd, qwea1.
> Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.
> Experiment 4 (NOTE: search criteria is changed):
> {noformat}
> drop keyspace if exists kmv;
> create keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> use kmv;
> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, 
> c1));
> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>  'mode': 'CONTAINS',
>  'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
>  'analyzed': 'true'
> };
> insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
> insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
> insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
> insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
> select c2 from kmv.kmv where c2 like '%w22%a%';
> {noformat}
> Expected result: no rows.
> Actual result: qweasd, qwea1, 

[jira] [Comment Edited] (CASSANDRA-12060) Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns

2016-09-15 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493125#comment-15493125
 ] 

Alex Petrov edited comment on CASSANDRA-12060 at 9/15/16 12:09 PM:
---

In this case I think 3.x behaviour is correct: when using {{UPDATE ... IF}}, it 
returns the {{[applied]}} column and existing values that caused the 
transaction to fail, whereas 2.x returns just {{[applied]}} with {{false}}. 

We can leave the 2.x behaviour unchanged and introduce special-case in dtests 
or make a 2-line fix for 2.x like 

{code}
for (Composite prefix : conditions.keySet())
{
if (prefix.isStatic() && conditions.size() == 1)
slices[i++] = new ColumnSlice(Composites.EMPTY, Composites.EMPTY);
else
slices[i++] = prefix.slice();
}
{code}



was (Author: ifesdjeen):
In this case I think 3.x behaviour is correct: when using {{UPDATE ... IF }}, 
it returns the {{[applied]}} column and existing values that caused the 
transaction to fail, whereas 2.x returns just {{[applied]}} with {{false}}. 

We can leave the 2.x behaviour unchanged and introduce special-case in dtests 
or make a 2-line fix for 2.x like 

{code}
for (Composite prefix : conditions.keySet())
{
if (prefix.isStatic() && conditions.size() == 1)
slices[i++] = new ColumnSlice(Composites.EMPTY, Composites.EMPTY);
else
slices[i++] = prefix.slice();
}
{code}



> Establish consistent distinction between non-existing partition and NULL 
> value for LWTs on static columns
> -
>
> Key: CASSANDRA-12060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>
> When executing following CQL commands: 
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'datacenter1': '1' };
> USE test;
> CREATE TABLE testtable (a int, b int, s1 int static, s2 int static, v int, 
> PRIMARY KEY (a, b));
> INSERT INTO testtable (a,b,s1,s2,v) VALUES (2,2,2,null,2);
> DELETE s1 FROM testtable WHERE a = 2 IF s2 IN (10,20,30);
> {code}
> The output is different between {{2.x}} and {{3.x}}:
> 2.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied] | s2
> ---+--
>  False | null
> {code}
> 3.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> {{2.x}} would although return same result if executed on a partition that 
> does not exist at all:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 5 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> It _might_ be related to static column LWTs, as I could not reproduce same 
> behaviour with non-static column LWTs. The most recent change was 
> [CASSANDRA-10532], which enabled LWT operations on static columns with 
> partition keys only. -Another possible relation is [CASSANDRA-9842], which 
> removed distinction between {{null}} column and non-existing row.- (striked 
> through since same happens on pre-[CASSANDRA-9842] code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12060) Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns

2016-09-15 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493125#comment-15493125
 ] 

Alex Petrov commented on CASSANDRA-12060:
-

In this case I think 3.x behaviour is correct: when using {{UPDATE ... IF }}, 
it returns the {{[applied]}} column and existing values that caused the 
transaction to fail, whereas 2.x returns just {{[applied]}} with {{false}}. 

We can leave the 2.x behaviour unchanged and introduce special-case in dtests 
or make a 2-line fix for 2.x like 

{code}
for (Composite prefix : conditions.keySet())
{
if (prefix.isStatic() && conditions.size() == 1)
slices[i++] = new ColumnSlice(Composites.EMPTY, Composites.EMPTY);
else
slices[i++] = prefix.slice();
}
{code}



> Establish consistent distinction between non-existing partition and NULL 
> value for LWTs on static columns
> -
>
> Key: CASSANDRA-12060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>
> When executing following CQL commands: 
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'datacenter1': '1' };
> USE test;
> CREATE TABLE testtable (a int, b int, s1 int static, s2 int static, v int, 
> PRIMARY KEY (a, b));
> INSERT INTO testtable (a,b,s1,s2,v) VALUES (2,2,2,null,2);
> DELETE s1 FROM testtable WHERE a = 2 IF s2 IN (10,20,30);
> {code}
> The output is different between {{2.x}} and {{3.x}}:
> 2.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied] | s2
> ---+--
>  False | null
> {code}
> 3.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> {{2.x}} would although return same result if executed on a partition that 
> does not exist at all:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 5 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> It _might_ be related to static column LWTs, as I could not reproduce same 
> behaviour with non-static column LWTs. The most recent change was 
> [CASSANDRA-10532], which enabled LWT operations on static columns with 
> partition keys only. -Another possible relation is [CASSANDRA-9842], which 
> removed distinction between {{null}} column and non-existing row.- (striked 
> through since same happens on pre-[CASSANDRA-9842] code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12060) Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns

2016-09-15 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493095#comment-15493095
 ] 

Sylvain Lebresne commented on CASSANDRA-12060:
--

I really don't think we should modify anything in 2.1 and 2.2 here. And 3.x 
should behave as 2.x as much as possible, unless there is a good reason to 
think the 2.x behavior is broken and even then we should probably only consider 
fixing it in 3.x at this point.

> Establish consistent distinction between non-existing partition and NULL 
> value for LWTs on static columns
> -
>
> Key: CASSANDRA-12060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>
> When executing following CQL commands: 
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'datacenter1': '1' };
> USE test;
> CREATE TABLE testtable (a int, b int, s1 int static, s2 int static, v int, 
> PRIMARY KEY (a, b));
> INSERT INTO testtable (a,b,s1,s2,v) VALUES (2,2,2,null,2);
> DELETE s1 FROM testtable WHERE a = 2 IF s2 IN (10,20,30);
> {code}
> The output is different between {{2.x}} and {{3.x}}:
> 2.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied] | s2
> ---+--
>  False | null
> {code}
> 3.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> {{2.x}} would although return same result if executed on a partition that 
> does not exist at all:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 5 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> It _might_ be related to static column LWTs, as I could not reproduce same 
> behaviour with non-static column LWTs. The most recent change was 
> [CASSANDRA-10532], which enabled LWT operations on static columns with 
> partition keys only. -Another possible relation is [CASSANDRA-9842], which 
> removed distinction between {{null}} column and non-existing row.- (striked 
> through since same happens on pre-[CASSANDRA-9842] code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12060) Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns

2016-09-15 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493049#comment-15493049
 ] 

Alex Petrov commented on CASSANDRA-12060:
-

I've re-added the test for non-existing values and added a patch for 2.1 and 
2.2 to match outputs on empty static columns (also, added static / regular 
column condition separation in similar way you've done in 3.x patch): 

|[2.1 |https://github.com/ifesdjeen/cassandra/tree/12060-2.1-v2] 
|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-2.1-v2-testall/]
 
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-2.1-v2-dtest/]
 |
|[2.2|https://github.com/ifesdjeen/cassandra/tree/12060-2.2-v2] 
|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-2.2-v2-testall/]
 
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-2.2-v2-dtest/]
 |
|[3.0|https://github.com/ifesdjeen/cassandra/tree/12060-3.0-v2] 
|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-3.0-v2-testall/]
 
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-3.0-v2-dtest/]
 |
|[trunk|https://github.com/ifesdjeen/cassandra/tree/12060-trunk-v2] 
|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-trunk-v2-testall/]
 
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12060-trunk-v2-dtest/]
 |
||[dtest 
patch|https://github.com/ifesdjeen/cassandra-dtest/tree/12060-trunk-v2] ||

I've ran tests only locally, CI pending.


> Establish consistent distinction between non-existing partition and NULL 
> value for LWTs on static columns
> -
>
> Key: CASSANDRA-12060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>
> When executing following CQL commands: 
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'datacenter1': '1' };
> USE test;
> CREATE TABLE testtable (a int, b int, s1 int static, s2 int static, v int, 
> PRIMARY KEY (a, b));
> INSERT INTO testtable (a,b,s1,s2,v) VALUES (2,2,2,null,2);
> DELETE s1 FROM testtable WHERE a = 2 IF s2 IN (10,20,30);
> {code}
> The output is different between {{2.x}} and {{3.x}}:
> 2.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied] | s2
> ---+--
>  False | null
> {code}
> 3.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> {{2.x}} would although return same result if executed on a partition that 
> does not exist at all:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 5 IF s2 = 5;
>  [applied]
> ---
>  False
> {code}
> It _might_ be related to static column LWTs, as I could not reproduce same 
> behaviour with non-static column LWTs. The most recent change was 
> [CASSANDRA-10532], which enabled LWT operations on static columns with 
> partition keys only. -Another possible relation is [CASSANDRA-9842], which 
> removed distinction between {{null}} column and non-existing row.- (striked 
> through since same happens on pre-[CASSANDRA-9842] code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12646) nodetool stopdaemon errors out on stopdaemon

2016-09-15 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-12646:
-
Status: Patch Available  (was: Open)

> nodetool stopdaemon errors out on stopdaemon
> 
>
> Key: CASSANDRA-12646
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12646
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.0.x
>
>
> {{nodetool stopdaemon}} works, but it prints a {{java.net.ConnectException: 
> Connection refused}} error message in {{NodeProbe.close()}} - which is 
> expected.
> Attached patch prevents that error message (i.e. it expects {{close()}} to 
> fail for {{stopdaemon}}).
> Additionally, on trunk a call to {{DD.clientInit()}} has been added, because 
> {{JVMStabilityInspector.inspectThrowable}} implicitly requires this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12646) nodetool stopdaemon errors out on stopdaemon

2016-09-15 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492962#comment-15492962
 ] 

Robert Stupp commented on CASSANDRA-12646:
--

CI pending:

||cassandra-3.0|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...snazy:12646-nodetool-shutdown-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-12646-nodetool-shutdown-3.0-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-12646-nodetool-shutdown-3.0-dtest/lastSuccessfulBuild/]
||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:12646-nodetool-shutdown-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-12646-nodetool-shutdown-trunk-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-12646-nodetool-shutdown-trunk-dtest/lastSuccessfulBuild/]


> nodetool stopdaemon errors out on stopdaemon
> 
>
> Key: CASSANDRA-12646
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12646
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.0.x
>
>
> {{nodetool stopdaemon}} works, but it prints a {{java.net.ConnectException: 
> Connection refused}} error message in {{NodeProbe.close()}} - which is 
> expected.
> Attached patch prevents that error message (i.e. it expects {{close()}} to 
> fail for {{stopdaemon}}).
> Additionally, on trunk a call to {{DD.clientInit()}} has been added, because 
> {{JVMStabilityInspector.inspectThrowable}} implicitly requires this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12646) nodetool stopdaemon errors out on stopdaemon

2016-09-15 Thread Robert Stupp (JIRA)
Robert Stupp created CASSANDRA-12646:


 Summary: nodetool stopdaemon errors out on stopdaemon
 Key: CASSANDRA-12646
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12646
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
 Fix For: 3.0.x


{{nodetool stopdaemon}} works, but it prints a {{java.net.ConnectException: 
Connection refused}} error message in {{NodeProbe.close()}} - which is expected.

Attached patch prevents that error message (i.e. it expects {{close()}} to fail 
for {{stopdaemon}}).

Additionally, on trunk a call to {{DD.clientInit()}} has been added, because 
{{JVMStabilityInspector.inspectThrowable}} implicitly requires this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12417) Built-in AVG aggregate is much less useful than it should be

2016-09-15 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492876#comment-15492876
 ] 

Branimir Lambov commented on CASSANDRA-12417:
-

Yes, Kahan's should do if you want to avoid the division. Make sure it's not 
optimized away (try averaging 1e20, 1, -1e20).

> Built-in AVG aggregate is much less useful than it should be
> 
>
> Key: CASSANDRA-12417
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12417
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Branimir Lambov
>Assignee: Alex Petrov
>
> For fixed-size integer types overflow is all but guaranteed to happen, 
> yielding incorrect result. While for sum it is somewhat acceptable as the 
> result cannot fit the type, this is not the case for average.
> As the result of average is always within the scope of the source type, 
> failing to produce it only signifies a bad implementation. Yes, one can solve 
> this by type-casting, but do we really want to always have to be telling 
> people that the correct spelling of the average function is 
> {{cast(avg(cast(value as bigint))) as int)}}, especially if this is so 
> trivial to fix?
> Additionally, the straightforward addition we use for floating point versions 
> is not a good choice numerically for larger numbers of values. We should 
> switch to a more stable version, e.g. iterative mean using {{avg = avg + 
> (value - avg) / count}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12237) Cassandra stress graphing is broken

2016-09-15 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492839#comment-15492839
 ] 

Sylvain Lebresne commented on CASSANDRA-12237:
--

Are you sure you didn't forgot to push the rebase the github? The link above 
(https://github.com/chbatey/cassandra-1/tree/stress-graph-logging) still shows 
the commit being based on trunk from the beginning of the month.

> Cassandra stress graphing is broken
> ---
>
> Key: CASSANDRA-12237
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12237
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Christopher Batey
>Assignee: Christopher Batey
> Fix For: 3.x
>
>
> Cassandra stress relies on a tmp file with the stress output so it can parse 
> it and put it the the graph html.
> However the contents of this file is now broken:
> {code}
> Sleeping 2s...Sleeping 2s...
> Sleeping 2s...
> Warming up WRITE with 5 iterations...Warming up WRITE with 5 
> iterations...
> Warming up WRITE with 5 iterations...
> Running WRITE with 500 threads 10 secondsRunning WRITE with 500 threads 10 
> seconds
> Running WRITE with 500 threads 10 seconds
> ...
> {code}
> This is as we create a {code}MultiPrintStream{code} that inherits from 
> {code}PrintWriter{code} and then delegate the call to super as well as a list 
> of other PrintWriters
> The call to super for println comes back into our print method so every line 
> gets logged multiple times as we do the for (PrintStream s: newStreams) many 
> times.
> We can change this to use composition and use our own interface if we want to 
> use a composite for logging the results
> This results in the parsing of this file not quite working and the aggregate 
> stats not working in produced graphs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12417) Built-in AVG aggregate is much less useful than it should be

2016-09-15 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492835#comment-15492835
 ] 

Benjamin Lerer commented on CASSANDRA-12417:


[~blambov] For summing floating point numbers, do you think it will make sense 
to use the [Kahan summation 
algorithm|https://en.wikipedia.org/wiki/Kahan_summation_algorithm]?

> Built-in AVG aggregate is much less useful than it should be
> 
>
> Key: CASSANDRA-12417
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12417
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Branimir Lambov
>Assignee: Alex Petrov
>
> For fixed-size integer types overflow is all but guaranteed to happen, 
> yielding incorrect result. While for sum it is somewhat acceptable as the 
> result cannot fit the type, this is not the case for average.
> As the result of average is always within the scope of the source type, 
> failing to produce it only signifies a bad implementation. Yes, one can solve 
> this by type-casting, but do we really want to always have to be telling 
> people that the correct spelling of the average function is 
> {{cast(avg(cast(value as bigint))) as int)}}, especially if this is so 
> trivial to fix?
> Additionally, the straightforward addition we use for floating point versions 
> is not a good choice numerically for larger numbers of values. We should 
> switch to a more stable version, e.g. iterative mean using {{avg = avg + 
> (value - avg) / count}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8780) cassandra-stress should support multiple table operations

2016-09-15 Thread Ben Slater (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492826#comment-15492826
 ] 

Ben Slater commented on CASSANDRA-8780:
---

Just thought I'd give this a bump to see if I could get a review before I 
forget what I did :-)


> cassandra-stress should support multiple table operations
> -
>
> Key: CASSANDRA-8780
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8780
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Benedict
>Assignee: Ben Slater
>  Labels: stress
> Fix For: 3.x
>
> Attachments: 8780-trunk.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)