[ https://issues.apache.org/jira/browse/CASSANDRA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15501293#comment-15501293 ]
Maxim Podkolzine commented on CASSANDRA-12573: ---------------------------------------------- > SASI initially support multiple predicates, something like : WHERE ((col1=xxx > OR col2=yyy) AND (col3 LIKE '%zzz')) but it is not merged yet into the 3.x > trunk That's good news. When do you plan to merge it? > Wrong, a bug is something that does not work as expected e.g that does not > work as documented. As a customer I have a slightly different view on this. My expectations are based on prior experience and common sense. I understand when certain features that are usual in other products are not implemented by design. This is obviously not the case. My current impression is that this feature is half-baked and not well tested. But it's just my opinion. I think I have a stronger argument that this is a bug. I have created a DB and filled it with some data from my disk: ``` CREATE KEYSPACE Excelsior WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }; use excelsior; create table demo (id text primary key, name text, content text); CREATE CUSTOM INDEX name_index ON demo (name) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'mode': 'CONTAINS', 'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', 'analyzed': 'true' }; ``` After that I run the queries with '%' inside. As you can see multi-patterns are handled by AND: ``` cqlsh:excelsior> select id, name from demo where name like '%RevisionDiff%'; id | name --------------------------------------+---------------------------- 93dce11a-cfdd-4c16-b3b3-7537c7af03ec | RevisionDiffType.java 6586058f-bd57-4fc7-ae12-e6d8ddcd2ceb | RevisionDiffItem.java d16dff53-002b-4fe6-9a10-bb32425360e0 | RevisionDiffItemDTO.java bb20981e-714f-4eac-802f-6191dba5a301 | GetRevisionDiff.java 1c53574b-2eea-46f8-bcbc-5e295ef9c70a | RevisionDiffItemDTO.java 7366f852-d63c-4d07-86b3-18a3bf47e79b | RevisionDiffItemDTO.java 7f18accb-9832-4303-8227-43aa89534cde | RevisionDiffItemImpl.java (7 rows) cqlsh:excelsior> select id, name from demo where name like '%ItemImpl%'; id | name --------------------------------------+--------------------------- 603c1d12-4871-4244-896a-54ddb76dbd3b | FastTreeItemImpl.java 7f18accb-9832-4303-8227-43aa89534cde | RevisionDiffItemImpl.java (2 rows) cqlsh:excelsior> select id, name from demo where name like '%RevisionDiff%ItemImpl%'; id | name --------------------------------------+-------------------------- 7f18accb-9832-4303-8227-43aa89534cde | RevisionDiffItemImpl.java (1 rows) ``` > SASI index. Incorrect results for '%foo%bar%'-like search pattern. > ------------------------------------------------------------------- > > Key: CASSANDRA-12573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12573 > Project: Cassandra > Issue Type: Bug > Reporter: Mikhail Krupitskiy > Priority: Critical > Labels: sasi > > We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests > with "LIKE '%foo%bar%'" constraints on a column with SASI index. > Below are few experiments that show this behaviour. > Experiment 1: > {noformat} > drop keyspace if exists kmv; > create keyspace if not exists kmv WITH REPLICATION = { 'class' : > 'SimpleStrategy', 'replication_factor':'1'} ; > use kmv; > CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text); > CREATE CUSTOM INDEX ON kmv.kmv ( c2 ) USING > 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { > 'mode': 'CONTAINS' > }; > insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ; > insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ; > insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ; > insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ; > insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ; > select c2 from kmv.kmv where c2 like '%w%a%'; > {noformat} > Expected result: qweasd, qwea1. > Actual result: no rows. > Experiment 2 (NOTE: definition of index is changed): > {noformat} > drop keyspace if exists kmv; > create keyspace if not exists kmv WITH REPLICATION = { 'class' : > 'SimpleStrategy', 'replication_factor':'1'} ; > use kmv; > CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text); > CREATE CUSTOM INDEX ON kmv.kmv ( c2 ) USING > 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { > 'mode': 'CONTAINS', > 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', > 'analyzed': 'true' > }; > insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ; > insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ; > insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ; > insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ; > insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ; > select c2 from kmv.kmv where c2 like '%w%a%'; > {noformat} > Expected result: qweasd, qwea1. > Actual result: asdqwe, qweasd, qwea1. > Experiment 3 (NOTE: primary key is compound now and inserted data was > changed): > {noformat} > drop keyspace if exists kmv; > create keyspace if not exists kmv WITH REPLICATION = { 'class' : > 'SimpleStrategy', 'replication_factor':'1'} ; > use kmv; > CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, > c1)); > CREATE CUSTOM INDEX ON kmv.kmv ( c2 ) USING > 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { > 'mode': 'CONTAINS', > 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', > 'analyzed': 'true' > }; > insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ; > insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ; > insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ; > insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ; > insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ; > select c2 from kmv.kmv where c2 like '%w%a%'; > {noformat} > Expected result: qweasd, qwea1. > Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe. > Experiment 4 (NOTE: search criteria is changed): > {noformat} > drop keyspace if exists kmv; > create keyspace if not exists kmv WITH REPLICATION = { 'class' : > 'SimpleStrategy', 'replication_factor':'1'} ; > use kmv; > CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, > c1)); > CREATE CUSTOM INDEX ON kmv.kmv ( c2 ) USING > 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { > 'mode': 'CONTAINS', > 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', > 'analyzed': 'true' > }; > insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ; > insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ; > insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ; > insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ; > insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ; > select c2 from kmv.kmv where c2 like '%w22%a%'; > {noformat} > Expected result: no rows. > Actual result: qweasd, qwea1, asdqwe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)