Ok you're right, I get your point LIKE '%%esc%' --> startWith('%esc')
LIKE 'escape%%' --> = 'escape%' What I strongly suspect is that in the source code of SASI, we parse the % xxx % expression BEFORE applying escape. That will explain the observed behavior. E.g: LIKE '%%esc%' parsed as %xxx% where xxx = %esc LIKE 'escape%%' parsed as xxx% where xxx =escape% Let me check in the source code and try to reproduce the issue On Tue, Sep 13, 2016 at 7:24 PM, Mikhail Krupitskiy < mikhail.krupits...@jetbrains.com> wrote: > Looks like we have different understanding of what results are expected. > I based my understanding on http://docs.datastax.com/ > en/cql/3.3/cql/cql_using/useSASIIndex.html > According to the doc ‘esc’ is a pattern for exact match and I guess that > there is no semantical difference between two LIKE patterns (both of > patterns should be treated as ‘exact match'): ‘%%esc’ and ‘esc’. > > SELECT * FROM escape WHERE val LIKE '%%esc%'; --> Give all results > *containing* '%esc' so *%esc*apeme is a possible match and also escape > *%esc* > > Why ‘containing’? I expect that it should be ’starting’.. > > > SELECT * FROM escape WHERE val LIKE 'escape%%' --> Give all results > *starting* with 'escape%' so *escape%*me is a valid result and also > *escape%*esc > > Why ’starting’? I expect that it should be ‘exact matching’. > > Also I expect that “ LIKE ‘%s%sc%’ ” will return ‘escape%esc’ but it > returns nothing (CASSANDRA-12573). > > What I’m missing? > > Thanks, > Mikhail > > On 13 Sep 2016, at 19:31, DuyHai Doan <doanduy...@gmail.com> wrote: > > CREATE CUSTOM INDEX ON test.escape(val) USING > 'org.apache.cassandra.index.sasi.SASIIndex' > WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': > 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', > 'case_sensitive': 'false'}; > > I don't see any problem in the results you got > > SELECT * FROM escape WHERE val LIKE '%%esc%'; --> Give all results > *containing* '%esc' so *%esc*apeme is a possible match and also escape > *%esc* > > Why ‘containing’? I expect that it should be ’starting’.. > > > SELECT * FROM escape WHERE val LIKE 'escape%%' --> Give all results > *starting* with 'escape%' so *escape%*me is a valid result and also > *escape%*esc > > Why ’starting’? I expect that it should be ‘exact matching’. > > > On Tue, Sep 13, 2016 at 5:58 PM, Mikhail Krupitskiy < > mikhail.krupits...@jetbrains.com> wrote: > >> Thanks for the reply. >> Could you please provide what index definition did you use? >> With the index from my script I get the following results: >> >> cqlsh:test> select * from escape; >> >> id | val >> ----+----------- >> 1 | %escapeme >> 2 | escape%me >> * 3 | escape%esc* >> >> Contains search >> >> cqlsh:test> SELECT * FROM escape WHERE val LIKE '%%esc%'; >> >> id | val >> ----+----------- >> 1 | %escapeme >> 3 >> * | escape%esc*(2 rows) >> >> >> Prefix search >> >> cqlsh:test> SELECT * FROM escape WHERE val LIKE 'escape%%'; >> >> id | val >> ----+----------- >> 2 | escape%me >> 3 >> * | escape%esc* >> >> Thanks, >> Mikhail >> >> On 13 Sep 2016, at 18:16, DuyHai Doan <doanduy...@gmail.com> wrote: >> >> Use % to escape % >> >> cqlsh:test> select * from escape; >> >> id | val >> ----+----------- >> 1 | %escapeme >> 2 | escape%me >> >> >> Contains search >> >> cqlsh:test> SELECT * FROM escape WHERE val LIKE '%%esc%'; >> >> id | val >> ----+----------- >> 1 | %escapeme >> >> (1 rows) >> >> >> Prefix search >> >> cqlsh:test> SELECT * FROM escape WHERE val LIKE 'escape%%'; >> >> id | val >> ----+----------- >> 2 | escape%me >> >> On Tue, Sep 13, 2016 at 5:06 PM, Mikhail Krupitskiy < >> mikhail.krupits...@jetbrains.com> wrote: >> >>> Hi Cassandra guys, >>> >>> I use Cassandra 3.7 and wondering how to use ‘%’ as a simple char in a >>> search pattern. >>> Here is my test script: >>> >>> DROP keyspace if exists kmv; >>> CREATE keyspace if not exists kmv WITH REPLICATION = { 'class' : >>> 'SimpleStrategy', 'replication_factor':'1'} ; >>> USE kmv; >>> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY >>> KEY(id, c1)); >>> CREATE CUSTOM INDEX ON kmv.kmv ( c2 ) USING ' >>> org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { >>> 'analyzed' : 'true', >>> 'analyzer_class' : 'org.apache.cassandra.index.sa >>> si.analyzer.NonTokenizingAnalyzer', >>> 'case_sensitive' : 'false', >>> 'mode' : 'CONTAINS' >>> }; >>> >>> INSERT into kmv (id, c1, c2) values (1, 'f22', 'qwe%asd'); >>> INSERT into kmv (id, c1, c2) values (2, 'f22', '%asd'); >>> INSERT into kmv (id, c1, c2) values (3, 'f22', 'asd%'); >>> INSERT into kmv (id, c1, c2) values (4, 'f22', 'asd%1'); >>> INSERT into kmv (id, c1, c2) values (5, 'f22', 'qweasd'); >>> >>> SELECT c2 from kmv.kmv where c2 like ‘_pattern_'; >>> >>> _pattern_ '%%%' finds all columns that contain %. >>> How to find columns that start form ‘%’ or ‘%a’? >>> How to find columns that end with ‘%’? >>> What about more complex patterns: '%qwe%a%sd%’? How to differentiate ‘%’ >>> char form % as a command symbol? (Also there is a related issue >>> CASSANDRA-12573). >>> >>> >>> Thanks, >>> Mikhail >> >> >> >> > >