[ 
https://issues.apache.org/jira/browse/CASSANDRA-11067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131534#comment-15131534
 ] 

Pavel Yaskevich edited comment on CASSANDRA-11067 at 2/4/16 1:46 AM:
---------------------------------------------------------------------

If you don't add an analyzer to the column which does stemming and tokenization 
it would work exactly how you describe - "distributing" would return 0 results 
and whole string would be 1, it's tokenization feature which makes it work the 
way it does in the example because after tokenization every term in of that 
string is a separate entity, and even more in case of "distributing" - only 
it's stem is going to be saved which is "distribut" that's why matching 
"distributing" vs. "distribution" which is an original value is going to 
produce results, but to make it work multiple additional SASI options are 
needed, by default it's not going to do any of that and going to behave like 
you describe.


was (Author: xedin):
If you don't add an analyzer to the column which does stemming and tokenization 
it would work exactly how you describe - "distributing" would return 0 results 
and whole string would be 1, it's tokenization feature which makes it work the 
way it does in the example because after tokenization every term in the of that 
string is a separate entity and even more in case of "distributing" only it's 
stem is going to be saved which is "distribut" that's why matching 
"distributing" vs. "distribution" which is an original value is going to 
produce results, but to make it work there are multiple additional SASI options 
needed, by default it's not going to do any of that. 

> Improve SASI syntax
> -------------------
>
>                 Key: CASSANDRA-11067
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11067
>             Project: Cassandra
>          Issue Type: Task
>          Components: CQL
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 3.4
>
>
> I think everyone agrees that a LIKE operator would be ideal, but that's 
> probably not in scope for an initial 3.4 release.
> Still, I'm uncomfortable with the initial approach of overloading = to mean 
> "satisfies index expression."  The problem is that it will be very difficult 
> to back out of this behavior once people are using it.
> I propose adding a new operator in the interim instead.  Call it MATCHES, 
> maybe.  With the exact same behavior that SASI currently exposes, just with a 
> separate operator rather than being rolled into =.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to