[ 
https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076044#comment-14076044
 ] 

Sergio Bossa commented on CASSANDRA-7575:
-----------------------------------------

[~adelapena], following the review of your patch, I believe that while it works 
in practice, pulling the index searchers in SelectStatement#getRangeCommand and 
validating them that way is a bit odd, more specifically:
* SelectStatement#getRangeCommand may be called even if a 2i query is not 
present, so enforcing 2i validation there is a bit misleading and unexpected.
* SecondaryIndexSearcher#validate is called with the whole list of index 
expressions, which means each searcher implementation will have to go through 
the list to inspect each expression and decide if that specific expression was 
targeted for it and is wrong, or was just for another searcher.

I'd rather rework the patch in the following way:
* Add a SecondaryIndexManager#validateIndexSearchersForQuery method that works 
similarly to getIndexSearchersForQuery, but rather than just getting the index 
by each column, it also validates it against the proper column/expression by 
calling SecondaryIndexSearcher#validate(IndexExpression).
* Call SecondaryIndexManager#validateIndexSearchersForQuery from 
SelectStatement#RawStatement#validateSecondaryIndexSelections

That should improve encapsulation and responsibility placement and provide 
better 2i APIs.

Finally, I would add a few tests.

> Custom 2i validation
> --------------------
>
>                 Key: CASSANDRA-7575
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Andrés de la Peña
>            Assignee: Andrés de la Peña
>            Priority: Minor
>              Labels: 2i, cql3, secondaryIndex, secondary_index, select
>             Fix For: 2.1.0, 3.0
>
>         Attachments: 2i_validation.patch
>
>
> There are several projects using custom secondary indexes as an extension 
> point to integrate C* with other systems such as Solr or Lucene. The usual 
> approach is to embed third party indexing queries in CQL clauses. 
> For example, [DSE 
> Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise]
>  embeds Solr syntax this way:
> {code}
> SELECT title FROM solr WHERE solr_query='title:natio*';
> {code}
> [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom 
> JSON syntax for searching in Lucene indexes:
> {code}
> SELECT * FROM tweets WHERE lucene='{
>     filter : {
>         type: "range",
>         field: "time",
>         lower: "2014/04/25",
>         upper: "2014/04/1"
>     },
>     query  : {
>         type: "phrase", 
>         field: "body", 
>         values: ["big", "data"]
>     },
>     sort  : {fields: [ {field:"time", reverse:true} ] }
> }';
> {code}
> Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the 
> Stratio's open source JSON syntax:
> {code}
> SELECT name,company FROM PERSON WHERE stargate ='{
>     filter: {
>         type: "range",
>         field: "company",
>         lower: "a",
>         upper: "p"
>     },
>     sort:{
>        fields: [{field:"name",reverse:true}]
>     }
> }';
> {code}
> These syntaxes are validated by the corresponding 2i implementation. This 
> validation is done behind the StorageProxy command distribution. So, far as I 
> know, there is no way to give rich feedback about syntax errors to CQL users.
> I'm uploading a patch with some changes trying to improve this. I propose 
> adding an empty validation method to SecondaryIndexSearcher that can be 
> overridden by custom 2i implementations:
> {code}
> public void validate(List<IndexExpression> clause) {}
> {code}
> And call it from SelectStatement#getRangeCommand:
> {code}
> ColumnFamilyStore cfs = 
> Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily());
>         for (SecondaryIndexSearcher searcher : 
> cfs.indexManager.getIndexSearchersForQuery(expressions))
>         {
>             try
>             {
>                 searcher.validate(expressions);
>             }
>             catch (RuntimeException e)
>             {
>                 String exceptionMessage = e.getMessage();
>                 if (exceptionMessage != null 
>                         && !exceptionMessage.trim().isEmpty())
>                     throw new InvalidRequestException(
>                             "Invalid index expression: " + e.getMessage());
>                 else
>                     throw new InvalidRequestException(
>                             "Invalid index expression");
>             }
>         }
> {code}
> In this way C* allows custom 2i implementations to give feedback about syntax 
> errors.
> We are currently using these changes in a fork with no problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to