[ 
https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717039#comment-17717039
 ] 

Benjamin Lerer commented on CASSANDRA-15803:
--------------------------------------------

I believe that we should be careful here. The way {{ALLOW FILTERING}} work 
today is relatively simple. It will be required every time somebody does some 
filtering. It does not matter at which level (across partitions, across rows or 
within rows). Potentially each of them can be bad. Now, I agree that people 
should decide what they are fine with: Multi-Partitions scan, Partitions scan 
or Row scan. The question is what do you do with index queries that filter 
across multiple rows? Do you consider it as equivalent to a partition scan?
Trying to extend the {{ALLOW FILTERING}} syntax will just make the language 
more complicated. Most people already fail to understand it. It makes sense to 
me to remove it and simply rely on a capability limitation framework, which 
would let operator decide at a more granular level what to authorize and for 
which keyspace or table.       

> Separate out allow filtering scanning through a partition versus scanning 
> over the table
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15803
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL/Syntax
>            Reporter: Jeremy Hanna
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>
> Currently allow filtering can mean two things in the spirit of "avoid 
> operations that don't seek to a specific row or sequential rows of data."  
> First, it can mean scanning across the entire table to meet the criteria of 
> the query.  That's almost always a bad thing and should be discouraged or 
> disabled (see CASSANDRA-8303).  Second, it can mean filtering within a 
> specific partition.  For example, in a query you could specify the full 
> partition key and if you specify a criterion on a non-key field, it requires 
> allow filtering.
> The second reason to require allow filtering is significantly less work to 
> scan through a partition.  It is still extra work over seeking to a specific 
> row and getting N sequential rows though.  So while an application developer 
> and/or operator needs to be cautious about this second type, it's not 
> necessarily a bad thing, depending on the table and the use case.
> I propose that we separate the way to specify allow filtering across an 
> entire table from specifying allow filtering across a partition in a 
> backwards compatible way.  One idea that was brought up in Slack in the 
> cassandra-dev room was to have allow filtering mean the superset - scanning 
> across the table.  Then if you want to specify that you *only* want to scan 
> within a partition you would use something like
> {{ALLOW FILTERING [WITHIN PARTITION]}}
> So it will succeed if you specify non-key criteria within a single partition, 
> but fail with a message to say it requires the full allow filtering.  This 
> would allow for a backwards compatible full allow filtering while allowing a 
> user to specify that they want to just scan within a partition, but error out 
> if trying to scan a full table.
> This is potentially also related to the capability limitation framework by 
> which operators could more granularly specify what features are allowed or 
> disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
> disallow the more general allow filtering while allowing the partition scan 
> (or disallow them both at their discretion).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to