+1 to the proposal from a CQL perspective

*However*, whether we do this in the context of simple partition
restriction, a global index query, or a partition-restricted index query,
the NOT operator is most likely to be useful only in a post-filtering
capacity. (ex. WHERE indexed_set CONTAINS { 'foo'} AND indexed_set NOT
CONTAINS { 'bar' })

Using Lucene as an example, you might remember that it doesn't (at least
IIRC) allow single predicate NOT queries. (See
https://stackoverflow.com/questions/3604771/not-query-in-lucene) It's easy
for an inverted index to find matches efficiently, but not so easy for it
to find non-matches. This is similar to, but even less-straightforward
than, the issue you have w/ boolean queries when you query the less
selective of the two possible values. You can create an accompanying
"negated" index, but that's not free, of course.

Again, not necessarily a problem w/ the CEP, but want to call out the
potential complication...

On Thu, Apr 6, 2023 at 4:01 PM Jeremy Hanna <jeremy.hanna1...@gmail.com>
wrote:

> Considering all of the examples require using ALLOW FILTERING with the
> partition key specified, I think it's appropriate to consider separating
> out use of ALLOW FILTERING within a partition versus ALLOW FILTERING across
> the whole table.  A few years back we had a discussion about this in ASF
> slack in the context of capability restrictions and it seems relevant
> here.  That is, we don't want people to get comfortable using ALLOW
> FILTERING across the whole table.  However, there are times when ALLOW
> FILTERING within a partition is reasonable.
>
> Ticket to discuss separating them out:
> https://issues.apache.org/jira/browse/CASSANDRA-15803
> Summary: Perhaps add an optional [WITHIN PARTITION] or something similar
> to make it backwards compatible and indicate that this is purely within the
> specified partition.
>
> This also gives us the ability to disallow table scan types of ALLOW
> FILTERING from a guard rail perspective, because the intent is explicit.
> That operators could disallow ALLOW FILTERING but allow ALLOW FILTERING
> WITHIN PARTITION, or whatever is decided.
>
> I do NOT want to hijack a good discussion but I thought this separation
> could be useful within this context.
>
> Jeremy
>
> On Apr 6, 2023, at 3:00 PM, Patrick McFadin <pmcfa...@gmail.com> wrote:
>
> I love that this is finally coming to Cassandra. Absolutely hate that,
> once again, we'll be endorsing the use of ALLOW FILTERING. This is an
> anti-pattern that keeps getting legitimized.
>
> Hot take: Should we just not do Milestones 1 and 2 and wait for an
> index-only Milestone 3?
>
> Patrick
>
> On Thu, Apr 6, 2023 at 10:04 AM David Capwell <dcapw...@apple.com> wrote:
>
>> Overall I welcome this feature, was trying to use this around 1-2 months
>> back and found we didn’t support, so glad to see it coming!
>>
>> From a testing point of view, I think we would want to have good fuzz
>> testing covering complex types (frozen/non-frozen collections, tuples, udt,
>> etc.), and reverse ordering; both sections tend to cause the most problem
>> for new features (and existing ones)
>>
>> We also will want a way to disable this feature, and optionally disable
>> at different sections (such as m2’s NOT IN for partition keys).
>>
>> > On Apr 4, 2023, at 2:28 AM, Piotr Kołaczkowski <pkola...@datastax.com>
>> wrote:
>> >
>> > Hi everyone!
>> >
>> > I created a new CEP for adding NOT support to the query language and
>> > want to start discussion around it:
>> >
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+NOT+operator
>> >
>> > Happy to get your feedback.
>> > --
>> > Piotr
>>
>>
>

Reply via email to