Thanks, Mike.

Are there any other concerns we should address before we move to a vote?

On Wed, Feb 16, 2022 at 5:25 AM Mike Adamson <madam...@datastax.com> wrote:

> I have updated the CEP to reflect the recent discussions.
>
> OR support has moved out of version 1 support. Index versioning and
> virtual table support are now covered in the Addenda.
>
> MikeA
>
> On 14 Feb 2022, at 15:35, Caleb Rackliffe <calebrackli...@gmail.com>
> wrote:
>
> Agreed there’s no reason to pull it out. I was just wondering what state
> it was in, given I didn’t see it mentioned in the CEP.
>
> On Feb 14, 2022, at 8:12 AM, Mike Adamson <madam...@datastax.com> wrote:
>
> > We don't need a whole "codec framework" for V1, but we're still
> embedding some versioning information in the column index on-disk
> structures, right?
>
> I’m not sure why we would want to pull the versioning code only to have to
> put it back in as soon as we need to change the on-disk format. We also
> need to consider whether the legacy format used by DSE is supported in OSS.
> I’m not sure of the policy on this although I strongly suspect that the
> answer is that it won’t be supported. Either way, it would seem to be a lot
> of work to pull the versioning code out at this point since it formed part
> of a major refactor of the SAI framework and plumbing.
>
> MikeA
>
> On 11 Feb 2022, at 18:47, Caleb Rackliffe <calebrackli...@gmail.com>
> wrote:
>
> Just finished reading the latest version of the CEP. Here are my thoughts:
>
> - We've already talked about OR queries, so I won't rehash that, but
> tokenization support seems like it might be another one of those places
> where we can cut scope if we want to get V1 out the door. It shouldn't be
> that hard to detangle from the rest of the code.
> - We mention the JMX metric ecosystem in the CEP, but not the related
> virtual tables. This isn't a big issue, and doesn't mean we need to change
> the CEP, but it might be helpful for those not familiar with the existing
> prototype to know they exist :)
> - It's probably below the line for CEP discussion, but the text and
> numeric index formats will probably change over time. We don't need a whole
> "codec framework" for V1, but we're still embedding some versioning
> information in the column index on-disk structures, right?
>
> To offset my obvious partiality around this CEP, I've already made an
> effort to raise some of the issues that may come up to challenge us from a
> macro perspective. It seems like the prevailing opinion here is that they
> are either surmountable or simply basic conceptual difficulties w/
> distributed secondary indexing.
>
> tl;dr I'm +1 on bringing this to a vote and starting to put together all
> the pieces for CASSANDRA-16052
> <https://issues.apache.org/jira/browse/CASSANDRA-16052> :)
>
> On Thu, Feb 10, 2022 at 11:26 AM Mike Adamson <madam...@datastax.com>
> wrote:
>
>> > I'd be interested to hear from Mike/Jason on the OR support topic, of
>> course.
>>
>> The support for OR within SAI is fairly minimal and will not work without
>> the non-SAI changes needed. Since the non-SAI OR changes are extensive it
>> would be better to bring those in under their own CEP.
>>
>> I’d leave the decision of whether to put the rest of SAI behind an
>> experimental flag to others. My preference would be to not do so because
>> the non-OR implementation has been tested and used on production for over a
>> year now.
>>
>> MikeA
>>
>> On 9 Feb 2022, at 13:06, bened...@apache.org wrote:
>>
>> > Is there some mechanism such as experimental flags, which would allow
>> the SAI-only OR support to be merged into trunk
>>
>> FWIW, I’m OK with this merging to trunk, either hidden behind a CI-only
>> flag or exposed to the user via some experimental flag (and a suitable
>> NEWS.txt). We’ve discussed the need to periodically merge feature branches
>> with trunk before they are complete. If the work is logically complete for
>> SAI, and we’re only pending work to make OR consistent between SAI and
>> non-SAI queries, I think that more than meets this criterion.
>>
>>
>>
>> *From: *Henrik Ingo <henrik.i...@datastax.com>
>> *Date: *Monday, 7 February 2022 at 12:03
>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
>> *Subject: *Re: [DISCUSS] CEP-7 Storage Attached Index
>> Thanks Benjamin for reviewing and raising this.
>>
>> While I don't speak for the CEP authors, just some thoughts from me:
>>
>> On Mon, Feb 7, 2022 at 11:18 AM Benjamin Lerer <ble...@apache.org> wrote:
>>
>> I would like to raise 2 points regarding the current CEP proposal:
>>
>> 1. There are mention of some target versions and of the removal of SASI
>>
>> At this point, we have not agreed on any version numbers and I do not
>> feel that removing SASI should be part of the proposal for now.
>> It seems to me that we should see first the adoption surrounding SAI
>> before talking about deprecating other solutions.
>>
>>
>>
>> This seems rather uncontroversial. I think the CEP template and previous
>> CEPs invite  the discussion on whether the new feature will or may replace
>> an existing feature. But at the same time that's of course out of scope for
>> the work at hand. I have no opinion one way or the other myself.
>>
>>
>>
>> 2. OR queries
>>
>> It is unclear to me if the proposal is about adding OR support only for
>> SAI index or for other types of queries too.
>> In the past, we had the nasty habit for CQL to provide only partialially
>> implemented features which resulted in a bad user experience.
>> Some examples are:
>> * LIKE restrictions which were introduced for the need of SASI and were
>> not never supported for other type of queries
>> * IS NOT NULL restrictions for MATERIALIZED VIEWS that are not supported
>> elsewhere
>> * != operator only supported for conditional inserts or updates
>> And there are unfortunately many more.
>>
>> We are currenlty slowly trying to fix those issue and make CQL a more
>> mature language. By consequence, I would like that we change our way of
>> doing things. If we introduce support for OR it should also cover all the
>> other type of queries and be fully tested.
>> I also believe that it is a feature that due to its complexity fully
>> deserves its own CEP.
>>
>>
>>
>> The current code that would be submitted for review after the CEP is
>> adopted, contains OR support beyond just SAI indexes. An initial
>> implementation first targeted only such queries where all columns in a
>> WHERE clause using OR needed to be backed by an SAI index. This was since
>> extended to also support ALLOW FILTERING mode as well as OR with clustering
>> key columns. The current implementation is by no means perfect as a general
>> purpose OR support, the focus all the time was on implementing OR support
>> in SAI. I'll leave it to others to enumerate exactly the limitations of the
>> current implementation.
>>
>> Seeing that also Benedict supports your point of view, I would steer the
>> conversation more into a project management perspective:
>> * How can we advance CEP-7 so that the bulk of the SAI code can still be
>> added to Cassandra, so that  users can benefit from this new index type,
>> albeit without OR?
>> * This is also an important question from the point of view that this is
>> a large block of code that will inevitably diverged if it's not in trunk.
>> Also, merging it to trunk will allow future enhancements, including the OR
>> syntax btw, to happen against trunk (aka upstream first).
>> * Since OR support nevertheless is a feature of SAI, it needs to be at
>> least unit tested, but ideally even would be exposed so that it is possible
>> to test on the CQL level. Is there some mechanism such as experimental
>> flags, which would allow the SAI-only OR support to be merged into trunk,
>> while a separate CEP is focused on implementing "proper" general purpose OR
>> support? I should note that there is no guarantee that the OR CEP would be
>> implemented in time for the next release. So the answer to this point needs
>> to be something that doesn't violate the desire for good user experience.
>>
>> henrik
>>
>>
>>
>>
>>
>
>

Reply via email to