Thanks, Mike. Are there any other concerns we should address before we move to a vote?
On Wed, Feb 16, 2022 at 5:25 AM Mike Adamson <madam...@datastax.com> wrote: > I have updated the CEP to reflect the recent discussions. > > OR support has moved out of version 1 support. Index versioning and > virtual table support are now covered in the Addenda. > > MikeA > > On 14 Feb 2022, at 15:35, Caleb Rackliffe <calebrackli...@gmail.com> > wrote: > > Agreed there’s no reason to pull it out. I was just wondering what state > it was in, given I didn’t see it mentioned in the CEP. > > On Feb 14, 2022, at 8:12 AM, Mike Adamson <madam...@datastax.com> wrote: > > > We don't need a whole "codec framework" for V1, but we're still > embedding some versioning information in the column index on-disk > structures, right? > > I’m not sure why we would want to pull the versioning code only to have to > put it back in as soon as we need to change the on-disk format. We also > need to consider whether the legacy format used by DSE is supported in OSS. > I’m not sure of the policy on this although I strongly suspect that the > answer is that it won’t be supported. Either way, it would seem to be a lot > of work to pull the versioning code out at this point since it formed part > of a major refactor of the SAI framework and plumbing. > > MikeA > > On 11 Feb 2022, at 18:47, Caleb Rackliffe <calebrackli...@gmail.com> > wrote: > > Just finished reading the latest version of the CEP. Here are my thoughts: > > - We've already talked about OR queries, so I won't rehash that, but > tokenization support seems like it might be another one of those places > where we can cut scope if we want to get V1 out the door. It shouldn't be > that hard to detangle from the rest of the code. > - We mention the JMX metric ecosystem in the CEP, but not the related > virtual tables. This isn't a big issue, and doesn't mean we need to change > the CEP, but it might be helpful for those not familiar with the existing > prototype to know they exist :) > - It's probably below the line for CEP discussion, but the text and > numeric index formats will probably change over time. We don't need a whole > "codec framework" for V1, but we're still embedding some versioning > information in the column index on-disk structures, right? > > To offset my obvious partiality around this CEP, I've already made an > effort to raise some of the issues that may come up to challenge us from a > macro perspective. It seems like the prevailing opinion here is that they > are either surmountable or simply basic conceptual difficulties w/ > distributed secondary indexing. > > tl;dr I'm +1 on bringing this to a vote and starting to put together all > the pieces for CASSANDRA-16052 > <https://issues.apache.org/jira/browse/CASSANDRA-16052> :) > > On Thu, Feb 10, 2022 at 11:26 AM Mike Adamson <madam...@datastax.com> > wrote: > >> > I'd be interested to hear from Mike/Jason on the OR support topic, of >> course. >> >> The support for OR within SAI is fairly minimal and will not work without >> the non-SAI changes needed. Since the non-SAI OR changes are extensive it >> would be better to bring those in under their own CEP. >> >> I’d leave the decision of whether to put the rest of SAI behind an >> experimental flag to others. My preference would be to not do so because >> the non-OR implementation has been tested and used on production for over a >> year now. >> >> MikeA >> >> On 9 Feb 2022, at 13:06, bened...@apache.org wrote: >> >> > Is there some mechanism such as experimental flags, which would allow >> the SAI-only OR support to be merged into trunk >> >> FWIW, I’m OK with this merging to trunk, either hidden behind a CI-only >> flag or exposed to the user via some experimental flag (and a suitable >> NEWS.txt). We’ve discussed the need to periodically merge feature branches >> with trunk before they are complete. If the work is logically complete for >> SAI, and we’re only pending work to make OR consistent between SAI and >> non-SAI queries, I think that more than meets this criterion. >> >> >> >> *From: *Henrik Ingo <henrik.i...@datastax.com> >> *Date: *Monday, 7 February 2022 at 12:03 >> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >> *Subject: *Re: [DISCUSS] CEP-7 Storage Attached Index >> Thanks Benjamin for reviewing and raising this. >> >> While I don't speak for the CEP authors, just some thoughts from me: >> >> On Mon, Feb 7, 2022 at 11:18 AM Benjamin Lerer <ble...@apache.org> wrote: >> >> I would like to raise 2 points regarding the current CEP proposal: >> >> 1. There are mention of some target versions and of the removal of SASI >> >> At this point, we have not agreed on any version numbers and I do not >> feel that removing SASI should be part of the proposal for now. >> It seems to me that we should see first the adoption surrounding SAI >> before talking about deprecating other solutions. >> >> >> >> This seems rather uncontroversial. I think the CEP template and previous >> CEPs invite the discussion on whether the new feature will or may replace >> an existing feature. But at the same time that's of course out of scope for >> the work at hand. I have no opinion one way or the other myself. >> >> >> >> 2. OR queries >> >> It is unclear to me if the proposal is about adding OR support only for >> SAI index or for other types of queries too. >> In the past, we had the nasty habit for CQL to provide only partialially >> implemented features which resulted in a bad user experience. >> Some examples are: >> * LIKE restrictions which were introduced for the need of SASI and were >> not never supported for other type of queries >> * IS NOT NULL restrictions for MATERIALIZED VIEWS that are not supported >> elsewhere >> * != operator only supported for conditional inserts or updates >> And there are unfortunately many more. >> >> We are currenlty slowly trying to fix those issue and make CQL a more >> mature language. By consequence, I would like that we change our way of >> doing things. If we introduce support for OR it should also cover all the >> other type of queries and be fully tested. >> I also believe that it is a feature that due to its complexity fully >> deserves its own CEP. >> >> >> >> The current code that would be submitted for review after the CEP is >> adopted, contains OR support beyond just SAI indexes. An initial >> implementation first targeted only such queries where all columns in a >> WHERE clause using OR needed to be backed by an SAI index. This was since >> extended to also support ALLOW FILTERING mode as well as OR with clustering >> key columns. The current implementation is by no means perfect as a general >> purpose OR support, the focus all the time was on implementing OR support >> in SAI. I'll leave it to others to enumerate exactly the limitations of the >> current implementation. >> >> Seeing that also Benedict supports your point of view, I would steer the >> conversation more into a project management perspective: >> * How can we advance CEP-7 so that the bulk of the SAI code can still be >> added to Cassandra, so that users can benefit from this new index type, >> albeit without OR? >> * This is also an important question from the point of view that this is >> a large block of code that will inevitably diverged if it's not in trunk. >> Also, merging it to trunk will allow future enhancements, including the OR >> syntax btw, to happen against trunk (aka upstream first). >> * Since OR support nevertheless is a feature of SAI, it needs to be at >> least unit tested, but ideally even would be exposed so that it is possible >> to test on the CQL level. Is there some mechanism such as experimental >> flags, which would allow the SAI-only OR support to be merged into trunk, >> while a separate CEP is focused on implementing "proper" general purpose OR >> support? I should note that there is no guarantee that the OR CEP would be >> implemented in time for the next release. So the answer to this point needs >> to be something that doesn't violate the desire for good user experience. >> >> henrik >> >> >> >> >> > >