Re: [DISCUSS] Stream Pipelines on hot paths

2024-05-30 Thread Caleb Rackliffe
+1 On Thu, May 30, 2024 at 11:29 AM Benedict wrote: > Since it’s related to the logging discussion we’re already having, I have > seen stream pipelines showing up in a lot of traces recently. I am > surprised; I thought it was understood that they shouldn’t be used on hot > paths as they are

Re: [DISCUSS] Adding experimental vtables and rules around them

2024-05-30 Thread Caleb Rackliffe
The two-part proposal of 1.) table-level self-identification of experimental status and 2.) a global config flag that determines what to do when querying those might work. I guess the only thing you can't do there is ignore warnings from a specific experimental table, since that's controlled in

Re: [EXTERNAL] Re: [Discuss] Generic Purpose Rate Limiter in Cassandra

2024-05-03 Thread Caleb Rackliffe
s thread >>> into a draft seems like a solid next step. >>> >>> On Wed, Feb 7, 2024, at 12:31 PM, Jaydeep Chovatia wrote: >>> >>> I see a lot of great ideas being discussed or proposed in the past to >>> cover the most common rate limiter candidat

Re: [DISCUSS] Donating easy-cass-stress to the project

2024-04-25 Thread Caleb Rackliffe
I do have some familiarity w/ the codebase, and I could help support it in a minor capacity. (Reviews, small fixes, etc.) Probably not something I could spend hours on every week.On Apr 25, 2024, at 5:11 PM, Jon Haddad wrote:I should probably have noted - since TLP is no more, I renamed

Re: [DISCUSS] NULL handling and the unfrozen collection issue

2024-04-04 Thread Caleb Rackliffe
The easiest way to check out how Accord uses IS NULL and IS NOT NULL is to look at the examples in the cep-15-accord branch: https://github.com/apache/cassandra/blob/cep-15-accord/test/distributed/org/apache/cassandra/distributed/test/accord/AccordCQLTestBase.java tl;dr We did indeed try to go

Re: [DISCUSSION] Replace the Config class instance with the tree-based framework

2024-03-18 Thread Caleb Rackliffe
> I kinda feel this should be separated as we can and do do this today but the reason we have not grouped is not due to the framework we use but more “what makes sense to group” I think that's more or less correct. The critical path thing here is getting to consensus (in CASSANDRA-17292

Re: [DISCUSS] Cassandra 5.0 support for RHEL 7

2024-03-12 Thread Caleb Rackliffe
Just created CASSANDRA-19467 <https://issues.apache.org/jira/browse/CASSANDRA-19467> On Tue, Mar 12, 2024 at 1:45 PM Caleb Rackliffe wrote: > Alright, so there has been a little conversation in ASF Slack here: > https://the-asf.slack.com/archives/CK23JSY2K/p1710255088441369 >

Re: [DISCUSS] Cassandra 5.0 support for RHEL 7

2024-03-12 Thread Caleb Rackliffe
does run cqlsh w/ those versions that they are EOL, support may be removed in a future C* release, and they may be used on an "as is" basis. I'll get a Jira up shortly... On Mon, Mar 11, 2024 at 8:51 PM Caleb Rackliffe wrote: > I did a quick experiment to revert all the bits tha

Re: [DISCUSS] Cassandra 5.0 support for RHEL 7

2024-03-11 Thread Caleb Rackliffe
I did a quick experiment to revert all the bits that require 3.8+ in the server codebase (while leaving 3.29.0 in place), and I don't see anything breaking in the tests on trunk.

Re: [DISCUSS] Cassandra 5.0 support for RHEL 7

2024-03-11 Thread Caleb Rackliffe
s the sole reason > we've bumped to 3.7 and 3.8 to support that python driver. That correct > Andres / Brandon? > > On Mon, Mar 11, 2024, at 1:22 PM, Caleb Rackliffe wrote: > > The vector issues itself was a simple error message change: > https://github.com/datastax/python-driver/comm

Re: [DISCUSS] Cassandra 5.0 support for RHEL 7

2024-03-11 Thread Caleb Rackliffe
The vector issues itself was a simple error message change: https://github.com/datastax/python-driver/commit/e90c0f5d71f4cac94ed80ed72c8789c0818e11d0 Was there something else in 3.29.0 that actually necessitated the move to a floor of Python 3.8? Do we generally change runtime requirements in

Re: [DISCUSS] What SHOULD we do when we index an inet type that is ipv4?

2024-03-07 Thread Caleb Rackliffe
> if an inet type column is a partition key, can I write to it in IPv4 and then query it with IPv6 and find the record? You can't...however... Especially when the original/existing behavior here was possibly not all that well-conceived, I think it would at least be a good idea to maintain an

Re: [DISCUSS] What SHOULD we do when we index an inet type that is ipv4?

2024-03-07 Thread Caleb Rackliffe
Yeah, what we have with inet is much like if we had a type like "numeric" that allowed you to write both ints and doubles. If we had actual "inet4" and "inet6" types, SAI would have been able to index them as fixed length values without doing the 4 -> 16 byte conversion. Given SAI could easily

Re: [EXTERNAL] Re: [Discuss] Generic Purpose Rate Limiter in Cassandra

2024-02-02 Thread Caleb Rackliffe
> (client -> coordinator, coordinator -> replica, internode with other > operations, etc) at surprising times and should be considered more > holistically? > > On Tue, Jan 30, 2024, at 12:31 AM, Caleb Rackliffe wrote: > > I almost forgot CASSANDRA-15817, which introdu

Re: [EXTERNAL] Re: [Discuss] Generic Purpose Rate Limiter in Cassandra

2024-01-29 Thread Caleb Rackliffe
I almost forgot CASSANDRA-15817, which introduced reject_repair_compaction_threshold, which provides a mechanism to stop repairs while compaction is underwater.On Jan 26, 2024, at 6:22 PM, Caleb Rackliffe wrote:Hey all,I'm a bit late to the discussion. I see that we've already discussed

Re: [EXTERNAL] Re: [Discuss] Generic Purpose Rate Limiter in Cassandra

2024-01-26 Thread Caleb Rackliffe
Hey all, I'm a bit late to the discussion. I see that we've already discussed CASSANDRA-15013 and CASSANDRA-16663 at least in passing. Having written the latter, I'd be the first to

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-21 Thread Caleb Rackliffe
I think I hinted at this in my first response, but just to clarify, I would be interested to see this work broken up as much as possible into a.) the set of things we can do without coordinator involvement (statistical optimization for index and filtering queries) and b.) the set of things where

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-21 Thread Caleb Rackliffe
What would the relationship between our present query tracing apparatus and EXPLAIN ANALYZE look like? On Thu, Dec 21, 2023 at 4:24 PM Caleb Rackliffe wrote: > > We are also currently working on some SAI features that need cost based > optimization. > > I don't even think we have

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-21 Thread Caleb Rackliffe
> We are also currently working on some SAI features that need cost based optimization. I don't even think we have to think about *new* SAI features to see where it will benefit from further *local* optimization, and I'm sympathetic to that happening in the context of a larger framework, as long

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

2023-12-21 Thread Caleb Rackliffe
+1 Agree w/ all the justifications mentioned above. As a reviewer on CASSANDRA-19210 , my goals were to a.) look at the directory, naming, and package structure of the ported code, b.) make sure IDE integration was working, and c.) make sure

Re: Welcome Mike Adamson as Cassandra committer

2023-12-08 Thread Caleb Rackliffe
Congratulations! Very well deserved. On Fri, Dec 8, 2023 at 10:33 AM Mick Semb Wever wrote: > Congrats Mike !! > >>

Re: [DISCUSS] CASSANDRA-18940 SAI post-filtering reads don't update local table latency metrics

2023-12-01 Thread Caleb Rackliffe
; > On Dec 1, 2023, at 12:50 PM, Caleb Rackliffe > wrote: > >  > So the plan would be to have local "Read" and "Range" remain unchanged in > TableMetrics, but have a third "SAIRead" (?) just for SAI post-filtering > read SinglePartitionReadCommands?

Re: [DISCUSS] CASSANDRA-18940 SAI post-filtering reads don't update local table latency metrics

2023-12-01 Thread Caleb Rackliffe
ics, not > regular reads. So grouping them into the regular read metrics at the lower > level seems confusing to me in that sense as well. > > As an operator I want to know how my SAI reads and normal reads are > performing latency wise separately. > > -Jeremiah > > On Dec

Re: [DISCUSS] CASSANDRA-18940 SAI post-filtering reads don't update local table latency metrics

2023-12-01 Thread Caleb Rackliffe
Option 1 would be my preference. Seems both useful to have a single metric for read load against the table and a way to break out SAI reads specifically. On Fri, Dec 1, 2023 at 11:00 AM Mike Adamson wrote: > Hi, > > We are looking at adding SAI post-filtering reads to the local table > metrics

Re: [VOTE] Release Apache Cassandra 5.0-beta1 (take2)

2023-12-01 Thread Caleb Rackliffe
+1 (nb) > On Dec 1, 2023, at 7:32 AM, Mick Semb Wever wrote: > >  > > Proposing the test build of Cassandra 5.0-beta1 for release. > > sha1: 87fd1fa88a0c859cc32d9f569ad09ad0b345e465 > Git: https://github.com/apache/cassandra/tree/5.0-beta1-tentative > Maven Artifacts: >

Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-29 Thread Caleb Rackliffe
> So in the context of this thread, if I want to try out SAI for example, I don't care as much about consistency edge cases around coordinators or replicas or read repair. That would apply to 19018, not 19011, which is a critical functionality issue. On Wed, Nov 29, 2023 at 12:49 PM Jeremy Hanna

Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-28 Thread Caleb Rackliffe
If the consensus is that meaningful community testing will occur in the week between "beta1 but SAI is broken, friends" and "ok, beta2, it's fixed now, go for it"...then go for it. On Tue, Nov 28, 2023 at 12:40 PM Patrick McFadin wrote: > JD, that wasn't my point. It feels like we are treating

Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-28 Thread Caleb Rackliffe
I'm fine w/ alpha2 now and beta1 once we resolve 19011. On Tue, Nov 28, 2023 at 12:36 PM Benjamin Lerer wrote: > -1 based on the problems raised by Caleb. > > I would be fine with releasing that version as an alpha as Jeremiah > proposed. > > As of this time, I'm also not aware of a user of the

Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-28 Thread Caleb Rackliffe
Just to update this thread, Mike has identified the root cause of 19011 and in the process uncovered 2 additional very closely related issues. (Fantastic job fuzzing there!) Fixes for all three issues are in hand, and he continues to test.After some conversation w/ Mick, Alex, and Mike, I feel

Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-11-15 Thread Caleb Rackliffe
Added one nit to the PR. Otherwise, this is awesome :) On Wed, Nov 15, 2023 at 11:01 AM Jordan West wrote: > I would also like to back this proposal. We change this default because > several incidents have occurred by leaving the default of auto. There are > rare cases where auto/mmap is the

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-23 Thread Caleb Rackliffe
...or like the end of January. Either way, feel free to ignore the "aside" :) On Mon, Oct 23, 2023 at 12:53 PM Caleb Rackliffe wrote: > Kind of in the same place as Benedict/Aleksey. > > If we release a 5.1 in, let's say...March of next year, the number of 5.0 > use

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-23 Thread Caleb Rackliffe
Kind of in the same place as Benedict/Aleksey. If we release a 5.1 in, let's say...March of next year, the number of 5.0 users is going to be very minimal. Nobody is going to upgrade anything important from now through the first half of January anyway, right? They're going to be making sure their

Re: [VOTE] Accept java-driver

2023-10-03 Thread Caleb Rackliffe
+1 On Tue, Oct 3, 2023 at 2:49 PM Sylvain Lebresne wrote: > +1 > -- > Sylvain > > > On Tue, Oct 3, 2023 at 8:43 PM Jon Haddad > wrote: > >> +1 >> >> >> On 2023/10/03 04:52:47 Mick Semb Wever wrote: >> > The donation of the java-driver is ready for its IP Clearance vote. >> >

Re: [DISCUSS] Backport CASSANDRA-18816 to 5.0? Add support for repair coordinator to retry messages that timeout

2023-09-20 Thread Caleb Rackliffe
+1 on a 5.0 backport On Wed, Sep 20, 2023 at 2:26 PM Brandon Williams wrote: > I think it could be argued that not retrying messages is a bug, I am > +1 on including this in 5.0. > > Kind Regards, > Brandon > > On Tue, Sep 19, 2023 at 1:16 PM David Capwell wrote: > > > > To try to get repair

Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Caleb Rackliffe
+100 to this We'd have to come up w/ a pretty compelling counterexample to NOT switch the default to mmap_index_only at this point. On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams wrote: > Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it > makes sense. At the least I

Re: Tokenization and SAI query syntax

2023-08-13 Thread Caleb Rackliffe
ok like or the shape it's going to take for awhile, >> so backing ourselves into any of the 3 corners above right now feels very >> premature to me. >> >> So I'm coming around to the expr / method call approach to preserve that >> flexibility. It's maximally expl

Re: [DISCUSS] CASSANDRA-18743 Deprecation of metrics-reporter-config

2023-08-11 Thread Caleb Rackliffe
+1 > On Aug 11, 2023, at 8:10 AM, Brandon Williams wrote: > > +1 > > Kind Regards, > Brandon > >> On Fri, Aug 11, 2023 at 8:08 AM Ekaterina Dimitrova >> wrote: >> >> >> “ The rationale for this proposed deprecation is that the upcoming 5.0 >> release is a good time to evaluate

Re: Tokenization and SAI query syntax

2023-08-07 Thread Caleb Rackliffe
ing how differing behavior for the same syntax can lead to >>> issues. Imo the best case scenario results in the user not even noticing >>> their indexes have changed. >>> >>> An (maybe better?) alternative is to add a flag to the index >>

Re: Tokenization and SAI query syntax

2023-08-07 Thread Caleb Rackliffe
> >> An (maybe better?) alternative is to add a flag to the index >> configuration for "compatibility mod", which might address the concerns >> around using an equality operator when it actually is a partial match. >> >> For what it's worth, I'm in a

Re: Tokenization and SAI query syntax

2023-08-02 Thread Caleb Rackliffe
me syntax, even > if it means there's two ways of writing the same query. To use Caleb's > example, this would mean supporting both LIKE and the `expr` column. > >> > >> Jon > >> > >>>> On 2023/08/01 19:17:11 Caleb Rackliffe wrote: > >>> Here are

Re: Tokenization and SAI query syntax

2023-08-01 Thread Caleb Rackliffe
Here are some additional bits of prior art, if anyone finds them useful: The Stratio Lucene Index - https://github.com/Stratio/cassandra-lucene-index#examples Stratio was the reason C* added the "expr" functionality. They embedded something similar to ElasticSearch JSON, which probably isn't my

Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-26 Thread Caleb Rackliffe
t 6:07 AM Jeremy Hanna wrote: > Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else > involved with the SAI implementation! > > On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe > wrote: > >  > Just a quick update... > > With CASSANDRA-18670 > <h

Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-25 Thread Caleb Rackliffe
rebase on the current trunk and J11 and J17 test runs. On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe wrote: > Hello there! > > After much toil, the first phase of CEP-7 is nearing completion (see > CASSANDRA-16052 <https://issues.apache.org/jira/browse/CASSANDRA-16052>). > The

Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-18 Thread Caleb Rackliffe
Hello there! After much toil, the first phase of CEP-7 is nearing completion (see CASSANDRA-16052 ). There are presently two issues to resolve before we'd like to merge the cep-7-sai feature branch and all its goodness to trunk:

Re: [DISCUSS] The future of CREATE INDEX

2023-06-20 Thread Caleb Rackliffe
For everyone previously following this, just created https://issues.apache.org/jira/browse/CASSANDRA-18615 :) On Fri, May 19, 2023 at 1:34 PM Caleb Rackliffe wrote: > Posted on ASF Slack to see if we can get more responses, but so far the > leaders seem to be... > > [POLL] Central

Re: [VOTE] CEP-8 Datastax Drivers Donation

2023-06-13 Thread Caleb Rackliffe
+1 On Tue, Jun 13, 2023 at 11:25 AM Francisco Guerrero wrote: > +1 (nb) > > On 2023/06/13 16:22:55 Andrés de la Peña wrote: > > +1 > > > > On Tue, 13 Jun 2023 at 16:40, Yifan Cai wrote: > > > > > +1 > > > -- > > > *From:* David Capwell > > > *Sent:* Tuesday, June

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Caleb Rackliffe
not something we will develop in tandem with C* releases, and we will want improvements to be applied across all branches.So it seems a natural fit for submodules to me.On 24 May 2023, at 21:09, Caleb Rackliffe wrote:> Submodules do have their own overhead and edge cases, so I am mostly in fa

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Caleb Rackliffe
> Submodules do have their own overhead and edge cases, so I am mostly in favor of using for cases where the code must live outside of tree (such as jvm-dtest that lives out of tree as all branches need the same interfaces) Agreed. Basically where I've ended up on this topic. > We could go over

Re: [DISCUSS] The future of CREATE INDEX

2023-05-19 Thread Caleb Rackliffe
; 6. MySQL allows the DBA to determine the default engine. This seems to > work well. If the user doesn't care, they don't care, if they do, they use > the explicit syntax. > > henrik > > > On Wed, May 10, 2023 at 12:45 AM Caleb Rackliffe <mailto:calebrackli...@gmail.com&

Re: [DISCUSS] Feature branch version hygiene

2023-05-17 Thread Caleb Rackliffe
...otherwise I'm fine w/ just the CEP name, like "CEP-7" for SAI, etc. On Wed, May 17, 2023 at 11:24 PM Caleb Rackliffe wrote: > So when a CEP slips, do we have to create a 5.1-cep-N? Could we just have > a version that's "NextMajorRelease" or something like that? I

Re: [DISCUSS] Feature branch version hygiene

2023-05-17 Thread Caleb Rackliffe
So when a CEP slips, do we have to create a 5.1-cep-N? Could we just have a version that's "NextMajorRelease" or something like that? It should still be pretty easy to bulk replace if we have something else to filter on, like belonging to an epic? On Wed, May 17, 2023 at 6:42 PM Mick Semb Wever

Re: [DISCUSS] The future of CREATE INDEX

2023-05-17 Thread Caleb Rackliffe
me nodes but not others? Maybe what we need is a jira ticket > to enforce that certain sections of the config must not differ? > > 5. That said, the default index type could also be a property of the > keyspace > > 6. MySQL allows the DBA to determine the default engine. This seems to

Re: [DISCUSS] Feature branch version hygiene

2023-05-16 Thread Caleb Rackliffe
...but that and "What do we do with things that might be in 5.0 and might not?" are different questions. A version that denotes "next major release" until merged (at which point it could be given a real version) could be helpful... On Tue, May 16, 2023 at 3:28 PM Caleb Ra

Re: [DISCUSS] Feature branch version hygiene

2023-05-16 Thread Caleb Rackliffe
I like the NA approach for sub-tasks that roll up to a parent/epic ticket. When that lands, it gets a real version, and any sub-task is assumed to also have that version. Not that it has to be called "NA", but there should be something to denote "inherits from parent". On Tue, May 16, 2023 at

Re: [DISCUSS] The future of CREATE INDEX

2023-05-16 Thread Caleb Rackliffe
an Ellis 于2023年5月16日周二 07:18写道: > >> On Fri, May 12, 2023 at 1:39 PM Caleb Rackliffe >> wrote: >> >>> [POLL] Centralize existing syntax or create new syntax? >>> >> >> 1 (Existing) >> >> [POLL] Should there be a default? (YES/NO) >>> >> >> YES >> >> >>> [POLL] What do do with the default? >>> >> >> 1 (Default SAI) >> >> > > > -- > you are the apple of my eye ! >

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
, until we have sufficient data to discuss that I’m going to put a hard veto on that on technical grounds.On 12 May 2023, at 19:41, Caleb Rackliffe wrote:...and to clarify, answers should be what you'd like to see for 5.0 specificallyOn Fri, May 12, 2023 at 1:36 PM Caleb Rackliffe <calebrac

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
...and to clarify, answers should be what you'd like to see for 5.0 specifically On Fri, May 12, 2023 at 1:36 PM Caleb Rackliffe wrote: > [POLL] Centralize existing syntax or create new syntax? > > 1.) CREATE INDEX ... USING WITH OPTIONS... > 2.) CREATE LOCAL IND

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
[POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING WITH OPTIONS... 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds LOCAL keyword for clarity and separation from future GLOBAL indexes) (In both cases, we deprecate w/ client warnings

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
If at some point in the glorious future we have global indexes, I'm sure we can add GLOBAL to the syntax...sry, working on an ugly poll... On Fri, May 12, 2023 at 1:24 PM Benedict wrote: > If folk should be reading up on the index type, doesn’t that conflict with > your support of a default? >

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
he 5.0 release that we have enough information to >> change the default (#1), we can change it in a matter of minutes. >> >> >> I am strongly against this… SAI is new for 5.0 so should be disabled by >> default; else we disrespect the idea that new features are disable

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
y > default; else we disrespect the idea that new features are disabled by > default. I am cool with our docs recommending if we do find its better in > most cases, but we should not change the default in the same reason it > lands in. > > On May 12, 2023, at 10:10

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
feels...not so polished. (The backend for this already exists w/ CREATE CUSTOM INDEX.) 3.) Leave in place but deprecate (client warnings could work?) CREATE CUSTOM INDEX. Support the syntax for the foreseeable future. Any objections to that? On Fri, May 12, 2023 at 12:10 PM Caleb Rackliffe wrote

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
n once the evidence is available for > everyone to consider? > > If not, then we probably can’t do the hard cutover and so the answer is > still pretty simple? > > On 12 May 2023, at 18:04, Caleb Rackliffe > wrote: > >  > I don't particularly like the YAML solution e

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
to > reify the distinction in the syntax. > > On 12 May 2023, at 17:29, Caleb Rackliffe > wrote: > >  > We don't need to know everything about SAI's performance profile to plan > and execute some small, reasonable things now for 5.0. I'm going to try to > summarize the least cont

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
...and if we decide before the 5.0 release that we have enough information to change the default (#1), we can change it in a matter of minutes. On Fri, May 12, 2023 at 11:28 AM Caleb Rackliffe wrote: > We don't need to know everything about SAI's performance profile to plan > and execut

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
ly opposite), I would want to see a very high quality of > evidence to support the claim. > > I don’t think we can resolve this conversation effectively until this > question is settled. > > On 12 May 2023, at 16:19, Caleb Rackliffe > wrote: > >  > > This creates h

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
> This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile. We wouldn't have even advanced it to this point if we didn't have copious amounts of (not all public, I know,

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Caleb Rackliffe
tl;dr If you take my original proposal and change only the fact that CREATE INDEX retains a configurable default, I think we get to the same place? (Then it's just a matter of what we do in 5.0 vs. after 5.0...) On Wed, May 10, 2023 at 11:00 AM Caleb Rackliffe wrote: > I see a broad des

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Caleb Rackliffe
>> >> On Tue, May 9, 2023 at 5:20 PM Jeremiah D Jordan < >> jeremiah.jor...@gmail.com> wrote: >> >>> If the consensus is that SAI is the right default index, then we should >>> just change CREATE INDEX to be SAI, and legacy 2i to be a CUSTOM INDEX

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Caleb Rackliffe
g extra ceremony. >> >> On Tue, May 9, 2023 at 5:20 PM Jeremiah D Jordan < >> jeremiah.jor...@gmail.com> wrote: >> >>> If the consensus is that SAI is the right default index, then we should >>> just change CREATE INDEX to be SAI, and legacy 2i to be a CU

[DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Caleb Rackliffe
Earlier today, Mick started a thread on the future of our index creation DDL on Slack: https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019 At the moment, there are two ways to create a secondary index. *1.) CREATE INDEX [IF NOT EXISTS] [name] ON ()* This creates an optionally

Re: CEP-30: Approximate Nearest Neighbor(ANN) Vector Search via Storage-Attached Indexes

2023-05-09 Thread Caleb Rackliffe
Anyone on this ML who still remembers DSE Search (or has experience w/ Elastic or SolrCloud) probably also knows that there are some significant pieces of an optimized scatter/gather apparatus for IR (even without sorting, which also doesn't exist yet) that do not exist in C* or it's range query

Re: [VOTE] CEP-29 CQL NOT Operator

2023-05-09 Thread Caleb Rackliffe
+1 On Tue, May 9, 2023 at 12:04 PM Piotr Kołaczkowski wrote: > Let's vote. > > > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+NOT+operator > > Piotr Kołaczkowski > e. pkola...@datastax.com > w. www.datastax.com >

Re: [POLL] Vector type for ML

2023-05-05 Thread Caleb Rackliffe
gt; [4] https://github.com/pgvector/pgvector >>> [5] https://weaviate.io/developers/weaviate/config-refs/datatypes >>> >>> On Fri, May 5, 2023 at 6:07 AM Mike Adamson >>> wrote: >>> >>> Then we can have the indexing apparatus only accept *frozen*

Re: [POLL] Vector type for ML

2023-05-04 Thread Caleb Rackliffe
Even in the ML case, sparse can just mean zeros rather than nulls, and they should compress similarly anyway. If we really want null values, I'd rather leave that in collections space. On Thu, May 4, 2023 at 8:59 PM Caleb Rackliffe wrote: > I actually still prefer *type[dimension]*, becaus

Re: [POLL] Vector type for ML

2023-05-04 Thread Caleb Rackliffe
I actually still prefer *type[dimension]*, because I think I intuitively read this as a primitive (meaning no null elements) array. Then we can have the indexing apparatus only accept *frozen* for the HSNW case. If that isn't intuitive to anyone else, I don't really have a strong

Re: [POLL] Vector type for ML

2023-05-03 Thread Caleb Rackliffe
To be clear, I support the general agreement David and Jonathan seem to have reached. On Wed, May 3, 2023 at 3:05 PM Caleb Rackliffe wrote: > Did we agree on a CQL syntax? > > On Wed, May 3, 2023 at 2:06 PM Rahul Xavier Singh < > rahul.xavier.si...@gmail.com> wrote: > &g

Re: [POLL] Vector type for ML

2023-05-03 Thread Caleb Rackliffe
Did we agree on a CQL syntax? On Wed, May 3, 2023 at 2:06 PM Rahul Xavier Singh < rahul.xavier.si...@gmail.com> wrote: > I like this approach. Thank you for those working on this vector search > initiative. > > Here's the feedback from my "user" hat for someone who is looking at > databases /

Re: [DISCUSS] New data type for vector search

2023-04-27 Thread Caleb Rackliffe
I don’t have a lot to add here, other than to say I’m broadly in agreement w/ David on syntax preference, element selectability, and making this a new type that roughly corresponds to a primitive (non-null-allowing) array.On Apr 27, 2023, at 9:18 PM, Anthony Grasso wrote:It would be strange for

Re: [DISCUSS] Next release date

2023-04-18 Thread Caleb Rackliffe
> Caleb, you appear to be the only one objecting, and it does not appear that you have made any compromises in this thread. All I'm really objecting to is making special exceptions for particular CEPs in relation to our freeze date. In other words, let's not have a pseudo-freeze date and a "real"

Re: [DISCUSS] Next release date

2023-04-17 Thread Caleb Rackliffe
> My personal .02: I think we should consider branching 5.0 September 1st. That gives us basically 12 weeks for folks to do their testing and for us to stabilize anything that's flaky in circle or regressed in ASF CI. WFM, if that means we branch there and anything not already merged has to wait

Re: [DISCUSS] Next release date

2023-04-17 Thread Caleb Rackliffe
of CEP-15 / CEP-21 after branch, we risk > needing a fast-follow release and don't have functional precedent for the > snapshots we earlier agreed upon doing. > > Does that distill it and match everyone else's understanding? > > On Mon, Apr 17, 2023, at 2:20 PM, Mick Semb Wever

Re: [DISCUSS] Next release date

2023-04-17 Thread Caleb Rackliffe
...or just cutting a 5.0 branch when CEP-21 is ready. There's nothing stopping us from testing JDK17 and TTL bits in trunk before that. On Mon, Apr 17, 2023 at 11:25 AM Caleb Rackliffe wrote: > > Once all CEPs except CEP-21 and CEP-15 land we branch cassandra-5.0 > > For the

Re: [DISCUSS] Next release date

2023-04-17 Thread Caleb Rackliffe
> Once all CEPs except CEP-21 and CEP-15 land we branch cassandra-5.0 For the record, I'm not convinced this is necessarily better than just cutting a cassandra-5.0 branch on 1 October. On Mon, Apr 17, 2023 at 2:30 AM Mick Semb Wever wrote: > 2. When CEP-15 lands we cut alpha1, >> 2a. The

Re: [DISCUSS] CEP-29 CQL NOT Operator

2023-04-11 Thread Caleb Rackliffe
+1 to the proposal from a CQL perspective *However*, whether we do this in the context of simple partition restriction, a global index query, or a partition-restricted index query, the NOT operator is most likely to be useful only in a post-filtering capacity. (ex. WHERE indexed_set CONTAINS {

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-05 Thread Caleb Rackliffe
KEYSPACE isn’t a terrible name for a namespace that also configures how keys are replicated. NAMESPACE is accurate but not comprehensive. DATABASE doesn’t seem to have the advantages of either.I’m neutral on NAMESPACE and slightly -1 on DATABASE. It’s hard for me to believe KEYSPACE is really a

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-27 Thread Caleb Rackliffe
) delay, then that's fine and there's no need to do additional work just to get some preview release out earlier.henrikOn Sat, Mar 25, 2023 at 4:17 AM Caleb Rackliffe <calebrackli...@gmail.com> wrote:I agree there’s little point in litigating right now, given test stability (or lack thereof)

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread Caleb Rackliffe
opinion on how long we think this would take? Seems like that'd help clarify whether or not there's contributors with the bandwidth and desire to even do that or whether everyone depending on cep-21 is our option.On Fri, Mar 24, 2023, at 1:30 PM, Caleb Rackliffe wrote:I actually did a dry run rebase

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread Caleb Rackliffe
an informed opinion on how long we think this would take? Seems like that'd help clarify whether or not there's contributors with the bandwidth and desire to even do that or whether everyone depending on cep-21 is our option.On Fri, Mar 24, 2023, at 1:30 PM, Caleb Rackliffe wrote:I actually did a dry

Re: [EXTERNAL] [DISCUSS] Next release date

2023-03-24 Thread Caleb Rackliffe
that it will change so much. Am I > missing something? > > Le ven. 24 mars 2023 à 19:16, Caleb Rackliffe > a écrit : > >> > I worry about the labor involved with having very large work like this >> target a frozen branch and then also needing to pull it

Re: [EXTERNAL] [DISCUSS] Next release date

2023-03-24 Thread Caleb Rackliffe
> I worry about the labor involved with having very large work like this target a frozen branch and then also needing to pull it up to trunk. That doesn't sound fun. > I for one do not like to have release branches cut months before their expected release. > CEP-15 is mostly “net new stuff” and

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread Caleb Rackliffe
a prototype of Accord to trunk probably has marketing > value. (Don't laugh, many popular databases have had "atomic transactions, > except if anyone executes DDL simultaneously".) > > On Tue, Mar 14, 2023 at 8:39 PM Caleb Rackliffe > wrote: > > We've already talked a bit &g

[DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-14 Thread Caleb Rackliffe
We've already talked a bit about how and when the current Accord feature branch should merge to trunk. Earlier today, the cep-21-tcm branch was created

Re: Merging CEP-15 to trunk

2023-02-01 Thread Caleb Rackliffe
Just an FYI, the Accord feature flag has landed in the cep-15-accord branch: https://github.com/apache/cassandra/commit/2e680a33c03ce66d4b1358e1a1cc11cf4ee0189f (btw, it implicitly fixes some of the dtests around the new Accord system keyspace, because Accord is now disabled by default.) On Tue,

Re: Merging CEP-15 to trunk

2023-01-24 Thread Caleb Rackliffe
wear our Apache > Hats here, and if the debate is between work like this happening in a > feature branch affording contributors increased efficiency and locality vs. > all that happening on trunk and repeatedly colliding with everyone > everywhere, feature branches are a clear win

Re: Merging CEP-15 to trunk

2023-01-24 Thread Caleb Rackliffe
t ninja commit >> unless it's a comment fix, typo, forgotten git add, or something along >> those lines. For any commit that doesn't qualify it should go through the >> review process. >> >> And a final note - Ekaterina alluded to something valuable in her email >> e

Re: Merging CEP-15 to trunk

2023-01-24 Thread Caleb Rackliffe
Just FYI, I'm going to be posting a Jira (which will have some dependencies as well) to track this merge, hopefully some time today... On Tue, Jan 24, 2023 at 12:26 PM Ekaterina Dimitrova wrote: > I actually see people all the time making a final check before merge as > part of the review. And

Re: Cassandra CI Status 2023-01-07

2023-01-23 Thread Caleb Rackliffe
New failures from Build Lead Week 4: *** CASSANDRA-18188 - Test failure in upgrade_tests.cql_tests.cls.test_limit_ranges - trunk - AttributeError: module 'py' has no attribute 'io' *** CASSANDRA-18189 - Test failure in cqlsh_tests.test_cqlsh_copy.TestCqlshCopy.test_bulk_round_trip_with_timeouts

Re: [DISCUSSION] New dependencies for SAI CEP-7

2022-12-13 Thread Caleb Rackliffe
We need random generators no matter what for these tests, so I think what we need to decide is whether to continue to use Carrot or migrate those to QuickTheories, along the lines of what we have now in org.apache.cassandra.utils.Generators. When it comes to a library like this, the thing I would

Re: Naming conventions for CQL native functions

2022-11-10 Thread Caleb Rackliffe
+100 on snake case for built-in functions given I think MySQL and Postgres use that convention as well. ex. https://www.postgresql.org/docs/9.2/functions-string.html On Thu, Nov 10, 2022 at 7:51 AM Brandon Williams wrote: > I too meant snake case and need coffee. > > On Thu, Nov 10, 2022,

  1   2   >