Re: Project Status Update: 90-day catch-up edition [2023-10-27]

2023-10-27 Thread Patrick McFadin
Sent you an invite Sam. Welcome to the community!

On Fri, Oct 27, 2023 at 10:31 AM Sam  wrote:

> Please can I have an invite to the Slack workspace on this email. I'd like
> to take a look through some of the items for first time contributors :-)
>
> Thanks!
>
> On Fri, 27 Oct 2023 at 18:10, Josh McKenzie  wrote:
>
>> In case you're keeping score on how frequently these are coming out: *please
>> stop*. ;)
>>
>> Silver lining - looks like we have a lot to discuss this round! Last
>> update was late July and we've been churning through the 5.0 freeze and
>> stabilization phase.
>>
>>
>>
>> *[New Contributors Getting Started]*
>> Check out https://the-asf.slack.com, channel #cassandra-dev. Reply
>> directly to me on this email if you need an invite for your account, and
>> reach out to the @cassandra_mentors alias in the channel if you need to get
>> oriented.
>>
>> We have a list of curated "getting started" tickets you can find here,
>> filtered to "ToDo" (i.e. not yet worked):
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484=2160=2162=2652
>> .
>>
>> *Helpful links:*
>> - Getting Started with Development on C*:
>> https://cassandra.apache.org/_/development/gettingstarted.html
>> - Building and IDE integration (worktrees are your friend; msg me on
>> slack if you need pointers):
>> https://cassandra.apache.org/_/development/ide.html
>> - Code Style: https://cassandra.apache.org/_/development/code_style.html
>>
>>
>>
>> *[Dev mailing list]*
>>
>> https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27
>> :
>>
>> My last email of shame was 35 threads. Drumroll for this one...
>> 91. *Yeesh*. Let me stick to highlights.
>>
>> Ekaterina pushed through dropping JDK8 support and adding JDK17
>> support... back in July. If you didn't know about it by know, consider
>> yourself doubly notified. :) .
>> https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k I think
>> I can speak on behalf of all of us when I say: *Thank You Ekaterina.*
>>
>> This came up recently on another thread about when to branch 5.1, but we
>> discussed our freeze plans and exception rules for TCM and Accord here:
>> https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick
>> was essentially looking for a similar waiver for Vector search since it was
>> well abstracted, depended on SAI and external libs, and in general
>> shouldn't be too big of a disruption to get into 5.0. General consensus at
>> the time was "sure", and the work has since been completed. But here's the
>> reminder and link for posterity (and in case you missed it).
>>
>> Jaydeep reached out about a potential short-term solution to detecting
>> token-ownership mismatch while we don't yet have TCM; this seems more
>> pressing now as we're looking at a 5.0 without yet having TCM in it. The
>> dev ML thread is here:
>> https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and he
>> created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing
>> with the topic. There's a relatively modest (7 files, just over 300 lines)
>> PR available here: https://github.com/apache/cassandra/pull/2595/files;
>> I haven't looked into it, but it might be worth considering getting this
>> into 5.0 since it looks like we're moving to cutting w/out TCM. Any
>> thoughts?
>>
>> We had a pretty good discussion about automated repair scheduling,
>> discussing whether it should live in the DB proper vs. in the sidecar, pros
>> and cons, pressures, etc. Not sure if things moved beyond that; I know
>> there's at least a few implementations out there that haven't yet made
>> their way back to the ASF project proper. Thread:
>> https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My
>> hope is we can avoid the gridlock we hit for a long time with the sidecar
>> where there are multiple implementations with different tradeoffs and
>> everyone's disincentivized from accepting a solution different from their
>> own in-house one since it'd theoretically require re-tooling. Tough problem
>> with no easy solutions, but would love to see this become a first class
>> citizen in the ecosystem.
>>
>> Paulo brought up a discussion about moving to disk_access_mode =
>> mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we
>> actually changed that in the 5.0 branch? Thread:
>> https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just
>> pulled on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() ==
>> .mmap rather than .mmap_index_only.
>>
>> David Capwell worked on adding some retries to repair messages when
>> they're failing to make the process more robust:
>> https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1.
>> Reception was positive enough that he went so far as to back-port it and
>> also work on some for IR. Looks like he could use a reviewer here:
>> https://issues.apache.org/jira/browse/CASSANDRA-18962 - and this is
>> patch available.
>>

Re: Project Status Update: 90-day catch-up edition [2023-10-27]

2023-10-27 Thread Sam
Please can I have an invite to the Slack workspace on this email. I'd like
to take a look through some of the items for first time contributors :-)

Thanks!

On Fri, 27 Oct 2023 at 18:10, Josh McKenzie  wrote:

> In case you're keeping score on how frequently these are coming out: *please
> stop*. ;)
>
> Silver lining - looks like we have a lot to discuss this round! Last
> update was late July and we've been churning through the 5.0 freeze and
> stabilization phase.
>
>
>
> *[New Contributors Getting Started]*
> Check out https://the-asf.slack.com, channel #cassandra-dev. Reply
> directly to me on this email if you need an invite for your account, and
> reach out to the @cassandra_mentors alias in the channel if you need to get
> oriented.
>
> We have a list of curated "getting started" tickets you can find here,
> filtered to "ToDo" (i.e. not yet worked):
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484=2160=2162=2652
> .
>
> *Helpful links:*
> - Getting Started with Development on C*:
> https://cassandra.apache.org/_/development/gettingstarted.html
> - Building and IDE integration (worktrees are your friend; msg me on slack
> if you need pointers): https://cassandra.apache.org/_/development/ide.html
> - Code Style: https://cassandra.apache.org/_/development/code_style.html
>
>
>
> *[Dev mailing list]*
>
> https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27
> :
>
> My last email of shame was 35 threads. Drumroll for this one...
> 91. *Yeesh*. Let me stick to highlights.
>
> Ekaterina pushed through dropping JDK8 support and adding JDK17 support...
> back in July. If you didn't know about it by know, consider yourself doubly
> notified. :) .
> https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k I think
> I can speak on behalf of all of us when I say: *Thank You Ekaterina.*
>
> This came up recently on another thread about when to branch 5.1, but we
> discussed our freeze plans and exception rules for TCM and Accord here:
> https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick
> was essentially looking for a similar waiver for Vector search since it was
> well abstracted, depended on SAI and external libs, and in general
> shouldn't be too big of a disruption to get into 5.0. General consensus at
> the time was "sure", and the work has since been completed. But here's the
> reminder and link for posterity (and in case you missed it).
>
> Jaydeep reached out about a potential short-term solution to detecting
> token-ownership mismatch while we don't yet have TCM; this seems more
> pressing now as we're looking at a 5.0 without yet having TCM in it. The
> dev ML thread is here:
> https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and he
> created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing
> with the topic. There's a relatively modest (7 files, just over 300 lines)
> PR available here: https://github.com/apache/cassandra/pull/2595/files; I
> haven't looked into it, but it might be worth considering getting this into
> 5.0 since it looks like we're moving to cutting w/out TCM. Any thoughts?
>
> We had a pretty good discussion about automated repair scheduling,
> discussing whether it should live in the DB proper vs. in the sidecar, pros
> and cons, pressures, etc. Not sure if things moved beyond that; I know
> there's at least a few implementations out there that haven't yet made
> their way back to the ASF project proper. Thread:
> https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My hope
> is we can avoid the gridlock we hit for a long time with the sidecar where
> there are multiple implementations with different tradeoffs and everyone's
> disincentivized from accepting a solution different from their own in-house
> one since it'd theoretically require re-tooling. Tough problem with no easy
> solutions, but would love to see this become a first class citizen in the
> ecosystem.
>
> Paulo brought up a discussion about moving to disk_access_mode =
> mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we
> actually changed that in the 5.0 branch? Thread:
> https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just
> pulled on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() ==
> .mmap rather than .mmap_index_only.
>
> David Capwell worked on adding some retries to repair messages when
> they're failing to make the process more robust:
> https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1.
> Reception was positive enough that he went so far as to back-port it and
> also work on some for IR. Looks like he could use a reviewer here:
> https://issues.apache.org/jira/browse/CASSANDRA-18962 - and this is patch
> available.
>
> Mike Adamson reached out about adding / taking a dependency on jvector:
> https://lists.apache.org/thread/zkqg7mk9hp35zn0cf1tvywc2m3l63jrn. The
> general gist of it was "looks good, written by