Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-27 Thread Jon Haddad
Very nice!  I'll kick the tires a bit, and add a sai test to tlp-stress 

On 2023/07/26 18:56:29 Caleb Rackliffe wrote:
> Alright, the cep-7-sai branch is now merged to trunk!
> 
> Now we move to addressing the most urgent items from "Phase 2" (
> CASSANDRA-18473 )
> before (and in the case of some testing after) the 5.0 freeze...
> 
> On Wed, Jul 26, 2023 at 6:07 AM Jeremy Hanna 
> wrote:
> 
> > Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else
> > involved with the SAI implementation!
> >
> > On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe 
> > wrote:
> >
> > 
> > Just a quick update...
> >
> > With CASSANDRA-18670
> >  complete, and all
> > remaining items in the category of performance optimizations and further
> > testing, the process of merging to trunk will likely start today, beginning
> > with a final rebase on the current trunk and J11 and J17 test runs.
> >
> > On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe 
> > wrote:
> >
> >> Hello there!
> >>
> >> After much toil, the first phase of CEP-7 is nearing completion (see
> >> CASSANDRA-16052 ).
> >> There are presently two issues to resolve before we'd like to merge the
> >> cep-7-sai feature branch and all its goodness to trunk:
> >>
> >> CASSANDRA-18670 
> >> - Importer should build SSTable indexes successfully before making new
> >> SSTables readable (in review)
> >>
> >> CASSANDRA-18673 
> >> - Reduce size of per-SSTable index components (in progress)
> >>
> >> (We've been getting clean CircleCI runs for a while now, and have been
> >> using the multiplexer to sniff out as much flakiness as possible up front.)
> >>
> >> Once merged to trunk, the next steps are:
> >>
> >> 1.) Finish a Harry model that we can use to further fuzz test SAI before
> >> 5.0 releases (see CASSANDRA-18275
> >> ). We've done a
> >> fair amount of fuzz/randomized testing at the component level, but I'd
> >> still consider Harry (at least around single-partition query use-cases) a
> >> critical item for us to have confidence before release.
> >>
> >> 2.) Start pursuing Phase 2 items as time and our needs allow. (see
> >> CASSANDRA-18473 )
> >>
> >> A reminder, SAI is a secondary index, and therefore is by definition an
> >> opt-in feature, and has no explicit "feature flag". However, its
> >> availability to users is still subject to the secondary_indexes_enabled
> >> guardrail, which currently defaults to allowing creation.
> >>
> >> Any thoughts, questions, or comments on the pre-merge plan here?
> >>
> >
> 


Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-27 Thread Lorina Poland
I have to say how excited I am that I get to emphasize SAI as an indexing
method for Apache C*, rather than the much less capable 2i! I have been
waiting for this day for years! Thanks Caleb and all who worked to make
this day a reality!

Lorina

On Wed, Jul 26, 2023 at 10:08 PM Berenguer Blasi 
wrote:

> Nice one!
> On 26/7/23 21:11, Ekaterina Dimitrova wrote:
>
> Thanks Caleb!
> Great  job everyone! 🚀👏🏻
>
> On Wed, 26 Jul 2023 at 15:07, J. D. Jordan 
> wrote:
>
>> Thanks for all the work here!
>>
>> On Jul 26, 2023, at 1:57 PM, Caleb Rackliffe 
>> wrote:
>>
>> 
>>
>> Alright, the cep-7-sai branch is now merged to trunk!
>>
>> Now we move to addressing the most urgent items from "Phase 2" (
>> CASSANDRA-18473 )
>> before (and in the case of some testing after) the 5.0 freeze...
>>
>> On Wed, Jul 26, 2023 at 6:07 AM Jeremy Hanna 
>> wrote:
>>
>>> Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else
>>> involved with the SAI implementation!
>>>
>>> On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe 
>>> wrote:
>>>
>>> 
>>> Just a quick update...
>>>
>>> With CASSANDRA-18670
>>>  complete, and
>>> all remaining items in the category of performance optimizations and
>>> further testing, the process of merging to trunk will likely start today,
>>> beginning with a final rebase on the current trunk and J11 and J17 test
>>> runs.
>>>
>>> On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe <
>>> calebrackli...@gmail.com> wrote:
>>>
 Hello there!

 After much toil, the first phase of CEP-7 is nearing completion (see
 CASSANDRA-16052 ).
 There are presently two issues to resolve before we'd like to merge the
 cep-7-sai feature branch and all its goodness to trunk:

 CASSANDRA-18670 
 - Importer should build SSTable indexes successfully before making new
 SSTables readable (in review)

 CASSANDRA-18673 
 - Reduce size of per-SSTable index components (in progress)

 (We've been getting clean CircleCI runs for a while now, and have been
 using the multiplexer to sniff out as much flakiness as possible up front.)

 Once merged to trunk, the next steps are:

 1.) Finish a Harry model that we can use to further fuzz test SAI
 before 5.0 releases (see CASSANDRA-18275
 ). We've done a
 fair amount of fuzz/randomized testing at the component level, but I'd
 still consider Harry (at least around single-partition query use-cases) a
 critical item for us to have confidence before release.

 2.) Start pursuing Phase 2 items as time and our needs allow. (see
 CASSANDRA-18473 
 )

 A reminder, SAI is a secondary index, and therefore is by definition an
 opt-in feature, and has no explicit "feature flag". However, its
 availability to users is still subject to the secondary_indexes_enabled
 guardrail, which currently defaults to allowing creation.

 Any thoughts, questions, or comments on the pre-merge plan here?

>>>


Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-26 Thread Berenguer Blasi

Nice one!

On 26/7/23 21:11, Ekaterina Dimitrova wrote:

Thanks Caleb!
Great  job everyone! 🚀👏🏻

On Wed, 26 Jul 2023 at 15:07, J. D. Jordan  
wrote:


Thanks for all the work here!


On Jul 26, 2023, at 1:57 PM, Caleb Rackliffe
 wrote:


Alright, the cep-7-sai branch is now merged to trunk!

Now we move to addressing the most urgent items from "Phase 2"
(CASSANDRA-18473
) before
(and in the case of some testing after) the 5.0 freeze...

On Wed, Jul 26, 2023 at 6:07 AM Jeremy Hanna
 wrote:

Thanks Caleb and Mike and Zhao and Andres and Piotr and
everyone else involved with the SAI implementation!


On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe
 wrote:


Just a quick update...

With CASSANDRA-18670
 complete,
and all remaining items in the category of performance
optimizations and further testing, the process of merging to
trunk will likely start today, beginning with a final rebase
on the current trunk and J11 and J17 test runs.

On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe
 wrote:

Hello there!

After much toil, the first phase of CEP-7 is nearing
completion (see CASSANDRA-16052
).
There are presently two issues to resolve before we'd
like to merge the cep-7-saifeature branch and all its
goodness to trunk:

CASSANDRA-18670

- Importer should build SSTable indexes successfully
before making new SSTables readable (in review)

CASSANDRA-18673

- Reduce size of per-SSTable index components (in progress)

(We've been getting clean CircleCI runs for a while now,
and have been using the multiplexer to sniff out as much
flakiness as possible up front.)

Once merged to trunk, the next steps are:

1.) Finish a Harry model that we can use to further fuzz
test SAI before 5.0 releases (see CASSANDRA-18275
).
We've done a fair amount of fuzz/randomized testing at
the component level, but I'd still consider Harry (at
least around single-partition query use-cases) a
critical item for us to have confidence before release.

2.) Start pursuing Phase 2 items as time and our needs
allow. (see CASSANDRA-18473
)

A reminder, SAI is a secondary index, and therefore is
by definition an opt-in feature, and has no explicit
"feature flag". However, its availability to users is
still subject to the secondary_indexes_enabled
guardrail, which currently defaults to allowing creation.

Any thoughts, questions, or comments on the pre-merge
plan here?


Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-26 Thread Ekaterina Dimitrova
Thanks Caleb!
Great  job everyone! 🚀👏🏻

On Wed, 26 Jul 2023 at 15:07, J. D. Jordan 
wrote:

> Thanks for all the work here!
>
> On Jul 26, 2023, at 1:57 PM, Caleb Rackliffe 
> wrote:
>
> 
>
> Alright, the cep-7-sai branch is now merged to trunk!
>
> Now we move to addressing the most urgent items from "Phase 2" (
> CASSANDRA-18473 )
> before (and in the case of some testing after) the 5.0 freeze...
>
> On Wed, Jul 26, 2023 at 6:07 AM Jeremy Hanna 
> wrote:
>
>> Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else
>> involved with the SAI implementation!
>>
>> On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe 
>> wrote:
>>
>> 
>> Just a quick update...
>>
>> With CASSANDRA-18670
>>  complete, and
>> all remaining items in the category of performance optimizations and
>> further testing, the process of merging to trunk will likely start today,
>> beginning with a final rebase on the current trunk and J11 and J17 test
>> runs.
>>
>> On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe 
>> wrote:
>>
>>> Hello there!
>>>
>>> After much toil, the first phase of CEP-7 is nearing completion (see
>>> CASSANDRA-16052 ).
>>> There are presently two issues to resolve before we'd like to merge the
>>> cep-7-sai feature branch and all its goodness to trunk:
>>>
>>> CASSANDRA-18670 
>>> - Importer should build SSTable indexes successfully before making new
>>> SSTables readable (in review)
>>>
>>> CASSANDRA-18673 
>>> - Reduce size of per-SSTable index components (in progress)
>>>
>>> (We've been getting clean CircleCI runs for a while now, and have been
>>> using the multiplexer to sniff out as much flakiness as possible up front.)
>>>
>>> Once merged to trunk, the next steps are:
>>>
>>> 1.) Finish a Harry model that we can use to further fuzz test SAI before
>>> 5.0 releases (see CASSANDRA-18275
>>> ). We've done a
>>> fair amount of fuzz/randomized testing at the component level, but I'd
>>> still consider Harry (at least around single-partition query use-cases) a
>>> critical item for us to have confidence before release.
>>>
>>> 2.) Start pursuing Phase 2 items as time and our needs allow. (see
>>> CASSANDRA-18473 )
>>>
>>> A reminder, SAI is a secondary index, and therefore is by definition an
>>> opt-in feature, and has no explicit "feature flag". However, its
>>> availability to users is still subject to the secondary_indexes_enabled
>>> guardrail, which currently defaults to allowing creation.
>>>
>>> Any thoughts, questions, or comments on the pre-merge plan here?
>>>
>>


Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-26 Thread J. D. Jordan
Thanks for all the work here!On Jul 26, 2023, at 1:57 PM, Caleb Rackliffe  wrote:Alright, the cep-7-sai branch is now merged to trunk!Now we move to addressing the most urgent items from "Phase 2" (CASSANDRA-18473) before (and in the case of some testing after) the 5.0 freeze...On Wed, Jul 26, 2023 at 6:07 AM Jeremy Hanna  wrote:Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else involved with the SAI implementation!On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe  wrote:Just a quick update...With CASSANDRA-18670 complete, and all remaining items in the category of performance optimizations and further testing, the process of merging to trunk will likely start today, beginning with a final rebase on the current trunk and J11 and J17 test runs.On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe  wrote:Hello there!After much toil, the first phase of CEP-7 is nearing completion (see CASSANDRA-16052). There are presently two issues to resolve before we'd like to merge the cep-7-sai feature branch and all its goodness to trunk:CASSANDRA-18670 - Importer should build SSTable indexes successfully before making new SSTables readable (in review)CASSANDRA-18673 - Reduce size of per-SSTable index components (in progress)(We've been getting clean CircleCI runs for a while now, and have been using the multiplexer to sniff out as much flakiness as possible up front.)Once merged to trunk, the next steps are:1.) Finish a Harry model that we can use to further fuzz test SAI before 5.0 releases (see CASSANDRA-18275). We've done a fair amount of fuzz/randomized testing at the component level, but I'd still consider Harry (at least around single-partition query use-cases) a critical item for us to have confidence before release.2.) Start pursuing Phase 2 items as time and our needs allow. (see CASSANDRA-18473)A reminder, SAI is a secondary index, and therefore is by definition an opt-in feature, and has no explicit "feature flag". However, its availability to users is still subject to the secondary_indexes_enabled guardrail, which currently defaults to allowing creation.Any thoughts, questions, or comments on the pre-merge plan here?




Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-26 Thread Caleb Rackliffe
Alright, the cep-7-sai branch is now merged to trunk!

Now we move to addressing the most urgent items from "Phase 2" (
CASSANDRA-18473 )
before (and in the case of some testing after) the 5.0 freeze...

On Wed, Jul 26, 2023 at 6:07 AM Jeremy Hanna 
wrote:

> Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else
> involved with the SAI implementation!
>
> On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe 
> wrote:
>
> 
> Just a quick update...
>
> With CASSANDRA-18670
>  complete, and all
> remaining items in the category of performance optimizations and further
> testing, the process of merging to trunk will likely start today, beginning
> with a final rebase on the current trunk and J11 and J17 test runs.
>
> On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe 
> wrote:
>
>> Hello there!
>>
>> After much toil, the first phase of CEP-7 is nearing completion (see
>> CASSANDRA-16052 ).
>> There are presently two issues to resolve before we'd like to merge the
>> cep-7-sai feature branch and all its goodness to trunk:
>>
>> CASSANDRA-18670 
>> - Importer should build SSTable indexes successfully before making new
>> SSTables readable (in review)
>>
>> CASSANDRA-18673 
>> - Reduce size of per-SSTable index components (in progress)
>>
>> (We've been getting clean CircleCI runs for a while now, and have been
>> using the multiplexer to sniff out as much flakiness as possible up front.)
>>
>> Once merged to trunk, the next steps are:
>>
>> 1.) Finish a Harry model that we can use to further fuzz test SAI before
>> 5.0 releases (see CASSANDRA-18275
>> ). We've done a
>> fair amount of fuzz/randomized testing at the component level, but I'd
>> still consider Harry (at least around single-partition query use-cases) a
>> critical item for us to have confidence before release.
>>
>> 2.) Start pursuing Phase 2 items as time and our needs allow. (see
>> CASSANDRA-18473 )
>>
>> A reminder, SAI is a secondary index, and therefore is by definition an
>> opt-in feature, and has no explicit "feature flag". However, its
>> availability to users is still subject to the secondary_indexes_enabled
>> guardrail, which currently defaults to allowing creation.
>>
>> Any thoughts, questions, or comments on the pre-merge plan here?
>>
>


Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-26 Thread Jeremy Hanna
Thanks Caleb and Mike and Zhao and Andres and Piotr and everyone else involved with the SAI implementation!On Jul 25, 2023, at 3:01 PM, Caleb Rackliffe  wrote:Just a quick update...With CASSANDRA-18670 complete, and all remaining items in the category of performance optimizations and further testing, the process of merging to trunk will likely start today, beginning with a final rebase on the current trunk and J11 and J17 test runs.On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe  wrote:Hello there!After much toil, the first phase of CEP-7 is nearing completion (see CASSANDRA-16052). There are presently two issues to resolve before we'd like to merge the cep-7-sai feature branch and all its goodness to trunk:CASSANDRA-18670 - Importer should build SSTable indexes successfully before making new SSTables readable (in review)CASSANDRA-18673 - Reduce size of per-SSTable index components (in progress)(We've been getting clean CircleCI runs for a while now, and have been using the multiplexer to sniff out as much flakiness as possible up front.)Once merged to trunk, the next steps are:1.) Finish a Harry model that we can use to further fuzz test SAI before 5.0 releases (see CASSANDRA-18275). We've done a fair amount of fuzz/randomized testing at the component level, but I'd still consider Harry (at least around single-partition query use-cases) a critical item for us to have confidence before release.2.) Start pursuing Phase 2 items as time and our needs allow. (see CASSANDRA-18473)A reminder, SAI is a secondary index, and therefore is by definition an opt-in feature, and has no explicit "feature flag". However, its availability to users is still subject to the secondary_indexes_enabled guardrail, which currently defaults to allowing creation.Any thoughts, questions, or comments on the pre-merge plan here?



Re: Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-25 Thread Caleb Rackliffe
Just a quick update...

With CASSANDRA-18670
 complete,
and all remaining items in the category of performance optimizations and
further testing, the process of merging to trunk will likely start today,
beginning with a final rebase on the current trunk and J11 and J17 test
runs.

On Tue, Jul 18, 2023 at 3:47 PM Caleb Rackliffe 
wrote:

> Hello there!
>
> After much toil, the first phase of CEP-7 is nearing completion (see
> CASSANDRA-16052 ).
> There are presently two issues to resolve before we'd like to merge the
> cep-7-sai feature branch and all its goodness to trunk:
>
> CASSANDRA-18670  -
> Importer should build SSTable indexes successfully before making new
> SSTables readable (in review)
>
> CASSANDRA-18673  -
> Reduce size of per-SSTable index components (in progress)
>
> (We've been getting clean CircleCI runs for a while now, and have been
> using the multiplexer to sniff out as much flakiness as possible up front.)
>
> Once merged to trunk, the next steps are:
>
> 1.) Finish a Harry model that we can use to further fuzz test SAI before
> 5.0 releases (see CASSANDRA-18275
> ). We've done a
> fair amount of fuzz/randomized testing at the component level, but I'd
> still consider Harry (at least around single-partition query use-cases) a
> critical item for us to have confidence before release.
>
> 2.) Start pursuing Phase 2 items as time and our needs allow. (see
> CASSANDRA-18473 )
>
> A reminder, SAI is a secondary index, and therefore is by definition an
> opt-in feature, and has no explicit "feature flag". However, its
> availability to users is still subject to the secondary_indexes_enabled
> guardrail, which currently defaults to allowing creation.
>
> Any thoughts, questions, or comments on the pre-merge plan here?
>


Status Update on CEP-7 Storage Attached Indexes (SAI)

2023-07-18 Thread Caleb Rackliffe
Hello there!

After much toil, the first phase of CEP-7 is nearing completion (see
CASSANDRA-16052 ).
There are presently two issues to resolve before we'd like to merge the
cep-7-sai feature branch and all its goodness to trunk:

CASSANDRA-18670  -
Importer should build SSTable indexes successfully before making new
SSTables readable (in review)

CASSANDRA-18673  -
Reduce size of per-SSTable index components (in progress)

(We've been getting clean CircleCI runs for a while now, and have been
using the multiplexer to sniff out as much flakiness as possible up front.)

Once merged to trunk, the next steps are:

1.) Finish a Harry model that we can use to further fuzz test SAI before
5.0 releases (see CASSANDRA-18275
). We've done a fair
amount of fuzz/randomized testing at the component level, but I'd still
consider Harry (at least around single-partition query use-cases) a
critical item for us to have confidence before release.

2.) Start pursuing Phase 2 items as time and our needs allow. (see
CASSANDRA-18473 )

A reminder, SAI is a secondary index, and therefore is by definition an
opt-in feature, and has no explicit "feature flag". However, its
availability to users is still subject to the secondary_indexes_enabled
guardrail, which currently defaults to allowing creation.

Any thoughts, questions, or comments on the pre-merge plan here?