> I think the project needs to conclude the discussions that keep being
started around the "definition of done" before determining what sufficient
quality assurance looks like for this feature.

Looking forward to the Test/QA guideline. Thanks for bringing this up.


> the CEP process suggest a wiki page

Added CEP-7 SAI cwiki:
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-7%3A+Storage+Attached+Index

On Sat, 22 Aug 2020 at 01:01, Jason Rutherglen <jason.rutherg...@gmail.com>
wrote:

> > About space efficiency, one of the biggest drawback of SASI was the huge
> space required for index structure when using CONTAINS logic because of the
> decomposition of text columns into n-grams. Will SAI suffer from the same
> issue in future iterations ?
>
> SAI does not have specific ngram support atm, though that may be added
> with tokenizers.  Ngrams do indeed grow the index, that's a user
> decision for faster queries or more disk space.
>
> On Tue, Aug 18, 2020 at 6:05 AM DuyHai Doan <doanduy...@gmail.com> wrote:
> >
> > Thank you Zhao Yang for starting this topic
> >
> > After reading the short design doc, I have a few questions
> >
> > 1) SASI was pretty inefficient indexing wide partitions because the index
> > structure only retains the partition token, not the clustering colums. As
> > per design doc SAI has row id mapping to partition offset, can we hope
> that
> > indexing wide partition will be more efficient with SAI ? One detail that
> > worries me is that in the beggining of the design doc, it is said that
> the
> > matching rows are post filtered while scanning the partition. Can you
> > confirm or infirm that SAI is efficient with wide partitions and provides
> > the partition offsets to the matching rows ?
> >
> > 2) About space efficiency, one of the biggest drawback of SASI was the
> huge
> > space required for index structure when using CONTAINS logic because of
> the
> > decomposition of text columns into n-grams. Will SAI suffer from the same
> > issue in future iterations ? I'm anticipating a bit
> >
> > 3) If I'm querying using SAI and providing complete partition key, will
> it
> > be more efficient than querying without partition key. In other words,
> does
> > SAI provide any optimisation when partition key is specified ?
> >
> > Regards
> >
> > Duy Hai DOAN
> >
> > Le mar. 18 août 2020 à 11:39, Mick Semb Wever <m...@apache.org> a écrit :
> >
> > > >
> > > > We are looking forward to the community's feedback and suggestions.
> > > >
> > >
> > >
> > > What comes immediately to mind is testing requirements. It has been
> > > mentioned already that the project's testability and QA guidelines are
> > > inadequate to successfully introduce new features and refactorings to
> the
> > > codebase. During the 4.0 beta phase this was intended to be addressed,
> i.e.
> > > defining more specific QA guidelines for 4.0-rc. This would be an
> important
> > > step towards QA guidelines for all changes and CEPs post-4.0.
> > >
> > > Questions from me
> > >  - How will this be tested, how will its QA status and lifecycle be
> > > defined? (per above)
> > >  - With existing C* code needing to be changed, what is the proposed
> plan
> > > for making those changes ensuring maintained QA, e.g. is there
> separate QA
> > > cycles planned for altering the SPI before adding a new SPI
> implementation?
> > >  - Despite being out of scope, it would be nice to have some idea from
> the
> > > CEP author of when users might still choose afresh 2i or SASI over SAI,
> > >  - Who fills the roles involved? Who are the contributors in this
> DataStax
> > > team? Who is the shepherd? Are there other stakeholders willing to be
> > > involved?
> > >  - Is there a preference to use gdoc instead of the project's wiki, and
> > > why? (the CEP process suggest a wiki page, and feedback on why another
> > > approach is considered better helps evolve the CEP process itself)
> > >
> > > cheers,
> > > Mick
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Reply via email to