Hi Branimir, thank you for sharing the draft. I like the idea, I am trying
to tune memory allocation for write flow now and I see limits of the
current implementation which cannot be crossed without changing a design
(while still there are plenty of places which can be resolved by local
improvements).
At the same time it is quite a challenging plan, taking in account the
amount of affected code  :-).
Few questions about the CEP/possible design for the mentioned ideas in it:
1) Am I right that you plan to put cell content into the trie itself as
well? How do you plan to deal with changed values for them? The current
trie implementation manages primary keys which are immutable and it
easier to implement than a mutable case.
2) Do you have already a plan how to provide consistent reads for the trie
as memtable?, currently we rely on BTree copy-on-write structures in
partition data to have a consistent partition read + java GC as a way to
recycle old versions in heap (except cell values which are allocated in
slabs).

Regards,
Dmitry

On Wed, 15 Oct 2025 at 22:28, Josh McKenzie <[email protected]> wrote:

> This is a pretty large proposal that has components solving a variety of
> problems of Cassandra.
>
> Indeed. :)
>
> Wanted to chime in too and say: will be digesting this for a bit (though
> it does look familiar...)
>
> Since CEP's are broadly about getting up-front collaboration and alignment
> on bigger architectural changes, given this is all closely related I think
> it'd be fine to have the CEP as one big block and do something like an Epic
> for the implementation w/issues that fall out from there (or whatever else
> works for you). That way you keep locality on the architectural part of it
> and can deliver it however best suits you.
>
> Thanks for putting this out there - clearly a lot of thought and effort
> has gone into it and the community would broadly benefit from a lot of the
> things you're trying to solve with this!
>
> On Wed, Oct 15, 2025, at 10:37 AM, Branimir Lambov via dev wrote:
>
> Hello everyone,
>
> I would like to open CEP-57: Flat keys and trie interfaces
> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-57%3A+Flat+keys+and+trie+interfaces>
>  for
> discussion.
>
> This is a pretty large proposal that has components solving a variety of
> problems of Cassandra. The original work started as an attempt to improve
> local node performance, but it enables improvements beyond that.
>
> The core of the proposal is movement away from data representation as
> hierarchies of structures,  towards a simple byte-comparable key to value
> store, where tries and trie cursors are used to efficiently store and
> access data, and key prefixes are used to define the distribution of data
> among nodes.
>
> This enables the flexibility in data distribution that we know we need to
> solve a variety of problems that make Cassandra difficult to use. In
> addition, a change in the internal representation of data is also an
> opportunity to redesign tombstone storage and processing, which in turn
> makes it possible to solve the problems associated with tombstone handling.
>
> Please let me know what you think. Will this project benefit from being
> split into multiple CEPs? Are there clarifications that need to be made?
> Any problems or opportunities I've missed? Would you support this kind of
> transformation for the core engine?
>
> Regards,
> Branimir
>
>
>

-- 
Dmitry Konstantinov

Reply via email to