Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

Jacek Lewandowski Tue, 22 Nov 2022 01:20:44 -0800

+1 for the proposal !

btw. regarding tests - perhaps we will have to let Python DTests run with
either new or old format


thanks
- - -- --- ----- -------- -------------
Jacek Lewandowski


On Mon, Nov 21, 2022 at 3:06 PM Benedict <bened...@apache.org> wrote:

> Yes of course, this was absolutely just a query and not a precondition for
> this work. It stands on its own on my view, and I’m already ready to +1 the
> proposal.
>
> On 21 Nov 2022, at 13:55, Branimir Lambov <blam...@apache.org> wrote:
>
> 
> I see. This does make a lot of sense for full row indexing, and also if
> one can specify sub-kb granularity (at the current default we just won't
> have an index in these cases). How does opening a ticket to do these two*
> after the current code is committed sound?
>
> * embedded index for sub-X-byte partitions + granularity in bytes
>
> On Mon, Nov 21, 2022 at 3:38 PM Benedict <bened...@apache.org> wrote:
>
>> Buffering on write up to at most one page seems fine? Once you are past a
>> single page it’s fine to write either to the end of the partition or to a
>> separate file, there’s nothing much to be gained, but esp. for small
>> partitions there’s likely significant value in prepending it?
>>
>> It might be preferable to retain the separate index for those that
>> overflow this buffer, and simply encode in the partition index whether the
>> row index is inline or in the separate file.
>>
>> On 21 Nov 2022, at 13:29, Branimir Lambov <blam...@apache.org> wrote:
>>
>> 
>> There is no intention to introduce any new versions of the format
>> specifically for DSE. If there are any further changes to the format, they
>> will be OSS-first. In other words this support only extends to preexisting
>> versions of the format.
>>
>> Inline row index in the data file is not something we have implemented,
>> and it's currently not in any plans. I personally am not sure how it can be
>> done to provide a benefit: if we place it at the end of a partition, it
>> does not help much compared to a separate file; if we place it in front, we
>> have to buffer the partition content, which will affect write performance.
>> In either case it may be harder to cache. Do you have something different
>> in mind?
>>
>> Regards,
>> Branimir
>>
>> On Mon, Nov 21, 2022 at 3:01 PM Benedict <bened...@apache.org> wrote:
>>
>>> Personally very pleased to see this proposal, and I’m not opposed to
>>> easing your migration by maintaining some light support for internal file
>>> versions - though would prefer the support have some version limit where it
>>> can be excised (maybe for one minor version bump?)
>>>
>>> One implementation question: are there any plans to support inline row
>>> index in the big sstable format files? Is this something DSE supports, and
>>> on the roadmap just not for initial work, or currently not envisioned?
>>>
>>> I would anticipate significant advantage to this for many workloads, and
>>> no downside (except for streaming - which could be resolved fairly easily
>>> by skipping over these sections when streaming to an old node, but since we
>>> don’t generally stream between versions I don’t see any major issue anyway).
>>>
>>>
>>> On 21 Nov 2022, at 12:43, Branimir Lambov <blam...@apache.org> wrote:
>>>
>>> 
>>> Hi everyone,
>>>
>>> We would like to put CEP-25 for discussion.
>>>
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format
>>>
>>> The proposal describes DSE's Big Trie-indexed SSTable format, which
>>> replaces the primary index with on-disk tries to improve lookup performance
>>> and index size, better handle wide partitions, and remove the need to
>>> manage key caching and index summaries.
>>>
>>> We would like to discuss this proposal with you.
>>>
>>> One of the questions that we want to ask is whether anyone objects to
>>> maintaining full compatibility with existing files created by DataStax
>>> Enterprise.
>>>
>>> Regards,
>>> Branimir
>>>
>>>
>>
>>
>>

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

Reply via email to