Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

Benedict Mon, 21 Nov 2022 06:06:27 -0800

Yes of course, this was absolutely just a query and not a precondition for this 
work. It stands on its own on my view, and I’m already ready to +1 the proposal.


> On 21 Nov 2022, at 13:55, Branimir Lambov <[email protected]> wrote:
> 
> 
> I see. This does make a lot of sense for full row indexing, and also if one 
> can specify sub-kb granularity (at the current default we just won't have an 
> index in these cases). How does opening a ticket to do these two* after the 
> current code is committed sound?
> 
> * embedded index for sub-X-byte partitions + granularity in bytes
> 
>> On Mon, Nov 21, 2022 at 3:38 PM Benedict <[email protected]> wrote:
>> Buffering on write up to at most one page seems fine? Once you are past a 
>> single page it’s fine to write either to the end of the partition or to a 
>> separate file, there’s nothing much to be gained, but esp. for small 
>> partitions there’s likely significant value in prepending it?
>> 
>> It might be preferable to retain the separate index for those that overflow 
>> this buffer, and simply encode in the partition index whether the row index 
>> is inline or in the separate file.
>> 
>>>> On 21 Nov 2022, at 13:29, Branimir Lambov <[email protected]> wrote:
>>>> 
>>> 
>>> There is no intention to introduce any new versions of the format 
>>> specifically for DSE. If there are any further changes to the format, they 
>>> will be OSS-first. In other words this support only extends to preexisting 
>>> versions of the format.
>>> 
>>> Inline row index in the data file is not something we have implemented, and 
>>> it's currently not in any plans. I personally am not sure how it can be 
>>> done to provide a benefit: if we place it at the end of a partition, it 
>>> does not help much compared to a separate file; if we place it in front, we 
>>> have to buffer the partition content, which will affect write performance. 
>>> In either case it may be harder to cache. Do you have something different 
>>> in mind?
>>> 
>>> Regards,
>>> Branimir
>>> 
>>>> On Mon, Nov 21, 2022 at 3:01 PM Benedict <[email protected]> wrote:
>>>> Personally very pleased to see this proposal, and I’m not opposed to 
>>>> easing your migration by maintaining some light support for internal file 
>>>> versions - though would prefer the support have some version limit where 
>>>> it can be excised (maybe for one minor version bump?)
>>>> 
>>>> One implementation question: are there any plans to support inline row 
>>>> index in the big sstable format files? Is this something DSE supports, and 
>>>> on the roadmap just not for initial work, or currently not envisioned?
>>>> 
>>>> I would anticipate significant advantage to this for many workloads, and 
>>>> no downside (except for streaming - which could be resolved fairly easily 
>>>> by skipping over these sections when streaming to an old node, but since 
>>>> we don’t generally stream between versions I don’t see any major issue 
>>>> anyway).
>>>> 
>>>> 
>>>>>> On 21 Nov 2022, at 12:43, Branimir Lambov <[email protected]> wrote:
>>>>>> 
>>>>> 
>>>>> Hi everyone,
>>>>> 
>>>>> We would like to put CEP-25 for discussion.
>>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format
>>>>> 
>>>>> The proposal describes DSE's Big Trie-indexed SSTable format, which 
>>>>> replaces the primary index with on-disk tries to improve lookup 
>>>>> performance and index size, better handle wide partitions, and remove the 
>>>>> need to manage key caching and index summaries.
>>>>> 
>>>>> We would like to discuss this proposal with you.
>>>>> 
>>>>> One of the questions that we want to ask is whether anyone objects to 
>>>>> maintaining full compatibility with existing files created by DataStax 
>>>>> Enterprise.
>>>>> 
>>>>> Regards,
>>>>> Branimir
>>> 
>>> 
>>>

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

Reply via email to