Hi, yubiao,
First of all, thanks for the attention and questions. Then for your three
questions:
1.
 > Does the merge take place in memory or in BK?
The snapshot will merge in BK. For specific details, you can see detailed
instructions in the* ### Merge snapshot section.*
2.
>How do we ensure the atomicity of the two writes, I suggest adding a check
We do not guarantee their atomicity. The position of the snapshot is
generally unchanged, so the previous index is also valid. If the index
write fails after a snapshot is written, the final result is that the
snapshot write fails this time. There will be no other worse results, and
no dirty data will be introduced due to compression.
3.
>Clean up unused aborts data
Snapshot cleanup can be found in *####take snapshot ##### How*.
The cleanup of the index is done automatically by the compressor. I will
add it at *### Snapshot index topic.*

yours sincerely,
Xiangying Meng




On Mon, Aug 15, 2022 at 3:56 PM Yubiao Feng
<yubiao.f...@streamnative.io.invalid> wrote:

> Hi Xiangying
>
> I think Multiple-snapshots for TB is a good idea. And I have these
> questions:
>
>
> > The number of the transactions in a snapshot can be configured, and we
> hope it is small, then we can merge the small snapshots into a large
> snapshot when it reaches a configured number.
>
> Does the merge take place in memory or in BK?
>
> - If we merge small-snapshot in memory, can we just use large-snapshot?
> - If we merge small-snapshot in BK, how to do it?
>
>
>
> > The index is written after each multiple-snapshot is written.
>
> Snapshot and index are stored in different topics, right?
>
> How do we ensure the atomicity of the two writes, I suggest adding a check
> mechanism that snapshot not recorded in the index is invalid.
>
>
>
> > #### Clean up unused aborts data
>
> Now, this section only has instructions for clear snapshots.
> I think we should add this: how to delete/override the index data.
>
> Thanks
> Yubiao Feng
>
> On Thu, Aug 4, 2022 at 10:27 AM Xiangying Meng <xiangy...@apache.org>
> wrote:
>
> > Hi, Pulsar community,
> > I`d like to start a discussion about transaction multiple-snapshot.
> > In order to get rid of the capacity limitation of the bookkeeper entry,
> we
> > plan to use multiple snapshots. More details can be found here
> > <https://github.com/apache/pulsar/issues/16913>.
> >
> > Yours sincerely,
> > Xiangying Meng
> >
>

Reply via email to