On Tue, Nov 7, 2017 at 10:47 AM, Istvan Soos <[email protected]> wrote:

> On the website [0] I gather that data compaction is mostly about
> cleaning up after we delete a ledger. Is there a feature or plan to
> implement entry-level compaction, e.g. to have an ID that uniquely
> identifies an entity, and if there are two events for that entity,
> only retain the last one?
>
> [0]: https://bookkeeper.apache.org/docs/latest/getting-started/concepts/


Currently we don't have an open item about supporting this "log compaction"
feature.
But I would to learn more about your use case and to see how we can support
you.


>
>
> Or do you implement it by using different ledgers, migrating from one
> to another?


In pulsar community, we are actually discussing a similar "log compaction"
feature.
Pulsar is the pub/sub messaging system built on Apache BookKeeper.
The idea is almost same as what you said, it would compact the
messages/entries based on
some keys, and write the compacted messages as a separate ledger.



> How does it work out with handovers of what is considered
> the main ledger to write to or read from?
>

You need some sort of metadata to track the list of ledgers and update the
metadata once a compacted ledger is generated.

Hope this explain your questions. Would love to chat more about your user
case.


>
> Thanks,
>   Istvan
>

Reply via email to