Bookkeeper's ledger is a distributed WAL with random reads support.
ledger.readAsync(long firstEntry, long lastEntry) works if firstEntry ==
lastEntry
With that one can pipeline reads of as many entries in parallel as needed,
no BK API changes needed for such a niche case.

Reading backwards may backfire in performance, as there is a caching layer
in BK + OS pages data in memory in anticipation of sequential reads.
You can experiment with OS tuning and some BK parameters, with modern SSD
drives backwards reads should have minimal perf differences.

Majority of your changes are in pulsar, I haven't looked closely but my
best guess is that it will affect topic truncation that relies on
subscription moving forwards.

FWIW, consumer has seek() API
https://github.com/apache/pulsar/blob/e5a833a2dcb7ce13ada4ca94714cc045a02de276/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/Consumer.java#L482
You can seek to N messages back from last offset, read messages in memory
forward sequentially, reverse, handle, repeat.

As for the PR and idea overall, I'd suggest calling into one of the Pulsar
community meetings to get faster feedback.

On Mon, Mar 6, 2023 at 7:37 AM Alexandre DUVAL <kanna...@gmail.com> wrote:

> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>

-- 
Andrey Yegorov

Reply via email to