I'm looking to use Pulsar to store some timeseries data with infinite 
retention. For a simplistic example, lets say it's a bunch of orders lifecycle 
events (created, shipped, received, cancelled, etc). I want to be able to 
retrieve the state of the open orders at a given point in time, and then all 
order events up to a second point in time.

At first compaction seems like what I want here, as with the first point in 
time, I don't care about orders in the past which are closed. So compaction 
would keep me from having to replay all events from the beginning of time. 
However as I want to be able to retrieve any period of time, I would need 
multiple compaction points so I could use the one closest to my start time.

Is this possible?
>From my understanding reading through the documentation 
>(https://pulsar.apache.org/docs/en/concepts-topic-compaction/), it looks like 
>there's only a single compaction point. So if I performed a compaction today, 
>but I wanted to retrieve last month's data, I'd have to replay from the 
>beginning of time.
Can Pulsar handle this, or will I have to create some manual method of 
snapshotting and storing the state at periodic intervals?

Thanks

-Patrick

Reply via email to