Re: [Discuss] PIP-248: Add backlog eviction metric

PengHui Li Wed, 01 Mar 2023 17:46:18 -0800

Ah, I forgot this one "pulsar_storage_backlog_quota_limit"
As Asaf said, users can just divide the two to get a percentage.
I think we don't need to expose more metrics for the size-based backlog
quota. And only exposing the topic-level metrics looks good to me.
Users can get the alert and then check which subscription with large
backlogs
by the Pulsar Admin.


For the estimated backlog size. It should be ok? The backlog quota policy
also performs based on the estimated backlog size.

> I'm afraid of that.
> Today the quota is checked periodically, right? So that's how the operator
> knows the cost in terms of I/O is limited.
> Now you are adding one additional I/O per collection, every 1 min by
> default. That's a lot perhaps. How long is the check interval today?
>
> Perhaps in the backlog quota check, you can persist the check result, and
> use it? Persist the age that is.

I think yes, we don't need to add additional costs here. The broker did the
backlog
check if the backlog quota was enabled. So we can just record the last
checked value
to the topic.

Follow the same way, we can just expose the time-based lag metrics. So that
users can divide the two to get a percentage.

> Regarding "slowest_subscription"
> I think the cost is too high, because the subscriptions will keep
> alternating, which can generate so many unique time series. Since
> Prometheus flush only every 2 hours, or any there TSDB, it will cost you
> too much.
>
> I suggest exposing the name via the topic stats. This way they can issue a
> REST call to grab that subscription name only when the alert fires.

Yes, I totally agree. And now we already have the information.
Just get the subscription with max backlog size.

@jiuming I think you'd better copy the context that Asaf provided to the
proposal.
It will help the reviewer to understand what problems we want to resolve.
And It will provide the opportunity for more people to join the discussion.

Regards
Penghui

On Wed, Mar 1, 2023 at 11:42 PM Asaf Mesika <[email protected]> wrote:

> >
> > Pulsar has 2 configurations for the backlog eviction
> > <
> https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas
> >
> > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond.
> > By default, backlog eviction is disabled, and also, there is a field
> named
> > backlogQuotaMap in TopicPolicies
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45
> >
> > /NamespaceSpacePolicies
> > <
> https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41>
> assists
> > in controlling Topic/Namespace level backlog quota.
> >
> > If topic backlog reaches the threshold of any item, backlog eviction will
> > be triggered, Pulsar will move subscription's cursor to skip
> unacknowledged
> > messages.
> >
> > Before backlog eviction happens, we don't have a metric to monitor how
> > long that it can reaches the threshold.
> >
>
> I  think you should fix this explanation:
>
> In Pulsar, a subscription maintains a state of message acknowledged. A
> subscription backlog is the set of messages which are unacknowledged.
> A subscription backlog size is the sum of size of unacknowledged messages
> (in bytes).
> A topic can have many subscriptions.
> A topic backlog is defined as the backlog size of the subscription which
> has the oldest unacknowledged message. Since acknowledged messages can be
> interleaved with unacknowledged messages, calculating the exact size of
> that subscription can be expensive as it requires I/O operations to read
> from the messages from the ledgers.
> For that reason, the topic backlog is actually defined to be the estimated
> backlog size of that subscription. It does so by summarizing the size of
> all the ledgers, starting from the current active one, up to the ledger
> which contains the oldest unacknowledged message (There is actually a
> faster way to calculate it, but this is the definition of the estimation).
>
> A topic backlog age is the age of the oldest unacknowledged message (in any
> subscription). If that message was written 30 minutes ago, its age is 30
> minutes.
>
> Pulsar has a feature called backlog quota (place link). It allows the user
> to define a quota - in effect, a limit - which limits the topic backlog.
> There are two types of quotas:
> * Size based: The limit is for the topic backlog size (as we defined
> above).
> * Time based: The limit is for the topic's backlog age (as we defined
> above).
>
> Once a topic backlog exceeds either one of those limits, an action is taken
> upon messages written to the topic:
> * The producer write is placed on hold for a certain amount of time before
> failing.
> * The producer write is failed
> * The subscriptions oldest unacknowledged messages will be acknowledged in
> order until both the topic backlog size or age will fall inside the limit
> (quota). The process is called backlog eviction (happens every interval)
>
> The quotas can be defined as a default value for any topic, by using the
> following broker configuration keys: backlogQuotaDefaultLimitBytes ,
> backlogQuotaDefaultLimitSecond. It can also be specified directly for all
> topics in a given namespace using the namespace policy, or a specific topic
> using a topic policy.
>
> The user today can calculate quota used for size based limit, since there
> are two metrics that are exposed today on a topic level: "
> pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You
> can just divide the two to get a percentage.
> For the time-based limit, the only metric exposed today is quota itself , "
> pulsar_storage_backlog_quota_limit_time".
>
> ------------
>
> I would create two metrics:
>
> `pulsar_backlog_size_quota_used_percentage`
> `pulsar_backlog_time_quota_used_percentage`
>
> You would like to know what triggered the alert, hence two.
> It's not the quota percentage, it's the quota used percentage.
>
> ----------
>
> It checks if the backlog size exceeds the threshold(
> > backlogQuotaDefaultLimitBytes), and it gets the current backlog size by
> > calculating LedgerInfo
> > <
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> >,
> > it will not lead to I/O.
>
> This is not correct.
> It checks against the topic / namespace policy, and if it doesn't exist, it
> falls back on the default configuration key mentioned above.
>
> It checks if the backlog time exceeds the threshold(
> > backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck is
> > set to be true, it will read an entry from Bookkeeper, but the default
> > value is false, which means it gets the backlog time by calculating
> > LedgerInfo
> > <
> https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54
> >.
> > So in general, we don't need to worry about it will lead to I/O.
>
>
> I'm afraid of that.
> Today the quota is checked periodically, right? So that's how the operator
> knows the cost in terms of I/O is limited.
>  Now you are adding one additional I/O per collection, every 1 min by
> default. That's a lot perhaps. How long is the check interval today?
>
> Perhaps in the backlog quota check, you can persist the check result, and
> use it? Persist the age that is.
>
>
> ------
>
> Regarding "slowest_subscription"
> I think the cost is too high, because the subscriptions will keep
> alternating, which can generate so many unique time series. Since
> Prometheus flush only every 2 hours, or any there TSDB, it will cost you
> too much.
>
> I suggest exposing the name via the topic stats. This way they can issue a
> REST call to grab that subscription name only when the alert fires.
>
> Thanks,
>
> Asaf
>
>
>
>
>
> On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <[email protected]> wrote:
>
> > Hi Asaf,
> > I've updated the PIP, PTAL
> >
> > Thank,
> > Tao Jiuming
> >
> > Asaf Mesika <[email protected]> 于2023年2月26日周日 23:03写道：
> >
> > > Hi,
> > >
> > > Pulsar has 2 configurations for the backlog eviction:
> > > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond, if
> > > > topic backlog reaches the threshold of any item, backlog eviction
> will
> > be
> > > > triggered.
> > >
> > > This seems like default values, not the actual values. Can you please
> > > provide an explanation in the PIP and link to read more:
> > > 1. Where do you define the backlog quota exactly? What is the
> granularity
> > > (subscription?)
> > > 2.  Is the backlog quota on by default? If so, what are the default
> > values?
> > >
> > >
> > >
> > > *Notes*
> > > 1. When the backlog quota limit is defined in Bytes, and you wish to
> know
> > > how close a subscription is to its bytes limit, you need to calculate
> the
> > > backlog size in bytes. From my understanding, there is an accurate
> > > calculation (which is costly in terms of I/O) and there is an estimate
> of
> > > it. I presume you would want to use the estimated one, is that correct?
> > > The backlog quota itself, uses the accurate or the estimated when it
> > starts
> > > evicting entries (i.e. marking them as acknowledged)?
> > >
> > > 2. For the backlog limit specifying in time units, there is no
> estimate,
> > as
> > > it must be calculated all the time (earliest unacknowledged message
> > > distance from now). How do you plan to calculate the current age of the
> > > earliest message without bearing that I/O cost on each metric
> > calculation?
> > >
> > > 3. In the Goal section, you specify that your goal is to add a
> > "proximity"
> > > metric.
> > > a) You must define that - what is proximity metric exactly? What are
> its
> > > units? How are you planning to calculate it?
> > > b) Proximity is not a good term IMO. I personally have never seen this
> > term
> > > used in software systems, unless it's in the aviation/space industry.
> > Once
> > > you explain (a) I hope I can help provide alternative names.
> > >
> > > 4. Maybe we should provide the used quota percentage for both limits,
> > > instead of one per both, since it's easier to act upon the alert when
> you
> > > need which one triggered it.
> > >
> > > 5. I didn't understand the "slowest_subscription" label used when
> > > describing the metric label. Can you please provide an explanation?
> > >
> > > 6. I suggest writing a "High Level Design" section, and add everything
> > you
> > > need to know for this proposal, so I don't need to read the
> > > implementation details below (code).
> > >
> > > Thanks,
> > >
> > > Asaf
> > >
> > >
> > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <[email protected]> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've started a PIP to discuss: PIP-248 Add backlog eviction metric
> > > >
> > > > ### Motivation:
> > > >
> > > > Pulsar has 2 configurations for the backlog eviction:
> > > > `backlogQuotaDefaultLimitBytes` and `backlogQuotaDefaultLimitSecond`,
> > if
> > > > topic backlog reaches the threshold of any item, backlog eviction
> will
> > be
> > > > triggered.
> > > >
> > > > Before backlog eviction happens, we don't have a metric to monitor
> how
> > > long
> > > > that it can reaches the threshold.
> > > >
> > > > We can provide a progress bar metric to tell users some topics is
> about
> > > to
> > > > trigger backlog eviction. And users can subscribe the alert to
> schedule
> > > > consumers.
> > > >
> > > > For more details, please read the PIP at
> > > > https://github.com/apache/pulsar/issues/19601
> > > >
> > > > Thanks,
> > > > Tao Jiuming
> > > >
> > >
> >
>

Re: [Discuss] PIP-248: Add backlog eviction metric

Reply via email to