Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic

Asaf Mesika Tue, 14 Mar 2023 08:01:17 -0700

On Sun, Mar 12, 2023 at 7:21 PM Yan Zhao <[email protected]> wrote:


> > > In the LedgerDeletionService start, it will create a producer for
> sending
> > > pending delete ledger. When deleting a ledger, the producer sends
> > > PendingDeleteLedgerInfo to the system-topic. If the sent succeeds,
> delete
> > > the ledger id from the whole ledgers and update the ledgers into the
> meta
> > > store. If the sent fails, do nothing.
> >
> > You don't delete the ledger ID from the whole ledgers, then update the
> > ledgers in the metadata store. We described it above already; we remove
> the
> > pending delete ledger ID from the list of ledger IDs, publish the
> message,
> > and only at the end update the metadata store.
> Yes, correct it.
>
> > > In the LedgerDeletionService start, it will create a producer for
> sending
> > > pending delete ledger. When deleting a ledger, the producer sends
> > > PendingDeleteLedgerInfo to the system-topic. If the sent succeeds,
> delete
> > > the ledger id from the whole ledgers and update the ledgers into the
> meta
> > > store. If the sent fails, do nothing.
>
> > After that section, you write the methods of DeletionService and the
> > pseudo-code, but it's hard to understand what goes where: What exactly
> > deletion service does, and where the pseudo code will actually go into.
> It's just a summary, more detailed information on the later sections.
>
> > Also, let's concentrate on the implementation details in the
> Implementation
> > section High-Level Design. Let the readers understand the concept end to
> > end in the High-Level Design, and then in the Implementation, start
> diving
> > into the nitty gritty details.
>
>
> > > The consumer uses pulsar admin to send PendingDeleteLedgerInfo by the
> > > broker restful API.
>
> > Please explain here that you will introduce a new Admin API for that.
> Also,
> > explain that you will issue that API against the broker who owns the
> topic
> > (The schema registry has a ledger per topic?)
> correct it.
>
> > I still don't understand which methods or behavior you're changing
> exactly
> > for the managed ledger, cursor, and schema.
> > Why am I asking?
> > - Ledger deletion can happen because:
> > * Background truncation since ledgers are allowed to be deleted based on
> > expiration policy.
> > * Truncate request by an admin API.
> > * More?
> All the ways you mention do the same work: trigger
> ManagedLedgerImpl#internalTrimLedgers.
> We change the implementation approach in the internalTrimLedgers.
>
> > If you find the ledger ID is still in the meta-store, you return a
> failure,
> > and the consumer acks the message, right?
> > If it's background deletion, all good since it will kick in again in a
> few
> > seconds and retry.
> > But what happens if a user runs a REST admin API command to truncate the
> > topic?
> Using the REST admin API command to truncate the topic is not different
> from the background task. It's happening in the first metadata deletion
> phase, we didn't check whether the ledger is in the metadata or not, the
> first phase only picks all consumed ledgers, then try to delete them.
>
> > > check if the ledger is still in use,
>
> > You mean, check if the ledger ID still exists in the ledger ID list of
> the
> > topic in the metadata store, right?
> >
> >  then check the PendingDeleteLedgerInfo ledgerType.
> >
> > Check for what?
> see 3.1, 3.2.
>
>
> > > 2.2 If the topic not exists, the consumer checks the
> > > PendingDeleteLedgerInfo ledgerType.
> > > 3.1 If the ledgerType is Ledger, delete the data at the bookkeeper
> side.
> > > 3.2 If the ledgerType is Offload-Ledger, delete the data at the
> > > tiered-storage side.
>
> > Something doesn't look good in the numbering and how you structure it.
> 3.1
> > is probably an indent inside 2.2, no?
> Sorry, it may confuse the user. I add the flow hint in the pip.
>
>
> > 1. What happens if the topic doesn't exist when you are in the broker in
> > the delete pending ledger API implementation?
> The broker will respond 404 Topic not found an error to the consumer, the
> consumer knows the topic already does not exist, it will delete the ledger
> locally.
>
> You need to write that in your PIP. In your pseudo code you make it appear
as if the failure is boolean, but in fact the response should be an error
code.



> > 2. Say the ledger ID is still in the metadata store. If the background
> > truncation of the managed ledge was the trigger, there is no real point
> in
> > retrying, right? Once you start over, another message will be sent to the
> > system topic.
> In the storage deletion phase, If the ledger id is still in the metadata
> store, we think it will be used, we didn't delete it and ack the message.
> Due to the ledger is still in the metadata store, it will be deleted at
> the next background truncation task. So the ledger will be deleted finally.
>
>
> > 3. Can you explain exactly the cases in which it is actually worth doing
> > the retry?
> > I mean the obvious: you failed to delete it from BK due to a transient
> > error. But the previous one of checking the list? worth the retry?
> If the ledger is still in the meta store at the second storage deletion,
> we won't retry. we will ack the message to avoid useless retry.
> We retry deletion only on the storage delete failed case.
>
You need to make it clear.
Your response to new REST ADMIN API should be an error code:
TOPIC_NOT_EXISTS
LEDGER_STILL_IN_METADATA_STORE
STORAGE_DELETION_FAILED

then retry only the last type of error.


>
>
> > then respond to the error msg "ledger still in use" to the consumer.
>
> > You need to have those error codes in your pseudo code.
> > and your consumer pseudo code should check and act accordingly
> Fine. correct it.
>
I haven't seen it.
Also you need to make it clear also in the interface method definition of
the REST API of delete pending ledger.



>
> > > For security, the hacker may fake a PendingDeleteLedgerInfo and send
> it to
> > > the system-topic. So we need to do some checks works before deleting
> the
> > > data.
> > >
> > I think we’re over-engineering this or overprotecting.
> > Today, anybody with access to Pulsar Admin (who has a token with a role)
> > can do a lot of harm - delete an entire topic, and truncate the topic.
> Why
> > do you want to protect yourself from something like that? It’s too much,
> > IMO.
> Ok, we need more discussion about this case. I will ack for more eyes.
>
> > > Build an offloader according to the
> PendingDeleteLedgerInfo.offloadContext
> > >
> > >  Isn’t that expensive?
> We must use offloaded to delete the offload ledger. We can use the cache
> to avoid building the same offloader.
>
>
> > > There are some errors but we think it was successful, and avoid
> retrying
> > > it again.
> > >
> > >    - The Ledger is Still in use
> > >    - The PendingDeleteLedgerInfo is not matched with the bookkeeper
> > >    ledger metadata
> > >    - The data is not exist in the bookkeeper or tiered-storage
> > >    - The PendingDeleteLedgerInfo param is not incorrect, miss some
> > >    essential property
> > >
> > >
> > Maybe specify for each your resolution:
> > * For ledger ID that still exists in the metadata store - don't retry,
> ack
> > the message because ...
> > * ...
> Correct it.
>
> >
> > > Add new description API in pulsarAdmin.BrokerStats for getting it.
> >
> > What?
> Could you describe it detailed?
>

What is a description API in BrokerStats?


>
> >
> > So LedgerDeletionService.asyncDelete*() calls
> > ManagedLedger.asyncDeleteLedger()?
> No, it invokes bookkeeper.asyncDeleteLedger or offloader.deleteOffloaded
>
> >
> > > Consumer use pulsar.admin().topics().asyncDeleteLedger() to send
> request
> > > to the broker
>
> >  So, I presume we don't really want users to have that ability, right?
> > It should be "private" for the users.
>
> > But once we expose it via AdminAPI, it's out there, ready to be used by
> > anyone.
> Yes. So now, we check if the bookkeeper metadata is whether matches the
> param, to avoid mistake operations.
>
>
> > 1. Perhaps we should introduce a new concept of private AdminAPI which
> can
> > only be consumed internally by other brokers.
> > 2. What's the needed authorization to make such a delete command of a
> > ledger?
> Sounds good, we can draft a new pip to discuss it.
>

Agree, but I wouldn't continue with this PIP before you have that private
admin API.

Penghui suggested instead of communicating via Admin API we use the same
methodology used in the new load balancer - using a compacted topic as a
queue:
The system topic you create will have compaction enabled.
You write the pending ledger delete, and the key is ledger-ID and
ledger-type (BK, offloaded).
All brokers will consume this compacted topic as table-view. Basically upon
restart, you read all messages into memory.
When you reading you only keep the messages related to topic the broker
owns, others you discard (table view have filter feature).
For each message in table-view, you delete the ledgers, then write a
message to the topic with same key, null message, meaning it will be
deleted.

This way you don't need to invent a new private Admin API.

WDYT?

Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic

Reply via email to