Only enabled `AppendBrokerTimestampMetadataInterceptor` is not enough.

We should also improve the TTL:
When client publish time > Ledger create time + Ledger rollover time, 
and `brokerPublishTime` is not set,
we can make Ledger TTL time =  Ledger create time + Ledger rollover time.

This change can make sure entry expired when client clock is not right.

On 2024/01/17 04:31:16 PengHui Li wrote:
> > User disabling `AppendBrokerTimestampMetadataInterceptor` does not mean
> that they allow this bug.
> This is a configuration, not an API. It is difficult to use documentation
> to regulate user behavior.
> 
> Actually, they need to know why they want to disable `
> AppendBrokerTimestampMetadataInterceptor`.
> Otherwise, why do they want to disable `AppendBrokerTimestampMetadata
> Interceptor`?
> 
> Pulsar's default configuration tries to provide a best practice for most of
> the cases. To avoid the potential
> problem, Pulsar enables  `AppendBrokerTimestampMetadataInterceptor` by
> default. But it doesn't mean all the cases
> should enable `AppendBrokerTimestampMetadataInterceptor`. If users don't
> use TTL, the producers are
> well-maintained. Is there any reason to have at least 16 bytes append to
> each entry? The default configuration
> behavior change also needs to be highlighted in the release note. Users
> need to know if they don't disable the
> `AppendBrokerTimestampMetadataInterceptor` manually on a newer version, the
> retained data size will increase.
> 
> BTW, we need to explain each interceptor in the `broker.conf` and why `
> AppendBrokerTimestampMetadataInterceptor`
> is enabled by default. If users don't want to read it, change anything in
> `broker.conf` they don't really know what it is.
> How can they know what they expect?
> 
> Regards,
> Penghui
> 
> 
> On Mon, Jan 15, 2024 at 11:09 PM Lin Lin <lin...@apache.org> wrote:
> 
> > User disabling `AppendBrokerTimestampMetadataInterceptor` does not mean
> > that they allow this bug.
> > This is a configuration, not an API. It is difficult to use documentation
> > to regulate user behavior.
> >
> > Maybe we can add a new field (BrokerTimestamp) to save the timestamp on
> > the Broker side.
> > The time priority for trimming Ledger is as follows:
> >
> > BrokerPublishTime > BrokerTimestamp
> >
> > If `BrokerPublishTime` exists, `BrokerReceiveTime` is not set.
> > If not, we set `BrokerReceiveTime`  and is no longer affected by client
> > time.
> >
> > On 2024/01/15 02:15:17 PengHui Li wrote:
> > > IMO, we should enable `AppendBrokerTimestampMetadataInterceptor` by
> > default.
> > > Users can still have a way to disable it if they care about the
> > additional
> > > metadata stored
> > > in each entry.
> > >
> > > For the `hasBrokerPublishTime` API. The topic might also have historical
> > > data without
> > > broker publish time. So, it should be fine to keep this API because we
> > > don't know how
> > > long users will retain their data.
> > >
> > > Regards,
> > > Penghui
> > >
> > > On Sat, Jan 6, 2024 at 10:35 PM linlin <lin...@apache.org> wrote:
> > >
> > > > Now, if the message's metadata does not set a broker side timestamp,
> > the
> > > > ledger expiration check is based on the client's publish time.
> > > >
> > > > When the client machine's clock is incorrect (eg: set to 1 year later)
> > ,
> > > > the ledger can not be cleaned up. Issue
> > > > https://github.com/apache/pulsar/issues/21347
> > > >
> > > > `AppendBrokerTimestampMetadataInterceptor` can set timestamp for
> > messages
> > > > on the broker side, but we can not ensure that the
> > > > `AppendBrokerTimestampMetadataInterceptor` is always enable
> > > >
> > > > Therefore, I open this PR(https://github.com/apache/pulsar/pull/21835)
> > to
> > > > always set the broker timestamp for messages on the broker side.
> > > >
> > > > With this change , firstly we should deprecate
> > > > AppendBrokerTimestampMetadataInterceptor.
> > > > It no longer needs to exist
> > > >
> > > > Secondly, we should deprecate `hasBrokerPublishTime` in interface
> > Message.
> > > > It always returns true.
> > > > This API is created in PR (https://github.com/apache/pulsar/pull/11553
> > )
> > > > This PR is for the client to obtain BrokerPublishTime, so the
> > > > `hasBrokerPublishTime` API is not necessary.
> > > >
> > >
> >
> 

Reply via email to