Hi Zixuan:

Here I am more concerned about whether this feature will break backward
compatibility, for historical data or old clusters, how do we use this
feature.

--
Thanks
Xiaolong Ran

Zixuan Liu <node...@gmail.com> 于2022年3月7日周一 15:14写道:

> Hi everyone,
>
> Good catch! I update my proposal on
> https://github.com/apache/pulsar/issues/14529, and the compatibility part
> has been appended:
>
> 1. The compression is disabled by default
> 2. We need to consider how to migrate the old data when this compression
> has been enabled. If the cursor data header is compressed format, we will
> parse the bytes data by compressed format, otherwise we will parse the
> cursor data directly by the original way
>
> Zixuan Liu <node...@gmail.com> 于2022年3月7日周一 15:11写道:
>
> > Hi PengHui,
> >
> > Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529.
> >
> > :( Because of the problem of subscription, the email here is very
> > confusing.
> >
> >
> > PengHui Li <peng...@apache.org> 于2022年3月7日周一 12:39写道:
> >
> >> Hi Zixuan,
> >>
> >> Looks like you have added the wrong link for the proposal?
> >> https://github.com/apache/pulsar/issues/14395 is for PIP-44
> >>
> >> Penghui
> >>
> >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li <peng...@apache.org> wrote:
> >>
> >> > > This is a global setting now. But I wonder if we should compress it
> >> only
> >> > if the size
> >> > is over a threshold?
> >> >
> >> > +1
> >> >
> >> > Penghui
> >> >
> >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli <eolive...@gmail.com>
> >> > wrote:
> >> >
> >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang <jianghait...@apache.org> ha
> >> >> scritto:
> >> >>
> >> >> > This is a global setting now. But I wonder if we should compress it
> >> only
> >> >> > if the size
> >> >> > is over a threshold?
> >> >>
> >> >>
> >> >> Good idea
> >> >>
> >> >> Enrico
> >> >>
> >> >>
> >> >>   Because:
> >> >> > 1. It's not easy for us to notice some managed cursor info is too
> >> large
> >> >> in
> >> >> > advance,  normally it would be found only if it have actual impact.
> >> But
> >> >> if
> >> >> > we enable this compression in advance, it will took some extra
> >> computing
> >> >> > resources.
> >> >> > 2. It seems that it won't be a common case that this managed cursor
> >> info
> >> >> > is too large (only if there are a lot individualDeletedMessages and
> >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to compress
> >> all
> >> >> > managed cursor info.
> >> >> >
> >> >> > Regards,
> >> >> > Haiting
> >> >> >
> >> >> >
> >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote:
> >> >> > > Hi Pulsar Community,
> >> >> > >
> >> >> > >
> >> >> > > I create a proposal that support ManagedCursorInfo compression.
> >> >> > >
> >> >> > > The proposal can be found:
> >> >> https://github.com/apache/pulsar/issues/14395
> >> >> > >
> >> >> > >
> >> >> > > Motivation
> >> >> > >
> >> >> > > The cursor data is managed by ZooKeeper/etcd metadata store. When
> >> >> > > cursor data becomes more and more, the data size will increase
> and
> >> >> > > will take a lot of time to pull the data. Therefore, it is
> >> necessary
> >> >> > > to add compression for the cursor, which can reduce the size of
> >> data
> >> >> > > and reduce the time of pulling data.
> >> >> > > Goal
> >> >> > >
> >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the
> >> >> ManagedCursorInfo.
> >> >> > > Implementation
> >> >> > >
> >> >> > >    - Cursor compression format
> >> >> > >    [MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] +
> >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD]
> >> >> > >
> >> >> > >
> >> >> > >    -
> >> >> > >
> >> >> > >    MAGIC_NUMBER
> >> >> > >    Ox4779
> >> >> > >    -
> >> >> > >
> >> >> > >    METADATA
> >> >> > >    Add a named ManagedCursorInfoMetadata message to
> >> >> MLDataFormats.proto:
> >> >> > >    message ManagedCursorInfoMetadata {
> >> >> > >       required CompressionType compressionType = 1;
> >> >> > >       required int32 uncompressedSize = 2;
> >> >> > >    }
> >> >> > >
> >> >> > > Currently, these compressions have been supported, we only need
> to
> >> >> > > deal with compression and decompression of the ManagedCursorInfo
> >> data:
> >> >> > >
> >> >> > >    -
> >> >> > >
> >> >> > >    Get CursorInfo from the metadata store
> >> >> > >    We will check the cursor data header, if it is compressed, we
> >> will
> >> >> > > parse the bytes data by compressed format, otherwise by the
> >> original
> >> >> > > way.
> >> >> > >    -
> >> >> > >
> >> >> > >    Add/Update CursorInfo to the metadata store
> >> >> > >    The default is to use compression if the compression type is
> >> >> > specified.
> >> >> > >
> >> >> > >
> >> >> > > Thanks,
> >> >> > > Zixuan
> >> >> > >
> >> >> >
> >> >>
> >> >
> >>
> >
>

Reply via email to