Hi Zixuan: Here I am more concerned about whether this feature will break backward compatibility, for historical data or old clusters, how do we use this feature.
-- Thanks Xiaolong Ran Zixuan Liu <node...@gmail.com> 于2022年3月7日周一 15:14写道: > Hi everyone, > > Good catch! I update my proposal on > https://github.com/apache/pulsar/issues/14529, and the compatibility part > has been appended: > > 1. The compression is disabled by default > 2. We need to consider how to migrate the old data when this compression > has been enabled. If the cursor data header is compressed format, we will > parse the bytes data by compressed format, otherwise we will parse the > cursor data directly by the original way > > Zixuan Liu <node...@gmail.com> 于2022年3月7日周一 15:11写道: > > > Hi PengHui, > > > > Sorry, the correct URL: https://github.com/apache/pulsar/issues/14529. > > > > :( Because of the problem of subscription, the email here is very > > confusing. > > > > > > PengHui Li <peng...@apache.org> 于2022年3月7日周一 12:39写道: > > > >> Hi Zixuan, > >> > >> Looks like you have added the wrong link for the proposal? > >> https://github.com/apache/pulsar/issues/14395 is for PIP-44 > >> > >> Penghui > >> > >> On Mon, Mar 7, 2022 at 12:37 PM PengHui Li <peng...@apache.org> wrote: > >> > >> > > This is a global setting now. But I wonder if we should compress it > >> only > >> > if the size > >> > is over a threshold? > >> > > >> > +1 > >> > > >> > Penghui > >> > > >> > On Sun, Mar 6, 2022 at 6:57 PM Enrico Olivelli <eolive...@gmail.com> > >> > wrote: > >> > > >> >> Il Dom 6 Mar 2022, 05:04 Haiting Jiang <jianghait...@apache.org> ha > >> >> scritto: > >> >> > >> >> > This is a global setting now. But I wonder if we should compress it > >> only > >> >> > if the size > >> >> > is over a threshold? > >> >> > >> >> > >> >> Good idea > >> >> > >> >> Enrico > >> >> > >> >> > >> >> Because: > >> >> > 1. It's not easy for us to notice some managed cursor info is too > >> large > >> >> in > >> >> > advance, normally it would be found only if it have actual impact. > >> But > >> >> if > >> >> > we enable this compression in advance, it will took some extra > >> computing > >> >> > resources. > >> >> > 2. It seems that it won't be a common case that this managed cursor > >> info > >> >> > is too large (only if there are a lot individualDeletedMessages and > >> >> > batchedEntryDeletionIndexInfo). So not quite necessary to compress > >> all > >> >> > managed cursor info. > >> >> > > >> >> > Regards, > >> >> > Haiting > >> >> > > >> >> > > >> >> > On 2022/03/02 04:41:16 Zixuan Liu wrote: > >> >> > > Hi Pulsar Community, > >> >> > > > >> >> > > > >> >> > > I create a proposal that support ManagedCursorInfo compression. > >> >> > > > >> >> > > The proposal can be found: > >> >> https://github.com/apache/pulsar/issues/14395 > >> >> > > > >> >> > > > >> >> > > Motivation > >> >> > > > >> >> > > The cursor data is managed by ZooKeeper/etcd metadata store. When > >> >> > > cursor data becomes more and more, the data size will increase > and > >> >> > > will take a lot of time to pull the data. Therefore, it is > >> necessary > >> >> > > to add compression for the cursor, which can reduce the size of > >> data > >> >> > > and reduce the time of pulling data. > >> >> > > Goal > >> >> > > > >> >> > > Support use the LZ4/ZLIB/ZSTD/SNAPPY to compress the > >> >> ManagedCursorInfo. > >> >> > > Implementation > >> >> > > > >> >> > > - Cursor compression format > >> >> > > [MAGIC_NUMBER] + [METADATA_SIZE] + [METADATA_PAYLOAD] + > >> >> > > [MANAGED_CURSOR_INFO_PAYLOAD] > >> >> > > > >> >> > > > >> >> > > - > >> >> > > > >> >> > > MAGIC_NUMBER > >> >> > > Ox4779 > >> >> > > - > >> >> > > > >> >> > > METADATA > >> >> > > Add a named ManagedCursorInfoMetadata message to > >> >> MLDataFormats.proto: > >> >> > > message ManagedCursorInfoMetadata { > >> >> > > required CompressionType compressionType = 1; > >> >> > > required int32 uncompressedSize = 2; > >> >> > > } > >> >> > > > >> >> > > Currently, these compressions have been supported, we only need > to > >> >> > > deal with compression and decompression of the ManagedCursorInfo > >> data: > >> >> > > > >> >> > > - > >> >> > > > >> >> > > Get CursorInfo from the metadata store > >> >> > > We will check the cursor data header, if it is compressed, we > >> will > >> >> > > parse the bytes data by compressed format, otherwise by the > >> original > >> >> > > way. > >> >> > > - > >> >> > > > >> >> > > Add/Update CursorInfo to the metadata store > >> >> > > The default is to use compression if the compression type is > >> >> > specified. > >> >> > > > >> >> > > > >> >> > > Thanks, > >> >> > > Zixuan > >> >> > > > >> >> > > >> >> > >> > > >> > > >