Retitled the thread to conform to the common format.

On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ning2008w...@gmail.com> wrote:

> Hello Henry,
>
> This is a very interesting proposal.
> https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
> concern of re-compressing data in mirror maker.
>
> Probably one thing may need to clarify is: how "shallow" mirroring is only
> applied to mirrormaker use case, if the changes need to be made on generic
> consumer and producer (e.g. by adding `fetch.raw.bytes` and
> `send.raw.bytes` to producer and consumer config)
>
> On 2021/02/05 00:59:57, Henry Cai <h...@pinterest.com.INVALID> wrote:
> > Dear Community members,
> >
> > We are proposing a new feature to improve the performance of Kafka mirror
> > maker:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> >
> > The current Kafka MirrorMaker process (with the underlying Consumer and
> > Producer library) uses significant CPU cycles and memory to
> > decompress/recompress, deserialize/re-serialize messages and copy
> multiple
> > times of messages bytes along the mirroring/replicating stages.
> >
> > The KIP proposes a *shallow mirror* feature which brings back the shallow
> > iterator concept to the mirror process and also proposes to skip the
> > unnecessary message decompression and recompression steps.  We argue in
> > many cases users just want a simple replication pipeline to replicate the
> > message as it is from the source cluster to the destination cluster.  In
> > many cases the messages in the source cluster are already compressed and
> > properly batched, users just need an identical copy of the message bytes
> > through the mirroring without any transformation or repartitioning.
> >
> > We have a prototype implementation in house with MirrorMaker v1 and
> > observed *CPU usage dropped from 50% to 15%* for some mirror pipelines.
> >
> > We name this feature: *shallow mirroring* since it has some resemblance
> to
> > the old Kafka 0.7 namesake feature but the implementations are not quite
> > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
> inside
> > MemoryRecords structure instead of deep iterating records inside
> > RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer
> > instead of deep copying and deserializing bytes into objects.
> >
> > Please share discussions/feedback along this email thread.
> >
>


-- 

Thanks!
--Vahid

Reply via email to