Thanks for the KIP. I have a few high level comments:

1. Like Tom, I'm not convinced about the proposal to make this change to
MirrorMaker 1 if we intend to deprecate it and remove it. I would rather us
focus our efforts on the implementation we intend to support going forward.
2. The producer/consumer configs seem pretty dangerous for general usage,
but the KIP doesn't address the potential downsides.
3. How does the ProducerRequest change impact exactly-once (if at all)? The
change we are reverting was done as part of KIP-98. Have we considered the
original reasons for the change?

Thanks,
Ismael

On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <vahid.hashem...@gmail.com>
wrote:

> Retitled the thread to conform to the common format.
>
> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ning2008w...@gmail.com> wrote:
>
> > Hello Henry,
> >
> > This is a very interesting proposal.
> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
> > concern of re-compressing data in mirror maker.
> >
> > Probably one thing may need to clarify is: how "shallow" mirroring is
> only
> > applied to mirrormaker use case, if the changes need to be made on
> generic
> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > `send.raw.bytes` to producer and consumer config)
> >
> > On 2021/02/05 00:59:57, Henry Cai <h...@pinterest.com.INVALID> wrote:
> > > Dear Community members,
> > >
> > > We are proposing a new feature to improve the performance of Kafka
> mirror
> > > maker:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > >
> > > The current Kafka MirrorMaker process (with the underlying Consumer and
> > > Producer library) uses significant CPU cycles and memory to
> > > decompress/recompress, deserialize/re-serialize messages and copy
> > multiple
> > > times of messages bytes along the mirroring/replicating stages.
> > >
> > > The KIP proposes a *shallow mirror* feature which brings back the
> shallow
> > > iterator concept to the mirror process and also proposes to skip the
> > > unnecessary message decompression and recompression steps.  We argue in
> > > many cases users just want a simple replication pipeline to replicate
> the
> > > message as it is from the source cluster to the destination cluster.
> In
> > > many cases the messages in the source cluster are already compressed
> and
> > > properly batched, users just need an identical copy of the message
> bytes
> > > through the mirroring without any transformation or repartitioning.
> > >
> > > We have a prototype implementation in house with MirrorMaker v1 and
> > > observed *CPU usage dropped from 50% to 15%* for some mirror pipelines.
> > >
> > > We name this feature: *shallow mirroring* since it has some resemblance
> > to
> > > the old Kafka 0.7 namesake feature but the implementations are not
> quite
> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
> > inside
> > > MemoryRecords structure instead of deep iterating records inside
> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer
> > > instead of deep copying and deserializing bytes into objects.
> > >
> > > Please share discussions/feedback along this email thread.
> > >
> >
>
>
> --
>
> Thanks!
> --Vahid
>

Reply via email to