You can try to reduce batch.num.messages first and see if the throughput
gets affected.

For your second question, we do not have a good solution to that since the
offsets are not consistent across data centers like you said. One way we
did is to have the consumer consuming both data centers, but keep the slave
data center's consumer do no-op; once the master data center is totally
gone, turn the nob to let the slave data center's consumers to both consume
and process. Doing so still do not give you consistency as you may consume
duplicates or data loss, but gives you a relatively close point of resuming
compared with the previous consumer.

Guozhang


On Mon, Jun 9, 2014 at 2:08 PM, Kane Kane <kane.ist...@gmail.com> wrote:

> What would you recommend in this case? Bump up the max.message.bytes,
> use sync producer or lower batch.num.messages? Also does it matter on
> which side mirrormaker is located, source or target one?
>
> And another related question: as I understand, potentially offsets
> might be different for source and target, and if the whole datacenter
> is lost, you probably loose offsets in zookeeper as well. What is your
> strategy to switch over producers and consumers to the new cluster,
> given consumer should pickup from the same point where it was dropped
> in another datacenter?
>
> Thanks.
>
> On Sat, Jun 7, 2014 at 3:49 PM, Guozhang Wang <wangg...@gmail.com> wrote:
> > In the old producer yes, in the new producer (available in 0.8.1.1) the
> > batch size is by bytes instead of #. messages, which gives you a better
> > control.
> >
> > Guozhang
> >
> >
> > On Sat, Jun 7, 2014 at 2:48 PM, Kane Kane <kane.ist...@gmail.com> wrote:
> >
> >> Ah, that makes sense then. As I understand right now there is no
> >> something like queue.buffering.max.size to control batch size?
> >>
> >> On Sat, Jun 7, 2014 at 2:32 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >> > With compression, the batch of messages are compressed in to a
> "single"
> >> > wrapper message and sent to the broker, and the broker will reject the
> >> > request if this single message's size is larger than 1MB. So you need
> to
> >> > either change your max request size on broker or reduce your producer
> >> batch
> >> > size.
> >> >
> >> > Guozhang
> >> >
> >> >
> >> > On Sat, Jun 7, 2014 at 2:09 PM, Kane Kane <kane.ist...@gmail.com>
> wrote:
> >> >
> >> >> Yes, messages were compressed with gzip and I've enabled the same
> >> >> compression in mirrormaker producer.
> >> >>
> >> >> On Sat, Jun 7, 2014 at 12:56 PM, Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >> >> > Kane,
> >> >> >
> >> >> > Did you use any compression method?
> >> >> >
> >> >> > Guozhang
> >> >> >
> >> >> >
> >> >> > On Fri, Jun 6, 2014 at 2:15 PM, Kane Kane <kane.ist...@gmail.com>
> >> wrote:
> >> >> >
> >> >> >> I've tried to run mirrormaker tools in async mode and I get
> >> >> >> WARN Produce request with correlation id 263 failed due to
> >> >> >> [benchmark1,30]: kafka.common.MessageSizeTooLargeException
> >> >> >> (kafka.producer.async.DefaultEventHandler)
> >> >> >>
> >> >> >> I don't get error in sync mode. My message.max.bytes is default
> >> >> >> (1000000). As I understand the difference between sync and async
> mode
> >> >> >> is that async collects batches before sending. Does it mean that
> >> batch
> >> >> >> should be less than message.max.bytes, or any single message
> within
> >> >> >> batch should be less than that.
> >> >> >>
> >> >> >> Thanks.
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > -- Guozhang
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > -- Guozhang
> >>
> >
> >
> >
> > --
> > -- Guozhang
>



-- 
-- Guozhang

Reply via email to