Mathias,

What's the ack mode you used in the producer? Could you share the command
you used to run kafka-producer-perf-test.sh?

Thanks,

Jun

On Thu, Feb 12, 2015 at 1:17 PM, Mathias Söderberg <
mathias.soederb...@gmail.com> wrote:

> Jun,
>
> Pardon the radio silence. I booted up a new broker, created a topic with
> three (3) partitions and replication factor one (1) and used the
> *kafka-producer-perf-test.sh
> *script to generate load (using messages of roughly the same size as ours).
> There was a slight increase in CPU usage (~5-10%) on 0.8.2.0-rc2 compared
> to 0.8.1.1, but that was about it.
>
> I upgraded our staging cluster to 0.8.2.0 earlier this week or so, and had
> to add an additional broker due to increased load after the upgrade (note
> that the incoming load on the cluster has been pretty much consistent).
> Since the upgrade we've been seeing an 2-3x increase in latency as well.
> I'm considering downgrading to 0.8.1.1 again to see if it resolves our
> issues.
>
> Best regards,
> Mathias
>
> On Tue Feb 03 2015 at 6:44:36 PM Jun Rao <j...@confluent.io> wrote:
>
> > Mathias,
> >
> > The new hprof doesn't reveal anything new to me. We did fix the logic in
> > using Purgatory in 0.8.2, which could potentially drive up the CPU usage
> a
> > bit. To verify that, could you do your test on a single broker (with
> > replication factor 1) btw 0.8.1 and 0.8.2 and see if there is any
> > significant difference in cpu usage?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Feb 3, 2015 at 5:09 AM, Mathias Söderberg <
> > mathias.soederb...@gmail.com> wrote:
> >
> > > Jun,
> > >
> > > I re-ran the hprof test, for about 30 minutes again, for 0.8.2.0-rc2
> with
> > > the same version of snappy that 0.8.1.1 used. Attached the logs.
> > > Unfortunately there wasn't any improvement as the node running
> > 0.8.2.0-rc2
> > > still had a higher load and CPU usage.
> > >
> > > Best regards,
> > > Mathias
> > >
> > > On Tue Feb 03 2015 at 4:40:31 AM Jaikiran Pai <
> jai.forums2...@gmail.com>
> > > wrote:
> > >
> > >> On Monday 02 February 2015 11:03 PM, Jun Rao wrote:
> > >> > Jaikiran,
> > >> >
> > >> > The fix you provided in probably unnecessary. The channel that we
> use
> > in
> > >> > SimpleConsumer (BlockingChannel) is configured to be blocking. So
> even
> > >> > though the read from the socket is in a loop, each read blocks if
> > there
> > >> is
> > >> > no bytes received from the broker. So, that shouldn't cause extra
> CPU
> > >> > consumption.
> > >> Hi Jun,
> > >>
> > >> Of course, you are right! I forgot that while reading the thread dump
> in
> > >> hprof output, one has to be aware that the thread state isn't shown
> and
> > >> the thread need not necessarily be doing any CPU activity.
> > >>
> > >> -Jaikiran
> > >>
> > >>
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jun
> > >> >
> > >> > On Mon, Jan 26, 2015 at 10:05 AM, Mathias Söderberg <
> > >> > mathias.soederb...@gmail.com> wrote:
> > >> >
> > >> >> Hi Neha,
> > >> >>
> > >> >> I sent an e-mail earlier today, but noticed now that it didn't
> > >> actually go
> > >> >> through.
> > >> >>
> > >> >> Anyhow, I've attached two files, one with output from a 10 minute
> run
> > >> and
> > >> >> one with output from a 30 minute run. Realized that maybe I
> should've
> > >> done
> > >> >> one or two runs with 0.8.1.1 as well, but nevertheless.
> > >> >>
> > >> >> I upgraded our staging cluster to 0.8.2.0-rc2, and I'm seeing the
> > same
> > >> CPU
> > >> >> usage as with the beta version (basically pegging all cores). If I
> > >> manage
> > >> >> to find the time I'll do another run with hprof on the rc2 version
> > >> later
> > >> >> today.
> > >> >>
> > >> >> Best regards,
> > >> >> Mathias
> > >> >>
> > >> >> On Tue Dec 09 2014 at 10:08:21 PM Neha Narkhede <n...@confluent.io
> >
> > >> wrote:
> > >> >>
> > >> >>> The following should be sufficient
> > >> >>>
> > >> >>> java
> > >> >>> -agentlib:hprof=cpu=samples,depth=100,interval=20,lineno=
> > >> >>> y,thread=y,file=kafka.hprof
> > >> >>> <classname>
> > >> >>>
> > >> >>> You would need to start the Kafka server with the settings above
> for
> > >> >>> sometime until you observe the problem.
> > >> >>>
> > >> >>> On Tue, Dec 9, 2014 at 3:47 AM, Mathias Söderberg <
> > >> >>> mathias.soederb...@gmail.com> wrote:
> > >> >>>
> > >> >>>> Hi Neha,
> > >> >>>>
> > >> >>>> Yeah sure. I'm not familiar with hprof, so any particular
> options I
> > >> >>> should
> > >> >>>> include or just run with defaults?
> > >> >>>>
> > >> >>>> Best regards,
> > >> >>>> Mathias
> > >> >>>>
> > >> >>>> On Mon Dec 08 2014 at 7:41:32 PM Neha Narkhede <
> n...@confluent.io>
> > >> >>> wrote:
> > >> >>>>> Thanks for reporting the issue. Would you mind running hprof and
> > >> >>> sending
> > >> >>>>> the output?
> > >> >>>>>
> > >> >>>>> On Mon, Dec 8, 2014 at 1:25 AM, Mathias Söderberg <
> > >> >>>>> mathias.soederb...@gmail.com> wrote:
> > >> >>>>>
> > >> >>>>>> Good day,
> > >> >>>>>>
> > >> >>>>>> I upgraded a Kafka cluster from v0.8.1.1 to v0.8.2-beta and
> > noticed
> > >> >>>> that
> > >> >>>>>> the CPU usage on the broker machines went up by roughly 40%,
> from
> > >> >>> ~60%
> > >> >>>> to
> > >> >>>>>> ~100% and am wondering if anyone else has experienced something
> > >> >>>> similar?
> > >> >>>>>> The load average also went up by 2x-3x.
> > >> >>>>>>
> > >> >>>>>> We're running on EC2 and the cluster currently consists of four
> > >> >>>>> m1.xlarge,
> > >> >>>>>> with roughly 1100 topics / 4000 partitions. Using Java 7
> > (1.7.0_65
> > >> >>> to
> > >> >>>> be
> > >> >>>>>> exact) and Scala 2.9.2. Configurations can be found over here:
> > >> >>>>>> https://gist.github.com/mthssdrbrg/7df34a795e07eef10262.
> > >> >>>>>>
> > >> >>>>>> I'm assuming that this is not expected behaviour for
> 0.8.2-beta?
> > >> >>>>>>
> > >> >>>>>> Best regards,
> > >> >>>>>> Mathias
> > >> >>>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>> --
> > >> >>>>> Thanks,
> > >> >>>>> Neha
> > >> >>>>>
> > >> >>>
> > >> >>>
> > >> >>> --
> > >> >>> Thanks,
> > >> >>> Neha
> > >> >>>
> > >>
> > >>
> >
>

Reply via email to