Re: [VOTE] 0.8.2.0 Candidate 1

svante karlsson Fri, 16 Jan 2015 16:40:43 -0800

Hmm, produce "msg/sec in rate" seems to be per broker and "produce msg/per
sec" should also be per broker and thus be related. The problem is that for
a time period the graphs indicated that 1) messages where only produced to
one broker 2) messages where produces to two brokers.


when I restarted brokers everything looked normal again. I made no changes
to the parts that were collecting the metrics during this time. This is of
course hearsay since I can't repeat it but at least the graphs supports the
view that something is strange.

I agree that the value looks ok for most ("all") of the time but I suspect
that there might be issues here.

/svante

2015-01-17 0:19 GMT+01:00 Jun Rao <j...@confluent.io>:

> I did some quick tests and the mbean values look reasonable. On the
> producer side, produce msg/sec is actually for all brokers.
>
> Thanks,
>
> Jun
>
> On Fri, Jan 16, 2015 at 12:09 PM, svante karlsson <s...@csi.se> wrote:
>
> > Disregard the previous message, it was send accidently..
> >
> > Jun,
> >
> > I don't know if it was an issue with graphite or the mbean and have not
> > seen it since - and we have tried of several cases of failover.
> >
> > That said, I have the feeling that it was a kafka issue and I'm a bit
> > suspicious about the new mbeans.
> >
> > I attach a screenshot from the grafana dashboard and if you look at the
> > first graph (top left) at ~10% it shows the startup after upgrade.
> >
> > This is a 3 node cluster with a topic of 2 partitions. When we start up a
> > single producer it produces messages to both partitions without message
> > loss. I know that all messages are acked.
> >
> >  If you look at the "produce msg/sec" graph it seems to hit 2 servers
> (it's
> > per broker) but "messages in rate" & "byte in rate",& "byte out rate"
> (all
> > from the new mbeans) look as if the data only hits one broker. (those are
> > also per broker)
> >
> > At 70% I restarted two brokers after each other. After that point all
> three
> > graphs looks fine.
> >
> > I'm not at work now and can't dig into the graphite data but I now see
> that
> > the "fetch follower" also looks strange
> >
> > I can't file it as a bug report as I can't reproduce it but I have a
> > distinct feeling that I can't trust the new mbeans or have to find
> another
> > explanation.
> >
> > regard it as an observation if someone else reports issues.
> >
> >
> > thanks,
> >
> > svante
> >
> > 2015-01-16 20:56 GMT+01:00 svante karlsson <s...@csi.se>:
> >
> > > Jun,
> > >
> > > I don't know if it was an issue with graphite or the mbean but I have
> not
> > > seen it since - and we have tried of several cases of failover and this
> > > problem has only been seen once.
> > >
> > > That said, I have the feeling that it was a kafka issue and I'm a bit
> > > suspicious about the new mbeans.
> > >
> > > I attach a screenshot from the grafana dashboard and if you look at the
> > > first graph (top left) at ~10% it shows the startup after upgrade.
> > >
> > > This is a 3 node cluster with a topic of 2 partitions. When we start
> up a
> > > single producer it produces messages to both partitions without message
> > > loss. I know that all messages are acked.
> > >
> > >  If you look at the "produce message msg/sec" graph it seems to hit 2
> > > servers (it's per broker)
> > >
> > >
> > > Bad picture but
> > >
> > > 2015-01-16 18:05 GMT+01:00 Jun Rao <j...@confluent.io>:
> > >
> > >> Svante,
> > >>
> > >> I tested this out locally and the mbeans for those metrics do show up
> on
> > >> startup. Can you reproduce the issue reliably? Also, is what you saw
> an
> > >> issue with the mbean itself or graphite?
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Fri, Jan 16, 2015 at 4:38 AM, svante karlsson <s...@csi.se> wrote:
> > >>
> > >> > I upgrade two small test cluster and I had two small issues but I'm,
> > not
> > >> > clear yet as to if those were an issue due to us using ansible to
> > >> configure
> > >> > and deploy the cluster.
> > >> >
> > >> > The first issue could be us doing something bad when distributing
> the
> > >> > update (I updated, not reinstalled) but it should be easy for you to
> > >> > disregard since it seems so trivial.
> > >> >
> > >> > We replace the kafka-server-start.sh with something else but we had
> > the
> > >> > line
> > >> >
> > >> > EXTRA_ARGS="-name kafkaServer -loggc"
> > >> >
> > >> > then kafka-run-class.sh exits without starting the VM and complains
> on
> > >> > unknown options. ( both -name and -loggc ) - once we removed the
> > >> EXTRA_ARGS
> > >> > everything starts.
> > >> >
> > >> > as I said - everyone should have this issue if it was a problem...
> > >> >
> > >> >
> > >> > The second thing is regarding the jmx beans. I reconfigured our
> > graphite
> > >> > monitoring and noticed that the following metrics stopped working on
> > one
> > >> > broker
> > >> > - server.BrokerTopicMetrics.MessagesInPerSec.OneMinuteRate,
> > >> > - server.BrokerTopicMetrics.ByteInPerSec.OneMinuteRate,
> > >> > - server.BrokerTopicMetrics.ByteOutPerSec.OneMinuteRate
> > >> >
> > >> > I had graphs running and the it looked like the traffic was dropping
> > on
> > >> > those metrics but our producers was working without problems and the
> > >> > metrics
> > >> > network.RequestMetrics.Produce.RequestsPerSec.OneMinuteRate
> confirmed
> > >> that
> > >> > on all brokers.
> > >> >
> > >> > A restart of the offending broker brought the metrics online again.
> > >> >
> > >> > /svante
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > 2015-01-16 3:42 GMT+01:00 Gwen Shapira <gshap...@cloudera.com>:
> > >> >
> > >> > > Would make sense to enable it after we have authorization feature
> > and
> > >> > > admins can control who can delete what.
> > >> > >
> > >> > > On Thu, Jan 15, 2015 at 6:32 PM, Jun Rao <j...@confluent.io>
> wrote:
> > >> > > > Yes, I agree it's probably better not to enable
> > >> "delete.topic.enable"
> > >> > by
> > >> > > > default.
> > >> > > >
> > >> > > > Thanks,
> > >> > > >
> > >> > > > Jun
> > >> > > >
> > >> > > > On Thu, Jan 15, 2015 at 6:29 PM, Joe Stein <
> joe.st...@stealth.ly>
> > >> > wrote:
> > >> > > >
> > >> > > >> I think that is a change of behavior that organizations may get
> > >> burned
> > >> > > on.
> > >> > > >> Right now there is no delete data feature. If an operations
> teams
> > >> > > upgrades
> > >> > > >> to 0.8.2 and someone decides to delete a topic then there will
> be
> > >> data
> > >> > > >> loss. The organization may not have wanted that to happen. I
> > would
> > >> > > argue to
> > >> > > >> not have a way to "by default" delete data. There is something
> > >> > > actionable
> > >> > > >> about consciously turning on a feature that allows anyone with
> > >> access
> > >> > to
> > >> > > >> kafka-topics (or zookeeper for that matter) to delete Kafka
> data.
> > >> If
> > >> > > folks
> > >> > > >> want that feature then flip the switch prior to upgrade or
> after
> > >> and
> > >> > > >> rolling restart and have at it. By not setting it as default
> they
> > >> will
> > >> > > know
> > >> > > >> they have to turn it on and figure out what they need to-do
> from
> > a
> > >> > > security
> > >> > > >> perspective (until Kafka gives them that) to protect their data
> > >> > (through
> > >> > > >> network or other type of measures).
> > >> > > >>
> > >> > > >> On Thu, Jan 15, 2015 at 8:24 PM, Manikumar Reddy <
> > >> > ku...@nmsworks.co.in>
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >> > Also can we remove "delete.topic.enable" config property and
> > >> enable
> > >> > > topic
> > >> > > >> > deletion by default?
> > >> > > >> > On Jan 15, 2015 10:07 PM, "Jun Rao" <j...@confluent.io>
> wrote:
> > >> > > >> >
> > >> > > >> > > Thanks for reporting this. I will remove that option in
> RC2.
> > >> > > >> > >
> > >> > > >> > > Jun
> > >> > > >> > >
> > >> > > >> > > On Thu, Jan 15, 2015 at 5:21 AM, Jaikiran Pai <
> > >> > > >> jai.forums2...@gmail.com>
> > >> > > >> > > wrote:
> > >> > > >> > >
> > >> > > >> > > > I just downloaded the Kafka binary and am trying this on
> my
> > >> 32
> > >> > bit
> > >> > > >> JVM
> > >> > > >> > > > (Java 7)? Trying to start Zookeeper or Kafka server keeps
> > >> > failing
> > >> > > >> with
> > >> > > >> > > > "Unrecognized VM option 'UseCompressedOops'":
> > >> > > >> > > >
> > >> > > >> > > > ./zookeeper-server-start.sh
> ../config/zookeeper.properties
> > >> > > >> > > > Unrecognized VM option 'UseCompressedOops'
> > >> > > >> > > > Error: Could not create the Java Virtual Machine.
> > >> > > >> > > > Error: A fatal exception has occurred. Program will exit.
> > >> > > >> > > >
> > >> > > >> > > > Same with the Kafka server startup scripts. My Java
> version
> > >> is:
> > >> > > >> > > >
> > >> > > >> > > > java version "1.7.0_71"
> > >> > > >> > > > Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
> > >> > > >> > > > Java HotSpot(TM) Server VM (build 24.71-b01, mixed mode)
> > >> > > >> > > >
> > >> > > >> > > > Should there be a check in the script, before adding this
> > >> > option?
> > >> > > >> > > >
> > >> > > >> > > > -Jaikiran
> > >> > > >> > > >
> > >> > > >> > > > On Wednesday 14 January 2015 10:08 PM, Jun Rao wrote:
> > >> > > >> > > >
> > >> > > >> > > >> + users mailing list. It would be great if people can
> test
> > >> this
> > >> > > out
> > >> > > >> > and
> > >> > > >> > > >> report any blocker issues.
> > >> > > >> > > >>
> > >> > > >> > > >> Thanks,
> > >> > > >> > > >>
> > >> > > >> > > >> Jun
> > >> > > >> > > >>
> > >> > > >> > > >> On Tue, Jan 13, 2015 at 7:16 PM, Jun Rao <
> > j...@confluent.io>
> > >> > > wrote:
> > >> > > >> > > >>
> > >> > > >> > > >>  This is the first candidate for release of Apache Kafka
> > >> > 0.8.2.0.
> > >> > > >> > There
> > >> > > >> > > >>> has been some changes since the 0.8.2 beta release,
> > >> especially
> > >> > > in
> > >> > > >> the
> > >> > > >> > > new
> > >> > > >> > > >>> java producer api and jmx mbean names. It would be
> great
> > if
> > >> > > people
> > >> > > >> > can
> > >> > > >> > > >>> test
> > >> > > >> > > >>> this out thoroughly. We are giving people 10 days for
> > >> testing
> > >> > > and
> > >> > > >> > > voting.
> > >> > > >> > > >>>
> > >> > > >> > > >>> Release Notes for the 0.8.2.0 release
> > >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
> > >> > > >> > > >>> candidate1/RELEASE_NOTES.html
> > >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0-
> > >> > > >> > > >>> candidate1/RELEASE_NOTES.html>*
> > >> > > >> > > >>>
> > >> > > >> > > >>> *** Please download, test and vote by Friday, Jan 23h,
> > 7pm
> > >> PT
> > >> > > >> > > >>>
> > >> > > >> > > >>> Kafka's KEYS file containing PGP keys we use to sign
> the
> > >> > > release:
> > >> > > >> > > >>> *
> > >> > > https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS
> > >> > > >> > > >>> <
> > >> > > https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS>*
> > >> > > >> > in
> > >> > > >> > > >>> addition to the md5, sha1
> > >> > > >> > > >>> and sha2 (SHA256) checksum.
> > >> > > >> > > >>>
> > >> > > >> > > >>> * Release artifacts to be voted upon (source and
> binary):
> > >> > > >> > > >>> *
> > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/
> > >> > > >> > > >>> <
> > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/
> > >> > >*
> > >> > > >> > > >>>
> > >> > > >> > > >>> * Maven artifacts to be voted upon prior to release:
> > >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
> > >> > > >> > > >>> candidate1/maven_staging/
> > >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0-
> > >> > > >> > > >>> candidate1/maven_staging/>*
> > >> > > >> > > >>>
> > >> > > >> > > >>> * scala-doc
> > >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0-
> > >> > > >> > > >>> candidate1/scaladoc/#package
> > >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0-
> > >> > > >> > > >>> candidate1/scaladoc/#package>*
> > >> > > >> > > >>>
> > >> > > >> > > >>> * java-doc
> > >> > > >> > > >>> *
> > >> > > >>
> > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/
> > >> > > >> > > >>> <
> > >> > > >>
> > >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/
> > >> > > >> > >*
> > >> > > >> > > >>>
> > >> > > >> > > >>> * The tag to be voted upon (off the 0.8.2 branch) is
> the
> > >> > 0.8.2.0
> > >> > > >> tag
> > >> > > >> > > >>> *
> > >> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > >> > > >> > > >>> b0c7d579f8aeb5750573008040a42b7377a651d5
> > >> > > >> > > >>> <
> > >> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > >> > > >> > > >>> b0c7d579f8aeb5750573008040a42b7377a651d5>*
> > >> > > >> > > >>>
> > >> > > >> > > >>> /*******************************************
> > >> > > >> > > >>>
> > >> > > >> > > >>> Thanks,
> > >> > > >> > > >>>
> > >> > > >> > > >>> Jun
> > >> > > >> > > >>>
> > >> > > >> > > >>>
> > >> > > >> > > >
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [VOTE] 0.8.2.0 Candidate 1

Reply via email to