Disregard the previous message, it was send accidently.. Jun,
I don't know if it was an issue with graphite or the mbean and have not seen it since - and we have tried of several cases of failover. That said, I have the feeling that it was a kafka issue and I'm a bit suspicious about the new mbeans. I attach a screenshot from the grafana dashboard and if you look at the first graph (top left) at ~10% it shows the startup after upgrade. This is a 3 node cluster with a topic of 2 partitions. When we start up a single producer it produces messages to both partitions without message loss. I know that all messages are acked. If you look at the "produce msg/sec" graph it seems to hit 2 servers (it's per broker) but "messages in rate" & "byte in rate",& "byte out rate" (all from the new mbeans) look as if the data only hits one broker. (those are also per broker) At 70% I restarted two brokers after each other. After that point all three graphs looks fine. I'm not at work now and can't dig into the graphite data but I now see that the "fetch follower" also looks strange I can't file it as a bug report as I can't reproduce it but I have a distinct feeling that I can't trust the new mbeans or have to find another explanation. regard it as an observation if someone else reports issues. thanks, svante 2015-01-16 20:56 GMT+01:00 svante karlsson <s...@csi.se>: > Jun, > > I don't know if it was an issue with graphite or the mbean but I have not > seen it since - and we have tried of several cases of failover and this > problem has only been seen once. > > That said, I have the feeling that it was a kafka issue and I'm a bit > suspicious about the new mbeans. > > I attach a screenshot from the grafana dashboard and if you look at the > first graph (top left) at ~10% it shows the startup after upgrade. > > This is a 3 node cluster with a topic of 2 partitions. When we start up a > single producer it produces messages to both partitions without message > loss. I know that all messages are acked. > > If you look at the "produce message msg/sec" graph it seems to hit 2 > servers (it's per broker) > > > Bad picture but > > 2015-01-16 18:05 GMT+01:00 Jun Rao <j...@confluent.io>: > >> Svante, >> >> I tested this out locally and the mbeans for those metrics do show up on >> startup. Can you reproduce the issue reliably? Also, is what you saw an >> issue with the mbean itself or graphite? >> >> Thanks, >> >> Jun >> >> On Fri, Jan 16, 2015 at 4:38 AM, svante karlsson <s...@csi.se> wrote: >> >> > I upgrade two small test cluster and I had two small issues but I'm, not >> > clear yet as to if those were an issue due to us using ansible to >> configure >> > and deploy the cluster. >> > >> > The first issue could be us doing something bad when distributing the >> > update (I updated, not reinstalled) but it should be easy for you to >> > disregard since it seems so trivial. >> > >> > We replace the kafka-server-start.sh with something else but we had the >> > line >> > >> > EXTRA_ARGS="-name kafkaServer -loggc" >> > >> > then kafka-run-class.sh exits without starting the VM and complains on >> > unknown options. ( both -name and -loggc ) - once we removed the >> EXTRA_ARGS >> > everything starts. >> > >> > as I said - everyone should have this issue if it was a problem... >> > >> > >> > The second thing is regarding the jmx beans. I reconfigured our graphite >> > monitoring and noticed that the following metrics stopped working on one >> > broker >> > - server.BrokerTopicMetrics.MessagesInPerSec.OneMinuteRate, >> > - server.BrokerTopicMetrics.ByteInPerSec.OneMinuteRate, >> > - server.BrokerTopicMetrics.ByteOutPerSec.OneMinuteRate >> > >> > I had graphs running and the it looked like the traffic was dropping on >> > those metrics but our producers was working without problems and the >> > metrics >> > network.RequestMetrics.Produce.RequestsPerSec.OneMinuteRate confirmed >> that >> > on all brokers. >> > >> > A restart of the offending broker brought the metrics online again. >> > >> > /svante >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > 2015-01-16 3:42 GMT+01:00 Gwen Shapira <gshap...@cloudera.com>: >> > >> > > Would make sense to enable it after we have authorization feature and >> > > admins can control who can delete what. >> > > >> > > On Thu, Jan 15, 2015 at 6:32 PM, Jun Rao <j...@confluent.io> wrote: >> > > > Yes, I agree it's probably better not to enable >> "delete.topic.enable" >> > by >> > > > default. >> > > > >> > > > Thanks, >> > > > >> > > > Jun >> > > > >> > > > On Thu, Jan 15, 2015 at 6:29 PM, Joe Stein <joe.st...@stealth.ly> >> > wrote: >> > > > >> > > >> I think that is a change of behavior that organizations may get >> burned >> > > on. >> > > >> Right now there is no delete data feature. If an operations teams >> > > upgrades >> > > >> to 0.8.2 and someone decides to delete a topic then there will be >> data >> > > >> loss. The organization may not have wanted that to happen. I would >> > > argue to >> > > >> not have a way to "by default" delete data. There is something >> > > actionable >> > > >> about consciously turning on a feature that allows anyone with >> access >> > to >> > > >> kafka-topics (or zookeeper for that matter) to delete Kafka data. >> If >> > > folks >> > > >> want that feature then flip the switch prior to upgrade or after >> and >> > > >> rolling restart and have at it. By not setting it as default they >> will >> > > know >> > > >> they have to turn it on and figure out what they need to-do from a >> > > security >> > > >> perspective (until Kafka gives them that) to protect their data >> > (through >> > > >> network or other type of measures). >> > > >> >> > > >> On Thu, Jan 15, 2015 at 8:24 PM, Manikumar Reddy < >> > ku...@nmsworks.co.in> >> > > >> wrote: >> > > >> >> > > >> > Also can we remove "delete.topic.enable" config property and >> enable >> > > topic >> > > >> > deletion by default? >> > > >> > On Jan 15, 2015 10:07 PM, "Jun Rao" <j...@confluent.io> wrote: >> > > >> > >> > > >> > > Thanks for reporting this. I will remove that option in RC2. >> > > >> > > >> > > >> > > Jun >> > > >> > > >> > > >> > > On Thu, Jan 15, 2015 at 5:21 AM, Jaikiran Pai < >> > > >> jai.forums2...@gmail.com> >> > > >> > > wrote: >> > > >> > > >> > > >> > > > I just downloaded the Kafka binary and am trying this on my >> 32 >> > bit >> > > >> JVM >> > > >> > > > (Java 7)? Trying to start Zookeeper or Kafka server keeps >> > failing >> > > >> with >> > > >> > > > "Unrecognized VM option 'UseCompressedOops'": >> > > >> > > > >> > > >> > > > ./zookeeper-server-start.sh ../config/zookeeper.properties >> > > >> > > > Unrecognized VM option 'UseCompressedOops' >> > > >> > > > Error: Could not create the Java Virtual Machine. >> > > >> > > > Error: A fatal exception has occurred. Program will exit. >> > > >> > > > >> > > >> > > > Same with the Kafka server startup scripts. My Java version >> is: >> > > >> > > > >> > > >> > > > java version "1.7.0_71" >> > > >> > > > Java(TM) SE Runtime Environment (build 1.7.0_71-b14) >> > > >> > > > Java HotSpot(TM) Server VM (build 24.71-b01, mixed mode) >> > > >> > > > >> > > >> > > > Should there be a check in the script, before adding this >> > option? >> > > >> > > > >> > > >> > > > -Jaikiran >> > > >> > > > >> > > >> > > > On Wednesday 14 January 2015 10:08 PM, Jun Rao wrote: >> > > >> > > > >> > > >> > > >> + users mailing list. It would be great if people can test >> this >> > > out >> > > >> > and >> > > >> > > >> report any blocker issues. >> > > >> > > >> >> > > >> > > >> Thanks, >> > > >> > > >> >> > > >> > > >> Jun >> > > >> > > >> >> > > >> > > >> On Tue, Jan 13, 2015 at 7:16 PM, Jun Rao <j...@confluent.io> >> > > wrote: >> > > >> > > >> >> > > >> > > >> This is the first candidate for release of Apache Kafka >> > 0.8.2.0. >> > > >> > There >> > > >> > > >>> has been some changes since the 0.8.2 beta release, >> especially >> > > in >> > > >> the >> > > >> > > new >> > > >> > > >>> java producer api and jmx mbean names. It would be great if >> > > people >> > > >> > can >> > > >> > > >>> test >> > > >> > > >>> this out thoroughly. We are giving people 10 days for >> testing >> > > and >> > > >> > > voting. >> > > >> > > >>> >> > > >> > > >>> Release Notes for the 0.8.2.0 release >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0- >> > > >> > > >>> candidate1/RELEASE_NOTES.html >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0- >> > > >> > > >>> candidate1/RELEASE_NOTES.html>* >> > > >> > > >>> >> > > >> > > >>> *** Please download, test and vote by Friday, Jan 23h, 7pm >> PT >> > > >> > > >>> >> > > >> > > >>> Kafka's KEYS file containing PGP keys we use to sign the >> > > release: >> > > >> > > >>> * >> > > https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS >> > > >> > > >>> < >> > > https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/KEYS>* >> > > >> > in >> > > >> > > >>> addition to the md5, sha1 >> > > >> > > >>> and sha2 (SHA256) checksum. >> > > >> > > >>> >> > > >> > > >>> * Release artifacts to be voted upon (source and binary): >> > > >> > > >>> * >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/ >> > > >> > > >>> < >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/ >> > >* >> > > >> > > >>> >> > > >> > > >>> * Maven artifacts to be voted upon prior to release: >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0- >> > > >> > > >>> candidate1/maven_staging/ >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0- >> > > >> > > >>> candidate1/maven_staging/>* >> > > >> > > >>> >> > > >> > > >>> * scala-doc >> > > >> > > >>> *https://people.apache.org/~junrao/kafka-0.8.2.0- >> > > >> > > >>> candidate1/scaladoc/#package >> > > >> > > >>> <https://people.apache.org/~junrao/kafka-0.8.2.0- >> > > >> > > >>> candidate1/scaladoc/#package>* >> > > >> > > >>> >> > > >> > > >>> * java-doc >> > > >> > > >>> * >> > > >> >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/ >> > > >> > > >>> < >> > > >> >> https://people.apache.org/~junrao/kafka-0.8.2.0-candidate1/javadoc/ >> > > >> > >* >> > > >> > > >>> >> > > >> > > >>> * The tag to be voted upon (off the 0.8.2 branch) is the >> > 0.8.2.0 >> > > >> tag >> > > >> > > >>> * >> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h= >> > > >> > > >>> b0c7d579f8aeb5750573008040a42b7377a651d5 >> > > >> > > >>> < >> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h= >> > > >> > > >>> b0c7d579f8aeb5750573008040a42b7377a651d5>* >> > > >> > > >>> >> > > >> > > >>> /******************************************* >> > > >> > > >>> >> > > >> > > >>> Thanks, >> > > >> > > >>> >> > > >> > > >>> Jun >> > > >> > > >>> >> > > >> > > >>> >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> > >