Re: Not found NewShinyProducer sync performance metrics

2015-02-05 Thread Xinyi Su
Hi, I try to use Jconsole to connect remote Kafka broker which is running behind a firewall. But it is blocked by the firewall. I can specify JMX registry port by set JMX_PORT= which is allowed by firewall, but I cannot specify the ephemeral port which is always chosen randomly at startup.

Kafka broker core dump

2015-02-05 Thread Xinyi Su
Hi, I encounter a Kafka broker core dump today. One broker process is aborted during test. I have attached core dump file. Kafka version I deploy is 2.9.2-0.8.2.0. Thanks. Best regards. Xinyi

Re: Kafka Architecture diagram

2015-02-05 Thread Joe Stein
Ankur, There is more from papers and presentations you can check out too https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations if you haven't already. - Joestein On Thu, Feb 5, 2015 at 12:57 PM, Conikee coni...@gmail.com wrote: Michael Noll's blog posting might

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Jason Rosenberg
On Thu, Feb 5, 2015 at 9:52 PM, Joel Koshy jjkosh...@gmail.com wrote: Ok, so it looks like the default settings are: offset.storage = zookeeper dual.commit.enabled = true The doc for 'dual.commit.enabled' seems to imply (but doesn't clearly state) that it will only apply if

Kafka Architecture diagram

2015-02-05 Thread Ankur Jain
Hi Team, I am looking out high and low level architecture diagram of Kafka with Zookeeper, but haven't got any good one , showing concepts like replication, high availability etc. Please do let me know if there is any... Thank you Ankur

Re: Get Latest Offset for Specific Topic for All Partition

2015-02-05 Thread Gwen Shapira
You can use the metrics Kafka publishes. I think the relevant metrics are: Log.LogEndOffset Log.LogStartOffset Log.size Gwen On Thu, Feb 5, 2015 at 11:54 AM, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: HI All, I just need to get the latest offset # for topic (not for consumer group).

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Jason Rosenberg
Ok, so it looks like the default settings are: offset.storage = zookeeper dual.commit.enabled = true The doc for 'dual.commit.enabled' seems to imply (but doesn't clearly state) that it will only apply if offset.storage = kafka. Is that right? (I'm guessing not) *If you are using kafka* as

Re: kafka.server.ReplicaManager error

2015-02-05 Thread svante karlsson
I believe I've had the same problem on the 0.8.2 rc2. We had a idle test cluster with unknown health status and I applied rc3 without checking if everything was ok before. Since that cluster had been doing nothing for a couple of days and the retention time was 48 hours it's reasonable to assume

Re: Logstash to Kafka

2015-02-05 Thread Vineet Mishra
Yury, Well thanks for sharing the insight of kafka partition distribution. Well I am more of a concerned about the throughtput that kafka-storm can collaborative give so as to event process. Currently I am having around a 30 Gb file with around .2 Billion events, this number is soon gonna rise

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Joel Koshy
This is documented in the official docs: http://kafka.apache.org/documentation.html#distributionimpl On Thu, Feb 05, 2015 at 01:23:01PM -0500, Jason Rosenberg wrote: What are the defaults for those settings (I assume it will be to continue using only zookeeper by default)? Also, if I have a

Re: How to fetch old messages from kafka

2015-02-05 Thread Mayuresh Gharat
If you see the code for getOffsetsBefore() : /** * Get a list of valid offsets (up to maxSize) before the given time. * * @param request a [[kafka.javaapi.OffsetRequest]] object. * @return a [[kafka.javaapi.OffsetResponse]] object. */ def getOffsetsBefore(request:

Re: kafka out-of-box monitoring system

2015-02-05 Thread Otis Gospodnetic
Hi, Not sure if you are asking if anyone has bundled all these together or if you are asking for an alternative. If the latter, see SPM for Kafka http://sematext.com/spm/ (works with 0.8.2 metrics) (or some other more complete monitoring tool) which will give you everything but 5. (but 5. is in

How to delete defunct topics

2015-02-05 Thread Jagbir Hooda
First I would like to take this opportunity to thank this group for releasing 0.8.2.0. It's a major milestone with a rich set of features. Kudos to all the contributors! We are still running 0.8.1.2 and are planning to upgrade to 0.8.2.0. While planning this upgrade we discovered many topics that

Re: [ANNOUNCEMENT] Apache Kafka 0.8.2.0 Released

2015-02-05 Thread Otis Gospodnetic
Big thanks to Jun and everyone else involved! We're on 0.8.2 as of today. :) Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Feb 3, 2015 at 8:37 PM, Jun Rao j...@confluent.io wrote: The Apache Kafka

Re: Not found NewShinyProducer sync performance metrics

2015-02-05 Thread Otis Gospodnetic
Not announced yet, but http://sematext.com/spm should be showing you all the new shiny Kafka (new producer) metrics out of the box. If you don't see them, please shout (I know we have a bit more tweaking to do in the coming day-two-three). If you want to just dump MBeans from JMX manually and

Re: fetchrequest and assigned replica issue

2015-02-05 Thread Karts
Hey Jun Its kafka-2.9.2-0.8.1.5 Great presentations kafka on youtube btw! -Kart

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Surendranauth Hiraman
This is what I've found so far. https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka The high-level consumer just worked for me by setting offsets.storage = kafka. Scroll down to the offsets.* config params.

kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
I have a 9-node Kafka cluster, and all of the brokers just started spouting the following error: ERROR [Replica Manager on Broker 1]: Error when processing fetch request for partition [mytopic,57] offset 0 from follower with correlation id 58166. Possible cause: Request for offset 0 but we only

question about new consumer offset management in 0.8.2

2015-02-05 Thread Jason Rosenberg
Hi, For 0.8.2, one of the features listed is: - Kafka-based offset storage. Is there documentation on this (I've heard discussion of it of course)? Also, is it something that will be used by existing consumers when they migrate up to 0.8.2? What is the migration process? Thanks, Jason

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Jon Bringhurst
There should probably be a wiki page started for this so we have the details in one place. The same question was asked on Freenode IRC a few minutes ago. :) A summary of the migration procedure is: 1) Upgrade your brokers and set dual.commit.enabled=false and offsets.storage=zookeeper (Commit

Re: error handling with high-level consumer

2015-02-05 Thread Steven Wu
Jun, we are already passing the retention period. so can't go back and do a DumpLogSegment. plus there are other factors make this exercise difficult: 1) this topic has very high traffic volume 2) we don't know the msg offset that is corrupted anyhow, it doesn't happen often. but can you advise

Re: Topic migration mentionning one removed server

2015-02-05 Thread Anthony Pastor
We're using Kafka 0.8.1.1 on debian 7.7 - Logs when i migrate a specific topic (~20GB) from kafka5 to kafka2 (No problem that way): - controller.log: No logs. - Logs when i migrate the same specific topic from kafka2 to kafka5 (same problems as my original mail): - controller.log:

Not found NewShinyProducer sync performance metrics

2015-02-05 Thread Xinyi Su
Hi, I am using kafka-producer-perf-test.sh to study NewShinyProducer *sync* performance. I have not found any CSV output or metrics collector for NewShinyProducer sync performance. Would you like to share with me about how to collect NewShinyProducer metrics? Thanks. Best regards. Xinyi

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Joel Koshy
Ok, so it looks like the default settings are: offset.storage = zookeeper dual.commit.enabled = true The doc for 'dual.commit.enabled' seems to imply (but doesn't clearly state) that it will only apply if offset.storage = kafka. Is that right? (I'm guessing not) dual.commit.enabled

Re: [ANNOUNCEMENT] Apache Kafka 0.8.2.0 Released

2015-02-05 Thread Steve Morin
Congratz team it's a big accomplishment On Feb 5, 2015, at 14:22, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Big thanks to Jun and everyone else involved! We're on 0.8.2 as of today. :) Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr

Re: Issue with topic deletion

2015-02-05 Thread Joel Koshy
Thanks will take a look.. On Wed, Feb 04, 2015 at 11:33:03PM -0800, Sumit Rangwala wrote: Any idea why you have session expirations? This is typically due to GC and/or flaky network. Regardless, we should be handling that scenario as well. However, your logs seem incomplete. Can you redo

Re: How to delete defunct topics

2015-02-05 Thread Joel Koshy
There are mbeans (http://kafka.apache.org/documentation.html#monitoring) that you can poke for incoming message rate - if you look at those over a period of time you can figure out which of those are likely to be defunct and then delete those topics. On Thu, Feb 05, 2015 at 02:38:27PM -0800,

Re: How to fetch old messages from kafka

2015-02-05 Thread Joel Koshy
We can reset the offset and get first 10 messages, but since we need to back in reverse sequence, suppose user has consumed messages upto 100 offset , currently there are only last 10 messages are visible, from 100 -90, now I want to retrieve messages from 80 to 90, how can we do that? I'm

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Gwen Shapira
Thanks Jon. I updated the FAQ with your procedure: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdowemigratetocommittingoffsetstoKafka(ratherthanZookeeper)in0.8.2 ? On Thu, Feb 5, 2015 at 9:16 AM, Jon Bringhurst jbringhu...@linkedin.com.invalid wrote: There should probably be a

Re: Kafka Architecture diagram

2015-02-05 Thread Conikee
Michael Noll's blog posting might serve your purpose as well http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/ Sent from my iPhone On Feb 5, 2015, at 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: The Kafka documentation has

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
Digging in a bit more, it appears that the down broker had likely partially failed. Thus, it was still attempting to fetch offsets that no longer exists. Does this make sense as an explanation of the above-mentioned behavior? On Thu, Feb 5, 2015 at 10:58 AM, Kyle Banker kyleban...@gmail.com

Re: Not found NewShinyProducer sync performance metrics

2015-02-05 Thread Manikumar Reddy
New Producer uses Kafka's own metrics api. Currently metrics are reported using jmx. Any jmx monitoring tool (jconsole) can be used for monitoring. On Feb 5, 2015 3:56 PM, Xinyi Su xiny...@gmail.com wrote: Hi, I am using kafka-producer-perf-test.sh to study NewShinyProducer *sync* performance.

Re: Kafka Architecture diagram

2015-02-05 Thread Gwen Shapira
The Kafka documentation has several good diagrams. Did you check it out? http://kafka.apache.org/documentation.html On Thu, Feb 5, 2015 at 6:31 AM, Ankur Jain ankurmitujj...@gmail.com wrote: Hi Team, I am looking out high and low level architecture diagram of Kafka with Zookeeper, but

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
Dug into this a bit more, and it turns out that we lost one of our 9 brokers at the exact moment when this started happening. At the time that we lost the broker, we had no under-replicated partitions. Since the broker disappeared, we've had a fairly constant number of under replicated partitions.

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Jason Rosenberg
What are the defaults for those settings (I assume it will be to continue using only zookeeper by default)? Also, if I have a cluster of consumers sharing the same groupId, and I update them via a rolling release, will it be a problem during the rolling restart if there is inconsistency in the

Re: Logstash to Kafka

2015-02-05 Thread Otis Gospodnetic
Hi, In short, I don't see Kafka having problems with those numbers. Logstash will have a harder time, I believe. That said, it all depends on how you tune things an what kind of / how much hardware you use. 2B or 200B events, yes, big numbers, but how quickly do you need to process those? in 1

Get Latest Offset for Specific Topic for All Partition

2015-02-05 Thread Bhavesh Mistry
HI All, I just need to get the latest offset # for topic (not for consumer group). Which API to get this info ? My use case is to analyze the data injection rate to each of partition is uniform or not (close). For this, I am planing to dump the latest offset into graphite for each partition

generics type for Producer and Consumer do not need to match?

2015-02-05 Thread Yang
in the example https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example we use a String,String for K,V in the same set of example https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example on the consumer side we use byte[], byte[] for K,V

Re: Get Latest Offset for Specific Topic for All Partition

2015-02-05 Thread Joel Koshy
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIaccuratelygetoffsetsofmessagesforacertaintimestampusingOffsetRequest? However, you will need to issue a TopicMetadataRequest first to discover the leaders for all the partitions and then issue the offset request. On Thu, Feb 05,

Re: generics type for Producer and Consumer do not need to match?

2015-02-05 Thread Joel Koshy
There has to be an implicit contract between the producer and consumer. The K, V pairs don't _need_ to match but generally _should_. If producer sends with PK, PV the consumer may receive CK, CV as long as it knows how to convert those raw bytes to CK, CV. In the example if CK == byte[] and CV ==

Re: generics type for Producer and Consumer do not need to match?

2015-02-05 Thread Yang
thanks, I just noticed the consumerconnector.createMessageStream Api, it has 2 versions public K,V MapString, ListKafkaStreamK,V createMessageStreams(MapString, Integer topicCountMap, DecoderK keyDecoder, DecoderV valueDecoder); /** * Create a list of message streams of type T for

Re: Issue with auto topic creation as well

2015-02-05 Thread Sumit Rangwala
On Wed, Feb 4, 2015 at 9:23 PM, Jun Rao j...@confluent.io wrote: Could you try the 0.8.2.0 release? It fixed one issue related to topic creation. Jun, Tried with 0.8.2.0 and I still see the same error. I see the error given below almost incessantly on the client side for topic