Re: some producers stuck when one broker is bad

2015-09-11 Thread Steven Wu
I was doing a rolling bounce of all brokers. Immediately after the bad broker was bounced, those stuck producers recovered On Fri, Sep 11, 2015 at 9:05 AM, Mayuresh Gharat wrote: > So how did you detect that the broker is bad? If bouncing brokers solved > the problem and you did not find any unu

Re: Kafka Server JMX Log End Offset Not updating after repartioning

2015-09-11 Thread Otis Gospodnetić
Replying 3 months later Sounds like http://search-hadoop.com/m/uyzND1YlKxr5XZpK , Joe. No fix yes, as far as I know. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jun 8, 2015 at 6:33 PM, joe smith

Re: Controlled Shutdown Tool?

2015-09-11 Thread Otis Gospodnetić
Btw. a regular UNIX kill will do the same - SIGTERM - http://linux.die.net/man/1/kill Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jul 27, 2015 at 3:57 PM, Andrew Otto wrote: > Ah, thank you, SIGTERM

Re: Zombie Replica Fetcher Threads

2015-09-11 Thread Otis Gospodnetić
Juicy one. https://issues.apache.org/jira/browse/KAFKA-2530 I hope it's related to http://search-hadoop.com/m/uyzND1XVyK12UNtd32/kafka+orphaned&subj=Consumer+lag+lies+orphaned+offsets ! :) Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support *

Re: Metrics to monitor in Kafka

2015-09-11 Thread Otis Gospodnetić
Consumer offset/lag is what people are always after. :) See also: http://search-hadoop.com/?q=kafka+otis+important+metrics Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Tue, Aug 25, 2015 at 3:28 AM, Debraj M

port already in use error when trying to add topic

2015-09-11 Thread allen chan
Hi all, First time testing kafka with brand new cluster. Running into an issue that i do not understand. Server started up fine but I get error when trying to create a topic. *[achan@server1 ~]$ ps -ef | grep -i kafka* *root 6507 1 0 15:42 ?00:00:00 sudo /opt/kafka_2.10-0.8.2.

Re: Zookeeper jmx monitoring for kafka

2015-09-11 Thread Otis Gospodnetić
Hi Prabhjot, Short answer: yes I used to think ZK was so super stable that it was one of those things that don't require any management, but on a few occasions I witnessed complex distributed applications nearly fall apart because of issues with ZK. We use our own SPM for ZooKeeper to monitor all

Re: How to monitor lag when "kafka" is used as offset.storage?

2015-09-11 Thread Otis Gospodnetić
Hi Shahab - SPM for Kafka captures ~200 Kafka metrics IIRC and has built-in alerting, anomaly detection, and a bunch of other features - see http://sematext.com/spm/integrations/kafka-monitoring.html Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearc

Re: Issue in pulling metrics from kafka-console-producer.

2015-09-11 Thread Otis Gospodnetić
Unfortunately Kafka brokers don't have producer and consumer metrics. :( P & C expose them through their own JMX interfaces. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thu, Sep 10, 2015 at 2:46 AM, Pavan

Re: Kafka cluster cannot start anymore after unexpected shutdown

2015-09-11 Thread Qi Xu
And I tried to clean up the whole kafka-logs folder, and then starts the kafka server again. It will the following errors: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/share/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/

Kafka cluster cannot start anymore after unexpected shutdown

2015-09-11 Thread Qi Xu
Hi, We're running the Trunk version of Kafka (for its SSL feature) and recently I'm trying to enable the kafka manager with it. After enabling that, I find out some machine's Kafka Server is dead. Looking at the server.log, it has the following logs. java.lang.OutOfMemoryError: Java heap space [20

Re: Unclean leader election docs outdated

2015-09-11 Thread Stevo Slavić
That sentence is in both https://svn.apache.org/repos/asf/kafka/site/083/design.html and https://svn.apache.org/repos/asf/kafka/site/082/design.html near the end of "Unclean leader election: What if they all die?" section. Next one, "Availability and Durability Guarantees", mentions ability to disa

Re: Unclean leader election docs outdated

2015-09-11 Thread Guozhang Wang
Hi Stevo, Could you point me to the link of the docs? Guozhang On Fri, Sep 11, 2015 at 5:47 AM, Stevo Slavić wrote: > Hello Apache Kafka community, > > Current unclean leader election docs state: > "In the future, we would like to make this configurable to better support > use cases where down

Re: MirrorMaker - Not consuming from all partitions

2015-09-11 Thread Craig Swift
Scary but thanks! :) We'll start digging into the network and see if we can find a smoking gun. Appreciate the response, thanks again. Craig J. Swift Software Engineer - Data Pipeline ReturnPath Inc. Work: 303-999-3220 Cell: 720-560-7038 On Fri, Sep 11, 2015 at 11:29 AM, Steve Miller wrote: >

Re: automatically consume from all topics

2015-09-11 Thread Alexis Midon
When a new topic is created, I agree that the regex would remain unchanged but how would an existing consumer be notified of the topic creation? afaik there's no such notification mechanism in the High level consumer. On Thu, Sep 10, 2015 at 8:43 AM, tao xiao wrote: > You can create message st

Re: MirrorMaker - Not consuming from all partitions

2015-09-11 Thread Steve Miller
I have a vague feeling that I've seen stuff like this when the network on the broker that's disappearing is actually unreachable from time to time -- though I'd like to believe that's not such an issue when talking to AWS (though there could be a lot of screwed-up Internet between you and it,

Re: some producers stuck when one broker is bad

2015-09-11 Thread Mayuresh Gharat
So how did you detect that the broker is bad? If bouncing brokers solved the problem and you did not find any unusual things in the logs on brokers , it is likely that the process was up but was isolated from producer request and since the producer did not have timeout the producer buffer filled up

Re: What can be reason for fetcher thread for slow response.

2015-09-11 Thread Prabhjot Bharaj
Hi, In addition to the parameters asked by Erik, it would be great if you could share your broker's server.properties as well Regards, Prabhjot On Fri, Sep 11, 2015 at 8:10 PM, Helleren, Erik wrote: > Hi Madhukar, > Some questions that can help understand whats going on: Which kafka > version

Re: MirrorMaker - Not consuming from all partitions

2015-09-11 Thread Craig Swift
Just wanted to bump this again and see if the community had any thoughts or if we're just missing something stupid. For added context the topic we're reading from has 24 partitions and we see roughly 15k messages per minute. As I mentioned before the throughput seems fine, but I'm not entirely sure

Re: What can be reason for fetcher thread for slow response.

2015-09-11 Thread Helleren, Erik
Hi Madhukar, Some questions that can help understand whats going on: Which kafka version is used? Which Producer API is being used (http://kafka.apache.org/documentation.html#producerapi)? And what are the configs for this producer? Also, because I know little about tomcat, is there a semantic f

Re: Partition Consumer(s)

2015-09-11 Thread Helleren, Erik
Kafka can only do so much. Kafka’s High level consumer API does guarantee delivery at least once to a living consumer’s message consumption function. Kafka can't guarantee that the business logic that handles that message won’t hang or do things to circumvent it’s guarantees. But, since we are t

What can be reason for fetcher thread for slow response.

2015-09-11 Thread Madhukar Bharti
Hi, We are having 3 brokers in a cluster. Producer request is getting failed for broker 2. We are frequently getting below exception: 15/09/09 22:09:06 WARN async.DefaultEventHandler: Failed to send producer request with* correlation id 1455 to broker 2* with data for partitions [UserEvents,0] >

KafkaClient in 0.8.2.1

2015-09-11 Thread Wilk, George
What is the best way to connect to broker with 0.8.2.1 release? I tried using KafkaClient.poll() method but it appears that it hasn't been implemented yet. Any suggestions would be greatly appreciated! Cheers, ~george

Unclean leader election docs outdated

2015-09-11 Thread Stevo Slavić
Hello Apache Kafka community, Current unclean leader election docs state: "In the future, we would like to make this configurable to better support use cases where downtime is preferable to inconsistency. " If I'm not mistaken, since 0.8.2, unclean leader election strategy (whether to allow it or

Re: Partition Consumer(s)

2015-09-11 Thread Reza Aliakbari
This is not good solution to monitor and kill the bad consumer, if my consumer can't manage my partition well even when there are idle threads then I have a bad design. I can't design a system that in some situations doesn't deliver thousands of emails because one thread couldn't manage things we