Reuse ConsumerConnector

2014-12-10 Thread Chen Wang
Hey Guys, I have a user case that my thread reads from different kafka topic periodically through a timer. The way I am reading from kafka in the timer callback is the following: try { MapString, ListKafkaStreambyte[], byte[] consumerMap = consumerConnector

Re: Broker keeps rebalancing

2014-11-14 Thread Chen Wang
/10.93.83.50:44094 which had sessionid 0x149a4cc1b581b5b Chen On Thu, Nov 13, 2014 at 5:25 PM, Jun Rao jun...@gmail.com wrote: Which version of ZK are you using? Thanks, Jun On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang chen.apache.s...@gmail.com wrote: Thanks for the info. It makes sense

Re: Broker keeps rebalancing

2014-11-13 Thread Chen Wang
neha.narkh...@gmail.com wrote: Does this help? https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog ? On Wed, Nov 12, 2014 at 3:53 PM, Chen Wang chen.apache.s...@gmail.com wrote: Hi there, My kafka client is reading a 3 partition topic

Re: Broker keeps rebalancing

2014-11-13 Thread Chen Wang
...@gmail.com wrote: Chen, From ZK logs it sounds like ZK kept timed out consumers which triggers rebalance. What is the zk session timeout config value in your consumers? Guozhang On Thu, Nov 13, 2014 at 10:15 AM, Chen Wang chen.apache.s...@gmail.com wrote: Thanks

Broker keeps rebalancing

2014-11-12 Thread Chen Wang
Hi there, My kafka client is reading a 3 partition topic from kafka with 3 threads distributed on different machines. I am seeing frequent owner changes on the topics when running: bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group my_test_group --topic mytopic -zkconnect

Re: change retention for a topic on the fly does not work

2014-11-11 Thread Chen Wang
For those who might need to do the same thing, the command is bin/kafka-topics.sh --zookeeper localhost:2182 --alter --topic yourconfig --config retention.ms=17280 On Mon, Nov 10, 2014 at 4:46 PM, Chen Wang chen.apache.s...@gmail.com wrote: Hey guys, i am using kafka_2.9.2-0.8.1.1 bin

change retention for a topic on the fly does not work

2014-11-10 Thread Chen Wang
Hey guys, i am using kafka_2.9.2-0.8.1.1 bin/kafka-topics.sh --zookeeper localhost:2182 --alter --topic my_topic --config log.retention.hours.per.topic=48 It says: Error while executing topic command requirement failed: Unknown configuration log.retention.hours.per.topic.

Consumer lag keep increasing

2014-11-05 Thread Chen Wang
Hey Guys, I have a really simply storm topology with a kafka spout, reading from kafka through high level consumer. Since the topic has 30 partitions, we have 30 threads in the spout reading from it. However, it seems that the lag keeps increasing even the thread only read the message and do

Re: Consumer lag keep increasing

2014-11-05 Thread Chen Wang
and check if your consumer is blocked on some locks? Guozhang On Wed, Nov 5, 2014 at 2:01 PM, Chen Wang chen.apache.s...@gmail.com wrote: Hey Guys, I have a really simply storm topology with a kafka spout, reading from kafka through high level consumer. Since the topic has 30 partitions, we

Re: Consumer keeps looking connection

2014-11-02 Thread Chen Wang
(SocketServer.scala:405) at kafka.network.Processor.run(SocketServer.scala:265) at java.lang.Thread.run(Thread.java:744) On Sat, Nov 1, 2014 at 9:46 PM, Chen Wang chen.apache.s...@gmail.com wrote: Hello Folks, I am using Highlevel consumer, and it seems to drop connections intermittently: 2014

Consumer keeps looking connection

2014-11-01 Thread Chen Wang
Hello Folks, I am using Highlevel consumer, and it seems to drop connections intermittently: 2014-11-01 13:34:40 SimpleConsumer [INFO] Reconnect due to socket error: Received -1 when reading from channel, socket has likely been closed. 2014-11-01 13:34:40 ConsumerFetcherThread [WARN]

Kafka lost data

2014-10-27 Thread Chen Wang
Hello folks, I recently noticed our message amount in kafka seems to have dropped significantly. I didn't see any exception on my consumer side. Since producer is not within my control, I am trying to get some guidance on how I could debug this issue. Our individual message size recently have

Re: Correct way to handle ConsumerTimeoutException

2014-08-14 Thread Chen Wang
should probably just increasing the timeout value in the configs to avoid it throwing timeout exception. Guozhang On Tue, Aug 12, 2014 at 2:27 PM, Chen Wang chen.apache.s...@gmail.com wrote: Folks, I am using consumer.timeout.ms to force a consumer jump out hasNext call

Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Folks, Is there any potential issue with creating 240 topics every day? Although the retention of each topic is set to be 2 days, I am a little concerned that since right now there is no delete topic api, the zookeepers might be overloaded. Thanks, Chen

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
it is too large for Zookeeper¹s default 1 MB data size. You also need to think about the number of open file handles. Even with no data, there will be open files for each topic. -Todd On 8/11/14, 2:19 PM, Chen Wang chen.apache.s...@gmail.com wrote: Folks, Is there any potential issue

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
to use Kafka in this manner. Can you provide more detail? Philip - http://www.philipotoole.com On Monday, August 11, 2014 4:45 PM, Chen Wang chen.apache.s...@gmail.com wrote: Todd, I actually only intend to keep each topic valid for 3 days

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
- http://www.philipotoole.com On Aug 11, 2014, at 6:42 PM, Chen Wang chen.apache.s...@gmail.com wrote: And if you can't consume it all within 6 minutes, partition the topic until you can run enough consumers such that you can keep up., this is what I intend to do for each 6min -topic

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
need. On Tue, Aug 12, 2014 at 8:04 AM, Chen Wang chen.apache.s...@gmail.com wrote: Those data has a timestamp: its actually email campaigns with scheduled send time. But since they can be scheduled ahead(e.g, two days ahead), I cannot read it when it arrives. It has to wait until its actual

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
want to use Kafka. Philip -- http://www.philipotoole.com On Aug 11, 2014, at 7:34 PM, Chen Wang chen.apache.s...@gmail.com wrote: Those data has a timestamp: its actually email campaigns with scheduled send time. But since they can be scheduled ahead

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
to topic2 at 12:06, and so on. The next week, you loop around over exactly the same topics, knowing that the retention settings have cleared out the old data. -Todd On 8/11/14, 4:45 PM, Chen Wang chen.apache.s...@gmail.com wrote: Todd, I actually only intend to keep each topic valid for 3 days

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
#basic_ops_consumer_lag Guozhang On Fri, Aug 8, 2014 at 12:18 PM, Chen Wang chen.apache.s...@gmail.com wrote: sounds like a good idea! I think i will go with the high level consumer then. Another question along with this design is that is there a way to check the lag for a consumer group

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
small. Could you try with larger numbers, like 1? Guozhang On Fri, Aug 8, 2014 at 1:41 PM, Chen Wang chen.apache.s...@gmail.com wrote: Guozhang, I just did a simple test, and kafka does not seem to do what it is supposed to do: I put 20 messages numbered from 1 to 20 to a topic

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-08 Thread Chen Wang
Guozhang, Just curious, do you guys already have a java version of the ConsumerOffsetChecker https://github.com/apache/kafka/blob/0.8/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala so that I could use it in my storm topology? Chen On Fri, Aug 8, 2014 at 2:03 PM, Chen Wang

error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
Folks, I have a process started at specific time and read from a specific topic. I am currently using the High Level API(consumer group) to read from kafka(and will stop once there is nothing in the topic by specifying a timeout). i am most concerned about error recovery in multiple thread

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
to commit offset.. On Thu, Aug 7, 2014 at 4:54 PM, Guozhang Wang wangg...@gmail.com wrote: Hello Chen, With high-level consumer, the partition re-assignment is automatic upon consumer failures. Guozhang On Thu, Aug 7, 2014 at 4:41 PM, Chen Wang chen.apache.s...@gmail.com wrote: Folks, I

Re: error recovery in multiple thread reading from Kafka with HighLevel api

2014-08-07 Thread Chen Wang
of the partitions that the consumer is currently fetching, so there is no need to coordinate this operation. On Thu, Aug 7, 2014 at 5:03 PM, Chen Wang chen.apache.s...@gmail.com wrote: But with the auto commit turned on, I am risking off losing the failed message, right? should I turn off

Clean up Consumer Group

2014-02-24 Thread Chen Wang
Hi, It seems that my consumers cannot be shut down properly. I can still see many unused consumers on the portal. Is there a way to get rid of all these consumers? I tried to call shutdown explicitly, but without any luck. Any help appreciated. Chen

Kafka SimpleConsumer not working

2014-02-20 Thread Chen Wang
Hi, I am using kafka for the first time, and was running the sample from https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example However, I cannot read any data from kafka. The kafka has 10 partitions,and I tried to read from any of them. The fetch can succeed, however, the

Re: Kafka SimpleConsumer not working

2014-02-20 Thread Chen Wang
Never mind. It was actually working. I just need to wait a bit longer for data to come into the partition i was testing for. Chen On Thu, Feb 20, 2014 at 2:41 PM, Chen Wang chen.apache.s...@gmail.comwrote: i am using 0.8.0. The high level api works as expected. dependency