Re: Consumer rebalancing retry settings and reconnecting after failure

2014-07-14 Thread Guozhang Wang
Hi Michal, Restart consumer should be easy to implement in a script. The reason not implementing this function inside Kafka consumer is to avoid missing any potential issue/bugs causing consumer to stop. Guozhang On Mon, Jul 14, 2014 at 1:45 AM, Michal Michalski < michal.michal...@boxever.com>

Re: Consumer rebalancing retry settings and reconnecting after failure

2014-07-14 Thread Michal Michalski
Hi Guozhang, OK, I spent some time to understand a bit more how Kafka uses ZooKeeper and how sessions are handled and it seems that the change you proposed should do the job. Thanks :-) But I still think that (optional?) automatic restart of a consumer could be a good idea! ;-) M. Kind regard

Re: Consumer rebalancing retry settings and reconnecting after failure

2014-07-11 Thread Guozhang Wang
Hi Michal, In your case you could try to increase the zookeeper session timeout value on the consumer side (default is 6 sec) and see if this is sufficient to cover the latency jitters. Guozhang On Fri, Jul 11, 2014 at 5:25 AM, Michal Michalski < michal.michal...@boxever.com> wrote: > Hey Guoz

Re: Consumer rebalancing retry settings and reconnecting after failure

2014-07-11 Thread Michal Michalski
Hey Guozhang, Thanks for reply. I get your point on "hiding" some issues, but I'd prefer to separate the recovery and reporting a failure. Also, I think if simple restart is a possible solution, it shouldn't require implementing it separately or, what's even worse, a manual intervention. Maybe I'l

Re: Consumer rebalancing retry settings and reconnecting after failure

2014-07-10 Thread Guozhang Wang
Hi Michal, The rebalance will only be triggered on consumer membership or topic/partition changes. Once triggered it will try to finish the rebalance for at most rebalance.max.retries times, i.e. if it fails it will wait for rebalance.backoff.ms, and then try again until number of retries exhauste

Consumer rebalancing retry settings and reconnecting after failure

2014-07-10 Thread Michal Michalski
Hi, Just wondering - is there any reason why rebalance.max.retries is 4 by default? Is there any good reason why I shouldn't expect my consumers to keep trying to rebalance for minutes (e.g. 30 retries every 6 seconds), rather than seconds (4 retries every 2 seconds by default)? Also, if my consu