[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-06 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660955#comment-14660955
 ] 

Onur Karaman commented on KAFKA-2400:
-

[~hachikuji]: Yeah the reason behind the decoupling is valid and seems like a 
good idea.
[~guozhang]: Agreed. It seems like more things can go wrong by throttling 
heartbeats as opposed to setting a reasonable lower bound in ConsumerConfig.

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.8.3
>
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-06 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660950#comment-14660950
 ] 

Jason Gustafson commented on KAFKA-2400:


[~onurkaraman], [~guozhang] One related problem that I didn't think about it is 
that the current consumer allows multiple pending heartbeats to be transmitted 
to the coordinator. If some request takes longer than normal on the server 
(maybe a commit for example), then the heartbeats might pile up. It seems 
unlikely to be a major issue as long as the heartbeat interval is reasonable, 
but it should be pretty easy to fix.

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.8.3
>
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660921#comment-14660921
 ] 

Guozhang Wang commented on KAFKA-2400:
--

Hey [~onurkaraman], yeah that is indeed an issue. However after thinking it 
twice I feel for different types of consumer clients these may end up with the 
same effect:

1. Malicious client: a consumer could always claim "my heartbeat frequency is 
X" upon join-group but actually sends a heartbeat every 1ms, for this case I 
think the only shield would be throttling; i.e. protocols between the 
coordinator / consumer does not really help here.

2. Mis-configured client: a consumer could mistakenly config its heartbeat 
frequency too small; the min heartbeat could help in this cases while 
throttling might just result in the same effect.

3. Not-care client: they will not override the defaults at all, so all we need 
to do is to make sure the default values are reasonable.

A side note is that we'd better be careful throttling heatbeats since this 
would possibly increase the false positives of consumer failure if we throttle 
heartbeat the wrong way.

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.8.3
>
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-06 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660903#comment-14660903
 ] 

Jason Gustafson commented on KAFKA-2400:


[~onurkaraman] The goal of the ticket was specifically to decouple the 
heartbeat frequency from the session timeout to allow longish session timeouts 
but still have quick expected rebalance times. I think this is a helpful 
feature for users who want to limit the impact from rebalances. Since 
heartbeats are pretty cheap, I don't feel too much concern hammering the 
server, but perhaps it would help to have a minimum value? I also wouldn't be 
opposed to having a hard-coded value that was fairly low.

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.8.3
>
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-06 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660896#comment-14660896
 ] 

Onur Karaman commented on KAFKA-2400:
-

It seems that this change could increase the chances for a consumer to hammer 
the coordinator with heartbeats now.

Previously, the heartbeat interval was tied to the session timeouts, so the 
coordinator's min and max session timeouts would in a way limit the heartbeat 
interval. With this patch dissociating the heartbeat interval from session 
timeout, the coordinator's min and max session timeouts no longer help shield 
against this. Is this a concern?

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.8.3
>
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660810#comment-14660810
 ] 

ASF GitHub Bot commented on KAFKA-2400:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/116


> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
> Fix For: 0.8.3
>
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658686#comment-14658686
 ] 

ASF GitHub Bot commented on KAFKA-2400:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/116

KAFKA-2400; expose heartbeat interval in KafkaConsumer configuration



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-2400

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #116


commit 3c1b1dd0dc44cd454d02aa7c476825c2ba46
Author: Jason Gustafson 
Date:   2015-08-05T18:52:35Z

KAFKA-2400; expose heartbeat interval in KafkaConsumer configuration




> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-04 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654050#comment-14654050
 ] 

Jason Gustafson commented on KAFKA-2400:


[~jkreps] I was thinking defaults in the ballpark of 30s for session timeout, 
and 1-5s for heartbeat. 300ms seems a little short, but perhaps it's not 
unreasonable since heartbeats are so cheap.

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration

2015-08-03 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652663#comment-14652663
 ] 

Jay Kreps commented on KAFKA-2400:
--

Also we should set good defaults: 
- session timeout should probably default to something pretty high, this will 
mean longer time to detect true failures but no false positives or churning, 
those who want faster detection can tune down appropriately (most won't care)
- reasonable heartbeat frequency (300 ms?).

> Expose heartbeat frequency in new consumer configuration
> 
>
> Key: KAFKA-2400
> URL: https://issues.apache.org/jira/browse/KAFKA-2400
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Minor
>
> The consumer coordinator communicates the need to rebalance through responses 
> to heartbeat requests sent from each member of the consumer group. The 
> heartbeat frequency therefore controls how long normal rebalances will take. 
> Currently, the frequency is hard-coded to 3 heartbeats per the configured 
> session timeout, but it would be nice to expose this setting so that the user 
> can control the impact from rebalancing.
> Since the consumer is currently single-threaded and heartbeats are sent in 
> poll(), we cannot guarantee that the heartbeats will actually be sent at the 
> configured frequency. In practice, the user may have to adjust their fetch 
> size to ensure that poll() is called often enough to get the desired 
> heartbeat frequency. For most users, the consumption rate is probably fast 
> enough for this not to matter, but we should make the documentation clear on 
> this point. In any case, we expect that most users will accept the default 
> value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)