[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660955#comment-14660955 ] Onur Karaman commented on KAFKA-2400: - [~hachikuji]: Yeah the reason behind the decoupling is valid and seems like a good idea. [~guozhang]: Agreed. It seems like more things can go wrong by throttling heartbeats as opposed to setting a reasonable lower bound in ConsumerConfig. > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > Fix For: 0.8.3 > > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660950#comment-14660950 ] Jason Gustafson commented on KAFKA-2400: [~onurkaraman], [~guozhang] One related problem that I didn't think about it is that the current consumer allows multiple pending heartbeats to be transmitted to the coordinator. If some request takes longer than normal on the server (maybe a commit for example), then the heartbeats might pile up. It seems unlikely to be a major issue as long as the heartbeat interval is reasonable, but it should be pretty easy to fix. > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > Fix For: 0.8.3 > > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660921#comment-14660921 ] Guozhang Wang commented on KAFKA-2400: -- Hey [~onurkaraman], yeah that is indeed an issue. However after thinking it twice I feel for different types of consumer clients these may end up with the same effect: 1. Malicious client: a consumer could always claim "my heartbeat frequency is X" upon join-group but actually sends a heartbeat every 1ms, for this case I think the only shield would be throttling; i.e. protocols between the coordinator / consumer does not really help here. 2. Mis-configured client: a consumer could mistakenly config its heartbeat frequency too small; the min heartbeat could help in this cases while throttling might just result in the same effect. 3. Not-care client: they will not override the defaults at all, so all we need to do is to make sure the default values are reasonable. A side note is that we'd better be careful throttling heatbeats since this would possibly increase the false positives of consumer failure if we throttle heartbeat the wrong way. > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > Fix For: 0.8.3 > > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660903#comment-14660903 ] Jason Gustafson commented on KAFKA-2400: [~onurkaraman] The goal of the ticket was specifically to decouple the heartbeat frequency from the session timeout to allow longish session timeouts but still have quick expected rebalance times. I think this is a helpful feature for users who want to limit the impact from rebalances. Since heartbeats are pretty cheap, I don't feel too much concern hammering the server, but perhaps it would help to have a minimum value? I also wouldn't be opposed to having a hard-coded value that was fairly low. > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > Fix For: 0.8.3 > > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660896#comment-14660896 ] Onur Karaman commented on KAFKA-2400: - It seems that this change could increase the chances for a consumer to hammer the coordinator with heartbeats now. Previously, the heartbeat interval was tied to the session timeouts, so the coordinator's min and max session timeouts would in a way limit the heartbeat interval. With this patch dissociating the heartbeat interval from session timeout, the coordinator's min and max session timeouts no longer help shield against this. Is this a concern? > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > Fix For: 0.8.3 > > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660810#comment-14660810 ] ASF GitHub Bot commented on KAFKA-2400: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/116 > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > Fix For: 0.8.3 > > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658686#comment-14658686 ] ASF GitHub Bot commented on KAFKA-2400: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/116 KAFKA-2400; expose heartbeat interval in KafkaConsumer configuration You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-2400 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/116.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #116 commit 3c1b1dd0dc44cd454d02aa7c476825c2ba46 Author: Jason Gustafson Date: 2015-08-05T18:52:35Z KAFKA-2400; expose heartbeat interval in KafkaConsumer configuration > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654050#comment-14654050 ] Jason Gustafson commented on KAFKA-2400: [~jkreps] I was thinking defaults in the ballpark of 30s for session timeout, and 1-5s for heartbeat. 300ms seems a little short, but perhaps it's not unreasonable since heartbeats are so cheap. > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2400) Expose heartbeat frequency in new consumer configuration
[ https://issues.apache.org/jira/browse/KAFKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652663#comment-14652663 ] Jay Kreps commented on KAFKA-2400: -- Also we should set good defaults: - session timeout should probably default to something pretty high, this will mean longer time to detect true failures but no false positives or churning, those who want faster detection can tune down appropriately (most won't care) - reasonable heartbeat frequency (300 ms?). > Expose heartbeat frequency in new consumer configuration > > > Key: KAFKA-2400 > URL: https://issues.apache.org/jira/browse/KAFKA-2400 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Minor > > The consumer coordinator communicates the need to rebalance through responses > to heartbeat requests sent from each member of the consumer group. The > heartbeat frequency therefore controls how long normal rebalances will take. > Currently, the frequency is hard-coded to 3 heartbeats per the configured > session timeout, but it would be nice to expose this setting so that the user > can control the impact from rebalancing. > Since the consumer is currently single-threaded and heartbeats are sent in > poll(), we cannot guarantee that the heartbeats will actually be sent at the > configured frequency. In practice, the user may have to adjust their fetch > size to ensure that poll() is called often enough to get the desired > heartbeat frequency. For most users, the consumption rate is probably fast > enough for this not to matter, but we should make the documentation clear on > this point. In any case, we expect that most users will accept the default > value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)