[jira] [Commented] (KAFKA-2758) Improve Offset Commit Behavior
[ https://issues.apache.org/jira/browse/KAFKA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013271#comment-17013271 ] Guozhang Wang commented on KAFKA-2758: -- We used to put it on hold especially for 1) since KIP-211 is not merged yet, however even now after KIP-211 is merged we should be careful since a newer versioned client may talk to an older versioned broker (2.0-) which does not have KIP-211 yet. We have some plans for automatically detecting broker versions so I'd suggest before that we do not pick up this ticket yet. > Improve Offset Commit Behavior > -- > > Key: KAFKA-2758 > URL: https://issues.apache.org/jira/browse/KAFKA-2758 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Priority: Major > Labels: newbie, reliability > > There are two scenarios of offset committing that we can improve: > 1) we can filter the partitions whose committed offset is equal to the > consumed offset, meaning there is no new consumed messages from this > partition and hence we do not need to include this partition in the commit > request. > 2) we can make a commit request right after resetting to a fetch / consume > position either according to the reset policy (e.g. on consumer starting up, > or handling of out of range offset, etc), or through the {code} seek {code} > so that if the consumer fails right after these event, upon recovery it can > restarts from the reset position instead of resetting again: this can lead > to, for example, data loss if we use "largest" as reset policy while there > are new messages coming to the fetching partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-2758) Improve Offset Commit Behavior
[ https://issues.apache.org/jira/browse/KAFKA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012672#comment-17012672 ] highluck commented on KAFKA-2758: - [~guozhang] Is this issue still open > Improve Offset Commit Behavior > -- > > Key: KAFKA-2758 > URL: https://issues.apache.org/jira/browse/KAFKA-2758 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Priority: Major > Labels: newbie, reliability > > There are two scenarios of offset committing that we can improve: > 1) we can filter the partitions whose committed offset is equal to the > consumed offset, meaning there is no new consumed messages from this > partition and hence we do not need to include this partition in the commit > request. > 2) we can make a commit request right after resetting to a fetch / consume > position either according to the reset policy (e.g. on consumer starting up, > or handling of out of range offset, etc), or through the {code} seek {code} > so that if the consumer fails right after these event, upon recovery it can > restarts from the reset position instead of resetting again: this can lead > to, for example, data loss if we use "largest" as reset policy while there > are new messages coming to the fetching partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-2758) Improve Offset Commit Behavior
[ https://issues.apache.org/jira/browse/KAFKA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895953#comment-16895953 ] Omkar Mestry commented on KAFKA-2758: - Is this issue still open and if open can I assign it to myself? > Improve Offset Commit Behavior > -- > > Key: KAFKA-2758 > URL: https://issues.apache.org/jira/browse/KAFKA-2758 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Priority: Major > Labels: newbie, reliability > > There are two scenarios of offset committing that we can improve: > 1) we can filter the partitions whose committed offset is equal to the > consumed offset, meaning there is no new consumed messages from this > partition and hence we do not need to include this partition in the commit > request. > 2) we can make a commit request right after resetting to a fetch / consume > position either according to the reset policy (e.g. on consumer starting up, > or handling of out of range offset, etc), or through the {code} seek {code} > so that if the consumer fails right after these event, upon recovery it can > restarts from the reset position instead of resetting again: this can lead > to, for example, data loss if we use "largest" as reset policy while there > are new messages coming to the fetching partitions. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (KAFKA-2758) Improve Offset Commit Behavior
[ https://issues.apache.org/jira/browse/KAFKA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722593#comment-16722593 ] Richard Yu commented on KAFKA-2758: --- I just want to say that this issue might no longer be relevant. It appears to be quite old. > Improve Offset Commit Behavior > -- > > Key: KAFKA-2758 > URL: https://issues.apache.org/jira/browse/KAFKA-2758 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang >Priority: Major > Labels: newbie, reliability > > There are two scenarios of offset committing that we can improve: > 1) we can filter the partitions whose committed offset is equal to the > consumed offset, meaning there is no new consumed messages from this > partition and hence we do not need to include this partition in the commit > request. > 2) we can make a commit request right after resetting to a fetch / consume > position either according to the reset policy (e.g. on consumer starting up, > or handling of out of range offset, etc), or through the {code} seek {code} > so that if the consumer fails right after these event, upon recovery it can > restarts from the reset position instead of resetting again: this can lead > to, for example, data loss if we use "largest" as reset policy while there > are new messages coming to the fetching partitions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-2758) Improve Offset Commit Behavior
[ https://issues.apache.org/jira/browse/KAFKA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240796#comment-16240796 ] Jeff Widman commented on KAFKA-2758: item 1 would be significantly more useful if [KIP-211](https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets) gets accepted. That would remove the risk off accidentally expiring a consumers offsets. > Improve Offset Commit Behavior > -- > > Key: KAFKA-2758 > URL: https://issues.apache.org/jira/browse/KAFKA-2758 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang > Labels: newbiee, reliability > > There are two scenarios of offset committing that we can improve: > 1) we can filter the partitions whose committed offset is equal to the > consumed offset, meaning there is no new consumed messages from this > partition and hence we do not need to include this partition in the commit > request. > 2) we can make a commit request right after resetting to a fetch / consume > position either according to the reset policy (e.g. on consumer starting up, > or handling of out of range offset, etc), or through the {code} seek {code} > so that if the consumer fails right after these event, upon recovery it can > restarts from the reset position instead of resetting again: this can lead > to, for example, data loss if we use "largest" as reset policy while there > are new messages coming to the fetching partitions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-2758) Improve Offset Commit Behavior
[ https://issues.apache.org/jira/browse/KAFKA-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240608#comment-16240608 ] Guozhang Wang commented on KAFKA-2758: -- [~jjkoshy] That's a good point. The main motivation for 1) is for services like MM, where a commit request may contains large number of partitions where many of them contains the same offsets; and the hope is to reduce the request size for such scenarios. I'm wondering if this is still a good trade-off with complexity to modify the server-side logic handling commit offset to update the timestamps from this group id (I think that is primarily dependent on how much we can save in practice for network bandwidth). > Improve Offset Commit Behavior > -- > > Key: KAFKA-2758 > URL: https://issues.apache.org/jira/browse/KAFKA-2758 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Guozhang Wang > Labels: newbiee, reliability > > There are two scenarios of offset committing that we can improve: > 1) we can filter the partitions whose committed offset is equal to the > consumed offset, meaning there is no new consumed messages from this > partition and hence we do not need to include this partition in the commit > request. > 2) we can make a commit request right after resetting to a fetch / consume > position either according to the reset policy (e.g. on consumer starting up, > or handling of out of range offset, etc), or through the {code} seek {code} > so that if the consumer fails right after these event, upon recovery it can > restarts from the reset position instead of resetting again: this can lead > to, for example, data loss if we use "largest" as reset policy while there > are new messages coming to the fetching partitions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)