[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185926#comment-15185926 ] ASF GitHub Bot commented on KAFKA-3197: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/857 > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin >Priority: Blocker > Fix For: 0.10.0.0 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174085#comment-15174085 ] Ismael Juma commented on KAFKA-3197: [~jjkoshy], since the patch doesn't introduce any new configuration and it's relatively simple I think it's fine to go with this approach for now. If we find a better way in the future, we can consider it then. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.10.0.0 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168065#comment-15168065 ] Joel Koshy commented on KAFKA-3197: --- Hi [~fpj] so your suggestion is to preemptively invoke the retransmission logic to retry the affected partitions? It can be done, but I think it would necessitate some weird APIs in {{InFlightRequests}} since as Becket notes, we would need to also proactively fish out partitions from {{InFlightRequests}} and retry those on the new leader. I’m +1 on the patch apart from the minor comments, but will leave this open for a few more days in case anyone has further concerns or better ideas. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.10.0.0 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134760#comment-15134760 ] Ismael Juma commented on KAFKA-3197: [~becket_qin], the plan is to release 0.9.0.1 next week and since the details of how to fix this are still being discussed, do you agree that we should target 0.9.1.0 instead? > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134836#comment-15134836 ] Jiangjie Qin commented on KAFKA-3197: - [~ijuma] Sure, we can target it 0.9.1.0. Last minute change is always unwanted. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.1.0 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133305#comment-15133305 ] Jiangjie Qin commented on KAFKA-3197: - Thanks [~fpj]. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133265#comment-15133265 ] Flavio Junqueira commented on KAFKA-3197: - [~becket_qin] thanks for the clarification. bq. If leader moves, that does not necessarily mean the request to old leader failed. We can always send the unacknowledged message to new leader, but that probably introduce duplicate almost every time leader moves. I agree that duplicates are inconvenient, but in this scenario we aren't promising no duplicates, so I'd rather treat the duplicates separately. bq. Currently after batches leave the record accumulator, we only track them in requests. The record accumulator point is a good one and I'm not super familiar with that part of the code, so I don't have any concrete suggestion right now, but I'll have a closer look. However, bq. So while the idea of resend unacknowledged message to both old and new leader is natural and makes sense, it seems much more complicated and error prone based on our current implementation and does not buy us much. True, from your description, it sounds like the change isn't trivial. But let me ask you this: don't we ever have to retransmit messages after a leader change? If we do, then the code path for retransmitting on a different connection must be there. I'm not super familiar with that part of the code, so I don't have any concrete suggestion right now, but I can have a look to see if I'm able to help out. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132571#comment-15132571 ] Flavio Junqueira commented on KAFKA-3197: - Treating it as a bug sounds right. In the example given in the description, when the producer connects to broker B, shouldn't it resend unacknowledged messages (0 in the example) over the new connection (to broker B in the example)? It can produce duplicates as has been pointed out, but eliminating duplicates is a separate matter. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132580#comment-15132580 ] Ismael Juma commented on KAFKA-3197: I was wondering the same thing [~fpj] > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132698#comment-15132698 ] Jiangjie Qin commented on KAFKA-3197: - [~fpj] [~ijuma] That was my first thinking as well. After a second thought it might be a little bit complicated for the current implementation. This approach needs the following works: 1. Detect leader movement on each metadata refresh. 2. If leader moves, that does not necessarily mean the request to old leader failed. We can always send the unacknowledged message to new leader, but that probably introduce duplicate almost every time leader moves. 3. Currently after batches leave the record accumulator, we only track them in requests. If leader migrates, now we need to peek into every in flight request, take out the batches to the partition whose leader moved, and re-enqueue them in the to record accumulator. This is even more intrusive because we store the batches in the ProduceResponseHandler which we don't even track today. Compared with current approach, the benefit of doing that seems we potentially don't need to wait for request timeout if a broker is actually down. However, given the metadata refresh itself is usually triggered by request timeout, this benefit becomes marginal. So while the idea of resend unacknowledged message to both old and new leader is natural and makes sense, it seems much more complicated and error prone based on our current implementation and does not buy us much. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130084#comment-15130084 ] ASF GitHub Bot commented on KAFKA-3197: --- GitHub user becketqin opened a pull request: https://github.com/apache/kafka/pull/857 KAFKA-3197 Fix producer sending records out of order This patch adds a new configuration to the producer to enforce the maximum in flight batch for each partition. The patch itself is small, but this is an interface change. Given this is a pretty important fix, may be we can run a quick KIP on it. This patch did not remove max.in.flight.request.per.connection configuration because it might still have some value to throttle the number of requests sent to a broker. This is primarily for broker's interest. You can merge this pull request into a Git repository by running: $ git pull https://github.com/becketqin/kafka KAFKA-3197 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/857.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #857 commit c12c1e2044fe92954e0c8a27f63263f2020ddd3c Author: Jiangjie QinDate: 2016-02-03T06:51:41Z KAFKA-3197 Fix producer sending records out of order > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130306#comment-15130306 ] Eno Thereska commented on KAFKA-3197: - [~becket_qin]: I don't think the documentation for max.in.flight.requests ever promised to send a message in order if in flight requests is set to 1. Reqs can be sent out of order if there are retries. Could the order goal be achieved without adding another parameter to the (already long) config file but by using a combination of in flight requests=1 and (retries=0 or acks=1)? Thanks. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131451#comment-15131451 ] Jay Kreps commented on KAFKA-3197: -- Would it be better to treat this more as a bug than a configurable thing in the in-flight=1 case? i.e. when would i have in-flight=1 and not want the reordering protection? I agree people are depending on that now. Slightly longer term I think we are actively picking up that idempotence/txn/semantics line of work and I think it is possible that whatever is done for idempotence might be the more principled solution as it could solve this problem even in the presence of pipelining. The idea here is that there is a sequence number per-partition which the server uses to dedupe, and this ensures that if one request fails all other pipelined requests on that partition also fail (and then retry idempotently) so that you don't reorder irrespective of the depth of the pipelining. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131507#comment-15131507 ] Jiangjie Qin commented on KAFKA-3197: - Hey Jay, Yes, idempotent producer would solve the problem. And I completely agree that when people set in flight request to one they are expecting no re-ordering. I was initially thinking of treating in.flight.request.per.connection=1 as in.flight.batch.per.partition=1 implicitly needed. This does not need additional configuration. But there is a subtle difference in terms of performance. If a producer has a lot partitions to send to the same broker, theoretically we can allow in flight request > 1 as long as each request addresses distinct partitions. If we enforce in.flight.request=1, we lose this parallelism. But given this is what already there, so it is probably fine to leave it as is. I'll update the patch to remove the newly added configuration but simply reuse in.flight.request.per.connection. Otherwise please let me know if you think the subtle optimization worth the configuration change. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131409#comment-15131409 ] Jiangjie Qin commented on KAFKA-3197: - [~enothereska] We have already defined sync and async in the producer at per message level when user call send(). Having another configuration is a little confusing. If the only purpose for send "sync" is to send messages in order, making it clear in the configuration seems reasonable. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130781#comment-15130781 ] Joel Koshy commented on KAFKA-3197: --- [~enothereska] - the documentation is accurate in that it . The reality though is that everyone (or at least most users) interpret that to mean it is possible to achieve strict ordering within a partition which is necessary for several use-cases. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130785#comment-15130785 ] Joel Koshy commented on KAFKA-3197: --- Sorry - hit enter too soon. "in that it does not specifically say that it can be used to prevent reordering within a partition" > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130761#comment-15130761 ] Jiangjie Qin commented on KAFKA-3197: - [~enothereska] The documentation of max.in.flight.request.per.connection did not say it explicitly, but I think followings are the guarantees we currently claim (or think) we are providing with different in.flight.request.per.connection and retries. 1. retries = 0, regardless of in.flight.request.per.connection Producer itself does not introduce reordering in this case (all messages are only sent once), but message send will very likely fail immediately when event such as leader migration occurs. Users probably only have three choices when message failure occurs: a) let it go so the message is dropped; b) close producer if user do not tolerate message loss or re-ordering; c) resend the message and have re-ordering (this re-ordering is introduced by user) 2. in.flight.request.per.connection >1 and retries > 0 (some reasonable number) No worry about frequent message send failure, but re-order could happen when there is retry. 3. in.flight.request.per.connection = 1 and retries > 0 No re-ordering and no frequent failure. The bug here breaks the 3rd guarantee which we thought we are providing. > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.
[ https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130806#comment-15130806 ] Eno Thereska commented on KAFKA-3197: - [~jjkoshy], [~becket_qin] Makes sense. I can't help but think of the analogy to file systems. The only way to guarantee order is to do synchronous requests one at a time. Async requests can never guarantee order. I believe the current solution you are providing would work, but I wonder if it's worth taking a step back and simplifying the options (perhaps to just two: async ---with any number of requests outstanding --- and sync). > Producer can send message out of order even when in flight request is set to > 1. > --- > > Key: KAFKA-3197 > URL: https://issues.apache.org/jira/browse/KAFKA-3197 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 0.9.0.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.9.0.1 > > > The issue we saw is following: > 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight > request to broker A is 1. > 2. The request is somehow lost > 3. Producer refreshed its topic metadata and found leader of > topic-partition-0 migrated from broker A to broker B. > 4. Because there is no in-flight request to broker B. All the subsequent > messages to topic-partition-0 in the record accumulator are sent to broker B. > 5. Later on when the request in step (1) times out, message 0 will be retried > and sent to broker B. At this point, all the later messages has already been > sent, so we have re-order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)