[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185926#comment-15185926
 ] 

ASF GitHub Bot commented on KAFKA-3197:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/857


> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-03-01 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174085#comment-15174085
 ] 

Ismael Juma commented on KAFKA-3197:


[~jjkoshy], since the patch doesn't introduce any new configuration and it's 
relatively simple I think it's fine to go with this approach for now. If we 
find a better way in the future, we can consider it then.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.0.0
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-25 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168065#comment-15168065
 ] 

Joel Koshy commented on KAFKA-3197:
---

Hi [~fpj] so your suggestion is to preemptively invoke the retransmission logic 
to retry the affected partitions? It can be done, but I think it would 
necessitate some weird APIs in {{InFlightRequests}} since as Becket notes, we 
would need to also proactively fish out partitions from {{InFlightRequests}} 
and retry those on the new leader.
I’m +1 on the patch apart from the minor comments, but will leave this open for 
a few more days in case anyone has further concerns or better ideas.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.0.0
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-05 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134760#comment-15134760
 ] 

Ismael Juma commented on KAFKA-3197:


[~becket_qin], the plan is to release 0.9.0.1 next week and since the details 
of how to fix this are still being discussed, do you agree that we should 
target 0.9.1.0 instead?

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-05 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134836#comment-15134836
 ] 

Jiangjie Qin commented on KAFKA-3197:
-

[~ijuma] Sure, we can target it 0.9.1.0. Last minute change is always unwanted.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.1.0
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-04 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133305#comment-15133305
 ] 

Jiangjie Qin commented on KAFKA-3197:
-

Thanks [~fpj].

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-04 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133265#comment-15133265
 ] 

Flavio Junqueira commented on KAFKA-3197:
-

[~becket_qin] thanks for the clarification. 

bq. If leader moves, that does not necessarily mean the request to old leader 
failed. We can always send the unacknowledged message to new leader, but that 
probably introduce duplicate almost every time leader moves.

I agree that duplicates are inconvenient, but in this scenario we aren't 
promising no duplicates, so I'd rather treat the duplicates separately.

bq. Currently after batches leave the record accumulator, we only track them in 
requests.

The record accumulator point is a good one and I'm not super familiar with that 
part of the code, so I don't have any concrete suggestion right now, but I'll 
have a closer look. However, 

bq. So while the idea of resend unacknowledged message to both old and new 
leader is natural and makes sense, it seems much more complicated and error 
prone based on our current implementation and does not buy us much.

True, from your description, it sounds like the change isn't trivial. But let 
me ask you this: don't we ever have to retransmit messages after a leader 
change? If we do, then the code path for retransmitting on a different 
connection must be there. I'm not super familiar with that part of the code, so 
I don't have any concrete suggestion right now, but I can have a look to see if 
I'm able to help out.



> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-04 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132571#comment-15132571
 ] 

Flavio Junqueira commented on KAFKA-3197:
-

Treating it as a bug sounds right. In the example given in the description, 
when the producer connects to broker B, shouldn't it resend unacknowledged 
messages (0 in the example) over the new connection (to broker B in the 
example)? It can produce duplicates as has been pointed out, but eliminating 
duplicates is a separate matter.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-04 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132580#comment-15132580
 ] 

Ismael Juma commented on KAFKA-3197:


I was wondering the same thing [~fpj]

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-04 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132698#comment-15132698
 ] 

Jiangjie Qin commented on KAFKA-3197:
-

[~fpj] [~ijuma] That was my first thinking as well. After a second thought it 
might be a little bit complicated for the current implementation.

This approach needs the following works:
1. Detect leader movement on each metadata refresh.
2. If leader moves, that does not necessarily mean the request to old leader 
failed. We can always send the unacknowledged message to new leader, but that 
probably introduce duplicate almost every time leader moves.
3. Currently after batches leave the record accumulator, we only track them in 
requests. If leader migrates, now we need to peek into every in flight request, 
take out the batches to the partition whose leader moved, and re-enqueue them 
in the to record accumulator. This is even more intrusive because we store the 
batches in the ProduceResponseHandler which we don't even track today. 

Compared with current approach, the benefit of doing that seems we potentially 
don't need to wait for request timeout if a broker is actually down. However, 
given the metadata refresh itself is usually triggered by request timeout, this 
benefit becomes marginal. 

So while the idea of resend unacknowledged message to both old and new leader 
is natural and makes sense, it seems much more complicated and error prone 
based on our current implementation and does not buy us much.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130084#comment-15130084
 ] 

ASF GitHub Bot commented on KAFKA-3197:
---

GitHub user becketqin opened a pull request:

https://github.com/apache/kafka/pull/857

KAFKA-3197 Fix producer sending records out of order

This patch adds a new configuration to the producer to enforce the maximum 
in flight batch for each partition. The patch itself is small, but this is an 
interface change. Given this is a pretty important fix, may be we can run a 
quick KIP on it. 

This patch did not remove max.in.flight.request.per.connection 
configuration because it might still have some value to throttle the number of 
requests sent to a broker. This is primarily for broker's interest.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/becketqin/kafka KAFKA-3197

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/857.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #857


commit c12c1e2044fe92954e0c8a27f63263f2020ddd3c
Author: Jiangjie Qin 
Date:   2016-02-03T06:51:41Z

KAFKA-3197 Fix producer sending records out of order




> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130306#comment-15130306
 ] 

Eno Thereska commented on KAFKA-3197:
-

[~becket_qin]: I don't think the documentation for max.in.flight.requests ever 
promised to send a message in order if in flight requests is set to 1. Reqs can 
be sent out of order if there are retries. Could the order goal be achieved 
without adding another parameter to the (already long) config file but by using 
a combination of in flight requests=1 and (retries=0 or acks=1)? Thanks.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131451#comment-15131451
 ] 

Jay Kreps commented on KAFKA-3197:
--

Would it be better to treat this more as a bug than a configurable thing in the 
in-flight=1 case? i.e. when would i have in-flight=1 and not want the 
reordering protection? I agree people are depending on that now.

Slightly longer term I think we are actively picking up that 
idempotence/txn/semantics line of work and I think it is possible that whatever 
is done for idempotence might be the more principled solution as it could solve 
this problem even in the presence of pipelining. The idea here is that there is 
a sequence number per-partition which the server uses to dedupe, and this 
ensures that if one request fails all other pipelined requests on that 
partition also fail (and then retry idempotently) so that you don't reorder 
irrespective of the depth of the pipelining.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131507#comment-15131507
 ] 

Jiangjie Qin commented on KAFKA-3197:
-

Hey Jay,

Yes, idempotent producer would solve the problem. And I completely agree that 
when people set in flight request to one they are expecting no re-ordering. 

I was initially thinking of treating in.flight.request.per.connection=1 as 
in.flight.batch.per.partition=1 implicitly needed. This does not need 
additional configuration. But there is a subtle difference in terms of 
performance. If a producer has a lot partitions to send to the same broker, 
theoretically we can allow in flight request > 1 as long as each request 
addresses distinct partitions. If we enforce in.flight.request=1, we lose this 
parallelism. But given this is what already there, so it is probably fine to 
leave it as is.

I'll update the patch to remove the newly added configuration but simply reuse 
in.flight.request.per.connection. Otherwise please let me know if you think the 
subtle optimization worth the configuration change.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131409#comment-15131409
 ] 

Jiangjie Qin commented on KAFKA-3197:
-

[~enothereska] We have already defined sync and async in the producer at per 
message level when user call send(). Having another configuration is a little 
confusing. If the only purpose for send "sync" is to send messages in order, 
making it clear in the configuration seems reasonable.

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130781#comment-15130781
 ] 

Joel Koshy commented on KAFKA-3197:
---

[~enothereska] - the documentation is accurate in that it . The reality though 
is that everyone (or at least most users) interpret that to mean it is possible 
to achieve strict ordering within a partition which is necessary for several 
use-cases.


> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130785#comment-15130785
 ] 

Joel Koshy commented on KAFKA-3197:
---

Sorry - hit enter too soon. "in that it does not specifically say that it can 
be used to prevent reordering within a partition"

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130761#comment-15130761
 ] 

Jiangjie Qin commented on KAFKA-3197:
-

[~enothereska] The documentation of max.in.flight.request.per.connection did 
not say it explicitly, but I think followings are the guarantees we currently 
claim (or think) we are providing with different 
in.flight.request.per.connection and retries.

1. retries = 0, regardless of in.flight.request.per.connection
Producer itself does not introduce reordering in this case (all messages are 
only sent once), but message send will very likely fail immediately when event 
such as leader migration occurs. Users probably only have three choices when 
message failure occurs: a) let it go so the message is dropped; b) close 
producer if user do not tolerate message loss or re-ordering; c) resend the 
message and have re-ordering (this re-ordering is introduced by user)

2. in.flight.request.per.connection >1 and retries > 0 (some reasonable number)
No worry about frequent message send failure, but re-order could happen when 
there is retry.

3. in.flight.request.per.connection = 1 and retries > 0
No re-ordering and no frequent failure.

The bug here breaks the 3rd guarantee which we thought we are providing.


> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3197) Producer can send message out of order even when in flight request is set to 1.

2016-02-03 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130806#comment-15130806
 ] 

Eno Thereska commented on KAFKA-3197:
-

[~jjkoshy], [~becket_qin] Makes sense. I can't help but think of the analogy to 
file systems. The only way to guarantee order is to do synchronous requests one 
at a time. Async requests can never guarantee order. I believe the current 
solution you are providing would work, but I wonder if it's worth taking a step 
back and simplifying the options (perhaps to just two: async ---with any number 
of requests outstanding --- and sync).

> Producer can send message out of order even when in flight request is set to 
> 1.
> ---
>
> Key: KAFKA-3197
> URL: https://issues.apache.org/jira/browse/KAFKA-3197
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> The issue we saw is following:
> 1. Producer send message 0 to topic-partition-0 on broker A. The in-flight 
> request to broker A is 1.
> 2. The request is somehow lost
> 3. Producer refreshed its topic metadata and found leader of 
> topic-partition-0 migrated from broker A to broker B.
> 4. Because there is no in-flight request to broker B. All the subsequent 
> messages to topic-partition-0 in the record accumulator are sent to broker B.
> 5. Later on when the request in step (1) times out, message 0 will be retried 
> and sent to broker B. At this point, all the later messages has already been 
> sent, so we have re-order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)