[jira] [Commented] (KAFKA-4666) Failure test for Kafka configured for consistency vs availability

Emanuele Cesena (JIRA) Tue, 24 Jan 2017 15:56:08 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15836828#comment-15836828
 ]


Emanuele Cesena commented on KAFKA-4666:
----------------------------------------

Yes, you're correct, there's nothing unexpected here from Kafka perspective.

There can be some confusion, imo, from the user perspective. Say you config 
Kafka for consistency and forget acks=all. Your producer sends some messages, 
and receives acks (=1) from the master. You think all is good, but in the 
situation described by the test you end up actually loosing these messages, 
which is confusing (they were acked... though not by all, this is the user's 
mistake).

So my proposal was just to clarify a bit in the docs, because the acks=all is 
kind of hidden.

Furthermore, as I wrote the test for myself to be sure I understood the issue 
correctly, I was wondering if you'd like to merge the test for future ref, but 
this is kind of a minor/conveniency thing.

> Failure test for Kafka configured for consistency vs availability
> -----------------------------------------------------------------
>
>                 Key: KAFKA-4666
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4666
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Emanuele Cesena
>         Attachments: consistency_test.py
>
>
> We recently had an issue with our Kafka setup because of a misconfiguration.
> In short, we thought we have configured Kafka for durability, but we didn't 
> set the producers to acks=all. During a full outage, we had situations where 
> some partitions were "partitioned", meaning that the followers started 
> without properly waiting for the right leader, and thus we lost data. Again, 
> this is not an issue with Kafka, but a misconfiguration on our side.
> I think we reproduced the issue, and we built a docker test that proves that, 
> if the producer isn't set with acks=all, then data can be lost during an 
> almost full outage. The test is attached.
> I was thinking to send a PR, but wanted to run this through you first, as 
> it's not necessarily proving that a feature works as expected.
> In addition, I think the documentation could be slightly improved, for 
> instance in the section:
> http://kafka.apache.org/documentation/#design_ha
> by clearly stating that there are 3 steps one should do for configuring kafka 
> for consistency, the third being that producers should be set with acks=all 
> (which is now part of the 2nd point).
> Please let me know what do you think, and I can send a PR if you agree.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-4666) Failure test for Kafka configured for consistency vs availability

Reply via email to