Kafka periodically flushes to disk, but this can happen after your producer gets an acknowledgement. If you have a replicated topic then the producer gets an acknowledgment only after other replicas have received the message. In this way you get safety from server crashes by replicating to other servers. If all of those servers crashed before any of them flushed to disk you could lose data if consumers have not yet read and processed those messages.
There is a configuration property that deals with this: min.insync.replicas " When a producer sets acks to "all", min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend). When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of "all". This will ensure that the producer raises an exception if a majority of replicas do not receive a write." -Dave Dave Tauzell | Senior Software Engineer | Surescripts O: 651.855.3042 | www.surescripts.com | [email protected] Connect with us: Twitter I LinkedIn I Facebook I YouTube -----Original Message----- From: Xin Chen [mailto:[email protected]] Sent: Monday, June 20, 2016 10:47 AM To: [email protected] Subject: handle the data loss in page cache? Hello everyone, Does Kafka guarantee that the data flushed by the server is absolutely persisted into disk? Is it possible that the data that only accumulate in the page cache without sync to disk, get lost if the server crashes? Thanks, Xin This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.
