Andrew,

As I mentioned above, in Kafka durability is supported via data replication
instead of sync-flushing to disks. KIP-98 does not try to change that part
of the Kafka: if all your replicas are gone at the same time before the
data was ever flushed to disks, then your data is lost today, and it will
be still the case after KIP-98.

As for atomicity, KIP-98 does provide all-or-nothing guarantee for writes
to multiple partitions, and it is based on its existing durability
guarantees. So it is possible that if your durability breaks, then
atomicity will be violated: some of the committed transaction's messages
could be lost if the above scenarios happen while others can be
successfully appended. My take is that, if you have concerns that Kafka's
replication mechanism i not good enough for your durability requirements as
of today, then you should have the same level of concerns with durability
if you want to use Kafka with KIP-98 as your transactional queuing system
as well.


Guozhang


On Mon, Dec 12, 2016 at 1:49 AM, Andrew Schofield <andrew_schofi...@live.com
> wrote:

> Guozhang,
> Exactly. This is the crux of the matter. Because it's async, the log is
> basically
> slightly out of date wrt to the run-time state and a failure of all
> replicas might
> take the data slightly back in time.
>
> Given this, do you think that KIP-98 gives an all-or-nothing,
> no-matter-what guarantee
> for Kafka transactions? I think the key is whether the data which is
> asynchronously
> flushed is guaranteed to be recovered atomically in all cases.
> Asynchronous but
> atomic would be good.
>
> Andrew Schofield
> IBM Watson and Cloud Platform
>
>
> >
> > From: Guozhang Wang <wangg...@gmail.com>
> > Sent: 09 December 2016 22:59
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional
> Messaging
> >
> > Onur,
> >
> > I understand your question now. So it is indeed possible that after
> > commitTxn() returned the messages could still be lost permanently if all
> > replicas failed before the data was flushed to disk. This is the virtue
> of
> > Kafka's design to reply on replication (probably in memory) for high
> > availability, hence async flushing. This scenario already exist today and
> > KIP-98 did not intend to change this factor in any ways.
> >
> > Guozhang
> >
> >
> >
>



-- 
-- Guozhang

Reply via email to