Re: Kafka 0.11 transactions API question

Michal Borowiecki Fri, 16 Jun 2017 06:48:02 -0700

I don't think KIP-98 is as ambitious as to provide support fordistributed transactions (2 phase commit).


It would be great if I was wrong though :P


Cheers,

Michał


On 16/06/17 14:21, Piotr Nowojski wrote:

Hi,

I'm looking into Kafka's transactions API as proposed in KIP-98. I've read
both this KIP-98 document and I looked into the code that is on the master
branch. I would like to use it to implement some two phase commit mechanism
on top of the Kafka's transactions, that would allow me to tie multiple
systems (some of them might not be Kafka) in one transaction.

Maybe I'm missing something but the problem is I don't see a way to
implement it using proposed Kafka's transactions API. Even if I have just
two processes writing to Kafka topics, I don't know how can I guarantee
that if one's transaction is committed, the other will also eventually be
committed. This is because if first KafkaProducer successfully commits, but
the second one fails before committing it's data, after restart the second
one's "initTransactions" call will (according to my understanding of the
API) abort previously non completed transactions.

Usually transactional systems expose API like this
<http://hlinnaka.iki.fi/2013/04/11/how-to-write-a-java-transaction-manager-that-works-with-postgresql/>.
Namely there is a known identifier for a transaction and you can pre-commit
it (void prepare(...) method in before mentioned example) and then commit
or you can abort this transaction. Usually pre-commit involves flushing
stuff to some temporary files and commit move those files to the final
directory. In case of machine/process failure, if it was before
"pre-commit", we can just rollback all transactions from all of the
processes. However once every process acknowledge that it completed
"pre-commit", each process should call "commit". If some process fails at
that stage, after restarting this process, I would expect to be able to
restore it's "pre-committed" transaction (having remembered transaction's
id) and re attempt to commit it - which should be guaranteed to eventually
succeed.

In other words, it seems to me like the missing features of this API for me
are:
1. possibility to resume transactions after machine/process crash. At least
I would expect to be able to commit "flushed"/"pre-committed" data for such
transactions.
2. making sure that committing already committed transactions doesn't brake
anything

Or maybe there is some other way to integrate Kafka into such two phase
commit system that I'm missing?

Thanks, Piotrek


--
Signature
<http://www.openbet.com/>         Michal Borowiecki
Senior Software Engineer L4
        T:      +44 208 742 1600

        
        +44 203 249 8448

        
        
        E:      michal.borowie...@openbet.com
        W:      www.openbet.com <http://www.openbet.com/>

        
        OpenBet Ltd

        Chiswick Park Building 9

        566 Chiswick High Rd

        London

        W4 5XT

        UK

        
<https://www.openbet.com/email_promo>

This message is confidential and intended only for the addressee. If youhave received this message in error, please immediately notify thepostmas...@openbet.com <mailto:postmas...@openbet.com> and delete itfrom your system as well as any copies. The content of e-mails as wellas traffic data may be monitored by OpenBet for employment and securitypurposes. To protect the environment please do not print this e-mailunless necessary. OpenBet Ltd. Registered Office: Chiswick Park Building9, 566 Chiswick High Road, London, W4 5XT, United Kingdom. A companyregistered in England and Wales. Registered no. 3134634. VAT no.GB927523612

Re: Kafka 0.11 transactions API question

Reply via email to