Hi Hans, We currently do #2 and thats is quite slow so yeah In thoery #1 is probably a better choice although its not quite what we want since it doesn't guarantee consistency at any given time as you have already pointed out. Thanks a lot for the response!
kant On Mon, Nov 7, 2016 at 6:31 AM, Hans Jespersen <h...@confluent.io> wrote: > I don't believe that either of your two storage systems support distributed > atomic transactions. > You are just going to have to do one of the following: > 1) update them separately (in parallel) and be aware that their committed > offsets may be slightly different at certain points in time > 2) update one and when you are sure the data is in the first storage, then > update the other storage and be aware that you need to handle your own > rollback logic if the second storage system is down or throws an error when > you try to write to it. > > It is very common in Kafka community to do #1 but in either case this is no > longer a Kafka question and has become more of a a distributed database > design question. > > -hans > > /** > * Hans Jespersen, Principal Systems Engineer, Confluent Inc. > * h...@confluent.io (650)924-2670 > */ > > On Sun, Nov 6, 2016 at 7:08 PM, kant kodali <kanth...@gmail.com> wrote: > > > Hi Hans, > > > > The two storages we use are Cassandra and Elastic search and they are on > > the same datacenter for now. > > The Programming Language we use is Java and OS would be Ubuntu or CentOS. > > We get messages in JSON format so we insert into Elastic Search directly > > and for Cassandra we transform JSON message into appropriate model so we > > could insert into a Cassandra table. > > The rate we currently get is about 100K/sec which is awesome but I am > > pretty sure this will go down once when we implement 2PC or transactional > > writes. > > > > Thanks, > > kant > > >