Hello everyone,

I am looking to enhance my Kafka Streams based application from
one instance to many.

Part of the difficulty is the it seems that all of the state providers
are "instance local", either in memory or on local disk.  This means to
answer queries for non-local partitions you have to proxy them to another
node.  There's also race conditions here -- what if node B owns partition 1,
node A redirects a query from a key in that partition, then B fails over to A
concurrently?

I am curious why there's not an option to e.g. use a Memcache, Redis, or 
Cassandra cluster (or similar).
Seems that it would simplify the inter-node communication (you just speak to a 
cluster using
e.g. consistent hashing for keys) and improve availability (application node 
crashing doesn't imply
loss of state for affected partitions)

Is this just because nobody has written it yet?  Is there some reason that 
having
strictly local storage plus a "gossip" like protocol is superior?

How are other people doing this?

Thanks,
Steven


Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to