Hello everyone, I am looking to enhance my Kafka Streams based application from one instance to many.
Part of the difficulty is the it seems that all of the state providers are "instance local", either in memory or on local disk. This means to answer queries for non-local partitions you have to proxy them to another node. There's also race conditions here -- what if node B owns partition 1, node A redirects a query from a key in that partition, then B fails over to A concurrently? I am curious why there's not an option to e.g. use a Memcache, Redis, or Cassandra cluster (or similar). Seems that it would simplify the inter-node communication (you just speak to a cluster using e.g. consistent hashing for keys) and improve availability (application node crashing doesn't imply loss of state for affected partitions) Is this just because nobody has written it yet? Is there some reason that having strictly local storage plus a "gossip" like protocol is superior? How are other people doing this? Thanks, Steven
signature.asc
Description: Message signed with OpenPGP using GPGMail