Thanks Joe. That's a nice pointer. Will explore the possibility. I am just concerned about the Leader swap time window, but may be thats the tradeoff b/n data consistency Vs availability.
Regards, Jagan ---- On Sat, 22 Feb 2014 23:08:00 +0530 Joe Stein <crypt...@gmail.com> wrote ---- Without them you have no durability. With them you have guarantees... More than any other system with messaging features. It is a durable CP commit log. Works very well for data pipelines with AP systems like Cassandra which is a different system solving different problems. When a Kafka leader fails you right might block and wait for 10ms while a new leader is elected but writes can be guaranteed. The consumers then read and process data and write to Cassandra. And then have your app read from Cassandra for what what was processed. These are very typical type architectures at scale https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop ********************************************/ On Feb 22, 2014, at 11:49 AM, Jagan Ranganathan <ja...@zohocorp.com> wrote: Hi Joe, If my understanding is right, Kafka does not satisfy the high availability/replication part well because of the need for leader and In-Sync replicas. Regards, Jagan ---- On Sat, 22 Feb 2014 22:02:27 +0530 Joe Stein<crypt...@gmail.com> wrote ---- If performance and availability for messaging is a requirement then use Apache Kafka http://kafka.apache.org/ You can pass the same thrift/avro objects through the Kafka commit log or strings or whatever you want. /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop ********************************************/ On Feb 22, 2014, at 11:13 AM, Jagan Ranganathan <ja...@zohocorp.com> wrote: Hi Michael, Yes I am planning to use RabbitMQ for my messaging system. But I wonder which will give better performance if writing directly into Rabbit with Ack support Vs a temporary Queue in Cassandra first and then dequeue and publish in Rabbit. Complexities involving - Handling scenarios like Rabbit Connection failure etc Vs Cassandra write performance and replication with hinted handoff support etc, makes me wonder if this is a better path. Regards, Jagan ---- On Sat, 22 Feb 2014 21:01:14 +0530 Michael Laing <michael.la...@nytimes.com> wrote ---- We use RabbitMQ for queuing and Cassandra for persistence. RabbitMQ with clustering and/or federation should meet your high availability needs. Michael On Sat, Feb 22, 2014 at 10:25 AM, DuyHai Doan <doanduy...@gmail.com> wrote: Jagan Queue-like data structures are known to be one of the worst anti patterns for Cassandra: http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets On Sat, Feb 22, 2014 at 4:03 PM, Jagan Ranganathan <ja...@zohocorp.com> wrote: Hi, I need to decouple some of the work being processed from the user thread to provide better user experience. For that I need a queuing system with the following needs, High Availability No Data Loss Better Performance. Following are some libraries that were considered along with the limitation I see, Redis - Data Loss ZooKeeper - Not advised for Queue system. TokyoCabinet/SQLite/LevelDB - of this Level DB seem to be performing better. With replication requirement, I probably have to look at Apache ActiveMQ+LevelDB. After checking on the third option above, I kind of wonder if Cassandra with Leveled Compaction offer a similar system. Do you see any issues in such a usage or is there other better solutions available. Will be great to get insights on this. Regards, Jagan