Hey Otis, I think the key phrase in Samza's description is:
"Apache Samza is a distributed stream processing framework" Kafka is not a stream processing framework. It's a message queue/broker system. Stream processing frameworks (e.g. Storm, Spark Streaming, Samza, etc) use message queueing/brokering systems to pass messages within the stream processing framework. Samza is more akin to Storm or Spark Streaming. KOYA is just putting Kafka brokers in a YARN grid. At least, that's my understanding. Cheers, Chris On 1/16/15 4:05 PM, "Otis Gospodnetic" <otis.gospodne...@gmail.com> wrote: >Hm. My understanding was that both are aimed at basically the same thing >- >Kafka on YARN. From Samza site: >"Apache Samza is a distributed stream processing framework. It uses Apache >Kafka <http://kafka.apache.org/> for messaging, and Apache Hadoop YARN ><http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.h >tml> >to >provide fault tolerance, processor isolation, security, and resource >management." > >And KOYA: >"KOYA is a YARN application that launches Kafka within YARN. It then >manages the resource negotiation with Resource Manager, and ensures that >Kafka operates in a YARN native way. For an external publisher or >subscriber, KOYA would not look any different than Kafka since the same >code is being run as a YARN application." > >It looks like they were never mentioned together.... until now :) > >Otis >-- >Monitoring * Alerting * Anomaly Detection * Centralized Log Management >Solr & Elasticsearch Support * http://sematext.com/ > > >On Fri, Jan 16, 2015 at 3:34 PM, Chris Riccomini < >criccom...@linkedin.com.invalid> wrote: > >> Hey Otis, >> >> I'm not terribly familiar with KOYA, but my understanding is that it's a >> tool for deploying Kafka brokers to YARN, and administering them. I >>don't >> think that it has any stream processing functionality built into it. As >> such, it seems to me that KOYA and Samza could be used together: you >>could >> use KOYA to deploy Kafka in YARN, and Samza to read/write messages from >> the brokers that have been deployed. >> >> Samza provides containers that have consumers/producers in them, and >>allow >> you to plug in processing logic as new messages arrive. Samza's goal is >>to >> provide features that are useful when you're processing the messages, >>such >> as fault tolerance (restarting consumers/producers when they fail), >> checkpointing (saving offsets), state management (if you're counting >> messages, you want to make sure your count is accurate even if you >>fail), >> etc. >> >> Cheers, >> Chris >> >> On 1/16/15 12:14 PM, "Otis Gospodnetic" <otis.gospodne...@gmail.com> >> wrote: >> >> >Hi, >> > >> >I was wondering if anyone can compare and contrast KOYA and Samza? >> > >> >Thanks, >> >Otis >> >-- >> >Monitoring * Alerting * Anomaly Detection * Centralized Log Management >> >Solr & Elasticsearch Support * http://sematext.com/ >> >>