An analogy I use is that in comparison to classic MapReduce with its
storage (HDFS) and compute (MR), Kafka is the storage and Samza is the
compute.  But HDFS does not run through YARN...

So, in the same way that HDFS is run and launched separately from YARN
(and hence Map-Reduce), Kafka is also run and launched separately from
Samza (and hence YARN).  KOYA is trying to move Kafka deployment to
YARN.  There is no equivalent launch-and-manage-HDFS-through-YARN
project underway (though there have been rumblings), but such a
project would be the correspondingly equivalent.

On 16 January 2015 at 16:05, Otis Gospodnetic
<otis.gospodne...@gmail.com> wrote:
> Hm.  My understanding was that both are aimed at basically the same thing -
> Kafka on YARN.  From Samza site:
> "Apache Samza is a distributed stream processing framework. It uses Apache
> Kafka <http://kafka.apache.org/> for messaging, and Apache Hadoop YARN
> <http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html>
> to
> provide fault tolerance, processor isolation, security, and resource
> management."
>
> And KOYA:
> "KOYA is a YARN application that launches Kafka within YARN. It then
> manages the resource negotiation with Resource Manager, and ensures that
> Kafka operates in a YARN native way. For an external publisher or
> subscriber, KOYA would not look any different than Kafka since the same
> code is being run as a YARN application."
>
> It looks like they were never mentioned together.... until now :)
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, Jan 16, 2015 at 3:34 PM, Chris Riccomini <
> criccom...@linkedin.com.invalid> wrote:
>
>> Hey Otis,
>>
>> I'm not terribly familiar with KOYA, but my understanding is that it's a
>> tool for deploying Kafka brokers to YARN, and administering them. I don't
>> think that it has any stream processing functionality built into it. As
>> such, it seems to me that KOYA and Samza could be used together: you could
>> use KOYA to deploy Kafka in YARN, and Samza to read/write messages from
>> the brokers that have been deployed.
>>
>> Samza provides containers that have consumers/producers in them, and allow
>> you to plug in processing logic as new messages arrive. Samza's goal is to
>> provide features that are useful when you're processing the messages, such
>> as fault tolerance (restarting consumers/producers when they fail),
>> checkpointing (saving offsets), state management (if you're counting
>> messages, you want to make sure your count is accurate even if you fail),
>> etc.
>>
>> Cheers,
>> Chris
>>
>> On 1/16/15 12:14 PM, "Otis Gospodnetic" <otis.gospodne...@gmail.com>
>> wrote:
>>
>> >Hi,
>> >
>> >I was wondering if anyone can compare and contrast KOYA and Samza?
>> >
>> >Thanks,
>> >Otis
>> >--
>> >Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> >Solr & Elasticsearch Support * http://sematext.com/
>>
>>

Reply via email to