Ok so in your use case instead of your application(s) writing directly to Kafka 
you instead have a separate process running that will tail log files and ship 
them over to Kafka. Is that correct?

On Jun 7, 2013, at 5:33 PM, Jonathan Creasy <j...@box.com> wrote:

> I recommend Kafka or Flume-NG for this.
> 
> Our Analytics team is using a Kafka Producer on each server to tail logs
> and ship them to Kafka. We use Oozie to schedule a MapReduce consumer every
> few minutes to read all the Kafka topics into HDFS.
> 
> We use Kafka as a buffer, we keep a few weeks of data there. Our security
> team for example sometimes connects up and consumes some logs for various
> purposes. Usually when they want aggregate log data in realtime.
> 
> Most folks access them in HDFS. We have <1 minute of delay for most log
> lines getting from the server where they were written to HDFS.
> 
> -Jonathan
> 
> 
> On Fri, Jun 7, 2013 at 5:30 PM, Mark <static.void....@gmail.com> wrote:
> 
>> Like I said, Im a bit confused. I see the terms "events", "messages" and
>> "logs" and not quite sure what to make of it.
>> 
>> We are trying to determine the best way to aggregate all of our logs for
>> processing in Hadoop. Kafka seems to fit this bill nicely however I want to
>> know If its suited for other types of messages as well. Are there certain
>> determine factors on why one would choose Kafka over RabbitMQ? Is it mostly
>> scale or is it the type of messages/events/logs being produced/consumed?
>> 
>> On Jun 7, 2013, at 5:21 PM, Alexis Richardson <alexis.richard...@gmail.com>
>> wrote:
>> 
>>> On Sat, Jun 8, 2013 at 1:08 AM, Mark <static.void....@gmail.com> wrote:
>>>> Im a bit confused on the concept of a "message" in Kafka.  How does
>> this differ, if at all, from a message in RabbitMQ? It seems to me that
>> Kafka is better suited for very write intensive "messages" like log data
>> but RabbitMQ may be a better fit for traditional "messages"… i.e. "Product
>> Purchased" or "User Registered" message.
>>> 
>>> I'm not sure why you think this, or how to distinguish between a 'log'
>>> message and some other kind.
>>> 
>>> Messages = data, annotated with metadata.  The latter is typically a
>>> protocol-specific envelope.  Kafka and Rabbit certainly have different
>>> envelopes, eg for mapping data to subscribers/queries.
>>> 
>>> alexis
>> 
>> 
> 
> 
> -- 
> **
> 
> *Jonathan Creasy* | Sr. Ops Engineer
> 
> e: j...@box.com | t: 314.580.8909

Reply via email to