Are these always log files in the sense of log files or do they also contain 
some event data.. i.e. Product A was purchased or User A just signed in, etc?

On Jun 7, 2013, at 6:53 PM, Jonathan Creasy <jcre...@box.com> wrote:

> Correct, we essentially use the logs as an additional buffer in case of 
> outage in the pipeline. Typically though, messages are produces as soon as 
> they are written.
> 
> 
> -Jonathan
> 
> 
> On Fri, Jun 7, 2013 at 6:06 PM, Mark <static.void....@gmail.com> wrote:
> Ok so in your use case instead of your application(s) writing directly to 
> Kafka you instead have a separate process running that will tail log files 
> and ship them over to Kafka. Is that correct?
> 
> On Jun 7, 2013, at 5:33 PM, Jonathan Creasy <j...@box.com> wrote:
> 
> > I recommend Kafka or Flume-NG for this.
> >
> > Our Analytics team is using a Kafka Producer on each server to tail logs
> > and ship them to Kafka. We use Oozie to schedule a MapReduce consumer every
> > few minutes to read all the Kafka topics into HDFS.
> >
> > We use Kafka as a buffer, we keep a few weeks of data there. Our security
> > team for example sometimes connects up and consumes some logs for various
> > purposes. Usually when they want aggregate log data in realtime.
> >
> > Most folks access them in HDFS. We have <1 minute of delay for most log
> > lines getting from the server where they were written to HDFS.
> >
> > -Jonathan
> >
> >
> > On Fri, Jun 7, 2013 at 5:30 PM, Mark <static.void....@gmail.com> wrote:
> >
> >> Like I said, Im a bit confused. I see the terms "events", "messages" and
> >> "logs" and not quite sure what to make of it.
> >>
> >> We are trying to determine the best way to aggregate all of our logs for
> >> processing in Hadoop. Kafka seems to fit this bill nicely however I want to
> >> know If its suited for other types of messages as well. Are there certain
> >> determine factors on why one would choose Kafka over RabbitMQ? Is it mostly
> >> scale or is it the type of messages/events/logs being produced/consumed?
> >>
> >> On Jun 7, 2013, at 5:21 PM, Alexis Richardson <alexis.richard...@gmail.com>
> >> wrote:
> >>
> >>> On Sat, Jun 8, 2013 at 1:08 AM, Mark <static.void....@gmail.com> wrote:
> >>>> Im a bit confused on the concept of a "message" in Kafka.  How does
> >> this differ, if at all, from a message in RabbitMQ? It seems to me that
> >> Kafka is better suited for very write intensive "messages" like log data
> >> but RabbitMQ may be a better fit for traditional "messages"… i.e. "Product
> >> Purchased" or "User Registered" message.
> >>>
> >>> I'm not sure why you think this, or how to distinguish between a 'log'
> >>> message and some other kind.
> >>>
> >>> Messages = data, annotated with metadata.  The latter is typically a
> >>> protocol-specific envelope.  Kafka and Rabbit certainly have different
> >>> envelopes, eg for mapping data to subscribers/queries.
> >>>
> >>> alexis
> >>
> >>
> >
> >
> > --
> > **
> >
> > *Jonathan Creasy* | Sr. Ops Engineer
> >
> > e: j...@box.com | t: 314.580.8909
> 
> 

Reply via email to