Correct, we essentially use the logs as an additional buffer in case of
outage in the pipeline. Typically though, messages are produces as soon as
they are written.


-Jonathan


On Fri, Jun 7, 2013 at 6:06 PM, Mark <static.void....@gmail.com> wrote:

> Ok so in your use case instead of your application(s) writing directly to
> Kafka you instead have a separate process running that will tail log files
> and ship them over to Kafka. Is that correct?
>
> On Jun 7, 2013, at 5:33 PM, Jonathan Creasy <j...@box.com> wrote:
>
> > I recommend Kafka or Flume-NG for this.
> >
> > Our Analytics team is using a Kafka Producer on each server to tail logs
> > and ship them to Kafka. We use Oozie to schedule a MapReduce consumer
> every
> > few minutes to read all the Kafka topics into HDFS.
> >
> > We use Kafka as a buffer, we keep a few weeks of data there. Our security
> > team for example sometimes connects up and consumes some logs for various
> > purposes. Usually when they want aggregate log data in realtime.
> >
> > Most folks access them in HDFS. We have <1 minute of delay for most log
> > lines getting from the server where they were written to HDFS.
> >
> > -Jonathan
> >
> >
> > On Fri, Jun 7, 2013 at 5:30 PM, Mark <static.void....@gmail.com> wrote:
> >
> >> Like I said, Im a bit confused. I see the terms "events", "messages" and
> >> "logs" and not quite sure what to make of it.
> >>
> >> We are trying to determine the best way to aggregate all of our logs for
> >> processing in Hadoop. Kafka seems to fit this bill nicely however I
> want to
> >> know If its suited for other types of messages as well. Are there
> certain
> >> determine factors on why one would choose Kafka over RabbitMQ? Is it
> mostly
> >> scale or is it the type of messages/events/logs being produced/consumed?
> >>
> >> On Jun 7, 2013, at 5:21 PM, Alexis Richardson <
> alexis.richard...@gmail.com>
> >> wrote:
> >>
> >>> On Sat, Jun 8, 2013 at 1:08 AM, Mark <static.void....@gmail.com>
> wrote:
> >>>> Im a bit confused on the concept of a "message" in Kafka.  How does
> >> this differ, if at all, from a message in RabbitMQ? It seems to me that
> >> Kafka is better suited for very write intensive "messages" like log data
> >> but RabbitMQ may be a better fit for traditional "messages"… i.e.
> "Product
> >> Purchased" or "User Registered" message.
> >>>
> >>> I'm not sure why you think this, or how to distinguish between a 'log'
> >>> message and some other kind.
> >>>
> >>> Messages = data, annotated with metadata.  The latter is typically a
> >>> protocol-specific envelope.  Kafka and Rabbit certainly have different
> >>> envelopes, eg for mapping data to subscribers/queries.
> >>>
> >>> alexis
> >>
> >>
> >
> >
> > --
> > **
> >
> > *Jonathan Creasy* | Sr. Ops Engineer
> >
> > e: j...@box.com | t: 314.580.8909
>
>

Reply via email to