Check out my logstash-kafka project:

https://github.com/joekiller/logstash-kafka

I believe the plugin will be merged into logstash itself soon but for now you 
can make it yourself.

I would suggest making your apache format in json in your apache config and 
then stream the data through the logstash kafka output (producer) and parse it 
on the other side with logstash input (kafka consumer)

Try something like:

LogFormat 
"{\"@timestamp\":\"%{%Y-%m-%dT%H:%M:%S%z}t\",\"mod_proxy\":{\"x-forwarded-for\":\"%{X-Forwarded-For}i\"},\"mod_headers\":{\"referer\":\"%{Referer}i\",\"user-agent\":\"%{User-Agent}i\",\"host\":\"%{Host}i\"},\"mod_log\":{\"server_name\":\"%V\",\"remote_logname\":\"%l\",\"remote_user\":\"%u\",\"first_request\":\"%r\",\"last_request_status\":\"%>s\",\"response_size_bytes\":%B,\"duration_usec\":
 %D,\"@version\":1 }" logstash_json


CustomLog "|rotatelogs /var/log/httpd/access_log_json-%s 3600" logstash_json

________________________________________
From: Philip O'Toole <philip.oto...@yahoo.com.INVALID>
Sent: Thursday, August 07, 2014 3:01 PM
To: users@kafka.apache.org
Subject: Re: Apache webserver access logs + Kafka producer

Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch 
the Apache log files, and send them to a suitable Producer (for example I wrote 
something that will accept messages from a syslog client, and stream them to 
Kafka.  https://github.com/otoolep/syslog-gollector)


More ideas here:

https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem

  Philip
-----------------------------------------
http://www.philipotoole.com
On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine <flor...@gumgum.com> wrote:



You might be interested by something like Logstash http://logstash.org for
logs and event processing.

Regards,

Florian

Le 5 août 2014 23:17, "Jonathan Weeks" <jonathanbwe...@gmail.com> a écrit :

> You can look at something like:
>
> https://github.com/harelba/tail2kafka
>
> (although I
 don’t know what the effort would be to update it, as it
> doesn’t look like it has been updated in a couple years)
>
> We are using flume to gather logs, and then sending them to a kafka
> cluster via a flume kafka sink — e.g..
>
> https://github.com/thilinamb/flume-ng-kafka-sink
>
> -Jonathan
>
>
> On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote:
>
> > Hi,
> >
> > I want to collect apache web server logs in real time and send it to
> Kafka
> > server. Is there any existing Producer available to do this operation, If
> > not can you please provide a way to implement it.
> >
> > Regards,
> > Sree.
>
>

Reply via email to