Re: Apache webserver access logs + Kafka producer

2014-08-07 Thread Philip O'Toole
Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch 
the Apache log files, and send them to a suitable Producer (for example I wrote 
something that will accept messages from a syslog client, and stream them to 
Kafka.  https://github.com/otoolep/syslog-gollector)


More ideas here:

https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem

  Philip
-
http://www.philipotoole.com 
On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine flor...@gumgum.com wrote:
 


You might be interested by something like Logstash http://logstash.org for
logs and event processing.

Regards,

Florian

Le 5 août 2014 23:17, Jonathan Weeks jonathanbwe...@gmail.com a écrit :

 You can look at something like:

 https://github.com/harelba/tail2kafka

 (although I
 don’t know what the effort would be to update it, as it
 doesn’t look like it has been updated in a couple years)

 We are using flume to gather logs, and then sending them to a kafka
 cluster via a flume kafka sink — e.g..

 https://github.com/thilinamb/flume-ng-kafka-sink

 -Jonathan


 On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote:

  Hi,
 
  I want to collect apache web server logs in real time and send it to
 Kafka
  server. Is there any existing Producer available to do this operation, If
  not can you please provide a way to implement it.
 
  Regards,
  Sree.



RE: Apache webserver access logs + Kafka producer

2014-08-07 Thread Joseph Lawson
Check out my logstash-kafka project:

https://github.com/joekiller/logstash-kafka

I believe the plugin will be merged into logstash itself soon but for now you 
can make it yourself.

I would suggest making your apache format in json in your apache config and 
then stream the data through the logstash kafka output (producer) and parse it 
on the other side with logstash input (kafka consumer)

Try something like:

LogFormat 
{\@timestamp\:\%{%Y-%m-%dT%H:%M:%S%z}t\,\mod_proxy\:{\x-forwarded-for\:\%{X-Forwarded-For}i\},\mod_headers\:{\referer\:\%{Referer}i\,\user-agent\:\%{User-Agent}i\,\host\:\%{Host}i\},\mod_log\:{\server_name\:\%V\,\remote_logname\:\%l\,\remote_user\:\%u\,\first_request\:\%r\,\last_request_status\:\%s\,\response_size_bytes\:%B,\duration_usec\:
 %D,\@version\:1 } logstash_json


CustomLog |rotatelogs /var/log/httpd/access_log_json-%s 3600 logstash_json


From: Philip O'Toole philip.oto...@yahoo.com.INVALID
Sent: Thursday, August 07, 2014 3:01 PM
To: users@kafka.apache.org
Subject: Re: Apache webserver access logs + Kafka producer

Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch 
the Apache log files, and send them to a suitable Producer (for example I wrote 
something that will accept messages from a syslog client, and stream them to 
Kafka.  https://github.com/otoolep/syslog-gollector)


More ideas here:

https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem

  Philip
-
http://www.philipotoole.com
On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine flor...@gumgum.com wrote:



You might be interested by something like Logstash http://logstash.org for
logs and event processing.

Regards,

Florian

Le 5 août 2014 23:17, Jonathan Weeks jonathanbwe...@gmail.com a écrit :

 You can look at something like:

 https://github.com/harelba/tail2kafka

 (although I
 don’t know what the effort would be to update it, as it
 doesn’t look like it has been updated in a couple years)

 We are using flume to gather logs, and then sending them to a kafka
 cluster via a flume kafka sink — e.g..

 https://github.com/thilinamb/flume-ng-kafka-sink

 -Jonathan


 On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote:

  Hi,
 
  I want to collect apache web server logs in real time and send it to
 Kafka
  server. Is there any existing Producer available to do this operation, If
  not can you please provide a way to implement it.
 
  Regards,
  Sree.




RE: Apache webserver access logs + Kafka producer

2014-08-07 Thread Joseph Lawson
PS you can also try just feeding the logs into a Kafka console producer by 
doing:

TransferLog | /opt/kafka/bin/kafka-console-producer.sh --topic apache 
--broker-list broker-1:9092
ErrorLog | /opt/kafka/bin/kafka-console-producer.sh --topic apache-errors 
--broker-list broker-1:9092

You can also pipe a custom log into it as well :)

From: Joseph Lawson jlaw...@roomkey.com
Sent: Thursday, August 07, 2014 5:35 PM
To: users@kafka.apache.org; Philip O'Toole
Subject: RE: Apache webserver access logs + Kafka producer

Check out my logstash-kafka project:

https://github.com/joekiller/logstash-kafka

I believe the plugin will be merged into logstash itself soon but for now you 
can make it yourself.

I would suggest making your apache format in json in your apache config and 
then stream the data through the logstash kafka output (producer) and parse it 
on the other side with logstash input (kafka consumer)

Try something like:

LogFormat 
{\@timestamp\:\%{%Y-%m-%dT%H:%M:%S%z}t\,\mod_proxy\:{\x-forwarded-for\:\%{X-Forwarded-For}i\},\mod_headers\:{\referer\:\%{Referer}i\,\user-agent\:\%{User-Agent}i\,\host\:\%{Host}i\},\mod_log\:{\server_name\:\%V\,\remote_logname\:\%l\,\remote_user\:\%u\,\first_request\:\%r\,\last_request_status\:\%s\,\response_size_bytes\:%B,\duration_usec\:
 %D,\@version\:1 } logstash_json


CustomLog |rotatelogs /var/log/httpd/access_log_json-%s 3600 logstash_json


From: Philip O'Toole philip.oto...@yahoo.com.INVALID
Sent: Thursday, August 07, 2014 3:01 PM
To: users@kafka.apache.org
Subject: Re: Apache webserver access logs + Kafka producer

Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch 
the Apache log files, and send them to a suitable Producer (for example I wrote 
something that will accept messages from a syslog client, and stream them to 
Kafka.  https://github.com/otoolep/syslog-gollector)


More ideas here:

https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem

  Philip
-
http://www.philipotoole.com
On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine flor...@gumgum.com wrote:



You might be interested by something like Logstash http://logstash.org for
logs and event processing.

Regards,

Florian

Le 5 août 2014 23:17, Jonathan Weeks jonathanbwe...@gmail.com a écrit :

 You can look at something like:

 https://github.com/harelba/tail2kafka

 (although I
 don’t know what the effort would be to update it, as it
 doesn’t look like it has been updated in a couple years)

 We are using flume to gather logs, and then sending them to a kafka
 cluster via a flume kafka sink — e.g..

 https://github.com/thilinamb/flume-ng-kafka-sink

 -Jonathan


 On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote:

  Hi,
 
  I want to collect apache web server logs in real time and send it to
 Kafka
  server. Is there any existing Producer available to do this operation, If
  not can you please provide a way to implement it.
 
  Regards,
  Sree.



Apache webserver access logs + Kafka producer

2014-08-05 Thread mvs.sree
Hi,

I want to collect apache web server logs in real time and send it to Kafka
server. Is there any existing Producer available to do this operation, If
not can you please provide a way to implement it.

Regards,
Sree.


Re: Apache webserver access logs + Kafka producer

2014-08-05 Thread Jonathan Weeks
You can look at something like: 

https://github.com/harelba/tail2kafka

(although I don’t know what the effort would be to update it, as it doesn’t 
look like it has been updated in a couple years)

We are using flume to gather logs, and then sending them to a kafka cluster via 
a flume kafka sink — e.g..

https://github.com/thilinamb/flume-ng-kafka-sink

-Jonathan


On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote:

 Hi,
 
 I want to collect apache web server logs in real time and send it to Kafka
 server. Is there any existing Producer available to do this operation, If
 not can you please provide a way to implement it.
 
 Regards,
 Sree.



Re: Apache webserver access logs + Kafka producer

2014-08-05 Thread Florian Dambrine
You might be interested by something like Logstash http://logstash.org for
logs and event processing.

Regards,

Florian
Le 5 août 2014 23:17, Jonathan Weeks jonathanbwe...@gmail.com a écrit :

 You can look at something like:

 https://github.com/harelba/tail2kafka

 (although I don’t know what the effort would be to update it, as it
 doesn’t look like it has been updated in a couple years)

 We are using flume to gather logs, and then sending them to a kafka
 cluster via a flume kafka sink — e.g..

 https://github.com/thilinamb/flume-ng-kafka-sink

 -Jonathan


 On Aug 5, 2014, at 1:40 PM, mvs.s...@gmail.com wrote:

  Hi,
 
  I want to collect apache web server logs in real time and send it to
 Kafka
  server. Is there any existing Producer available to do this operation, If
  not can you please provide a way to implement it.
 
  Regards,
  Sree.