I would have a cron that runs every day but somehow tracks if it has pulled 
data for the month.  If it has it just does nothing.  This way if you have some 
sort of failure one day (website is down, etc ...) it would pull data the next 
day.

You could possibly use Kaka itself to store the last month that it grabbed data 
for.   Running once a day is just an example, but the basic idea is to have 
some way of automatically dealing with failures.    You might also want some 
way to monitor monthly in case it just stops working altogether.

-Dave

-----Original Message-----
From: James Smyth [mailto:smyth.james...@gmail.com]
Sent: Monday, January 15, 2018 1:48 PM
To: users@kafka.apache.org
Subject: what are common ways to convert info on a web site into a log entry?

Hi Kafka people,

I am very new to kafka, so perhaps my question is naive. I spent some time 
searching around at resources of kafka but only became more confused.

What are common ways to pull info from a web site and send to kafka to become a 
log entry?
There is a web site that I want to pull a piece of data from once/month and 
have that data written to Kafka log. Consumers will be listening for that 
message to do processing on it.
I am not sure about common ways to do this.

I am thinking I could have some scheduler (e.g. cron) wake up once per month 
and trigger the pull of the data from the web site and then send it to a kafka 
stream.
Does kafka have ability to trigger events once/month or is using cron a better 
idea?
What is the scheduler triggering a stand-alone batch job or the running of a 
some service like a kafka producer? Should I worry about a service running all 
the time when it is likely to only do a few seconds of work each month?

Many thanks,

James.

This e-mail and any files transmitted with it are confidential, may contain 
sensitive information, and are intended solely for the use of the individual or 
entity to whom they are addressed. If you have received this e-mail in error, 
please notify the sender by reply e-mail immediately and destroy all copies of 
the e-mail and any attachments.

Reply via email to