I would have a cron that runs every day but somehow tracks if it has pulled data for the month. If it has it just does nothing. This way if you have some sort of failure one day (website is down, etc ...) it would pull data the next day.
You could possibly use Kaka itself to store the last month that it grabbed data for. Running once a day is just an example, but the basic idea is to have some way of automatically dealing with failures. You might also want some way to monitor monthly in case it just stops working altogether. -Dave -----Original Message----- From: James Smyth [mailto:smyth.james...@gmail.com] Sent: Monday, January 15, 2018 1:48 PM To: users@kafka.apache.org Subject: what are common ways to convert info on a web site into a log entry? Hi Kafka people, I am very new to kafka, so perhaps my question is naive. I spent some time searching around at resources of kafka but only became more confused. What are common ways to pull info from a web site and send to kafka to become a log entry? There is a web site that I want to pull a piece of data from once/month and have that data written to Kafka log. Consumers will be listening for that message to do processing on it. I am not sure about common ways to do this. I am thinking I could have some scheduler (e.g. cron) wake up once per month and trigger the pull of the data from the web site and then send it to a kafka stream. Does kafka have ability to trigger events once/month or is using cron a better idea? What is the scheduler triggering a stand-alone batch job or the running of a some service like a kafka producer? Should I worry about a service running all the time when it is likely to only do a few seconds of work each month? Many thanks, James. This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.