Hello, I am a newbie to spark and I have below requirement.
Problem statement : A third party application is dumping files continuously in a server. Typically the count of files is 100 files per hour and each file is of size less than 50MB. My application has to process those files. Here 1) is it possible for spark-stream to trigger a job after a file is placed instead of triggering a job at fixed batch interval? 2) If it is not possible with Spark-streaming, can we control this with Kafka/Flume Thanks, Sivaram ---------------------------------------------------------------------- This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.