I think it's a very relevant use case. In the Apex formulation this would work as follows. An operator runs continuously and maintains an internal state that tracks process files or an offset (e.g. In Kafka). As more data becomes available, the operator performs the appropriate operation and then returns to waiting. In this fashion, batched data is processed as soon as it becomes available but the process overall is still a batch process since it's limited by the production of the source batches.
There are a couple of examples of this in Malhar, for example the AbstractFileInputOperator. Your earlier comment with regards to your motivation is interesting. Can you elaborate on the load reduction you get with your approach? A number of batched small writes to a DB may prove to be more efficient from a latency or database utilization standpoint when compared with infrequent large batch writes particularly if they involve index updates. ________________________________ From: [email protected] <[email protected]> Sent: Tuesday, June 13, 2017 6:36:29 PM To: [email protected]; [email protected] Subject: Re: Is there a way to schedule an operator? I have input operators that reach out to Google, Facebook, Bing, Yahoo etc. once a day or an hour and download marketing spend statistics. Apex promises batch and streaming to be equal class citizens. How is this equality achieved if there's no scheduler for batch jobs to rely on? If want the dag to take data stream from batch pipeline and affect streaming pipelines running alongside. Do you not see this as a valid use case? Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android> On Tue, Jun 13, 2017 at 5:29 PM, Guilherme Hott <[email protected]> wrote: Hi guys, Is there a way to schedule an operator? I need an operator start the DAG once a day at 00am. Best -- Guilherme Hott Software Engineer Skype: guilhermehott @guilhermehott https://www.linkedin.com/in/guilhermehott ________________________________________________________ The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
