If you want to go this way, you could: - as you proposed use some busy waiting with reading some file from a distributed file system - wait for some network message (opening your own socket) - use some other external system for this purpose: Kafka? Zookeeper?
Although all of them seems hacky and I would prefer (as I proposed before) to pre compute those ids before running/starting the main Flink application. Probably would be simpler and easier to maintain. Piotrek > On 25 Jan 2018, at 13:47, Ishwara Varnasi <ivarn...@gmail.com> wrote: > > The FLIP-17 is promising. Until it’s available I’m planning to do this: > extend Kafka consumer and add logic to hold consuming until other source > (fixed set) completes sending and those messages are processed by the > application. However the question is to how to let the Kafka consumer know > that it should now start consuming messages. What is the correct way to > broadcast messages to other tasks at runtime? I’d success with the > distributed cache (ie write status to a file in one task and other looks for > status in this file), but doesn’t look like good solution although works. > Thanks for the pointers. > Ishwara Varnasi > > Sent from my iPhone > > On Jan 25, 2018, at 4:03 AM, Piotr Nowojski <pi...@data-artisans.com > <mailto:pi...@data-artisans.com>> wrote: > >> Hi, >> >> As far as I know there is currently no simple way to do this: >> Join stream with static data in >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API >> >> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API> >> and >> https://issues.apache.org/jira/browse/FLINK-6131 >> <https://issues.apache.org/jira/browse/FLINK-6131> >> >> One walk around might be to buffer on the state the Kafka input in your >> TwoInput operator until all of the broadcasted messages have arrived. >> Another option might be to dynamically start your application. First run >> some computation to determine the fixed lists of ids and start the flink >> application with those values hardcoded in/passed via command line arguments. >> >> Piotrek >> >>> On 25 Jan 2018, at 04:10, Ishwara Varnasi <ivarn...@gmail.com >>> <mailto:ivarn...@gmail.com>> wrote: >>> >>> Hello, >>> I have a scenario where I've two sources, one of them is source of fixed >>> list of ids for preloading (caching certain info which is slow) and second >>> one is the kafka consumer. I need to run Kafka after first one completes. I >>> need a mechanism to let the Kafka consumer know that it can start consuming >>> messages. How can I achieve this? >>> thanks >>> Ishwara Varnasi >>