I have a case where my Flink job needs to consume multiple sources. I have a topic in Kafka where the order of consuming is important. Because the cost of S3 is much less than storage on Kafka, we have a job that sinks to S3. The topic in Kafka can then retain just 3 days worth of data. My job needs to first consume everything from the existing S3 file(s) and only then start consuming from the Kafka topic. When using a union operator in Flink, the data comes in mixed from both sources. Is there any way that I can control the ordering so that it first reads S3, then Kafka all in the same job?
Kurt