Hello,

We have a requirement as follows:

We want to stream events from 2 sources: Parquet files stored in a GCS
Bucket, and a Kafka topic.
With the release of Hybrid Source in Flink 1.14, we were able to construct
a Hybrid Source which produces events from two sources: a FileSource
which reads data from a locally saved Parquet File, and a KafkaSource
consuming events from a remote Kafka broker.

I was wondering if instead of using a local Parquet file, whether it is
possible to directly stream the file from a GCS bucket and construct a File
Source out of it at runtime ? The Parquet Files are quite big and it's a
bit expensive to download.

Does Flink have such a functionality ? Or, has anyone come across such a
use case previously ? Would greatly appreciate some help on this.

Looking forward to hearing from you.

Thanks,
Megh

Reply via email to