Re: Question about saving data to use across runner's instances

2020-11-16 Thread Reza Ardeshir Rokni
Hi, Do you have an upper bound on how large the file will become? If it's small enough to fit into a sideinput you may be able to make use of the Slow update sideinput pattern: https://beam.apache.org/documentation/patterns/side-inputs/ If not, then SatefulDoFn would be a good choice, but note a

Question about saving data to use across runner's instances

2020-11-15 Thread Artur Khanin
Hi all, I am designing a Dataflow pipeline in Java that has to: * Read a file (it may be pretty large) during initialization and then store it in some sort of shared memory * Periodically update this file * Make this file available to read across all runner's instances * Persist