Hi all,

I am designing a Dataflow pipeline in Java that has to:

  *   Read a file (it may be pretty large) during initialization and then store 
it in some sort of shared memory
  *   Periodically update this file
  *   Make this file available to read across all runner's instances
  *   Persist this file in cases of restarts/crashes/scale-up/scale down

I found some information about stateful processing in Beam using Stateful 
DoFn<https://beam.apache.org/blog/stateful-processing/>. Is it an appropriate 
way to handle such functionality, or is there a better approach for it?

Any help or information is very appreciated!

Thanks,
Artur Khanin
Akvelon, Inc.

Reply via email to