Hi, In one of my transforms I am using Map which is the result of a previous transform as a sideInput. This Map<String, Int> is potentially very large with count of all words that appeared in all documents.
The step that uses the sideInput is quite slow because it seems like it is initialising a huge Hashmap for every element it processes (I followed this example https://beam.apache.org/documentation/programming-guide/#side-inputs) Is this the wrong way of using sideInputs? And by this I mean, can a sideInput be too big to be a sideInput? I also thought about saving the sideInput as a static class variable, then in principle I only have to read it once per "transform" initialised in the cluster. Am I going totally wrong about this, should I try other approaches? Best regards, Augusto