Hi, I am running out of resources on the workers machines. The reasons are: 1. Every pcollection is a reference to a LARGE file that is copied into the worker 2. The worker makes calculations on the copied file using a software library that consumes memory / storage / compute resources
I have changed the workers' CPUs and memory size. At some point, I am running out of resources with this method as well I am looking to limit the number of pCollection / elements that are being processed in parallel on each worker at a time. Many thank for any advice, Best wishes, -- Eila <http://www.orielresearch.com> Meetup <https://www.meetup.com/Deep-Learning-In-Production/>