Hi,
I am running out of resources on the workers machines.
The reasons are:
1. Every pcollection is a reference to a LARGE file that is copied into the
worker
2. The worker makes calculations on the copied file using a software
library that consumes memory / storage / compute resources

I have changed the workers' CPUs and memory size. At some point, I am
running out of resources with this method as well
I am looking to limit the number of pCollection / elements that are being
processed in parallel on each worker at a time.

Many thank for any advice,
Best wishes,
-- 
Eila
<http://www.orielresearch.com>
Meetup <https://www.meetup.com/Deep-Learning-In-Production/>

Reply via email to