Has anyone done much experimenting on optimal worker sizes? I'm basically unsure if it is better to run with more, smaller workers, or fewer, larger workers. Right now, I'm using ~3GB workers, and around 5 or so per machine. Would it be better to reduce this number?
The main issues that come to mind are If larger workers - if one crashes, more data is lost - more GC issues for larger heap sizes if smaller workers - more overhead - more threads used - less local shuffling capability - more load on ZK/nimbus(?) Thoughts?
