We do small workers. 4 per node at the moment. 500 megs each.
Moving to 2 per node at 500 megs each soon. Thats largely dictated by several of our larger topologies doing a lot of network calls to various services. More machines means more sockets. (We open a lot of sockets). On Wed, Dec 18, 2013 at 11:26 PM, Michael Rose <[email protected]>wrote: > It really comes down to your use case, perhaps you can comment on what > you're doing. > > Personally, we run smaller workers and more of them. Mainly because having > more JVMs helps us avoid internal contention on strangely-locked JVM > internals. We have on average 2-3 workers per machine with moderately sized > heaps. > > I can't say I'd think the overhead is too much more to have extra workers > if you're doing shuffles or fields grouping most of the time anyways. > > Michael Rose (@Xorlev <https://twitter.com/xorlev>) > Senior Platform Engineer, FullContact <http://www.fullcontact.com/> > [email protected] > > > On Wed, Dec 18, 2013 at 8:54 PM, Jon Logan <[email protected]> wrote: > >> Has anyone done much experimenting on optimal worker sizes? I'm basically >> unsure if it is better to run with more, smaller workers, or fewer, larger >> workers. Right now, I'm using ~3GB workers, and around 5 or so per machine. >> Would it be better to reduce this number? >> >> The main issues that come to mind are >> >> If larger workers >> - if one crashes, more data is lost >> - more GC issues for larger heap sizes >> >> if smaller workers >> - more overhead >> - more threads used >> - less local shuffling capability >> - more load on ZK/nimbus(?) >> >> >> Thoughts? >> > > -- Ce n'est pas une signature
