We do small workers.

4 per node at the moment. 500 megs each.

Moving to 2 per node at 500 megs each soon.

Thats largely dictated by several of our larger topologies doing a lot of
network calls to various services.
More machines means more sockets. (We open a lot of sockets).


On Wed, Dec 18, 2013 at 11:26 PM, Michael Rose <[email protected]>wrote:

> It really comes down to your use case, perhaps you can comment on what
> you're doing.
>
> Personally, we run smaller workers and more of them. Mainly because having
> more JVMs helps us avoid internal contention on strangely-locked JVM
> internals. We have on average 2-3 workers per machine with moderately sized
> heaps.
>
> I can't say I'd think the overhead is too much more to have extra workers
> if you're doing shuffles or fields grouping most of the time anyways.
>
>  Michael Rose (@Xorlev <https://twitter.com/xorlev>)
> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
> [email protected]
>
>
> On Wed, Dec 18, 2013 at 8:54 PM, Jon Logan <[email protected]> wrote:
>
>> Has anyone done much experimenting on optimal worker sizes? I'm basically
>> unsure if it is better to run with more, smaller workers, or fewer, larger
>> workers. Right now, I'm using ~3GB workers, and around 5 or so per machine.
>> Would it be better to reduce this number?
>>
>> The main issues that come to mind are
>>
>> If larger workers
>> - if one crashes, more data is lost
>> - more GC issues for larger heap sizes
>>
>> if smaller workers
>> - more overhead
>> - more threads used
>> - less local shuffling capability
>> - more load on ZK/nimbus(?)
>>
>>
>> Thoughts?
>>
>
>


-- 

Ce n'est pas une signature

Reply via email to