This seems to be one of the eternal questions people ask about Storm.

Ultimately, it's something you can tune, and much like garbage collection
settings, perhaps the best method of finding your best balance is trial and
error because it depends on so many factors specific to your workload:
tuple size, tuple volume, worker complexity, hardware, etc..

I'd only advise, though, that you not spend too much time on this problem
before you have a relatively stably designed topology (and this is more
general advice for anyone coming along and reading this later, not
necessarily for you specifically since I'd assume if you're bringing this
to the list you already have a fairly stable topology design that now needs
further tuning). There's often much more to be gained through better design
decisions than tuning, since if you've done it right tuning will primarily
just grant you better a cost-performance ratio.


On Wed, Dec 18, 2013 at 8:26 PM, Michael Rose <[email protected]>wrote:

> It really comes down to your use case, perhaps you can comment on what
> you're doing.
>
> Personally, we run smaller workers and more of them. Mainly because having
> more JVMs helps us avoid internal contention on strangely-locked JVM
> internals. We have on average 2-3 workers per machine with moderately sized
> heaps.
>
> I can't say I'd think the overhead is too much more to have extra workers
> if you're doing shuffles or fields grouping most of the time anyways.
>
>  Michael Rose (@Xorlev <https://twitter.com/xorlev>)
> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
> [email protected]
>
>
> On Wed, Dec 18, 2013 at 8:54 PM, Jon Logan <[email protected]> wrote:
>
>> Has anyone done much experimenting on optimal worker sizes? I'm basically
>> unsure if it is better to run with more, smaller workers, or fewer, larger
>> workers. Right now, I'm using ~3GB workers, and around 5 or so per machine.
>> Would it be better to reduce this number?
>>
>> The main issues that come to mind are
>>
>> If larger workers
>> - if one crashes, more data is lost
>> - more GC issues for larger heap sizes
>>
>> if smaller workers
>> - more overhead
>> - more threads used
>> - less local shuffling capability
>> - more load on ZK/nimbus(?)
>>
>>
>> Thoughts?
>>
>
>

Reply via email to