Re: problem with shuffleGrouping

Stephen Powis Mon, 21 Nov 2016 08:26:09 -0800

So we've seen some weird distributions using ShuffleGrouping as well.  I
noticed there's no test case for ShuffleGrouping and got curious.  Also the
implementation seemed overly complicated (in my head anyhow, perhaps
there's a reason for it?) so I put together a much more simple version of
round robin shuffling.


Gist here: https://gist.github.com/Crim/61537958df65a5e13b3844b2d5e28cde

Its possible I've setup my test cases incorrectly, but it seems like when
using multiple threads in my test ShuffleGrouping provides wildly un-even
distribution?  In the Javadocs above each test case I've pasted the output
that I get locally.

Thoughts?

On Sat, Nov 19, 2016 at 2:49 AM, Ohad Edelstein <[email protected]> wrote:

> It happened to you also?
> We are upgrading from 0.9.3 to 1.0.1,
> In 0.9.3 we didn’t have that problem.
>
> But Ones I use localOrShuffle the messages are send only to the same
> machine.
>
> From: Chien Le <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Saturday, 19 November 2016 at 6:05
> To: "[email protected]" <[email protected]>
> Subject: Re: Testing serializers with multiple workers
>
> Ohad,
>
>
> We found that we had to use localOrShuffle grouping in order to see
> activity in the same worker as the spout.
>
>
> -Chien
>
>
> ------------------------------
> *From:* Ohad Edelstein <[email protected]>
> *Sent:* Friday, November 18, 2016 8:38:35 AM
> *To:* [email protected]
> *Subject:* Re: Testing serializers with multiple workers
>
> Hello,
>
> We just finished setting up storm 1.0.1 with 3 supervisors and one nimbus
> machine.
> Total of 4 machines in aws.
>
> We see the following phanomenon:
> lets say spout on host2,
> host1 - using 100% cpu
> host3 - using 100% cpu
> host2 - idle (some message are being handled by it, not many)
> its not slots problem, we have even amount of bolts.
>
> We also tried to deploy only 2 host, and the same thing happened, the host
> with the spout is idle, the other host at 100% cpu.
>
> We switched from shuffleGrouping to noneGrouping, and its seems to work,
> The documentation says that:
> None grouping: This grouping specifies that you don't care how the stream
> is grouped. Currently, none groupings are equivalent to shuffle groupings.
> Eventually though, Storm will push down bolts with none groupings to
> execute in the same thread as the bolt or spout they subscribe from (when
> possible).
>
> We are still trying to understand what is wrong with shuffleGrouping in
> our system,
>
> Any ideas?
>
> Thanks!
>
> From: Aaron Niskodé-Dossett <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Friday, 18 November 2016 at 17:04
> To: "[email protected]" <[email protected]>
> Subject: Re: Testing serializers with multiple workers
>
> Hit send too soon... that really is the option :-)
>
> On Fri, Nov 18, 2016 at 9:03 AM Aaron Niskodé-Dossett <[email protected]>
> wrote:
>
>> topology.testing.always.try.serialize = true
>>
>> On Fri, Nov 18, 2016 at 8:57 AM Kristopher Kane <[email protected]>
>> wrote:
>>
>> Does anyone have any techniques for testing serializers that would only
>> surface when the serializer is uses in a multi-worker topology?
>>
>> Kris
>>
>>

Re: problem with shuffleGrouping

Reply via email to