Hi Trude,

one important difference between storm 1 and 2 from my experience when
upgrading is the handling of shuffleGrouping. In storm 1, it was "strict"
shuffle grouping, meaning tuples were distributed evenly across worker, no
matter where workers are located physically. With storm 2, when storm
recognizes that shuffle grouping would send a tuple to a worker not running
in the same JVM it would prefer to keep the tuple inside the current
worker, even if that means effectively that only one worker receives tuples.

Have you examined the behavior, could it be that in storm 2 only one worker
is receiving tuples, when you configure two workers?

To my knowledge, you can recreate the old behavior of storm 1 in storm 2 by
setting TOPOLOGY_DISABLE_LOADAWARE_MESSAGING to true.

Regards

Jonas

Am Mo., 3. Mai 2021 um 10:23 Uhr schrieb Trude Gentenaar <
genten...@semlab.nl>:

> Hello all,
>
> After upgrading the Storm platform our topology is running approximately
> 100% slower on the same machine and with the same memory and threading
> settings, i.e. taking twice as long on the same testset.
>
> The topology is processing documents of varying lengths. The documents are
> split into sentences. Further processing is done by bolts that operate on
> either ‘document-level’ or ‘sentence-level’. Bolts that process sentences
> are set to higher parallelism. In Storm version 1.2.0 we found optimal
> performance when running on 2 workers on a single server, with document
> based bolts having their parallelism set to 2 and the sentence bolts having
> parallelism set to 8. Worker-xmx is set to 2048mb. This configuration runs
> twice as slow on Storm 2.2.0. When running the topology on 1 worker and
> with all parallelism set to 1 the speed returns to nearly that of 1.2.0.
>
> Further performance tuning has also been attempted but to no avail. This
> is not the behaviour that we expected of the new platform. Can anyone shed
> some light on this situation or perhaps let us know if our expectations
> were wrong?
>
>
> Thanks in advance,
>
> Trude
>
>
> ----------------------------------
> Trude Gentenaar
> Research&Development
> ----------------------------------
> SemLab
> Zuidpoolsingel 14-A
> 2408 ZE Alphen a/d Rijn
> The Netherlands
> T: +31 172 494 777
> E: genten...@semlab.nl
> W: http://www.semlab.nl
>

Reply via email to