Hi Thomas, Which version of Storm are you using?
On Sun, 13 Sep 2020 at 20:23, Thomas L. Redman <tomred...@mchsi.com> wrote: > Sorry, I had previously sent this from a different email address, not sure > how well that would work with this service, hence this re-send. > > I’m running storm on a 3 node cluster, 32 physical cores in each node. I > have a complex topology with one spout which is a singleton, connected to > several other bolts most all of which doing natural language processing. > Most of these are pretty heavy weight. The input spout is easily capable of > outpacing the downstream bolts. I get good performance, but on only one > node, even though I specify 3 worker nodes for my topology. StormUI > indicates for any given component that the executors for that token on the > idle machines have emitted very few tokens, and have transferred none! > > When I look at the machine usage with htop, I see indeed only one of the > nodes is really getting any usage at all. My heaviest computation nodes > have a very high capacity value. But the machine which hosts the spout is > pegged with significant load. I have used almost exclusively shuffle(I > prefer localOrShuffleGrouping) grouping, but that doesn’t help. I will have > machines that are simply receiving few tuples to operate on, and those few > tuples are not transferred (and I admit I don’t know quite what that means). > > So, I have two questions: > 1) Why would a component on a node remote from the spout have a lower > Emitted count, and have a Transferred count always at zero? > > 2) What might cause my high capacity (typically over 1) to not be > offloaded to a more idle machine?