I recommend against aligning the number of slots on a supervisor to the number of cores in its CPU/s. In general I think this is too high a granularity, but then I was also used to supervisor machines that are pretty big (12+ cores).
On Thu, Apr 28, 2016 at 10:48 AM, Matthias J. Sax <mj...@apache.org> wrote: > @Nathan: I am not sure what you recommend against? I did not recommend > anything so far... > > > From my point of view, I doubt there is a good general recommendation to > configure the slots per supervisor. > > As Nathan mentioned correctly, if you have less workers, a single worker > needs to do more work. All executor threads of your topology are > distributes evenly over all used workers. > > Thus, it is a trade-off between network overhead and fault-tolerance. > > 1) If you use only a single worker (as extreme example) there is no > network involved. If you have a machine with many cores, the single > worker JVM can use all those cores and utilize your machine quite well. > Of course, if something goes wrong, your whole topology crashes > resulting in expensive recover. > > 2) If you use as many workers as executors (as the other extreme > example), each worker will run a small portion of you topology. It is > unclear if a single executor thread can utilize a full core (this > depends heavily on the work to be performed by the executor as well as > your expected throughput). Thus, you could have one or multiple workers > per core. The advantage is, that a faulty worker is recovered more > easily -- furthermore, all other parts of your topology keep processing > data. The disadvantage is, that you have inter process communication for > worker JVMs on the same supervisor machine and most likely a lot of > network I/O for worker JVMs on different supervisor machines. > > So, it is hard to say in general, what the workload of a single worker > is -- it can range from only a single thread to all threads of a whole > topology. And thus, it is hard to say, how many workers you should > configure per supervisor. > > Hope this helps. > > > -Matthias > > > On 04/28/2016 04:01 PM, Nathan Leung wrote: > > I would recommend against this. Storm will automatically run multiple > > threads for you, especially if you have more than 1 executor / worker. > > Every time data transfers between workers, it must be serialized and > > deserialized. On the other hand, if you have larger workers and one > > goes down, your topology will have to do more work to recover. > > > > On Thu, Apr 28, 2016 at 9:48 AM, I PVP <i...@hotmail.com > > <mailto:i...@hotmail.com>> wrote: > > > > Matthias , > > > > Thanks for the clear explanation. > > > > Is there any initial guidance to align the number of slots a > > supervisor could handle based on the machine # of cpus/cores ? > > > > > > -- > > IPVP > > > > > > From: Matthias J. Sax <mj...@apache.org> <mailto:mj...@apache.org> > > Reply: user@storm.apache.org <mailto:user@storm.apache.org> > > <user@storm.apache.org>> <mailto:user@storm.apache.org> > > Date: April 28, 2016 at 4:26:53 AM > > To: user@storm.apache.org <mailto:user@storm.apache.org> > > <user@storm.apache.org>> <mailto:user@storm.apache.org> > > Subject: Re: Slots vs. Topology > > > >> The number of slots defines the number of worker JVM a supervisor > can > >> start. And a single worker JVM only executes code of a single > >> topology > >> (to isolate topologies for fault-tolerance reasons). > >> > >> Thus, you need to have a least a single worker for each topology > >> in your > >> cluster (ie, sum of all slots over all supervisors)---assuming a > >> topology uses only a single worker. > >> > >> It is not required to have a slot per supervisor per topology per > se. > >> > >> However, take into account the parameter "numberOfWorkers" that > >> you can > >> set per topology. This is the maximum number of slots a topology can > >> occupy. If less workers are present, the topology will run using the > >> available once. If you want all your topologies to be able to use > >> this > >> max number of workers, you need to have enough slots in your > cluster. > >> Otherwise, the first topologies will occupy as much workers as > >> they are > >> allowed, and for later deployed topologies not slots might be left > >> over. > >> > >> -Matthias > >> > >> > >> On 04/28/2016 03:35 AM, I PVP wrote: > >> > Hi everyone, > >> > > >> > Do I need to have one slot(supervisor.slots.ports:) at the > storm.yaml > >> > for each Topology? > >> > > >> > What is the impact of having a number of Slots smaller than the > number > >> > of Topologies ? > >> > > >> > So far I am understanding that the impact is that some topologies > will > >> > never run. I have 12 Topologies and they all only run fine when > the > >> > number of slots is equal or higher to the number of Topologies. > When > >> > the number os Slots is smaller some topologies never start and > uptime > >> > for these topologies is blank. > >> > > >> > Thanks > >> > > >> > -- > >> > IPVP > >> > > >> > >> > ------------------------------------------------------------------------ > > > > > >