I recommend against aligning the number of slots on a supervisor to the
number of cores in its CPU/s.  In general I think this is too high a
granularity, but then I was also used to supervisor machines that are
pretty big (12+ cores).

On Thu, Apr 28, 2016 at 10:48 AM, Matthias J. Sax <mj...@apache.org> wrote:

> @Nathan: I am not sure what you recommend against? I did not recommend
> anything so far...
>
>
> From my point of view, I doubt there is a good general recommendation to
> configure the slots per supervisor.
>
> As Nathan mentioned correctly, if you have less workers, a single worker
> needs to do more work. All executor threads of your topology are
> distributes evenly over all used workers.
>
> Thus, it is a trade-off between network overhead and fault-tolerance.
>
> 1) If you use only a single worker (as extreme example) there is no
> network involved. If you have a machine with many cores, the single
> worker JVM can use all those cores and utilize your machine quite well.
> Of course, if something goes wrong, your whole topology crashes
> resulting in expensive recover.
>
> 2) If you use as many workers as executors (as the other extreme
> example), each worker will run a small portion of you topology. It is
> unclear if a single executor thread can utilize a full core (this
> depends heavily on the work to be performed by the executor as well as
> your expected throughput). Thus, you could have one or multiple workers
> per core. The advantage is, that a faulty worker is recovered more
> easily -- furthermore, all other parts of your topology keep processing
> data. The disadvantage is, that you have inter process communication for
> worker JVMs on the same supervisor machine and most likely a lot of
> network I/O for worker JVMs on different supervisor machines.
>
> So, it is hard to say in general, what the workload of a single worker
> is -- it can range from only a single thread to all threads of a whole
> topology. And thus, it is hard to say, how many workers you should
> configure per supervisor.
>
> Hope this helps.
>
>
> -Matthias
>
>
> On 04/28/2016 04:01 PM, Nathan Leung wrote:
> > I would recommend against this.  Storm will automatically run multiple
> > threads for you, especially if you have more than 1 executor / worker.
> > Every time data transfers between workers, it must be serialized and
> > deserialized.  On the other hand, if you have larger workers and one
> > goes down, your topology will have to do more work to recover.
> >
> > On Thu, Apr 28, 2016 at 9:48 AM, I PVP <i...@hotmail.com
> > <mailto:i...@hotmail.com>> wrote:
> >
> >     Matthias ,
> >
> >     Thanks for the clear explanation.
> >
> >     Is there any initial guidance to align the number of slots a
> >     supervisor could handle based on the machine # of cpus/cores ?
> >
> >
> >     --
> >     IPVP
> >
> >
> >     From: Matthias J. Sax <mj...@apache.org> <mailto:mj...@apache.org>
> >     Reply: user@storm.apache.org <mailto:user@storm.apache.org>
> >     <user@storm.apache.org>> <mailto:user@storm.apache.org>
> >     Date: April 28, 2016 at 4:26:53 AM
> >     To: user@storm.apache.org <mailto:user@storm.apache.org>
> >     <user@storm.apache.org>> <mailto:user@storm.apache.org>
> >     Subject: Re: Slots vs. Topology
> >
> >>     The number of slots defines the number of worker JVM a supervisor
> can
> >>     start. And a single worker JVM only executes code of a single
> >>     topology
> >>     (to isolate topologies for fault-tolerance reasons).
> >>
> >>     Thus, you need to have a least a single worker for each topology
> >>     in your
> >>     cluster (ie, sum of all slots over all supervisors)---assuming a
> >>     topology uses only a single worker.
> >>
> >>     It is not required to have a slot per supervisor per topology per
> se.
> >>
> >>     However, take into account the parameter "numberOfWorkers" that
> >>     you can
> >>     set per topology. This is the maximum number of slots a topology can
> >>     occupy. If less workers are present, the topology will run using the
> >>     available once. If you want all your topologies to be able to use
> >>     this
> >>     max number of workers, you need to have enough slots in your
> cluster.
> >>     Otherwise, the first topologies will occupy as much workers as
> >>     they are
> >>     allowed, and for later deployed topologies not slots might be left
> >>     over.
> >>
> >>     -Matthias
> >>
> >>
> >>     On 04/28/2016 03:35 AM, I PVP wrote:
> >>     > Hi everyone,
> >>     >
> >>     > Do I need to have one slot(supervisor.slots.ports:) at the
> storm.yaml
> >>     > for each Topology?
> >>     >
> >>     > What is the impact of having a number of Slots smaller than the
> number
> >>     > of Topologies ?
> >>     >
> >>     > So far I am understanding that the impact is that some topologies
> will
> >>     > never run. I have 12 Topologies and they all only run fine when
> the
> >>     > number of slots is equal or higher to the number of Topologies.
> When
> >>     > the number os Slots is smaller some topologies never start and
> uptime
> >>     > for these topologies is blank.
> >>     >
> >>     > Thanks
> >>     >
> >>     > --
> >>     > IPVP
> >>     >
> >>
> >>
>  ------------------------------------------------------------------------
> >
> >
>
>

Reply via email to