@Nathan: I am not sure what you recommend against? I did not recommend
anything so far...


From my point of view, I doubt there is a good general recommendation to
configure the slots per supervisor.

As Nathan mentioned correctly, if you have less workers, a single worker
needs to do more work. All executor threads of your topology are
distributes evenly over all used workers.

Thus, it is a trade-off between network overhead and fault-tolerance.

1) If you use only a single worker (as extreme example) there is no
network involved. If you have a machine with many cores, the single
worker JVM can use all those cores and utilize your machine quite well.
Of course, if something goes wrong, your whole topology crashes
resulting in expensive recover.

2) If you use as many workers as executors (as the other extreme
example), each worker will run a small portion of you topology. It is
unclear if a single executor thread can utilize a full core (this
depends heavily on the work to be performed by the executor as well as
your expected throughput). Thus, you could have one or multiple workers
per core. The advantage is, that a faulty worker is recovered more
easily -- furthermore, all other parts of your topology keep processing
data. The disadvantage is, that you have inter process communication for
worker JVMs on the same supervisor machine and most likely a lot of
network I/O for worker JVMs on different supervisor machines.

So, it is hard to say in general, what the workload of a single worker
is -- it can range from only a single thread to all threads of a whole
topology. And thus, it is hard to say, how many workers you should
configure per supervisor.

Hope this helps.


-Matthias


On 04/28/2016 04:01 PM, Nathan Leung wrote:
> I would recommend against this.  Storm will automatically run multiple
> threads for you, especially if you have more than 1 executor / worker. 
> Every time data transfers between workers, it must be serialized and
> deserialized.  On the other hand, if you have larger workers and one
> goes down, your topology will have to do more work to recover.
> 
> On Thu, Apr 28, 2016 at 9:48 AM, I PVP <i...@hotmail.com
> <mailto:i...@hotmail.com>> wrote:
> 
>     Matthias ,
> 
>     Thanks for the clear explanation.
> 
>     Is there any initial guidance to align the number of slots a
>     supervisor could handle based on the machine # of cpus/cores ?
>      
>      
>     -- 
>     IPVP
> 
> 
>     From: Matthias J. Sax <mj...@apache.org> <mailto:mj...@apache.org>
>     Reply: user@storm.apache.org <mailto:user@storm.apache.org>
>     <user@storm.apache.org>> <mailto:user@storm.apache.org>
>     Date: April 28, 2016 at 4:26:53 AM
>     To: user@storm.apache.org <mailto:user@storm.apache.org>
>     <user@storm.apache.org>> <mailto:user@storm.apache.org>
>     Subject: Re: Slots vs. Topology
> 
>>     The number of slots defines the number of worker JVM a supervisor can
>>     start. And a single worker JVM only executes code of a single
>>     topology
>>     (to isolate topologies for fault-tolerance reasons).
>>
>>     Thus, you need to have a least a single worker for each topology
>>     in your
>>     cluster (ie, sum of all slots over all supervisors)---assuming a
>>     topology uses only a single worker.
>>
>>     It is not required to have a slot per supervisor per topology per se.
>>
>>     However, take into account the parameter "numberOfWorkers" that
>>     you can
>>     set per topology. This is the maximum number of slots a topology can
>>     occupy. If less workers are present, the topology will run using the
>>     available once. If you want all your topologies to be able to use
>>     this
>>     max number of workers, you need to have enough slots in your cluster.
>>     Otherwise, the first topologies will occupy as much workers as
>>     they are
>>     allowed, and for later deployed topologies not slots might be left
>>     over.
>>
>>     -Matthias
>>
>>
>>     On 04/28/2016 03:35 AM, I PVP wrote:
>>     > Hi everyone, 
>>     > 
>>     > Do I need to have one slot(supervisor.slots.ports:) at the storm.yaml 
>>     > for each Topology? 
>>     > 
>>     > What is the impact of having a number of Slots smaller than the number 
>>     > of Topologies ? 
>>     > 
>>     > So far I am understanding that the impact is that some topologies will 
>>     > never run. I have 12 Topologies and they all only run fine when the 
>>     > number of slots is equal or higher to the number of Topologies. When 
>>     > the number os Slots is smaller some topologies never start and uptime 
>>     > for these topologies is blank. 
>>     > 
>>     > Thanks 
>>     > 
>>     > -- 
>>     > IPVP 
>>     > 
>>
>>     ------------------------------------------------------------------------
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to