Thank you for the clarification and pointing me to the JIRA case

"So in the above pull request, I implemented a solution in the way similar to 
watershed algorithm. It would firstly pick the slots from least used host, 
until that host uses the same number of slots as the second least used slots. 
Then it evenly picks slots from the 2 least used hosts until reaches the 3rd 
one. Iterating this way, we can get the best balanced assignment."

This makes sense now. So mostly its evenly spread across supervisors.

From: Grant Overby (groverby) [mailto:grove...@cisco.com]
Sent: Friday, March 13, 2015 11:31 AM
To: user@storm.apache.org
Subject: Re: configuring topology.workers

My understudying of the DefaultScheduler isn't complete, so take this with a 
grain of salt. As of 0.9.3, if I understand correctly:

The list of available slots are sorted based on number of free slots on the 
same supervisor as the given slot then on port number. If there are two or more 
slots that tie on both these conditions, the order between those slots is 
random. Slots are then consumed in this order.

This gives a round robin style behavior.  I say style here because it deviates 
from a basic round robin algorithm when topologies have been killed and thus 
freeing  up other slots. Preference is given to these slots to keep supervisor 
slot use balanced.

So if you already have topologies deployed and then kill some, such that 
Supervisor A has 1 free slot and Supervisor B has 3 free slots and then deploy 
a topology with two workers, both will go to Supervisor B. This isn't the usual 
case. You can generally expect the topology to be spread evenly, or at least 
somewhat evenly, over the supervisors.

The primary goal is to balance the used slots between supervisors with a 
secondary goal of spreading a topology across supervisors.

This may help:
https://issues.apache.org/jira/browse/STORM-132



[Image removed by sender.]


Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
grove...@cisco.com<mailto:grove...@cisco.com>
Mobile: 865 724 4910



[Image removed by sender.] Think before you print.


This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.

Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.




From: Srividhya Shanmugam 
<srividhyashanmu...@fico.com<mailto:srividhyashanmu...@fico.com>>
Reply-To: "user@storm.apache.org<mailto:user@storm.apache.org>" 
<user@storm.apache.org<mailto:user@storm.apache.org>>
Date: Thursday, March 12, 2015 at 8:49 PM
To: "user@storm.apache.org<mailto:user@storm.apache.org>" 
<user@storm.apache.org<mailto:user@storm.apache.org>>
Subject: RE: configuring topology.workers

Thank you.

To be more specific, I looked at DefaultScheduler source. It's getting the list 
of available slots as node+port combination...So if a node has two slots and 
next node has 2 more slots, will the workers for the topology be assigned to 2 
slots in one node or one slot in each node?
Appreciate your help.


From: Grant Overby (groverby) [mailto:grove...@cisco.com]
Sent: Thursday, March 12, 2015 6:38 PM
To: user@storm.apache.org<mailto:user@storm.apache.org>
Subject: Re: configuring topology.workers

Topologies are assigned to supervisors in a round robin fashion by the default 
scheduler.

You can provide other schedulers: 
http://xumingming.sinaapp.com/885/twitter-storm-how-to-develop-a-pluggable-scheduler/
[Image removed by sender.]


Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
grove...@cisco.com<mailto:grove...@cisco.com>
Mobile: 865 724 4910



[Image removed by sender.] Think before you print.


This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.

Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.




From: Srividhya Shanmugam 
<srividhyashanmu...@fico.com<mailto:srividhyashanmu...@fico.com>>
Reply-To: "user@storm.apache.org<mailto:user@storm.apache.org>" 
<user@storm.apache.org<mailto:user@storm.apache.org>>
Date: Thursday, March 12, 2015 at 5:26 PM
To: "user@storm.apache.org<mailto:user@storm.apache.org>" 
<user@storm.apache.org<mailto:user@storm.apache.org>>
Subject: configuring topology.workers

All,

I am trying to understand how setting topology.workers will impact distribution 
of work for a given topology. Say if storm cluster has 2 supervisor nodes and 
both the nodes are configured with supervisor.slots.ports: 6700, 6701,6702,6703.

If the  topology.workers is set to 2, will storm run two worker process in  one 
node or two worker process - one in each node? How storm determines this?

Thanks much,
Srividhya

This email and any files transmitted with it are confidential, proprietary and 
intended solely for the individual or entity to whom they are addressed. If you 
have received this email in error please delete it immediately.

This email and any files transmitted with it are confidential, proprietary and 
intended solely for the individual or entity to whom they are addressed. If you 
have received this email in error please delete it immediately.

This email and any files transmitted with it are confidential, proprietary and 
intended solely for the individual or entity to whom they are addressed. If you 
have received this email in error please delete it immediately.

Reply via email to