I am pretty sure Nathan is referring to rebalancing in that response.

*'When you set the parallelism to 'x', you will have 'x' executors forever*.'
No. The number of *tasks *is static. You can change the number of
*executors* using the rebalance command.

Since from 0.8.0 'parallelism' refers to the number of initial executors,
which can be changed, this sort of means that the 'parallelism' can be
changed on the fly. It's confusing because 0.8.0 redefined the meaning of
parallelism and then said that the 'parallelism' could be changed on the
fly. Which is true, but you need to realize that the number of tasks
remains the same regardless.



Rebalancing becomes useful when you have more than one task per executor.
The default is one task per executor. However, you can override the one
task per executor default and manually set the number of tasks using
setNumTasks.


Why do this? I'll just copy Michael's excellent explanation.

*"So one reason for having 2+ tasks per executor thread is to give you the
flexibility to expand/scale up the topology through the storm rebalance
command in the future without taking the topology offline. For instance,
imagine you start out with a Storm cluster of 15 machines but already know
that next week another 10 boxes will be added. Here you could opt for
running the topology at the anticipated parallelism level of 25 machines
already on the 15 initial boxes (which is of course slower than 25 boxes).
Once the additional 10 boxes are integrated you can then storm rebalance
the topology to make full use of all 25 boxes without any downtime."*


*http://stackoverflow.com/questions/17257448/what-is-the-task-in-twitter-storm-parallelism
<http://stackoverflow.com/questions/17257448/what-is-the-task-in-twitter-storm-parallelism>*




On Fri, Mar 13, 2015 at 6:00 PM, tishan pubudu kanishka dahanayakage <
dtishanpub...@gmail.com> wrote:

> Hi Kosala,
>
> Thanks for the response. Yeah. I came across that. But that was written in
> 2012 whereas [1] is more recently. it says "Note that as of Storm 0.8 the
> parallelism_hint parameter now specifies the
> ​'​
> initial
> ​'​
> number of executors (not tasks!) for that bolt". Also in here[2] Nathan
> says that "0.8.0 will let you change the parallelism of topologies on the
> fly"
> ​ . That's why i raised this concern. So what you are saying is if I set
> parallelism to 'x' it will have x number of executors forever. Please
> correct if I am wrong.
>
>
> [1]
> http://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html
> [2] https://groups.google.com/forum/#!topic/storm-user/Rr9K7f-AMLc
>
> Thanks,
> Tishan
>
> On Fri, Mar 13, 2015 at 11:54 AM, Kosala Dissanayake <umaradi...@gmail.com
> > wrote:
>
>> *"initial parallelism value and that value increase dynamically in
>> run-time."*
>>
>> No. The parallelism value is the number of executors you get. This does
>> not change at run-time.
>>
>> Read this.
>> http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/
>>
>> On Wed, Mar 11, 2015 at 7:45 PM, tishan pubudu kanishka dahanayakage <
>> dtishanpub...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I went through this[1] and tried few topology deployments. Just want to
>>> clear small doubt. According to [1] what i understood was that parallelism
>>> hint is the initial parallelism value and that value increase dynamically
>>> in run-time. Last comment on [2] also suggest the same. However when I
>>> tested it in Storm I did not see parallelism for that Bolt increase with
>>> load.
>>>
>>> ​Does my understanding about how parallelism hint operates in Storm
>>> correct. If so do I need to do any more configurations to make it work.
>>>
>>> Thanks,
>>> Tishan​
>>>
>>> --
>>> Regards,
>>> Tishan
>>>
>>
>>
>
>
> --
> Regards,
> Tishan
>

Reply via email to