Hi Zainal,

I guess some assumptions above about Storm continues to work is wrong, what
> is the best practice or recommended way for our situation?


I will need more details. What stops working? How many workers are you
running per supervisor and how many workers is your topology configured to
use?

On Mon, 24 Feb 2020 at 15:50, Zainal Arifin (BLOOMBERG/ 731 LEX) <
zari...@bloomberg.net> wrote:

> Thanks Rui! It looks like I am having a bigger problem as Storm seems
> having issue now, maybe someone can give me a recommendation here.
> So right now I have 2 machines, and both of them have a scheduled
> maintenance, which the infra team has procedure which it'll call our stop
> script before machine reboot, and call our start script after reboot.
> In our stop script, it kills following components: nimbus, supervisor and
> ui, then in start script, it launches: nimbus, supervisor and ui.
> Right now I don't have script to check if topology is running yet, as I am
> still experimenting this machine reboot behavior.
> Note that those machines won't be brought down at the same time, so it'd
> be one down, and it's back up, and then bring another one down. So I'd
> imagine:
>
> T1: Storm (and topology) runs on both machine A and machine B
> T2: Storm is killed on machine A, and machine is rebooted, assuming all
> spout/bolt will now run on machine B
> T3: machine A is back online, and Storm is launched at start up, assuming
> it'll pick up the topology
> T4: Storm is killed on machine B, and machine is rebooted
> T5: machine B is back online, and Storm is launched, and assuming
> everything works fine (somehow it's not)
>
> I guess some assumptions above about Storm continues to work is wrong,
> what is the best practice or recommended way for our situation?
>
>
> Thanks,
> zainal
>
>
> From: rui.ab...@gmail.com At: 02/21/20 18:01:56
> To: Zainal Arifin (BLOOMBERG/ 731 LEX ) <zari...@bloomberg.net>
> Cc: user@storm.apache.org
> Subject: Re: machine reboot
>
> The rebalance command through Storm UI does not guarantee you that tasks
> will go to another worker / machine.
>
> There is a command line rebalance command but it's used to increase the
> number of workers or executors:
>
> $ storm rebalance mytopology -n 5 -e blue-spout=3 -e yellow-bolt=10
>
>
> Without trying it, I'm not sure you can call it without parameters.
> And if even if it works, I'm not sure it will have the desired effect.
> Please have a look here for further information about the rebalance
> command:
>
>
> https://storm.apache.org/releases/2.1.0/Understanding-the-parallelism-of-a-Storm-topology.html
>
> On Fri, Feb 21, 2020, 23:22 Zainal Arifin (BLOOMBERG/ 731 LEX) <
> zari...@bloomberg.net> wrote:
>
>> Thanks Rui! Can I force the rebalance from a script? I am thinking to add
>> in my current script, after it starts Storm, and checks if the topology is
>> running. If yes, what would be the command or API?
>>
>> From: rui.ab...@gmail.com At: 02/21/20 16:48:53
>> To: Zainal Arifin (BLOOMBERG/ 731 LEX ) <zari...@bloomberg.net>,
>> user@storm.apache.org
>> Subject: Re: machine reboot
>>
>> As long the workers and tasks in machine A are healthy and sending
>> hearbeats to Nimbus, they will keep running there. A redeployment of the
>> topologies or a a rebalance command (you can use Storm UI for this), may
>> send tasks to be executed in machine B.
>>
>> On Fri, Feb 21, 2020, 22:14 Zainal Arifin (BLOOMBERG/ 731 LEX) <
>> zari...@bloomberg.net> wrote:
>>
>>> Hi,
>>> We run Storm on 2 machines (let's call it machine A and B), and
>>> everything works fine.
>>> Then I want to test the machine being reboot, so basically when the
>>> machine being brought down, it'll call my script to stop Storm, and when
>>> the machine back up, it called my script to start Storm.
>>>
>>> From my testing to reboot 1 machine (let say B), after the machine is
>>> back online, I notice Storm runs fine there, but I see all spout/bolt tasks
>>> are running on machine A. I waited for a few minutes (now it's already 1
>>> hour), and all tasks are still running on machine A.
>>>
>>> I'd expect some of the tasks will be automatically distributed
>>> (rebalance) to machine B, is that not the case? Or is there something I
>>> need to configure? Thanks!
>>>
>>
>>
>

Reply via email to