[
https://issues.apache.org/jira/browse/STORM-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Li updated STORM-3527:
----------------------------
Description:
Sometimes supervisor got terminated/died during writing username to
workers-users file. And when it happens, the file could be empty. And when
supervisor recovers after, it wouldn't be able to get the correct username
because the workers-users file is present but empty. So supervisor would never
be able to clean up this worker and you could see in supervisor log file:
{code:java}
2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911'
contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904'
contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433'
contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510'
contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741'
contained no data, resetting state
{code}
was:
Sometimes supervisor got terminated/died during writing username to
workers-users file. And when it happens, the file could be empty. And when
supervisor recovers after, it wouldn't be able to get the correct username
because the workers-users file is present but empty.
So you could see in supervisor log file:
{code:java}
2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911'
contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904'
contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433'
contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510'
contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file
'/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741'
contained no data, resetting state
{code}
> Container.getWorkerUser() should check if the user name is empty
> ----------------------------------------------------------------
>
> Key: STORM-3527
> URL: https://issues.apache.org/jira/browse/STORM-3527
> Project: Apache Storm
> Issue Type: Bug
> Reporter: Ethan Li
> Assignee: Ethan Li
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Sometimes supervisor got terminated/died during writing username to
> workers-users file. And when it happens, the file could be empty. And when
> supervisor recovers after, it wouldn't be able to get the correct username
> because the workers-users file is present but empty. So supervisor would
> never be able to clean up this worker and you could see in supervisor log
> file:
> {code:java}
> 2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file
> '/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911'
> contained no data, resetting state
> 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file
> '/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904'
> contained no data, resetting state
> 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file
> '/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433'
> contained no data, resetting state
> 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file
> '/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510'
> contained no data, resetting state
> 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file
> '/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741'
> contained no data, resetting state
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)