[ 
https://issues.apache.org/jira/browse/STORM-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Li updated STORM-3527:
----------------------------
    Description: 
Sometimes supervisor got terminated/died during writing username to 
workers-users file. And when it happens, the file could be empty. And when 
supervisor recovers after, it wouldn't be able to get the correct username 
because the workers-users file is present but empty. So supervisor would never 
be able to clean up this worker and you could see in supervisor log file:

{code:java}
2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911'
 contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904'
 contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433'
 contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510'
 contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741'
 contained no data, resetting state
{code}


  was:
Sometimes supervisor got terminated/died during writing username to 
workers-users file. And when it happens, the file could be empty. And when 
supervisor recovers after, it wouldn't be able to get the correct username 
because the workers-users file is present but empty.

So you could see in supervisor log file:

{code:java}
2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911'
 contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904'
 contained no data, resetting state
2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433'
 contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510'
 contained no data, resetting state
2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file 
'/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741'
 contained no data, resetting state
{code}



> Container.getWorkerUser() should check if the user name is empty
> ----------------------------------------------------------------
>
>                 Key: STORM-3527
>                 URL: https://issues.apache.org/jira/browse/STORM-3527
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: Ethan Li
>            Assignee: Ethan Li
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sometimes supervisor got terminated/died during writing username to 
> workers-users file. And when it happens, the file could be empty. And when 
> supervisor recovers after, it wouldn't be able to get the correct username 
> because the workers-users file is present but empty. So supervisor would 
> never be able to clean up this worker and you could see in supervisor log 
> file:
> {code:java}
> 2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file 
> '/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911'
>  contained no data, resetting state
> 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file 
> '/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904'
>  contained no data, resetting state
> 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file 
> '/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433'
>  contained no data, resetting state
> 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file 
> '/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510'
>  contained no data, resetting state
> 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file 
> '/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741'
>  contained no data, resetting state
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to