[ https://issues.apache.org/jira/browse/STORM-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Li updated STORM-3527: ---------------------------- Description: Sometimes supervisor got terminated/died during writing username to workers-users file. And when it happens, the file could be empty. And when supervisor recovers after, it wouldn't be able to get the correct username because the workers-users file is present but empty. So supervisor would never be able to clean up this worker and you could see in supervisor log file: {code:java} 2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911' contained no data, resetting state 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904' contained no data, resetting state 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433' contained no data, resetting state 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510' contained no data, resetting state 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741' contained no data, resetting state {code} was: Sometimes supervisor got terminated/died during writing username to workers-users file. And when it happens, the file could be empty. And when supervisor recovers after, it wouldn't be able to get the correct username because the workers-users file is present but empty. So you could see in supervisor log file: {code:java} 2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911' contained no data, resetting state 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904' contained no data, resetting state 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433' contained no data, resetting state 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510' contained no data, resetting state 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file '/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741' contained no data, resetting state {code} > Container.getWorkerUser() should check if the user name is empty > ---------------------------------------------------------------- > > Key: STORM-3527 > URL: https://issues.apache.org/jira/browse/STORM-3527 > Project: Apache Storm > Issue Type: Bug > Reporter: Ethan Li > Assignee: Ethan Li > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes supervisor got terminated/died during writing username to > workers-users file. And when it happens, the file could be empty. And when > supervisor recovers after, it wouldn't be able to get the correct username > because the workers-users file is present but empty. So supervisor would > never be able to clean up this worker and you could see in supervisor log > file: > {code:java} > 2019-10-21 18:26:48.272 o.a.s.u.LocalState timer [WARN] LocalState file > '/home/y/var/storm/workers/a9290217-f83f-4c16-ac54-781aca150d7f/heartbeats/1571508791911' > contained no data, resetting state > 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file > '/home/y/var/storm/workers/94967b6b-c666-4020-9d2c-363551d1229b/heartbeats/1571508791904' > contained no data, resetting state > 2019-10-21 18:26:49.282 o.a.s.u.LocalState timer [WARN] LocalState file > '/home/y/var/storm/workers/5aa891f0-9b9c-4914-8745-c55e99537ba1/heartbeats/1569158099433' > contained no data, resetting state > 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file > '/home/y/var/storm/workers/060056f4-9589-4473-b6d0-9ab5fdc278e2/heartbeats/1561524903510' > contained no data, resetting state > 2019-10-21 18:26:49.283 o.a.s.u.LocalState timer [WARN] LocalState file > '/home/y/var/storm/workers/bb189497-eb21-48c4-ba62-48ee02acde94/heartbeats/1571508791741' > contained no data, resetting state > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)