[ https://issues.apache.org/jira/browse/YUNIKORN-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Craig Condit closed YUNIKORN-936. --------------------------------- Resolution: Delivered Closing as this was resolved elsewhere. > app and node recovery event ordering > ------------------------------------ > > Key: YUNIKORN-936 > URL: https://issues.apache.org/jira/browse/YUNIKORN-936 > Project: Apache YuniKorn > Issue Type: Improvement > Components: core - common > Reporter: Wilfred Spiegelenburg > Assignee: Peter Bacsko > Priority: Major > > While working on YUNIKORN-905 a number of unit tests failed due to event > ordering. Looking at the change we might have had an issue in the RMProxy for > a long time. > An update request could contain apps, asks and nodes. Processing was ordered > like that too. During recovery the order was/is important. There was never an > order requirement on the events send by a shim or a use of complex updates > events to support this ordering by the shim. > An event to recover a node could be a separate UpdateRequest from the > applications that should be recovered. That means we relied on the go routine > and event ordering to hopefully do things correctly: i.e. events send by the > shim to create new apps would be processed before node recovery started. Even > in the previous implementation there was no guarantee that all the > application were added before a node was recovered. The unit tests in the > core used the order processing dependency to make sure it worked. > That is not the real world scenario. and thus a dangerous assumption. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org