Hi all, I’m creating a cluster singleton plugin and have found an issue with the lifecycle management of the singleton if the plugin is updated via the API.
When the /cluster/plugin api is called with an update payload, the ClusterSingletons.modified method is called, which adds the new plugin to the singletonMap (and starts it if applicable). Then it stops and removes the old one. The order of these operations has a couple of side effects: 1. For a very brief period, there are 2 instances of the plugin running. This may not really be a problem, but does seem to violate the Singleton principle 2. Given that the map is keyed on the plugin name, adding the replacement first will overwrite the existing (old) entry in the map. Then when the old one is removed, it actually removes the new one that was just added. This leaves the singletonMap with no entry for the plugin. When the Overseer node goes down, the stop method is not called for the plugin because it no longer has an entry in the map. I’ve reproduced the issue by modifying the TestContainerPlugin test, and I can create a Jira issue, but I wonder if there is any reason that the added and deleted methods are called in this order that I haven’t understood. It seems to me that reversing the order in which they are called will solve the issue. Thanks, Paul