Hi all,
We are looking at implementing a "cleaner" that can remove orphaned
locations from persisted state.
_*Problem statement*_
In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not
unmanage locations when the associated entity was deleted. This means
that the persisted state for some customers contains many "orphaned
locations" that are no longer referenced.
We want a way to safely delete these. We only want to delete locations
that are not referenced.
These orphaned locations can also cause "dangling references" to be
reported, where the orphaned location(s) hold references to things that
have been deleted.
References to locations can be in a few formats:
1. Location is directly referenced from an entity's getLocations().
2. Location is indirectly referenced from an entity (e.g. the location
is the parent of another location that is referenced).
3. Location is referenced by an entity in some other way (rather than
getLocations()) - e.g. in a sensor or config key, such as [2].
4. Location is referenced by a policy or enricher.
For (4), I can't think of any such use-case off-hand, but it's possible
that a customer might write a bespoke policy/enricher that does this.
For (2), it means we need to worry about reachability. Note there might
be groups of locations that are unreachable (e.g. location X and its
parent refer to each other, but are not referenced by anything else).
_*Location deletion: proposed solution*_
We propose an offline tool, similar in use to copy-state [3], which will
clean up the persisted state, and save the cleaned-up copy to a given
location.
It is important that the tool is run offline, in case a Brooklyn server
is in the middle of writing multiple new files.
Ideally this will not deserialize all the persisted state (so does not
require classloading, etc). We'll therefore work with
BrooklynMementoRawData [4].
We'd therefore be able to run this outside of the Karaf container.
We can identify location references in the XML using a combination of
the following techniques:
1. The marker <locationProxy>...</locationProxy> for references inside
config keys, sensors, etc.
2. Inside an entity, the <locations>...</locations> section.
3. Inside a location, the <parent>...</parent> and
<children>...</children> section.
From (1) and (2), we'll identify all locations that are reachable. From
(3), we'll identify the locations that are indirectly referenced. We'll
then know we can delete all others.
_Optional second part: validating location deletions_
We could validate that we were right to delete those locations. When we
next start Brooklyn, we could look at the set of dangling references
[5]. If anything we deleted is now reported as a dangling reference,
then we'd report this error.
Is this worth doing? Would it be optional (because it requires being
able to class-load everything).
_*Policy/Enricher deletion: proposed solution*_
We can apply the same logic for deleting policies/enrichers that have
become orphaned.
It is a lot easier to identify the policies/enrichers that are in use:
they are all directly referenced by an entity in the section
<enrichers>....</enrichers> or <policies>....</policies>.
Anything not referenced, we can delete.
Aled
[1] https://github.com/apache/brooklyn-server/pull/148
[2]
https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
[3]
http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
[4]
https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
[5]
https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88