Hi Brooklyn devs,

Regarding the recent addition to allow custom Bundle Resolver OSGi services [1], we've discovered a bothersome issue with load order at runtime.  It is non-deterministic whether a custom resolver bundle loads before or after the initial catalog.bom and persisted state.  If the custom resolver loads _after_, then it won't be available to handle the catalog.bom and persisted state, which means the bundle might be loaded by the wrong resolver or it might fail to load altogether.

We want to introduce a mechanism to prevent those errors.  There are several options:

(a) Specify a dependency on the resolver bundle/service inside the bundle that needs it

(b) Specify any resolver OSGi bundle or service names that are required in brooklyn.cfg, and then wait until they are available before initializing Brooklyn catalog (eg using BundleListener / ServiceTracker)

(c) Require the bundle to be explicitly included in the Brooklyn/OSGi startup sequence (boot bundlers or startup.properties) before the catalog/rebind initializes

(d) Wait for "all startup and deploy bundles" to be in their final state (usually active) or a start level before the catalog/rebind initializes

(e) Re-install bundles if we've added a new bundle-parser service (so while it might fail initially, it eventually succeeds)


Option (d) would be the nicest I think, simplest for user, leaning on OSGi "start levels":  but Karaf does not seem to respect startlevels.  I'll send an email to the Karaf list to ask.

Option (c) is quite tricky AFAIK, obscure edits needed to the etc/* directory and some tricky listeners (OSGi doesn't encourage the notion of "wait for everything else to be ready", for obvious reasons if two bundles use that philosophy they will deadlock!). So I don't like it.

Option (a) makes writing a bundle that uses a custom resolver more difficult (e.g. requiring an OSGi MANIFEST.MF) so I don't like it either.

Option (e) is quite hard to code up, and inefficient, and will cause a lot of warnings in the log as part of the normal case, and potentially disrupt operations if we re-install bundles whenever a resolver is added.  That said, it is a common pattern in OSGi ... but I don't much like it.


That leaves option (b) which is what I'm leaning towards (unless we get an answer re (d)).  Specifically we'd say something like this in `brooklyn.cfg`:

    brooklyn.resolvers.require.services = custom.Resolver1,custom.Resolver2

and then in catalog/rebind we block (with logging) if those are not yet available.  Possibly we would also have

    brooklyn.resolvers.require.timeout = 5m

And if not available within that timeframe it logs a warning and proceeds.


Note there are potentially similar issues with PlanTransformers but I think that is simple to solve once we've solved ^.

Best
Alex

[1] https://github.com/apache/brooklyn-server/pull/1115


Reply via email to