Hi Brooklyn devs,
Regarding the recent addition to allow custom Bundle Resolver OSGi
services [1], we've discovered a bothersome issue with load order at
runtime. It is non-deterministic whether a custom resolver bundle loads
before or after the initial catalog.bom and persisted state. If the
custom resolver loads _after_, then it won't be available to handle the
catalog.bom and persisted state, which means the bundle might be loaded
by the wrong resolver or it might fail to load altogether.
We want to introduce a mechanism to prevent those errors. There are
several options:
(a) Specify a dependency on the resolver bundle/service inside the
bundle that needs it
(b) Specify any resolver OSGi bundle or service names that are required
in brooklyn.cfg, and then wait until they are available before
initializing Brooklyn catalog (eg using BundleListener / ServiceTracker)
(c) Require the bundle to be explicitly included in the Brooklyn/OSGi
startup sequence (boot bundlers or startup.properties) before the
catalog/rebind initializes
(d) Wait for "all startup and deploy bundles" to be in their final state
(usually active) or a start level before the catalog/rebind initializes
(e) Re-install bundles if we've added a new bundle-parser service (so
while it might fail initially, it eventually succeeds)
Option (d) would be the nicest I think, simplest for user, leaning on
OSGi "start levels": but Karaf does not seem to respect startlevels.
I'll send an email to the Karaf list to ask.
Option (c) is quite tricky AFAIK, obscure edits needed to the etc/*
directory and some tricky listeners (OSGi doesn't encourage the notion
of "wait for everything else to be ready", for obvious reasons if two
bundles use that philosophy they will deadlock!). So I don't like it.
Option (a) makes writing a bundle that uses a custom resolver more
difficult (e.g. requiring an OSGi MANIFEST.MF) so I don't like it either.
Option (e) is quite hard to code up, and inefficient, and will cause a
lot of warnings in the log as part of the normal case, and potentially
disrupt operations if we re-install bundles whenever a resolver is
added. That said, it is a common pattern in OSGi ... but I don't much
like it.
That leaves option (b) which is what I'm leaning towards (unless we get
an answer re (d)). Specifically we'd say something like this in
`brooklyn.cfg`:
brooklyn.resolvers.require.services = custom.Resolver1,custom.Resolver2
and then in catalog/rebind we block (with logging) if those are not yet
available. Possibly we would also have
brooklyn.resolvers.require.timeout = 5m
And if not available within that timeframe it logs a warning and proceeds.
Note there are potentially similar issues with PlanTransformers but I
think that is simple to solve once we've solved ^.
Best
Alex
[1] https://github.com/apache/brooklyn-server/pull/1115