I have dug in this a bit more and here is the general design I ended up
with after a couple of iterations:
- The plugin tracks N most recent provisioning activities. One such
activity covers the whole lifecycle from provisioning to slave deletion.
- The activity have 4 hardcoded phases: provisioning, launching,
operating, completed. Operation starts with first successful launch and
ends with slave deletion (it is the only productive phase). The activity
is Completed once the slave is gone and the activity is effectively a
history.
- Each phase execution tracks start time and a list of attachments. (I
will reconsider making it actionable and use actions instead of
attachments as they are similar). The attachment is extensible and can
be a mare piece of html, hyperlink or a model object with URL subspace.
This is to attach and present any kind of information: logs, exceptions,
etc.
- Each Attachment(/Action) has a state: ok, warn or fail. The worse of
attached states is propagated to phase execution and activity level. (If
slave fails to launch, and exception will be attached explaining why the
launch phase and thus the whole activity has failed).
While that sounds reasonable, there is a couple of problems:
1) Each phase is considered completed (as long as time measurement is
concerned) when the next phase starts. This is caused by core extension
points being called on rather random places. It is possible that
launching starts or even completes (ComputerListener#onOnline) before
provisioning is done (CloudProvisioningListener#onComplete). Each phase
execution needs to be therefore open for new attachments even next phase
has started. (Attaching provisioning log while launching has already
started)
2) Provisioning activity can start without core's involvement
(provisioning from UI on /computer page). In such cases, plugins will
have to call listener in cloud-stats-plugin to have this activity tracked.
3) There is no concept of templates in core cloud API. However, it is
used by most cloud implementations and it is valuable information for
statistics.
4) Tracking the same activity as it goes through
PlannedNode/Computer/Slave phases turned out to be lot trickier than I
expected. I tried several approaches:
- Almost no plugin uses PlannedNode#displayName as the actual slave
name so it is of no use. Not to mention we would have to reflect slave
renames.
- Calculating fingerprint based on PlannedNode's identity and
attaching to Slave instance as NodeProperty in
CloudProvisioningListener#onComplete was the closest thing to working
solution I have got. The problem is that at the time it gets called,
launching can already start. Some plugins even wait for launch to
complete before leaving PlannedNode#future. (Plus, for whatever reason
computer passed to ComputerListener#preLaunch might not have Node
assigned yet which seems like a bug to me.)
Having said that, I see no other way but require every plugin to
implement custom interface in PlannedNode(or its #future), computer and
node to have the activity tracked correctly. However invasive this might
be, it will remove problems #3 and #4 entirely.
At this point, any feedback welcome!
--
oliver
--
You received this message because you are subscribed to the Google Groups "Jenkins
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-dev/5707B5A9.1010007%40gmail.com.
For more options, visit https://groups.google.com/d/optout.