I have dug in this a bit more and here is the general design I ended up with after a couple of iterations:

- The plugin tracks N most recent provisioning activities. One such activity covers the whole lifecycle from provisioning to slave deletion. - The activity have 4 hardcoded phases: provisioning, launching, operating, completed. Operation starts with first successful launch and ends with slave deletion (it is the only productive phase). The activity is Completed once the slave is gone and the activity is effectively a history. - Each phase execution tracks start time and a list of attachments. (I will reconsider making it actionable and use actions instead of attachments as they are similar). The attachment is extensible and can be a mare piece of html, hyperlink or a model object with URL subspace. This is to attach and present any kind of information: logs, exceptions, etc. - Each Attachment(/Action) has a state: ok, warn or fail. The worse of attached states is propagated to phase execution and activity level. (If slave fails to launch, and exception will be attached explaining why the launch phase and thus the whole activity has failed).

While that sounds reasonable, there is a couple of problems:

1) Each phase is considered completed (as long as time measurement is concerned) when the next phase starts. This is caused by core extension points being called on rather random places. It is possible that launching starts or even completes (ComputerListener#onOnline) before provisioning is done (CloudProvisioningListener#onComplete). Each phase execution needs to be therefore open for new attachments even next phase has started. (Attaching provisioning log while launching has already started)

2) Provisioning activity can start without core's involvement (provisioning from UI on /computer page). In such cases, plugins will have to call listener in cloud-stats-plugin to have this activity tracked.

3) There is no concept of templates in core cloud API. However, it is used by most cloud implementations and it is valuable information for statistics.

4) Tracking the same activity as it goes through PlannedNode/Computer/Slave phases turned out to be lot trickier than I expected. I tried several approaches:

- Almost no plugin uses PlannedNode#displayName as the actual slave name so it is of no use. Not to mention we would have to reflect slave renames. - Calculating fingerprint based on PlannedNode's identity and attaching to Slave instance as NodeProperty in CloudProvisioningListener#onComplete was the closest thing to working solution I have got. The problem is that at the time it gets called, launching can already start. Some plugins even wait for launch to complete before leaving PlannedNode#future. (Plus, for whatever reason computer passed to ComputerListener#preLaunch might not have Node assigned yet which seems like a bug to me.)

Having said that, I see no other way but require every plugin to implement custom interface in PlannedNode(or its #future), computer and node to have the activity tracked correctly. However invasive this might be, it will remove problems #3 and #4 entirely.

At this point, any feedback welcome!
--
oliver

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/5707B5A9.1010007%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to