Re: Jenkins cloud health reporting

Oliver Gondža Fri, 08 Apr 2016 06:44:43 -0700

I have dug in this a bit more and here is the general design I ended upwith after a couple of iterations:

- The plugin tracks N most recent provisioning activities. One suchactivity covers the whole lifecycle from provisioning to slave deletion.- The activity have 4 hardcoded phases: provisioning, launching,operating, completed. Operation starts with first successful launch andends with slave deletion (it is the only productive phase). The activityis Completed once the slave is gone and the activity is effectively ahistory.- Each phase execution tracks start time and a list of attachments. (Iwill reconsider making it actionable and use actions instead ofattachments as they are similar). The attachment is extensible and canbe a mare piece of html, hyperlink or a model object with URL subspace.This is to attach and present any kind of information: logs, exceptions,etc.- Each Attachment(/Action) has a state: ok, warn or fail. The worse ofattached states is propagated to phase execution and activity level. (Ifslave fails to launch, and exception will be attached explaining why thelaunch phase and thus the whole activity has failed).


While that sounds reasonable, there is a couple of problems:

1) Each phase is considered completed (as long as time measurement isconcerned) when the next phase starts. This is caused by core extensionpoints being called on rather random places. It is possible thatlaunching starts or even completes (ComputerListener#onOnline) beforeprovisioning is done (CloudProvisioningListener#onComplete). Each phaseexecution needs to be therefore open for new attachments even next phasehas started. (Attaching provisioning log while launching has alreadystarted)

2) Provisioning activity can start without core's involvement(provisioning from UI on /computer page). In such cases, plugins willhave to call listener in cloud-stats-plugin to have this activity tracked.

3) There is no concept of templates in core cloud API. However, it isused by most cloud implementations and it is valuable information forstatistics.

4) Tracking the same activity as it goes throughPlannedNode/Computer/Slave phases turned out to be lot trickier than Iexpected. I tried several approaches:

- Almost no plugin uses PlannedNode#displayName as the actual slavename so it is of no use. Not to mention we would have to reflect slaverenames.- Calculating fingerprint based on PlannedNode's identity andattaching to Slave instance as NodeProperty inCloudProvisioningListener#onComplete was the closest thing to workingsolution I have got. The problem is that at the time it gets called,launching can already start. Some plugins even wait for launch tocomplete before leaving PlannedNode#future. (Plus, for whatever reasoncomputer passed to ComputerListener#preLaunch might not have Nodeassigned yet which seems like a bug to me.)

Having said that, I see no other way but require every plugin toimplement custom interface in PlannedNode(or its #future), computer andnode to have the activity tracked correctly. However invasive this mightbe, it will remove problems #3 and #4 entirely.


At this point, any feedback welcome!
--
oliver

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/5707B5A9.1010007%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Jenkins cloud health reporting

Reply via email to