I was just looking at out revert job. This is what we found works pretty 
reliably. We produce snapshots in powered down state.

Issue the revert to snapshot command

This waits for Jenkins to miss the slave
java -jar C:\Tools\BuildTools\jenkins-cli.jar -s http://xxxx.net:8082/ 
wait-node-offline %VMNAME%

power up the VM

this waits for Jenkins to see the slave again
java -jar C:\Tools\BuildTools\jenkins-cli.jar -s http://xxxxxx.net:8082/ 
wait-node-online %VMNAME%

From: jenkinsci-users@googlegroups.com 
[mailto:jenkinsci-users@googlegroups.com] On Behalf Of Dale Quigg
Sent: Wednesday, November 12, 2014 12:39 PM
To: jenkinsci-users@googlegroups.com
Subject: How long does it take Jenkins to realize slaves are offline?

Hi,

For our nightly testing, we revert slaves to a VMware snapshot before starting 
jobs.

Most of the time this works, but sometimes the first job after the revert job 
fails.

The weird part is that a downstream job that runs on the same slave
will often succeed less than two minutes after the upstream failure.

I'm guessing that Jenkins thinks the slave is still online and rejecting the 
connection and/or not connecting.

My idea for a work-around is to add a "snooze" time after the revert.  This 
way, Jenkins can figure out that the slave is offline.

So, what amount of time is necessary for Jenkins to realize that a slave is 
offline?
5, 10, ?? minutes?

Note that for non-windows slaves, I'm not disconnecting the slave prior to 
revert.  I had a "good reason" for doing this at the time, but can not remember 
it.

Below are some snippets of console output showing my situation.

Running Jenkins ver. 1.509.1

Thanks,
Dale

------------------------------------
http://jenkins/job/Job-A/549/console
------------------------------------
21:45:38 NOT Disconnecting Hudson client before powering on machine: 
AT03-cent5.  (non-Windows slave)
21:45:38 Revert vm 'AT03-cent5' to snapshot '2014-11-03'.
21:45:38 Power on vm AT03-cent5.
21:45:38 Spent 4.327958 seconds powering on.
21:45:38 Snooze an extra 180 seconds to allow boot completion...
<snip>
21:48:40 Finished: SUCCESS

-------------------------------------
http://jenkins/job/Job-B/1109/console
-------------------------------------
21:48:46 Started by upstream project "Job-A" build number 549
<snip>
21:48:49 Building remotely on AT03-cent5 in workspace 
/opt/hudson/workspace/Job-A
21:51:28 FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: 
Unexpected termination of the channel
<snip>

----------------------------------------
http://jenkins/job/Job-C/740/consoleFull
----------------------------------------
21:52:51 Started by upstream project "Job-B" build number 1109
21:52:54 Building remotely on AT03-cent5 in workspace 
/opt/hudson/workspace/Job-C
<snip>
22:35:13 Finished: SUCCESS


--
Dale Quigg
Revolution Analytics, Inc.


Revolution R Plus<http://revolutionanalytics.com/plus>

Subscribe to Technical Support & Indemnification for R
--
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
jenkinsci-users+unsubscr...@googlegroups.com<mailto:jenkinsci-users+unsubscr...@googlegroups.com>.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to