[ 
https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=162845#comment-162845
 ] 

Jason Swager commented on JENKINS-13735:
----------------------------------------

I've been encountering the same problem.  I thought it was in the code of the 
vSphere Plugin, but it turns out that it's not.  Jenkins is issuing a connect() 
call on slaves that have no reason to be starting up due to the queued jobs 
that I can see.

Part of the problem IS the vSphere Plugin itself.  Originally, when a job was 
fired up, any slave that was down that could the job would be started by the 
vSphere Plugin because the connect() method would get called on all those 
slaves, which resulted in a large number of VMs being powered on for a single 
job.  I added code to the plugin to throttle that behavior.  Unfortunately, the 
throttling is causing this problem to get worse.  Where as originally, jA, jB, 
and jC might have been started up, jC now MIGHT get started up due to the 
vSphere plugin throttling the VM startups.

Initial investigation seems to indicate that the Slave.canTake() function might 
not be functioning as expected. If I find anything further during my 
investigation, I'll post here.
                
> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows 
> XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where 
> this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut 
> down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the 
> required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of 
> trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to