[JIRA] (JENKINS-37483) Deadlock caused by synchronized methods in EC2Cloud

2018-03-12 Thread franc...@gmail.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Francis Upton closed an issue as Fixed  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-37483  
 
 
  Deadlock caused by synchronized methods in EC2Cloud   
 

  
 
 
 
 

 
Change By: 
 Francis Upton  
 
 
Status: 
 Resolved Closed  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-37483) Deadlock caused by synchronized methods in EC2Cloud

2017-02-24 Thread randall.ra...@gmail.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Randall Raboy commented on  JENKINS-37483  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Deadlock caused by synchronized methods in EC2Cloud   
 

  
 
 
 
 

 
 I am seeing the same deadlock in our setup: (omitted jvm related classes) 

 
Handling POST /cloud/ec2-us-west-2/provision from 172.16.6.210 : RequestHandlerThread[#969] - threadId:76774 - state:WAITING
stackTrace:
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- waiting to lock  (a java.util.concurrent.locks.ReentrantLock$NonfairSync) owned by "jenkins.util.Timer [#3]" t@36
...
at hudson.model.Queue._withLock(Queue.java:1307)
at hudson.model.Queue.withLock(Queue.java:1186)
at jenkins.model.Nodes.removeNode(Nodes.java:237)
at jenkins.model.Jenkins.removeNode(Jenkins.java:2084)
at hudson.plugins.ec2.EC2Cloud.countCurrentEC2Slaves(EC2Cloud.java:420)
at hudson.plugins.ec2.EC2Cloud.getPossibleNewSlavesCount(EC2Cloud.java:499)
at hudson.plugins.ec2.EC2Cloud.getNewOrExistingAvailableSlave(EC2Cloud.java:518)
- locked <65f5826a> (a hudson.plugins.ec2.AmazonEC2Cloud)
at hudson.plugins.ec2.EC2Cloud.doProvision(EC2Cloud.java:340)
...
Locked ownable synchronizers:
- locked <112a6eb5> (a java.util.concurrent.ThreadPoolExecutor$Worker)

jenkins.util.Timer [#3] - threadId:36 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED
at hudson.plugins.ec2.EC2Cloud.connect(EC2Cloud.java:634)
- waiting to lock <65f5826a> (a hudson.plugins.ec2.AmazonEC2Cloud) owned by "Handling POST /cloud/ec2-us-west-2/provision from 172.16.6.210 : RequestHandlerThread[#969]" t@76774
at hudson.plugins.ec2.EC2AbstractSlave.getInstance(EC2AbstractSlave.java:277)
at hudson.plugins.ec2.EC2AbstractSlave.fetchLiveInstanceData(EC2AbstractSlave.java:429)
at hudson.plugins.ec2.EC2AbstractSlave.isAlive(EC2AbstractSlave.java:397)
at hudson.plugins.ec2.EC2SpotSlave.terminate(EC2SpotSlave.java:73)
at hudson.plugins.ec2.EC2AbstractSlave.idleTimeout(EC2AbstractSlave.java:344)
at hudson.plugins.ec2.EC2RetentionStrategy.internalCheck(EC2RetentionStrategy.java:136)
at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:85)
at hudson.plugins.ec2.EC2RetentionStrategy.check(EC2RetentionStrategy.java:43)
at hudson.slaves.ComputerRetentionWork$1.run(ComputerRetentionWork.java:72)
at hudson.model.Queue._withLock(Queue.java:1309)
at hudson.model.Queue.withLock(Queue.java:1186)
at hudson.slaves.ComputerRetentionWork.doRun(ComputerRetentionWork.java:63)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:50)
...
Locked ownable synchronizers:
- locked  (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
 

 I noticed this deadlock only happen if I switch from on-demand to a spot request. The on-demand works pretty well. Similarly, I noticed same deadlock when using the ec2 fleet plugin.  Jenkins version 2.32.2 EC2 plugin: 1.36  
 

  
 
 
 
 

 
 
 

   

[JIRA] (JENKINS-37483) Deadlock caused by synchronized methods in EC2Cloud

2016-08-17 Thread todd.r...@urjanet.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Todd Rose commented on  JENKINS-37483  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Deadlock caused by synchronized methods in EC2Cloud   
 

  
 
 
 
 

 
 https://github.com/jenkinsci/ec2-plugin/pull/214  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-37483) Deadlock caused by synchronized methods in EC2Cloud

2016-08-17 Thread todd.r...@urjanet.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Todd Rose commented on  JENKINS-37483  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Deadlock caused by synchronized methods in EC2Cloud   
 

  
 
 
 
 

 
 I think the quickest fix for this is to make the non-static connect() method synchronize on the class object. connect() is really the only thing that I can see that can be invoked from a lot of different contexts and threads.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-37483) Deadlock caused by synchronized methods in EC2Cloud

2016-08-17 Thread todd.r...@urjanet.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Todd Rose created an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 Jenkins /  JENKINS-37483  
 
 
  Deadlock caused by synchronized methods in EC2Cloud   
 

  
 
 
 
 

 
Issue Type: 
  Bug  
 
 
Assignee: 
 Francis Upton  
 
 
Components: 
 ec2-plugin  
 
 
Created: 
 2016/Aug/17 8:02 PM  
 
 
Labels: 
 plugin  
 
 
Priority: 
  Blocker  
 
 
Reporter: 
 Todd Rose  
 

  
 
 
 
 

 
 This is against 1.35 EC2Cloud.java has several synchronized methods that can be called from various timers. getNewOrExistingAvailableSlave() and connect() are the problematic ones in this case. Our installation heavily utilizes the spot market and we have a high number of nodes in our fleet. Under load you can easily get into a situation where one thread is terminating an instance and at the same time another is trying to provision a new one. The liberal use of synchronized methods in EC2Cloud is not safe. A finer-grained locking strategy, or moving to a lockless strategy is advisable. {{ T1 "Handling POST /view/Adhoc/job/admin_FailedSourceReplayRunner/build from xxx.xx.xxx.xx : RequestHandlerThread2247" – parking to wait for <0x00060090c078> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) which is held by T2 "EC2 alive slaves monitor thread"  "Handling POST /view/Adhoc/job/admin_FailedSourceReplayRunner/build from xxx.xx.xxx.xx : RequestHandlerThread2247": at sun.misc.Unsafe.park(Native Method) 
 
parking to wait for <0x00060090c078> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at