[JIRA] (OVIRT-296) [jenkins] take offline faulty bad slaves

2016-12-22 Thread eyal edri [Administrator] (oVirt JIRA)

 [ 
https://ovirt-jira.atlassian.net/browse/OVIRT-296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

eyal edri [Administrator] reassigned OVIRT-296:
---

Assignee: Evgheni Dereveanchin  (was: infra)

> [jenkins] take offline faulty bad slaves
> 
>
> Key: OVIRT-296
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-296
> Project: oVirt - virtualization made easy
>  Issue Type: Task
>  Components: Jenkins
>Affects Versions: Test
>Reporter: eyal edri [Administrator]
>Assignee: Evgheni Dereveanchin
>  Labels: jenkins, monitoring,
>
> it seems that quite often we hit an issue with a specific slave on phx, due 
> to various reasons (out of space/git/network/etc..).
> which leads to multiple jobs trying to run on it and failing.
> we need an automated way of finding this.
> proposal:
> add post groovy build to jobs that will take a slave offline if it's 
> misbehaves using:
> manager.build.getBuiltOn().toComputer.setTemporarilyOffline(true) 
> the trick is to find such a slave and to be able to know if it failed 
> consistently in the past X hours to justify it's disable.
> we need some sort of counter or service to track slaves and thier error state 
> and according to it take offline a specific slave.
> for example:
> if a slave was failing x jobs in Y time and runtime was < Z min , it might 
> indicate such a problem.
> (e.g 10 jobs were failing on the same slave in a timeframe of 5 min and job 
> runtime was less than a 1 min.. )
> the post script should email infra@ovirt.org that it disabled a slave and we 
> should look into it.



--
This message was sent by Atlassian JIRA
(v1000.621.5#100023)
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[JIRA] (OVIRT-296) [jenkins] take offline faulty bad slaves

2016-12-08 Thread eyal edri [Administrator] (oVirt JIRA)

 [ 
https://ovirt-jira.atlassian.net/browse/OVIRT-296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

eyal edri [Administrator] updated OVIRT-296:

Priority: Medium  (was: Highest)

> [jenkins] take offline faulty bad slaves
> 
>
> Key: OVIRT-296
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-296
> Project: oVirt - virtualization made easy
>  Issue Type: Task
>  Components: Jenkins
>Affects Versions: Test
>Reporter: eyal edri [Administrator]
>Assignee: infra
>  Labels: jenkins, monitoring,
>
> it seems that quite often we hit an issue with a specific slave on phx, due 
> to various reasons (out of space/git/network/etc..).
> which leads to multiple jobs trying to run on it and failing.
> we need an automated way of finding this.
> proposal:
> add post groovy build to jobs that will take a slave offline if it's 
> misbehaves using:
> manager.build.getBuiltOn().toComputer.setTemporarilyOffline(true) 
> the trick is to find such a slave and to be able to know if it failed 
> consistently in the past X hours to justify it's disable.
> we need some sort of counter or service to track slaves and thier error state 
> and according to it take offline a specific slave.
> for example:
> if a slave was failing x jobs in Y time and runtime was < Z min , it might 
> indicate such a problem.
> (e.g 10 jobs were failing on the same slave in a timeframe of 5 min and job 
> runtime was less than a 1 min.. )
> the post script should email infra@ovirt.org that it disabled a slave and we 
> should look into it.



--
This message was sent by Atlassian JIRA
(v1000.620.0#100023)
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[JIRA] (OVIRT-296) [jenkins] take offline faulty bad slaves

2016-02-10 Thread eyal edri [Administrator] (oVirt JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 eyal edri [Administrator] updated an issue  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
 oVirt - virtualization made easy /  OVIRT-296  
 
 
  [jenkins] take offline faulty bad slaves   
 

  
 
 
 
 

 
Change By: 
 eyal edri [Administrator]  
 
 
Priority: 
 Highest  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.1.0-OD-06-005#71002-sha1:1d15c98)  
 
 

 
   
 

  
 

  
 

   

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra