[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-14 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Kartik Mistry  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Kartik Mistry  ---
https://cxserver-beta.wmflabs.org/ is built fine (deployment-cxserver03), so
closing this now.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783
Bug 71783 depends on bug 71873, which changed state.

Bug 71873 Summary: role::labs::lvm::mnt ends up with make-instance-vg: failed 
to create new partition
https://bugzilla.wikimedia.org/show_bug.cgi?id=71873

   What|Removed |Added

 Status|PATCH_TO_REVIEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #13 from Antoine "hashar" Musso  ---
(In reply to Yuvi Panda from comment #12)
> So puppet fails on cxserver02 because it tries to create a lvm volume and
> fails (/mnt, I think), leading to cascading failures (among which this is
> one, I presume). ^d ran into the same problem on his new ES box there as
> well, I think.
> 
> I'll investigate in a bit, but andrewbogott/coren/others feel free to take
> this as well...

Yup made I have it another bug for Labs > Infrastructure:

https://bugzilla.wikimedia.org/show_bug.cgi?id=71873

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #12 from Yuvi Panda  ---
So puppet fails on cxserver02 because it tries to create a lvm volume and fails
(/mnt, I think), leading to cascading failures (among which this is one, I
presume). ^d ran into the same problem on his new ES box there as well, I
think.

I'll investigate in a bit, but andrewbogott/coren/others feel free to take this
as well...

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Greg Grossmeier  changed:

   What|Removed |Added

 CC||yuvipa...@gmail.com

--- Comment #11 from Greg Grossmeier  ---
(In reply to Antoine "hashar" Musso from comment #10)
> (In reply to Greg Grossmeier from comment #9)
> > CRITICAL:
> > deployment-prep.deployment-cxserver02.puppetagent.failed_events.value
> > (100.00%)
> 
> Yup that is due to this bug.   I wanted to acknowledge the alarm, but since
> the monitor is on the production Icinga I lack permissions to do so.

Yuvi: Halp? How should we address this?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #10 from Antoine "hashar" Musso  ---
(In reply to Greg Grossmeier from comment #9)
> CRITICAL:
> deployment-prep.deployment-cxserver02.puppetagent.failed_events.value
> (100.00%)

Yup that is due to this bug.   I wanted to acknowledge the alarm, but since the
monitor is on the production Icinga I lack permissions to do so.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Greg Grossmeier  changed:

   What|Removed |Added

   Priority|Unprioritized   |Normal

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #9 from Greg Grossmeier  ---
CRITICAL: deployment-prep.deployment-cxserver02.puppetagent.failed_events.value
(100.00%)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Antoine "hashar" Musso  changed:

   What|Removed |Added

 Depends on||71873

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Nemo  changed:

   What|Removed |Added

 CC||federicol...@tiscali.it

--- Comment #8 from Nemo  ---
(Context:

> The virt1005 compute node died overnight, might explain the issue.

https://lists.wikimedia.org/pipermail/labs-l/2014-October/002982.html )

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #7 from Antoine "hashar" Musso  ---
Kart confirmed we can get rid of the instance.  Since beta cluster is out of
quota, that is convenient.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #6 from Antoine "hashar" Musso  ---
Creating an instance deployment-cxserver02 :

Size: m1.medium
OS: Ubuntu Trusty
Security rules: default, cxserver

Ie the same as deployment-cxserver01 used to be.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Antoine "hashar" Musso  changed:

   What|Removed |Added

 CC||kartik.mis...@gmail.com,
   ||niklas.laxst...@gmail.com

--- Comment #5 from Antoine "hashar" Musso  ---
(In reply to Andrew Bogott from comment #4)
> If this instance has any important data I can try to reclaim the drive
> contents.  Otherwise you should just delete and recreate.

Thank you for the time spent on investigating the issue.

I will check with the cxserver folks (Kartik and Niklas, added to cc) and see
whether they need any data.  Else we will recreate it and update relevant
configuration files.  It is fully puppetized AFAIK.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #4 from Andrew Bogott  ---
If this instance has any important data I can try to reclaim the drive
contents.  Otherwise you should just delete and recreate.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

--- Comment #3 from Antoine "hashar" Musso  ---
Link to instance informations:
https://wikitech.wikimedia.org/wiki/Nova_Resource:I-0421.eqiad.wmflabs

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 71783] Jenkins can not ssh to deployment-cxserver01 (hosted by virt1005)

2014-10-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=71783

Antoine "hashar" Musso  changed:

   What|Removed |Added

 CC||abog...@wikimedia.org,
   ||m...@uberbox.org,
   ||rlan...@gmail.com
  Component|deployment-prep (beta)  |Infrastructure
Summary|Jenkins can not ssh to  |Jenkins can not ssh to
   |deployment-cxserver01   |deployment-cxserver01
   ||(hosted by virt1005)

--- Comment #2 from Antoine "hashar" Musso  ---
The instance is hosted on virt1005 which died overnight.  I have marked the
Jenkins slave as offline:
https://integration.wikimedia.org/ci/computer/deployment-cxserver01/

I attempted to reboot it via OpenStackManager but it does not come back.  I
guess the  the VM is corrupted.

Impact:
* content translation server is not running for beta cluster
* code updates for content translation servers are obviously not pushed :D

Moving to Infrastructure (corrupted VM apparently)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l