Hi all,

I noticed that nodepool was failing to build, out of space again.  We
haven't had a build in about 3 days.

Unlike last time, there wasn't anything to cleanup in the cache; it
all seemed to be images.

---
ianw@nodepool:/opt$ sudo du -sh ./*/
16G     ./dib_cache/
12G     ./dib_tmp/
704K    ./gear/
16K     ./lost+found/
7.2M    ./nodepool/
914G    ./nodepool_dib/
66M     ./system-config/
5.6G    ./test_images/
---

The image list at the time I started looked like

nodepool@nodepool:~$ nodepool dib-image-list
2016-11-06 23:41:11,267 INFO gear.Connection.nodepool: Disconnected from 
zuul.openstack.org port 4730
2016-11-06 23:41:11,311 INFO gear.Connection.nodepool: Connected to 
zuul.openstack.org port 4730
+------+----------------+---------------------------------------------+------------+-------+-------------+
| ID   | Image          | Filename                                    | Version 
   | State | Age         |
+------+----------------+---------------------------------------------+------------+-------+-------------+
| 1357 | centos-7       | /opt/nodepool_dib/centos-7-1477305240       | 
1477305240 | ready | 13:08:03:05 |
| 1415 | centos-7       | /opt/nodepool_dib/centos-7-1478169240       | 
1478169240 | ready | 03:08:42:23 |
| 1355 | debian-jessie  | /opt/nodepool_dib/debian-jessie-1477305240  | 
1477305240 | ready | 13:09:21:24 |
| 1413 | debian-jessie  | /opt/nodepool_dib/debian-jessie-1478169240  | 
1478169240 | ready | 03:09:58:31 |
| 1411 | fedora-23      | /opt/nodepool_dib/fedora-23-1478169240      | 
1478169240 | ready | 03:11:41:23 |
| 1418 | fedora-23      | /opt/nodepool_dib/fedora-23-1478255640      | 
1478255640 | ready | 02:11:38:16 |
| 1354 | fedora-24      | /opt/nodepool_dib/fedora-24-1477305240      | 
1477305240 | ready | 13:10:35:50 |
| 1361 | fedora-24      | /opt/nodepool_dib/fedora-24-1477391640      | 
1477391640 | ready | 12:10:28:52 |
| 1342 | ubuntu-precise | /opt/nodepool_dib/ubuntu-precise-1477132440 | 
1477132440 | ready | 15:07:10:03 |
| 1349 | ubuntu-precise | /opt/nodepool_dib/ubuntu-precise-1477218840 | 
1477218840 | ready | 14:07:24:06 |
| 1344 | ubuntu-trusty  | /opt/nodepool_dib/ubuntu-trusty-1477132440  | 
1477132440 | ready | 15:04:45:19 |
| 1416 | ubuntu-trusty  | /opt/nodepool_dib/ubuntu-trusty-1478169240  | 
1478169240 | ready | 03:06:59:33 |
| 1345 | ubuntu-xenial  | /opt/nodepool_dib/ubuntu-xenial-1477132440  | 
1477132440 | ready | 15:03:23:41 |
| 1417 | ubuntu-xenial  | /opt/nodepool_dib/ubuntu-xenial-1478169240  | 
1478169240 | ready | 03:05:01:40 |
+------+----------------+---------------------------------------------+------------+-------+-------------+

Well there was a lot of left-over builds in /opt/nodepool_dib, which
I've dumped into /opt/nodepool_dib/ianw-cleanup-2016-11.07.txt

I removed all the old builds listed in that file (i.e. all builds not
listed above).  This got us to a usable amount of free space

 /dev/mapper/main-nodepoolbuild 1008G  579G  430G  58% /opt

I then noticed that nodepool was stuck building *a lot* of old images

 nodepool@nodepool:/opt/nodepool_dib$ nodepool image-list | grep building | wc 
-l
 826

I went through an did an image-delete on each of these building
instances to clear things out.  I have started some image builds now
to see what the deal is.  I will keep an eye on them.

It's currently very hard to debug the upload process.  I'll soon propose some 
changes
to split the upload logs out into provider log files, similar to the way we 
split the
build logs out into separate files.  I think this will help to diagnose issues 
on
specific providers much quicker.

-i

_______________________________________________
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Reply via email to