Andrew, YOU MADE MY DAY! The issue is GONE (sorry, I'm very excited and relieved at the same time). We tried to nuke docker completely on the node (also removing /var/lib/docker), but we hadn't removed /var/lib/origin/.
So, for an obscur reason, we had a lot of old volumes from May, June and July. After removing these folders, our deploys now take less than 5s (time to run the deploy pod + actually starting the services). We havent seen our cluster running like that since a looooong time. For the record, here's the command we've been using on all nodes: find /var/lib/origin/openshift.local.volumes/pods/ -type d -maxdepth 1 -mtime +30 -exec rm -rf \{\} \; It tooks more than 30s on some nodes, so I suspect some folders to be completely full of sh... Anyway, that's a relief, thanks again for your pugnacity :)
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users