On Thu, Jun 13, 2019 at 8:55 AM Javier Pena <jp...@redhat.com> wrote:
> > > ------------------------------ > > > > On Thu, Jun 13, 2019 at 8:22 AM Javier Pena <jp...@redhat.com> wrote: > >> Hi all, >> >> For the last few days, I have been monitoring a spike in disk space >> utilization for logs.rdoproject.org. The current situation is: >> >> - 94% of space used, with less than 140GB out of 2TB available. >> - The log pruning script has been reclaiming less space than we are using >> for new logs during this week. >> - I expect the situation to improve over the weekend, but we're >> definitely running out of space. >> >> I have looked at a random job (https://review.opendev.org/639324, patch >> set 26), and found that each run is consuming 1.2 GB of disk space in logs. >> The worst offenders I have found are: >> >> - atop.bin.gz files (one per job, 8 jobs per recheck), ranging between 15 >> and 40 MB each >> - logs/undercloud/home/zuul/tempest/.stackviz directory on >> tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001 jobs, which is a >> virtualenv eating up 81 MB. >> > > Can we sync up w/ how you are calculating these results as they do not > match our results. > I see each job consuming about 215M of space, we are close on stackviz > being 83M. Oddly I don't see atop.bin.gz in our calculations so I'll have > to look into that. > > I've checked it directly using du on the logserver. By 1.2 GB I meant the > aggregate of the 8 jobs running for a single patchset. PS26 is currently > using 2.5 GB and had one recheck. > > About the atop.bin.gz file: > > # find . -name atop.bin.gz -exec du -sh {} \; > 16M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/042cb8f/logs/undercloud/var/log/atop.bin.gz > 16M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/e4171d7/logs/undercloud/var/log/atop.bin.gz > 28M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-rocky-branch/ffd4de9/logs/undercloud/var/log/atop.bin.gz > 26M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-rocky-branch/34d44bf/logs/undercloud/var/log/atop.bin.gz > 25M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein-branch/b89761d/logs/undercloud/var/log/atop.bin.gz > 24M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein-branch/9ade834/logs/undercloud/var/log/atop.bin.gz > 29M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/a10447d/logs/undercloud/var/log/atop.bin.gz > 44M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/99a5f9f/logs/undercloud/var/log/atop.bin.gz > 15M > > ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/c8a8c60/logs/subnode-2/var/log/atop.bin.gz > 33M > > ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/c8a8c60/logs/undercloud/var/log/atop.bin.gz > 16M > > ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/73ef532/logs/subnode-2/var/log/atop.bin.gz > 33M > > ./tripleo-ci-centos-7-multinode-1ctlr-featureset010/73ef532/logs/undercloud/var/log/atop.bin.gz > 40M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/109d5ae/logs/undercloud/var/log/atop.bin.gz > 45M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/c2ebeae/logs/undercloud/var/log/atop.bin.gz > 39M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/7fe5bbb/logs/undercloud/var/log/atop.bin.gz > 16M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/5e6cb0f/logs/undercloud/var/log/atop.bin.gz > 40M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039/c6bf5ea/logs/undercloud/var/log/atop.bin.gz > 40M > > ./tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039/6ec5ac6/logs/undercloud/var/log/atop.bin.gz > > Can I safely delete all .stackviz directories? I guess that would give us > some breathing room. > Yup, go for it > > Regards, > Javier > > Each job reports the size of the logs e.g. [1] > > http://logs.rdoproject.org/24/639324/26/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein-branch/9ade834/logs/quickstart_files/log-size.txt > > >> As a temporary measure, I am reducing log retention from 21 days to 14, >> but we still need to reduce the rate at which we are uploading logs. Would >> it be possible to check the oooq-generated logs and see where we can >> reduce? These jobs are by far the ones consuming most space. >> >> Thanks, >> Javier >> _______________________________________________ >> dev mailing list >> dev@lists.rdoproject.org >> http://lists.rdoproject.org/mailman/listinfo/dev >> >> To unsubscribe: dev-unsubscr...@lists.rdoproject.org >> > >
_______________________________________________ dev mailing list dev@lists.rdoproject.org http://lists.rdoproject.org/mailman/listinfo/dev To unsubscribe: dev-unsubscr...@lists.rdoproject.org