Tests may not be doing docker cleanup. Inventory job runs a docker prune every 12 hours for images older than 24 hrs [1]. Randomly looking at one of the recent runs [2], it cleaned up a long list of containers consuming 30+GB space. That should be just 12 hours worth of containers.
[1] https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_Inventory.groovy#L69 [2] https://ci-beam.apache.org/job/beam_Inventory_apache-beam-jenkins-14/501/console On Fri, Jul 24, 2020 at 1:07 PM Tyson Hamilton <tyso...@google.com> wrote: > Yes, these are on the same volume in the /var/lib/docker directory. I'm > unsure if they clean up leftover images. > > On Fri, Jul 24, 2020 at 12:52 PM Udi Meiri <eh...@google.com> wrote: > >> I forgot Docker images: >> >> ehudm@apache-ci-beam-jenkins-3:~$ sudo docker system df >> TYPE TOTAL ACTIVE SIZE >> RECLAIMABLE >> Images 88 9 125.4GB >> 124.2GB (99%) >> Containers 40 4 7.927GB >> 7.871GB (99%) >> Local Volumes 47 0 3.165GB >> 3.165GB (100%) >> Build Cache 0 0 0B >> 0B >> >> There are about 90 images on that machine, with all but 1 less than 48 >> hours old. >> I think the docker test jobs need to try harder at cleaning up their >> leftover images. (assuming they're already doing it?) >> >> On Fri, Jul 24, 2020 at 12:31 PM Udi Meiri <eh...@google.com> wrote: >> >>> The additional slots (@3 directories) take up even more space now than >>> before. >>> >>> I'm testing out https://github.com/apache/beam/pull/12326 which could >>> help by cleaning up workspaces after a run (just started a seed job). >>> >>> On Fri, Jul 24, 2020 at 12:13 PM Tyson Hamilton <tyso...@google.com> >>> wrote: >>> >>>> 664M beam_PreCommit_JavaPortabilityApi_Commit >>>> 656M beam_PreCommit_JavaPortabilityApi_Commit@2 >>>> 611M beam_PreCommit_JavaPortabilityApi_Cron >>>> 616M beam_PreCommit_JavaPortabilityApiJava11_Commit >>>> 598M beam_PreCommit_JavaPortabilityApiJava11_Commit@2 >>>> 662M beam_PreCommit_JavaPortabilityApiJava11_Cron >>>> 2.9G beam_PreCommit_Portable_Python_Commit >>>> 2.9G beam_PreCommit_Portable_Python_Commit@2 >>>> 1.7G beam_PreCommit_Portable_Python_Commit@3 >>>> 3.4G beam_PreCommit_Portable_Python_Cron >>>> 1.9G beam_PreCommit_Python2_PVR_Flink_Commit >>>> 1.4G beam_PreCommit_Python2_PVR_Flink_Cron >>>> 1.3G beam_PreCommit_Python2_PVR_Flink_Phrase >>>> 6.2G beam_PreCommit_Python_Commit >>>> 7.5G beam_PreCommit_Python_Commit@2 >>>> 7.5G beam_PreCommit_Python_Cron >>>> 1012M beam_PreCommit_PythonDocker_Commit >>>> 1011M beam_PreCommit_PythonDocker_Commit@2 >>>> 1011M beam_PreCommit_PythonDocker_Commit@3 >>>> 1002M beam_PreCommit_PythonDocker_Cron >>>> 877M beam_PreCommit_PythonFormatter_Commit >>>> 988M beam_PreCommit_PythonFormatter_Cron >>>> 986M beam_PreCommit_PythonFormatter_Phrase >>>> 1.7G beam_PreCommit_PythonLint_Commit >>>> 2.1G beam_PreCommit_PythonLint_Cron >>>> 7.5G beam_PreCommit_Python_Phrase >>>> 346M beam_PreCommit_RAT_Commit >>>> 341M beam_PreCommit_RAT_Cron >>>> 338M beam_PreCommit_Spotless_Commit >>>> 339M beam_PreCommit_Spotless_Cron >>>> 5.5G beam_PreCommit_SQL_Commit >>>> 5.5G beam_PreCommit_SQL_Cron >>>> 5.5G beam_PreCommit_SQL_Java11_Commit >>>> 750M beam_PreCommit_Website_Commit >>>> 750M beam_PreCommit_Website_Commit@2 >>>> 750M beam_PreCommit_Website_Cron >>>> 764M beam_PreCommit_Website_Stage_GCS_Commit >>>> 771M beam_PreCommit_Website_Stage_GCS_Cron >>>> 336M beam_Prober_CommunityMetrics >>>> 693M beam_python_mongoio_load_test >>>> 339M beam_SeedJob >>>> 333M beam_SeedJob_Standalone >>>> 334M beam_sonarqube_report >>>> 556M beam_SQLBigQueryIO_Batch_Performance_Test_Java >>>> 175G total >>>> >>>> On Fri, Jul 24, 2020 at 12:04 PM Tyson Hamilton <tyso...@google.com> >>>> wrote: >>>> >>>>> Ya looks like something in the workspaces is taking up room: >>>>> >>>>> @apache-ci-beam-jenkins-8:/home/jenkins$ sudo du -shc . >>>>> 191G . >>>>> 191G total >>>>> >>>>> >>>>> On Fri, Jul 24, 2020 at 11:44 AM Tyson Hamilton <tyso...@google.com> >>>>> wrote: >>>>> >>>>>> Node 8 is also full. The partition that /tmp is on is here: >>>>>> >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/sda1 485G 482G 2.9G 100% / >>>>>> >>>>>> however after cleaning up tmp with the crontab command, there is only >>>>>> 8G usage yet it still remains 100% full: >>>>>> >>>>>> @apache-ci-beam-jenkins-8:/tmp$ sudo du -shc /tmp >>>>>> 8.0G /tmp >>>>>> 8.0G total >>>>>> >>>>>> The workspaces are in the /home/jenkins/jenkins-slave/workspace >>>>>> directory. When I run a du on that, it takes really long. I'll let it >>>>>> keep >>>>>> running for a while to see if it ever returns a result but so far this >>>>>> seems suspect. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Jul 24, 2020 at 11:19 AM Tyson Hamilton <tyso...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Everything I've been looking at is in the /tmp dir. Where are the >>>>>>> workspaces, or what are the named? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Jul 24, 2020 at 11:03 AM Udi Meiri <eh...@google.com> wrote: >>>>>>> >>>>>>>> I'm curious to what you find. Was it /tmp or the workspaces using >>>>>>>> up the space? >>>>>>>> >>>>>>>> On Fri, Jul 24, 2020 at 10:57 AM Tyson Hamilton <tyso...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Bleck. I just realized that it is 'offline' so that won't work. >>>>>>>>> I'll clean up manually on the machine using the cron command. >>>>>>>>> >>>>>>>>> On Fri, Jul 24, 2020 at 10:56 AM Tyson Hamilton < >>>>>>>>> tyso...@google.com> wrote: >>>>>>>>> >>>>>>>>>> Something isn't working with the current set up because node 15 >>>>>>>>>> appears to be out of space and is currently 'offline' according to >>>>>>>>>> Jenkins. >>>>>>>>>> Can someone run the cleanup job? The machine is full, >>>>>>>>>> >>>>>>>>>> @apache-ci-beam-jenkins-15:/tmp$ df -h >>>>>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>>>>> udev 52G 0 52G 0% /dev >>>>>>>>>> tmpfs 11G 265M 10G 3% /run >>>>>>>>>> */dev/sda1 485G 484G 880M 100% /* >>>>>>>>>> tmpfs 52G 0 52G 0% /dev/shm >>>>>>>>>> tmpfs 5.0M 0 5.0M 0% /run/lock >>>>>>>>>> tmpfs 52G 0 52G 0% /sys/fs/cgroup >>>>>>>>>> tmpfs 11G 0 11G 0% /run/user/1017 >>>>>>>>>> tmpfs 11G 0 11G 0% /run/user/1037 >>>>>>>>>> >>>>>>>>>> apache-ci-beam-jenkins-15:/tmp$ sudo du -ah --time . | sort -rhk >>>>>>>>>> 1,1 | head -n 20 >>>>>>>>>> 20G 2020-07-24 17:52 . >>>>>>>>>> 580M 2020-07-22 17:31 ./junit1031982597110125586 >>>>>>>>>> 517M 2020-07-22 17:31 >>>>>>>>>> ./junit1031982597110125586/junit8739924829337821410/heap_dump.hprof >>>>>>>>>> 517M 2020-07-22 17:31 >>>>>>>>>> ./junit1031982597110125586/junit8739924829337821410 >>>>>>>>>> 263M 2020-07-22 12:23 ./pip-install-2GUhO_ >>>>>>>>>> 263M 2020-07-20 09:30 ./pip-install-sxgwqr >>>>>>>>>> 263M 2020-07-17 13:56 ./pip-install-bWSKIV >>>>>>>>>> 242M 2020-07-21 20:25 ./beam-pipeline-tempmByU6T >>>>>>>>>> 242M 2020-07-21 20:21 ./beam-pipeline-tempV85xeK >>>>>>>>>> 242M 2020-07-21 20:15 ./beam-pipeline-temp7dJROJ >>>>>>>>>> 236M 2020-07-21 20:25 >>>>>>>>>> ./beam-pipeline-tempmByU6T/tmpOWj3Yr >>>>>>>>>> 236M 2020-07-21 20:21 >>>>>>>>>> ./beam-pipeline-tempV85xeK/tmppbQHB3 >>>>>>>>>> 236M 2020-07-21 20:15 >>>>>>>>>> ./beam-pipeline-temp7dJROJ/tmpgOXPKW >>>>>>>>>> 111M 2020-07-23 00:57 ./pip-install-1JnyNE >>>>>>>>>> 105M 2020-07-23 00:17 ./beam-artifact1374651823280819755 >>>>>>>>>> 105M 2020-07-23 00:16 ./beam-artifact5050755582921936972 >>>>>>>>>> 105M 2020-07-23 00:16 ./beam-artifact1834064452502646289 >>>>>>>>>> 105M 2020-07-23 00:15 ./beam-artifact682561790267074916 >>>>>>>>>> 105M 2020-07-23 00:15 ./beam-artifact4691304965824489394 >>>>>>>>>> 105M 2020-07-23 00:14 ./beam-artifact4050383819822604421 >>>>>>>>>> >>>>>>>>>> On Wed, Jul 22, 2020 at 12:03 PM Robert Bradshaw < >>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>> >>>>>>>>>>> On Wed, Jul 22, 2020 at 11:57 AM Tyson Hamilton < >>>>>>>>>>> tyso...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Ah I see, thanks Kenn. I found some advice from the Apache >>>>>>>>>>>> infra wiki that also suggests using a tmpdir inside the workspace >>>>>>>>>>>> [1]: >>>>>>>>>>>> >>>>>>>>>>>> Procedures Projects can take to clean up disk space >>>>>>>>>>>> >>>>>>>>>>>> Projects can help themselves and Infra by taking some basic >>>>>>>>>>>> steps to help clean up their jobs after themselves on the build >>>>>>>>>>>> nodes. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1. Use a ./tmp dir in your jobs workspace. That way it gets >>>>>>>>>>>> cleaned up when job workspaces expire. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Tests should be (able to be) written to use the standard >>>>>>>>>>> temporary file mechanisms, and the environment set up on Jenkins >>>>>>>>>>> such that >>>>>>>>>>> that falls into the respective workspaces. Ideally this should be >>>>>>>>>>> as simple >>>>>>>>>>> as setting the TMPDIR (or similar) environment variable (and making >>>>>>>>>>> sure it >>>>>>>>>>> exists/is writable). >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 1. Configure your jobs to wipe workspaces on start or >>>>>>>>>>>> finish. >>>>>>>>>>>> 2. Configure your jobs to only keep 5 or 10 previous builds. >>>>>>>>>>>> 3. Configure your jobs to only keep 5 or 10 previous >>>>>>>>>>>> artifacts. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [1]: >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/INFRA/Disk+Space+cleanup+of+Jenkins+nodes >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jul 22, 2020 at 8:06 AM Kenneth Knowles < >>>>>>>>>>>> k...@apache.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Those file listings look like the result of using standard >>>>>>>>>>>>> temp file APIs but with TMPDIR set to /tmp. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jul 20, 2020 at 7:55 PM Tyson Hamilton < >>>>>>>>>>>>> tyso...@google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Jobs are hermetic as far as I can tell and use unique >>>>>>>>>>>>>> subdirectories inside of /tmp. Here is a quick look into two >>>>>>>>>>>>>> examples: >>>>>>>>>>>>>> >>>>>>>>>>>>>> @apache-ci-beam-jenkins-4:/tmp$ sudo du -ah --time . | sort >>>>>>>>>>>>>> -rhk 1,1 | head -n 20 >>>>>>>>>>>>>> 1.6G 2020-07-21 02:25 . >>>>>>>>>>>>>> 242M 2020-07-17 18:48 ./beam-pipeline-temp3ybuY4 >>>>>>>>>>>>>> 242M 2020-07-17 18:46 ./beam-pipeline-tempuxjiPT >>>>>>>>>>>>>> 242M 2020-07-17 18:44 ./beam-pipeline-tempVpg1ME >>>>>>>>>>>>>> 242M 2020-07-17 18:42 ./beam-pipeline-tempJ4EpyB >>>>>>>>>>>>>> 242M 2020-07-17 18:39 ./beam-pipeline-tempepea7Q >>>>>>>>>>>>>> 242M 2020-07-17 18:35 ./beam-pipeline-temp79qot2 >>>>>>>>>>>>>> 236M 2020-07-17 18:48 >>>>>>>>>>>>>> ./beam-pipeline-temp3ybuY4/tmpy_Ytzz >>>>>>>>>>>>>> 236M 2020-07-17 18:46 >>>>>>>>>>>>>> ./beam-pipeline-tempuxjiPT/tmpN5_UfJ >>>>>>>>>>>>>> 236M 2020-07-17 18:44 >>>>>>>>>>>>>> ./beam-pipeline-tempVpg1ME/tmpxSm8pX >>>>>>>>>>>>>> 236M 2020-07-17 18:42 >>>>>>>>>>>>>> ./beam-pipeline-tempJ4EpyB/tmpMZJU76 >>>>>>>>>>>>>> 236M 2020-07-17 18:39 >>>>>>>>>>>>>> ./beam-pipeline-tempepea7Q/tmpWy1vWX >>>>>>>>>>>>>> 236M 2020-07-17 18:35 >>>>>>>>>>>>>> ./beam-pipeline-temp79qot2/tmpvN7vWA >>>>>>>>>>>>>> 3.7M 2020-07-17 18:48 >>>>>>>>>>>>>> ./beam-pipeline-temp3ybuY4/tmprlh_di >>>>>>>>>>>>>> 3.7M 2020-07-17 18:46 >>>>>>>>>>>>>> ./beam-pipeline-tempuxjiPT/tmpLmVWfe >>>>>>>>>>>>>> 3.7M 2020-07-17 18:44 >>>>>>>>>>>>>> ./beam-pipeline-tempVpg1ME/tmpvrxbY7 >>>>>>>>>>>>>> 3.7M 2020-07-17 18:42 >>>>>>>>>>>>>> ./beam-pipeline-tempJ4EpyB/tmpLTb6Mj >>>>>>>>>>>>>> 3.7M 2020-07-17 18:39 >>>>>>>>>>>>>> ./beam-pipeline-tempepea7Q/tmptYF1v1 >>>>>>>>>>>>>> 3.7M 2020-07-17 18:35 >>>>>>>>>>>>>> ./beam-pipeline-temp79qot2/tmplfV0Rg >>>>>>>>>>>>>> 2.7M 2020-07-17 20:10 ./pip-install-q9l227ef >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> @apache-ci-beam-jenkins-11:/tmp$ sudo du -ah --time . | sort >>>>>>>>>>>>>> -rhk 1,1 | head -n 20 >>>>>>>>>>>>>> 817M 2020-07-21 02:26 . >>>>>>>>>>>>>> 242M 2020-07-19 12:14 ./beam-pipeline-tempUTXqlM >>>>>>>>>>>>>> 242M 2020-07-19 12:11 ./beam-pipeline-tempx3Yno3 >>>>>>>>>>>>>> 242M 2020-07-19 12:05 ./beam-pipeline-tempyCrMYq >>>>>>>>>>>>>> 236M 2020-07-19 12:14 >>>>>>>>>>>>>> ./beam-pipeline-tempUTXqlM/tmpstXoL0 >>>>>>>>>>>>>> 236M 2020-07-19 12:11 >>>>>>>>>>>>>> ./beam-pipeline-tempx3Yno3/tmpnnVn65 >>>>>>>>>>>>>> 236M 2020-07-19 12:05 >>>>>>>>>>>>>> ./beam-pipeline-tempyCrMYq/tmpRF0iNs >>>>>>>>>>>>>> 3.7M 2020-07-19 12:14 >>>>>>>>>>>>>> ./beam-pipeline-tempUTXqlM/tmpbJjUAQ >>>>>>>>>>>>>> 3.7M 2020-07-19 12:11 >>>>>>>>>>>>>> ./beam-pipeline-tempx3Yno3/tmpsmmzqe >>>>>>>>>>>>>> 3.7M 2020-07-19 12:05 >>>>>>>>>>>>>> ./beam-pipeline-tempyCrMYq/tmp5b3ZvY >>>>>>>>>>>>>> 2.0M 2020-07-19 12:14 >>>>>>>>>>>>>> ./beam-pipeline-tempUTXqlM/tmpoj3orz >>>>>>>>>>>>>> 2.0M 2020-07-19 12:11 >>>>>>>>>>>>>> ./beam-pipeline-tempx3Yno3/tmptng9sZ >>>>>>>>>>>>>> 2.0M 2020-07-19 12:05 >>>>>>>>>>>>>> ./beam-pipeline-tempyCrMYq/tmpWp6njc >>>>>>>>>>>>>> 1.2M 2020-07-19 12:14 >>>>>>>>>>>>>> ./beam-pipeline-tempUTXqlM/tmphgdj35 >>>>>>>>>>>>>> 1.2M 2020-07-19 12:11 >>>>>>>>>>>>>> ./beam-pipeline-tempx3Yno3/tmp8ySXpm >>>>>>>>>>>>>> 1.2M 2020-07-19 12:05 >>>>>>>>>>>>>> ./beam-pipeline-tempyCrMYq/tmpNVEJ4e >>>>>>>>>>>>>> 992K 2020-07-12 12:00 ./junit642086915811430564 >>>>>>>>>>>>>> 988K 2020-07-12 12:00 ./junit642086915811430564/beam >>>>>>>>>>>>>> 984K 2020-07-12 12:00 >>>>>>>>>>>>>> ./junit642086915811430564/beam/nodes >>>>>>>>>>>>>> 980K 2020-07-12 12:00 >>>>>>>>>>>>>> ./junit642086915811430564/beam/nodes/0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jul 20, 2020 at 6:46 PM Udi Meiri <eh...@google.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> You're right, job workspaces should be hermetic. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Jul 20, 2020 at 1:24 PM Kenneth Knowles < >>>>>>>>>>>>>>> k...@apache.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm probably late to this discussion and missing something, >>>>>>>>>>>>>>>> but why are we writing to /tmp at all? I would expect TMPDIR >>>>>>>>>>>>>>>> to point >>>>>>>>>>>>>>>> somewhere inside the job directory that will be wiped by >>>>>>>>>>>>>>>> Jenkins, and I >>>>>>>>>>>>>>>> would expect code to always create temp files via APIs that >>>>>>>>>>>>>>>> respect this. >>>>>>>>>>>>>>>> Is Jenkins not cleaning up? Do we not have the ability to set >>>>>>>>>>>>>>>> this up? Do >>>>>>>>>>>>>>>> we have bugs in our code (that we could probably find by >>>>>>>>>>>>>>>> setting TMPDIR to >>>>>>>>>>>>>>>> somewhere not-/tmp and running the tests without write >>>>>>>>>>>>>>>> permission to /tmp, >>>>>>>>>>>>>>>> etc) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Kenn >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jul 20, 2020 at 11:39 AM Ahmet Altay < >>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Related to workspace directory growth, +Udi Meiri >>>>>>>>>>>>>>>>> <eh...@google.com> filed a relevant issue previously ( >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-9865) for >>>>>>>>>>>>>>>>> cleaning up workspace directory after successful jobs. >>>>>>>>>>>>>>>>> Alternatively, we >>>>>>>>>>>>>>>>> can consider periodically cleaning up the /src directories. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I would suggest moving the cron task from internal cron >>>>>>>>>>>>>>>>> scripts to the inventory job ( >>>>>>>>>>>>>>>>> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_Inventory.groovy#L51). >>>>>>>>>>>>>>>>> That way, we can see all the cron jobs as part of the source >>>>>>>>>>>>>>>>> tree, adjust >>>>>>>>>>>>>>>>> frequencies and clean up codes with PRs. I do not know how >>>>>>>>>>>>>>>>> internal cron >>>>>>>>>>>>>>>>> scripts are created, maintained, and how would they be >>>>>>>>>>>>>>>>> recreated for new >>>>>>>>>>>>>>>>> worker instances. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /cc +Tyson Hamilton <tyso...@google.com> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jul 20, 2020 at 4:50 AM Damian Gadomski < >>>>>>>>>>>>>>>>> damian.gadom...@polidea.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hey, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I've recently created a solution for the growing /tmp >>>>>>>>>>>>>>>>>> directory. Part of it is the job mentioned by Tyson: >>>>>>>>>>>>>>>>>> *beam_Clean_tmp_directory*. It's intentionally not >>>>>>>>>>>>>>>>>> triggered by cron and should be a last resort solution for >>>>>>>>>>>>>>>>>> some strange >>>>>>>>>>>>>>>>>> cases. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Along with that job, I've also updated every worker with >>>>>>>>>>>>>>>>>> an internal cron script. It's being executed once a week and >>>>>>>>>>>>>>>>>> deletes all >>>>>>>>>>>>>>>>>> the files (and only files) that were not accessed for at >>>>>>>>>>>>>>>>>> least three days. >>>>>>>>>>>>>>>>>> That's designed to be as safe as possible for the running >>>>>>>>>>>>>>>>>> jobs on the >>>>>>>>>>>>>>>>>> worker (not to delete the files that are still in use), and >>>>>>>>>>>>>>>>>> also to be >>>>>>>>>>>>>>>>>> insensitive to the current workload on the machine. The >>>>>>>>>>>>>>>>>> cleanup will always >>>>>>>>>>>>>>>>>> happen, even if some long-running/stuck jobs are blocking >>>>>>>>>>>>>>>>>> the machine. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I also think that currently the "No space left" errors >>>>>>>>>>>>>>>>>> may be a consequence of growing workspace directory rather >>>>>>>>>>>>>>>>>> than /tmp. I >>>>>>>>>>>>>>>>>> didn't do any detailed analysis but e.g. currently, on >>>>>>>>>>>>>>>>>> apache-beam-jenkins-7 the workspace directory size is 158 GB >>>>>>>>>>>>>>>>>> while /tmp is >>>>>>>>>>>>>>>>>> only 16 GB. We should either guarantee the disk size to hold >>>>>>>>>>>>>>>>>> workspaces for >>>>>>>>>>>>>>>>>> all jobs (because eventually, every worker will execute each >>>>>>>>>>>>>>>>>> job) or clear >>>>>>>>>>>>>>>>>> also the workspaces in some way. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>> Damian >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 20, 2020 at 10:43 AM Maximilian Michels < >>>>>>>>>>>>>>>>>> m...@apache.org> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> +1 for scheduling it via a cron job if it won't lead to >>>>>>>>>>>>>>>>>>> test failures >>>>>>>>>>>>>>>>>>> while running. Not a Jenkins expert but maybe there is >>>>>>>>>>>>>>>>>>> the notion of >>>>>>>>>>>>>>>>>>> running exclusively while no other tasks are running? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Max >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 17.07.20 21:49, Tyson Hamilton wrote: >>>>>>>>>>>>>>>>>>> > FYI there was a job introduced to do this in Jenkins: >>>>>>>>>>>>>>>>>>> beam_Clean_tmp_directory >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > Currently it needs to be run manually. I'm seeing some >>>>>>>>>>>>>>>>>>> out of disk related errors in precommit tests currently, >>>>>>>>>>>>>>>>>>> perhaps we should >>>>>>>>>>>>>>>>>>> schedule this job with cron? >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > On 2020/03/11 19:31:13, Heejong Lee < >>>>>>>>>>>>>>>>>>> heej...@google.com> wrote: >>>>>>>>>>>>>>>>>>> >> Still seeing no space left on device errors on >>>>>>>>>>>>>>>>>>> jenkins-7 (for example: >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2754/ >>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> On Fri, Mar 6, 2020 at 7:11 PM Alan Myrvold < >>>>>>>>>>>>>>>>>>> amyrv...@google.com> wrote: >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >>> Did a one time cleanup of tmp files owned by jenkins >>>>>>>>>>>>>>>>>>> older than 3 days. >>>>>>>>>>>>>>>>>>> >>> Agree that we need a longer term solution. >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> >>> Passing recent tests on all executors except >>>>>>>>>>>>>>>>>>> jenkins-12, which has not >>>>>>>>>>>>>>>>>>> >>> scheduled recent builds for the past 13 days. Not >>>>>>>>>>>>>>>>>>> scheduling: >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-12/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-12/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> Recent passing builds: >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-1/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-1/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-2/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-2/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-3/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-3/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-4/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-4/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-5/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-5/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-6/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-6/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-7/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-7/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-8/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-8/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-9/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-9/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-10/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-10/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-11/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-11/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-13/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-13/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-14/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-14/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-15/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-15/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> https://builds.apache.org/computer/apache-beam-jenkins-16/builds >>>>>>>>>>>>>>>>>>> >>> < >>>>>>>>>>>>>>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-16/builds&sa=D >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> >>> On Fri, Mar 6, 2020 at 11:54 AM Ahmet Altay < >>>>>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>> >>>> +Alan Myrvold <amyrv...@google.com> is doing a one >>>>>>>>>>>>>>>>>>> time cleanup. I agree >>>>>>>>>>>>>>>>>>> >>>> that we need to have a solution to automate this >>>>>>>>>>>>>>>>>>> task or address the root >>>>>>>>>>>>>>>>>>> >>>> cause of the buildup. >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> >>>> On Thu, Mar 5, 2020 at 2:47 AM Michał Walenia < >>>>>>>>>>>>>>>>>>> michal.wale...@polidea.com> >>>>>>>>>>>>>>>>>>> >>>> wrote: >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> >>>>> Hi there, >>>>>>>>>>>>>>>>>>> >>>>> it seems we have a problem with Jenkins workers >>>>>>>>>>>>>>>>>>> again. Nodes 1 and 7 >>>>>>>>>>>>>>>>>>> >>>>> both fail jobs with "No space left on device". >>>>>>>>>>>>>>>>>>> >>>>> Who is the best person to contact in these cases >>>>>>>>>>>>>>>>>>> (someone with access >>>>>>>>>>>>>>>>>>> >>>>> permissions to the workers). >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>>> I also noticed that such errors are becoming more >>>>>>>>>>>>>>>>>>> and more frequent >>>>>>>>>>>>>>>>>>> >>>>> recently and I'd like to discuss how can this be >>>>>>>>>>>>>>>>>>> remedied. Can a cleanup >>>>>>>>>>>>>>>>>>> >>>>> task be automated on Jenkins somehow? >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>>> Regards >>>>>>>>>>>>>>>>>>> >>>>> Michal >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>>> -- >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>>> Michał Walenia >>>>>>>>>>>>>>>>>>> >>>>> Polidea <https://www.polidea.com/> | Software >>>>>>>>>>>>>>>>>>> Engineer >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>>> M: +48 791 432 002 <+48%20791%20432%20002> < >>>>>>>>>>>>>>>>>>> +48791432002 <+48%20791%20432%20002>> >>>>>>>>>>>>>>>>>>> >>>>> E: michal.wale...@polidea.com >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>>> Unique Tech >>>>>>>>>>>>>>>>>>> >>>>> Check out our projects! < >>>>>>>>>>>>>>>>>>> https://www.polidea.com/our-work> >>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>