Related to workspace directory growth, +Udi Meiri <eh...@google.com> filed
a relevant issue previously (https://issues.apache.org/jira/browse/BEAM-9865)
for cleaning up workspace directory after successful jobs. Alternatively,
we can consider periodically cleaning up the /src directories.

I would suggest moving the cron task from internal cron scripts to the
inventory job (
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_Inventory.groovy#L51).
That way, we can see all the cron jobs as part of the source tree, adjust
frequencies and clean up codes with PRs. I do not know how internal cron
scripts are created, maintained, and how would they be recreated for new
worker instances.

/cc +Tyson Hamilton <tyso...@google.com>

On Mon, Jul 20, 2020 at 4:50 AM Damian Gadomski <damian.gadom...@polidea.com>
wrote:

> Hey,
>
> I've recently created a solution for the growing /tmp directory. Part of
> it is the job mentioned by Tyson: *beam_Clean_tmp_directory*. It's
> intentionally not triggered by cron and should be a last resort solution
> for some strange cases.
>
> Along with that job, I've also updated every worker with an internal cron
> script. It's being executed once a week and deletes all the files (and only
> files) that were not accessed for at least three days. That's designed to
> be as safe as possible for the running jobs on the worker (not to delete
> the files that are still in use), and also to be insensitive to the current
> workload on the machine. The cleanup will always happen, even if some
> long-running/stuck jobs are blocking the machine.
>
> I also think that currently the "No space left" errors may be a
> consequence of growing workspace directory rather than /tmp. I didn't do
> any detailed analysis but e.g. currently, on apache-beam-jenkins-7 the
> workspace directory size is 158 GB while /tmp is only 16 GB. We should
> either guarantee the disk size to hold workspaces for all jobs (because
> eventually, every worker will execute each job) or clear also the
> workspaces in some way.
>
> Regards,
> Damian
>
>
> On Mon, Jul 20, 2020 at 10:43 AM Maximilian Michels <m...@apache.org>
> wrote:
>
>> +1 for scheduling it via a cron job if it won't lead to test failures
>> while running. Not a Jenkins expert but maybe there is the notion of
>> running exclusively while no other tasks are running?
>>
>> -Max
>>
>> On 17.07.20 21:49, Tyson Hamilton wrote:
>> > FYI there was a job introduced to do this in Jenkins:
>> beam_Clean_tmp_directory
>> >
>> > Currently it needs to be run manually. I'm seeing some out of disk
>> related errors in precommit tests currently, perhaps we should schedule
>> this job with cron?
>> >
>> >
>> > On 2020/03/11 19:31:13, Heejong Lee <heej...@google.com> wrote:
>> >> Still seeing no space left on device errors on jenkins-7 (for example:
>> >> https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2754/)
>> >>
>> >>
>> >> On Fri, Mar 6, 2020 at 7:11 PM Alan Myrvold <amyrv...@google.com>
>> wrote:
>> >>
>> >>> Did a one time cleanup of tmp files owned by jenkins older than 3
>> days.
>> >>> Agree that we need a longer term solution.
>> >>>
>> >>> Passing recent tests on all executors except jenkins-12, which has not
>> >>> scheduled recent builds for the past 13 days. Not scheduling:
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-12/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-12/builds&sa=D
>> >
>> >>> Recent passing builds:
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-1/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-1/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-2/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-2/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-3/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-3/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-4/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-4/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-5/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-5/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-6/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-6/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-7/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-7/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-8/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-8/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-9/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-9/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-10/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-10/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-11/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-11/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-13/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-13/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-14/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-14/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-15/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-15/builds&sa=D
>> >
>> >>> https://builds.apache.org/computer/apache-beam-jenkins-16/builds
>> >>> <
>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-16/builds&sa=D
>> >
>> >>>
>> >>> On Fri, Mar 6, 2020 at 11:54 AM Ahmet Altay <al...@google.com> wrote:
>> >>>
>> >>>> +Alan Myrvold <amyrv...@google.com> is doing a one time cleanup. I
>> agree
>> >>>> that we need to have a solution to automate this task or address the
>> root
>> >>>> cause of the buildup.
>> >>>>
>> >>>> On Thu, Mar 5, 2020 at 2:47 AM Michał Walenia <
>> michal.wale...@polidea.com>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi there,
>> >>>>> it seems we have a problem with Jenkins workers again. Nodes 1 and 7
>> >>>>> both fail jobs with "No space left on device".
>> >>>>> Who is the best person to contact in these cases (someone with
>> access
>> >>>>> permissions to the workers).
>> >>>>>
>> >>>>> I also noticed that such errors are becoming more and more frequent
>> >>>>> recently and I'd like to discuss how can this be remedied. Can a
>> cleanup
>> >>>>> task be automated on Jenkins somehow?
>> >>>>>
>> >>>>> Regards
>> >>>>> Michal
>> >>>>>
>> >>>>> --
>> >>>>>
>> >>>>> Michał Walenia
>> >>>>> Polidea <https://www.polidea.com/> | Software Engineer
>> >>>>>
>> >>>>> M: +48 791 432 002 <+48%20791%20432%20002> <+48791432002
>> <+48%20791%20432%20002>>
>> >>>>> E: michal.wale...@polidea.com
>> >>>>>
>> >>>>> Unique Tech
>> >>>>> Check out our projects! <https://www.polidea.com/our-work>
>> >>>>>
>> >>>>
>> >>
>>
>

Reply via email to