Those file listings look like the result of using standard temp file APIs
but with TMPDIR set to /tmp.

On Mon, Jul 20, 2020 at 7:55 PM Tyson Hamilton <[email protected]> wrote:

> Jobs are hermetic as far as I can tell and use unique subdirectories
> inside of /tmp. Here is a quick look into two examples:
>
> @apache-ci-beam-jenkins-4:/tmp$ sudo du -ah --time . | sort -rhk 1,1 |
> head -n 20
> 1.6G    2020-07-21 02:25        .
> 242M    2020-07-17 18:48        ./beam-pipeline-temp3ybuY4
> 242M    2020-07-17 18:46        ./beam-pipeline-tempuxjiPT
> 242M    2020-07-17 18:44        ./beam-pipeline-tempVpg1ME
> 242M    2020-07-17 18:42        ./beam-pipeline-tempJ4EpyB
> 242M    2020-07-17 18:39        ./beam-pipeline-tempepea7Q
> 242M    2020-07-17 18:35        ./beam-pipeline-temp79qot2
> 236M    2020-07-17 18:48        ./beam-pipeline-temp3ybuY4/tmpy_Ytzz
> 236M    2020-07-17 18:46        ./beam-pipeline-tempuxjiPT/tmpN5_UfJ
> 236M    2020-07-17 18:44        ./beam-pipeline-tempVpg1ME/tmpxSm8pX
> 236M    2020-07-17 18:42        ./beam-pipeline-tempJ4EpyB/tmpMZJU76
> 236M    2020-07-17 18:39        ./beam-pipeline-tempepea7Q/tmpWy1vWX
> 236M    2020-07-17 18:35        ./beam-pipeline-temp79qot2/tmpvN7vWA
> 3.7M    2020-07-17 18:48        ./beam-pipeline-temp3ybuY4/tmprlh_di
> 3.7M    2020-07-17 18:46        ./beam-pipeline-tempuxjiPT/tmpLmVWfe
> 3.7M    2020-07-17 18:44        ./beam-pipeline-tempVpg1ME/tmpvrxbY7
> 3.7M    2020-07-17 18:42        ./beam-pipeline-tempJ4EpyB/tmpLTb6Mj
> 3.7M    2020-07-17 18:39        ./beam-pipeline-tempepea7Q/tmptYF1v1
> 3.7M    2020-07-17 18:35        ./beam-pipeline-temp79qot2/tmplfV0Rg
> 2.7M    2020-07-17 20:10        ./pip-install-q9l227ef
>
>
> @apache-ci-beam-jenkins-11:/tmp$ sudo du -ah --time . | sort -rhk 1,1 |
> head -n 20
> 817M    2020-07-21 02:26        .
> 242M    2020-07-19 12:14        ./beam-pipeline-tempUTXqlM
> 242M    2020-07-19 12:11        ./beam-pipeline-tempx3Yno3
> 242M    2020-07-19 12:05        ./beam-pipeline-tempyCrMYq
> 236M    2020-07-19 12:14        ./beam-pipeline-tempUTXqlM/tmpstXoL0
> 236M    2020-07-19 12:11        ./beam-pipeline-tempx3Yno3/tmpnnVn65
> 236M    2020-07-19 12:05        ./beam-pipeline-tempyCrMYq/tmpRF0iNs
> 3.7M    2020-07-19 12:14        ./beam-pipeline-tempUTXqlM/tmpbJjUAQ
> 3.7M    2020-07-19 12:11        ./beam-pipeline-tempx3Yno3/tmpsmmzqe
> 3.7M    2020-07-19 12:05        ./beam-pipeline-tempyCrMYq/tmp5b3ZvY
> 2.0M    2020-07-19 12:14        ./beam-pipeline-tempUTXqlM/tmpoj3orz
> 2.0M    2020-07-19 12:11        ./beam-pipeline-tempx3Yno3/tmptng9sZ
> 2.0M    2020-07-19 12:05        ./beam-pipeline-tempyCrMYq/tmpWp6njc
> 1.2M    2020-07-19 12:14        ./beam-pipeline-tempUTXqlM/tmphgdj35
> 1.2M    2020-07-19 12:11        ./beam-pipeline-tempx3Yno3/tmp8ySXpm
> 1.2M    2020-07-19 12:05        ./beam-pipeline-tempyCrMYq/tmpNVEJ4e
> 992K    2020-07-12 12:00        ./junit642086915811430564
> 988K    2020-07-12 12:00        ./junit642086915811430564/beam
> 984K    2020-07-12 12:00        ./junit642086915811430564/beam/nodes
> 980K    2020-07-12 12:00        ./junit642086915811430564/beam/nodes/0
>
>
>
> On Mon, Jul 20, 2020 at 6:46 PM Udi Meiri <[email protected]> wrote:
>
>> You're right, job workspaces should be hermetic.
>>
>>
>>
>> On Mon, Jul 20, 2020 at 1:24 PM Kenneth Knowles <[email protected]> wrote:
>>
>>> I'm probably late to this discussion and missing something, but why are
>>> we writing to /tmp at all? I would expect TMPDIR to point somewhere inside
>>> the job directory that will be wiped by Jenkins, and I would expect code to
>>> always create temp files via APIs that respect this. Is Jenkins not
>>> cleaning up? Do we not have the ability to set this up? Do we have bugs in
>>> our code (that we could probably find by setting TMPDIR to somewhere
>>> not-/tmp and running the tests without write permission to /tmp, etc)
>>>
>>> Kenn
>>>
>>> On Mon, Jul 20, 2020 at 11:39 AM Ahmet Altay <[email protected]> wrote:
>>>
>>>> Related to workspace directory growth, +Udi Meiri <[email protected]> filed
>>>> a relevant issue previously (
>>>> https://issues.apache.org/jira/browse/BEAM-9865) for cleaning up
>>>> workspace directory after successful jobs. Alternatively, we can consider
>>>> periodically cleaning up the /src directories.
>>>>
>>>> I would suggest moving the cron task from internal cron scripts to the
>>>> inventory job (
>>>> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_Inventory.groovy#L51).
>>>> That way, we can see all the cron jobs as part of the source tree, adjust
>>>> frequencies and clean up codes with PRs. I do not know how internal cron
>>>> scripts are created, maintained, and how would they be recreated for new
>>>> worker instances.
>>>>
>>>> /cc +Tyson Hamilton <[email protected]>
>>>>
>>>> On Mon, Jul 20, 2020 at 4:50 AM Damian Gadomski <
>>>> [email protected]> wrote:
>>>>
>>>>> Hey,
>>>>>
>>>>> I've recently created a solution for the growing /tmp directory. Part
>>>>> of it is the job mentioned by Tyson: *beam_Clean_tmp_directory*. It's
>>>>> intentionally not triggered by cron and should be a last resort solution
>>>>> for some strange cases.
>>>>>
>>>>> Along with that job, I've also updated every worker with an internal
>>>>> cron script. It's being executed once a week and deletes all the files 
>>>>> (and
>>>>> only files) that were not accessed for at least three days. That's 
>>>>> designed
>>>>> to be as safe as possible for the running jobs on the worker (not to 
>>>>> delete
>>>>> the files that are still in use), and also to be insensitive to the 
>>>>> current
>>>>> workload on the machine. The cleanup will always happen, even if some
>>>>> long-running/stuck jobs are blocking the machine.
>>>>>
>>>>> I also think that currently the "No space left" errors may be a
>>>>> consequence of growing workspace directory rather than /tmp. I didn't do
>>>>> any detailed analysis but e.g. currently, on apache-beam-jenkins-7 the
>>>>> workspace directory size is 158 GB while /tmp is only 16 GB. We should
>>>>> either guarantee the disk size to hold workspaces for all jobs (because
>>>>> eventually, every worker will execute each job) or clear also the
>>>>> workspaces in some way.
>>>>>
>>>>> Regards,
>>>>> Damian
>>>>>
>>>>>
>>>>> On Mon, Jul 20, 2020 at 10:43 AM Maximilian Michels <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> +1 for scheduling it via a cron job if it won't lead to test failures
>>>>>> while running. Not a Jenkins expert but maybe there is the notion of
>>>>>> running exclusively while no other tasks are running?
>>>>>>
>>>>>> -Max
>>>>>>
>>>>>> On 17.07.20 21:49, Tyson Hamilton wrote:
>>>>>> > FYI there was a job introduced to do this in Jenkins:
>>>>>> beam_Clean_tmp_directory
>>>>>> >
>>>>>> > Currently it needs to be run manually. I'm seeing some out of disk
>>>>>> related errors in precommit tests currently, perhaps we should schedule
>>>>>> this job with cron?
>>>>>> >
>>>>>> >
>>>>>> > On 2020/03/11 19:31:13, Heejong Lee <[email protected]> wrote:
>>>>>> >> Still seeing no space left on device errors on jenkins-7 (for
>>>>>> example:
>>>>>> >>
>>>>>> https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2754/)
>>>>>> >>
>>>>>> >>
>>>>>> >> On Fri, Mar 6, 2020 at 7:11 PM Alan Myrvold <[email protected]>
>>>>>> wrote:
>>>>>> >>
>>>>>> >>> Did a one time cleanup of tmp files owned by jenkins older than 3
>>>>>> days.
>>>>>> >>> Agree that we need a longer term solution.
>>>>>> >>>
>>>>>> >>> Passing recent tests on all executors except jenkins-12, which
>>>>>> has not
>>>>>> >>> scheduled recent builds for the past 13 days. Not scheduling:
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-12/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-12/builds&sa=D
>>>>>> >
>>>>>> >>> Recent passing builds:
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-1/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-1/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-2/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-2/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-3/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-3/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-4/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-4/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-5/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-5/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-6/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-6/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-7/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-7/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-8/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-8/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-9/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-9/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-10/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-10/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-11/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-11/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-13/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-13/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-14/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-14/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-15/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-15/builds&sa=D
>>>>>> >
>>>>>> >>> https://builds.apache.org/computer/apache-beam-jenkins-16/builds
>>>>>> >>> <
>>>>>> https://www.google.com/url?q=https://builds.apache.org/computer/apache-beam-jenkins-16/builds&sa=D
>>>>>> >
>>>>>> >>>
>>>>>> >>> On Fri, Mar 6, 2020 at 11:54 AM Ahmet Altay <[email protected]>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>>> +Alan Myrvold <[email protected]> is doing a one time
>>>>>> cleanup. I agree
>>>>>> >>>> that we need to have a solution to automate this task or address
>>>>>> the root
>>>>>> >>>> cause of the buildup.
>>>>>> >>>>
>>>>>> >>>> On Thu, Mar 5, 2020 at 2:47 AM Michał Walenia <
>>>>>> [email protected]>
>>>>>> >>>> wrote:
>>>>>> >>>>
>>>>>> >>>>> Hi there,
>>>>>> >>>>> it seems we have a problem with Jenkins workers again. Nodes 1
>>>>>> and 7
>>>>>> >>>>> both fail jobs with "No space left on device".
>>>>>> >>>>> Who is the best person to contact in these cases (someone with
>>>>>> access
>>>>>> >>>>> permissions to the workers).
>>>>>> >>>>>
>>>>>> >>>>> I also noticed that such errors are becoming more and more
>>>>>> frequent
>>>>>> >>>>> recently and I'd like to discuss how can this be remedied. Can
>>>>>> a cleanup
>>>>>> >>>>> task be automated on Jenkins somehow?
>>>>>> >>>>>
>>>>>> >>>>> Regards
>>>>>> >>>>> Michal
>>>>>> >>>>>
>>>>>> >>>>> --
>>>>>> >>>>>
>>>>>> >>>>> Michał Walenia
>>>>>> >>>>> Polidea <https://www.polidea.com/> | Software Engineer
>>>>>> >>>>>
>>>>>> >>>>> M: +48 791 432 002 <+48%20791%20432%20002> <+48791432002
>>>>>> <+48%20791%20432%20002>>
>>>>>> >>>>> E: [email protected]
>>>>>> >>>>>
>>>>>> >>>>> Unique Tech
>>>>>> >>>>> Check out our projects! <https://www.polidea.com/our-work>
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>
>>>>>>
>>>>>

Reply via email to