*;tldr - I want to temporarily reduce the number of builds that we retain
to reduce pressure on Jenkins*

Hey everyone, over the past few days our Jenkins runs have been
particularly flaky across the board, with errors like the following showing
up all over the place [1]:

java.nio.file.FileSystemException:
/home/jenkins/jenkins-home/jobs/beam_PreCommit_Python_Phrase/builds/3352/changelog.xml:
No space left on device [2]


These errors indicate that we're out of space on the Jenkins master node.
After some digging (thanks @Yi Hu <ya...@google.com> @Ahmet Altay
<al...@google.com> and @Bruno Volpato <bvolp...@google.com> for
contributing), we've determined that at least one large contributing issue
is that some of our builds are eating up too much space. For example, our
beam_PreCommit_Java_Commit build is taking up 28GB of space by itself (this
is just one example).

@Yi Hu <ya...@google.com> found one change around code coverage that is
likely heavily contributing to the problem and rolled that back [3]. We can
continue to find other contributing factors here.

In the meantime, to get us back to healthy *I propose that we reduce the
number of builds that we are retaining to 40 for all jobs that are using a
large amount of storage (>5GB)*. This will hopefully allow us to return
Jenkins to a normal functioning state, though it will do so at the cost of
a significant amount of build history (right now, for example,
beam_PreCommit_Java_Commit is at 400 retained builds). We could restore the
normal retention limit once the underlying problem is resolved. Given that
this is irreversible (and not guaranteed to work), I wanted to gather
feedback before doing this. Personally, I rarely use builds that old, but
others may feel differently.

Please let me know if you have any objections or support for this proposal.

Thanks,
Danny

[1] Tracking issue: https://github.com/apache/beam/issues/26197
[2] Example run with this error:
https://ci-beam.apache.org/job/beam_PreCommit_Python_Phrase/3352/console
[3] Rollback PR: https://github.com/apache/beam/pull/26199

Reply via email to