STINNER Victor added the comment:
The buildbot server migrated to a new machine and is now behind a load
balancer. tcp/80 (buildbot web page, HTTP) and tcp/9020 (used by buildbot
workers) are both behind the load balancer.
Maybe the load balancer closes TCP connections which are idle for 60
STINNER Victor added the comment:
On the worker (client) side, I see many "lost remote step" every 1 to 3
minutes. Example with the PPC64LE Fedora Stable
(cstratak-fedora-stable-ppc64le) worker:
2020-08-27 01:30:09-0400 [Broker,client] lost remote step
2020-08-27 01:31:57-0400
STINNER Victor added the comment:
> I have found a large number of un-removed files in /tmp.
Right. I found many /tmp/cc.XXX and /tmp/tmpX files. Around 20 GB of
these files! Maybe using passing "-pipe" to gcc/clang would avoid the
/tmp/cc.XXX files when a build is interrupted.
David Edelsohn added the comment:
I have found a large number of un-removed files in /tmp. Things seem to
function better with Buildbots running older 0.x "buildslave" as opposed to
newer "builtbot-worker" instances.
--
nosy: +David.Edelsohn
title: Buildbot: workers detached every
STINNER Victor added the comment:
Statistics on partition which are the most full.
Fedora Rawhide x86-64 is ok:
/dev/mapper/vg_root_python--builder--rawhide.osci.io-root14G5,4G 7,6G
42% /
/dev/mapper/vg_root_python--builder--rawhide.osci.io-home36G 24G 11G
70% /home
STINNER Victor added the comment:
python-builder-rawhide had its /tmp partition full of temporary "cc.XXX"
files. Before: /tmp was full at 100% (3.9 GB). After sudo rm -f /tmp/cc*, only
52 KB are used (1%).
I'm not sure why gcc/clang left so many temporary files :-/ There are many
Charalampos Stratakis added the comment:
There were almost 10GB of remnant cc* files in /tmp from the compilers used,
which I presume were also the temporary artifacts which remained there after
the disconnects.
Cleaned those up and rebooted the RHEL8 x86_64 buildbot.
--
Charalampos Stratakis added the comment:
There is an issue which I discovered after I returned from holidays, basically
the buildbot-worker keeps getting disconnected from master, so builds start and
end abruptly, retaining some artifacts.
The next second it tried again with the same
STINNER Victor added the comment:
> It seems many of the RHEL and Fedora builds fail due to disk space
These workers have different owners and so need to reach different people. We
should list all impacted workers.
> https://buildbot.python.org/all/#/builders/185/builds/2
AMD64 RHEL8 3.x
New submission from Karthikeyan Singaravelan :
It seems many of the RHEL and Fedora builds fail due to disk space
https://buildbot.python.org/all/#/builders/185/builds/2
./configure: line 2382: cannot create temp file for here-document: No space
left on device
./configure: line 2394: cannot
10 matches
Mail list logo