[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-09-02 Thread Pablo Galindo Salgado
Pablo Galindo Salgado added the comment: All the pablogsal-* buildbots have been updated -- ___ Python tracker ___ ___

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-09-02 Thread Gregory P. Smith
Gregory P. Smith added the comment: The gps-* bots have been updated. -- nosy: +gregory.p.smith ___ Python tracker ___ ___

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-09-02 Thread David Edelsohn
David Edelsohn added the comment: I have updated edelsohn-aix-ppc64 edelsohn-debian-z edelsohn-fedora-ppc64 edelsohn-fedora-rawhide-z edelsohn-fedora-z edelsohn-rhel-z edelsohn-rhel8-z edelsohn-sles-z aixtools-aix-power6 -- ___ Python tracker

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-09-02 Thread STINNER Victor
STINNER Victor added the comment: Ah, I also updated: Fedora Stable ppc64le Fedora Rawhide ppc64le -- ___ Python tracker ___ ___

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-09-02 Thread STINNER Victor
STINNER Victor added the comment: Charris, Pablo and me identified that TCP connections are closed by the load balancer on some buildbot workers. When the "buildbot.python.org" host name is used, TCP connections (tcp port 9020) go through a load balancer. Ernest exposed the TCP port 9020

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-09-02 Thread STINNER Victor
STINNER Victor added the comment: > I can provide some information from the logs of one of the buildbots, or > change a parameter. Let me know. David: can you please change the buildbot client configuration to use "buildbot-api.python.org" host name? This host name doesn't go through the

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-28 Thread David Edelsohn
David Edelsohn added the comment: I can provide some information from the logs of one of the buildbots, or change a parameter. Let me know. -- ___ Python tracker ___

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-28 Thread STINNER Victor
STINNER Victor added the comment: On the server side, it seems like the "edelsohn-rhel8-z" worker is detached because its TCP connection is closed, only 87 seconds after the worker was attached. I added some debug traces: 2020-08-28 09:44:02+ [Broker,2,10.132.169.156] worker

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-28 Thread STINNER Victor
STINNER Victor added the comment: There are multiple errors in the buildbot server logs. I'm not sure if it's related or not. 2020-08-28 09:16:25+ [-] while invoking > Traceback (most recent call last): File

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-28 Thread STINNER Victor
STINNER Victor added the comment: I added keepalive_interval=60 parameter to Worker() in the server configuration: https://github.com/python/buildmaster-config/commit/2d28a4cfe77a3e206028613524a1e938801a1655 -- ___ Python tracker

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-28 Thread STINNER Victor
STINNER Victor added the comment: > The buildbot server migrated to a new machine and is now behind a load > balancer. tcp/80 (buildbot web page, HTTP) and tcp/9020 (used by buildbot > workers) are both behind the load balancer. > (...) > Buildbot workers have a TCP keepalive option of 1

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-27 Thread STINNER Victor
STINNER Victor added the comment: 27 buildbot workers are detached. They are are detached every minute! Affected workers: aixtools-aix-power6 billenstein-macos bolen-ubuntu bolen-windows10 bolen-windows7 cstratak-RHEL7-aarch64 cstratak-RHEL7-ppc64le cstratak-RHEL7-x86_64

[issue41642] Buildbot: workers detached every minute and "no space left on device" issue

2020-08-27 Thread STINNER Victor
STINNER Victor added the comment: I closed bpo-41648 "edelsohn-* buildbot worker failing with: No space left on device" as a duplicate of this issue. -- ___ Python tracker