On Tue, Sep 10, 2019 at 06:12:24PM +0000, Eric Wong wrote: > > It does seem like there's perhaps a leak somewhere? > > Probably. Not seeing any of that on my (smaller) instances; > but those -httpd haven't been restarted in weeks/months. > > The "PerlIO_" prefix is created from open(..., '+>', undef), so > it either has to be for: > > 1. POST bodies for git http-backend > > 1. git cat-file --batch-check stderr > > 3. experimental ViewVCS|SolverGit which isn't configured on lore :) > > = git-http-backend > > Looking at the size (984), PerlIO_* prefix and proximity of FD > numbers, I think those lines above are git-http-backend POST body.
Pretty sure that's the culprit. This is how we replicate between lore.kernel.org to erol.kernel.org: - once a minute, two nodes that are behind erol.kernel.org grab the newest manifest.js.gz - if there are changes, each updated repository is pulled from lore.kernel.org, so if there were 5 repository updates, there would be 10 "git pull" requests I switched the replication nodes to pull once every 5 minutes instead of once every minute and I see a direct correlation between when those processes run and the number of broken pipes and "/tmp/PerlIO_* (deleted)" processes showing up and hanging around. Not every run produces these, but increase spikes come in roughly 5-minute intervals. On the first run after public-inbox-httpd restart, the correlation is direct: this is from one of the mirroring nodes: [82807] 2019-09-11 09:30:02,044 - INFO - Updating 18 repos from https://lore.kernel.org this is on lore.kernel.org after the run is completed: # ls -al /proc/{16212,16213,16214,16215}/fd | grep deleted | wc -l 36 > Any git-http-backend stuck from people fetching/cloning? No, all git processes seem to exit cleanly on both ends. > This is -httpd writing to varnish, still, right? We bypass varnish for git requests, since this is not generally useful. Nginx goes straight to public-inbox-httpd for those. I did run some updates on lore.kernel.org on Thursday, including kernel (3.10.0-957.27.2), nginx (1.16.1) and public-inbox updates. For the latter, it went from f4f0a3be to what was latest master at the time (d327141c). Hope this helps, and thanks for your help! -K