Bug#930728: pg_cltlcluster leaks logfile descriptor into PostgreSQL backends
Hi! > 26 июня 2019 г., в 19:22, Christoph Berg написал(а): > > Re: Andrey Borodin 2019-06-19 > <4178a7b3-b742-49d7-afa9-fcb3439a0...@yandex-team.ru> >> On my observation - yes, descriptor 4 is no longer in lsof. > > Ah, I had not noticed the "4" before. Only 1 and 2 are expected. > We can simply close() that one after dup2()'ing it to stdout/stderr. > > Thanks for spotting! Thanks for pushing fix for this. Maybe we can somehow get rid of 1 and 2 attached to logfile (that will be logrotated and deleted)? Can we redirect output to some temporary file and after pg_ctl append this file to logfile? I simply want to be able to delete this file via logrotate... Best regards, Andrey Borodin.
Bug#930728: pg_cltlcluster leaks logfile descriptor into PostgreSQL backends
> 19 июня 2019 г., в 21:05, Andrey Borodin написал(а): > > But I've checked only on Yandex.Cloud clusters. I think I'll do repro on > fresh Ubuntu... So, I've tried the fix on Ubuntu 18. There are much more leaked descriptors in this case and fix helps only partially postgres@x4m-pg-test:/home/x4mmm$ pg_ctlcluster 10 main start postgres@x4m-pg-test:/home/x4mmm$ lsof | grep 'postgresql-10-main.log' | grep postgres | grep '/var/log' postgres 4715 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4715 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4715 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4715 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4717 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4717 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4717 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4717 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4718 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4718 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4718 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4718 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4719 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4719 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4719 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4719 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4720 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4720 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4720 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4720 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4721 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4721 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4721 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4721 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4722 postgres1w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4722 postgres2w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4722 postgres4w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log postgres 4722 postgres5w REG 252,1 1332 256830 /var/log/postgresql/postgresql-10-main.log fixing pg_ctlcluster manually postgres@x4m-pg-test:/usr/bin$ diff pg_ctlcluster pg_ctlcluster_copy 208,213c208,210 < #my $output = $info{'logfile'} && -r $info{'logfile'} ? $info{'logfile'} : '/dev/null'; < #dup2(POSIX::open($output, POSIX::O_WRONLY | POSIX::O_APPEND | POSIX::O_CREAT), 1); < #dup2(POSIX::open($output, POSIX::O_WRONLY | POSIX::O_APPEND | POSIX::O_CREAT), 2); < < open(STDOUT, ">>$info{logfile}") or die $!; < open(STDERR, '>') or die $!; --- > my $output = $info{'logfile'} && -r $info{'logfile'} ? > $info{'logfile'} : '/dev/null'; > dup2(POSIX::open($output, POSIX::O_WRONLY | POSIX::O_APPEND | > POSIX::O_CREAT), 1); > dup2(POSIX::open($output, POSIX::O_WRONLY | POSIX::O_APPEND | > POSIX::O_CREAT), 2); postgres@x4m-pg-test:/usr/bin$
Bug#930728: pg_cltlcluster leaks logfile descriptor into PostgreSQL backends
> 19 июня 2019 г., в 20:57, Christoph Berg написал(а): > > Control: tags -1 moreinfo > > Re: Andrey Borodin 2019-06-19 > <3fd6aec7-55c8-4129-a46c-0bb3e4296...@yandex-team.ru> >> Package: postgresql-common >> Version: 199 >> >> When I execute >> service postgresql start >> I observe that logfile descriptor will leak into postmaster and every >> backend. This is not fd opened by PG's syslogger. > > That's because we use logging to stderr. > >> -my $fd = POSIX::open($info{'logfile'}, >> POSIX::O_WRONLY|POSIX::O_APPEND|POSIX::O_CREAT) or error "Could not open >> logfile $info{'logfile'}"; >> -dup2($fd, 1); >> -dup2($fd, 2); >> +open(STDOUT, ">>$info{logfile}") or die $!; >> +open(STDERR, '>') or die $!; > > Does that actually do things differently? Does stderr logging still > work with that patch? On my observation - yes, descriptor 4 is no longer in lsof. I looked that I see some lines of pg_ctl output like "output will be redirected etcetc" in pglog. But not from stderr. But I've checked only on Yandex.Cloud clusters. I think I'll do repro on fresh Ubuntu... I'm not an expert in Perl, these two lines were my first experience, and even in them I needed a lot of help from colleagues with Perl background. Best regards, Andrey Borodin.
Bug#930728: pg_cltlcluster leaks logfile descriptor into PostgreSQL backends
Package: postgresql-common Version: 199 When I execute service postgresql start I observe that logfile descriptor will leak into postmaster and every backend. This is not fd opened by PG's syslogger. root@rc1b-tl7je3ulcm4z9v3c ~ # lsof | grep 'postgresql-10-data.log' | grep postgres | grep '/var/log' postgres 28 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290011 postgres1w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290011 postgres2w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290011 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290011 postgres 13w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290013 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290014 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290015 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290016 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290017 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290018 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290019 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290020 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290036 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290211 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290256 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290257 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log postgres 290258 postgres4w REG 253,1 103172283 131900 /var/log/postgresql/postgresql-10-data.log When logfile is rotated, descriptor remain to be open, but occupied space is not accounted by df. This leads to incorrect free space estimation on cluster. When logfile is big enough during cluster start this can result in sudden cluster stop, which could be prevented by monitoring. If I apply following patch descriptor stops leaking. I use Ubuntu. Description:Ubuntu 18.04.2 LTS Release:18.04 Codename: bionic Best regards, Andrey Borodin. diff --git a/pg_ctlcluster b/pg_ctlcluster index 1690d3d..6039659 100755 --- a/pg_ctlcluster +++ b/pg_ctlcluster @@ -207,9 +207,8 @@ sub start { } else { setsid or error "could not start session: $!"; if ($info{'logfile'}) { -my $fd = POSIX::open($info{'logfile'}, POSIX::O_WRONLY|POSIX::O_APPEND|POSIX::O_CREAT) or error "Could not open logfile $info{'logfile'}"; -dup2($fd, 1); -dup2($fd, 2); +open(STDOUT, ">>$info{logfile}") or die $!; +open(STDERR, '>') or die $!; } exec $pg_ctl @options or error "could not exec $pg_ctl @options: $!"; }