Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker bada...@gmail.com writes: On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, Â Â Â Â -DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. Applied with some editorialization. I concluded that it'd be better to put the oom_adj reset right into fork_process, rather than scattering the support across several different files. The latter seems vulnerable to errors of omission in future versions, and there's no really strong reason to not have all the child processes behave the same. Also, a single-file patch is a lot easier for packagers to borrow and apply to existing releases, should they choose to. (Already done and tested in Fedora packages ...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On fre, 2010-01-08 at 11:37 -0500, Tom Lane wrote: Alex Hunsaker bada...@gmail.com writes: On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, -DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. Anybody have an objection to this basic approach? I'm in a bit of a hurry to get something like this into the Fedora RPMs, so barring objections I'm going to review this, commit it into HEAD, and then make a back-ported patch I can use with 8.4 in Fedora. I find this whole approach a bit evil. If word of this gets out, every server process on Linux will excuse itself from the OOM killer. And then the kernel guys will add another setting to override the process preference. It's an arms race, but maybe that's what's needed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Sat, Jan 9, 2010 at 14:06, Peter Eisentraut pete...@gmx.net wrote: I find this whole approach a bit evil. I would tend to agree but this type of thing has been known since about 2004... See http://thoughts.j-davis.com/2009/11/29/linux-oom-killer/, particularly the comment from Greg Smith. If word of this gets out, every server process on Linux will excuse itself from the OOM killer. And then the kernel guys will add another setting to override the process preference. Yes, and note debian is already doing that with things like ssh. Who knows what else. (Id be curious to know) Plus maybe it will convince them its time to fix the damn thing. Although postgres really is kind of special in this regard. All the other daemons on my system include X had way lower oom scores. Alsamixer was 3 times more likely to get killed than the first daemon with the highest score (hald) while postgres was 55 times more likely. Yes its the kernel being stupid, but its been known for more than 6 years... (oom scores: alsamxier: 1497, hald: 487, postgres: 26558) It's an arms race, but maybe that's what's needed. Well *shrug* regardless of what core does... Ill certainly be doing it on my postgres linux builds :) Maybe it would convince them more if we could get distros to accept patches that fix the kernel to do correct/better shared mem accounting? May I add good luck? :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Peter Eisentraut wrote: On fre, 2010-01-08 at 11:37 -0500, Tom Lane wrote: Alex Hunsaker bada...@gmail.com writes: On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, -DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. Anybody have an objection to this basic approach? I'm in a bit of a hurry to get something like this into the Fedora RPMs, so barring objections I'm going to review this, commit it into HEAD, and then make a back-ported patch I can use with 8.4 in Fedora. I find this whole approach a bit evil. If word of this gets out, every server process on Linux will excuse itself from the OOM killer. And then the kernel guys will add another setting to override the process preference. It's an arms race, but maybe that's what's needed. The trouble is that the OOM heuristics are pretty bad, and many Linux hackers aren't interested in improving them. One of the most prominent told me some years ago Just turn it off. And the point of this patch is to allow the postmaster to *remove* OOM protection from normal postgres backends. We at least would be playing nice, and not engaging in an arms race. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker bada...@gmail.com writes: On Sat, Jan 9, 2010 at 14:06, Peter Eisentraut pete...@gmx.net wrote: If word of this gets out, every server process on Linux will excuse itself from the OOM killer. Â And then the kernel guys will add another setting to override the process preference. ... maybe it will convince them its time to fix the damn thing. Although postgres really is kind of special in this regard. Yeah. If they had saner handling of shared-memory accounting, maybe there wouldn't be a need for us to kluge around the OOM logic. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 04:46, Alex Hunsaker bada...@gmail.com wrote: On Thu, Jan 7, 2010 at 20:26, Tom Lane t...@sss.pgh.pa.us wrote: Alex Hunsaker bada...@gmail.com writes: We can either drop this in core (with a lot of #ifdef LINUX added) Any thoughts on doing something like (in fork_process.c) #ifdef LINUX void oom_adjust() { ... } #else void oom_adjust() {} #endif So there is only one #ifdef? It still leaves the ugly calls to the function... Seems like a much better way, yes. Especially if we in the future want to do this for more than one platform (if it becomes necessary). or expect Linux packagers to carry it as a patch. Given that the packagers would also have to modify their init scripts to go with, the patch route is not unreasonable. Comments? Id plus +1 for core. The problem certainly does not look to be going away soon (if ever). Yeah, I think core is better. It's not like it's enough code to cause a huge maintenance problem, I think. Do we need to make the value configurable? I'd certainly find it interesting to set backends to say 5 or something like that, that makes them less likely to be killed than any old oops opened too big file in an editor-process, but still possible to kill if the system is *really* running out of memory. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander mag...@hagander.net writes: Do we need to make the value configurable? I'd certainly find it interesting to set backends to say 5 or something like that, that makes them less likely to be killed than any old oops opened too big file in an editor-process, but still possible to kill if the system is *really* running out of memory. I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) What I suggest is that we do something like #ifdef LINUX_OOM_ADJ ... fprintf(oom, %d\n, LINUX_OOM_ADJ); ... #endif Then, somebody who wants the feature would build with, say, -DLINUX_OOM_ADJ=0 or another value if they want that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker wrote: On Thu, Jan 7, 2010 at 20:26, Tom Lane t...@sss.pgh.pa.us wrote: Alex Hunsaker bada...@gmail.com writes: We can either drop this in core (with a lot of #ifdef LINUX added) Any thoughts on doing something like (in fork_process.c) #ifdef LINUX void oom_adjust() { ... } #else void oom_adjust() {} #endif So there is only one #ifdef? It still leaves the ugly calls to the function... The usual solution for this kind of thing is: #ifdef LINUX #define OOM_ADJUST oom_adjust() #else #define OOM_ADJUST do {} while (0) #endif so there is no call or dummy function and you reference it in the code as: OOM_ADJUST; -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 07:53, Bruce Momjian br...@momjian.us wrote: Alex Hunsaker wrote: On Thu, Jan 7, 2010 at 20:26, Tom Lane t...@sss.pgh.pa.us wrote: Alex Hunsaker bada...@gmail.com writes: The usual solution for this kind of thing is: #ifdef LINUX #define OOM_ADJUST oom_adjust() #else #define OOM_ADJUST do {} while (0) #endif so there is no call or dummy function and you reference it in the code as: Surely any compiler worth its salt would turn a call to an empty void function into a noop? Then again maybe I just hate macros :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, -DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. It sets oom_adj for: autovacuum workers archivers (pgarch.c) regular backends Also it updates the contrib linux starup script to start under an oom_adj of -17 Comments? *** a/contrib/start-scripts/linux --- b/contrib/start-scripts/linux *** *** 53,58 DAEMON=$prefix/bin/postmaster --- 53,63 # What to use to shut down the postmaster PGCTL=$prefix/bin/pg_ctl + # Adjust oom_adj on linux to avoid the postmaster from be killed + # note you probably want to compile postgres with -DLINUX_OOM_ADJ=0 + # so that regular backends will be killed on oom + OOM_ADJ=-17 + set -e # Only start if we can find the postmaster. *** *** 62,67 test -x $DAEMON || exit 0 --- 67,73 case $1 in start) echo -n Starting PostgreSQL: + echo $OOM_ADJ /proc/self/oom_adj su - $PGUSER -c $DAEMON -D '$PGDATA' $PGLOG 21 echo ok ;; *** a/src/backend/postmaster/autovacuum.c --- b/src/backend/postmaster/autovacuum.c *** *** 1403,1408 StartAutoVacWorker(void) --- 1403,1411 /* Lose the postmaster's on-exit routines */ on_exit_reset(); + /* allow us to be killed on oom */ + oom_adjust(); + AutoVacWorkerMain(0, NULL); break; #endif *** a/src/backend/postmaster/fork_process.c --- b/src/backend/postmaster/fork_process.c *** *** 66,68 fork_process(void) --- 66,97 } #endif /* ! WIN32 */ + + #if defined(__linux__) defined(LINUX_OOM_ADJ) + /* + * By default linux really likes to kill the postmaster on oom. + * Because linux does not take SYSV shared mem into account it + * (almost) always will SIGKILL the postmaster on an oom event. + * + * In the event you started the postmaster under a low (negative) + * oom_adj. This will adjust regular backends, autovac and archivers + * to LINUX_OOM_ADJ on fork(). Allowing them to be killed in an oom + * event. + * + * Later we might add other OS oom type stuff here. + */ + void + oom_adjust(void) + { + FILE *oom = fopen(/proc/self/oom_adj, w); + + /* ignore errors we dont really care */ + if (oom) + { + fprintf(oom, %d\n, LINUX_OOM_ADJ); + fclose(oom); + } + } + #else + void oom_adjust(void) { } + #endif *** a/src/backend/postmaster/pgarch.c --- b/src/backend/postmaster/pgarch.c *** *** 170,175 pgarch_start(void) --- 170,178 /* Drop our connection to postmaster's shared memory, as well */ PGSharedMemoryDetach(); + /* allow us to be killed on oom */ + oom_adjust(); + PgArchiverMain(0, NULL); break; #endif *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *** *** 3076,3081 BackendStartup(Port *port) --- 3076,3084 /* Perform additional initialization and collect startup packet */ BackendInitialize(port); + /* allow us to be killed on oom */ + oom_adjust(); + /* And run the backend */ proc_exit(BackendRun(port)); } *** a/src/include/postmaster/fork_process.h --- b/src/include/postmaster/fork_process.h *** *** 13,17 --- 13,18 #define FORK_PROCESS_H extern pid_t fork_process(void); + extern void oom_adjust(void); #endif /* FORK_PROCESS_H */ *** a/contrib/start-scripts/linux --- b/contrib/start-scripts/linux *** *** 53,58 DAEMON=$prefix/bin/postmaster --- 53,63 # What to use to shut down the postmaster PGCTL=$prefix/bin/pg_ctl + # Adjust oom_adj on linux to avoid the postmaster from be killed + # note you probably want to compile postgres with -DLINUX_OOM_ADJ=0 + # so that regular backends will be killed on oom + OOM_ADJ=-17 + set -e # Only start if we can find the postmaster. *** *** 62,67 test -x $DAEMON || exit 0 --- 67,73 case $1 in start) echo -n Starting PostgreSQL: + echo $OOM_ADJ /proc/self/oom_adj su - $PGUSER -c $DAEMON -D '$PGDATA' $PGLOG 21 echo ok ;; *** a/src/backend/postmaster/autovacuum.c --- b/src/backend/postmaster/autovacuum.c *** *** 1403,1408 StartAutoVacWorker(void) --- 1403,1411 /* Lose the postmaster's on-exit routines */ on_exit_reset(); + /* allow us to be killed on oom */ + oom_adjust(); + AutoVacWorkerMain(0, NULL); break; #endif *** a/src/backend/postmaster/fork_process.c --- b/src/backend/postmaster/fork_process.c *** *** 66,68 fork_process(void) --- 66,97 } #endif /* ! WIN32 */ + + #if defined(__linux__) defined(LINUX_OOM_ADJ) + /* + * By default linux really likes to kill the postmaster on oom. + * Because linux does not take SYSV shared mem into
Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker bada...@gmail.com writes: On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, Â Â Â Â -DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. Anybody have an objection to this basic approach? I'm in a bit of a hurry to get something like this into the Fedora RPMs, so barring objections I'm going to review this, commit it into HEAD, and then make a back-ported patch I can use with 8.4 in Fedora. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Tom Lane wrote: Alex Hunsaker bada...@gmail.com writes: On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, ?? ?? ?? ??-DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. Anybody have an objection to this basic approach? I'm in a bit of a hurry to get something like this into the Fedora RPMs, so barring objections I'm going to review this, commit it into HEAD, and then make a back-ported patch I can use with 8.4 in Fedora. Go for it, but FYI, we need to udpate the our OOM documentation mention to reflect this change. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
* Magnus Hagander (mag...@hagander.net) wrote: Do we need to make the value configurable? I'd certainly find it interesting to set backends to say 5 or something like that, that makes them less likely to be killed than any old oops opened too big file in an editor-process, but still possible to kill if the system is *really* running out of memory. We do need to make it configurable, in at least the sense of being able to control if it's done or not. There are some environments where you won't be able to set it. Perhaps just handling failure gracefully would work, but I'd be happier if you could just disable it in the config file. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Setting oom_adj on linux?
* Tom Lane (t...@sss.pgh.pa.us) wrote: I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) Requiring a custom build to disable it would be horrible, in my view. Or, at best, just means that the packagers won't enable it, which obviously would be less than ideal. Sorry if it's a pain, but I think it needs to either be configurable or not done. As I said before, it definitely needs to handle failure gracefully, but I worry that even that won't be sufficient in some cases. Just thinking about how we run PG under VServers and Linux Containers and whatnot, we ran into some issues with OpenSSH trying to monkey with proc values and I'd really hate to run into the same issues with PG. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 10:07, Stephen Frost sfr...@snowman.net wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) Requiring a custom build to disable it would be horrible, in my view. Or, at best, just means that the packagers won't enable it, which obviously would be less than ideal. FWIW I agree. Sorry if it's a pain, but I think it needs to either be configurable or not done. As I said before, it definitely needs to handle failure gracefully, but I worry that even that won't be sufficient in some cases. Just thinking about how we run PG under VServers and Linux Containers and whatnot, we ran into some issues with OpenSSH trying to monkey with proc values and I'd really hate to run into the same issues with PG. As long as the VM/container you are running under wont kill postmaster for trying to access proc-- the patch I posted should work fine. It just ignores any error (I assumed you might be running in a chroot without proc or some such). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
* Tom Lane (t...@sss.pgh.pa.us) wrote: Alex Hunsaker bada...@gmail.com writes: On Fri, Jan 8, 2010 at 07:27, Tom Lane t...@sss.pgh.pa.us wrote: Then, somebody who wants the feature would build with, say, -DLINUX_OOM_ADJ=0 or another value if they want that. Here is a stab at that. Anybody have an objection to this basic approach? I'm in a bit of a hurry to get something like this into the Fedora RPMs, so barring objections I'm going to review this, commit it into HEAD, and then make a back-ported patch I can use with 8.4 in Fedora. Whoah, I would caution against doing that without being very confident it won't break when installed under things like VServer, Linux Containers, SELinux configurations, etc, when back-porting it. I don't expect people would be too pleased to discover their nice, simple, no-expected-issues upgrade of a minor point release to all of a sudden mean their database doesn't start anymore.. Sorry I havn't got time right now to run down the issue I had with OpenSSH doing something similar, and it might have just been poor coding in OpenSSH, but I wanted to voice my concern. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Setting oom_adj on linux?
Stephen Frost sfr...@snowman.net writes: * Tom Lane (t...@sss.pgh.pa.us) wrote: I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) Requiring a custom build to disable it would be horrible, in my view. Or, at best, just means that the packagers won't enable it, which obviously would be less than ideal. I'm a packager, and I think that this approach is perfectly fine. The place where the rubber meets the road is in the init script, which is the packager's responsibility. If the packager is going to provide an init script that sets oom_adj in the first place, he can turn on the compensation code inside the binary. If not, the compensation code has no purpose anyhow. There are no moving parts in this as far as the end user is concerned. Sorry if it's a pain, but I think it needs to either be configurable or not done. As I said before, it definitely needs to handle failure gracefully, We just ignore any error from the attempt to write to /proc. but I worry that even that won't be sufficient in some cases. Just thinking about how we run PG under VServers and Linux Containers and whatnot, I think you are missing the point that the code won't even be compiled except on platforms where the packager has determined that it's sensible to have it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Alex, * Alex Hunsaker (bada...@gmail.com) wrote: As long as the VM/container you are running under wont kill postmaster for trying to access proc-- the patch I posted should work fine. It just ignores any error (I assumed you might be running in a chroot without proc or some such). As I recall, oom_adj wasn't visible in the container because you explicitly set what proc entries processes can have access to when using VServers, and OpenSSH didn't handle that cleanly. Guess what I'm just saying is don't just assume everything is as it would be on a 'normal' system when dealing with /proc and friends, and, of course, test, test, test when you're talking about back-porting things. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Setting oom_adj on linux?
Tom Lane wrote: I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) What I suggest is that we do something like #ifdef LINUX_OOM_ADJ ... fprintf(oom, %d\n, LINUX_OOM_ADJ); ... #endif Then, somebody who wants the feature would build with, say, -DLINUX_OOM_ADJ=0 or another value if they want that. +1 for this. Looks like a sound approach. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 12:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Stephen Frost sfr...@snowman.net writes: * Tom Lane (t...@sss.pgh.pa.us) wrote: I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) Requiring a custom build to disable it would be horrible, in my view. Or, at best, just means that the packagers won't enable it, which obviously would be less than ideal. I'm a packager, and I think that this approach is perfectly fine. The place where the rubber meets the road is in the init script, which is the packager's responsibility. If the packager is going to provide an init script that sets oom_adj in the first place, he can turn on the compensation code inside the binary. If not, the compensation code has no purpose anyhow. There are no moving parts in this as far as the end user is concerned. There could well be moving parts if the user wants to adjust the value being written to oom_adj, and can't because it's compiled in. I don't see why we can't just add a GUC for this and be done with it. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Stephen Frost sfr...@snowman.net writes: * Tom Lane (t...@sss.pgh.pa.us) wrote: I don't want to go to the trouble of creating (and documenting) a configure option for this. Much less a GUC ;-) Requiring a custom build to disable it would be horrible, in my view. BTW, maybe you're confused about the intention here? You'd need a custom build to *enable* it, not to disable it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Robert Haas robertmh...@gmail.com writes: There could well be moving parts if the user wants to adjust the value being written to oom_adj, and can't because it's compiled in. I don't see why we can't just add a GUC for this and be done with it. The number of users who will want to do that might be different from epsilon, but not by enough to justify a GUC. Furthermore, we don't have any reasonable infrastructure for supporting platform-specific GUCs, which means that the amount of effort you're proposing is extremely large. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 10:24, Stephen Frost sfr...@snowman.net wrote: As I recall, oom_adj wasn't visible in the container because you explicitly set what proc entries processes can have access to when using VServers, and OpenSSH didn't handle that cleanly. Guess what I'm just saying is don't just assume everything is as it would be on a 'normal' system when dealing with /proc and friends, and, of course, test, test, test when you're talking about back-porting things. Sure this was openssh? I just looked through the entire cvs history for opensshp and found 0 references to 'oom' let alone 'oom_adj'. Maybe something distro specific? [ for the curious here is what I tried ] $ git clone git://git.infradead.org/openssh.git $ cd openssh $ git grep oom_adj $ git grep 'oom' $ git grep oom | grep -vi loomis | grep -v room $ git log -p | grep oom | grep -vi loomis | grep -v room | grep -v tsoome -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker bada...@gmail.com writes: Sure this was openssh? I just looked through the entire cvs history for opensshp and found 0 references to 'oom' let alone 'oom_adj'. Maybe something distro specific? FWIW, I see no evidence that sshd on Fedora does anything to change its oom score --- the oom_adj file reads as zero for both the parent daemon and its children. Kinda scary to realize the OOM killer could easily lock me out of boxes I run headless. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 12:45, Tom Lane t...@sss.pgh.pa.us wrote: Alex Hunsaker bada...@gmail.com writes: Sure this was openssh? I just looked through the entire cvs history for opensshp and found 0 references to 'oom' let alone 'oom_adj'. Maybe something distro specific? FWIW, I see no evidence that sshd on Fedora does anything to change its oom score --- the oom_adj file reads as zero for both the parent daemon and its children. Kinda scary to realize the OOM killer could easily lock me out of boxes I run headless. [ OT, CC trimmed] Yeah, for me sshd has a score of 24 and has the 13th lowest oom_score on my box. While postgres with 0 connections and shared_buffers = 28MB has a score of 26558 and has the 5th highest oom_score. Only chromium and xmonad are above it. With 5 connections it just about doubles its score (to 47238). Had I been headless postgres would certainly die on oom. But even something like alsamixer or bash is higher than sshd *shrug*. For the curious here below is the raw data and how i generated it [ yes its far from perfect... ]: (for file in /proc/*/; do echo `cat $file/oom_score` `cat $file/cmdline` $file; done) | sort -n 6 /sbin/agetty-838400tty1linux /proc/2204/ 6 /sbin/agetty-838400tty2linux /proc/1977/ 6 /sbin/agetty-838400tty3linux /proc/1978/ 6 /usr/sbin/crond /proc/1954/ 6 /usr/sbin/uptimed /proc/1971/ 10 /usr/sbin/irqbalance /proc/1951/ 12 /usr/sbin/ntpd-s /proc/1967/ 12 /usr/sbin/smartd /proc/2128/ 13 hald-addon-input: Listening on /dev/input/event2 /dev/input/event1 /proc/2044/ 13 hald-addon-storage: polling /dev/sr0 (every 2 sec) /proc/2088/ 13 /usr/lib/hal/hald-addon-cpufreq /proc/2092/ 19 /usr/sbin/syslog-ng /proc/1639/ 24 /usr/sbin/sshd /proc/1943/ 26 /usr/lib/sa/sadc-F-L-SDISK6006- /proc/2716/ 27 supervising syslog-ng /proc/1638/ 38 hald-runner /proc/1992/ 53 cat/proc/self//cmdline /proc/self/ 64 /usr/lib/postfix/master /proc/2089/ 90 /usr/bin/X-nolistentcp /proc/2173/ 102 -bash /proc/10997/ 108 /home/alex/.cabal/bin/xmobar-x1 /proc/2190/ 140 /usr/bin/dbus-daemon--fork--print-pid5--print-address7--session /proc/2256/ 140 /usr/bin/dbus-daemon--system /proc/1983/ 191 /usr/lib/hal/hald-addon-acpi /proc/2093/ 194 dbus-launch--autolaunch004d7f457c373938f22d796a4ae05b60--binary-syntax--close-stderr /proc/2255/ 199 /usr/sbin/ntpd-s /proc/1960/ 206 ssh-agentxmonad /proc/2188/ 287 tail-f/var/log/httpd/error.log /proc/8688/ 310 firefox /proc/18571/ 339 xbindkeys /proc/2192/ 341 -bash /proc/10589/ 354 /usr/local/bin/cmus /proc/2205/ 400 /bin/sh/usr/bin/startx /proc/2155/ 456 sort-n /proc/10998/ 487 /usr/sbin/hald /proc/1991/ 525 qmgr-l-tfifo-u /proc/2097/ 639 /bin/sh/home/alex/.xinitrc /proc/2180/ 736 /usr/lib/GConf/gconfd-2 /proc/18578/ 861 tail-f/var/log/httpd/error.log /proc/1621/ 1122 urxvtd-q-f-o /proc/2189/ 1151 /usr/lib/chromium/chromium /proc/2220/ 1172 -bash /proc/22276/ 1351 mutt-y /proc/27637/ 1497 alsamixer /proc/3528/ 1528 ssh192.168.0.15 /proc/1523/ 1534 -bash /proc/2863/ 1534 -bash /proc/2881/ 1534 -bash /proc/2891/ 1793 /usr/lib/chromium/chromium /proc/2219/ 1828 ssh70.98.186.4 /proc/2860/ 2066 pickup-l-tfifo-u /proc/26254/ 2195 postgres: stats collector process /proc/10602/ 2281 -bash /proc/3521/ 2299 -bash /proc/1516/ 2447 -bash /proc/23990/ 2573 -bash /proc/1538/ 3082 /usr/lib/chromium/chromium --channel=2219.aa9eb00.196295212 --type=renderer --lang=en-US --force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/CacheSize/CacheSizeGroup_6/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/ /proc/2392/ 3305 -bash /proc/20490/ 3796 /usr/sbin/httpd-kstart /proc/1553/ 4697 /usr/bin/knotify4 /proc/31084/ 6117 /usr/lib/chromium/chromium --channel=2219.afcb450.251437212 --type=renderer --lang=en-US --force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/CacheSize/CacheSizeGroup_6/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/ /proc/18311/ 6564 /proc/self/exe--channel=2219.a684ce8.115295409--type=extension--lang=en-US--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/ /proc/2242/ 6921 /proc/self/exe--channel=2219.a6862e0.261342415--type=extension--lang=en-US--force-fieldtest=AsyncSlowStart/_AsyncSlowStart_off/DnsImpact/_max_500ms_queue_prefetch/GlobalSdch/_global_disable_sdch/SocketLateBinding/_enable_late_binding/ /proc/2245/ 7453 /proc/self/exe--type=plugin--plugin-path=/usr/lib/mozilla/plugins/libflashplayer.so--lang=en-US--plugin-data-dir=/home/alex/.config/chromium/Default--channel=2219.afd16a58.340779570 /proc/18316/ 10132 -bash /proc/2195/ 10154 postgres: wal writer process /proc/10600/ 10154 postgres: writer process /proc/10599/ 10299 postgres: autovacuum launcher process /proc/10601/ 11019 init [3] /proc/1/ 17332 /usr/sbin/httpd-kstart /proc/1616/ 17365 /usr/sbin/httpd-kstart /proc/1611/ 17366 /usr/sbin/httpd-kstart
Re: [HACKERS] Setting oom_adj on linux?
* Tom Lane (t...@sss.pgh.pa.us) wrote: Alex Hunsaker bada...@gmail.com writes: Sure this was openssh? I just looked through the entire cvs history for opensshp and found 0 references to 'oom' let alone 'oom_adj'. Maybe something distro specific? FWIW, I see no evidence that sshd on Fedora does anything to change its oom score --- the oom_adj file reads as zero for both the parent daemon and its children. Kinda scary to realize the OOM killer could easily lock me out of boxes I run headless. There were a few issues, as it turns out, the particularly annoying one was in the init script which caused upgrades to fail due to sshd not being restarted, bug report here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=473573 The other issue was with a Debian-specific patch which was applied to OpenSSH which basically just created noise in the log file, bug report here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=487325 In the end, the problem was with errors being returned from attempts to modify oom_adj. As long as we can just ignore those hopefully there won't be any issues. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Setting oom_adj on linux?
Stephen Frost sfr...@snowman.net writes: The other issue was with a Debian-specific patch which was applied to OpenSSH which basically just created noise in the log file, bug report here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=487325 Hmm, that's pretty interesting, specifically this: : After some discussion with the Linux-Vserver folks, they found some : interesting information I thought it worth adding. First EPERM is not : the error that they expected, and that inside a vserver guest its really : strict about what options you open it with, both O_CREAT and O_TRUNC are : forbidden, and O_WRONLY lets you write 0\n to it. That suggests it might be worth our trouble to use open/write rather than fopen, so that we can ensure the flags are correct to avoid this type of restriction. But in any case the main takeaway seems to be to not insist on the operation succeeding ;-) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Fri, Jan 8, 2010 at 15:24, Stephen Frost sfr...@snowman.net wrote: There were a few issues, as it turns out, the particularly annoying one was in the init script which caused upgrades to fail due to sshd not being restarted, bug report here: Thanks for the pointers! http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=473573 The changes I proposed to the example linux startup script wont suffer from that. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=487325 In the end, the problem was with errors being returned from attempts to modify oom_adj. As long as we can just ignore those hopefully there won't be any issues. Yep sounds good. Thanks again! Tom, sounds like you got busy with other stuff :) Should I submit a new patch that uses open and O_WRONLY? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker bada...@gmail.com writes: Tom, sounds like you got busy with other stuff :) Should I submit a new patch that uses open and O_WRONLY? No, I was just waiting to see if there were more comments. I can take it from here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Mon, Jan 4, 2010 at 09:55, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: Right. Which is why I like the idea of disabling the OOM killer for the *postmaster*, but not the regular backends. Gives it a chance to recover. It's not nice, but it's better than nothing. It doesn't sound like the init script can reenable the killer for the child processes though. So, if there's anything that the core code ought to do, is re-enable OOM-killer for postmaster children, after being disabled by the initscript. Exactly. FWIW here is the patch I run. Stupid as the patch may be, count it as a +1 for people in the field doing this. Hence a reason to think about doing something in core. maybe. This has some oddities like it does not reset oom to 0 for the (wal) writer process. Plus assuming you do oom, the stats collector has a good chance of being hit. Although normal backends will probably have a higher score. [ oom_adj gets set to -17 in the startup script. I run this on top of disabling overcommit, color me paranoid ] *** a/src/backend/postmaster/autovacuum.c --- b/src/backend/postmaster/autovacuum.c *** *** 362,367 StartAutoVacLauncher(void) --- 362,370 #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ + + oom_adjust(); + /* Close the postmaster's sockets */ ClosePostmasterPorts(false); *** a/src/backend/postmaster/fork_process.c --- b/src/backend/postmaster/fork_process.c *** *** 65,68 fork_process(void) --- 65,84 return result; } + void + oom_adjust(void) + { + /* adjust oom */ + FILE *oom = fopen(/proc/self/oom_adj, w); + + /* +* ignore errors we dont really care +*/ + if (oom) + { + fprintf(oom, 0\n); + fclose(oom); + } + } + #endif /* ! WIN32 */ *** a/src/backend/postmaster/pgarch.c --- b/src/backend/postmaster/pgarch.c *** *** 161,166 pgarch_start(void) --- 161,169 #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ + + oom_adjust(); + /* Close the postmaster's sockets */ ClosePostmasterPorts(false); *** a/src/backend/postmaster/pgstat.c --- b/src/backend/postmaster/pgstat.c *** *** 622,627 pgstat_start(void) --- 622,630 #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ + + oom_adjust(); + /* Close the postmaster's sockets */ ClosePostmasterPorts(false); *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *** *** 3056,3061 BackendStartup(Port *port) --- 3056,3063 { free(bn); + oom_adjust(); + /* * Let's clean up ourselves as the postmaster child, and close the * postmaster's listen sockets. (In EXEC_BACKEND case this is all *** a/src/backend/postmaster/syslogger.c --- b/src/backend/postmaster/syslogger.c *** *** 530,535 SysLogger_Start(void) --- 530,538 #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ + + oom_adjust(); + /* Close the postmaster's sockets */ ClosePostmasterPorts(true); *** a/src/include/postmaster/fork_process.h --- b/src/include/postmaster/fork_process.h *** *** 13,17 --- 13,18 #define FORK_PROCESS_H extern pid_t fork_process(void); + extern void oom_adjust(void); #endif /* FORK_PROCESS_H */ *** a/src/backend/postmaster/autovacuum.c --- b/src/backend/postmaster/autovacuum.c *** *** 362,367 StartAutoVacLauncher(void) --- 362,370 #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ + + oom_adjust(); + /* Close the postmaster's sockets */ ClosePostmasterPorts(false); *** a/src/backend/postmaster/fork_process.c --- b/src/backend/postmaster/fork_process.c *** *** 65,68 fork_process(void) --- 65,84 return result; } + void + oom_adjust(void) + { + /* adjust oom */ + FILE *oom = fopen(/proc/self/oom_adj, w); + + /* + * ignore errors we dont really care + */ + if (oom) + { + fprintf(oom, 0\n); + fclose(oom); + } + } + #endif /* ! WIN32 */ *** a/src/backend/postmaster/pgarch.c --- b/src/backend/postmaster/pgarch.c *** *** 161,166 pgarch_start(void) --- 161,169 #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ + + oom_adjust(); + /* Close the postmaster's sockets */ ClosePostmasterPorts(false); *** a/src/backend/postmaster/pgstat.c ---
Re: [HACKERS] Setting oom_adj on linux?
Alex Hunsaker bada...@gmail.com writes: FWIW here is the patch I run. Stupid as the patch may be, count it as a +1 for people in the field doing this. Hence a reason to think about doing something in core. maybe. Thanks for the patch --- it's certainly a fine starting point. We can either drop this in core (with a lot of #ifdef LINUX added) or expect Linux packagers to carry it as a patch. Given that the packagers would also have to modify their init scripts to go with, the patch route is not unreasonable. Comments? This has some oddities like it does not reset oom to 0 for the (wal) writer process. FWIW, I think that's probably a feature --- I'd vote for only resetting in regular backends and possibly autovac workers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Thu, Jan 7, 2010 at 20:26, Tom Lane t...@sss.pgh.pa.us wrote: Alex Hunsaker bada...@gmail.com writes: We can either drop this in core (with a lot of #ifdef LINUX added) Any thoughts on doing something like (in fork_process.c) #ifdef LINUX void oom_adjust() { ... } #else void oom_adjust() {} #endif So there is only one #ifdef? It still leaves the ugly calls to the function... or expect Linux packagers to carry it as a patch. Given that the packagers would also have to modify their init scripts to go with, the patch route is not unreasonable. Comments? Id plus +1 for core. The problem certainly does not look to be going away soon (if ever). This has some oddities like it does not reset oom to 0 for the (wal) writer process. FWIW, I think that's probably a feature --- I'd vote for only resetting in regular backends and possibly autovac workers. I think that makes sense +1. In-fact thats why the patch has it as a separate function instead of hacked into fork_process(). However its mainly odd because IIRC I greped for all instances of fork_process() and added the oom_adjusting to the callers. Given that it seems the wall writer procs should also be set to 0. My guess is its a race with my startup script launching postgres and then setting oom_adj. Or maybe I missed a caller? Maybe they don't use fork_process()? Ill check it out. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Setting oom_adj on linux?
I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/proc.txt;h=220cc6376ef80e0c9bcfec162d45552e729cdf5a;hb=45d28b097280a78893ce25a5d0db41e6a2717853 section 3.1. To get the best benefit, I think it needs to be done in cooperation between the startup scripts and PostgreSQL. We'd want -17 (never oom kill) on the postmaster, but some different value on regular backends (since if it has to kill someone, it's better to pick a regular backend so we can do a controlled restart). Only root can drop the value, so the startup script needs to be part of it. But if we then want to increase it for sub-processes, that'd require something in the backend itself... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Grr. I had zero recollectoin of that :S Can't find a useful consensus though? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Mon, 4 Jan 2010 16:57:28 +0100, Magnus Hagander mag...@hagander.net wrote: On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Grr. I had zero recollectoin of that :S Can't find a useful consensus though? I don't think we should set a setting like that automatically. Perhaps a warning on startup? Joshua D. Drake -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander wrote: On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Grr. I had zero recollectoin of that :S Can't find a useful consensus though? It is probably worth trying to protect the postmaster in the init script. Beyond that things probably start to get fairly difficult. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Mon, Jan 4, 2010 at 17:07, Andrew Dunstan and...@dunslane.net wrote: Magnus Hagander wrote: On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Grr. I had zero recollectoin of that :S Can't find a useful consensus though? It is probably worth trying to protect the postmaster in the init script. Beyond that things probably start to get fairly difficult. Right. But AFAICS (though I haven't tested with -17), it will become inherited to children, which is something we'd want to *undo*, no? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander wrote: On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Can't find a useful consensus though? In http://archives.postgresql.org/pgsql-hackers/2008-02/msg00049.php Tom points out that while you could make this adjustment in the init scripts for PostgreSQL, actually doing so is quite questionable as a packaging decision. That's where that thread ended as far as I was concerned. The best I think anyone could do here is to add such a capability into some of the init scripts, but it would probably need to be disabled by default. Since that sort of defeats the purpose of the change, I'm not sure what the benefit there is--if you have to turn it on, you might as well do something at a higher level instead. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander wrote: On Mon, Jan 4, 2010 at 17:07, Andrew Dunstan and...@dunslane.net wrote: Magnus Hagander wrote: On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Grr. I had zero recollectoin of that :S Can't find a useful consensus though? It is probably worth trying to protect the postmaster in the init script. Beyond that things probably start to get fairly difficult. Right. But AFAICS (though I haven't tested with -17), it will become inherited to children, which is something we'd want to *undo*, no? [experiments] Yes, darnit, you're right. But it looks like the oom_adj file can be set to the default by the process owner: [and...@sophia ~]$ ls -l /proc/6520/oom_adj -rw-r--r-- 1 andrew andrew 0 2010-01-04 12:37 /proc/6520/oom_adj [and...@sophia ~]$ cat /proc/6520/oom_adj 0 [and...@sophia ~]$ id uid=500(andrew) gid=500(andrew) groups=10(wheel),500(andrew) [and...@sophia ~]$ echo -17 /proc/6520/oom_adj -bash: echo: write error: Permission denied [and...@sophia ~]$ echo 0 /proc/6520/oom_adj [and...@sophia ~]$ echo -17 /proc/6520/oom_adj -bash: echo: write error: Permission denied [and...@sophia ~]$ But that would be a pain to have to do. OTOH, disabling the OOM killer is not always an option. I recently tried it on one system and had to revert it rapidly because the system stopped working in minutes. Some software just doesn't live well in such environments, sadly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Mon, Jan 4, 2010 at 17:40, Andrew Dunstan and...@dunslane.net wrote: Magnus Hagander wrote: On Mon, Jan 4, 2010 at 17:07, Andrew Dunstan and...@dunslane.net wrote: Magnus Hagander wrote: On Mon, Jan 4, 2010 at 16:45, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://archives.postgresql.org/message-id/20080201223336.GC24780%40alvh.no-ip.org Grr. I had zero recollectoin of that :S Can't find a useful consensus though? It is probably worth trying to protect the postmaster in the init script. Beyond that things probably start to get fairly difficult. Right. But AFAICS (though I haven't tested with -17), it will become inherited to children, which is something we'd want to *undo*, no? [experiments] Yes, darnit, you're right. But it looks like the oom_adj file can be set to the default by the process owner: [and...@sophia ~]$ ls -l /proc/6520/oom_adj -rw-r--r-- 1 andrew andrew 0 2010-01-04 12:37 /proc/6520/oom_adj [and...@sophia ~]$ cat /proc/6520/oom_adj 0 [and...@sophia ~]$ id uid=500(andrew) gid=500(andrew) groups=10(wheel),500(andrew) [and...@sophia ~]$ echo -17 /proc/6520/oom_adj -bash: echo: write error: Permission denied [and...@sophia ~]$ echo 0 /proc/6520/oom_adj [and...@sophia ~]$ echo -17 /proc/6520/oom_adj -bash: echo: write error: Permission denied [and...@sophia ~]$ But that would be a pain to have to do. OTOH, disabling the OOM killer is not always an option. I recently tried it on one system and had to revert it rapidly because the system stopped working in minutes. Some software just doesn't live well in such environments, sadly. Right. Which is why I like the idea of disabling the OOM killer for the *postmaster*, but not the regular backends. Gives it a chance to recover. It's not nice, but it's better than nothing. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Greg Smith g...@2ndquadrant.com writes: In http://archives.postgresql.org/pgsql-hackers/2008-02/msg00049.php Tom points out that while you could make this adjustment in the init scripts for PostgreSQL, actually doing so is quite questionable as a packaging decision. I just wondered if it would be questioned, I didn't say there was a problem. However, the long and the short of this is that we can't do anything without the close cooperation of an init script. I think that moves it out of the realm of what Postgres as a project should be doing. It seems more like a patch that the Linux-based packagers should be carrying. Memo to self: get off duff and prepare such a patch for the Red Hat/Fedora packages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander wrote: Right. Which is why I like the idea of disabling the OOM killer for the *postmaster*, but not the regular backends. Gives it a chance to recover. It's not nice, but it's better than nothing. It doesn't sound like the init script can reenable the killer for the child processes though. So, if there's anything that the core code ought to do, is re-enable OOM-killer for postmaster children, after being disabled by the initscript. BTW, is it possible for pg_ctl to disable OOM-killer? I guess not, since it's not run by root ... Is there a way to disable memory overcommit for particular processes? That would be very useful -- just disable overcommit for all Postgres processes, and there shouldn't be much need to enable the killer for backends. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
On Mon, Jan 4, 2010 at 17:55, Alvaro Herrera alvhe...@commandprompt.com wrote: Magnus Hagander wrote: Right. Which is why I like the idea of disabling the OOM killer for the *postmaster*, but not the regular backends. Gives it a chance to recover. It's not nice, but it's better than nothing. It doesn't sound like the init script can reenable the killer for the child processes though. So, if there's anything that the core code ought to do, is re-enable OOM-killer for postmaster children, after being disabled by the initscript. Yeah, that's why the backend code would need to be involved. BTW, is it possible for pg_ctl to disable OOM-killer? I guess not, since it's not run by root ... No, it has to run as root. Is there a way to disable memory overcommit for particular processes? That would be very useful -- just disable overcommit for all Postgres processes, and there shouldn't be much need to enable the killer for backends. Not that I've been able to find. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Alvaro Herrera alvhe...@commandprompt.com writes: Is there a way to disable memory overcommit for particular processes? I would think not --- the very essence of overcommit is that you're promising more total memory than the system has got, and that's inherently a global proposition. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander wrote: On Mon, Jan 4, 2010 at 17:55, Alvaro Herrera alvhe...@commandprompt.com wrote: BTW, is it possible for pg_ctl to disable OOM-killer? I guess not, since it's not run by root ... No, it has to run as root. We could at least make it work on Windows, since it is often run as Administrator and drops privileges afterwards ... ... oh, wait ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Magnus Hagander mag...@hagander.net writes: I realize this is a very platform-specific thing, but should we consider setting the value of /proc/pid/oom_adj when running on linux? See: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/proc.txt;h=220cc6376ef80e0c9bcfec162d45552e729cdf5a;hb=45d28b097280a78893ce25a5d0db41e6a2717853 One interesting thing I read there is: Swapped out tasks are killed first. Half of each child's memory size is added to the parent's score if they do not share the same memory. This suggests that PG's shared memory ought not be counted in the postmaster's OOM score, which would mean that the problem shouldn't be quite as bad as we've believed. I wonder if that is a recent change? Or maybe it's supposed to be that way and is not implemented correctly? BTW, the given link shows only chapter 1, see http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob_plain;f=Documentation/filesystems/proc.txt;hb=45d28b097280a78893ce25a5d0db41e6a2717853 for the whole file. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Tom Lane t...@sss.pgh.pa.us wrote: This suggests that PG's shared memory ought not be counted in the postmaster's OOM score, which would mean that the problem shouldn't be quite as bad as we've believed. I wonder if that is a recent change? Or maybe it's supposed to be that way and is not implemented correctly? I've wondered about that based on my experience. When I found that memory leak back in 8.2devel, running on a SLES 9 SP 3 system, the OOM killer killed the offending backend rather than the postmaster, although it took out a couple Java middle tier processes before starting in on PostgreSQL. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Setting oom_adj on linux?
Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: ...oom_adj... One interesting thing I read there is: Swapped out tasks are killed first. Half of each child's memory size is added to the parent's score if they do not share the same memory. This suggests that PG's shared memory ought not be counted in the postmaster's OOM score, which would mean that the problem shouldn't be quite as bad as we've believed. I wonder if that is a recent change? Or maybe it's supposed to be that way and is not implemented correctly? The code for oom_kill.c looks fairly readable (link below [1]): 96 points = mm-total_vm; 117 list_for_each_entry(child, p-children, sibling) { 118 task_lock(child); 119 if (child-mm != mm child-mm) 120 points += child-mm-total_vm/2 + 1; 121 task_unlock(child); 122 } Which seems to add points for each child who doesn't share the same mm structure as the parent. Which I think is a quite a bit stricter interpretation of if they do not share the same memory. [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=mm/oom_kill.c;h=f52481b1c1e5442c9a5b16b06b1b75b9bb7c;hb=HEAD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers