Re: [PERFORM] Postgres server crash

2006-11-27 Thread Florian Weimer
* Jim C. Nasby: > What's interesting is that apparently FreeBSD also has overcommit (and > IIRC no way to disable it), yet I never hear people going off on OOM > kills in FreeBSD. My theory is that FreeBSD admins are smart enough to > dedicate a decent amount of swap space, so that by the time you

Re: [PERFORM] Postgres server crash

2006-11-27 Thread Michael Stone
On Sun, Nov 26, 2006 at 05:41:02PM -0600, Jim C. Nasby wrote: What's interesting is that apparently FreeBSD also has overcommit (and IIRC no way to disable it), yet I never hear people going off on OOM kills in FreeBSD. Could just be that nobody is using FreeBSD. Seriously, though, there are

Re: [PERFORM] Postgres server crash

2006-11-26 Thread Jim C. Nasby
On Sat, Nov 18, 2006 at 05:28:46PM -0800, Richard Troy wrote: > ...I read a large number of articles on this subject and am > absolutely dumbfounded by the -ahem- idiots who think killing a random > process is an appropriate action. I'm just taking their word for it that > there's some kind of imp

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Craig A. James
You realize that it had to be turned on explicitly on IRIX, right? But don't let facts get in the way of a good rant... On the contrary, with Irix 4 and earlier it was the default, but it caused so many problems that SGI switched the default to OFF in IRIX 5. But because it had been available

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Michael Stone
On Sun, Nov 19, 2006 at 02:12:01PM -0800, Craig A. James wrote: And speaking of SGI, this very issue was among the things that sank the company. As the low-end graphics cards ate into their visualization market, they tried to become an Oracle Server platform. Their servers were *fast*. But t

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Craig A. James
Michael Stone wrote: At one point someone complained about the ability to configure, e.g., IRIX to allow memory overcommit. I worked on some large IRIX installations where full memory accounting would have required on the order of 100s of gigabytes of swap, due to large shared memory allocatio

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Michael Stone
On Sun, Nov 19, 2006 at 12:42:45PM -0800, Richard Broersma Jr wrote: I don't mean to hijack the thread, but I am interested in learning the science behind configuring memory usage. There isn't one. You need experience, and an awareness of your particular requirements. If it were easy, it w

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Richard Broersma Jr
> (Maybe other people are just > better at configuring their memory usage?) I don't mean to hijack the thread, but I am interested in learning the science behind configuring memory usage. A lot of the docs that I have found on this subject speak in terms of generalities and rules of thumb. A

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Michael Stone
On Sat, Nov 18, 2006 at 05:28:46PM -0800, Richard Troy wrote: On linux you can use the sysctl utility to muck with vm.overcommit_memory; You can disable the "feature." Be aware that there's are "reasons" the "feature" exists before you "cast" "aspersions" and "quote marks" all over the place,

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Ron Mayer
Tom Lane wrote: > "Craig A. James" <[EMAIL PROTECTED]> writes: >> Here's something I found googling for "memory overcommitment"+linux >> http://archives.neohapsis.com/archives/postfix/2000-04/0512.html > > That might have been right when it was written (note the reference to a > 2.2 Linux kernel

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Ron Mayer
Tom Lane wrote: > "Craig A. James" <[EMAIL PROTECTED]> writes: >> Here's something I found googling for "memory overcommitment"+linux >> http://archives.neohapsis.com/archives/postfix/2000-04/0512.html > > That might have been right when it was written (note the reference to a > 2.2 Linux kernel

Re: [PERFORM] Postgres server crash

2006-11-18 Thread Tom Lane
"Craig A. James" <[EMAIL PROTECTED]> writes: > Here's something I found googling for "memory overcommitment"+linux > http://archives.neohapsis.com/archives/postfix/2000-04/0512.html That might have been right when it was written (note the reference to a 2.2 Linux kernel), but it's 100% wrong now

Re: [PERFORM] Postgres server crash

2006-11-18 Thread Craig A. James
Richard Troy wrote: I did that - spent about two f-ing hours looking for what I wanted. (Guess I entered poor choices for my searches. -frown- ) There are a LOT of articles that TALK ABOUT OOM, but prescious few actually tell you what you can do about it. Trying to save you some time: On linux

Re: [PERFORM] Postgres server crash

2006-11-18 Thread Richard Troy
On Thu, 16 Nov 2006, Tom Lane wrote: > > "Craig A. James" <[EMAIL PROTECTED]> writes: > > OOM? Can you give me a quick pointer to what this acronym stands for > > and how I can reconfigure it? > > See "Linux Memory Overcommit" at > http://www.postgresql.org/docs/8.1/static/kernel-resources.html#

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Ben
OOM stands for "Out Of Memory" and it does indeed seem to be the same as what IRIX had. I believe you can turn the feature off and also configure its overcomitment by setting something in /proc/. and unfortunately, I don't remember more than that. On Thu, 16 Nov 2006, Craig A. James wrote:

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Richard Huxton
Craig A. James wrote: Russell Smith wrote: For the third time today, our server has crashed... I would guess it's the linux OOM if you are running linux. You need to turn off killing of processes when you run out of memory. Are you getting close to running out of memory? Good suggestion,

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Merlin Moncure
On 11/15/06, Craig A. James <[EMAIL PROTECTED]> wrote: Questions: 1. Any idea what happened and how I can avoid this? It's a *big* problem. 2. Why didn't the database recover? Why are there two processes that couldn't be killed? 3. Where did the "signal 9" come from? (Nobody but me

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Richard Huxton
Craig A. James wrote: Richard Huxton wrote: If a "kill -9" as root doesn't get rid of them, I think I'm right in saying that it's a kernel-level problem rather than something else. Sorry I didn't clarify that. "kill -9" did kill them. Other signals did not. It wasn't until I manually inter

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Tom Lane
"Craig A. James" <[EMAIL PROTECTED]> writes: > OOM? Can you give me a quick pointer to what this acronym stands for > and how I can reconfigure it? See "Linux Memory Overcommit" at http://www.postgresql.org/docs/8.1/static/kernel-resources.html#AEN18128 or try googling for "OOM kill" for non-Post

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Tom Lane
Richard Huxton writes: > Craig A. James wrote: >> It can't be a coincidence that these were the only two processes in a >> SELECT operation. Does the server disable signals at critical points? > If a "kill -9" as root doesn't get rid of them, I think I'm right in > saying that it's a kernel-le

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Richard Huxton
Craig A. James wrote: By the way, in spite of my questions and concerns, I was *very* impressed by the recovery process. I know it might seem like old hat to you guys to watch the WAL in action, and I know on a theoretical level it's supposed to work, but watching it recover 150 separate datab

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Craig A. James
By the way, in spite of my questions and concerns, I was *very* impressed by the recovery process. I know it might seem like old hat to you guys to watch the WAL in action, and I know on a theoretical level it's supposed to work, but watching it recover 150 separate databases, and find and fix

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Craig A. James
Russell Smith wrote: For the third time today, our server has crashed... I would guess it's the linux OOM if you are running linux. You need to turn off killing of processes when you run out of memory. Are you getting close to running out of memory? Good suggestion, it was a memory leak in

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Richard Huxton
Russell Smith wrote: Craig A. James wrote: Questions: 1. Any idea what happened and how I can avoid this? It's a *big* problem. 2. Why didn't the database recover? Why are there two processes that couldn't be killed? I'm guessing it didn't recover *because* there were two processes t

Re: [PERFORM] Postgres server crash

2006-11-15 Thread Russell Smith
Craig A. James wrote: For the third time today, our server has crashed, or frozen, actually something in between. Normally there are about 30-50 connections because of mod_perl processes that keep connections open. After the crash, there are three processes remaining: # ps -ef | grep postgr

[PERFORM] Postgres server crash

2006-11-15 Thread Craig A. James
For the third time today, our server has crashed, or frozen, actually something in between. Normally there are about 30-50 connections because of mod_perl processes that keep connections open. After the crash, there are three processes remaining: # ps -ef | grep postgres postgres 23832 1