Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-09-15 Thread Florian Weimer
* Craig James: So it never makes sense to enable overcommitted memory when Postgres, or any server, is running. There are some run-time environments which allocate huge chunks of memory on startup, without marking them as not yet in use. SBCL is in this category, and also the Hotspot VM (at

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-09-15 Thread Craig James
Florian Weimer wrote: * Craig James: So it never makes sense to enable overcommitted memory when Postgres, or any server, is running. There are some run-time environments which allocate huge chunks of memory on startup, without marking them as not yet in use. SBCL is in this category, and

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-09-15 Thread Florian Weimer
* Craig James: There are some run-time environments which allocate huge chunks of memory on startup, without marking them as not yet in use. SBCL is in this category, and also the Hotspot VM (at least some extent). I stand by my assertion: It never makes sense. Do these applications

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-09-15 Thread Craig James
Florian Weimer wrote: * Craig James: There are some run-time environments which allocate huge chunks of memory on startup, without marking them as not yet in use. SBCL is in this category, and also the Hotspot VM (at least some extent). I stand by my assertion: It never makes sense. Do

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Magnus Hagander
[EMAIL PROTECTED] wrote: On Thu, 28 Aug 2008, Scott Marlowe wrote: wait a min here, postgres is supposed to be able to survive a complete box failure without corrupting the database, if killing a process can corrupt the database it sounds like a major problem. Yes it is a major problem,

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Matthew Wakeling
On Thu, 28 Aug 2008, [EMAIL PROTECTED] wrote: Huh? Each backend has its own socket. we must be talking about different things. I'm talking about the socket that would be used for clients to talk to postgres, this is either a TCP socket or a unix socket. in either case only one process can

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Bill Moran
In response to Greg Smith [EMAIL PROTECTED]: On Thu, 28 Aug 2008, Bill Moran wrote: In linux, it's possible to tell the OOM killer never to consider certain processes for the axe, using /proc magic. See this page: http://linux-mm.org/OOM_Killer Perhaps this should be in the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Gregory Williamson
Bill Moran wrote: In response to Greg Smith [EMAIL PROTECTED]: snipped... I don't know, Greg. First off, the solution of making the postmaster immune to the OOM killer seems better than disabling overcommit to me anyway; and secondly, I don't understand why we should avoid making the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Alvaro Herrera
[EMAIL PROTECTED] escribió: On Thu, 28 Aug 2008, Alvaro Herrera wrote: [EMAIL PROTECTED] escribi?: On Thu, 28 Aug 2008, Scott Marlowe wrote: scenario 1: There's a postmaster, it owns all the child processes. It gets killed. The Postmaster gets restarted. Since there isn't one when the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Craig James
James Mansion wrote: I can't see how an OS can lie to processes about memory being allocated to them and not be ridiculed as a toy, but there you go. I don't think Linux is the only perpetrator - doesn't AIX do this too? This is a leftover from the days of massive physical modeling

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Matthew Wakeling
On Fri, 29 Aug 2008, Craig James wrote: Disable overcommitted memory. There is NO REASON to use it on any modern server-class computer, and MANY REASONS WHY IT IS A BAD IDEA. As far as I can see, the main reason nowadays for overcommit is when a large process forks and then execs. Are there

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Greg Smith
On Fri, 29 Aug 2008, Bill Moran wrote: First off, the solution of making the postmaster immune to the OOM killer seems better than disabling overcommit to me anyway I really side with Craig James here that the right thing to do here is to turn off overcommit altogether. While it's possible

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Valentin Bogdanov
In 2003 I met this guy who was doing Computation Fluid Dynamics and he had to use this software written by physics engineers in FORTRAN. 1 Gig of ram wasn't yet the standard for a desktop pc at that time but the software required at least 1 Gig just to get started. So I thought what is the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-29 Thread Gregory S. Youngblood
Gregory Williamson wrote: Bill Moran wrote: In response to Greg Smith [EMAIL PROTECTED]: snipped... I don't know, Greg. First off, the solution of making the postmaster immune to the OOM killer seems better than disabling overcommit to me anyway; and secondly, I don't understand why we

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Craig James
The OOM killer is a terrible idea for any serious database server. I wrote a detailed technical paper on this almost 15 years ago when Silicon Graphics had this same feature, and Oracle and other critical server processes couldn't be made reliable. The problem with overallocating memory as

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Wed, 27 Aug 2008, Craig James wrote: The OOM killer is a terrible idea for any serious database server. I wrote a detailed technical paper on this almost 15 years ago when Silicon Graphics had this same feature, and Oracle and other critical server processes couldn't be made reliable.

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Tom Lane wrote: [EMAIL PROTECTED] writes: On Wed, 27 Aug 2008, Andrew Sullivan wrote: The upshot of this is that postgres tends to be a big target for the OOM killer, with seriously bad effects to your database. So for good Postgres operation, you want to run on a

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Andrew Sullivan
On Wed, Aug 27, 2008 at 03:22:09PM -0700, [EMAIL PROTECTED] wrote: I disagree with you. I think goof Postgres operation is so highly dependant on caching as much data as possible that disabling overcommit (and throwing away a lot of memory that could be used for cache) is a solution that's

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Matthew Wakeling
On Wed, 27 Aug 2008, [EMAIL PROTECTED] wrote: if memory overcommit is disabled, the kernel checks to see if you have an extra 1G of ram available, if you do it allows the process to continue, if you don't it tries to free memory (by throwing away cache, swapping to disk, etc), and if it can't

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Bill Moran
In response to Matthew Wakeling [EMAIL PROTECTED]: Probably the best solution is to just tell the kernel somehow to never kill the postmaster. This thread interested me enough to research this a bit. In linux, it's possible to tell the OOM killer never to consider certain processes for the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Steve Atkins
On Aug 28, 2008, at 6:26 AM, Matthew Wakeling wrote: On Wed, 27 Aug 2008, [EMAIL PROTECTED] wrote: if memory overcommit is disabled, the kernel checks to see if you have an extra 1G of ram available, if you do it allows the process to continue, if you don't it tries to free memory (by

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Jerry Champlin
Another approach we used successfully for a similar problem -- (we had lots of free high memory but were running out of low memory; oom killer wiped out MQ a couple times and postmaster a couple times) -- was to change the settings for how aggressively the virtual memory system protected low

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Craig James
[EMAIL PROTECTED] wrote: On Wed, 27 Aug 2008, Craig James wrote: The OOM killer is a terrible idea for any serious database server. I wrote a detailed technical paper on this almost 15 years ago when Silicon Graphics had this same feature, and Oracle and other critical server processes

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Matthew Wakeling
On Thu, 28 Aug 2008, Steve Atkins wrote: Probably the best solution is to just tell the kernel somehow to never kill the postmaster. Or configure adequate swap space? Oh yes, that's very important. However, that gives the machine the opportunity to thrash. Matthew -- The early bird gets

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Craig James wrote: [EMAIL PROTECTED] wrote: On Wed, 27 Aug 2008, Craig James wrote: The OOM killer is a terrible idea for any serious database server. I wrote a detailed technical paper on this almost 15 years ago when Silicon Graphics had this same feature, and Oracle

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Matthew Wakeling wrote: On Wed, 27 Aug 2008, [EMAIL PROTECTED] wrote: if memory overcommit is disabled, the kernel checks to see if you have an extra 1G of ram available, if you do it allows the process to continue, if you don't it tries to free memory (by throwing away

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Jeff Davis
On Thu, 2008-08-28 at 00:56 -0400, Tom Lane wrote: Actually, the problem with Linux' OOM killer is that it *disproportionately targets the PG postmaster*, on the basis not of memory that the postmaster is using but of memory its child processes are using. This was discussed in the PG archives

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Craig James
Matthew Wakeling wrote: On Thu, 28 Aug 2008, Steve Atkins wrote: Probably the best solution is to just tell the kernel somehow to never kill the postmaster. Or configure adequate swap space? Oh yes, that's very important. However, that gives the machine the opportunity to thrash. No,

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, [EMAIL PROTECTED] wrote: On Thu, 28 Aug 2008, Matthew Wakeling wrote: On Wed, 27 Aug 2008, [EMAIL PROTECTED] wrote: if memory overcommit is disabled, the kernel checks to see if you have an extra 1G of ram available, if you do it allows the process to continue, if you

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Craig James wrote: Matthew Wakeling wrote: On Thu, 28 Aug 2008, Steve Atkins wrote: Probably the best solution is to just tell the kernel somehow to never kill the postmaster. Or configure adequate swap space? Oh yes, that's very important. However, that gives the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Jeff Davis
On Wed, 2008-08-27 at 23:23 -0700, [EMAIL PROTECTED] wrote: there are periodic flamefests on the kernel mailing list over the OOM killer, if you can propose a better algorithm for it to use than the current one that doesn't end up being just as bad for some other workload the kernel policy

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Matthew Wakeling
On Thu, 28 Aug 2008, Jeff Davis wrote: The problem for the postmaster is that the OOM killer counts the children's total vmsize -- including *shared* memory -- against the parent, which is such a bad idea I don't know where to start. If you have shared_buffers set to 1GB and 25 connections, the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Matthew Wakeling
On Thu, 28 Aug 2008, Craig James wrote: If your processes do use the memory, then your performance goes into the toilet, and you know it's time to buy more memory or a second server, but in the mean time your server processes at least keep running while you kill the rogue processes. I'd

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Scott Marlowe
On Thu, Aug 28, 2008 at 2:29 PM, Matthew Wakeling [EMAIL PROTECTED] wrote: Another point is that from a business perspective, a database that has stopped responding is equally bad regardless of whether that is because the OOM killer has appeared or because the machine is thrashing. In both

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Scott Marlowe
On Thu, Aug 28, 2008 at 5:08 PM, [EMAIL PROTECTED] wrote: On Thu, 28 Aug 2008, Scott Marlowe wrote: On Thu, Aug 28, 2008 at 2:29 PM, Matthew Wakeling [EMAIL PROTECTED] wrote: Another point is that from a business perspective, a database that has stopped responding is equally bad regardless

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Scott Marlowe wrote: On Thu, Aug 28, 2008 at 5:08 PM, [EMAIL PROTECTED] wrote: On Thu, 28 Aug 2008, Scott Marlowe wrote: On Thu, Aug 28, 2008 at 2:29 PM, Matthew Wakeling [EMAIL PROTECTED] wrote: Another point is that from a business perspective, a database that has

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Scott Marlowe
On Thu, Aug 28, 2008 at 7:16 PM, [EMAIL PROTECTED] wrote: the ACID guarantees that postgres is making are supposed to mean that even if the machine dies, the CPU goes up in smoke, etc, the transactions that are completed will not be corrupted. And if any of those things happens, the machine

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Matthew Dennis
On Thu, Aug 28, 2008 at 8:11 PM, Scott Marlowe [EMAIL PROTECTED]wrote: wait a min here, postgres is supposed to be able to survive a complete box failure without corrupting the database, if killing a process can corrupt the database it sounds like a major problem. Yes it is a major

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Scott Marlowe
On Thu, Aug 28, 2008 at 7:53 PM, Matthew Dennis [EMAIL PROTECTED] wrote: On Thu, Aug 28, 2008 at 8:11 PM, Scott Marlowe [EMAIL PROTECTED] wrote: wait a min here, postgres is supposed to be able to survive a complete box failure without corrupting the database, if killing a process can

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Alvaro Herrera
Scott Marlowe escribió: scenario 1: There's a postmaster, it owns all the child processes. It gets killed. The Postmaster gets restarted. Since there isn't one running, it comes up. Actually there's an additional step required at this point. There isn't a postmaster running, but a new one

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Scott Marlowe wrote: On Thu, Aug 28, 2008 at 7:53 PM, Matthew Dennis [EMAIL PROTECTED] wrote: On Thu, Aug 28, 2008 at 8:11 PM, Scott Marlowe [EMAIL PROTECTED] wrote: wait a min here, postgres is supposed to be able to survive a complete box failure without corrupting

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Alvaro Herrera
[EMAIL PROTECTED] escribió: On Thu, 28 Aug 2008, Scott Marlowe wrote: scenario 1: There's a postmaster, it owns all the child processes. It gets killed. The Postmaster gets restarted. Since there isn't one when the postmaster gets killed doesn't that kill all it's children as well? Of

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread david
On Thu, 28 Aug 2008, Alvaro Herrera wrote: [EMAIL PROTECTED] escribi?: On Thu, 28 Aug 2008, Scott Marlowe wrote: scenario 1: There's a postmaster, it owns all the child processes. It gets killed. The Postmaster gets restarted. Since there isn't one when the postmaster gets killed

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Greg Smith
On Tue, 26 Aug 2008, Scott Marlowe wrote: If it is a checkpoint issue then you need more aggresive bgwriter settings, and possibly more bandwidth on your storage array. Since this is 8.3.1 the main useful thing to do is increase checkpoint_segments and checkpoint_completion_target to spread

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread Greg Smith
On Thu, 28 Aug 2008, Bill Moran wrote: In linux, it's possible to tell the OOM killer never to consider certain processes for the axe, using /proc magic. See this page: http://linux-mm.org/OOM_Killer Perhaps this should be in the PostgreSQL docs somewhere? The fact that

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-28 Thread James Mansion
[EMAIL PROTECTED] wrote: for example if you have a process that uses 1G of ram (say firefox) and it needs to start a new process (say acroread to handle a pdf file), what it does is it forks the firefox process (each of which have 1G of ram allocated), and then does an exec of the acroread

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Florian Weimer
* henk de wit: On this table we're inserting records with a relatively low frequency of +- 6~10 per second. We're using PG 8.3.1 on a machine with two dual core 2.4Ghz XEON CPUs, 16 GB of memory and Debian Linux. The machine is completely devoted to PG, nothing else runs on the box. Have

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Bill Moran
In response to henk de wit [EMAIL PROTECTED]: What do your various logs (pgsql, application, etc...) have to say? There is hardly anything helpful in the pgsql log. The application log doesn't mention anything either. We log a great deal of information in our application, but there's

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread DANIEL CRISTIAN CRUZ
Maybe strace could help you find the problem, but could cause a great overhead... Bill Moran [EMAIL PROTECTED] escreveu: ... -- span style=color: #80Daniel Cristian Cruz /spanAdministrador de Banco de Dados Direção Regional - Núcleo de Tecnologia da Informação SENAI - SC Telefone:

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Shane Ambler
Bill Moran wrote: On a side note, what version of PG are you using? If it was in a previous email, I missed it. He mentioned 8.3.1 in the first email. Although nothing stands out in the 8.3.2 or 8.3.3 fix list (without knowing his table structure or any contrib modules used) I wonder if

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread david
On Wed, 27 Aug 2008, Florian Weimer wrote: * henk de wit: On this table we're inserting records with a relatively low frequency of +- 6~10 per second. We're using PG 8.3.1 on a machine with two dual core 2.4Ghz XEON CPUs, 16 GB of memory and Debian Linux. The machine is completely devoted to

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Andrew Sullivan
On Wed, Aug 27, 2008 at 02:45:47PM -0700, [EMAIL PROTECTED] wrote: with memory overcommit enabled (the default), the kernel recognises that most programs that fork don't write to all the memory they have allocated, It doesn't recognise it; it hopes it. It happens to hope correctly in many

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread david
On Wed, 27 Aug 2008, Andrew Sullivan wrote: On Wed, Aug 27, 2008 at 02:45:47PM -0700, [EMAIL PROTECTED] wrote: with memory overcommit enabled (the default), the kernel recognises that most programs that fork don't write to all the memory they have allocated, It doesn't recognise it; it

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote: On Wed, 27 Aug 2008, Andrew Sullivan wrote: seperate copies for the seperate processes (and if at this time it runs of of memory it invokes the OOM killer to free some space), . . .it kills processes that are using a lot of memory. Those are not necessarily the

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: Some time ago I found that it was possible to fiddle with a /proc entry to convince the OOM to not touch the postmaster. A postmaster with the raw IO capability bit set would be skipped by the OOM too killer (this is an Oracle tweak AFAIK). These are

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-27 Thread Tom Lane
[EMAIL PROTECTED] writes: On Wed, 27 Aug 2008, Andrew Sullivan wrote: The upshot of this is that postgres tends to be a big target for the OOM killer, with seriously bad effects to your database. So for good Postgres operation, you want to run on a machine with the OOM killer disabled. I

[PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread henk de wit
Hi, We're currently having a problem with queries on a medium sized table. This table is 22GB in size (via select pg_size_pretty(pg_relation_size('table'));). It has 7 indexes, which bring the total size of the table to 35 GB (measured with pg_total_relation_size). On this table we're

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread Mark Lewis
On Tue, 2008-08-26 at 18:44 +0200, henk de wit wrote: Hi, We're currently having a problem with queries on a medium sized table. This table is 22GB in size (via select pg_size_pretty(pg_relation_size('table'));). It has 7 indexes, which bring the total size of the table to 35 GB

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread Scott Marlowe
On Tue, Aug 26, 2008 at 10:44 AM, henk de wit [EMAIL PROTECTED] wrote: Hi, We're currently having a problem with queries on a medium sized table. This table is 22GB in size (via select pg_size_pretty(pg_relation_size('table'));). It has 7 indexes, which bring the total size of the table

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread henk de wit
If the select returns a lot of data and you haven't enabled cursors (by calling setFetchSize), then the entire SQL response will be loaded in memory at once, so there could be an out-of-memory condition on the client. I hear you. This is absolutely not the case though. There is no other

Re: [PERFORM] select on 22 GB table causes An I/O error occured while sending to the backend. exception

2008-08-26 Thread henk de wit
What do your various logs (pgsql, application, etc...) have to say? There is hardly anything helpful in the pgsql log. The application log doesn't mention anything either. We log a great deal of information in our application, but there's nothing out of the ordinary there, although there's of