Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-11 Thread Jim C. Nasby
On Fri, Feb 10, 2006 at 12:16:04PM -0500, Mark Woodward wrote:
  And even when PostgreSQL has the server all to itself, having a hashagg
  spill to disk is *way* better than pushing the machine into a swap
  storm. At least if you spill the hashagg you only have one backend
  running at a snail's pace; a swap storm means next to nothing gets done.
 
  This was/is an example of where the behavior of PostgreSQL is clearly
  unacceptable. OK, yes, this problem goes away with an ANALYZE, but it
  isn't clear how anyone could have known this, and unexpected behavior is
  bad in any product.
 
  Care to submit a documentation patch before releases are bundled (I
  think on Sunday?) At least then people would be aware that work_mem is
  just a suggestion to hash_aggs. I'd do a patch myself but I doubt I'll
  have time before the release. :(
 
 I would be glad too. What's the process?

Well, find the appropriate file in doc/src/sgml, make a copy, edit the
file, generate a diff with diff -u, and email that diff/patch to
pgsql-patches.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-11 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  That's why merely allocating tons of swap doesn't necessarily protect you.
  It's still possible for a process (or several processes if you allocate more
  swap than you have address space) to mmap gigabytes of memory without 
  touching
  it and then start touching those pages.
 
 So?  If the swap exists to back that memory, there's no problem.  It
 might be slow, but it will not fail.

Sure, but there's no way to know how much swap you need. No matter how much
swap you allocate these processes can allocate more pages of untouched RAM and
then blow up.

Of course realistically allocating 4G of swap is enough to deal with something
like Postgres where you're not being maliciously attacked. One process on ia32
can't accidentally allocate more than 4G of ram.

I was just trying to clarify the situation since someone made some comment
about it having to do with memory being swapped out and then finding nowhere
to swap in when needed. That's not exactly what's happening.

-- 
greg


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-11 Thread Bruce Momjian
Greg Stark wrote:
 
 Tom Lane [EMAIL PROTECTED] writes:
 
  Greg Stark [EMAIL PROTECTED] writes:
   That's why merely allocating tons of swap doesn't necessarily protect you.
   It's still possible for a process (or several processes if you allocate 
   more
   swap than you have address space) to mmap gigabytes of memory without 
   touching
   it and then start touching those pages.
  
  So?  If the swap exists to back that memory, there's no problem.  It
  might be slow, but it will not fail.
 
 Sure, but there's no way to know how much swap you need. No matter how much
 swap you allocate these processes can allocate more pages of untouched RAM and
 then blow up.
 
 Of course realistically allocating 4G of swap is enough to deal with something
 like Postgres where you're not being maliciously attacked. One process on ia32
 can't accidentally allocate more than 4G of ram.
 
 I was just trying to clarify the situation since someone made some comment
 about it having to do with memory being swapped out and then finding nowhere
 to swap in when needed. That's not exactly what's happening.

I guess the fundamental issue is whether Linux requires all mmap()'ed
file contents to be in memory, or whether it pushes data to disk and
unmaps it as it runs low on memory.  I don't know the answer to that.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-11 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 I was just trying to clarify the situation since someone made some comment
 about it having to do with memory being swapped out and then finding nowhere
 to swap in when needed. That's not exactly what's happening.

No.  I believe the case that is actually hard for the kernel to predict
comes from copy-on-write: when a process forks, you could potentially
need twice its current memory image, but in reality you probably won't
ever need that much since many of the shared pages won't ever be written
by either process.  However, a non-overcommitting kernel must assume
that worst case, and hence fail the fork() if it doesn't have enough
swap space to cover both processes.  If it does not, then the crunch
comes when one process does touch a shared page.  If there is no
available swap space at that time, kill -9 is the only recourse, because
there is no way in the Unix API to fail a write to valid memory.

The reason for having a lot more swap space than you really need is just
to cover the potential demand from copy-on-write of pages that are
currently shared.  But given the price of disk these days, it's pretty
cheap insurance.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Mark Woodward
 Rick Gigger [EMAIL PROTECTED] writes:
 However if hashagg truly does not obey the limit that is supposed to
 be imposed by work_mem then it really ought to be documented.  Is
 there a misunderstanding here and it really does obey it?  Or is
 hashagg an exception but the other work_mem associated operations
 work fine?  Or is it possible for them all to go out of bounds?

 hashagg is the exception.  It should be fixed, not documented, but no
 one's got round to that.

Well, it is clearly a pathalogical condition. Fixed? Sure, but someone
should document it so that others don't stumble across it.


 One point to consider is that if the planner's estimate is as far off
 as exhibited in the OP's example, a hashagg that does spill to disk
 is likely to take so long that he'll be back here complaining that
 the query never terminates ;-).

That's not fair, now is it? This isn't about the OP (me), it is about
PostgreSQL behaving badly.

 In most practical situations, I think
 exceeding work_mem is really the best solution, as long as it's not
 by more than 10x or 100x.  It's when the estimate is off by many
 orders of magnitude that you've got a problem.  Running out of memory
 is not necessarily the worst response ... as long as the system doesn't
 kill the process in response to that.

I don't agree with you here. Many PostgreSQL installations use PostgreSQL
as part of a larger whole. Adjusting work_mem should give the admin some
control over the memory footprint of the system. It is documented as the
limit a specific function path will use before spilling to disk.

I set up a lot of systems and I write a lot of software that uses
PostgreSQL. Periodically I run across features/problems/limitations of
PostgreSQL and post them.

This was/is an example of where the behavior of PostgreSQL is clearly
unacceptable. OK, yes, this problem goes away with an ANALYZE, but it
isn't clear how anyone could have known this, and unexpected behavior is
bad in any product.

In your statement, he'll be back here complaining that the query never
terminates, that's not true. A long query typically gets examined with
explain (or in Oracle, explain plan) and evaluated from there. When the
process exhibits runaway memory use, that's a problem.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Jim C. Nasby
On Thu, Feb 09, 2006 at 03:13:22PM -0500, Greg Stark wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
 
  * Tom Lane ([EMAIL PROTECTED]) wrote:
   Greg Stark [EMAIL PROTECTED] writes:
It doesn't seem like a bad idea to have a max_memory parameter that if a
backend ever exceeded it would immediately abort the current
transaction.
   
   See ulimit (or local equivalent).
  
  As much as setting ulimit in shell scripts is fun, I have to admit that
  I really don't see it happening very much.  
 
 For one thing it requires admin access to the startup scripts to arrange this.
 And it's always cluster-wide.
 
 Having a GUC parameter would mean it could be set per-session. Even if the GUC
 parameter were just implemented by calling setrlimit it might be useful.

Trying to tune work_mem is extremely difficult in PostgreSQL, because
you are constantly running the risk of sending the server into a
swap-storm. Having a set-able per-backend memory limit would allow a lot
more flexability in setting work_mem, because you could now ensure that
you wouldn't push the server into serious swapping.

Even better would be a means to set a cluster-wide memory limit, but of
course that's substantially more work.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Jim C. Nasby
On Thu, Feb 09, 2006 at 05:04:38PM -0500, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  When people talk about disabling the OOM killer, it doesn't stop the
  SIGKILL behaviour,
 
 Yes it does, because the situation will never arise.
 
  it just causes the kernel to return -ENOMEM for
  malloc() much much earlier... (ie when you still actually have memory
  available).
 
 Given the current price of disk, there is no sane reason not to have
 enough swap space configured to make this not-a-problem.  The OOM kill
 mechanism was a reasonable solution for running systems that were not
 expected to be too reliable anyway on small hardware, but if you're
 trying to run a 24/7 server you're simply incompetent if you don't
 disable it.

BTW, I was shocked when I found out that FreeBSD actually has an OOM
killer itself. Yet I've never heard of anyone having problems with it.
Granted, the FreeBSD OOM could be better designed to pick the right
process to kill, but I'd bet that the real reason you never hear about
it is because FreeBSD admins are clued enough to a) setup a reasonable
amount of swap and b) do a better job of monitoring memory usage so that
you don't start swapping in the first place.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Jim C. Nasby
On Fri, Feb 10, 2006 at 09:57:12AM -0500, Mark Woodward wrote:
  In most practical situations, I think
  exceeding work_mem is really the best solution, as long as it's not
  by more than 10x or 100x.  It's when the estimate is off by many
  orders of magnitude that you've got a problem.  Running out of memory
  is not necessarily the worst response ... as long as the system doesn't
  kill the process in response to that.
 
 I don't agree with you here. Many PostgreSQL installations use PostgreSQL
 as part of a larger whole. Adjusting work_mem should give the admin some
 control over the memory footprint of the system. It is documented as the
 limit a specific function path will use before spilling to disk.

And even when PostgreSQL has the server all to itself, having a hashagg
spill to disk is *way* better than pushing the machine into a swap
storm. At least if you spill the hashagg you only have one backend
running at a snail's pace; a swap storm means next to nothing gets done.

 This was/is an example of where the behavior of PostgreSQL is clearly
 unacceptable. OK, yes, this problem goes away with an ANALYZE, but it
 isn't clear how anyone could have known this, and unexpected behavior is
 bad in any product.

Care to submit a documentation patch before releases are bundled (I
think on Sunday?) At least then people would be aware that work_mem is
just a suggestion to hash_aggs. I'd do a patch myself but I doubt I'll
have time before the release. :(
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Martijn van Oosterhout
On Fri, Feb 10, 2006 at 10:01:18AM -0600, Jim C. Nasby wrote:
 BTW, I was shocked when I found out that FreeBSD actually has an OOM
 killer itself. Yet I've never heard of anyone having problems with it.
 Granted, the FreeBSD OOM could be better designed to pick the right
 process to kill, but I'd bet that the real reason you never hear about
 it is because FreeBSD admins are clued enough to a) setup a reasonable
 amount of swap and b) do a better job of monitoring memory usage so that
 you don't start swapping in the first place.

Hmm, I do wonder what FreeBSDs overcommit policy is. For example on my
computer right now the total allocated VM is approximately 3 times the
actual memory in the computer and about twice if you include swap. By a
strict policy of overcommit my computer wouldn't complete the boot
sequence, whereas as currently it runs without using any swap.

Disabling overcommit has a serious cost in that most of your VM will
never be used. Are people really suggesting that I can't run a few
daemons, X and a web-browser on FreeBSD without allocating 3 times my
physical memory in swap?

However, my real question is: while trying to find info about FreeBSDs
overcommit policy, I just get lot of people complaining about freebsd
killing random processes. Does anyone know a site that describes how it
works? I understand Linux's overcommit policy just fine.

Disclaimer: The Linux OOM killer has never killed the wrong process for
me, so I don't have any bad experiences with overcommit.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Mark Woodward
 On Fri, Feb 10, 2006 at 09:57:12AM -0500, Mark Woodward wrote:
  In most practical situations, I think
  exceeding work_mem is really the best solution, as long as it's not
  by more than 10x or 100x.  It's when the estimate is off by many
  orders of magnitude that you've got a problem.  Running out of memory
  is not necessarily the worst response ... as long as the system
 doesn't
  kill the process in response to that.

 I don't agree with you here. Many PostgreSQL installations use
 PostgreSQL
 as part of a larger whole. Adjusting work_mem should give the admin
 some
 control over the memory footprint of the system. It is documented as the
 limit a specific function path will use before spilling to disk.

 And even when PostgreSQL has the server all to itself, having a hashagg
 spill to disk is *way* better than pushing the machine into a swap
 storm. At least if you spill the hashagg you only have one backend
 running at a snail's pace; a swap storm means next to nothing gets done.

 This was/is an example of where the behavior of PostgreSQL is clearly
 unacceptable. OK, yes, this problem goes away with an ANALYZE, but it
 isn't clear how anyone could have known this, and unexpected behavior is
 bad in any product.

 Care to submit a documentation patch before releases are bundled (I
 think on Sunday?) At least then people would be aware that work_mem is
 just a suggestion to hash_aggs. I'd do a patch myself but I doubt I'll
 have time before the release. :(

I would be glad too. What's the process?


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 Disclaimer: The Linux OOM killer has never killed the wrong process for
 me, so I don't have any bad experiences with overcommit.

You haven't tried real hard.  What I've seen recently when I do something
that makes a PG backend go overboard is that the kernel zaps both the
misbehaving backend and the bgwriter process.  No idea what it's got
against the bgwriter, but the behavior's been pretty consistent under
recent Fedora 4 kernels ...

(This is on a development machine, not a server, so I'm not particularly
worried by leaving the default overcommit policy in place.  I wouldn't
do that on a server --- but it's not worth my time to change it on a
devel machine.)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Greg Stark

Martijn van Oosterhout kleptog@svana.org writes:

 Disabling overcommit has a serious cost in that most of your VM will
 never be used. Are people really suggesting that I can't run a few
 daemons, X and a web-browser on FreeBSD without allocating 3 times my
 physical memory in swap?

There's a possibility you're misreading your process info there. The X server
maps large areas of graphics memory in its address space which shows up as
virtual memory used in ps on some OSes. I'm not sure if BSD is included there.
That's not memory that can ever be swapped out and it doesn't take up any
memory from the non-video memory.

That said, it's true that some applications allocate much more memory than
needed. There's a bit of a feedback loop here. Because application writers
know that OSes overcommit they don't worry about avoiding unnecessary
allocations. Sun's original Java memory management system just started off
with allocating 20MB before it ran a single bytecode for example. 

That's why merely allocating tons of swap doesn't necessarily protect you.
It's still possible for a process (or several processes if you allocate more
swap than you have address space) to mmap gigabytes of memory without touching
it and then start touching those pages. Hopefully the OOM killer targets the
offender but there's no real way for it to guarantee it.

-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-10 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 That's why merely allocating tons of swap doesn't necessarily protect you.
 It's still possible for a process (or several processes if you allocate more
 swap than you have address space) to mmap gigabytes of memory without touching
 it and then start touching those pages.

So?  If the swap exists to back that memory, there's no problem.  It
might be slow, but it will not fail.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
PostgreSQL promptly uses all available memory for the query and
subsequently crashes.

I'm sure it can be corrected with a setting, but should it crash?

freedb=# create table ucode as select distinct ucode from cdtitles group
by ucode having count(ucode)1 ;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

(Table layout)
freedb=# \d cdtitles
  Table public.cdtitles
  Column   |   Type| Modifiers
---+---+---
 cdid  | integer   | default nextval('seq_cdid'::text)
 cdcode| character varying |
 ucode | character varying |
 title | character varying |
 artist| character varying |
 genre | character varying |
 dgenre| character varying |
 year  | integer   |
 processed | character varying |
 submit| character varying |
 revision  | integer   |
 disclen   | integer   |
 tracks| integer   |
 extkey| integer   |
 offsets   | integer[] |
Indexes:
cdtitles_ucode btree (ucode) CLUSTER

(Size of table)
freedb=# select count(*) from cdtitles ;
  count
-
 1927912
(1 row)

(Sample row)
freedb=# select * from cdtitles where cdid = 11 limit 1;
  cdid  |  cdcode  | ucode |  title   |  
artist   | genre | dgenre | year |processed 
  | submit  | revision | disclen | tracks | extkey |
   offsets
+--+---+--++---++--+-+-+--+-+++---
 11 | 320e9e14 | 0ea0-14-3399-0006ad6a | Var Slutar V#65533;arna
(CD1) | Streaplers | rock  ||0 | cddbd v1.5PL3 Copyright (c)
Steve Scherf et al. | dBpowerAMP V4.0 |0 | | 19 |
   |
{21,150,13920,31180,43664,59907,73163,86629,98447,113684,130205,141682,154920,166783,179028,192689,205161,222164,234649,249692,264340,3744,0}
(1 row

postgresql.conf
Description: Binary data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
More info: the machine has 512M RAM and 512M swap
Work mem is set to:work_mem = 1024

This should't have crashed, should it?

 PostgreSQL promptly uses all available memory for the query and
 subsequently crashes.

 I'm sure it can be corrected with a setting, but should it crash?

 freedb=# create table ucode as select distinct ucode from cdtitles group
 by ucode having count(ucode)1 ;
 server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
 The connection to the server was lost. Attempting reset: Failed.

 (Table layout)
 freedb=# \d cdtitles
   Table public.cdtitles
   Column   |   Type| Modifiers
 ---+---+---
  cdid  | integer   | default nextval('seq_cdid'::text)
  cdcode| character varying |
  ucode | character varying |
  title | character varying |
  artist| character varying |
  genre | character varying |
  dgenre| character varying |
  year  | integer   |
  processed | character varying |
  submit| character varying |
  revision  | integer   |
  disclen   | integer   |
  tracks| integer   |
  extkey| integer   |
  offsets   | integer[] |
 Indexes:
 cdtitles_ucode btree (ucode) CLUSTER

 (Size of table)
 freedb=# select count(*) from cdtitles ;
   count
 -
  1927912
 (1 row)

 (Sample row)
 freedb=# select * from cdtitles where cdid = 11 limit 1;
   cdid  |  cdcode  | ucode |  title   |
 artist   | genre | dgenre | year |processed
   | submit  | revision | disclen | tracks | extkey |
offsets
 +--+---+--++---++--+-+-+--+-+++---
  11 | 320e9e14 | 0ea0-14-3399-0006ad6a | Var Slutar V#65533;arna
 (CD1) | Streaplers | rock  ||0 | cddbd v1.5PL3 Copyright (c)
 Steve Scherf et al. | dBpowerAMP V4.0 |0 | | 19 |
|
 {21,150,13920,31180,43664,59907,73163,86629,98447,113684,130205,141682,154920,166783,179028,192689,205161,222164,234649,249692,264340,3744,0}
 (1 row
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Mark Woodward [EMAIL PROTECTED] writes:
 PostgreSQL promptly uses all available memory for the query and
 subsequently crashes.

I'll bet a nickel this is on a Linux machine with OOM kill enabled.
What does the postmaster log show --- or look in the kernel log to
see if it mentions anything about an out-of-memory kill.

 freedb=# create table ucode as select distinct ucode from cdtitles group
 by ucode having count(ucode)1 ;
 server closed the connection unexpectedly

What does EXPLAIN show as the plan for that?  If it's a hash aggregate,
try with enable_hashagg turned off.  How many distinct ucode values
are there in the table?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 PostgreSQL promptly uses all available memory for the query and
 subsequently crashes.

 I'll bet a nickel this is on a Linux machine with OOM kill enabled.
 What does the postmaster log show --- or look in the kernel log to
 see if it mentions anything about an out-of-memory kill.

That's a no brainer. Maybe I should have rephrased the condition, crash
may be the wrong word, it was definitely killed by out of memory. Sorry.


 freedb=# create table ucode as select distinct ucode from cdtitles group
 by ucode having count(ucode)1 ;
 server closed the connection unexpectedly

 What does EXPLAIN show as the plan for that?  If it's a hash aggregate,
 try with enable_hashagg turned off.  How many distinct ucode values
 are there in the table?

There are over 1.7M distinct rows, about 200K non-distinct that I want to
somehow remove.

It does have hash aggregate:
freedb=# explain select distinct ucode from cdtitles group by ucode having
count(ucode)1 ;
 QUERY PLAN
-
 Unique  (cost=106536.32..106537.32 rows=200 width=32)
   -  Sort  (cost=106536.32..106536.82 rows=200 width=32)
 Sort Key: ucode
 -  HashAggregate  (cost=106527.68..106528.68 rows=200 width=32)
   Filter: (count(ucode)  1)
   -  Seq Scan on cdtitles  (cost=0.00..96888.12 rows=1927912
width=32)
(6 rows)


Well, shouldn't hash aggregate respect work memory limits?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Mark Woodward [EMAIL PROTECTED] writes:
  -  HashAggregate  (cost=106527.68..106528.68 rows=200 width=32)
Filter: (count(ucode)  1)
-  Seq Scan on cdtitles  (cost=0.00..96888.12 rows=1927912
 width=32)

 Well, shouldn't hash aggregate respect work memory limits?

If the planner thought there were 1.7M distinct values, it wouldn't have
used hash aggregate ... but it only thinks there are 200, which IIRC is
the default assumption.  Have you ANALYZEd this table lately?

Meanwhile, I'd strongly recommend turning off OOM kill.  That's got to
be the single worst design decision in the entire Linux kernel.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
  -  HashAggregate  (cost=106527.68..106528.68 rows=200
 width=32)
Filter: (count(ucode)  1)
-  Seq Scan on cdtitles  (cost=0.00..96888.12
 rows=1927912
 width=32)

 Well, shouldn't hash aggregate respect work memory limits?

 If the planner thought there were 1.7M distinct values, it wouldn't have
 used hash aggregate ... but it only thinks there are 200, which IIRC is
 the default assumption.  Have you ANALYZEd this table lately?

I thought that I had, but I did CLUSTER at some point. Or maybe I didn't
I'm, not sure. I have been working on a file reader/parser/importer
program.  I created and dropped the DB so many times it is hard to keep
track. Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?


 Meanwhile, I'd strongly recommend turning off OOM kill.  That's got to
 be the single worst design decision in the entire Linux kernel.

How is this any different than the FreeBSD having a default 512M process
size limit? On FreeBSD, the process would have been killed earlier.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Mark Woodward [EMAIL PROTECTED] writes:
 Still, I would say that is is extremly bad behavior for not having
 stats, wouldn't you think?

Think of it as a kernel bug.

 Meanwhile, I'd strongly recommend turning off OOM kill.  That's got to
 be the single worst design decision in the entire Linux kernel.

 How is this any different than the FreeBSD having a default 512M process
 size limit? On FreeBSD, the process would have been killed earlier.

No, the process would have been politely told it was out of memory, and
would have told you the same.  If the kernel's way of notifying a
process that it's out of memory is SIGKILL, there is not a damn thing
that we can do to operate robustly.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 Still, I would say that is is extremly bad behavior for not having
 stats, wouldn't you think?

 Think of it as a kernel bug.

While I respect your viewpoint that the Linux kernel should not kill an
offending process if the system runs out of memory, I sort of disagree in
that OOM is a disaster preventor. It should be viewed as a last ditch him
or me choice the kernel needs to make and it should not get into that
position in the first place.

Regardless, it is troubling that failing to have current stats can cause
the system, with a large data set, to exceed working memory limits.

I think it is still a bug. While it may manifest itself as a pg crash on
Linux because of a feature with which you have issue, the fact remains
that PG is exeeding its working memory limit.

Should failing to run ANALYZE cause this behavior?
If so, how does this get clearly documented?
If not, can it be prevented?


 Meanwhile, I'd strongly recommend turning off OOM kill.  That's got to
 be the single worst design decision in the entire Linux kernel.

 How is this any different than the FreeBSD having a default 512M process
 size limit? On FreeBSD, the process would have been killed earlier.

 No, the process would have been politely told it was out of memory, and
 would have told you the same.  If the kernel's way of notifying a
 process that it's out of memory is SIGKILL, there is not a damn thing
 that we can do to operate robustly.

Lets not waste time on a Linux discussion. Linux and FreeBSD have their
strengths, and a debate on the dubious merits of either is a long and
contentious debate. Both systems are fine, just with some subtle
differences in design goals.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Andrew Dunstan

Tom Lane wrote:


Mark Woodward [EMAIL PROTECTED] writes:
 


Still, I would say that is is extremly bad behavior for not having
stats, wouldn't you think?
   



Think of it as a kernel bug.

 


Meanwhile, I'd strongly recommend turning off OOM kill.  That's got to
be the single worst design decision in the entire Linux kernel.
 



 


How is this any different than the FreeBSD having a default 512M process
size limit? On FreeBSD, the process would have been killed earlier.
   



No, the process would have been politely told it was out of memory, and
would have told you the same.  If the kernel's way of notifying a
process that it's out of memory is SIGKILL, there is not a damn thing
that we can do to operate robustly.
 



And we have docco on it: 
http://www.postgresql.org/docs/current/static/kernel-resources.html#AEN18105 
which I assume is still current. Back in October I mentioned the OOM 
killer to Andrew Morton - his reaction was a very pained look and a curt 
admonition: turn it off.


cheers

andrew


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Mark Woodward [EMAIL PROTECTED] writes:
 I think it is still a bug. While it may manifest itself as a pg crash on
 Linux because of a feature with which you have issue, the fact remains
 that PG is exeeding its working memory limit.

The problem is that *we have no way to know what that limit is* ---
short of exceeding it and being summarily killed.  (BTW, the kernel
doesn't know what the limit is either.)  There is simply not any way
to operate robustly under the OOM-kill regime.

While I'll certainly acknowledge that it'd be nice if hashagg had
spill-to-disk capability, that wouldn't alter the fundamental fact that
if you want reliable behavior you MUST turn off OOM kill.  There is not
anything we can do at the database level to work around that kernel-level
misdesign.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 I think it is still a bug. While it may manifest itself as a pg crash on
 Linux because of a feature with which you have issue, the fact remains
 that PG is exeeding its working memory limit.

 The problem is that *we have no way to know what that limit is* ---
 short of exceeding it and being summarily killed.  (BTW, the kernel
 doesn't know what the limit is either.)  There is simply not any way
 to operate robustly under the OOM-kill regime.

No, you misunderstand what I said, the working memory as defined in
postgresql.conf. I don't care about the OS debate.


 While I'll certainly acknowledge that it'd be nice if hashagg had
 spill-to-disk capability, that wouldn't alter the fundamental fact that
 if you want reliable behavior you MUST turn off OOM kill.  There is not
 anything we can do at the database level to work around that kernel-level
 misdesign.

Again, regardless of OS used, hashagg will exceed working memory as
defined in postgresql.conf.

At issue is would a lack of ANALYZE justify this behavior? If so, it
should be documented.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Mark Woodward [EMAIL PROTECTED] writes:
 Again, regardless of OS used, hashagg will exceed working memory as
 defined in postgresql.conf.

So?  If you've got OOM kill enabled, it can zap a process whether it's
strictly adhered to work_mem or not.  The OOM killer is entirely capable
of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
 Mark Woodward [EMAIL PROTECTED] writes:
  Again, regardless of OS used, hashagg will exceed working memory as
  defined in postgresql.conf.
 
 So?  If you've got OOM kill enabled, it can zap a process whether it's
 strictly adhered to work_mem or not.  The OOM killer is entirely capable
 of choosing a victim process whose memory footprint hasn't changed
 materially since it started (eg, the postmaster).

Unless I've missed something here, disabling the OOM killer doesn't
really solve the problem here.  What solves the problem is running
ANALYZE but it's still certainly the case that it would make some sense
for the Postmaster, upon realizing that it's gone well beyond its
work_mem boundary, to ideally switch plans to one which isn't going to
exceed its work_mem limit.  Less ideally, it could give up and issue an
error to the user instead of running the box out of memory.

I appriciate that this is probably not very easy to implement but I
do believe the current situation could be improved in this regard.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Mark Woodward [EMAIL PROTECTED] writes:
 Again, regardless of OS used, hashagg will exceed working memory as
 defined in postgresql.conf.

 So?  If you've got OOM kill enabled, it can zap a process whether it's
 strictly adhered to work_mem or not.  The OOM killer is entirely capable
 of choosing a victim process whose memory footprint hasn't changed
 materially since it started (eg, the postmaster).

Sorry, I must strongly disagree here. The postgresql.conf working mem is
a VERY IMPORTANT setting, it is intended to limit the consumption of
memory by the postgresql process. Often times PostgreSQL will work along
side other application servers on the same system, infact, may be a
sub-part of application servers on the same system. (This is, in fact, how
it is used on one of my site servers.)

Clearly, if the server will use 1000 times this number (Set for 1024K, but
exceeds 1G) this is broken, and it may cause other systems to fail or
perform very poorly.

If it is not something that can be fixed, it should be clearly documented.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Jim C. Nasby
On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
  Mark Woodward [EMAIL PROTECTED] writes:
  Again, regardless of OS used, hashagg will exceed working memory as
  defined in postgresql.conf.
 
  So?  If you've got OOM kill enabled, it can zap a process whether it's
  strictly adhered to work_mem or not.  The OOM killer is entirely capable
  of choosing a victim process whose memory footprint hasn't changed
  materially since it started (eg, the postmaster).
 
 Sorry, I must strongly disagree here. The postgresql.conf working mem is
 a VERY IMPORTANT setting, it is intended to limit the consumption of
 memory by the postgresql process. Often times PostgreSQL will work along

Actually, no, it's not designed for that at all.

 side other application servers on the same system, infact, may be a
 sub-part of application servers on the same system. (This is, in fact, how
 it is used on one of my site servers.)
 
 Clearly, if the server will use 1000 times this number (Set for 1024K, but
 exceeds 1G) this is broken, and it may cause other systems to fail or
 perform very poorly.
 
 If it is not something that can be fixed, it should be clearly documented.

work_mem (integer)

Specifies the amount of memory to be used by internal sort
operations and hash tables before switching to temporary disk files.
The value is specified in kilobytes, and defaults to 1024 kilobytes
(1 MB). Note that for a complex query, several sort or hash
operations might be running in parallel; each one will be allowed to
use as much memory as this value specifies before it starts to put
data into temporary files. Also, several running sessions could be
doing such operations concurrently. So the total memory used could
be many times the value of work_mem; it is necessary to keep this
fact in mind when choosing the value. Sort operations are used for
ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
joins, hash-based aggregation, and hash-based processing of IN
subqueries. 

So it says right there that it's very easy to exceed work_mem by a very
large amount. Granted, this is a very painful problem to deal with and
will hopefully be changed at some point, but it's pretty clear as to how
this works.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Jim C. Nasby
On Thu, Feb 09, 2006 at 11:42:57AM -0500, Mark Woodward wrote:
  Mark Woodward [EMAIL PROTECTED] writes:
  Still, I would say that is is extremly bad behavior for not having
  stats, wouldn't you think?
 
  Think of it as a kernel bug.
 
 While I respect your viewpoint that the Linux kernel should not kill an
 offending process if the system runs out of memory, I sort of disagree in
 that OOM is a disaster preventor. It should be viewed as a last ditch him
 or me choice the kernel needs to make and it should not get into that
 position in the first place.

I've had processes run away on a FreeBSD box before, to the extent of
running entirely out of swap and memory. Instead of random processes
just dying for no apparent reason, I instead started getting a bunch of
out-of-memory errors. No disaster, I just fixed the problem and life
went on.

Well, ok, the box did become rather unresponsive when my fix for the
problem meant that all the sudden there were about 950 perl processes
trying to run at the same time. I wish I'd captured top showing 900+
runnable processes. But after a few minutes the processes started
completing and exiting and everything was soon back to normal. I rather
doubt Linux would survive that...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Stephen Frost
* Jim C. Nasby ([EMAIL PROTECTED]) wrote:
 On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
  If it is not something that can be fixed, it should be clearly documented.
 
 work_mem (integer)
 
 Specifies the amount of memory to be used by internal sort
 operations and hash tables before switching to temporary disk files.
 The value is specified in kilobytes, and defaults to 1024 kilobytes
 (1 MB). Note that for a complex query, several sort or hash
 operations might be running in parallel; each one will be allowed to
 use as much memory as this value specifies before it starts to put
 data into temporary files. Also, several running sessions could be
 doing such operations concurrently. So the total memory used could
 be many times the value of work_mem; it is necessary to keep this
 fact in mind when choosing the value. Sort operations are used for
 ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
 joins, hash-based aggregation, and hash-based processing of IN
 subqueries. 
 
 So it says right there that it's very easy to exceed work_mem by a very
 large amount. Granted, this is a very painful problem to deal with and
 will hopefully be changed at some point, but it's pretty clear as to how
 this works.

It also says that when it goes over, it'll spill to disk.  Additionally,
we're talking about one hash here, not multiple ones.  It seems at least
misleading as, if I understand correctly, Postgres isn't actually
actively checking to see if the amount of memory used by an in-progress
hash creation has gone over the limit but rather it guesses at how much
memory will be used during the planning stage to decide if a hash plan
is possible or not.  That guess can certainly be wrong but there's
nothing in place to handle the situation where the guess is wrong...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
  Mark Woodward [EMAIL PROTECTED] writes:
  Again, regardless of OS used, hashagg will exceed working memory as
  defined in postgresql.conf.
 
  So?  If you've got OOM kill enabled, it can zap a process whether it's
  strictly adhered to work_mem or not.  The OOM killer is entirely
 capable
  of choosing a victim process whose memory footprint hasn't changed
  materially since it started (eg, the postmaster).

 Sorry, I must strongly disagree here. The postgresql.conf working mem
 is
 a VERY IMPORTANT setting, it is intended to limit the consumption of
 memory by the postgresql process. Often times PostgreSQL will work along

 Actually, no, it's not designed for that at all.

I guess that's a matter of opinion.


 side other application servers on the same system, infact, may be a
 sub-part of application servers on the same system. (This is, in fact,
 how
 it is used on one of my site servers.)

 Clearly, if the server will use 1000 times this number (Set for 1024K,
 but
 exceeds 1G) this is broken, and it may cause other systems to fail or
 perform very poorly.

 If it is not something that can be fixed, it should be clearly
 documented.

 work_mem (integer)

 Specifies the amount of memory to be used by internal sort
 operations and hash tables before switching to temporary disk files.
 The value is specified in kilobytes, and defaults to 1024 kilobytes
 (1 MB). Note that for a complex query, several sort or hash
 operations might be running in parallel; each one will be allowed to
 use as much memory as this value specifies before it starts to put
 data into temporary files. Also, several running sessions could be
 doing such operations concurrently. So the total memory used could
 be many times the value of work_mem; it is necessary to keep this
 fact in mind when choosing the value. Sort operations are used for
 ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
 joins, hash-based aggregation, and hash-based processing of IN
 subqueries.

 So it says right there that it's very easy to exceed work_mem by a very
 large amount. Granted, this is a very painful problem to deal with and
 will hopefully be changed at some point, but it's pretty clear as to how
 this works.

Well, if you read that paragraph carefully, I'll admit that I was a little
too literal in my statement apliying it to the process and not specific
functions within the process, but in the documentation:

each one will be allowed to use as much memory as this value specifies
before it starts to put data into temporary files.

According to the documentation the behavior of hashagg is broken. It did
not use up to this amount and then start to use temporary files, it used
1000 times this limit and was killed by the OS.

I think it should be documented as the behavior is unpredictable.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Greg Stark

Stephen Frost [EMAIL PROTECTED] writes:

 Unless I've missed something here, disabling the OOM killer doesn't
 really solve the problem here.  

Well in a way it does. Postgres would get an out-of-memory error from malloc
which it would handle properly and the world would be happy.

Except not quite, since I think an out of memory error still means that
backend exits instead of just that query failing. That means if you have an
application running such as apache then all subsequent transactions on that
connection fail too, instead of just the transaction that misbehaved.

And as the other poster mentioned, having Postgres use up every available byte
of memory isn't really very friendly to anything else running on the box.

It doesn't seem like a bad idea to have a max_memory parameter that if a
backend ever exceeded it would immediately abort the current transaction. That
would let an application continue operating normally after getting an error.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Except not quite, since I think an out of memory error still means that
 backend exits instead of just that query failing.

Not at all!  PG will recover from this perfectly well ... if it's given
the opportunity, rather than being SIGKILLed.

 It doesn't seem like a bad idea to have a max_memory parameter that if a
 backend ever exceeded it would immediately abort the current
 transaction.

See ulimit (or local equivalent).

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Alvaro Herrera
Greg Stark wrote:

 Well in a way it does. Postgres would get an out-of-memory error from malloc
 which it would handle properly and the world would be happy.
 
 Except not quite, since I think an out of memory error still means that
 backend exits instead of just that query failing.

Not at all -- the transaction is aborted, but the backend can continue
working perfectly fine.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
 Greg Stark [EMAIL PROTECTED] writes:
  It doesn't seem like a bad idea to have a max_memory parameter that if a
  backend ever exceeded it would immediately abort the current
  transaction.
 
 See ulimit (or local equivalent).

As much as setting ulimit in shell scripts is fun, I have to admit that
I really don't see it happening very much.  Having Postgres set a ulimit
for itself may not be a bad idea and would perhaps provide a least
suprise for new users.  Perhaps shared_buffers + 10*work_mem +
maintenance_work_mem + max_stack_depth?  Then errors from running out of
memory could provide a 'HINT: Memory consumption went well over allowed
work_mem, perhaps you need to run ANALYZE or raise work_mem?'.

Just some thoughts,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Greg Stark
Stephen Frost [EMAIL PROTECTED] writes:

 * Tom Lane ([EMAIL PROTECTED]) wrote:
  Greg Stark [EMAIL PROTECTED] writes:
   It doesn't seem like a bad idea to have a max_memory parameter that if a
   backend ever exceeded it would immediately abort the current
   transaction.
  
  See ulimit (or local equivalent).
 
 As much as setting ulimit in shell scripts is fun, I have to admit that
 I really don't see it happening very much.  

For one thing it requires admin access to the startup scripts to arrange this.
And it's always cluster-wide.

Having a GUC parameter would mean it could be set per-session. Even if the GUC
parameter were just implemented by calling setrlimit it might be useful.

-- 
greg


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Stephen Frost [EMAIL PROTECTED] writes:

 * Tom Lane ([EMAIL PROTECTED]) wrote:
  Greg Stark [EMAIL PROTECTED] writes:
   It doesn't seem like a bad idea to have a max_memory parameter that
 if a
   backend ever exceeded it would immediately abort the current
   transaction.
 
  See ulimit (or local equivalent).

 As much as setting ulimit in shell scripts is fun, I have to admit that
 I really don't see it happening very much.

 For one thing it requires admin access to the startup scripts to arrange
 this.
 And it's always cluster-wide.

 Having a GUC parameter would mean it could be set per-session. Even if the
 GUC
 parameter were just implemented by calling setrlimit it might be useful.


I don't think it needs a new GUC parameter, just having hashagg respect
work_mem would fix the problem.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Martijn van Oosterhout
On Thu, Feb 09, 2006 at 02:35:34PM -0500, Tom Lane wrote:
 Greg Stark [EMAIL PROTECTED] writes:
  Except not quite, since I think an out of memory error still means that
  backend exits instead of just that query failing.
 
 Not at all!  PG will recover from this perfectly well ... if it's given
 the opportunity, rather than being SIGKILLed.

FWIW, the problem is mainly from the situation where some process
accesses a piece of memory that has been swapped out, but there is no
memory available to swap the page in. Or write to a page marked
copy-on-write. What do you do? There's is no way to return -ENOMEM from
a normal memory access and PostgreSQL wouldn't handle that anyway.

When people talk about disabling the OOM killer, it doesn't stop the
SIGKILL behaviour, it just causes the kernel to return -ENOMEM for
malloc() much much earlier... (ie when you still actually have memory
available).

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Rick Gigger


On Feb 9, 2006, at 11:22 AM, Stephen Frost wrote:


* Tom Lane ([EMAIL PROTECTED]) wrote:

Mark Woodward [EMAIL PROTECTED] writes:
Again, regardless of OS used, hashagg will exceed working  
memory as

defined in postgresql.conf.


So?  If you've got OOM kill enabled, it can zap a process whether  
it's
strictly adhered to work_mem or not.  The OOM killer is entirely  
capable

of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).


Unless I've missed something here, disabling the OOM killer doesn't
really solve the problem here.  What solves the problem is running
ANALYZE but it's still certainly the case that it would make some  
sense

for the Postmaster, upon realizing that it's gone well beyond its
work_mem boundary, to ideally switch plans to one which isn't going to
exceed its work_mem limit.  Less ideally, it could give up and  
issue an

error to the user instead of running the box out of memory.


So is the work_mem paramater that is set not strictly enforced?  Is  
it more like a suggestions as to what it SHOULD use and not a hard  
limit on how much memory the each process is ALLOWED to use?


If his work_mem is set to 1 mb and that process is using 500 mb for  
tasks that are supposed to stay in work_mem then doesn't that mean  
that that limit is not really a hard limit but rather a suggestion?


Rick

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 When people talk about disabling the OOM killer, it doesn't stop the
 SIGKILL behaviour,

Yes it does, because the situation will never arise.

 it just causes the kernel to return -ENOMEM for
 malloc() much much earlier... (ie when you still actually have memory
 available).

Given the current price of disk, there is no sane reason not to have
enough swap space configured to make this not-a-problem.  The OOM kill
mechanism was a reasonable solution for running systems that were not
expected to be too reliable anyway on small hardware, but if you're
trying to run a 24/7 server you're simply incompetent if you don't
disable it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Mark Woodward
 Martijn van Oosterhout kleptog@svana.org writes:
 When people talk about disabling the OOM killer, it doesn't stop the
 SIGKILL behaviour,

 Yes it does, because the situation will never arise.

 it just causes the kernel to return -ENOMEM for
 malloc() much much earlier... (ie when you still actually have memory
 available).

 Given the current price of disk, there is no sane reason not to have
 enough swap space configured to make this not-a-problem.  The OOM kill
 mechanism was a reasonable solution for running systems that were not
 expected to be too reliable anyway on small hardware, but if you're
 trying to run a 24/7 server you're simply incompetent if you don't
 disable it.

And people say I have STRONG opinions. Don't hold back Tom, let us know
what you really think.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Ernst Herzberg
On Friday 10 February 2006 00:53, Mark Woodward wrote:
  Martijn van Oosterhout kleptog@svana.org writes:
  When people talk about disabling the OOM killer, it doesn't stop the
  SIGKILL behaviour,
 
  Yes it does, because the situation will never arise.
 
  it just causes the kernel to return -ENOMEM for
  malloc() much much earlier... (ie when you still actually have memory
  available).
 
  Given the current price of disk, there is no sane reason not to have
  enough swap space configured to make this not-a-problem.  The OOM kill
  mechanism was a reasonable solution for running systems that were not
  expected to be too reliable anyway on small hardware, but if you're
  trying to run a 24/7 server you're simply incompetent if you don't
  disable it.

 And people say I have STRONG opinions. Don't hold back Tom, let us know
 what you really think.

Read 
http://linux-mm.org/OverCommitAccounting
or 
file://usr/src/linux/Documentation/vm/overcommit-accounting

It is a good idea to have enough swap space. If not, set 
vm.overcommit_memory=2

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Rick Gigger


On Feb 9, 2006, at 12:49 PM, Mark Woodward wrote:


On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:

Mark Woodward [EMAIL PROTECTED] writes:
Again, regardless of OS used, hashagg will exceed working  
memory as

defined in postgresql.conf.


So?  If you've got OOM kill enabled, it can zap a process  
whether it's

strictly adhered to work_mem or not.  The OOM killer is entirely

capable

of choosing a victim process whose memory footprint hasn't changed
materially since it started (eg, the postmaster).


Sorry, I must strongly disagree here. The postgresql.conf  
working mem

is
a VERY IMPORTANT setting, it is intended to limit the consumption of
memory by the postgresql process. Often times PostgreSQL will  
work along


Actually, no, it's not designed for that at all.


I guess that's a matter of opinion.




side other application servers on the same system, infact, may be a
sub-part of application servers on the same system. (This is, in  
fact,

how
it is used on one of my site servers.)

Clearly, if the server will use 1000 times this number (Set for  
1024K,

but
exceeds 1G) this is broken, and it may cause other systems to  
fail or

perform very poorly.

If it is not something that can be fixed, it should be clearly
documented.


work_mem (integer)

Specifies the amount of memory to be used by internal sort
operations and hash tables before switching to temporary disk  
files.
The value is specified in kilobytes, and defaults to 1024  
kilobytes

(1 MB). Note that for a complex query, several sort or hash
operations might be running in parallel; each one will be  
allowed to
use as much memory as this value specifies before it starts to  
put
data into temporary files. Also, several running sessions  
could be
doing such operations concurrently. So the total memory used  
could

be many times the value of work_mem; it is necessary to keep this
fact in mind when choosing the value. Sort operations are used  
for

ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
joins, hash-based aggregation, and hash-based processing of IN
subqueries.

So it says right there that it's very easy to exceed work_mem by a  
very
large amount. Granted, this is a very painful problem to deal with  
and
will hopefully be changed at some point, but it's pretty clear as  
to how

this works.


Well, if you read that paragraph carefully, I'll admit that I was a  
little
too literal in my statement apliying it to the process and not  
specific

functions within the process, but in the documentation:

each one will be allowed to use as much memory as this value  
specifies

before it starts to put data into temporary files.

According to the documentation the behavior of hashagg is broken.  
It did
not use up to this amount and then start to use temporary files, it  
used

1000 times this limit and was killed by the OS.

I think it should be documented as the behavior is unpredictable.


It seems to me that the solution for THIS INCIDENT is to run an  
analyze.  That should fix the problem at hand.  I have nothing to say  
about the OOM issue except that hopefully the analyze will prevent  
him from running out of memory at all.


However if hashagg truly does not obey the limit that is supposed to  
be imposed by work_mem then it really ought to be documented.  Is  
there a misunderstanding here and it really does obey it?  Or is  
hashagg an exception but the other work_mem associated operations  
work fine?  Or is it possible for them all to go out of bounds?


Even if you've got 100 terabyts of swap space though if seems like if  
your system is very heavy on reads then you would really want that  
single backend to start using up your disk space and leave your  
memory alone so that most of your data can stay cached and largely  
unaffeted by the problem of one backend.


If your bottleneck is writing to the disk then it doesn't really seem  
to matter.  You just need to make sure that huge out of control  
hashagg never occurs.  If your disks get saturated with writes  
because of the hashagg of one backend then all other processes that  
need to write a lot of info to disk are going to come to a grinding  
halt and queries are not going to complete quickly and build up and  
you will have a huge mess on your hands that will essentially prevent  
postgres from being able to do it's job even if it doesn't actually  
die.  In this situation disk bandwidth is a scarce commodity and  
whether you let the OS handle it all with virtual memory or you let  
postgres swap everything out to disc for that one operation you are  
still using disc to make up for a lack of RAM.  At some point you  
you've either got to stock up on enough RAM to run your queries  
properly or alter how your queries run to use less RAM.  Having a  
process go out of control in resource usage is going to cause big  
problems one way or another.



Re: [HACKERS] PostgreSQL 8.0.6 crash

2006-02-09 Thread Tom Lane
Rick Gigger [EMAIL PROTECTED] writes:
 However if hashagg truly does not obey the limit that is supposed to  
 be imposed by work_mem then it really ought to be documented.  Is  
 there a misunderstanding here and it really does obey it?  Or is  
 hashagg an exception but the other work_mem associated operations  
 work fine?  Or is it possible for them all to go out of bounds?

hashagg is the exception.  It should be fixed, not documented, but no
one's got round to that.

One point to consider is that if the planner's estimate is as far off
as exhibited in the OP's example, a hashagg that does spill to disk
is likely to take so long that he'll be back here complaining that
the query never terminates ;-).  In most practical situations, I think
exceeding work_mem is really the best solution, as long as it's not
by more than 10x or 100x.  It's when the estimate is off by many
orders of magnitude that you've got a problem.  Running out of memory
is not necessarily the worst response ... as long as the system doesn't
kill the process in response to that.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly