Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-21 Thread Matthew Wakeling
On Wed, 20 Jan 2010, Greg Smith wrote: Basically, to an extent, that's right. However, when you get 16 drives or more into a system, then it starts being an issue. I guess if I test a system with *only* 16 drives in it one day, maybe I'll find out. *Curious* What sorts of systems have you tr

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-20 Thread Greg Smith
Matthew Wakeling wrote: On Fri, 15 Jan 2010, Greg Smith wrote: My theory has been that the "extra processing it has to perform" you describe just doesn't matter in the context of a fast system where physical I/O is always the bottleneck. Basically, to an extent, that's right. However, when yo

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-20 Thread Matthew Wakeling
On Fri, 15 Jan 2010, Greg Smith wrote: It seems to me that CFQ is simply bandwidth limited by the extra processing it has to perform. I'm curious what you are doing when you see this. 16 disc 15kRPM RAID0, when using fadvise with more than 100 simultaneous 8kB random requests. I sent an emai

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-17 Thread Tom Lane
Greg Smith writes: > In this context, "priority inversion" is not a generic term related to > running things with lower priorities. It means something very > specific: that you're allowing low-priority jobs to acquire locks on > resources needed by high-priority ones, and therefore blocking t

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-17 Thread Greg Smith
Eduardo Piombino wrote: In the case where priority inversion is not to be used, I would however still greatly benefit from the slow jobs/fast jobs mechanism, just being extra-careful that the slow jobs, obviously, did not acquire any locks that a fast job would ever require. This alone would b

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-17 Thread Eduardo Piombino
> Seems like you'd also need to think about priority inversion, if the > "low-priority" backend is holding any locks. > I'm not sure that priority inversion would be right in this scenario, because in that case the IO storm would still be able to exist, in the cases where the slow jobs collide wit

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-16 Thread Greg Smith
Robert Haas wrote: Seems like you'd also need to think about priority inversion, if the "low-priority" backend is holding any locks. Right, that's what I was alluding to in the last part: the non-obvious piece here is not how to decide when the backend should nap because it's done too muc

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-16 Thread Robert Haas
On Sat, Jan 16, 2010 at 4:09 AM, Greg Smith wrote: > Tom Lane wrote: >> >> This is in fact exactly what the vacuum_cost_delay logic does. >> It might be interesting to investigate generalizing that logic >> so that it could throttle all of a backend's I/O not just vacuum. >> In principle I think i

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-16 Thread Greg Smith
Tom Lane wrote: This is in fact exactly what the vacuum_cost_delay logic does. It might be interesting to investigate generalizing that logic so that it could throttle all of a backend's I/O not just vacuum. In principle I think it ought to work all right for any I/O-bound query. So much for

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Tom Lane
Greg Smith writes: > You might note that only one of these sources--a backend allocating a > buffer--is connected to the process you want to limit. If you think of > the problem from that side, it actually becomes possible to do something > useful here. The most practical way to throttle some

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Greg Smith
Craig Ringer wrote: It's also complicated by the fact that Pg's architecture is very poorly suited to prioritizing I/O based on query or process. (AFAIK) basically all writes go through shared_buffers and the bgwriter - neither Pg nor in fact the OS know what query or what backend created a given

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Craig Ringer
Eduardo Piombino wrote: > I think pg is wasting resources, it could be very well taking advantage > of, if you guys just tell me get better hardware. I mean ... the IO > subsystem is obviously the bottleneck of my system. But most of the time > it is on very very light load, actually ALL of the ti

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Greg Smith
Eduardo Piombino wrote: But already knowing that the base system (i.e. components out of pg's control, like OS, hardware, etc) may be "buggy" or that it can fail in rationalizing the IO, maybe it would be nice to tell to whoever is responsible for making use of the IO subsystem (pg_bg_writer?),

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Eduardo Piombino
I will give it a try, thanks. However, besides all the analysis and tests and stats that I've been collecting, I think the point of discussion turned into if my hardware is good enough, and if it can keep up with the needs in normal, or even heaviest users load. And if that is the question, the an

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Greg Smith
Eduardo Piombino wrote: Going to the disk properties (in windows), I just realized it does not have the Write Cache enabled, and it doesn't also allow me to set it up. I've read in google that the lack of ability to turn it on (that is, that the checkbox remains checked after you apply the chan

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Greg Smith
Matthew Wakeling wrote: CFQ is the default scheduler, but in most systems I have seen, it performs worse than the other three schedulers, all of which seem to have identical performance. I would avoid anticipatory on a RAID array though. It seems to me that CFQ is simply bandwidth limited by

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Matthew Wakeling
On Fri, 15 Jan 2010, Craig James wrote: That's the perception I get. CFQ is the default scheduler, but in most systems I have seen, it performs worse than the other three schedulers, all of which seem to have identical performance. I would avoid anticipatory on a RAID array though. I thought

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Craig James
Matthew Wakeling wrote: On Thu, 14 Jan 2010, Greg Smith wrote: Andy Colson wrote: So if there is very little io, or if there is way way too much, then the scheduler really doesn't matter. So there is a slim middle ground where the io is within a small percent of the HD capacity where the sch

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-15 Thread Matthew Wakeling
On Thu, 14 Jan 2010, Greg Smith wrote: Andy Colson wrote: So if there is very little io, or if there is way way too much, then the scheduler really doesn't matter. So there is a slim middle ground where the io is within a small percent of the HD capacity where the scheduler might make a diffe

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Eduardo Piombino
Regarding the EA-200 card, here are the specs. It seems it has support for SAS disks, so it is most probably that we are using the embedded/default controller. http://h18000.www1.hp.com/products/quickspecs/12460_div/12460_div.html http://h18000.www1.hp.com/products/quickspecs/12460_div/12460_div.p

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Eduardo Piombino
Regarding the hardware the system is running on: It's an HP Proliant DL-180 G5 server. Here are the specs... our actual configuration only has one CPU, and 16G of RAM. The model of the 2 disks I will post later today, when I get to the server. I was with many things, sorry. http://h18000.www1.hp

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Greg Smith
Andy Colson wrote: So if there is very little io, or if there is way way too much, then the scheduler really doesn't matter. So there is a slim middle ground where the io is within a small percent of the HD capacity where the scheduler might make a difference? That's basically how I see it.

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Andy Colson
On 1/14/2010 12:07 PM, Greg Smith wrote: Andy Colson wrote: On 1/13/2010 11:36 PM, Craig Ringer wrote: Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a terminal would go from a 1/4 second operation to a 5 minute operation under heavy write load by one writer. I landed up

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Greg Smith
Andy Colson wrote: On 1/13/2010 11:36 PM, Craig Ringer wrote: Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a terminal would go from a 1/4 second operation to a 5 minute operation under heavy write load by one writer. I landed up having to modify the driver to partially

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Pierre Frédéric Caillau d
"high CPU usage" It might very well be "high IO usage". Try this : Copy (using explorer, the shell, whatever) a huge file. This will create load similar to ALTER TABLE. Measure throughput, how much is it ? If your server blows up just like it did on ALTER TAB

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Andy Colson
On 1/13/2010 11:36 PM, Craig Ringer wrote: Robert Haas wrote: I'm kind of surprised that there are disk I/O subsystems that are so bad that a single thread doing non-stop I/O can take down the whole server. Is that normal? No. Does it happen on non-Windows operating systems? Yes. My 3war

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Craig Ringer
Robert Haas wrote: > I'm kind of surprised that there are disk I/O subsystems that are so > bad that a single thread doing non-stop I/O can take down the whole > server. Is that normal? No. > Does it happen on non-Windows operating > systems? Yes. My 3ware 8500-8 on a Debian Sarge box was so

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Eduardo Piombino
Greg, I will post more detailed data as soon as I'm able to gather it. I was trying out if the cancellation of the ALTER cmd worked ok, I might give the ALTER another try, and see how much CPU, RAM and IO usage gets involved. I will be doing this monitoring with the process explorer from sysintern

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Greg Smith
Robert Haas wrote: I'm kind of surprised that there are disk I/O subsystems that are so bad that a single thread doing non-stop I/O can take down the whole server. Is that normal? Does it happen on non-Windows operating systems? What kind of hardware should I not buy to make sure this doesn't

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Scott Marlowe
On Wed, Jan 13, 2010 at 10:54 AM, Eduardo Piombino wrote: > >> OK, I'm not entirely sure this table is not still locking something >> else.  If you make a copy by doing something like: >> >> select * into test_table from a; >> >> and then alter test_table do you still get the same problems?  If so

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Eduardo Piombino
> OK, I'm not entirely sure this table is not still locking something > else. If you make a copy by doing something like: > > select * into test_table from a; > > and then alter test_table do you still get the same problems? If so, > then it is an IO issue, most likely. If not, then there is som

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Scott Marlowe
On Tue, Jan 12, 2010 at 9:59 PM, Eduardo Piombino wrote: ... > Now, with this experience, I tried a simple workaround. > Created an empty version of "a" named "a_empty", identical in every sense. > renamed "a" to "a_full", and "a_empty" to "a". This procedure costed me like > 0 seconds of downtim

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Robert Haas
On Wed, Jan 13, 2010 at 2:03 AM, Eduardo Piombino wrote: > Excellent, lots of useful information in your message. > I will follow your advices, and keep you posted on any progress. I have yet > to confirm you with some technical details of my setup, but I'm pretty sure > you hit the nail in any ca

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Eduardo Piombino
With that said, I assume my current version of pgsql DOES make all this heavy work go through WAL logging. Curious thing is that I remember (of course) reviewing logs of the crash times, and I didn't see anything strange, not even the famous warning "you are making checkpoints too often. maybe you

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Euler Taveira de Oliveira
Eduardo Piombino escreveu: > Maybe it does not get logged at all until the ALTER is completed? > This feature [1] was implemented a few months ago and it will be available only in the next PostgreSQL version (8.5). [1] http://archives.postgresql.org/pgsql-committers/2009-11/msg00018.php -- E

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Craig Ringer
On 13/01/2010 3:03 PM, Eduardo Piombino wrote: One last question, this IO issue I'm facing, do you think it is just a matter of RAID configuration speed, or a matter of queue gluttony (and not leaving time for other processes to get into the IO queue in a reasonable time)? Hard to say with the

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-13 Thread Eduardo Piombino
Yes, one of the things I will do asap is to migrate to the latest version. On other occasion I went through the checkpoint parameters you mentioned, but left them untouched since they seemed logical. I'm a little reluctant of changing the checkpoint configuration just to let me do a -once in a lif

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Greg Smith
Eduardo Piombino wrote: Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone. Ugh...there are several features in PostgreSQL 8.3 and later specifically to address the sort of issue you're running into. If you want to get good write performance out of this system, you may

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Eduardo Piombino
Excellent, lots of useful information in your message. I will follow your advices, and keep you posted on any progress. I have yet to confirm you with some technical details of my setup, but I'm pretty sure you hit the nail in any case. One last question, this IO issue I'm facing, do you think it

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Craig Ringer
On 13/01/2010 1:47 PM, Eduardo Piombino wrote: I'm sorry. The server is a production server HP Proliant, I don't remember the exact model, but the key features were: 4 cores, over 2GHz each (I'm sorry I don't remember the actual specs), I think it had 16G of RAM (if that is possible?) It has two

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Eduardo Piombino
I'm sorry. The server is a production server HP Proliant, I don't remember the exact model, but the key features were: 4 cores, over 2GHz each (I'm sorry I don't remember the actual specs), I think it had 16G of RAM (if that is possible?) It has two 320G disks in RAID (mirrored). I don't even hav

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Craig Ringer
On 13/01/2010 12:59 PM, Eduardo Piombino wrote: My question then is: is there a way to limit the CPU assigned to a specific connection? I mean, I don't care if my ALTER TABLE takes 4 days instead of 4 hours. Something like: pg_set_max_cpu_usage(2/100); You're assuming the issue is CPU. I thin

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Craig James
Eduardo Piombino wrote: Hi list, I'm having a problem when dealing with operations that asks too much CPU from the server. The scenario is this: A nice description below, but ... you give no information about your system: number of CPUs, disk types and configuration, how much memory, what hav

[PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-12 Thread Eduardo Piombino
Hi list, I'm having a problem when dealing with operations that asks too much CPU from the server. The scenario is this: I have a multithreaded server, each thread with its own connection to the database. Everything is working fine, actually great, actually outstandingly, in normal operation. I'v