Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-21 Thread Matthew Wakeling

On Wed, 20 Jan 2010, Greg Smith wrote:
Basically, to an extent, that's right. However, when you get 16 drives or 
more into a system, then it starts being an issue.


I guess if I test a system with *only* 16 drives in it one day, maybe I'll 
find out.


*Curious* What sorts of systems have you tried so far?

As the graph I just sent shows, the four schedulers are pretty-much 
identical in performance, until you start saturating it with simultaneous 
requests. CFQ levels out at a performance a little lower than the other 
three.


Seriously though, there is some difference between a completely synthetic 
test like you noted issues with here, and anything you can see when running 
the database.


Granted, this test is rather synthetic. It is testing the rather unusual 
case of lots of simultaneous random small requests - more simultaneous 
requests than we advise people to run backends on a server. You'd probably 
need to get a RAID array a whole lot bigger than 16 drives to have a 
normal workload capable of demonstrating the performance difference, and 
even that isn't particularly major.


Would be interesting research if anyone has a 200-spindle RAID array 
hanging around somewhere.


Matthew

--
A good programmer is one who looks both ways before crossing a one-way street.
Considering the quality and quantity of one-way streets in Cambridge, it
should be no surprise that there are so many good programmers there.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-20 Thread Matthew Wakeling

On Fri, 15 Jan 2010, Greg Smith wrote:
It seems to me that CFQ is simply bandwidth limited by the extra processing 
it has to perform.


I'm curious what you are doing when you see this.


16 disc 15kRPM RAID0, when using fadvise with more than 100 simultaneous 
8kB random requests. I sent an email to the mailing list on 29 Jan 2008, 
but it got thrown away by the mailing list spam filter because it had an 
image in it (the graph showing interesting information). Gregory Stark 
replied to it in 
http://archives.postgresql.org/pgsql-performance/2008-01/msg00285.php


I was using his synthetic test case program.

My theory has been that the extra processing it has to perform you describe 
just doesn't matter in the context of a fast system where physical I/O is 
always the bottleneck.


Basically, to an extent, that's right. However, when you get 16 drives or 
more into a system, then it starts being an issue.


Matthew

--
For every complex problem, there is a solution that is simple, neat, and wrong.
 -- H. L. Mencken 


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-20 Thread Greg Smith

Matthew Wakeling wrote:

On Fri, 15 Jan 2010, Greg Smith wrote:
My theory has been that the extra processing it has to perform you 
describe just doesn't matter in the context of a fast system where 
physical I/O is always the bottleneck.


Basically, to an extent, that's right. However, when you get 16 drives 
or more into a system, then it starts being an issue.


I guess if I test a system with *only* 16 drives in it one day, maybe 
I'll find out.


Seriously though, there is some difference between a completely 
synthetic test like you noted issues with here, and anything you can see 
when running the database.  I was commenting more on the state of things 
from the perspective of a database app, where I just haven't seen any of 
the CFQ issues I hear reports of in other contexts.  I'm sure there are 
plenty of low-level tests where the differences between the schedulers 
is completely obvious and it doesn't look as good anymore, and I'll take 
a look at whether I can replicate the test case you saw a specific 
concern with here.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-17 Thread Eduardo Piombino
 Seems like you'd also need to think about priority inversion, if the
 low-priority backend is holding any locks.


I'm not sure that priority inversion would be right in this scenario,
because in that case the IO storm would still be able to exist, in the cases
where the slow jobs collide with the need-to-remain-fast (aka real-time)
operations on some lock . I'm using pg in a real time environment
communicating with many different hardware, which all produce a light load,
but all require real time response times, and allowing a proiority inversion
would indirectly allow IO storms in those cases, going back to where
everything started.

However, if such a mechanism was to be implemented, maybe it (the inversion
of priorities) could be left as an option in the configuration, that could
be turned on or off. In my case, I would just leave it off, but maybe for
some applications they find it useful, knowing that io storms may still
appear, given a series of conditions.

In the case where priority inversion is not to be used, I would however
still greatly benefit from the slow jobs/fast jobs mechanism, just being
extra-careful that the slow jobs, obviously, did not acquire any locks that
a fast job would ever require. This alone would be, still, a *huge* feature
if it was ever to be introduced, reinforcing the real-time
awareness/requirements, that many applications look for  today.


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-17 Thread Greg Smith

Eduardo Piombino wrote:
In the case where priority inversion is not to be used, I would 
however still greatly benefit from the slow jobs/fast jobs mechanism, 
just being extra-careful that the slow jobs, obviously, did not 
acquire any locks that a fast job would ever require. This alone would 
be, still, a *huge* feature if it was ever to be introduced, 
reinforcing the real-time awareness/requirements, that many 
applications look for  today.


In this context, priority inversion is not a generic term related to 
running things with lower priorities.  It means something very 
specific:  that you're allowing low-priority jobs to acquire locks on 
resources needed by high-priority ones, and therefore blocking the 
high-priority ones from running effectively.  Unfortunately, much like 
deadlock, it's impossible to avoid the problem in a generic way just by 
being careful.  It's one of the harder issues that needs to be 
considered in order to make progress on implementing this feature one day.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-16 Thread Greg Smith

Tom Lane wrote:

This is in fact exactly what the vacuum_cost_delay logic does.
It might be interesting to investigate generalizing that logic
so that it could throttle all of a backend's I/O not just vacuum.
In principle I think it ought to work all right for any I/O-bound
query.
  


So much for inventing a new idea; never considered that parallel 
before.  The logic is perfectly reusable, not so sure how much of the 
implementation would be though.


I think the main difference is that there's one shared VacuumCostBalance 
to worry about, whereas each backend that might be limited would need 
its own clear scratchpad to accumulate costs into.  That part seems 
similar to how the new EXPLAIN BUFFERS capability instruments things 
though, which was the angle I was thinking of approaching this from.  
Make that instrumenting more global, periodically compute a total cost 
from that instrument snapshot, and nap whenever the delta between the 
cost at the last nap and the current cost exceeds your threshold.


Bet I could find some more consumers in user land who'd love to watch 
that instrumented data too, if it were expanded to be available for 
operations beyond just plan execution.  I know it would make a lot of 
jobs easier if you could measure that x statement cost you y for 
more than just queries--for example, tracking whether any given UPDATE 
goes outside of the buffer cache or not would be fascinating tuning 
fodder.  Ditto if you could get a roll-up of everything a particular 
connection did.


The part specific to the rate limiting that I don't have any good idea 
about yet is where to put the napping logic at, such that it would work 
across everything an I/O limited backend might do.  The only common 
point here seems to be the calls into the buffer manager code, but since 
that's happening with locks and pins you can't sleep in there.  Not 
enthusiastic about sprinkling every type of backend operation with a 
call to some nap check routine.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-16 Thread Greg Smith

Robert Haas wrote:

Seems like you'd also need to think about priority inversion, if the
low-priority backend is holding any locks.
  


Right, that's what I was alluding to in the last part:  the non-obvious 
piece here is not how to decide when the backend should nap because it's 
done too much I/O, it's how to figure out when it's safe for it to do so 
without causing trouble for others.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Matthew Wakeling

On Thu, 14 Jan 2010, Greg Smith wrote:

Andy Colson wrote:
So if there is very little io, or if there is way way too much, then the 
scheduler really doesn't matter.  So there is a slim middle ground where 
the io is within a small percent of the HD capacity where the scheduler 
might make a difference?


That's basically how I see it.  There seem to be people who run into 
workloads in the middle ground where the scheduler makes a world of 
difference.  I've never seen one myself, and suspect that some of the reports 
of deadline being a big improvement just relate to some buginess in the 
default CFQ implementation that I just haven't encountered.


That's the perception I get. CFQ is the default scheduler, but in most 
systems I have seen, it performs worse than the other three schedulers, 
all of which seem to have identical performance. I would avoid 
anticipatory on a RAID array though.


It seems to me that CFQ is simply bandwidth limited by the extra 
processing it has to perform.


Matthew

--
Experience is what allows you to recognise a mistake the second time you
make it.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Craig James

Matthew Wakeling wrote:

On Thu, 14 Jan 2010, Greg Smith wrote:

Andy Colson wrote:
So if there is very little io, or if there is way way too much, then 
the scheduler really doesn't matter.  So there is a slim middle 
ground where the io is within a small percent of the HD capacity 
where the scheduler might make a difference?


That's basically how I see it.  There seem to be people who run into 
workloads in the middle ground where the scheduler makes a world of 
difference.  I've never seen one myself, and suspect that some of the 
reports of deadline being a big improvement just relate to some 
buginess in the default CFQ implementation that I just haven't 
encountered.


That's the perception I get. CFQ is the default scheduler, but in most 
systems I have seen, it performs worse than the other three schedulers, 
all of which seem to have identical performance. I would avoid 
anticipatory on a RAID array though.


I thought the best strategy for a good RAID controller was NOOP.  Anything the 
OS does just makes it harder for the RAID controller to do its job.  With a 
direct-attached disk, the OS knows where the heads are, but with a 
battery-backed RAID controller, the OS has no idea what's actually happening.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Matthew Wakeling

On Fri, 15 Jan 2010, Craig James wrote:
That's the perception I get. CFQ is the default scheduler, but in most 
systems I have seen, it performs worse than the other three schedulers, all 
of which seem to have identical performance. I would avoid anticipatory on 
a RAID array though.


I thought the best strategy for a good RAID controller was NOOP.


Agreed. That's what we use here. My observation is though that noop is 
identical in performance to anticipatory and deadline. Theoretically, it 
should be faster.


Matthew

--
Take care that thou useth the proper method when thou taketh the measure of
high-voltage circuits so that thou doth not incinerate both thee and the
meter; for verily, though thou has no account number and can be easily
replaced, the meter doth have one, and as a consequence, bringeth much woe
upon the Supply Department.   -- The Ten Commandments of Electronics

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Greg Smith

Matthew Wakeling wrote:
CFQ is the default scheduler, but in most systems I have seen, it 
performs worse than the other three schedulers, all of which seem to 
have identical performance. I would avoid anticipatory on a RAID array 
though.


It seems to me that CFQ is simply bandwidth limited by the extra 
processing it has to perform.


I'm curious what you are doing when you see this. I've got several 
hundred hours worth of pgbench data on CFQ vs. deadline from a couple of 
system collected over the last three years, and I've never seen either a 
clear deadline win or a major CFQ failing. Most results are an even tie, 
with the occasional mild preference for CFQ under really brutal loads.


My theory has been that the extra processing it has to perform you 
describe just doesn't matter in the context of a fast system where 
physical I/O is always the bottleneck. I'd really like to have a 
compelling reason to prefer deadline, because the concept seems better, 
but I just haven't seen the data to back that up.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Greg Smith

Eduardo Piombino wrote:
Going to the disk properties (in windows), I just realized it does not 
have the Write Cache enabled, and it doesn't also allow me to set it 
up. I've read in google that the lack of ability to turn it on (that 
is, that the checkbox remains checked after you apply the changes), 
has to do with the lack of batter backup in the controller (which is 
default bundle option for embedded EA-200, which is our case).


Regarding actual disk performance, I did some silly tests:
Copied a 496 Mbytes file from a folder to another folder in C: and it 
took almost 90 secs.

That would be 496MB/90 sec = 5.51MB/sec



I'd suggest http://www.hdtune.com/ as a better way to test transfer 
speed here across the drive(s).


I think you'll find that your server continues to underperform 
expectations until you get the battery installed that allows turning the 
write cache on.  A quick look at HP's literature suggests they believe 
you only need the battery to enable the write-cache if you're using 
RAID5.  That's completely wrong for database use, where you will greatly 
benefit from it regardless of underlying RAID setup.  If you've got an 
EA-200 but don't have a battery for it to unlock all the features, 
you're unlikely to find a more cost effect way to improve your system 
than to buy one.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Greg Smith

Eduardo Piombino wrote:
But already knowing that the base system (i.e. components out of pg's 
control, like OS, hardware, etc) may be buggy or that it can fail in 
rationalizing the IO, maybe it would be nice to tell to whoever is 
responsible for making use of the IO subsystem (pg_bg_writer?), to use 
it in a moderately manner. That is ... This operation is not critical, 
please do not trash my system because it is not necessary. Use all the 
delays you would like, go slowly, please, I don't really care if you 
take a month. Or at least, be aware of current status of the IO 
system. If it is being busy, slow down, if it is free, speed up. Of 
course I won't care if it takes less time to complete.


There are three problems here:

1) The background writer does not have a central role in the I/O of the 
system, and even if it did that would turn into a scalability issue.  
Clients initiate a lot of work on their own, and it's not so easy to 
actually figure out where to put a limiter at given that.


2) PostgreSQL aims to be cross-platform, and writing something that 
adjusts operations based on what the OS is doing requires a lot of 
OS-specific code.  You end up needing to write a whole new library for 
every platform you want to support.


3) Everyone who is spending money/time improving PostgreSQL has things 
they think are more important to work on than resource limiters, so 
there's just not anybody working on this.


Your request is completely reasonable and there are plenty of uses for 
it.  It's just harder than it might seem to build.  One day we may find 
someone with money to spend who can justify sponsoring development in 
this area because it's a must-have for their PostgreSQL deployment.  I 
assure you that any number of people reading this list would be happy to 
quote out that job.


But right now, there is no such sponsor I'm aware of.  That means the 
best we can do is try and help people work around the issues they do run 
into in the most effective way possible, which in your case has wandered 
into this investigation of your underlying disk subsystem.  It's not 
that we don't see that an alternate approach would make the problem go 
away, the code needed just isn't available, and other project 
development work (like the major replication advance that was just 
committed today) are seen as more important.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Craig Ringer
Eduardo Piombino wrote:

 I think pg is wasting resources, it could be very well taking advantage
 of, if you guys just tell me get better hardware. I mean ... the IO
 subsystem is obviously the bottleneck of my system. But most of the time
 it is on very very light load, actually ALL of the time, unless I do
 some heavy background processing like the original ALTER, or the
 procedure that updates 800.000 rows. What I would consider to be a great
 feature, would be able to tell pgsql, that a certain operation, is not
 time critical, so that it does not try to use all the available IO
 subsystem at once, but rather rationalize its use.

Rate-limiting (or preferably prioritizing) I/O from Pg would be nice.

It's already possible to prioritize I/O from Pg, though, albeit somewhat
clumsily:

  http://wiki.postgresql.org/wiki/Priorities

... as the OS provides I/O priority features. Pg shouldn't have to
re-implement those, only provide more convenient access to them.

( On Windows? Who knows. If you find out how to set I/O priorities on
Windows please extend that article! )

The trouble is that if you have a crappy RAID setup, the OS's I/O
priorities may be ineffective. The OS will do its best to prioritize
anything else over your don't-care-how-long-it-takes backend's query,
but if the RAID controller is queuing requests seconds-deep nothing the
OS does will make any real difference.

To my eternal frustration, there don't seem to be any RAID controllers
that have any concept of I/O priorities. I'd love Linux to be able to
submit requests to different queues within the controller depending on
priority, so low priority requests only got serviced if the
higher-priority queue was empty. AFAIK there isn't really anything like
that out there, though - all the RAID controllers seem to be built for
overall throughput at the expense of request latency to one extent or
another.

So ... your can prioritize I/O in the OS as much as you like, but your
RAID controller may merrily undermine all your work.

Doing it within Pg would suffer from many of the same issues. Pg has no
way to know how deeply the controller is queuing requests and when it's
actually finished a request, so it it's very hard for Pg to rate-limit
it's I/O effectively for low-priority work. It doesn't know how to
strike a balance between sending requests too fast (ruining latency for
higher priority work) and sending far too few (so taking forever for the
low priority work). What's insanely conservative on some hardware is
insanely too much to ask from other hardware. To be sure the controller
is done with a set of writes and ready for another, you'd have to
fsync() and that'd be murderous on performance, completely ruining any
benefits gained from pacing the work.

It's also complicated by the fact that Pg's architecture is very poorly
suited to prioritizing I/O based on query or process. (AFAIK) basically
all writes go through shared_buffers and the bgwriter - neither Pg nor
in fact the OS know what query or what backend created a given set of
block writes.

To be able to effectively prioritize I/O you'd really have to be able to
bypass the bgwriter, instead doing the writes direct from the low
priority backend after ionice()ing or otherwise setting up low OS-level
I/O priorities. Even then, RAID-controller level queuing and buffering
might land up giving most of the I/O bandwidth to the low priority work
anyway.

I guess some kind of dynamic rate-limiting could theoretically also
allow Pg to write at (say) 50% of the device's write capacity at any
given time, but the multiple layers of buffering and the dynamic load
changes in the system would make it incredibly hard to effectively
evaluate what the system's write capacity actually was. You'd probably
have to run a dedicated Pg benchmark to generate some parameters to
calibrate low priority write rates... but they'd still change depending
on the random vs seq I/O mix of other processes and Pg backends on the
system, the amount of fsync() activity, etc etc etc. It's a more
complicated (!) version of the problem of rate-limiting TCP/IP data sending.

( Actually, implementing something akin to TCP/IP connection rate
limiting for allocating I/O write bandwidth in low-priority connections
would be ... fascinating. I'm sure the people who write OS write
schedulers and priority systems like ionice have looked into it and
found reasons why it's not suitable. )


The point of all that rambling: it's not as easy as just adding query
priorities to Pg!

 responsible for making use of the IO subsystem (pg_bg_writer?), to use
 it in a moderately manner. That is ... This operation is not critical,
 please do not trash my system because it is not necessary. Use all the
 delays you would like, go slowly, please, I don't really care if you
 take a month.

Trouble is, that's a rather rare case. Usually you *do* care if it takes
a month vs a week, because you're worried about lock times.

 Or at least, be aware of 

Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Greg Smith

Craig Ringer wrote:

It's also complicated by the fact that Pg's architecture is very poorly
suited to prioritizing I/O based on query or process. (AFAIK) basically
all writes go through shared_buffers and the bgwriter - neither Pg nor
in fact the OS know what query or what backend created a given set of
block writes.


You're correct that all writes go through shared_buffers, and all 
information about the query that dirties the page in the first place is 
gone by the time it's written out.  In 8.3 and later, buffers get 
written three ways:


(1) A backend needs to allocate a buffer to do some work.  The buffer it 
is allocated is dirty.  In this case, the backend itself ends up writing 
the page out.


(2) The background writer monitors how many allocations are going on, 
and it tries to keep ahead of the backends by writing pages likely to be 
re-used in the near future out before (1) happens.  (This is the part 
that was different in earlier versions--the background writer just 
roamed the whole buffer cache looking for work to do before, unrelated 
to the amount of activity on the system).


(3) Checkpoints (which are also executed by the background writer) have 
to write out every dirty buffer in order to reconcile everything between 
memory and disk.


One reason you can't just ionice the backend and make all the problems 
go away is (3); you can't let a sluggish backend stop checkpoints from 
happening.


You might note that only one of these sources--a backend allocating a 
buffer--is connected to the process you want to limit.  If you think of 
the problem from that side, it actually becomes possible to do something 
useful here.  The most practical way to throttle something down without 
a complete database redesign is to attack the problem via allocation.  
If you limited the rate of how many buffers a backend was allowed to 
allocate and dirty in the first place, that would be extremely effective 
in limiting its potential damage to I/O too, albeit indirectly.  Trying 
to limit the damage on the write and OS side instead is a dead end, 
you'll never make that work without a major development job--one that I 
would bet against ever being committed even if someone did it for a 
specific platform, because they're all going to be so different and the 
code so hackish.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-15 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 You might note that only one of these sources--a backend allocating a 
 buffer--is connected to the process you want to limit.  If you think of 
 the problem from that side, it actually becomes possible to do something 
 useful here.  The most practical way to throttle something down without 
 a complete database redesign is to attack the problem via allocation.  
 If you limited the rate of how many buffers a backend was allowed to 
 allocate and dirty in the first place, that would be extremely effective 
 in limiting its potential damage to I/O too, albeit indirectly.

This is in fact exactly what the vacuum_cost_delay logic does.
It might be interesting to investigate generalizing that logic
so that it could throttle all of a backend's I/O not just vacuum.
In principle I think it ought to work all right for any I/O-bound
query.

But, as noted upthread, this is not high on the priority list
of any of the major developers.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Andy Colson

On 1/13/2010 11:36 PM, Craig Ringer wrote:

Robert Haas wrote:


I'm kind of surprised that there are disk I/O subsystems that are so
bad that a single thread doing non-stop I/O can take down the whole
server.  Is that normal?


No.


Does it happen on non-Windows operating
systems?


Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.


On a side note, on linux, would using the deadline scheduler resolve that?

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Pierre Frédéric Caillau d



high CPU usage It might very well be high IO usage.


Try this :

Copy (using explorer, the shell, whatever) a huge file.
This will create load similar to ALTER TABLE.
Measure throughput, how much is it ?

	If your server blows up just like it did on ALTER TABLE, you got a IO  
system problem.

If everything is smooth, you can look into other things.

	How's your fragmentation ? Did the disk ever get full ? What does the  
task manager say (swap in/out, disk queue lengthn etc)


PS : try a separate tablespace on another disk.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Greg Smith

Andy Colson wrote:

On 1/13/2010 11:36 PM, Craig Ringer wrote:

Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.


On a side note, on linux, would using the deadline scheduler resolve 
that?


I've never seen the deadline scheduler resolve anything.  If you're out 
of I/O capacity and that's blocking other work, performance is dominated 
by the policies of the underlying controller/device caches.  Think about 
it a minute:  disks nowadays can easily have 32MB of buffer in them, 
right?  And random read/write operations are lucky to clear 2MB/s on 
cheap drivers.  So once the drive is filled with requests, you can 
easily sit there for ten seconds before the scheduler even has any input 
on resolving the situation.  That's even more true if you've got a 
larger controller cache in the mix.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Andy Colson

On 1/14/2010 12:07 PM, Greg Smith wrote:

Andy Colson wrote:

On 1/13/2010 11:36 PM, Craig Ringer wrote:

Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.


On a side note, on linux, would using the deadline scheduler resolve
that?


I've never seen the deadline scheduler resolve anything. If you're out
of I/O capacity and that's blocking other work, performance is dominated
by the policies of the underlying controller/device caches. Think about
it a minute: disks nowadays can easily have 32MB of buffer in them,
right? And random read/write operations are lucky to clear 2MB/s on
cheap drivers. So once the drive is filled with requests, you can easily
sit there for ten seconds before the scheduler even has any input on
resolving the situation. That's even more true if you've got a larger
controller cache in the mix.



That makes sense.  So if there is very little io, or if there is way way 
too much, then the scheduler really doesn't matter.  So there is a slim 
middle ground where the io is within a small percent of the HD capacity 
where the scheduler might make a difference?


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Greg Smith

Andy Colson wrote:
So if there is very little io, or if there is way way too much, then 
the scheduler really doesn't matter.  So there is a slim middle ground 
where the io is within a small percent of the HD capacity where the 
scheduler might make a difference?


That's basically how I see it.  There seem to be people who run into 
workloads in the middle ground where the scheduler makes a world of 
difference.  I've never seen one myself, and suspect that some of the 
reports of deadline being a big improvement just relate to some buginess 
in the default CFQ implementation that I just haven't encountered.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Eduardo Piombino
Regarding the hardware the system is running on:

It's an HP Proliant DL-180 G5 server.

Here are the specs... our actual configuration only has one CPU, and 16G of
RAM.
The model of the 2 disks I will post later today, when I get to the server.
I was with many things, sorry.

http://h18000.www1.hp.com/products/quickspecs/12903_na/12903_na.HTML
http://h18004.www1.hp.com/products/quickspecs/DS_00126/DS_00126.pdf

*At A Glance
*The HP ProLiant DL180 G5 is a low cost high capacity storage optimized
2-way server that delivers on a history of design excellence and 2U density
for a variety of rack deployments and applications.

   - Processors:
  - Supports up to two Quad-Core Intel® Xeon® processors: 5400 sequence
  with 12MB Level 2 cache
  - Intel® 5100 Chipset
   - Memory:
  - Up to 32 GB of memory supported by six (6) PC2-5300 (667 MHz) DDR2
  memory slots
   - Internal Drive Support:
  - Supports up to twelve via CTO with controller or up to eight via BTO
  with the addition of a controller:
 - Hot Plug Serial ATA (SATA) 3.5hard drives; or
 - Hot Plug Serial Attached SCSI (SAS) 3.5hard drives
 *NOTE:* 4 hard drives are supported standard via BTO. 8 hard drive
 support requires the addition of a Smart Array or HBA
controller. Hot Plug
 and SAS functionality require the addition of a Smart Array or HBA
 controller. 12 hard drive support available via CTO only and
requires a SAS
 controller that supports expanders.
  - Internal storage capacity:
 - SATA Models: Up to 12.0TB (12 x 1TB Hot Plug 3.5 hard drives)
 - SAS Model: Up to 12.0TB (12 x 1TB Hot Plug 3.5 hard drives)
   - Network Controller:
  - One integrated NC105i PCI-e Gigabit NIC (embedded) (Wake on LAN and
  PXE capable)
   - Storage Controllers:
  - HP Embedded SATA RAID Controller (up to 4 hard drive support on
  standard BTO models)
  *NOTE:* Transfer rate 1.5 Gb/s SATA
   - Expansion Slots:
  - One available Low Profile x8 PCI-Express slot using a Low profile
  Riser.
  - Two Full Height/ Full Length Riser options
 - Option1: 2 full-length/full-height PCI-Express x8 connector slots
 (x4 electrical - Standard)
 - Option2: full-length/full-height riser with 2 PCI-X
 Slots(Optional)
   - Infrastructure Management:
  - Optional HP Lights Out 100c Remote Management card with Virtual KVM
  and Virtual Media support (includes IPMI2.0 and SMASH support)
   - USB Ports:
  - Seven USB ports (2) front, (4) rear, (1) internal
   - Optical Drive:
  - Support for one:
 - Optional Multi-bay DVD
 - Optional Floppy (USB only, USB key)
   - Power Supply:
  - 750W Power Supply (Optional Redundancy Hot Plug, Autoswitching) CSCI
  2007/8
  - 1200W High Efficiency Power Supply (Optional Redundancy Hot Plug,
  Autoswitching) (Optional) CSCI 2007/8
 - *NOTE:* Climate Savers Computing Initiative, 2007-2008 Compliant
   - Form Factor:
  - 2U rack models


Regarding the SATA RAID controller, on the other spec pages it says that for
the 8 disks model (ours), it comes with a Smart Array E200. I will try to
check out if we are using the original, since I recall hearing something
about that our disks were SAS (Serial Attached SCSI), and I don't know if it
is possible to connect those disks to embedded Smart Array E200 controller.
Would it be possible?

On Wed, Jan 13, 2010 at 4:13 PM, Eduardo Piombino drak...@gmail.com wrote:

 Greg, I will post more detailed data as soon as I'm able to gather it.

 I was trying out if the cancellation of the ALTER cmd worked ok, I might
 give the ALTER another try, and see how much CPU, RAM and IO usage gets
 involved. I will be doing this monitoring with the process explorer from
 sysinternals, but I don't know how I can make it to log the results. Do you
 know any tool that you have used that can help me generate this evidence? I
 will google a little as soon as possible.



 On Wed, Jan 13, 2010 at 3:46 PM, Greg Smith g...@2ndquadrant.com wrote:

 Robert Haas wrote:

 I'm kind of surprised that there are disk I/O subsystems that are so
 bad that a single thread doing non-stop I/O can take down the whole
 server.  Is that normal?  Does it happen on non-Windows operating
 systems?  What kind of hardware should I not buy to make sure this
 doesn't happen to me?


 You can kill any hardware on any OS with the right abusive client.  Create
 a wide table and insert a few million records into it with generate_series
 one day and watch what it does to queries trying to run in parallel with
 that.

 I think the missing step here to nail down exactly what's happening on
 Eduardo's system is that he should open up some of the Windows system
 monitoring tools, look at both disk I/O and CPU usage, and then watch what
 changes when the troublesome ALTER TABLE shows up.


 --
 Greg Smith2ndQuadrant   

Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-14 Thread Eduardo Piombino
Regarding the EA-200 card, here are the specs.
It seems it has support for SAS disks, so it is most probably that we are
using the embedded/default controller.

http://h18000.www1.hp.com/products/quickspecs/12460_div/12460_div.html
http://h18000.www1.hp.com/products/quickspecs/12460_div/12460_div.pdf

*Key Features *

   - Seamless upgrades from past generations and upgrades to next generation
   HP high performance and high capacity Serial Attached SCSI Smart Array
   controllers.
   - 3G SAS technology delivers high performance and data bandwidth up to
   300 MB\s per physical link and contains full compatibility with 1.5G SATA
   technology.
   - x4 2.5G PCI Express host interface technology delivers high performance
   and data bandwidth up to 2 GB/s maximum bandwidth.
   - Addition of the battery backed cache upgrade enables BBWC, RAID 5,
   Capacity Expansion, RAID migration, and Stripe Size Migration.
   - Mix-and-match SAS and SATA hard drives, lets you deploy drive
   technology as needed to fit your computing environment.
   - Support for up to 2 TB in a single logical drive.
   - Software consistency among all Smart Array family products: Array
   Configuration Utility (ACU), Option ROM Configuration for Arrays (ORCA),
   Systems Insight Manager, Array Diagnostic Utility (ADU) and SmartStart. Some
   of these features are not available with ProLiant 100 series platforms.
   - The SA-E200 controller supports up to 8 drives. The SA-E200i supports
   2-8 drives depending on the server implementation.


*Performance*

HP's High Performance Architecture sets new boundaries of industry
performance expectations!

   - 3Gb/s SAS (300MB/s bandwidth per physical link)
   - x8 3Gb/s SAS physical links (compatible with 1.5G SATA)
   - 64 MB or 128 MB DDR1-266 battery-backed cache provides up to 4.2 GB/s
   maximum bandwidth.
   - x4 2.5G PCI Express host interface provides 2 GB/s maximum bandwidth.
   - MIPS 32-bit Processor
   - Read ahead caching
   - Write-back caching (with battery-backed write cache upgrade)


*Capacity *

Given the increasing need for high performance and rapid capacity expansion,
the SA-E200 offers:

   - Up to 6TB of total storage with 6 x 1TB SATA MDL hard drives (3.5)
   *NOTE:* Support for greater than 2TB in a single logical drive.
   - Up to 2.4TB of total storage with 8 x 300GB SFF SAS hard drives


On Thu, Jan 14, 2010 at 5:49 PM, Eduardo Piombino drak...@gmail.com wrote:

 Regarding the hardware the system is running on:

 It's an HP Proliant DL-180 G5 server.

 Here are the specs... our actual configuration only has one CPU, and 16G of
 RAM.
 The model of the 2 disks I will post later today, when I get to the server.
 I was with many things, sorry.

 http://h18000.www1.hp.com/products/quickspecs/12903_na/12903_na.HTML
 http://h18004.www1.hp.com/products/quickspecs/DS_00126/DS_00126.pdf

 *At A Glance
 *The HP ProLiant DL180 G5 is a low cost high capacity storage optimized
 2-way server that delivers on a history of design excellence and 2U density
 for a variety of rack deployments and applications.

- Processors:
   - Supports up to two Quad-Core Intel® Xeon® processors: 5400
   sequence with 12MB Level 2 cache
   - Intel® 5100 Chipset
- Memory:
   - Up to 32 GB of memory supported by six (6) PC2-5300 (667 MHz) DDR2
   memory slots
- Internal Drive Support:
   - Supports up to twelve via CTO with controller or up to eight via
   BTO with the addition of a controller:
  - Hot Plug Serial ATA (SATA) 3.5hard drives; or
  - Hot Plug Serial Attached SCSI (SAS) 3.5hard drives
  *NOTE:* 4 hard drives are supported standard via BTO. 8 hard
  drive support requires the addition of a Smart Array or HBA 
 controller. Hot
  Plug and SAS functionality require the addition of a Smart Array or 
 HBA
  controller. 12 hard drive support available via CTO only and 
 requires a SAS
  controller that supports expanders.
   - Internal storage capacity:
  - SATA Models: Up to 12.0TB (12 x 1TB Hot Plug 3.5 hard drives)
  - SAS Model: Up to 12.0TB (12 x 1TB Hot Plug 3.5 hard drives)
- Network Controller:
   - One integrated NC105i PCI-e Gigabit NIC (embedded) (Wake on LAN
   and PXE capable)
- Storage Controllers:
   - HP Embedded SATA RAID Controller (up to 4 hard drive support on
   standard BTO models)
   *NOTE:* Transfer rate 1.5 Gb/s SATA
- Expansion Slots:
   - One available Low Profile x8 PCI-Express slot using a Low profile
   Riser.
   - Two Full Height/ Full Length Riser options
  - Option1: 2 full-length/full-height PCI-Express x8 connector
  slots (x4 electrical - Standard)
  - Option2: full-length/full-height riser with 2 PCI-X
  Slots(Optional)
- Infrastructure Management:
   - Optional HP Lights Out 100c Remote Management card with Virtual
   KVM and Virtual Media support 

Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Eduardo Piombino
Yes, one of the things I will do asap is to migrate to the latest version.

On other occasion I went through the checkpoint parameters you mentioned,
but left them untouched since they seemed logical.
I'm a little reluctant of changing the checkpoint configuration just to let
me do a -once in a lifetime- ALTER.
The checkpoints would then remain too far away in time (or in traffic).
And thinking of touching it and retouching it every time I need to do sthing
different bugs me a little. But if there is no other option I will
definitely give it a try.

Are you sure, for instance, that the ALTER command (and the internal data it
may require to handle, lets say 1.8 million records * 1024 bytes/record
(aprox)) goes to RAM, then to disk, and gets logged in the WAL during the
whole process? Maybe it does not get logged at all until the ALTER is
completed? Since the original table can be left untouched until this copy of
the table gets updated ... Just guessing here.


On Wed, Jan 13, 2010 at 4:39 AM, Greg Smith g...@2ndquadrant.com wrote:

 Eduardo Piombino wrote:

 Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone.


 Ugh...there are several features in PostgreSQL 8.3 and later specifically
 to address the sort of issue you're running into.  If you want to get good
 write performance out of this system, you may need to upgrade to at least
 that version.  It's impossible to resolve several of the common problems in
 write operations being too intense using any 8.2 version.

 The final effect was that the server went non-responsive, for all matters,
 not even the TaskManager would come up when i hit CTRL-ALT-DEL, and of
 course, every client would suffer horrific (+20 secs) for the simplest
 operations like SELECT NOW();


 The thing that you have to realize is that altering a table is basically
 making a new copy of that table, which is a really heavy amount of writing.
  It's quite easy for an I/O heavy operation like that to fill up a lot of
 RAM with data to be written out, and when the database periodically needs to
 force all that data out to disk the whole system grinds to a halt when it
 happens.  There's no way I'm aware of to throttle that writing down to a
 reasonable amount under Windows either, to achieve your goal of just making
 the ALTER run using less resources.

 Some reading:

 http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server goes over
 basic tuning of the database server.  If you haven't already increased the
 checkpoint_segments parameters of your system, that's the first thing to
 try--increase it *a lot* (32 or more, default is 3) because it can really
 help with this problem.  A moderate increase to shared_buffers is in order
 too; since you're on Windows, increasing it to 256MB is a reasonable change.
  The rest of the changes in there aren't likely to help out with this
 specific problem.

 http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htmhttp://www.westnet.com/%7Egsmith/content/postgresql/chkp-bgw-83.htm:
  covers the most likely cause of the issue you're running into.
  Unfortunately, most of the solutions you'll see there are things changed in
 8.3.

 --
 Greg Smith2ndQuadrant   Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com  www.2ndQuadrant.com




Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Craig Ringer

On 13/01/2010 3:03 PM, Eduardo Piombino wrote:

One last question, this IO issue I'm facing, do you think it is just a
matter of RAID configuration speed, or a matter of queue gluttony (and
not leaving time for other processes to get into the IO queue in a
reasonable time)?


Hard to say with the data provided. It's not *just* a matter of a slow 
array, but that might contribute.


Specifically, though, by slow array in this case I'm looking at 
latency rather than throughput, particularly read latency under heavy 
write load. Simple write throughput isn't really the issue, though bad 
write throughput can make it fall apart under a lighter load than it 
would otherwise.


High read latencies may not be caused by deep queuing, though that's one 
possible cause. A controller that prioritizes batching sequential writes 
efficiently over serving random reads would cause it too - though 
reducing its queue depth so it can't see as many writes to batch would help.


Let me stress, again, that if you have a decent RAID controller with a 
battery backed cache unit you can enable write caching and most of these 
issues just go away. Using an array format with better read/write 
concurrency, like RAID 10, may help as well.


Honestly, though, at this point you need to collect data on what the 
system is actually doing, what's slowing it down and where. *then* look 
into how to address it. I can't advise you much on that as you're using 
Windows, but there must be lots of info on optimising windows I/O 
latencies and throughput on the 'net...



Because if it was just a matter of speed, ok, with my actual RAID
configuration lets say it takes 10 minutes to process the ALTER TABLE
(leaving no space to other IOs until the ALTER TABLE is done), lets say
then i put the fastest possible RAID setup, or even remove RAID for the
sake of speed, and it completes in lets say again, 10 seconds (an unreal
assumption). But if my table now grows 60 times, I would be facing the
very same problem again, even with the best RAID configuration.


Only if the issue is one of pure write throughput. I don't think it is. 
You don't care how long the ALTER takes, only how much it impacts other 
users. Reducing the impact on other users so your ALTER can complete in 
its own time without stamping all over other work is the idea.



The problem would seem to be in the way the OS (or hardware, or someone
else, or all of them) is/are inserting the IO requests into the queue.


It *might* be. There's just not enough information to tell that yet. 
You'll need to do quite a bit more monitoring. I don't have the 
expertise to advise you on what to do and how to do it under Windows.



What can I do to control the order in which these IO requests are
finally entered into the queue?


No idea. You probably need to look into I/O priorities on Windows.

Ideally you shouldn't have to, though. If you can keep read latencies at 
sane levels under high write load on your array, you don't *need* to 
mess with this.


Note that I'm still guessing about the issue being high read latencies 
under write load. It fits what you describe, but there isn't enough data 
to be sure, and I don't know how to collect it on Windows.



What cards do I have to manipulate the order the IO requests are entered
into the queue?
Can I disable this queue?
Should I turn disk's IO operation caches off?
Not use some specific disk/RAID  vendor, for instance?


Don't know. Contact your RAID card tech support, Google, search MSDN, etc.

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Euler Taveira de Oliveira
Eduardo Piombino escreveu:
 Maybe it does not get logged at all until the ALTER is completed?
 
This feature [1] was implemented a few months ago and it will be available
only in the next PostgreSQL version (8.5).

[1] http://archives.postgresql.org/pgsql-committers/2009-11/msg00018.php


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Eduardo Piombino
With that said, I assume my current version of pgsql DOES make all this
heavy work go through WAL logging.

Curious thing is that I remember (of course) reviewing logs of the crash
times, and I didn't see anything strange, not even the famous warning you
are making checkpoints too often. maybe you should consider using extending
the checkpoint_segments parameter.

I will check it again.
Besides, I will try to gather as much information on the system itself (RAID
controllers, disk vendors, etc).
Thank you, will keep you posted.

On Wed, Jan 13, 2010 at 11:06 AM, Euler Taveira de Oliveira 
eu...@timbira.com wrote:

 Eduardo Piombino escreveu:
  Maybe it does not get logged at all until the ALTER is completed?
 
 This feature [1] was implemented a few months ago and it will be available
 only in the next PostgreSQL version (8.5).

 [1] http://archives.postgresql.org/pgsql-committers/2009-11/msg00018.php


 --
  Euler Taveira de Oliveira
  http://www.timbira.com/



Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Robert Haas
On Wed, Jan 13, 2010 at 2:03 AM, Eduardo Piombino drak...@gmail.com wrote:
 Excellent, lots of useful information in your message.
 I will follow your advices, and keep you posted on any progress. I have yet
 to confirm you with some technical details of my setup, but I'm pretty sure
 you hit the nail in any case.

 One last question, this IO issue I'm facing, do you think it is just a
 matter of RAID configuration speed, or a matter of queue gluttony (and not
 leaving time for other processes to get into the IO queue in a reasonable
 time)?

 Because if it was just a matter of speed, ok, with my actual RAID
 configuration lets say it takes 10 minutes to process the ALTER TABLE
 (leaving no space to other IOs until the ALTER TABLE is done), lets say then
 i put the fastest possible RAID setup, or even remove RAID for the sake of
 speed, and it completes in lets say again, 10 seconds (an unreal
 assumption). But if my table now grows 60 times, I would be facing the very
 same problem again, even with the best RAID configuration.

 The problem would seem to be in the way the OS (or hardware, or someone
 else, or all of them) is/are inserting the IO requests into the queue.
 What can I do to control the order in which these IO requests are finally
 entered into the queue?
 I mean .. what i would like to obtain is:

 Considering the ALTER TABLE as a sequence of 100.000 READ/WRITE OPERATIONS
 Considering the SELECT * FROM xxx as a sequence of 100 READ OPERATIONS
 (totally unrelated in disk)

 First i run the ALTER TABLE on a thread...
 Lets say by the time it generates 1.000 READ/WRITE OPERATIONS, the other
 thread starts with the SELECT * FROM xxx ...
 I would expect the IO system to give chance to the those 100 READ OPERATIONS
 to execute immediately (with no need to wait for the remaining 990.000
 READ/WRITE OPERATIONS finish), that is, to enter the queue at *almost* the
 very same moment the IO request were issued.

 If I can not guarantee that, I'm kinda doomed, because the largest the
 amount of IO operations requested by a heavy duty operation, the longest
 it will take any other thread to start doing anything.

One thing you can do - although it's a darn lot of work compared to
just running a DDL command - is create a new empty table with the
schema you want and then write a script that copies, say, 1000 records
from the old table to the new table.  If your table has a primary key
with a natural sort ordering, it's not too hard to keep track of where
you left off the last time and continue on from there.  Then you can
incrementally get all of your data over without swamping the system.
I realize that's a pain in the neck, of course.

I'm kind of surprised that there are disk I/O subsystems that are so
bad that a single thread doing non-stop I/O can take down the whole
server.  Is that normal?  Does it happen on non-Windows operating
systems?  What kind of hardware should I not buy to make sure this
doesn't happen to me?

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Scott Marlowe
On Tue, Jan 12, 2010 at 9:59 PM, Eduardo Piombino drak...@gmail.com wrote:
...

 Now, with this experience, I tried a simple workaround.
 Created an empty version of a named a_empty, identical in every sense.
 renamed a to a_full, and a_empty to a. This procedure costed me like
 0 seconds of downtime, and everything kept working smoothly. Maybe a cpl of
 operations could have failed if they tried to write in the very second that
 there was actually no table named a, but since the operation was
 transactional, the worst scenario was that if the operation should have
 failed, the client application would just inform of the error and ask the
 user for a retry. No big deal.

 Now, this table, that is totally unattached to the system in every way (no
 one references this table, its like a dumpster for old records), is not
 begin accessed by no other thread in the system, so an ALTER table on it, to
 turn a char(255) to char(250), should have no effect on the system.

 So, with this in mind, I tried the ALTER TABLE this time on the a_full
 (totally unrelated) table.
 The system went non-responsive again, and this time it had nothing to do
 with threads waiting for the alter table to complete. The pgAdmin GUI went
 non-responsive, as well as the application's server GUI, whose threads kept
 working on the background, but starting to take more and more time for every
 clients request (up to 25 seconds, which are just ridiculous and completely
 unacceptable in normal conditions).

OK, I'm not entirely sure this table is not still locking something
else.  If you make a copy by doing something like:

select * into test_table from a;

and then alter test_table do you still get the same problems?  If so,
then it is an IO issue, most likely.  If not, then there is some
client connection still referencing this table or something and that
could cause this type of behaviour as well.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Eduardo Piombino
 OK, I'm not entirely sure this table is not still locking something
 else.  If you make a copy by doing something like:

 select * into test_table from a;

 and then alter test_table do you still get the same problems?  If so,
 then it is an IO issue, most likely.  If not, then there is some
 client connection still referencing this table or something and that
 could cause this type of behaviour as well.


I can guarantee you that the table is not being referenced by any other
thread, table or process, and that it is totally unrelated to everything
else in the system.

Its just a plain table, with 1.8 million records, that no thread knows it
exists. It has no foreign keys that would allow thinking of a possible
lock on the parent table, nor it is being referenced by any other table in
the model. It has no triggers associated, and no indexes. It could very well
even be on another database on the same physical server, and still do the
same damage. I did not try this, but I'm pretty sure of the outcome. I
would'nt like to bring the server down just to prove this, but I will do it
if I find it necessary.

The only things that are common to this table and other tables in the
system, as I see are:
RAM, IO, and CPU, at a very low level. One of these is being stressed out by
the thread executing the ALTER, and the other threads (not just pgsql
application threads, but system processes in general) suffer from the lack
of this resource. All the previous discussions tend to induce that the
resource we are talking about is IO.

The fact that the Task Manager does not come up, would also not be explained
by a lock in a client thread.
Besides all that, all the client queries are NO WAIT, thus any lock would
just return immediately, and no retry would be done until the response gets
back to the user and the user confirms it. In that case, all the errors
presented to the final users would be The element is being processed some
other place, as my default handler to pgsql error code 55P03, instead of
the horrible Operation timed out, that is what final users got during the
huge slowdown/downtime.





--
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Scott Marlowe
On Wed, Jan 13, 2010 at 10:54 AM, Eduardo Piombino drak...@gmail.com wrote:

 OK, I'm not entirely sure this table is not still locking something
 else.  If you make a copy by doing something like:

 select * into test_table from a;

 and then alter test_table do you still get the same problems?  If so,
 then it is an IO issue, most likely.  If not, then there is some
 client connection still referencing this table or something and that
 could cause this type of behaviour as well.

 I can guarantee you that the table is not being referenced by any other
 thread, table or process, and that it is totally unrelated to everything
 else in the system.

If you rename a table that WAS being referenced by other threads, then
it might still be being accessed or waited on etc by those threads, as
their transaction would have started earlier.

The only way you can guarantee it's not being reference in some way is
to create it fresh and new as I suggested and test on that.  Until
then, your guarantee is based on a belief, not verifiable fact.  I too
tend to believe this is an IO problem btw, but claiming that it can't
be a problem with some locks without looking at pg_locks at least, is
a bit premature.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Greg Smith

Robert Haas wrote:

I'm kind of surprised that there are disk I/O subsystems that are so
bad that a single thread doing non-stop I/O can take down the whole
server.  Is that normal?  Does it happen on non-Windows operating
systems?  What kind of hardware should I not buy to make sure this
doesn't happen to me?
  
You can kill any hardware on any OS with the right abusive client.  
Create a wide table and insert a few million records into it with 
generate_series one day and watch what it does to queries trying to run 
in parallel with that.


I think the missing step here to nail down exactly what's happening on 
Eduardo's system is that he should open up some of the Windows system 
monitoring tools, look at both disk I/O and CPU usage, and then watch 
what changes when the troublesome ALTER TABLE shows up.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Eduardo Piombino
Greg, I will post more detailed data as soon as I'm able to gather it.

I was trying out if the cancellation of the ALTER cmd worked ok, I might
give the ALTER another try, and see how much CPU, RAM and IO usage gets
involved. I will be doing this monitoring with the process explorer from
sysinternals, but I don't know how I can make it to log the results. Do you
know any tool that you have used that can help me generate this evidence? I
will google a little as soon as possible.


On Wed, Jan 13, 2010 at 3:46 PM, Greg Smith g...@2ndquadrant.com wrote:

 Robert Haas wrote:

 I'm kind of surprised that there are disk I/O subsystems that are so
 bad that a single thread doing non-stop I/O can take down the whole
 server.  Is that normal?  Does it happen on non-Windows operating
 systems?  What kind of hardware should I not buy to make sure this
 doesn't happen to me?


 You can kill any hardware on any OS with the right abusive client.  Create
 a wide table and insert a few million records into it with generate_series
 one day and watch what it does to queries trying to run in parallel with
 that.

 I think the missing step here to nail down exactly what's happening on
 Eduardo's system is that he should open up some of the Windows system
 monitoring tools, look at both disk I/O and CPU usage, and then watch what
 changes when the troublesome ALTER TABLE shows up.


 --
 Greg Smith2ndQuadrant   Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com  www.2ndQuadrant.com




Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-13 Thread Craig Ringer
Robert Haas wrote:

 I'm kind of surprised that there are disk I/O subsystems that are so
 bad that a single thread doing non-stop I/O can take down the whole
 server.  Is that normal?

No.

 Does it happen on non-Windows operating
 systems? 

Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.

I landed up having dramatically better results by disabling the
controller's RAID features, instead exposing each disk to the OS
separately and using Linux's software RAID.

 What kind of hardware should I not buy to make sure this
 doesn't happen to me?

3ware's older cards. Apparently their new ones are a lot better, but I
haven't verified this personally.

Anything in RAID-5 without a BBU.

Anything at all without a BBU, preferably.

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-12 Thread Craig James

Eduardo Piombino wrote:
Hi list, I'm having a problem when dealing with operations that asks too 
much CPU from the server.

The scenario is this:


A nice description below, but ... you give no information about your system: 
number of CPUs, disk types and configuration, how much memory, what have you 
changed in your Postgres configuration?  And what operating system, what 
version of Postgres, etc., etc.  The more information you give, the better the 
answer.

If you're operating on a single disk with a tiny amount of memory, and old, 
misconfigured Postgres on a laptop computer, that's a whole different problem 
than if you're on a big sytem with 16 CPUs and a huge RAID 1+0 with 
battery-backed cache.

Craig



I have a multithreaded server, each thread with its own connection to 
the database. Everything is working fine, actually great, actually 
outstandingly, in normal operation.


I've a table named a with 1.8 million records, and growing, but I'm ok 
with it, at least for the moment. Maybe in the near future we will cut 
it down, backup old data, and free it up. But this is not the issue, as 
I said, everything is working great. I have a cpl of indexes to help 
some queries, and that's it.


Now my problem started when I tried to do some model refactoring on this 
production table.


First I tried a dumb approach.
I connected from pgadmin, opened a new session.
I tried an ALTER TABLE on this table just to turn a char(255) field into 
char(250), and it locked up my system.


No surprise, since I had many threads waiting for this alter table to 
finish. What I did not foresee was that this alter table would take up 
so much time. Ok, my fault, for not having calculated the time that it 
would take the ALTER TABLE to complete.


Now, with this experience, I tried a simple workaround.
Created an empty version of a named a_empty, identical in every sense.
renamed a to a_full, and a_empty to a. This procedure costed me 
like 0 seconds of downtime, and everything kept working smoothly. Maybe 
a cpl of operations could have failed if they tried to write in the very 
second that there was actually no table named a, but since the 
operation was transactional, the worst scenario was that if the 
operation should have failed, the client application would just inform 
of the error and ask the user for a retry. No big deal.


Now, this table, that is totally unattached to the system in every way 
(no one references this table, its like a dumpster for old records), is 
not begin accessed by no other thread in the system, so an ALTER table 
on it, to turn a char(255) to char(250), should have no effect on the 
system.


So, with this in mind, I tried the ALTER TABLE this time on the a_full 
(totally unrelated) table.
The system went non-responsive again, and this time it had nothing to do 
with threads waiting for the alter table to complete. The pgAdmin GUI 
went non-responsive, as well as the application's server GUI, whose 
threads kept working on the background, but starting to take more and 
more time for every clients request (up to 25 seconds, which are just 
ridiculous and completely unacceptable in normal conditions).


This resulted in my client applications to start disconnecting after 
their operations failed due to timeout, and the system basically went 
down again, from a users point of view.


This time, since I saw no relation between my operation on a totally 
unrelated table, and the server BIG slowdown, I blamed the servers memory.


After some tests, I came up to the conclusion that any heavy duty 
operation on any thread (ALTER TABLE on 1.8 million records tables, 
updates on this table, or an infinite loop, just to make my point), 
would affect the whole server.


Bottom line is, I can't seem to do any heavy processing on the database 
(or any operation that would require the server to enter into high CPU 
usage), and still expect the server to behave normally. Whatever heavy 
duty operation, DDL, DML, on whatever table (related, or unrelated), on 
whatever thread, would tear down my servers integrity.


My question then is: is there a way to limit the CPU assigned to a 
specific connection?

I mean, I don't care if my ALTER TABLE takes 4 days instead of 4 hours.

Something like:
pg_set_max_cpu_usage(2/100);

and rest assured that no matter what that thread is asking the database 
to do, it just wont affect the other running threads. Obviosly, assuring 
that the process itself does not involve any locking of the other threads.


Is something like that possible?

Thanks in advance,
Eduardo.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-12 Thread Craig Ringer

On 13/01/2010 12:59 PM, Eduardo Piombino wrote:


My question then is: is there a way to limit the CPU assigned to a
specific connection?
I mean, I don't care if my ALTER TABLE takes 4 days instead of 4 hours.

Something like:
pg_set_max_cpu_usage(2/100);


You're assuming the issue is CPU. I think that unlikely. In general, a 
single thread/process that wants as much CPU as it can get won't bring 
any machine with a half-decent OS to its knees. Any UNIX system should 
barely notice - everything else will slow down somewhat, depending on 
its scheduler, but in any sane setup shouldn't slow down by more than 
1/2. Modern Windows tends to be fairly well behaved here too.


What's much more likely is that you're working with a crappy disk setup 
- such as a RAID 5 array without battery-backed cache, or a single slow 
disk. You probably also have quite deep write queuing in the RAID 
controller / disk / OS. This means that your disk-intensive ALTER TABLE 
makes your disk subsystem so busy that it takes ages before any other 
process gets a look-in. It's not unlikely that I/O requests are being 
queued so deeply that it (often) takes several seconds for the 
controller to get around to executing a newly submitted read or write 
request. If your other queries need to do more than a few steps where 
they read some data, think about it, and read other data depending on 
the first read, then they're going to take forever, because they're 
going to have to ensure a long delay before disk access each time.


Of course, that's just a guess, since you've provided no information on 
your hardware. Try collecting up some of the information shown here:


http://wiki.postgresql.org/wiki/Guide_to_reporting_problems#Things_you_need_to_mention

In any case, if it *is* I/O related, what to do about it depends on 
exactly what sort of I/O issue it is. Extremely deep queuing? Looks good 
for throughput benchmarks, but is stupid if you care about latency and 
have some I/O that's higher priority than others, so reduce your queue 
depth. Very slow writes hammering reads? Don't use RAID 5. Etc.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-12 Thread Eduardo Piombino
I'm sorry.

The server is a production server HP Proliant, I don't remember the exact
model, but the key features were:
4 cores, over 2GHz each (I'm sorry I don't remember the actual specs), I
think it had 16G of RAM (if that is possible?)
It has two 320G disks in RAID (mirrored).

I don't even have the emails with the specs here, but I can give you the
exact configuration by tomorrow.

Operating system: Windows 2003 server, with latest patches.
Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone.

At any given time, the server is on 0% CPU load, with peaks of 1%, 2%, max.
In normal operation.

I've been digging a little in the archives, and one thing that it helped me
come up with, is that I don't really remember seeing high CPU usage (fact
that surprised me, but i do remember seeing high IO activity). I'm sorry,
its pretty late here.
I know this single statement is enough to almost change everything I've just
asked.
Please try interpreting again my original mail, considering that when I said
high CPU usage It might very well be high IO usage.

The final effect was that the server went non-responsive, for all matters,
not even the TaskManager would come up when i hit CTRL-ALT-DEL, and of
course, every client would suffer horrific (+20 secs) for the simplest
operations like SELECT NOW();

I've just made a little modification to my original questions, to extend to
the possibility of a IO usage issue, instead of just CPU.

*
*

 *Bottom line is, I can't seem to do any heavy processing on the database
 (or any operation that would require the server to enter into high CPU usage
 **or IO USAGE), and still expect the server to behave normally. Whatever
 heavy duty operation, DDL, DML, on whatever table (related, or unrelated),
 on whatever thread, would tear down my servers integrity.*

 * My question then is: is there a way to limit the CPU or IO USAGEassigned to 
 a specific connection?
 *
 * I mean, I don't care if my ALTER TABLE takes 4 days instead of 4 hours.*

 * Something like:*
 * pg_set_max_cpu _or_io_usage(2/100);*



On Wed, Jan 13, 2010 at 2:14 AM, Craig James craig_ja...@emolecules.comwrote:

 Eduardo Piombino wrote:

 Hi list, I'm having a problem when dealing with operations that asks too
 much CPU from the server.
 The scenario is this:


 A nice description below, but ... you give no information about your
 system: number of CPUs, disk types and configuration, how much memory, what
 have you changed in your Postgres configuration?  And what operating system,
 what version of Postgres, etc., etc.  The more information you give, the
 better the answer.

 If you're operating on a single disk with a tiny amount of memory, and old,
 misconfigured Postgres on a laptop computer, that's a whole different
 problem than if you're on a big sytem with 16 CPUs and a huge RAID 1+0 with
 battery-backed cache.


Craig



 I have a multithreaded server, each thread with its own connection to the
 database. Everything is working fine, actually great, actually
 outstandingly, in normal operation.

 I've a table named a with 1.8 million records, and growing, but I'm ok
 with it, at least for the moment. Maybe in the near future we will cut it
 down, backup old data, and free it up. But this is not the issue, as I said,
 everything is working great. I have a cpl of indexes to help some queries,
 and that's it.

 Now my problem started when I tried to do some model refactoring on this
 production table.

 First I tried a dumb approach.
 I connected from pgadmin, opened a new session.
 I tried an ALTER TABLE on this table just to turn a char(255) field into
 char(250), and it locked up my system.

 No surprise, since I had many threads waiting for this alter table to
 finish. What I did not foresee was that this alter table would take up so
 much time. Ok, my fault, for not having calculated the time that it would
 take the ALTER TABLE to complete.

 Now, with this experience, I tried a simple workaround.
 Created an empty version of a named a_empty, identical in every sense.
 renamed a to a_full, and a_empty to a. This procedure costed me
 like 0 seconds of downtime, and everything kept working smoothly. Maybe a
 cpl of operations could have failed if they tried to write in the very
 second that there was actually no table named a, but since the operation
 was transactional, the worst scenario was that if the operation should have
 failed, the client application would just inform of the error and ask the
 user for a retry. No big deal.

 Now, this table, that is totally unattached to the system in every way (no
 one references this table, its like a dumpster for old records), is not
 begin accessed by no other thread in the system, so an ALTER table on it, to
 turn a char(255) to char(250), should have no effect on the system.

 So, with this in mind, I tried the ALTER TABLE this time on the a_full
 (totally unrelated) table.
 The system went non-responsive again, and this time it had nothing 

Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-12 Thread Craig Ringer

On 13/01/2010 1:47 PM, Eduardo Piombino wrote:

I'm sorry.

The server is a production server HP Proliant, I don't remember the
exact model, but the key features were:
4 cores, over 2GHz each (I'm sorry I don't remember the actual specs), I
think it had 16G of RAM (if that is possible?)
It has two 320G disks in RAID (mirrored).


Plain 'ol SATA disks in RAID-1?

Hardware RAID (and if so, controller model)? With battery backup? Write 
cache on or off?


Or software RAID? If so, Windows build-in sw raid, or some vendor's 
fakeraid (Highpoint, Promise, Adaptec, etc) ?


Anyway, with two disks in RAID-1 I'm not surprised you're seeing some 
performance issues with heavy writes, especially since it seems unlikely 
that you have a BBU hardware RAID controller. In RAID-1 a write must hit 
both disks, so a 1Mb write effectively costs twice as much as a 1Mb 
read. Since many controllers try for high throughput (because it looks 
good in benchmarks) at the expense of latency they also tend to try to 
batch writes into long blocks, which keeps the disks busy in extended 
bursts. That slaughters read latencies.


I had this sort of issue with a 3Ware 8500-8, and landed up modifying 
and recompiling the driver to reduce its built-in queue depth. I also 
increased readahead. It was still pretty awful as I was working with 
RAID 5 on SATA disks, but it made a big difference and more importantly 
meant that my Linux server was able to honour `ionice' priorities and 
feed more important requests to the controller first.


On windows, I really don't know what to do about it beyond getting a 
better I/O subsystem. Google may help - look into I/O priorities, queue 
depths, reducing read latencies, etc.



I don't even have the emails with the specs here, but I can give you the
exact configuration by tomorrow.

Operating system: Windows 2003 server, with latest patches.
Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone.


Urk. 8.2 ?

Pg on Windows improves a lot with each release, and that's an old buggy 
version of 8.2 at that. Looking into an upgrade would be a really, 
REALLY good idea.



Please try interpreting again my original mail, considering that when I
said high CPU usage It might very well be high IO usage.

The final effect was that the server went non-responsive, for all
matters, not even the TaskManager would come up when i hit CTRL-ALT-DEL,
and of course, every client would suffer horrific (+20 secs) for the
simplest operations like SELECT NOW();


That sounds a LOT like horrible read latencies caused by total I/O 
overload. It could also be running out of memory and swapping heavily, 
so do keep an eye out for that, but I wouldn't expect to see that with 
an ALTER TABLE - especially on a 16GB server.



/ My question then is: is there a way to limit the CPU* or **IO
USAGE* assigned to a specific connection?/


In win32 you can set CPU priorities manually in Task Manager, but only 
once you already know the process ID of the Pg backend that's going to 
be hammering the machine. Not helpful.


I don't know of any way to do per-process I/O priorities in Win32, but I 
only use win32 reluctantly and don't use it for anything I care about 
(like a production Pg server) so I'm far from a definitive source.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-12 Thread Eduardo Piombino
Excellent, lots of useful information in your message.
I will follow your advices, and keep you posted on any progress. I have yet
to confirm you with some technical details of my setup, but I'm pretty sure
you hit the nail in any case.

One last question, this IO issue I'm facing, do you think it is just a
matter of RAID configuration speed, or a matter of queue gluttony (and not
leaving time for other processes to get into the IO queue in a reasonable
time)?

Because if it was just a matter of speed, ok, with my actual RAID
configuration lets say it takes 10 minutes to process the ALTER TABLE
(leaving no space to other IOs until the ALTER TABLE is done), lets say then
i put the fastest possible RAID setup, or even remove RAID for the sake of
speed, and it completes in lets say again, 10 seconds (an unreal
assumption). But if my table now grows 60 times, I would be facing the very
same problem again, even with the best RAID configuration.

The problem would seem to be in the way the OS (or hardware, or someone
else, or all of them) is/are inserting the IO requests into the queue.
What can I do to control the order in which these IO requests are finally
entered into the queue?
I mean .. what i would like to obtain is:

Considering the ALTER TABLE as a sequence of 100.000 READ/WRITE OPERATIONS
Considering the SELECT * FROM xxx as a sequence of 100 READ OPERATIONS
(totally unrelated in disk)

First i run the ALTER TABLE on a thread...
Lets say by the time it generates 1.000 READ/WRITE OPERATIONS, the other
thread starts with the SELECT * FROM xxx ...
I would expect the IO system to give chance to the those 100 READ OPERATIONS
to execute immediately (with no need to wait for the remaining 990.000
READ/WRITE OPERATIONS finish), that is, to enter the queue at *almost* the
very same moment the IO request were issued.

If I can not guarantee that, I'm kinda doomed, because the largest the
amount of IO operations requested by a heavy duty operation, the longest
it will take any other thread to start doing anything.

What cards do I have to manipulate the order the IO requests are entered
into the queue?
Can I disable this queue?
Should I turn disk's IO operation caches off?
Not use some specific disk/RAID  vendor, for instance?

I think I have some serious reading to do on this matter, google will help
of course, but as always, every advice for small it may seem, will be very
much appreciated.

Nonetheless, thanks a lot for all the light you already brought me on this
matter.
I really appreciate it.

Eduardo.



On Wed, Jan 13, 2010 at 3:02 AM, Craig Ringer
cr...@postnewspapers.com.auwrote:

 On 13/01/2010 1:47 PM, Eduardo Piombino wrote:

 I'm sorry.

 The server is a production server HP Proliant, I don't remember the
 exact model, but the key features were:
 4 cores, over 2GHz each (I'm sorry I don't remember the actual specs), I
 think it had 16G of RAM (if that is possible?)
 It has two 320G disks in RAID (mirrored).


 Plain 'ol SATA disks in RAID-1?

 Hardware RAID (and if so, controller model)? With battery backup? Write
 cache on or off?

 Or software RAID? If so, Windows build-in sw raid, or some vendor's
 fakeraid (Highpoint, Promise, Adaptec, etc) ?

 Anyway, with two disks in RAID-1 I'm not surprised you're seeing some
 performance issues with heavy writes, especially since it seems unlikely
 that you have a BBU hardware RAID controller. In RAID-1 a write must hit
 both disks, so a 1Mb write effectively costs twice as much as a 1Mb read.
 Since many controllers try for high throughput (because it looks good in
 benchmarks) at the expense of latency they also tend to try to batch writes
 into long blocks, which keeps the disks busy in extended bursts. That
 slaughters read latencies.

 I had this sort of issue with a 3Ware 8500-8, and landed up modifying and
 recompiling the driver to reduce its built-in queue depth. I also increased
 readahead. It was still pretty awful as I was working with RAID 5 on SATA
 disks, but it made a big difference and more importantly meant that my Linux
 server was able to honour `ionice' priorities and feed more important
 requests to the controller first.

 On windows, I really don't know what to do about it beyond getting a better
 I/O subsystem. Google may help - look into I/O priorities, queue depths,
 reducing read latencies, etc.


  I don't even have the emails with the specs here, but I can give you the
 exact configuration by tomorrow.

 Operating system: Windows 2003 server, with latest patches.
 Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone.


 Urk. 8.2 ?

 Pg on Windows improves a lot with each release, and that's an old buggy
 version of 8.2 at that. Looking into an upgrade would be a really, REALLY
 good idea.


  Please try interpreting again my original mail, considering that when I
 said high CPU usage It might very well be high IO usage.

 The final effect was that the server went non-responsive, for all
 matters, not even 

Re: [PERFORM] a heavy duty operation on an unused table kills my server

2010-01-12 Thread Greg Smith

Eduardo Piombino wrote:

Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone.


Ugh...there are several features in PostgreSQL 8.3 and later 
specifically to address the sort of issue you're running into.  If you 
want to get good write performance out of this system, you may need to 
upgrade to at least that version.  It's impossible to resolve several of 
the common problems in write operations being too intense using any 8.2 
version. 

The final effect was that the server went non-responsive, for all 
matters, not even the TaskManager would come up when i hit 
CTRL-ALT-DEL, and of course, every client would suffer horrific (+20 
secs) for the simplest operations like SELECT NOW();


The thing that you have to realize is that altering a table is basically 
making a new copy of that table, which is a really heavy amount of 
writing.  It's quite easy for an I/O heavy operation like that to fill 
up a lot of RAM with data to be written out, and when the database 
periodically needs to force all that data out to disk the whole system 
grinds to a halt when it happens.  There's no way I'm aware of to 
throttle that writing down to a reasonable amount under Windows either, 
to achieve your goal of just making the ALTER run using less resources.


Some reading:

http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server goes over 
basic tuning of the database server.  If you haven't already increased 
the checkpoint_segments parameters of your system, that's the first 
thing to try--increase it *a lot* (32 or more, default is 3) because it 
can really help with this problem.  A moderate increase to 
shared_buffers is in order too; since you're on Windows, increasing it 
to 256MB is a reasonable change.  The rest of the changes in there 
aren't likely to help out with this specific problem.


http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm : 
covers the most likely cause of the issue you're running into.  
Unfortunately, most of the solutions you'll see there are things changed 
in 8.3.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance