On Fri, 4 Feb 2011, Chris Browne wrote:
2. The query needs to NOT be I/O-bound. If it's I/O bound, then your
system is waiting for the data to come off disk, rather than to do
processing of that data.
yes and no on this one.
it is very possible to have a situation where the process generating the
I/O is waiting for the data to come off disk, but if there are still idle
resources in the disk subsystem.
it may be that the best way to address this is to have the process
generating the I/O send off more requests, but that sometimes is
significantly more complicated than splitting the work between two
processes and letting them each generate I/O requests
with rotating disks, ideally you want to have at least two requests
outstanding, one that the disk is working on now, and one for it to start
on as soon as it finishes the one that it's on (so that the disk doesn't
sit idle while the process decides what the next read should be). In
practice you tend to want to have even more outstanding from the
application so that they can be optimized (combined, reordered, etc) by
the lower layers.
if you end up with a largish raid array (say 16 disks), this can translate
into a lot of outstanding requests that you want to have active to fully
untilize the array, but having the same number of requests outstanding
with a single disk would be counterproductive as the disk would not be
able to see all the outstanding requests and therefor would not be able to
optimize them as effectivly.
David Lang
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance