Prefetch CPU cost should be rather low in the grand scheme of things, and does 
help performance even for very fast I/O.  I would not expect a very large CPU 
use increase from that sort of patch in the grand scheme of things - there is a 
lot that is more expensive to do on a per block basis.

There are two ways to look at non-I/O bound performance:
* Aggregate performance across many concurrent activities - here you want the 
least CPU used possible per action, and the least collisions on locks or shared 
data structures.  Using resources for as short of an interval as possible also 
helps a lot here.
* Single query performance, where you want to shorten the query time, perhaps 
at the cost of more average CPU.  Here, something like the fadvise stuff helps 
- as would any thread parallelism.  Perhaps less efficient in aggregate, but 
more efficient for a single query.

Overall CPU cost of accessing and reading data.  If this comes from disk, the 
big gains will be along the whole chain:  Driver to file system cache, file 
system cache to process, process specific tasks (cache eviction, placement, 
tracking), examining page tuples, locating tuples within pages, etc.   Anything 
that currently occurs on a per-block basis that could be done in a larger batch 
or set of blocks may be a big gain.  Another place that commonly consumes CPU 
in larger software projects is memory allocation if more advanced allocation 
techniques are not used.  I have no idea what Postgres uses here however.  I do 
know that commercial databases have extensive work in this area for 
performance, as well as reliability (harder to cause a leak, or easier to 
detect) and ease of use (don't have to even bother to free in certain contexts).

> On 12/9/08 2:58 PM, "Robert Haas" <[EMAIL PROTECTED]> wrote:

> I don't believe the thesis.  The gap between disk speeds and memory
> speeds may narrow over time, but I doubt it's likely to disappear
> altogether any time soon, and certainly not for all users.

Well, when select count(1) reads pages slower than my disk, its 16x + slower 
than my RAM.  Until one can demonstrate that the system can even read pages in 
RAM faster than what disks will do next year, it doesn't matter much that RAM 
is faster.   It does matter that RAM is faster for sorts, hashes, and other 
operations, but at the current time it does not for the raw pages themselves, 
from what I can measure.

This is in fact, central to my point.  Things will be CPU bound, not I/O bound. 
 It is mentioned that we still have to access things over the bus, and memory 
is faster, etc.  But Postgres is too CPU bound on page access to take advantage 
of the fact that memory is faster (for reading data pages).

The biggest change is not just that disks are getting closer to RAM, but that 
the random I/O penalty is diminishing significantly.  Low latencies makes 
seek-driven queries that used to consume mostly disk time consume CPU time 
instead.  High CPU costs for accessing pages makes a fast disk surprisingly 
close to RAM speed.

> Besides which, I believe the CPU overhead of that patch is pretty darn
> small when the feature is not enabled.

> ...Robert

I doubt it is much CPU, on or off.  It will help with SSD's when optimizing a 
single query, it may not help much if a system has enough 'natural' parallelism 
from other concurrent queries.  However there is a clear CPU benefit for 
getting individual queries out of the way faster, and occupying precious 
work_mem or other resources for a shorter time.  Occupying resources for a 
shorter period always translates to some CPU savings on a machine running at 
its limit with high concurrency.

Reply via email to