Gregory Stark wrote:
"Gregory Stark" <[EMAIL PROTECTED]> writes
The two interfaces I'm aware of for this are posix_fadvise() and libaio.
I've run tests with a synthetic benchmark which generates a large file then
reads a random selection of blocks from within it using either synchronous
reads like we do now or either of those interfaces. I saw impressive speed
gains on a machine with only three drives in a raid array. I did this a
while ago so I don't have the results handy. I'll rerun the tests again and
post them.
Here's the results of running the synthetic test program on a 3-drive raid
array. Note that the results *exceeded* the 3x speedup I expected, even for
ordered blocks. Either the drive (or the OS) is capable of reordering the
block requests better than the offset into the file would appear or some other
effect is kicking in.
The test is with an 8GB file, picking 8,192 random 8k blocks from within it.
The pink diamonds represent the bandwidth obtained if the random blocks are
sorted before fetching (like a bitmap indexscan) and the blue if they're
unsorted.
I didn't see exceeded 3X in the graph. But I certainly see 2X+ for most
of the graphic, and ~3X for very small reads. Possibly, it is avoiding
unnecessary read-ahead at the drive or OS levels?
I think we expected to see raw reads significantly faster for the single
process case. I thought your simulation was going to involve a tweak to
PostgreSQL on a real query to see what overall effect it would have on
typical queries and on special queries like Matthew's. Are you able to
tweak the index scan and bitmap scan methods to do posfix_fadvise()
before running? Even if it doesn't do anything more intelligent such as
you described in another post?
Cheers,
mark
--
Mark Mielke <[EMAIL PROTECTED]>