On Apr 22, 2008, at 5:13 AM, PFC wrote:
In order to have it use the fast plan I must set random_page_cost to 1 which I absolutely don't want to do.
Setting effective_cache_size to huge values has no effect.
If I select a value of parent_id that has much less children, the index will be used, but in this case I think the threshold is misplaced, it should be slightly higher. Here we have about 5% of values selected. Hash join becomes better at about 15% because the table is cached.
This is 8.3.

Perhaps there would be a need for a per-object setting (object=table,index,partition) to alter the aggressiveness/lazyness of the page flushing and how long the pages for this object are kept in shared_buffers... this would be used to modify random_page_cost on a per-table/index/partition basis.

Example : let's imagine a "cache priority" setting.

- "cache priority" set to the minimum means this table is mostly write-only - "cache priority" set to default would give current behaviour (which is correct in most cases) - "cache priority" set to a high value would tell Postgres "I know this table/index/partition is small and often accessed rather randomly, so I want you to keep it in shared_buffers, purge it if you must but otherwise keep it in memory, flush something else instead which has lower cache_priority".

The optimizer could then use a different (much lower) value of random_page_cost for tables for which "cache priority" is set highest since it would know.

"cache priority" to me sounds like we're trying to influence caching behavior, which isn't what's happening. I do agree that we need a better way to tell the planner what tables are in memory.

An alternative would be for the background writer to keep some stats and do the thing for us :

- begin bgwriter scan
- setup hashtable of [relid => page count]
- at each page that is scanned, increment "page count" for this relation (uses very little CPU)
- end bgwriter stats
- for each relation, compare the number of pages we found in shared_buffers with the number of pages in the relation and draw conclusions about how well cached the relation is
- update random_page_cost accordingly for this relation

This would not examine whatever is in the OS' cache, though.

Actually, there's no need for bgwriter to do that; we can just look at the hit rate for the object. But we'd also need stats for how often we find pages for a relation in the OS cache, which no one has come up with a good method for.
--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to