I believe that if we fall in lossy pages then tidbitmap will not have a significant impact on preformance because postgres will spend a lot of time on tuple rechecking on page. If work_mem is to small to keep exact tidbitmap then postgres will significantly slowdown. I implemented it, (v2.1 in attachs) butI've been having a look at this and I'm wondering about a certain scenario:In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block, and on the next iteration of the loop we have the same block again, have you benchmarked any caching code to store if tbm_page_is_lossy() returned true for that block on the previous iteration of the loop? This would save from having to call tbm_page_is_lossy() again for the same block. Or are you just expecting that tbm_page_is_lossy() returns true so rarely that you'll end up caching the page most of the time, and gain on skipping both hash lookups on the next loop, since page will be set in this case?
I don't think that is an improvement, at least significant improvement.
It would be nice to see a comment to explain why it might be a good idea to cache the page lookup. Perhaps something like:
added, see attachment (v2)
I also wondered if there might be a small slowdown in the case where the index only finds 1 matching tuple. So I tried the following: avg.2372.4456 2381.909 99.6% med.2371.224 2359.494 100.5% It appears that if it does, then it's not very much.
I believe, that's unmeasurable because standard deviation of your tests is about 2% what is greater that difference between pathed and master versions.
-- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
tbm_cachepage-2.patch.gz
Description: GNU Zip compressed data
tbm_cachepage-2.1.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers