Hi all, In the current lazy vacuum implementation, some index AMs such as btree indexes call lazy_tid_reaped() for each index tuples during ambulkdelete to check if the index tuple points to the (collected) garbage tuple. In that function, we simply call bsearch(3) but we should be able to know the result without bsearch(3) if the index tuple points to the heap tuple that is out of range of the collected garbage tuples. I've checked whether bsearch(3) does the boundary value check or not but looking at the bsearch(3) implementation of FreeBSD libc[1], there is not. The same is true for glibc.
So my proposal is to add boundary value check in lazy_tid_reaped() before executing bsearch(3). This will help when index vacuum happens multiple times or when garbage tuples are concentrated to a narrow range. I’ve done three performance tests. maintenance_work_mem is set to 64MB with which we can collect about 11 million tuples at maximum and the table size is 10GB. The table has one btree index. Here is the average of index vacuum (ambulkdelete) execution time of three trials: * Test case 1 I made all pages dirty with 15 million garbage tuples to invoke index vacuum twice. HEAD: 164.7 s Patched: 138.81 s * Test case 2 I made all pages dirty with 10 million garbage tuples to invoke index vacuum only once, checking the performance degradation. HEAD: 127.8 s Patched: 128.25 s * Test case 3 I made a certain range of table dirty with 1 million garbage tuples. HEAD: 31.07 s Patched: 9.41 s I thought that we can have a generic function wrapping bsearch(3) that does boundary value checks and then does bsearch(3) so that we can use it in other similar places as well. But the attached patch doesn't do that as I'd like to hear opinions on the proposal first. Feedback is very welcome. Regards, [1] https://svnweb.freebsd.org/base/head/lib/libc/stdlib/bsearch.c?revision=326025&view=markup -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
vacuum_boundary_check.patch
Description: Binary data