On 3/2/07, Tom Lane <[EMAIL PROTECTED]> wrote:
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > - Another problem with the current HOT patch is that it generates > tuple level fragmentation while reusing LP_DELETEd items when > the new tuple is of smaller size than the original one. Heikki > supported using best-fit strategy to reduce the fragmentation > and thats worth trying. But ISTM that we can also correct > row-level defragmentation whenever we run out of free space > and LP_DELETEd tuples while doing UPDATE. Since this does not > require moving tuples around, we can do this by a simple EXCLUSIVE > lock on the page. You are mistaken. To move existing tuples requires LockBufferForCleanup, the same as VACUUM needs; otherwise some other backend might continue to access a tuple it found previously.
I am not suggesting moving tuples around. This is a specific case of reusing LP_DELETEd tuples. For example, say the HOT-update chain had two tuples, the first one is of length 100 and next one is of length 125. When the first becomes dead, we remove it from the chain and set its LP_DELETE true. Now, this tuple is say reused to store a tuple of length 80, this results in tuple level fragmentation of 20 bytes. The information about the original size of the tuple is lost. Later of when this tuple is also LP_DELETEd, we can not use it store tuple of size greater than 80, even though there is unused free space of another 20 bytes. What I am suggesting is to clean up this fragmentation (only for LP_DELETEd tuples) by resetting the lp_len of these tuples to the max possible value. None of the live tuples are touched. Btw, I haven't yet implemented this stuff, so I am seeking opinions. How much testing of this patch's concurrent behavior has been done?
I'm wondering if any other locking thinkos are in there ...
I have tested it on pgbench with maximum 90 clinets and 90 scaling factor, with 50000 txns/client (please see my another post of preliminary results). I have done this quite a few time. Not that I am saying there are no bugs, but I have good confidence in the patch. These tests are done on SMP machines. I also run data consistency checks at the end of pgbench runs to validate the UPDATEs. I also ran 4 hour DBT2 tests 3-4 times, not seen any failures. I would appreciate if there are any independent tests, may be in different setups. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com