On Thu, Jan 17, 2013 at 2:57 PM, Abhijit Menon-Sen <a...@2ndquadrant.com>wrote:
> > > There was considerable discussion after this (accessible through the > archives link above), which I won't attempt to summarise. > > I thought Robert made those comments after considerable discussions on Jeff's approach. So he probably still stands by his objections or at least not satisfied/seen the numbers. Now that I look at the patch, I wonder if there is another fundamental issue with the patch. Since the patch removes WAL logging for the VM set operation, this can happen: 1. Vacuum kicks in and clears all dead tuples in a page and decides that its all-visible 2. Vacuum WAL-logs the cleanup activity and marks the page dirty 3. Vacuum sets the visibility bit and marks the VM page dirty 4. Say the VM page gets written to the disk. The heap page is not yet written neither the WAL log corresponding to the cleanup operation 5. CRASH After recovery, the VM bit will remain set because the VM page got written before the crash. But since heap page's cleanup WAL did not made to the disk, those operations won't be replayed. The heap page will be left with not-all-visible tuples in that case and its not a good state to be in. The original code does not have this problem because the VM set WAL gets written after the heap page cleanup WAL. So its guaranteed that the VM bit will be set during recovery only if the cleanup WAL is replayed too (there is more magic than what meets the eye and I think its not fully documented). Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee