Re: [HACKERS] Crash safe visibility map vs hint bits

[email protected] Sat, 04 Dec 2010 00:21:43 -0800

Den 4 Dec 2010 kl. 08:48 skrev Heikki Linnakangas 
<[email protected]>:

> On 04.12.2010 09:14, [email protected] wrote:
>> There has been a lot discussion about index-only scans and how to make the 
>> visibillity map crash safe. Then followed by a good discussion about hint 
>> bits.
>> 
>> What seems to be the main concern is the added wal volume and it makes me 
>> wonder if there is a way in-between that looks more like hint bits.
>> 
>> How about lazily wal-log the complete visibility map say every X minutes or 
>> N amount of tuple updates and make the wal recovery jobs of rechecking 
>> visibility of pages touched by the wal stream on recovery.
> 
> If you WAL-log the visibility map changes after-the-fact, it doesn't solve 
> the race condition we're struggling with: the visibility map change might hit 
> the disk before the PD_ALL_VISIBLE to the heap page. If you crash, you can 
> end up with a situation where the PD_ALL_VISIBLE flag on the heap page is not 
> set, but the bit in the visibility map is. Which causes serious issues later 
> on.

My imagination is probably not as good, but if you at time A wallog the 
complete map and at A+1 you update a tuple so the visibility bit is cleared but 
the map bit change does not happen due to a crash. Then at wal replay time you 
restore the map from time A and if the tuple change at A+1 is represented in 
the wal stream the you also update the visibility map.  This is the situation 
where the heap tuple hit disk but the map is left in a broken state?  Or is it 
a different similar looking situation?

The tuple change in the wal stream will require the system to reinspect the 
page anyway so there shouldn't be any additional disk io on replay due to this.

Jesper
> 

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash safe visibility map vs hint bits

Reply via email to