On Fri, Jan 7, 2011 at 1:28 PM, Jim Nasby wrote:
> On Jan 5, 2011, at 8:10 PM, Robert Haas wrote:
>> On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote:
>>> Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
>>> serve?
>>
>> If we modify a page on which PD_ALL_VISIBLE isn
On Jan 5, 2011, at 8:10 PM, Robert Haas wrote:
> On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote:
>> Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
>> serve?
>
> If we modify a page on which PD_ALL_VISIBLE isn't set, we don't
> attempt to update the visibility map.
On 2011-01-06 03:10, Robert Haas wrote:
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote:
Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
serve?
If we modify a page on which PD_ALL_VISIBLE isn't set, we don't
attempt to update the visibility map. In theory, this
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote:
> Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
> serve?
If we modify a page on which PD_ALL_VISIBLE isn't set, we don't
attempt to update the visibility map. In theory, this is an important
optimization to reduce
On 2010-11-30 05:57, Robert Haas wrote:
Last week, I posted a couple of possible designs for making the
visibility map crash-safe, which did not elicit much comment. Since
this is an important prerequisite to index-only scans, I'm trying
again.
The logic seems to be:
* If the visibillity map
* Robert Haas:
> Those hint bit tests are a single machine instruction. It's tough
> to beat that. It's tough to get within two orders of magnitude.
> I'd like to, but I don't see how.
For some scans, it might be possible to hoist the checks out of inner
loops. (At least in principle, I'm not
On Thu, 2010-12-02 at 19:06 -0500, Robert Haas wrote:
> I don't think that you can seriously suggest that emitting that volume
> of FPIs isn't going to be a problem immediately. We have to have some
> solution to that problem out of the gate.
Fair enough. I think you understand my point, and it's
On Thu, Dec 2, 2010 at 6:37 PM, Jeff Davis wrote:
>> It seems to me that a COPY command executed in a transaction with no
>> other open snapshots writing to a table created or truncated within
>> the same transaction should be able to write frozen tuples from the
>> get-go, regardless of anything
On Thu, 2010-12-02 at 17:00 -0500, Robert Haas wrote:
> I'm not really convinced that this problem is confined to bulk
> loading. Every INSERT or UPDATE results in a new tuple that may need
> hit bits set and eventually to be frozen. A bulk load is just a time
> when you do lots of inserts all at
On Thu, Dec 2, 2010 at 2:01 PM, Jeff Davis wrote:
> * We don't get an exclusive lock when dirtying a page with hint bits
> - Why: we write while reading, and we want good concurrency.
> - Why': because after a bulk load, we don't have any hint bits, and the
> only way to get them set without VACUU
Jeff Davis wrote:
> And, if we had a bulk loading path, we could probably get away
> with writing the data only twice (today, we write it 3 times
> including the hint bits) or maybe once if WAL archiving is off.
If you're counting WAL writes, you're low. If you don't go out of
your way to avo
On Wed, 2010-12-01 at 23:22 -0500, Robert Haas wrote:
> Well, let's think about what we'd need to do to make CRCs work
> reliably. There are two problems.
>
> 1. [...] If we CRC the entire page, the torn pages are never
> acceptable, so every action that modifies the page must be WAL-logged.
>
On Thu, Dec 2, 2010 at 6:37 AM, Dimitri Fontaine wrote:
> Robert Haas writes:
>> Or maybe I do. One other thing I've been thinking about with regard
>> to hint bit updates is that we might choose to mark that are
>> hint-bit-updated as "untidy" rather than "dirty". The background
>
> Please rev
Robert Haas writes:
> Or maybe I do. One other thing I've been thinking about with regard
> to hint bit updates is that we might choose to mark that are
> hint-bit-updated as "untidy" rather than "dirty". The background
Please review archives, you'll find the idea discussed and some patches
to
On Wed, Dec 1, 2010 at 6:41 PM, Jim Nasby wrote:
> On Dec 1, 2010, at 2:59 PM, Robert Haas wrote:
>> 2. Hint bits are necessary because an old XID can't be viewed as
>> guaranteed committed.
>
> Hmm... I thought hint bits were necessary because it's too expensive to query
> CLOG for every tuple.
On Wed, Dec 1, 2010 at 5:24 PM, Jeff Davis wrote:
> On Wed, 2010-12-01 at 15:59 -0500, Robert Haas wrote:
>> As for CRCs, there's a pretty direct chain of inference here:
>>
>> 1. CRCs are hard (really impossible) because we have hint bits.
>
> I would disagree with "impossible". If we don't set h
On Dec 1, 2010, at 2:59 PM, Robert Haas wrote:
> 2. Hint bits are necessary because an old XID can't be viewed as
> guaranteed committed.
Hmm... I thought hint bits were necessary because it's too expensive to query
CLOG for every tuple. If my understanding is correct then if we fix the CLOG
per
On Wed, 2010-12-01 at 15:59 -0500, Robert Haas wrote:
> As for CRCs, there's a pretty direct chain of inference here:
>
> 1. CRCs are hard (really impossible) because we have hint bits.
I would disagree with "impossible". If we don't set hint bits during
reading; and when we do set them, we log t
Robert Haas writes:
> If we switched from per-tuple MVCC based on XIDs to per-page MVCC
> based on LSNs and a rollback segment, all of this stuff would go out
> the window. Hint bits, gone. Anti-wraparound VACUUM, gone. CRCs,
> feasible. Visibility map... we might still need that, but the
> p
On Wed, Dec 1, 2010 at 3:31 PM, Jeff Davis wrote:
> On Wed, 2010-12-01 at 11:25 -0500, Robert Haas wrote:
>> 1. Every time we observe a page as all-visible, (a) set the
>> PD_ALL_VISIBLE bit on the page, without bumping the LSN;
>
> ...
>
>> 2. Every time we observe a page as all-visible, (a) set
On Wed, 2010-12-01 at 11:25 -0500, Robert Haas wrote:
> 1. Every time we observe a page as all-visible, (a) set the
> PD_ALL_VISIBLE bit on the page, without bumping the LSN;
...
> 2. Every time we observe a page as all-visible, (a) set the
> PD_ALL_VISIBLE bit on the page, without bumping the LS
On Wed, Dec 1, 2010 at 12:22 PM, Tom Lane wrote:
> Robert Haas writes:
>> I think we can improve this a bit further by also introducing a
>> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
>> FrozenXID. This allows us to freeze tuples aggressively - if we want
>> - without losi
On Wed, Dec 1, 2010 at 11:40 AM, Heikki Linnakangas
wrote:
> On 01.12.2010 18:25, Robert Haas wrote:
>>
>> I think we can improve this a bit further by also introducing a
>> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
>> FrozenXID. This allows us to freeze tuples aggressivel
Robert Haas writes:
> I think we can improve this a bit further by also introducing a
> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
> FrozenXID. This allows us to freeze tuples aggressively - if we want
> - without losing any forensic information.
So far so good ...
> We c
Heikki Linnakangas writes:
> On 01.12.2010 18:40, Tom Lane wrote:
>> Um, no it isn't. Suppose the heap page gets to disk but we crash before
>> the WAL record does. Now we have a persistent state where the heap page
>> is marked PD_ALL_VISIBLE but the corresponding VM bit is not set. The
>> VM
On 01.12.2010 18:40, Tom Lane wrote:
Robert Haas writes:
As far as I can tell, there are basically two viable solutions on the
table here.
1. Every time we observe a page as all-visible, (a) set the
PD_ALL_VISIBLE bit on the page, without bumping the LSN; (b) set the
bit in the visibility ma
Heikki Linnakangas writes:
> Hmm, actually, if we're willing to believe PD_ALL_VISIBLE in the page
> header over the xmin/xmax on the tuples, we could simply not bother
> doing anti-wraparound vacuums for pages that have the flag set. I'm not
> sure what changes that would require outside heapa
Robert Haas writes:
> As far as I can tell, there are basically two viable solutions on the
> table here.
> 1. Every time we observe a page as all-visible, (a) set the
> PD_ALL_VISIBLE bit on the page, without bumping the LSN; (b) set the
> bit in the visibility map page, bumping the LSN as usual
On 01.12.2010 18:25, Robert Haas wrote:
I think we can improve this a bit further by also introducing a
HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
FrozenXID. This allows us to freeze tuples aggressively - if we want
- without losing any forensic information. We can then m
On Wed, Dec 1, 2010 at 10:36 AM, Bruce Momjian wrote:
> Oh, we don't update the LSN when we set the PD_ALL_VISIBLE flag? OK,
> please let me think some more. Thanks.
As far as I can tell, there are basically two viable solutions on the
table here.
1. Every time we observe a page as all-visible
On Wed, Dec 1, 2010 at 9:57 AM, Kevin Grittner
wrote:
> Heikki Linnakangas wrote:
>
>> it would be annoying to have to checkpoint after a data load
>
> Heck, in my world it's currently pretty much a necessity to run
> VACUUM FREEZE ANALYZE on a table after a data load before it's
> reasonable to
Heikki Linnakangas wrote:
> On 01.12.2010 15:39, Bruce Momjian wrote:
> > Heikki Linnakangas wrote:
> >> On 01.12.2010 03:35, Bruce Momjian wrote:
> >>> Heikki Linnakangas wrote:
> Let's recap what happens when a VM bit is set: You set the
> PD_ALL_VISIBLE flag on the heap page (assuming
Heikki Linnakangas wrote:
> it would be annoying to have to checkpoint after a data load
Heck, in my world it's currently pretty much a necessity to run
VACUUM FREEZE ANALYZE on a table after a data load before it's
reasonable to expose the table to production use. It would hardly
be an incon
On 01.12.2010 15:39, Bruce Momjian wrote:
Heikki Linnakangas wrote:
On 01.12.2010 03:35, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Let's recap what happens when a VM bit is set: You set the
PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it
usually isn't), and then se
Heikki Linnakangas wrote:
> On 01.12.2010 03:35, Bruce Momjian wrote:
> > Heikki Linnakangas wrote:
> >> Let's recap what happens when a VM bit is set: You set the
> >> PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it
> >> usually isn't), and then set the bit in the VM while
On 01.12.2010 03:35, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Let's recap what happens when a VM bit is set: You set the
PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it
usually isn't), and then set the bit in the VM while keeping the heap
page locked.
What if we s
Heikki Linnakangas wrote:
> On 30.11.2010 18:33, Tom Lane wrote:
> > Robert Haas writes:
> >> Oh, but it's worse than that. When you XLOG a WAL record for each of
> >> those pages, you're going to trigger full-page writes for all of them.
> >> So now you've turned 1GB of data to write into 2+ G
Heikki Linnakangas writes:
> On 30.11.2010 19:22, Tom Lane wrote:
>> But having said that, I wonder whether we need a full-page image for
>> a WAL-logged action that is known to involve only setting a single bit
>> and updating LSN.
> You have to write a full-page image if you update the LSN, bec
On Tue, Nov 30, 2010 at 12:25 PM, Robert Haas wrote:
> On Tue, Nov 30, 2010 at 12:22 PM, Tom Lane wrote:
>> But having said that, I wonder whether we need a full-page image for
>> a WAL-logged action that is known to involve only setting a single bit
>> and updating LSN. Would omitting the FPI b
On Tue, Nov 30, 2010 at 12:22 PM, Tom Lane wrote:
> But having said that, I wonder whether we need a full-page image for
> a WAL-logged action that is known to involve only setting a single bit
> and updating LSN. Would omitting the FPI be any more risky than what
> happens now (ie, the page does
On 30.11.2010 19:22, Tom Lane wrote:
But having said that, I wonder whether we need a full-page image for
a WAL-logged action that is known to involve only setting a single bit
and updating LSN. Would omitting the FPI be any more risky than what
happens now (ie, the page does get written back to
Robert Haas writes:
> On Tue, Nov 30, 2010 at 12:10 PM, Tom Lane wrote:
>> It's ridiculous to claim that that "doubles the cost of VACUUM". In the
>> worst case, it will add 25% to the cost of setting an all-visible bit on
>> a page where there is no other work to do. (You already are writing o
On Tue, Nov 30, 2010 at 12:10 PM, Tom Lane wrote:
> Robert Haas writes:
>> We're not going to double the cost of VACUUM to get index-only scans.
>> And that's exactly what will happen if you do full-page writes of
>> every heap page to set a single bit.
>
> It's ridiculous to claim that that "dou
Robert Haas writes:
> We're not going to double the cost of VACUUM to get index-only scans.
> And that's exactly what will happen if you do full-page writes of
> every heap page to set a single bit.
It's ridiculous to claim that that "doubles the cost of VACUUM". In the
worst case, it will add 2
On Tue, Nov 30, 2010 at 11:59 AM, Tom Lane wrote:
> Robert Haas writes:
>> On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane wrote:
>>> Ouch. That seems like it could shoot down all these proposals. There
>>> definitely isn't any way to make VM crash-safe if there is no WAL-driven
>>> mechanism for s
On Tue, Nov 30, 2010 at 11:55 AM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> Can we get away with not setting the LSN on the heap page, even though
>> we set the PD_ALL_VISIBLE flag? If we don't set the LSN, the heap page
>> can be flushed to disk before the WAL record, but I think that's fi
On Tue, Nov 30, 2010 at 11:49 AM, Heikki Linnakangas
wrote:
> On 30.11.2010 18:33, Tom Lane wrote:
>>
>> Robert Haas writes:
>>>
>>> Oh, but it's worse than that. When you XLOG a WAL record for each of
>>> those pages, you're going to trigger full-page writes for all of them.
>>> So now you've
Robert Haas writes:
> On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane wrote:
>> Ouch. That seems like it could shoot down all these proposals. There
>> definitely isn't any way to make VM crash-safe if there is no WAL-driven
>> mechanism for setting the bits.
> Heikki's intent method works fine, be
Heikki Linnakangas writes:
> Can we get away with not setting the LSN on the heap page, even though
> we set the PD_ALL_VISIBLE flag? If we don't set the LSN, the heap page
> can be flushed to disk before the WAL record, but I think that's fine
> because it's OK to have the flag set in the heap
On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane wrote:
> Robert Haas writes:
>> That's definitely sucky, but in some ways it would be more complicated
>> if they did, because I don't think all-visible on the master implies
>> all-visible on the standby.
>
> Ouch. That seems like it could shoot down a
On 30.11.2010 18:33, Tom Lane wrote:
Robert Haas writes:
Oh, but it's worse than that. When you XLOG a WAL record for each of
those pages, you're going to trigger full-page writes for all of them.
So now you've turned 1GB of data to write into 2+ GB of data to
write.
No, because only the f
On 30.11.2010 18:40, Tom Lane wrote:
Robert Haas writes:
That's definitely sucky, but in some ways it would be more complicated
if they did, because I don't think all-visible on the master implies
all-visible on the standby.
Ouch. That seems like it could shoot down all these proposals. The
On Tue, Nov 30, 2010 at 11:33 AM, Tom Lane wrote:
> Robert Haas writes:
>> Oh, but it's worse than that. When you XLOG a WAL record for each of
>> those pages, you're going to trigger full-page writes for all of them.
>> So now you've turned 1GB of data to write into 2+ GB of data to
>> write.
Robert Haas writes:
> That's definitely sucky, but in some ways it would be more complicated
> if they did, because I don't think all-visible on the master implies
> all-visible on the standby.
Ouch. That seems like it could shoot down all these proposals. There
definitely isn't any way to make
Heikki Linnakangas writes:
> On 30.11.2010 18:10, Tom Lane wrote:
>> I'm not convinced it works at all. Consider write intent record,
>> checkpoint, set bit, crash before completing vacuum. There will be
>> no second intent record at which you could clean up if things are
>> inconsistent.
> Tha
Robert Haas writes:
> Oh, but it's worse than that. When you XLOG a WAL record for each of
> those pages, you're going to trigger full-page writes for all of them.
> So now you've turned 1GB of data to write into 2+ GB of data to
> write.
No, because only the first mod of each VM page would tri
On 30.11.2010 18:10, Tom Lane wrote:
Heikki Linnakangas writes:
Yeah, I'm not terribly excited about any of these schemes. The "intent"
record seems like the simplest one, but even that is quite different
from the traditional WAL-logging we do that it makes me slightly nervous.
I'm not convin
On 30.11.2010 18:22, Robert Haas wrote:
On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane wrote:
How much is "quite a lot"? Do we have any real reason to think that
this solution is unacceptable performance-wise?
Well, let's imagine a 1GB insert-only table. It has 128K pages. If
you XLOG setting
On Tue, Nov 30, 2010 at 11:22 AM, Robert Haas wrote:
> On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane wrote:
>> How much is "quite a lot"? Do we have any real reason to think that
>> this solution is unacceptable performance-wise?
>
> Well, let's imagine a 1GB insert-only table. It has 128K pages.
On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane wrote:
> How much is "quite a lot"? Do we have any real reason to think that
> this solution is unacceptable performance-wise?
Well, let's imagine a 1GB insert-only table. It has 128K pages. If
you XLOG setting the bit on each page, you'll need to wri
Heikki Linnakangas writes:
> The trivial solution to this is to WAL-log setting the visibility map
> bit, like we WAL-log any other operation. Lock the heap page, lock the
> visibility map page, write WAL-record, and release locks. That works,
> but the problem is that it creates quite a lot of
Heikki Linnakangas writes:
> On 30.11.2010 17:38, Tom Lane wrote:
>> Wouldn't it be easier and more robust to just consider VM bit changes to
>> be part of the WAL-logged actions? That would include updating LSNs on
>> VM pages and flushing VM pages to disk during checkpoint based on their
>> LSN
Here's one more idea:
The trivial solution to this is to WAL-log setting the visibility map
bit, like we WAL-log any other operation. Lock the heap page, lock the
visibility map page, write WAL-record, and release locks. That works,
but the problem is that it creates quite a lot of new WAL tra
On Tue, Nov 30, 2010 at 10:43 AM, Heikki Linnakangas
wrote:
>> It seems like you'll need to hold some kind of lock between the time
>> you examine RedoRecPtr and the time you actually examine the bit.
>> WALInsertLock in shared mode, maybe?
>
> It's enough to hold an exclusive lock on the visibili
On 30.11.2010 17:38, Tom Lane wrote:
Heikki Linnakangas writes:
On 30.11.2010 06:57, Robert Haas wrote:
I can't say I'm totally in love with any of these designs. Anyone
else have any ideas, or any opinions about which one is best?
Well, the design I've been pondering goes like this:
Wou
On Tue, Nov 30, 2010 at 10:38 AM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> On 30.11.2010 06:57, Robert Haas wrote:
>>> I can't say I'm totally in love with any of these designs. Anyone
>>> else have any ideas, or any opinions about which one is best?
>
>> Well, the design I've been ponder
On 30.11.2010 17:32, Robert Haas wrote:
On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas
wrote:
Some care is needed with checkpoints. Setting visibility map bits in step 2
is safe because crash recovery will replay the intent XLOG record and clear
any incorrectly set bits. But if a checkpoi
Heikki Linnakangas writes:
> On 30.11.2010 06:57, Robert Haas wrote:
>> I can't say I'm totally in love with any of these designs. Anyone
>> else have any ideas, or any opinions about which one is best?
> Well, the design I've been pondering goes like this:
Wouldn't it be easier and more robust
On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas
wrote:
> Some care is needed with checkpoints. Setting visibility map bits in step 2
> is safe because crash recovery will replay the intent XLOG record and clear
> any incorrectly set bits. But if a checkpoint has happened after the intent
> XLO
On Mon, Nov 29, 2010 at 9:57 PM, Robert Haas wrote:
> 1. Pin each visibility map page. If any VM_BECOMING_ALL_VISIBLE bits
> are set, take the exclusive content lock for long enough to clear
> them.
I wonder what the performance hit will be to workloads with contention
and if this feature should
On 30.11.2010 06:57, Robert Haas wrote:
I can't say I'm totally in love with any of these designs. Anyone
else have any ideas, or any opinions about which one is best?
Well, the design I've been pondering goes like this:
At vacuum:
1. Write an "intent" XLOG record listing a chunk of visibili
Last week, I posted a couple of possible designs for making the
visibility map crash-safe, which did not elicit much comment. Since
this is an important prerequisite to index-only scans, I'm trying
again.
http://archives.postgresql.org/pgsql-hackers/2010-11/msg01474.php
http://archives.postgresql
72 matches
Mail list logo