Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas

On 03/12/2014 02:05 PM, Robert Haas wrote:

On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
 wrote:

The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.


Did you forget to push?


Yep. Pushed now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
 wrote:
>> The attached patch doesn't apply any more, but it looks like this
>> issue still exists.
>
> Fixed.

Did you forget to push?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas

On 03/10/2014 09:44 PM, Robert Haas wrote:

On Mon, Apr 30, 2012 at 1:34 PM, Noah Misch  wrote:

When GIN changes a metapage, we WAL-log its ex-header content and never use a
backup block.  This reduces WAL volume since the vast majority of the metapage
is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
content if the metapage LSN predates the WAL record LSN.  If a metapage write
tore and updated the LSN but not the other content, we would fail to complete
the update.  Instead, unconditionally reinitialize the metapage similar to how
_bt_restore_meta() handles the situation.

I found this problem by code reading and did not attempt to build a test case
illustrating its practical consequences.  It's possible that there's no
problem in practice on account of some reason I haven't contemplated.


The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-10 Thread Robert Haas
On Mon, Apr 30, 2012 at 1:34 PM, Noah Misch  wrote:
> When GIN changes a metapage, we WAL-log its ex-header content and never use a
> backup block.  This reduces WAL volume since the vast majority of the metapage
> is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
> content if the metapage LSN predates the WAL record LSN.  If a metapage write
> tore and updated the LSN but not the other content, we would fail to complete
> the update.  Instead, unconditionally reinitialize the metapage similar to how
> _bt_restore_meta() handles the situation.
>
> I found this problem by code reading and did not attempt to build a test case
> illustrating its practical consequences.  It's possible that there's no
> problem in practice on account of some reason I haven't contemplated.

The attached patch doesn't apply any more, but it looks like this
issue still exists.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-03 Thread Tom Lane
Robert Haas  writes:
> Are you planning to commit Noah's patch?

I wasn't intending to do so personally in the near future; I've got
other things on my to-do list.  I won't object if somebody else
commits it though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-03 Thread Robert Haas
On Thu, May 3, 2012 at 12:16 AM, Tom Lane  wrote:
> Having said all that, I wasn't really arguing that this was a guaranteed
> safe thing for us to rely on; just pointing out that it's quite likely
> that the issue hasn't been seen in the field because of this type of
> consideration.

Well, we do rely, in numerous places, on writes << 512 bytes not
getting torn.  pd_prune_xid, index tuple kills, heap tuple hint bits,
relmapper files, etc.  We generally assume, for example, that a 4-byte
write which is 4-byte aligned does not need to be WAL-logged, which
would be necessary if we thought that the write might be torn.

Are you planning to commit Noah's patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-02 Thread Tom Lane
Daniel Farina  writes:
> On Wed, May 2, 2012 at 6:06 PM, Noah Misch  wrote:
>> Can we indeed assume that all support-worthy filesystems align the start of
>> every file to a physical sector?  I know little about modern filesystem
>> design, but these references leave me wary of that assumption:
>> 
>> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
>> http://en.wikipedia.org/wiki/Block_suballocation
>> 
>> If it is a safe assumption, we could exploit it elsewhere.

> Not to say whether this is safe or not, but it *is* exploited
> elsewhere, as I understand it: the pg_control information, whose
> justification for its safety is its small size.  That may point to a
> very rare problem with pg_control rather the safety of the assumption
> it makes.

I think it's somewhat common now for filesystems to attempt to optimize
very small files (on the order of a few dozen bytes) in that way.  It's
hard to see where's the upside for changing the conventional storage
allocation when the file is sector-sized or larger; the file system does
have to be prepared to rewrite the file on demand, and moving it from
one place to another isn't cheap.

That wikipedia reference argues for doing this type of optimization on
the last partial block of a file, which is entirely irrelevant for our
purposes since we always ask for page-multiples of space.  (The fact
that much of that might be useless padding is, I think, unknown to the
filesystem.)

Having said all that, I wasn't really arguing that this was a guaranteed
safe thing for us to rely on; just pointing out that it's quite likely
that the issue hasn't been seen in the field because of this type of
consideration.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-02 Thread Daniel Farina
On Wed, May 2, 2012 at 6:06 PM, Noah Misch  wrote:
> Can we indeed assume that all support-worthy filesystems align the start of
> every file to a physical sector?  I know little about modern filesystem
> design, but these references leave me wary of that assumption:
>
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
> http://en.wikipedia.org/wiki/Block_suballocation
>
> If it is a safe assumption, we could exploit it elsewhere.

Not to say whether this is safe or not, but it *is* exploited
elsewhere, as I understand it: the pg_control information, whose
justification for its safety is its small size.  That may point to a
very rare problem with pg_control rather the safety of the assumption
it makes.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-02 Thread Noah Misch
On Mon, Apr 30, 2012 at 02:35:20PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > When GIN changes a metapage, we WAL-log its ex-header content and never use 
> > a
> > backup block.  This reduces WAL volume since the vast majority of the 
> > metapage
> > is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
> > content if the metapage LSN predates the WAL record LSN.  If a metapage 
> > write
> > tore and updated the LSN but not the other content, we would fail to 
> > complete
> > the update.  Instead, unconditionally reinitialize the metapage similar to 
> > how
> > _bt_restore_meta() handles the situation.
> 
> > I found this problem by code reading and did not attempt to build a test 
> > case
> > illustrating its practical consequences.  It's possible that there's no
> > problem in practice on account of some reason I haven't contemplated.
> 
> I think there's no problem in practice; the reason is that the
> GinMetaPageData struct isn't large enough to extend past the first
> physical sector of the page.  So it's in the same disk sector as the
> LSN and tearing is impossible.  Still, this might be a good
> future-proofing move, in case GinMetaPageData gets larger.

Can we indeed assume that all support-worthy filesystems align the start of
every file to a physical sector?  I know little about modern filesystem
design, but these references leave me wary of that assumption:

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
http://en.wikipedia.org/wiki/Block_suballocation

If it is a safe assumption, we could exploit it elsewhere.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-04-30 Thread Tom Lane
Noah Misch  writes:
> When GIN changes a metapage, we WAL-log its ex-header content and never use a
> backup block.  This reduces WAL volume since the vast majority of the metapage
> is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
> content if the metapage LSN predates the WAL record LSN.  If a metapage write
> tore and updated the LSN but not the other content, we would fail to complete
> the update.  Instead, unconditionally reinitialize the metapage similar to how
> _bt_restore_meta() handles the situation.

> I found this problem by code reading and did not attempt to build a test case
> illustrating its practical consequences.  It's possible that there's no
> problem in practice on account of some reason I haven't contemplated.

I think there's no problem in practice; the reason is that the
GinMetaPageData struct isn't large enough to extend past the first
physical sector of the page.  So it's in the same disk sector as the
LSN and tearing is impossible.  Still, this might be a good
future-proofing move, in case GinMetaPageData gets larger.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-04-30 Thread Noah Misch
When GIN changes a metapage, we WAL-log its ex-header content and never use a
backup block.  This reduces WAL volume since the vast majority of the metapage
is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
content if the metapage LSN predates the WAL record LSN.  If a metapage write
tore and updated the LSN but not the other content, we would fail to complete
the update.  Instead, unconditionally reinitialize the metapage similar to how
_bt_restore_meta() handles the situation.

I found this problem by code reading and did not attempt to build a test case
illustrating its practical consequences.  It's possible that there's no
problem in practice on account of some reason I haven't contemplated.

Thanks,
nm
*** a/src/backend/access/gin/ginxlog.c
--- b/src/backend/access/gin/ginxlog.c
***
*** 492,504  ginRedoUpdateMetapage(XLogRecPtr lsn, XLogRecord *record)
return; /* assume index was 
deleted, nothing to do */
metapage = BufferGetPage(metabuffer);
  
!   if (!XLByteLE(lsn, PageGetLSN(metapage)))
!   {
!   memcpy(GinPageGetMeta(metapage), &data->metadata, 
sizeof(GinMetaPageData));
!   PageSetLSN(metapage, lsn);
!   PageSetTLI(metapage, ThisTimeLineID);
!   MarkBufferDirty(metabuffer);
!   }
  
if (data->ntuples > 0)
{
--- 492,503 
return; /* assume index was 
deleted, nothing to do */
metapage = BufferGetPage(metabuffer);
  
!   GinInitMetabuffer(metabuffer);
!   memcpy(GinPageGetMeta(metapage), &data->metadata, 
sizeof(GinMetaPageData));
! 
!   PageSetLSN(metapage, lsn);
!   PageSetTLI(metapage, ThisTimeLineID);
!   MarkBufferDirty(metabuffer);
  
if (data->ntuples > 0)
{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers