Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Heikki Linnakangas

Tom Lane wrote:

I wrote:

Two different buildfarm machines are currently showing the same failure:
ERROR:  could not fsync segment 0 of relation 1663/16384/29270/1: No such file 
or directory
ERROR:  checkpoint request failed


Some tests show that when the serial regression tests are run in a
freshly initdb'd installation, HEAD assigns OID 29270 to bmscantest
in the bitmapops test.  So that's been dropped some time before the
failure occurs; which means that this isn't a narrow-window race
condition; which raises the question of why we're not seeing it on more
machines.  I notice now that kudu and dragonfly are actually the same
machine ... could this be an OS-specific problem?  Kris, has there been
any system-software change on that machine recently?


Must be because of this little missing line here:

--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -1012,6 +1012,7 @@ ForwardFsyncRequest(RelFileNode rnode, ForkNumber 
forknum, BlockNumber segno)

}
request = BgWriterShmem-requests[BgWriterShmem-num_requests++];
request-rnode = rnode;
+   request-forknum = forknum;
request-segno = segno;
LWLockRelease(BgWriterCommLock);
return true;

Most fsync requests are for main fork, not FSM, so I guess what usually 
happens because of that bug is that the we skip the fsync on the FSM, 
which is why we haven't noticed before.


I still wonder, though, why we're seeing the error consistently on kudu, 
and not on any other animal. Perhaps the forknum field that's left 
uninitialized gets a different value there than on other platforms.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Heikki Linnakangas

Zdenek Kotala wrote:

Tom Lane napsal(a):

Heikki Linnakangas [EMAIL PROTECTED] writes:
I still wonder, though, why we're seeing the error consistently on 
kudu, and not on any other animal. Perhaps the forknum field that's 
left uninitialized gets a different value there than on other platforms.


Hmm ... AFAICS this mistake would mean that no forknum field of the
requests[] array ever gets set at all, so they would stay at whatever
the virgin value in the shmem segment had been.  Perhaps Solaris doesn't
guarantee that a shared memory block starts out as zeroes?  


For security reason any OS should clean memory pages before process 
first touches them.


Yeah. But it doesn't necessarily need to fill them with zeros, any 
garbage will do.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Zdenek Kotala wrote:
 Tom Lane napsal(a):
 Hmm ... AFAICS this mistake would mean that no forknum field of the
 requests[] array ever gets set at all, so they would stay at whatever
 the virgin value in the shmem segment had been.  Perhaps Solaris doesn't
 guarantee that a shared memory block starts out as zeroes?  
 
 For security reason any OS should clean memory pages before process 
 first touches them.

 Yeah. But it doesn't necessarily need to fill them with zeros, any 
 garbage will do.

Yeah, but the observed symptoms seem to indicate that the fill is mostly
zeroes with a very occasional one.  This seems less than probable.

The only theory I've thought of that seems to fit the facts is that
someplace we have a wild store that is clobbering that particular word.
Which is a pretty unpleasant thought.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:

Zdenek Kotala wrote:
For security reason any OS should clean memory pages before process 
first touches them.


Yeah. But it doesn't necessarily need to fill them with zeros, any 
garbage will do.


Yeah, but the observed symptoms seem to indicate that the fill is mostly
zeroes with a very occasional one.  This seems less than probable.

The only theory I've thought of that seems to fit the facts is that
someplace we have a wild store that is clobbering that particular word.
Which is a pretty unpleasant thought.


The bug only affected fsync/forget requests that are forwarded from 
backends, not the ones that bgwriter puts into the hash table itself. If 
the fsync request is queued by the bgwriter itself, and the forget 
request comes from a backend, you get the error that we saw, I think. I 
noted that kudu has a small shared_buffers setting, 5.6 MB, compared to 
most buildfarm members, which might explain different behavior wrt. 
which buffers are written by backends and which are written by bgwriter.


(kudu is green now BTW)

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 The bug only affected fsync/forget requests that are forwarded from 
 backends, not the ones that bgwriter puts into the hash table itself.

Oh, of course.  So the actual sequence of events was:

* bgwriter queues an fsync request for a FSM fork, after
  writing a dirty FSM buffer.
* backend drops table, sends forget request for FSM fork,
  which the bug converts into an extra forget for the main fork.
* at checkpoint, we got trouble.

 I noted that kudu has a small shared_buffers setting, 5.6 MB, compared to 
 most buildfarm members, which might explain different behavior wrt. 
 which buffers are written by backends and which are written by bgwriter.

Yup, that must have been the determining factor.  That would also
explain a certain amount of nondeterminism in the bug's manifestation,
which also squares with the buildfarm observations.

Okay, I feel better now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I still wonder, though, why we're seeing the error consistently on kudu, 
 and not on any other animal. Perhaps the forknum field that's left 
 uninitialized gets a different value there than on other platforms.

Hmm ... AFAICS this mistake would mean that no forknum field of the
requests[] array ever gets set at all, so they would stay at whatever
the virgin value in the shmem segment had been.  Perhaps Solaris doesn't
guarantee that a shared memory block starts out as zeroes?   But if
there were random garbage in the forknum fields you'd expect rather more
failures than are observed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-14 Thread Zdenek Kotala

Tom Lane napsal(a):

Heikki Linnakangas [EMAIL PROTECTED] writes:
I still wonder, though, why we're seeing the error consistently on kudu, 
and not on any other animal. Perhaps the forknum field that's left 
uninitialized gets a different value there than on other platforms.


Hmm ... AFAICS this mistake would mean that no forknum field of the
requests[] array ever gets set at all, so they would stay at whatever
the virgin value in the shmem segment had been.  Perhaps Solaris doesn't
guarantee that a shared memory block starts out as zeroes?  


For security reason any OS should clean memory pages before process 
first touches them.


Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-13 Thread Tom Lane
Two different buildfarm machines are currently showing the same failure:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=kududt=2008-10-13%2015:30:00
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dragonflydt=2008-10-13%2016:30:01

The postmaster log in each case shows

ERROR:  could not fsync segment 0 of relation 1663/16384/29270/1: No such file 
or directory
ERROR:  checkpoint request failed
HINT:  Consult recent messages in the server log for details.
STATEMENT:  DROP TABLESPACE testspace;
ERROR:  could not fsync segment 0 of relation 1663/16384/29270/1: No such file 
or directory
ERROR:  checkpoint request failed
HINT:  Consult recent messages in the server log for details.
STATEMENT:  DROP TABLESPACE testspace;

which looks like an fsync request has been allowed to hang around too
long, ie, after its file is already gone.  Since /1 indicates the FSM
fork, I suppose this has been introduced by the new FSM code.  I haven't
dug any more deeply than that though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-13 Thread Tom Lane
I wrote:
 Two different buildfarm machines are currently showing the same failure:
 ERROR:  could not fsync segment 0 of relation 1663/16384/29270/1: No such 
 file or directory
 ERROR:  checkpoint request failed

Some tests show that when the serial regression tests are run in a
freshly initdb'd installation, HEAD assigns OID 29270 to bmscantest
in the bitmapops test.  So that's been dropped some time before the
failure occurs; which means that this isn't a narrow-window race
condition; which raises the question of why we're not seeing it on more
machines.  I notice now that kudu and dragonfly are actually the same
machine ... could this be an OS-specific problem?  Kris, has there been
any system-software change on that machine recently?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] There's some sort of race condition with the new FSM stuff

2008-10-13 Thread Kris Jurka



On Mon, 13 Oct 2008, Tom Lane wrote:

I notice now that kudu and dragonfly are actually the same machine ... 
could this be an OS-specific problem?  Kris, has there been any 
system-software change on that machine recently?


This is a VM that I haven't touched in some time.  It was turned off after 
a host kernel upgrade no longer allowed vmware to work on it.  I recently 
turned it back on after switching from vmware workstation 5.5 to vmware 
server 2.0.  On the VM itself the only change I've made was to switch the 
timezone from Mountain to Pacific.


http://pgbuildfarm.org/cgi-bin/show_history.pl?nm=dragonflybr=HEAD

Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers