On Wed, May 30, 2012 22:25, Robert Haas wrote:
> On Wed, May 30, 2012 at 2:52 PM, Robert Haas <[email protected]> wrote:
>> On Wed, May 30, 2012 at 1:47 PM, Robert Haas <[email protected]> wrote:
>>>> The process holding the AccessExclusiveLock is the startup process. It's
>>>> holding the lock on behalf of the transaction in the master. But
>>>> something's
>>>> wrong, and the AccessExclusiveLock doesn't stop a regular backend from
>>>> acquiring the AccessShareLock on the table. I suspect the fast-path locking
>>>> patch, because this works on 9.1.
>>>
>>> Yeah, apparently so. gdb says that FastPathStrongRelationLocks on the
>>> standby is all-zeros even after that record has been replayed. Not
>>> sure how that's possible yet.
>>
>> Ah. The problem is that FastPathTag() expects that locks on database
>> objects will only be taken by backends with a non-zero value for
>> MyDatabaseId. Apparently the can-i-use-the-fastpath test and the
>> do-i-need-to-force-other-people-out-of-the-fastpath test need to be a
>> bit more asymmetrical than they are at present.
>
> I've fixed things so that Heikki's test case now behaves as expected.
> Hopefully this fixes Erik's problem as well, but I haven't tested.
>
(I double-checked that I got your latest commit in)
I'm afraid it's not yet resolved; the sync-slave still crashes almost
immediately:
master logfile says:
2012-05-30 23:30:07.846 CEST 3918 LOG: standby wal_receiver_01 is now the
synchronous standby
with priority 1
sync-slave logfile:
[...]
2012-05-30 23:30:07.833 CEST 3908 LOG: database system is ready to accept read
only connections
cp: cannot stat `/home/aardvark/pg_stuff/archive_dir/000000010000000000000004':
No such file or
directory
2012-05-30 23:30:07.845 CEST 3917 LOG: streaming replication successfully
connected to primary
2012-05-30 23:40:52.635 CEST 5287 ERROR: could not open relation with OID 26563
2012-05-30 23:40:52.635 CEST 5287 STATEMENT: select current_setting('port')
port, count(*) from
public.t
2012-05-30 23:40:57.909 CEST 3909 FATAL: could not open file
"base/21268/26569": No such file or
directory
2012-05-30 23:40:57.909 CEST 3909 CONTEXT: writing block 5152 of relation
base/21268/26569
xlog redo multi-insert (init): rel 1663/21268/26581; blk 3852; 35 tuples
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line:
1741)
2012-05-30 23:40:58.006 CEST 5331 FATAL: could not open file
"base/21268/26569": No such file or
directory
2012-05-30 23:40:58.006 CEST 5331 CONTEXT: writing block 5153 of relation
base/21268/26569
2012-05-30 23:40:59.661 CEST 3908 LOG: startup process (PID 3909) was
terminated by signal 6:
Aborted
2012-05-30 23:40:59.661 CEST 3908 LOG: terminating any other active server
processes
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers