On 18/07/10 08:22, Bruce Momjian wrote:
The bug is that we can't replay mkdir()/symlink() and assume those will
always succeed.  I looked at the createdb redo code and it basically
drops the directory before creating it.

The tablespace directory/symlink setup is more complex, so I just wrote
the attached patch to trigger a redo-'delete' tablespace operation
before the create tablespace redo operation.

Redoing a drop talespace assumes the tablespace directory is empty, which it necessarily isn't when redoing a create tablespace command:

postgres=# CREATE TABLESPACE t LOCATION '/tmp/t';
CREATE TABLESPACE
postgres=#  CREATE TABLE tfoo (id int4) TABLESPACE t;
CREATE TABLE
postgres=# \q
$ killall -9 postmaster
$ bin/postmaster -D data
LOG: database system was interrupted; last known up at 2010-07-18 08:48:32 EEST LOG: database system was not properly shut down; automatic recovery in progress
LOG:  consistent recovery state reached at 0/5E889C
LOG:  redo starts at 0/5E889C
FATAL:  tablespace 16402 is not empty
CONTEXT:  xlog redo create ts: 16402 "/tmp/t"
LOG:  startup process (PID 5987) exited with exit code 1
LOG:  aborting startup due to startup process failure

Also, casting the xl_tblspc_create_rec as xl_tblspc_drop_rec is a bit questionable. It works because both structs begin with the tablespace Oid, but it doesn't look right, and can break in hard-to-notice ways in the future if the structure of those structs change in the future.

Ignoring mkdir/symlink creation failure is not an option because the
symlink might point to some wrong location or something.

Maybe you should check that it points to the right location? Or drop and recreate the symlink, and ignore failure at mkdir.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to