Hoi, I ran into this in our CI setup and I thinks it's an actual bug. The issue appears to be that when a table is created *and truncated* in a single transaction, that the WAL log is logging a truncate record it shouldn't, such that if the database crashes you have a broken index. It would also lose any data that was in the table at commit time.
I didn't test 9.4.4 yet, though I don't see anything in the release
notes that resemble this.
Detail:
=== Start with an empty database
martijn@martijn-jessie:$ psql ctmp -h localhost -U username
Password for user username:
psql (9.4.3)
Type "help" for help.
ctmp=# begin;
BEGIN
ctmp=# create table test(id serial primary key);
CREATE TABLE
ctmp=# truncate table test;
TRUNCATE TABLE
ctmp=# commit;
COMMIT
ctmp=# select relname, relfilenode from pg_class where relname like 'test%';
relname | relfilenode
-------------+-------------
test | 16389
test_id_seq | 16387
test_pkey | 16393
(3 rows)
=== Note: if you do a CHECKPOINT here the issue doesn't happen
=== obviously.
ctmp=# \q
martijn@martijn-jessie:$ sudo ls -l
/data/postgres/base/16385/{16389,16387,16393}
[sudo] password for martijn:
-rw------- 1 messagebus ssl-cert 8192 Jul 2 23:34
/data/postgres/base/16385/16387
-rw------- 1 messagebus ssl-cert 0 Jul 2 23:34
/data/postgres/base/16385/16389
-rw------- 1 messagebus ssl-cert 8192 Jul 2 23:34
/data/postgres/base/16385/16393
=== Note the index file is 8KB.
=== At this point nuke the database server (in this case it was simply
=== destroying the container it was running in.
=== Dump the xlogs just to show what got recorded. Note there's a
=== truncate for the data file and the index file.
martijn@martijn-jessie:$ sudo /usr/lib/postgresql/9.4/bin/pg_xlogdump -p
/data/postgres/pg_xlog/ 000000010000000000000001 |grep -wE '16389|16387|16393'
rmgr: XLOG len (rec/tot): 72/ 104, tx: 0, lsn:
0/016A9240, prev 0/016A9200, bkp: 0000, desc: checkpoint: redo 0/16A9240; tli
1; prev tli 1; fpw true; xid 0/686; oid 16387; multi 1; offset 0; oldest xid
673 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown
rmgr: Storage len (rec/tot): 16/ 48, tx: 0, lsn:
0/016A92D0, prev 0/016A92A8, bkp: 0000, desc: file create: base/16385/16387
rmgr: Sequence len (rec/tot): 158/ 190, tx: 686, lsn:
0/016B5E50, prev 0/016B5D88, bkp: 0000, desc: log: rel 1663/16385/16387
rmgr: Storage len (rec/tot): 16/ 48, tx: 686, lsn:
0/016B5F10, prev 0/016B5E50, bkp: 0000, desc: file create: base/16385/16389
rmgr: Storage len (rec/tot): 16/ 48, tx: 686, lsn:
0/016BB028, prev 0/016BAFD8, bkp: 0000, desc: file create: base/16385/16393
rmgr: Sequence len (rec/tot): 158/ 190, tx: 686, lsn:
0/016BE4F8, prev 0/016BE440, bkp: 0000, desc: log: rel 1663/16385/16387
rmgr: Storage len (rec/tot): 16/ 48, tx: 686, lsn:
0/016BE6B0, prev 0/016BE660, bkp: 0000, desc: file truncate: base/16385/16389
to 0 blocks
rmgr: Storage len (rec/tot): 16/ 48, tx: 686, lsn:
0/016BE6E0, prev 0/016BE6B0, bkp: 0000, desc: file truncate: base/16385/16393
to 0 blocks
pg_xlogdump: FATAL: error in WAL record at 0/16BE710: record with zero length
at 0/16BE740
=== Start the DB up again
database_1 | LOG: database system was interrupted; last known up at 2015-07-02
21:08:05 UTC
database_1 | LOG: database system was not properly shut down; automatic
recovery in progress
database_1 | LOG: redo starts at 0/16A92A8
database_1 | LOG: record with zero length at 0/16BE740
database_1 | LOG: redo done at 0/16BE710
database_1 | LOG: last completed transaction was at log time 2015-07-02
21:34:45.664989+00
database_1 | LOG: database system is ready to accept connections
database_1 | LOG: autovacuum launcher started
=== Oops, the index file is empty now
martijn@martijn-jessie:$ sudo ls -l
/data/postgres/base/16385/{16389,16387,16393}
-rw------- 1 messagebus ssl-cert 8192 Jul 2 23:37
/data/postgres/base/16385/16387
-rw------- 1 messagebus ssl-cert 0 Jul 2 23:34
/data/postgres/base/16385/16389
-rw------- 1 messagebus ssl-cert 0 Jul 2 23:37
/data/postgres/base/16385/16393
martijn@martijn-jessie:$ psql ctmp -h localhost -U username
Password for user username:
psql (9.4.3)
Type "help" for help.
=== And now the index is broken. I think the only reason it doesn't
=== complain about the data file is because zero bytes there is OK. But if
=== the table had data before it would be gone now.
ctmp=# select * from test;
ERROR: could not read block 0 in file "base/16385/16393": read only 0 of 8192
bytes
ctmp=# select version();
version
-----------------------------------------------------------------------------------------------
PostgreSQL 9.4.3 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.9.2-10) 4.9.2, 64-bit
(1 row)
Hope this helps.
--
Martijn van Oosterhout <[email protected]> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer
signature.asc
Description: Digital signature
