This is an FYI re: a bug I ran across.

Background: RHEL 6 & ext4. PGDATA, a table space, and WAL logs are all on their 
own partitions.

The WAL partition filled up (wal_keep_segments was changed, but Pg hadn't been 
restarted), and a write happened, and it appears to have resulted in an index 
page being corrupt. REINDEX'ing the table didn't work with the following error:

> WARNING: concurrent delete in progress within table "tblA"

> WARNING: concurrent delete in progress within table "tblA"
> ERROR: could not access status of transaction 86081816
> DETAIL: Could not read from file "pg_subtrans/0521" at offset 131072: Success.

pg_subtrans/0521 was a 90112 byte file and pg_clog/0052 was 24576 bytes in size.

I haven't dug in to the code, but I did see the commit message for 
pgsql/src/backend/access/transam/slru.c 1.23.4.2, which I'm guessing is 
related. My WAG is that there's an assumption someplace that pg_subtrans and 
pg_clog are on the same partition as pg_xlog, or that creation of files in 
pg_subtrans and pg_clog will either absolutely succeed or absolutely fail. It 
could also be that Linux reported back a successful write(2), but it didn't 
actually have the space available (ext4).

Anyway, after extending pg_subtrans/0521 w/ zeros to its proper 256KB size, I 
was able to REINDEX the table, but there was a stream of WARNINGs about 
"concurrent inserts and deletes" that I didn't dig in to. Upon learning the WAL 
files were removed as a temporary solution to the space problem, I opted to 
dump, re-initdb, and load the data, which worked without any errors or warnings 
being reported.

I've saved the data if there are pointed questions about their contents.

-sc

--
Sean Chittenden
s...@chittenden.org



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to