(Adding Robert in CC.) On Thu, Jan 26, 2017 at 4:34 AM, Wang Hao <whbe...@gmail.com> wrote: > An unlogged table has an initialization fork. The initialization fork does > not have an BM_PERMANENT flag when get a buffer. > In checkpoint (not shutdown or end of recovery), it will not write to disk. > after a crash recovery, the page of initialization fork will not correctly, > then make the main fork not correctly too.
For init forks the flush need absolutely to happen, so that's really not good. We ought to fix BufferAlloc() appropriately here. > Here is an example for GIN index. > > create unlogged table gin_test_tbl(i int4[]); > create index gin_test_idx on gin_test_tbl using gin (i); > checkpoint; > > kill all the postgres process, and restart again. > > vacuum gin_test_tbl; -- crash. > > It seems have same problem in BRIN, GIN, GiST and HASH index which using > buffer for meta page initialize in ambuildempty function. Yeah, other index AMs deal directly with the sync of the page, that's why there is no issue for them. So the patch attached fixes the problem by changing BufferAlloc() in such a way that initialization forks are permanently written to disk, which is what you are suggesting. As a simple fix for back-branches that's enough, though on HEAD I think that we should really rework the empty() routines so as the write goes through shared buffers first, that seems more solid than relying on the sgmr routines to do this work. Robert, what do you think? -- Michael
unlogged-flush-fix.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers