I wrote:
>> I poked around in the dump for awhile.  I still can't reproduce the
>> failure from a standing start.

I've had a bit of success with reproducing the failure from scratch.
After loading the SQL dump that Tatsuo previously supplied into a UTF8,
C-locale database, generate a data-only dump of it, along the lines of

pg_dump -a ... >ishii-data.sql

Then run this psql script in the database (the function is designed to
exercise every entry in the GIN index):

create table msgstats as
  select * from ts_stat('select body_index from msginfo');

create language plpgsql;
create function testit () returns int as $$
declare r record;
        ifail int := 0;
begin
  for r in select * from msgstats loop
    begin
    perform count(*) from msginfo where plainto_tsquery(r.word) @@ body_index;
    exception when others then
      raise notice 'failed for "%": %', r.word, sqlerrm;
      ifail := ifail + 1;
    end;
  end loop;
  return ifail;
end $$ language plpgsql;

delete from msginfo;
truncate msg_folderinfo;
\i ishii-data.sql
reset client_min_messages;

select testit();

The results are not very stable, but I fairly often get "tuple offset out
of range" errors, and my last try resulted in an Assert failure in
autovacuum.  What I think is happening is that there's some inadequately
locked page-changing logic in there, and if you can get autovacuum to
happen on msginfo concurrently with the data reload step, it'll mess up.
Given that we've not seen reports of this on 8.3, I'll bet lunch that
the bug is somewhere in the "fast insert" code.  I haven't found it
yet, though.

BTW, I also did more analysis of Tatsuo's binary dump using this
function.  Of some 38000 distinct lexemes in the body_index column,
about 1500 trigger an error in testit(); the errors are either
"tuple offset out of range" or something like "could not open segment 1
of relation base/16384/16497 (target block 955254869): No such file or
directory".  There are about 440 distinct messages that show up,
suggesting that approximately that many pages of the index contain
corrupted TIDs.  Manual examination of the index is consistent with
this, and also indicates that all of the corrupt pages are in the second
half of the table.  That's not too hard to square with the
concurrent-update theory --- autovac might have already done about half
the index before a large data load started.  However it means that
corruption is *very* probable if data insertion is happening
concurrently with a vacuum, else not such a large number of pages would
have gotten messed up.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to