Thomas Munro <thomas.mu...@enterprisedb.com> writes: > This type of violent shutdown seems to be associated with occasional > corruption of .gcda files (the files output by GCC coverage builds). > The symptoms are that if you use --enable-coverage and make > check-world you'll very occasionally get a spurious TAP test failure > like this:
> # Failed test 'pg_ctl start: no stderr' > # at > /home/travis/build/postgresql-cfbot/postgresql/src/bin/pg_ctl/../../../src/test/perl/TestLib.pm > line 301. > # got: > 'profiling:/home/travis/build/postgresql-cfbot/postgresql/src/backend/nodes/copyfuncs.gcda:Merge > mismatch for function 94 > # ' > # expected: '' > I'm not sure of the exact mechanism though. GCC supplies a function > __gcov_flush() that normally runs at exit or execve, so if you're > killed without reaching those you don't get any .gcda data. Perhaps > we are in exit (or fork/exec) partway through writing out coverage > data in __gcov_flush(), and at that moment we are killed. Then a > subsequent run of instrumented code will find the half-written file > and print the "Merge mismatch" message. On a slow/loaded machine, perhaps it could be that the postmaster loses patience and SIGKILLs a backend that's still writing its .gcda data? If so, maybe we could make SIGKILL_CHILDREN_AFTER_SECS longer in coverage builds? Or bite the bullet and make it configurable ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers