Hello,

I've a slightly modified version of test_shm_mq, that I changed to include
a shared fileset.  The motivation to do that came because I hit an
assertion failure with PG15 while doing some development work on BDR and I
suspected it to be a PG15 bug.

The stack trace looks as below:

(lldb) bt
* thread #1
  * frame #0: 0x00007ff8187b100e libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007ff8187e71ff libsystem_pthread.dylib`pthread_kill + 263
    frame #2: 0x00007ff818732d24 libsystem_c.dylib`abort + 123
    frame #3: 0x000000010fce1bab
postgres`ExceptionalCondition(conditionName="pgstat_is_initialized &&
!pgstat_is_shutdown", errorType="FailedAssertion", fileName="pgstat.c",
lineNumber=1227) at assert.c:69:2
    frame #4: 0x000000010fb06412 postgres`pgstat_assert_is_up at
pgstat.c:1227:2
    frame #5: 0x000000010fb0d2c7
postgres`pgstat_get_entry_ref(kind=PGSTAT_KIND_DATABASE, dboid=0, objoid=0,
create=true, created_entry=0x0000000000000000) at pgstat_shmem.c:406:2
    frame #6: 0x000000010fb07579
postgres`pgstat_prep_pending_entry(kind=PGSTAT_KIND_DATABASE, dboid=0,
objoid=0, created_entry=0x0000000000000000) at pgstat.c:1068:14
    frame #7: 0x000000010fb09cce
postgres`pgstat_prep_database_pending(dboid=0) at pgstat_database.c:327:14
    frame #8: 0x000000010fb09dff
postgres`pgstat_report_tempfile(filesize=0) at pgstat_database.c:179:10
    frame #9: 0x000000010fa8dbe9
postgres`ReportTemporaryFileUsage(path="base/pgsql_tmp/pgsql_tmp17312.0.fileset/test_mq_sharefile.0",
size=0) at fd.c:1521:2
    frame #10: 0x000000010fa8db3c
postgres`PathNameDeleteTemporaryFile(path="base/pgsql_tmp/pgsql_tmp17312.0.fileset/test_mq_sharefile.0",
error_on_failure=false) at fd.c:1945:3
    frame #11: 0x000000010fa8d3a8
postgres`unlink_if_exists_fname(fname="base/pgsql_tmp/pgsql_tmp17312.0.fileset/test_mq_sharefile.0",
isdir=false, elevel=15) at fd.c:3674:3
    frame #12: 0x000000010fa8d270
postgres`walkdir(path="base/pgsql_tmp/pgsql_tmp17312.0.fileset",
action=(postgres`unlink_if_exists_fname at fd.c:3663),
process_symlinks=false, elevel=15) at fd.c:3573:5
    frame #13: 0x000000010fa8d0e2
postgres`PathNameDeleteTemporaryDir(dirname="base/pgsql_tmp/pgsql_tmp17312.0.fileset")
at fd.c:1689:2
    frame #14: 0x000000010fa91ac1
postgres`FileSetDeleteAll(fileset=0x0000000119240870) at fileset.c:165:3
    frame #15: 0x000000010fa92b08
postgres`SharedFileSetOnDetach(segment=0x00007f93ff00a7c0,
datum=4716759152) at sharedfileset.c:119:3
    frame #16: 0x000000010fa96b00
postgres`dsm_detach(seg=0x00007f93ff00a7c0) at dsm.c:801:3
    frame #17: 0x000000010fa96f51 postgres`dsm_backend_shutdown at
dsm.c:738:3
    frame #18: 0x000000010fa99402 postgres`shmem_exit(code=1) at ipc.c:259:2
    frame #19: 0x000000010fa99227 postgres`proc_exit_prepare(code=1) at
ipc.c:194:2
    frame #20: 0x000000010fa99133 postgres`proc_exit(code=1) at ipc.c:107:2
    frame #21: 0x000000010fce318c postgres`errfinish(filename="postgres.c",
lineno=3204, funcname="ProcessInterrupts") at elog.c:661:3
    frame #22: 0x000000010fad7c51 postgres`ProcessInterrupts at
postgres.c:3201:4
    frame #23: 0x000000011924d85b
test_shm_mq.so`test_shm_mq_main(main_arg=1155036180) at worker.c:159:2
    frame #24: 0x000000010f9da11e postgres`StartBackgroundWorker at
bgworker.c:858:2
    frame #25: 0x000000010f9e80b4
postgres`do_start_bgworker(rw=0x00007f93ef904080) at postmaster.c:5823:4
    frame #26: 0x000000010f9e2524 postgres`maybe_start_bgworkers at
postmaster.c:6047:9
    frame #27: 0x000000010f9e0e63
postgres`sigusr1_handler(postgres_signal_arg=30) at postmaster.c:5204:3
    frame #28: 0x00007ff8187fcdfd libsystem_platform.dylib`_sigtramp + 29
    frame #29: 0x00007ff8187b2d5b libsystem_kernel.dylib`__select + 11
    frame #30: 0x000000010f9e268d postgres`ServerLoop at
postmaster.c:1770:13
    frame #31: 0x000000010f9e0157 postgres`PostmasterMain(argc=8,
argv=0x0000600002f30190) at postmaster.c:1478:11
    frame #32: 0x000000010f8bc930 postgres`main(argc=8,
argv=0x0000600002f30190) at main.c:202:3
    frame #33: 0x000000011f7d651e dyld`start + 462

I notice that pgstat_shutdown_hook() is registered as a before-shmem-exit
callback. The callback is responsible for detaching from the pgstat shared
memory segment. But looks like other parts of the system still expect it to
be available during later stages of proc exit.

It's not clear to me if pgstat shutdown should happen later or code that
gets executed later in the cycle should not try to use pgstat. It's also
entirely possible that my usage of SharedFileSet is completely wrong. If
that's the case, please let me know the mistake in the usage.

Patch modifying the test case is attached. In order to reproduce the
problem quickly,  I added a CHECK_FOR_INTERRUPTS() in the test, but I don't
see why that would be a bad coding practice.

Thanks,
Pavan

Attachment: crash_pgstat.patch
Description: Binary data

Reply via email to