Reinitialize stack base after fork (for the benefit of rr)?
Hi, I've found rr [1] very useful to debug issues in postgres. The ability to hit a bug, and then e.g. identify a pointer with problematic contents, set a watchpoint on its contents, and reverse-continue is extremely powerful. Unfortunately, when running postgres, it currently occasionally triggers spurious stack-too-deep errors. That turns out to be because it has to use an alternative stack in some corner cases (IIUC when a signal arrives while already in a signal handler). That corner case can unfortunately be hit from within postmaster, and at least can lead to sigusr1_handler() being called with an alternative stack set. Unfortunately that means that processes that postmaster fork()s while using that alternative stack will continue their live using that alternative stack. Which then subsequently means that our stack depth checks always trigger. I've not seen this trigger for normal backends (which makes sense, they're not started from a signal handler), but for bgworkers. In particular parallel workers are prone to hit the issue. I've locally fixed the issue by computing the stack base address anew for postmaster children. Currently in InitPostmasterChild(). I'd like to get that change upstream. The rr hackers have fixed a number of other issues that could be hit with postgres, but they couldn't see a good way to address the potential for a different signal stack in this edge case. And it doesn't seem crazy to me to compute the stack base again in postmaster children: It's cheap enough and it's extremely unlikely that postmaster uses up a crazy amount of stack. I also don't find it too crazy to guard against forks in signal handlers leading to a different stack base address. It's a pretty odd thing to do. Tom, while imo not a fix of the right magnitude here: Are you planning / hoping to work again on your postmaster latch patch? I think it'd be really good if we could restructure the postmaster code to do far far less in signal handlers. And the postmaster latch patch seems like a big step in that direction. I think we mostly dropped it due to the release schedule last time round? Greetings, Andres Freund [1] https://github.com/mozilla/rr/
Re: Reinitialize stack base after fork (for the benefit of rr)?
Andres Freund writes: > I've locally fixed the issue by computing the stack base address anew > for postmaster children. Currently in InitPostmasterChild(). > I'd like to get that change upstream. The rr hackers have fixed a number > of other issues that could be hit with postgres, but they couldn't see a > good way to address the potential for a different signal stack in this > edge case. And it doesn't seem crazy to me to compute the stack base > again in postmaster children: It's cheap enough and it's extremely > unlikely that postmaster uses up a crazy amount of stack. Seems reasonable. I think we'd probably also need this in the EXEC_BACKEND case, in case ASLR puts the child process's stack somewhere else. Can you merge your concern with that one? On the other hand, it might be better to not launch children from the signal handler, because I don't think we should assume the alternate stack can grow as large as the main one. Does POSIX talk about this? > Tom, while imo not a fix of the right magnitude here: Are you planning / > hoping to work again on your postmaster latch patch? Um ... -ESWAPPEDOUT. What are you thinking of? regards, tom lane
Re: Reinitialize stack base after fork (for the benefit of rr)?
Hi, On 2020-03-27 14:34:56 -0400, Tom Lane wrote: > Andres Freund writes: > > I've locally fixed the issue by computing the stack base address anew > > for postmaster children. Currently in InitPostmasterChild(). > > > I'd like to get that change upstream. The rr hackers have fixed a number > > of other issues that could be hit with postgres, but they couldn't see a > > good way to address the potential for a different signal stack in this > > edge case. And it doesn't seem crazy to me to compute the stack base > > again in postmaster children: It's cheap enough and it's extremely > > unlikely that postmaster uses up a crazy amount of stack. > > Seems reasonable. I think we'd probably also need this in the > EXEC_BACKEND case, in case ASLR puts the child process's stack > somewhere else. Can you merge your concern with that one? We currently already do that there, in SubPostmasterMain(). If we add a set_stack_base() to InitPostmasterChild() we can remove it from there, though. > On the other hand, it might be better to not launch children from the > signal handler, because I don't think we should assume the alternate > stack can grow as large as the main one. Does POSIX talk about this? I strongly agree that it'd be better - independent of what we conclude re a localized fix for rr. I think I looked for what specs around this a while ago and couldn't find much. fork() is listed as signal safe (but there was discussion about removing it - going nowhere I think). > > Tom, while imo not a fix of the right magnitude here: Are you planning / > > hoping to work again on your postmaster latch patch? > > Um ... -ESWAPPEDOUT. What are you thinking of? https://postgr.es/m/18193.1492793404%40sss.pgh.pa.us That doesn't convert all that much of postmaster to latches, but once the basic infrastructure is in place, it doesn't seem too hard to convert more. In particular sigusr1_handler, which is the relevant one here, looks fairly easy. SIGHUP_handler(), reaper() shouldn't be hard either. Whether it could make sense to convert pmdie for SIGQUIT is less clear to me, but also seems less clearly necessary: We don't fork, and shutting down anyway. Greetings, Andres Freund
Re: Reinitialize stack base after fork (for the benefit of rr)?
Andres Freund writes: > On 2020-03-27 14:34:56 -0400, Tom Lane wrote: >> Andres Freund writes: >>> Tom, while imo not a fix of the right magnitude here: Are you planning / >>> hoping to work again on your postmaster latch patch? >> Um ... -ESWAPPEDOUT. What are you thinking of? > https://postgr.es/m/18193.1492793404%40sss.pgh.pa.us Oh, I thought we'd dropped that line of thinking in favor of trying to not do work in the postmaster signal handlers (i.e. I thought *you* were pushing this forward, not me). regards, tom lane
Re: Reinitialize stack base after fork (for the benefit of rr)?
Hi, On 2020-03-27 14:59:56 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2020-03-27 14:34:56 -0400, Tom Lane wrote: > >> Andres Freund writes: > >>> Tom, while imo not a fix of the right magnitude here: Are you planning / > >>> hoping to work again on your postmaster latch patch? > > >> Um ... -ESWAPPEDOUT. What are you thinking of? > > > https://postgr.es/m/18193.1492793404%40sss.pgh.pa.us > > Oh, I thought we'd dropped that line of thinking in favor of trying > to not do work in the postmaster signal handlers (i.e. I thought *you* > were pushing this forward, not me). Hm - the way I imagine that to work is that we'd do a SetLatch() in the various signal handlers and that the main loop would then react to got_sigchld type variables. But for that we'd need latch support in postmaster - which I think is pretty exactly what your patch in the above message does? Of course there'd need to be several subsequent patches to move work out of signal handlers into the main loop. Were you thinking of somehow doing that without using a latch? Greetings, Andres Freund
Re: Reinitialize stack base after fork (for the benefit of rr)?
On Fri, Mar 27, 2020 at 11:22 AM Andres Freund wrote: > I've found rr [1] very useful to debug issues in postgres. The ability > to hit a bug, and then e.g. identify a pointer with problematic > contents, set a watchpoint on its contents, and reverse-continue is > extremely powerful. I agree that rr is very useful. It would be great if we had a totally smooth workflow for debugging using rr. -- Peter Geoghegan
Re: Reinitialize stack base after fork (for the benefit of rr)?
On 2020-04-04 21:02:56 -0700, Peter Geoghegan wrote: > On Fri, Mar 27, 2020 at 11:22 AM Andres Freund wrote: > > I've found rr [1] very useful to debug issues in postgres. The ability > > to hit a bug, and then e.g. identify a pointer with problematic > > contents, set a watchpoint on its contents, and reverse-continue is > > extremely powerful. > > I agree that rr is very useful. It would be great if we had a totally > smooth workflow for debugging using rr. I just pushed that.
Re: Reinitialize stack base after fork (for the benefit of rr)?
On Sun, Apr 5, 2020 at 6:54 PM Andres Freund wrote: > I just pushed that. Great! I have found that it's useful to use rr to debug Postgres by following certain recipes. I'll share some of the details now, in case anybody else wants to start using rr and isn't sure where to start. I have a script that records a postgres session using rr with these options: rr record -M /code/postgresql/$BRANCH/install/bin/postgres \ -D /code/postgresql/$BRANCH/data \ --log_line_prefix="%m %p " \ --autovacuum=off \ --effective_cache_size=1GB \ --random_page_cost=4.0 \ --work_mem=4MB \ --maintenance_work_mem=64MB \ --fsync=off \ --log_statement=all \ --log_min_messages=DEBUG5 \ --max_connections=50 \ --shared_buffers=32MB Most of these settings were taken from a similar script that I use to run Valgrind, so the particulars may not matter much -- though it's useful to make the server logs as verbose as possible (you'll see why in a minute). I find it quite practical to run "make installcheck" against the server, recording the entire execution. I find that it's not that much slower than just running the tests against a regular debug build of Postgres. It's still much faster than Valgrind, for example. (Replaying the recording seems to be where having a high end machine helps a lot.) Once the tests are done, I stop Postgres in the usual way (Ctrl + C). The recording is saved to the $HOME/.local/share/rr/ directory on my Linux distro -- rr creates a directory for each distinct recording in this parent directory. rr also maintains a symlink (latest-trace) that points to the latest recording directory, which I rely on most of the time when replaying a recording (it's the default). I am careful to not leave too many recordings around, since they're large enough that that could become a concern. The record/Postgres terminal has output that looks like this: [rr 1786705 1241867]2020-04-04 21:55:05.018 PDT 1786705 DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 63992/1/2 [rr 1786705 1241898]2020-04-04 21:55:05.019 PDT 1786705 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0 [rr 1786705 1241902]2020-04-04 21:55:05.019 PDT 1786705 LOG: statement: CREATE TYPE test_type_empty AS (); [rr 1786705 1241906]2020-04-04 21:55:05.020 PDT 1786705 DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 63993/1/1 [rr 1786705 1241936]2020-04-04 21:55:05.020 PDT 1786705 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0 [rr 1786705 1241940]2020-04-04 21:55:05.020 PDT 1786705 LOG: statement: DROP TYPE test_type_empty; [rr 1786705 1241944]2020-04-04 21:55:05.021 PDT 1786705 DEBUG: drop auto-cascades to composite type test_type_empty [rr 1786705 1241948]2020-04-04 21:55:05.021 PDT 1786705 DEBUG: drop auto-cascades to type test_type_empty[] [rr 1786705 1241952]2020-04-04 21:55:05.021 PDT 1786705 DEBUG: MultiXact: setting OldestMember[2] = 9 [rr 1786705 1241956]2020-04-04 21:55:05.021 PDT 1786705 DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 63994/1/3 The part of each log line in square brackets comes from rr (since we used -M when recording) -- the first number is a PID, the second an event number. I usually don't care about the PIDs, though, since the event number alone unambiguously identifies a particular "event" in a particular backend (rr recordings are single threaded, even when there are multiple threads or processes). Suppose I want to get to the "CREATE TYPE test_type_empty AS ();" query -- I can get to the end of the query by replaying the recording with this option: $ rr replay -M -g 1241902 Replaying the recording like this takes me to the point where the Postgres backend prints the log message at the end of executing the query I mentioned -- I get a familiar gdb debug server (rr implements a gdb backend). This isn't precisely the point of execution that interests me, but it's close enough. I can easily set a breakpoint to the precise function I'm interested in, and then "reverse-continue" to get there by going backwards. I can also find the point where a particular backend starts by using the fork option instead. So for the PID 1786705, that would look like: $ rr replay -M -f 1786705 (Don't try to use the similar -p option, since that starts a debug server when the pid has been exec'd.) rr really shines when debugging things like tap tests, where there is complex scaffolding that may run multiple Postgres servers. You can run an entire "rr record make check", without having to worry about how that scaffolding works. Once you have useful event numbers to work off of, it doesn't take too long to get an interactive debugging session in the backend of interest by applying the same techniques. Note that saving the output of a recording using standard tools like
Re: Reinitialize stack base after fork (for the benefit of rr)?
Hi, On 2020-04-05 20:35:50 -0700, Peter Geoghegan wrote: > I have found that it's useful to use rr to debug Postgres by following > certain recipes. I'll share some of the details now, in case anybody > else wants to start using rr and isn't sure where to start. Perhaps put it on a wiki page? > I have a script that records a postgres session using rr with these options: > > rr record -M /code/postgresql/$BRANCH/install/bin/postgres \ > -D /code/postgresql/$BRANCH/data \ > --log_line_prefix="%m %p " \ > --autovacuum=off \ Were you doing this because of occasional failures in autovacuum workers? If so, that shouldn't be necessary after the stack base change (previously workers IIRC also could start with the wrong stack base - but didn't end up checking stack depth except for expression indexes). Greetings, Andres Freund
Re: Reinitialize stack base after fork (for the benefit of rr)?
On Sun, Apr 5, 2020 at 8:56 PM Andres Freund wrote: > Perhaps put it on a wiki page? I added a new major section to the "getting a stack trace" wiki page: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Recording_Postgres_using_rr_Record_and_Replay_Framework Feel free to add to and edit this section yourself. > Were you doing this because of occasional failures in autovacuum > workers? If so, that shouldn't be necessary after the stack base change > (previously workers IIRC also could start with the wrong stack base - > but didn't end up checking stack depth except for expression indexes). No, just a personal preference for things like this. -- Peter Geoghegan