Hi, On 2026-02-19 11:25:43 +0200, Heikki Linnakangas wrote: > On 19/02/2026 09:21, Kirill Reshke wrote: > > Hi! Thank you for your input. > > Stupid question from me: in singleuser mode, how should this work? We > > still should not ereport inside signal handlers, are we? > > If singleuser mode is somehow safe with die & ereport then... I don't > > understand how. > > It's not safe to ereport() from die() in single-user mode either.
Well, the consequences of doing something stupid in single user mode are just much less severe. There is the potential that doing invalid stuff in a signal handler in a normal backend could lead to vulnerabilities that a user could exploit, particularly if TLS is used, but that's not a threat in single user mode. That said, it'd obviously be much better if we stopped doing that stuff. > We've just accepted it as the lesser evil: > > /* > > * If we're in single user mode, we want to quit immediately - we can't > > * rely on interrupts as they wouldn't work when stdin/stdout is a file. > > * Rather ugly, but it's unlikely to be worthwhile to invest much more > > * effort just for the benefit of single user mode. > > */ > > if (DoingCommandRead && whereToSendOutput != DestRemote) > > ProcessTerminateInterrupt(); > > I don't quite understand that though. I understand that a read/write on a > file might be uninterruptible. But so what? It presumably won't take very > long, so it's fine to wait for the read/write to finish. It can take a while, in some cases, even leaving NFS and such aside, think of a redirect to a file where the kernel suddenly blocks you dirtying more data. That can be a while. Of course, whether we actually care about that is another question. Right now the main issue would be that we don't actually do anything to get the interrupt until further input has arrived. Think of the single user code being in InteractiveBackend()->getc(). > > > Could you use WaitLatchOrSocket to sleep on stdin? > > > > Unfortunately I don't think so. poll() etc only works properly on > > network handles, pipes etc - but stdin can be a file :(. And I think > > what exactly happens if it's a file fd isn't super well defined. On > > linux the file is always marked as ready, which would probably actually > > work... > > I think that's actually wrong. POSIX says > [https://pubs.opengroup.org/onlinepubs/9799919799/functions/poll.html]: > > > The poll() and ppoll() functions shall support regular files, > > terminal and pseudo-terminal devices, FIFOs, pipes, and sockets. The > > behavior of poll() and ppoll() on elements of fds that refer to > > other types of file is unspecified. > > > > Regular files shall always poll TRUE for reading and writing. > > So that is pretty well-defined, and we could use poll() on stdin too. Pretty sure poll on pipes etc doesn't work on windows :(. > > > quickdie() obviously does does reach the ereport(). I think it's a bad > > > idea, > > > with a bad justification: For one, libpq makes up the same error reason > > > if the > > > client just vanishes. For another, because it just uses > > > WARNING[_CLIENT_ONLY], it'll not even reach clients if they use a higher > > > client_min_messages. > > > > > > Just hoping that the client communication, including openssl, is in a > > > state > > > that makes it somewhat safe to send messages from a signal handler, is > > > bonkers, IMNSHO. Yes, we have the "safety" mechanism of postmaster > > > SIGKILLing > > > processes, but that only protects against self deadlocks, not against > > > corrupted datastructures etc. > > PqCommBusy gives some protection from messing with openssl state. But yes, > it's generally unsafe. Not sure how far that even helps us, as PqCommBusy isn't volatile, the compiler is free to keep it in a register if it can prove that no code is called that could use PqCommBusy. C runtime functions are often marked to not access user variables outside of arguments... > > > It's one thing to ereport() in a signal handler during a crash triggered > > > quickdie(), the shit already has hit the fan after all, but doing it as > > > part > > > of an intended immediate shutdown is a seriously bad idea. > > I don't see much difference between a crash and an "intended immediate > shutdown". In both cases, you don't want to corrupt datastructures etc. more > than you already have, but on the other hand the system is going down > anyway. An immediate shutdown may be executed in completely normal operations, whereas a crash shutdown should only happen after shit already has hit the fan (and thus there are already other issues). I think there's some difference. > > So, after all, we need to remove ereport from quick die completely, no > > backup plan? > > This will make whole system less verbose about its shutdown reasons... > > Not sure how vital that is > > What's the proposed alternative to ereporting() from quickdie? If we just > remove the ereport(), yeah, clients will miss out on the messages. Are we OK > with that? I don't think the information that we do print is actually particularly useful right now. What's the client going to do differently based on the error messages? The "unexpected SIGQUIT" one really is only relevant for PG developers. I guess there's some possibility of a client benefiting from the difference between a crash and an immediate shutdown. Hampered by it already not being reliable that we can send this information to the client. Without TLS in the game, this would be a lot easier. We could do a non-blocking write to the socket with some non-translated message. But with TLS... > Another alternative is to try to handle SIGQUIT more like SIGTERM, and delay > the exiting until next CHECK_FOR_INTERRUPTS(). Maybe with a short timer to > just SIGKILL if it the interrupt isn't processed quickly. It increases the > risk that the process won't exit as quickly as we'd like, but maybe that's > the right tradeoff. I'm doubtful that that's the right thing, we really don't want to have backends continue any longer than possible if there was a stuff-is-inconsistent style PANIC. They could screw up things further. I guess I could get more on board doing so for an immediate shutdown, just to give backends a chance to actually send a connection termination. Greetings, Andres Freund
