On 11/18/25 15:06, David Geier wrote: > Hi Tomas! > > On 15.11.2025 00:00, Tomas Vondra wrote: >> On 11/14/25 19:20, David Geier wrote: >>> >>> Ooops. That can likely be fixed. >>> > > I'll take a look at why this happens the next days, if you think this > approach generally has a chance to be accepted. See below. > >>>> And I very much doubt inventing a new ad hoc way to signal workers is >>>> the right solution (even if there wasn't the InstrEndLoop issue). >>>> >> >> Good point, I completely forgot about (2). >> > > In that light, could you take another look at my patch? > > Some clarifications: I'm not inventing a new way to signal workers but > I'm using the existing SendProcSignal() machinery to inform parallel > workers to stop. I just added another signal PROCSIG_PARALLEL_STOP and > the corresponding functions to handle it from ProcessInterrupts(). >
Sure, but I still don't quite see the need to do all this. > What is "new" is how I'm stopping the parallel workers once they've > received the stop signal: the challenge is that the workers need to > actually jump out of whatever they are doing - even if they aren't > producing any rows at this point; but e.g. are scanning a table > somewhere deep down in ExecScan() / SeqNext(). > > The only way I can see to make this work, without a huge patch that adds > new code all over the place, is to instruct process termination from > inside ProcessInterrupts(). I'm siglongjmp-ing out of the ExecutorRun() > function so that all parallel worker cleanup code still runs as if the > worker processed to completion. I've tried to end the process without > but that caused all sorts of fallout (instrumentation not collected, > postmaster thinking the process stopped unexpectedly, ...). > > Instead of siglongjmp-ing we could maybe call some parallel worker > shutdown function but that would require access to the parallel worker > state variables, which are currently not globally accessible. > But why? The leader and workers already share state - the parallel scan state (for the parallel-aware scan on the "driving" table). Why couldn't the leader set a flag in the scan, and force it to end in workers? Which AFAICS should lead to workers terminating shortly after that. All the code / handling is already in place. It will need a bit of new code in the parallel scans, but but not much I think. regards -- Tomas Vondra
