On 29 September 2015 at 20:52, Shulgin, Oleksandr < oleksandr.shul...@zalando.de> wrote:
> On Tue, Sep 29, 2015 at 8:34 PM, Simon Riggs <si...@2ndquadrant.com> > wrote: > >> On 29 September 2015 at 12:52, Shulgin, Oleksandr < >> oleksandr.shul...@zalando.de> wrote: >> >> >>> Hitting a process with a signal and hoping it will produce a meaningful >>> response in all circumstances without disrupting its current task was way >>> too naive. >>> >> >> Hmm, I would have to disagree, sorry. For me the problem was dynamically >> allocating everything at the time the signal is received and getting into >> problems when that caused errors. >> > > What I mean is that we need to move the actual EXPLAIN run out of > ProcessInterrupts(). It can be still fine to trigger the communication > with a signal. > Good > * INIT - Allocate N areas of memory for use by queries, which can be >> expanded/contracted as needed. Keep a freelist of structures. >> * OBSERVER - When requested, gain exclusive access to a diagnostic area, >> then allocate the designated process to that area, then send a signal >> * QUERY - When signal received dump an EXPLAIN ANALYZE to the allocated >> diagnostic area, (set flag to show complete, set latch on observer) >> * OBSERVER - process data in diagnostic area and then release area for >> use by next observation >> >> If the EXPLAIN ANALYZE doesn't fit into the diagnostic chunk, LOG it as a >> problem and copy data only up to the size defined. Any other ERRORs that >> are caused by this process cause it to fail normally. >> > > Do you envision problems if we do this with a newly allocated DSM every > time instead of pre-allocated area? This will have to revert the workflow, > because only the QUERY knows the required segment size: > That's too fiddly; we need to keep it simple by using just fixed sizes. > OBSERVER - sends a signal and waits for its proc latch to be set > QUERY - when signal is received allocates a DSM just big enough to fit the > EXPLAIN plan, then locates the OBSERVER(s) and sets its latch (or their > latches) > > The EXPLAIN plan should already be produced somewhere in the executor, to > avoid calling into explain.c from ProcessInterrupts(). > > That allows the observer to be another backend, or it allows the query >> process to perform self-observation based upon a timeout (e.g. >1 hour) or >> a row limit (e.g. when an optimizer estimate is seen to be badly wrong). >> > > Do you think there is one single best place in the executor code where > such a check could be added? I have very little idea about that. > Fairly simple. Main problem is knowing how to handle nested calls to the executor. I'll look at the patch. -- Simon Riggs http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services