On Mon, Jul 6, 2015 at 10:48 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > According to his patch, the wait events that he was thinking to add were: > > + typedef enum PgCondition > + { > + PGCOND_UNUSED = 0, /* unused */ > + > + /* 10000 - CPU */ > + PGCOND_CPU = 10000, /* generic cpu operations */ > + /* 11000 - CPU:PARSE */ > + PGCOND_CPU_PARSE = 11000, /* pg_parse_query */ > + PGCOND_CPU_PARSE_ANALYZE = 11100, /* parse_analyze */ > + /* 12000 - CPU:REWRITE */ > + PGCOND_CPU_REWRITE = 12000, /* pg_rewrite_query */ > + /* 13000 - CPU:PLAN */ > + PGCOND_CPU_PLAN = 13000, /* pg_plan_query */ > + /* 14000 - CPU:EXECUTE */ > + PGCOND_CPU_EXECUTE = 14000, /* PortalRun or > PortalRunMulti */ [ etc. ]
Sorry to say it, but I this design is a mess. Suppose we are executing a query, and during the query we execute the ILIKE operator, and within that we try to acquire a buffer content lock (say, to detoast some data). So at the outermost level our state is PGCOND_CPU_EXECUTE, and then within that we are in state PGCOND_CPU_ILIKE, and then within that we are in state PGCOND_LWLOCK_PAGE. When we exit each of the inner states, we've got to restore the proper outer state, or time will be mis-attribtued. Error handling has got to pop all of the items off the stack that were added since the PG_TRY() block started, and then push on a new state for error handling, which gets popped when the PG_TRY block finishes. Another problem is that some of these things are incredibly specific (like "running the ILIKE operator") and others are extremely general (like "executing the query"). Why does ILIKE get a code but +(int4,int4) does not? We need some less-arbitrary way of assigning codes than what's shown here. Now, that's not to say there are no good ideas here. For example, pg_stat_activity could expose a byte of state indicating which phase of query processing is current: parse / parse analysis / rewrite / plan / execute / none. I think that'd be a fine thing to do, and I support doing it, although maybe not in the same patch as my original proposal. On the flip side, I don't support trying to expose information on the level of "which C function are we currently executing?" because I think there's going to be absolutely no reasonable way to make that sufficiently low-overhead, and also because I don't see any way to make it less than nightmarishly onerous from the point of view of code maintenance. We could expose some functions but not others, but that seems like a mess; I think unless and until we have a better solution, the right answer to "I need to know which C function is running in each backend" is "that's what perf is for". In any case, I think the main point is that Itagaki-san's proposal is not really a proposal for wait events. It is a proposal to expose some state about "what is the backend doing right now?" which might be waiting or something else. I believe those things are should be separated into several separate pieces of state. It's entirely reasonable to want to know whether we are in the parse phase or plan phase separately from knowing whether we are waiting on an lwlock or not, because then you could (for example) judge what percentage of your lock waits are coming from parsing vs. what fraction are coming from planning, which somebody might care about. Or you might care about ONLY the phase of query processing and not about wait events at all, and then you can ignore one column and look at the other. With the above proposal, those things all get munged together in a way that I think is bound to be awkward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers