Hi all,

following the link in 

http://wiki.postgresql.org/wiki/Query_progress_indication

but mostly:

http://www.postech.ac.kr/~swhwang/progress2.pdf [1]

I'm trying to write an implementation of the "dne" method in postgresql.

I added another column to the pg_stat_get_activity function to report the 
percentage of work done for the query (of course, any other method could be 
used... the way the percentage is reported to the user can be easily changed).

I attached a first patch (just to see if anyone is interested, the work is by 
no means finished).

I guess I did a lot of mistakes, since I don't know anything about postgresql 
code...

1) the progress indicator can be eliminated at runtime; this could be done with 
another runtime flag (at the moment is always on)

2) I added a new structure (Progress) to PlanState to keep all the info about 
execution progress

3) I needed a pointer to the root of the PlanStates, to be able to calculate 
the total progress of the query tree (I bet this pointer was already available 
somewhere, but I couldn't find where...)

4) sub-plans are not included yet (well, just to be honest, I don't really know 
what postgresql means with those... :) )

5) the percentage is updated at most every second (can be easily changed)

6) the methods to adjust upper/lower bounds in [1] are not implemented yet (but 
that shouldn't be a problem)

7) the "spilled tuples" handling in [1] is not supported yet

8) only hash join, loop join, aggregate, sequence scans are implemented at the 
moment

9) I added another flag (EXEC_FLAG_DRIVER_BRANCH) in executor.h to signal to 
the sub-nodes if they are part of a branch that will contain a driver node (for 
example, inner subtree of a Nested Loops join is not a driver branch). I guess 
this could be done better at Plan level (instead of PlanState), but this way 
less code has to be changed

10) at the moment all driver nodes have the same "work_per_tuple=1", but this 
could be changed (for example, CPU-intensive driver nodes could have a smaller 
work_per_tuple value)

Well, some (very early) tests on a tpcd db showed it works as expected (well, I 
only did very few tests...)

Hope someone is interested


      
Index: src/include/pgstat.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/pgstat.h,v
retrieving revision 1.83
diff -r1.83 pgstat.h
568a569,571
> 
>     /* current percentage of progress */
>     float       st_progress_perc;
646a650,651
> extern void pgstat_report_progress_percentage(double perc);
> 
Index: src/backend/executor/Makefile
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/Makefile,v
retrieving revision 1.29
diff -r1.29 Makefile
25c25
<        nodeWindowAgg.o tstoreReceiver.o spi.o
---
>        nodeWindowAgg.o tstoreReceiver.o spi.o progress.o
Index: src/backend/executor/execProcnode.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execProcnode.c,v
retrieving revision 1.65
diff -r1.65 execProcnode.c
109a110,111
> #include "executor/progress.h"
> #include "pgstat.h"
111a114
> void ProgressUpdate(PlanState* node, double* tot_operations_expected, double* 
> tot_operations_so_far);
132a136
>       bool    is_driver_node_candidate = false;
175a180,181
>                       is_driver_node_candidate = true;
> 
261a268
>                       is_driver_node_candidate = true;
314a322,325
>       /* Set up progress info for this node if requested */
>       if (result->state->es_progress)
>               ProgressSetInfo(result, node, eflags, is_driver_node_candidate);
> 
328a340,343
>       struct timeval t;
>       double  tot_operations_expected = 0;
>       double  tot_operations_so_far = 0;
> 
462a478,493
>       // progress calcs (only if required)
>       if (node->state->es_progress && node->progress != NULL && 
> node->progress->is_driver_node)
>       {
>               node->progress->operations_so_far++;
>               gettimeofday(&t, NULL);
>               if (t.tv_sec > node->state->es_progress_last_update.tv_sec)
>               {
>                       ProgressUpdate(node->state->es_root_planstate, 
> &tot_operations_expected, &tot_operations_so_far);
>                       if (tot_operations_expected != 0)
>                       {
>                               
> pgstat_report_progress_percentage(tot_operations_so_far*100/tot_operations_expected);
>                               node->state->es_progress_last_update = t;
>                       }
>               }
>       }
> 
466a498,524
> void ProgressUpdate(PlanState* node, double* tot_operations_expected, double* 
> tot_operations_so_far)
> {
>       // TODO here a  switch (nodeTag(node)) is needed in case we want 
> upper/lower limit update
>       if (node->progress->is_driver_node)
>       {
>               *tot_operations_expected += node->progress->lower_bound;
>               *tot_operations_so_far += node->progress->operations_so_far;
>       }
> 
>       /*else ??? */if (node->progress->is_driver_branch)
>       {
>               if (outerPlanState(node) != NULL)
>               {
>                       ProgressUpdate(outerPlanState(node), 
> tot_operations_expected, tot_operations_so_far);
>               }
>               if (innerPlanState(node) != NULL)
>               {
>                       ProgressUpdate(innerPlanState(node), 
> tot_operations_expected, tot_operations_so_far);
>               }
>       }
> 
> 
> 
> }
> 
> 
> 
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execMain.c,v
retrieving revision 1.326
diff -r1.326 execMain.c
829a830,832
>       // TODO use configuration parameter
>       estate->es_progress = true;
> 
835c838,841
<       planstate = ExecInitNode(plan, estate, eflags);
---
>       planstate = ExecInitNode(plan, estate, eflags | 
> EXEC_FLAG_DRIVER_BRANCH);
> 
>       estate->es_root_planstate = planstate;
> 
2737c2743
<       epq->planstate = ExecInitNode(estate->es_plannedstmt->planTree, 
epqstate, 0);
---
>       epq->planstate = ExecInitNode(estate->es_plannedstmt->planTree, 
> epqstate, 0 | EXEC_FLAG_DRIVER_BRANCH);
Index: src/backend/executor/nodeHashjoin.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v
retrieving revision 1.101
diff -r1.101 nodeHashjoin.c
400a401
>       //innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, 
> eflags & ~EXEC_FLAG_DRIVER_BRANCH);
Index: src/backend/executor/nodeNestloop.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeNestloop.c,v
retrieving revision 1.53
diff -r1.53 nodeNestloop.c
317a318,319
> 
>       // for progress estimation: the inner branch is not a driver branch
319c321
<                                                                               
   eflags | EXEC_FLAG_REWIND);
---
>                                                                               
>    (eflags | EXEC_FLAG_REWIND) & ~(EXEC_FLAG_DRIVER_BRANCH));
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.189
diff -r1.189 pgstat.c
2215a2216
>       beentry->st_progress_perc = 0;
2341a2343,2363
> /*
>  * Report current progress percentage
>  */
> void
> pgstat_report_progress_percentage(double perc)
> {
>       volatile PgBackendStatus *beentry = MyBEEntry;
> 
>       if (!pgstat_track_activities || !beentry)
>               return;
> 
>       /*
>        * Update my status entry, following the protocol of bumping
>        * st_changecount before and after.  We use a volatile pointer here to
>        * ensure the compiler doesn't try to get cute.
>        */
>       beentry->st_changecount++;
>       beentry->st_progress_perc = perc;
>       beentry->st_changecount++;
>       Assert((beentry->st_changecount & 1) == 0);
> }
Index: src/include/executor/executor.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/executor/executor.h,v
retrieving revision 1.155
diff -r1.155 executor.h
51a52
> #define EXEC_FLAG_DRIVER_BRANCH       0x0010  /* for progress update: this is 
> a driver branch  */
Index: src/backend/utils/adt/pgstatfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/pgstatfuncs.c,v
retrieving revision 1.54
diff -r1.54 pgstatfuncs.c
419c419
<               tupdesc = CreateTemplateTupleDesc(10, false);
---
>               tupdesc = CreateTemplateTupleDesc(11, false);
429a430
>               TupleDescInitEntry(tupdesc, (AttrNumber) 11, "progress_perc", 
> FLOAT4OID, -1, 0);
481,482c482,483
<               Datum           values[10];
<               bool            nulls[10];
---
>               Datum           values[11];
>               bool            nulls[11];
601a603,604
>                       nulls[10] = false;
>             values[10] = Float4GetDatum(beentry->st_progress_perc);
612a616
>                       nulls[10] = true;
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.544
diff -r1.544 pg_proc.h
2984c2984
< DATA(insert OID = 2022 (  pg_stat_get_activity                        PGNSP 
PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" 
"{23,26,23,26,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o}" 
"{pid,datid,procpid,usesysid,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}"
 _null_ pg_stat_get_activity _null_ _null_ _null_ ));
---
> DATA(insert OID = 2022 (  pg_stat_get_activity                        PGNSP 
> PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" 
> "{23,26,23,26,25,16,1184,1184,1184,869,23,700}" "{i,o,o,o,o,o,o,o,o,o,o,o}" 
> "{pid,datid,procpid,usesysid,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port,progress_perc}"
>  _null_ pg_stat_get_activity _null_ _null_ _null_ ));
Index: src/include/executor/progress.h
===================================================================
RCS file: src/include/executor/progress.h
diff -N src/include/executor/progress.h
0a1,43
> /*-------------------------------------------------------------------------
>  *
>  * progress.h
>  *      definitions for run-time progress information
>  *
>  *
>  * Copyright (c) 2001-2009, PostgreSQL Global Development Group
>  *
>  *-------------------------------------------------------------------------
>  */
> #ifndef PROGRESS_H
> #define PROGRESS_H
> 
> #include "nodes/execnodes.h"
> 
> 
> typedef struct Progress
> {
>       // information for progress estimation
>       bool            is_driver_branch;               /* is this node in a 
> branch that will contain a driver node? */
>       bool            is_driver_node;                 /* is this node a 
> driver node? */
>       double          operations_so_far;              /* rows emitted so far 
> */
>       double          upper_bound;            /* max number of expected rows 
> */
>       double          lower_bound;            /* min number of expected rows 
> */
>       float           work_per_tuple;         /* could be nice to divide CPU 
> vs disk nodes */
> } Progress;
> 
> extern Progress *ProgressAlloc(void);
> 
> /* increment rows processed in case this is a driver node */
> #define ProgressNewRow(node) \
>       do { \
>               if (node->ps.state->es_progress && /*this shouldn't be 
> needed...*/ node->ps.progress != 0) { \
>                       node->ps.progress->rows_so_far++; \
>                       ProgressUpdate((PlanState*)&(node->ps));        \
>               } \
>       } while(0);
> 
> 
> void ProgressSetInfo(PlanState * planstate, Plan* plan, int eflags, bool 
> is_candidate_driver_node);
> 
> 
> #endif   /* PROGRESS_H */
Index: src/backend/executor/progress.c
===================================================================
RCS file: src/backend/executor/progress.c
diff -N src/backend/executor/progress.c
0a1,44
> /*-------------------------------------------------------------------------
>  *
>  * progress.c
>  *     functions for progress info of plan execution
>  *
>  *
>  * Copyright (c) 2001-2009, PostgreSQL Global Development Group
>  *
>  * IDENTIFICATION
>  *
>  *-------------------------------------------------------------------------
>  */
> #include "postgres.h"
> 
> #include <unistd.h>
> 
> #include "executor/progress.h"
> #include "pgstat.h"
> #include "executor/executor.h"
> 
> 
> /* Allocate new progress structure */
> Progress *ProgressAlloc(void)
> {
>       Progress *prog = palloc0(sizeof(Progress));
>       prog->work_per_tuple = 1;
> 
>       return prog;
> }
> 
> /* Set up progress info for this node if requested */
> void ProgressSetInfo(PlanState * planstate, Plan* plan, int eflags, bool 
> is_candidate_driver_node)
> {
>       planstate->progress = ProgressAlloc();
>       if (eflags & EXEC_FLAG_DRIVER_BRANCH)
>       {
>               planstate->progress->is_driver_branch = true;
>               if (is_candidate_driver_node)
>               {
>                       planstate->progress->lower_bound = plan->plan_rows;
>                       planstate->progress->is_driver_node = true;
>               }
>       }
> }
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to