Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-15 Thread Thakur, Sameer
Hello,
I am not really willing to show up as the picky guy here, but could it be 
possible to receive those patches as attached to emails instead of having them 
referenced by URL? I imagine that you are directly using the nabble interface.
Just configured a new mail client for nabble, did not know how to use it within 
an existing conversation.
Now I can send the patch attached!
Thanks
Sameer


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


IndexScanProgress.patch
Description: IndexScanProgress.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-15 Thread Amit Langote
On 2015-07-16 AM 05:18, Rahila Syed wrote:
 
 GUC parameter 'pgstat_track_progress' is currently PGC_SUSET in line with
 'track_activities' GUC parameter.

Naming the GUC pgstat* seems a little inconsistent. It could be called,
say, track_maintenance_progress_interval/track_vacuum_progress_interval.
That way, it will look similar to existing track_* parameters:

#track_activities = on
#track_counts = on
#track_io_timing = off
#track_functions = none # none, pl, all
#track_activity_query_size = 1024   # (change requires restart)

Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample
might be helpful.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-15 Thread Rahila Syed
Hello,

Please find attached updated patch with an interface to calculate command
progress in pgstat.c. This interface currently implements VACUUM progress
 tracking .
A column named percent_complete has been added in pg_stat_activity to
report progress.

VACUUM calls the progress calculation interface periodically at an interval
specified by pgstat_track_progress GUC in ms.
Progress calculation can be disabled by setting pgstat_track_progress  as
-1.

Remaining_time for VACUUM is not included in current patch to avoid
cluttering pg_stat_activity with too many columns.
But the estimate as seen from previous implementation seems reasonable
enough to be included in progress information , may be as an exclusive view
for vacuum progress information.

GUC parameter 'pgstat_track_progress' is currently PGC_SUSET in line with
'track_activities' GUC parameter.  Although IMO, pgstat_track_progress can
be made PGC_USERSET in order to provide more flexibility to any user to
enable/disable progress calculation provided progress is tracked only if
track_activities GUC parameter  is enabled.

In this patch, index scans are not taken into account for progress
calculation as of now .

Thank you,
Rahila Syed.


Vacuum_progress_checker_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-03 Thread Syed, Rahila
Hello,

TBH, I think that designing this as a hook-based solution is adding a whole 
lot of complexity for no value.  The hard parts of the problem are collecting 
the raw data and making the results visible to users, and both of those 
require involvement of the core code.  Where is the benefit from pushing some 
trivial intermediate arithmetic into an external module?
If there's any at all, it's certainly not enough to justify problems such as 
you mention here.

So I'd just create a pgstat_report_percent_done() type of interface in 
pgstat.c and then teach VACUUM to call it directly.

Thank you for suggestion. I agree that adding code in core will reduce code 
complexity with no additional overhead. 
Going by the consensus, I will update the patch with code to collect and store 
progress information from vacuum in pgstat.c and
UI using pg_stat_activity view.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-02 Thread Syed, Rahila
Hello,

Unless I am missing something, I guess you can still keep the actual code that 
updates counters outside the core if you adopt an approach that Simon suggests.
Yes. The code to extract progress information from VACUUM and storing in shared 
memory can be outside core even with pg_stat_activity as a user interface.

Whatever the view (existing/new), any related counters would have a valid 
(non-NULL) value when read off the view iff hooks are set perhaps because you 
have an extension that sets them. 
I guess that means any operation that supports progress tracking would have 
an extension with suitable hooks implemented.
Do you mean to say , any operation/application that want progress  tracking 
feature will dynamically load the progress checker module which will set the 
hooks for progress reporting?
If yes , unless I am missing something such dynamic loading cannot happen if we 
use pg_stat_activity as it gets values from shared memory. Module has to be a 
shared_preload_library
to allocate a shared memory. So this will mean the module can be loaded only at 
server restart. Am I missing something?

Thank you,
Rahila Syed




__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-02 Thread Tom Lane
Syed, Rahila rahila.s...@nttdata.com writes:
 Hello,
 Unless I am missing something, I guess you can still keep the actual code 
 that updates counters outside the core if you adopt an approach that Simon 
 suggests.
 Yes. The code to extract progress information from VACUUM and storing in 
 shared memory can be outside core even with pg_stat_activity as a user 
 interface.

 Whatever the view (existing/new), any related counters would have a valid 
 (non-NULL) value when read off the view iff hooks are set perhaps because 
 you have an extension that sets them. 
 I guess that means any operation that supports progress tracking would 
 have an extension with suitable hooks implemented.
 Do you mean to say , any operation/application that want progress  tracking 
 feature will dynamically load the progress checker module which will set the 
 hooks for progress reporting?
 If yes , unless I am missing something such dynamic loading cannot happen if 
 we use pg_stat_activity as it gets values from shared memory. Module has to 
 be a shared_preload_library
 to allocate a shared memory. So this will mean the module can be loaded
 only at server restart. Am I missing something?

TBH, I think that designing this as a hook-based solution is adding a
whole lot of complexity for no value.  The hard parts of the problem are
collecting the raw data and making the results visible to users, and
both of those require involvement of the core code.  Where is the benefit
from pushing some trivial intermediate arithmetic into an external module?
If there's any at all, it's certainly not enough to justify problems such
as you mention here.

So I'd just create a pgstat_report_percent_done() type of interface in
pgstat.c and then teach VACUUM to call it directly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-02 Thread Amit Langote
On 2015-07-02 PM 11:00, Syed, Rahila wrote:
 Hello,
 
 Unless I am missing something, I guess you can still keep the actual code 
 that updates counters outside the core if you adopt an approach that Simon 
 suggests.
 Yes. The code to extract progress information from VACUUM and storing in 
 shared memory can be outside core even with pg_stat_activity as a user 
 interface.
 
 Whatever the view (existing/new), any related counters would have a valid 
 (non-NULL) value when read off the view iff hooks are set perhaps because 
 you have an extension that sets them. 
 I guess that means any operation that supports progress tracking would 
 have an extension with suitable hooks implemented.
 Do you mean to say , any operation/application that want progress  tracking 
 feature will dynamically load the progress checker module which will set the 
 hooks for progress reporting?
 If yes , unless I am missing something such dynamic loading cannot happen if 
 we use pg_stat_activity as it gets values from shared memory. Module has to 
 be a shared_preload_library
 to allocate a shared memory. So this will mean the module can be loaded only 
 at server restart. Am I missing something?
 

Assuming that set of hooks per command and shared memory structure(s) is a way
to go, I meant to say that hook implementations per command would be in their
separate modules, of course loaded at the server start for shared memory). Of
those, your proposed patch has vacuum_progress, for example. And in context of
my comment above, that means the view would say NULL for commands for which
the module has not been set up in advance. IOW, between showing NULL in the
view and dynamically loading hook functions, we choose the former because I
don't know what the latter means in postgres.

Having said that, Tom's suggestion to export pgstat.c function(s) for
command(s) may be a more appealing way to go.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-01 Thread Simon Riggs
On 2 July 2015 at 03:00, Rahila Syed rahilasye...@gmail.com wrote:


 Yes, I suggest just a single column on pg_stat_activity called
 pct_complete

 Reporting remaining time also can be crucial to make decisions regarding
 continuing or aborting VACUUM.
 The same has been suggested  in  the thread below,

 http://www.postgresql.org/message-id/13072.1284826...@sss.pgh.pa.us

 trace_completion_interval = 5s (default)

 Every interval, we report the current % complete for any operation that
 supports it. We just show NULL if the current operation has not reported
 anything or never will.

 We do this for VACUUM first, then we can begin adding other operations
 as we work out how (for that operation).

 Thank you for explaining. This design seems good to me except, adding more
 than one columns(percent_complete, remaining_time)


It is attractive to have a remaining_time column, or a
predicted_completion_timestamp, but those columns are prediction
calculations rather than actual progress reports. I'm interested in seeing
a report that relates to actual progress made.

Predicted total work required is also interesting, but is much less
trustworthy figure.

I think we'll need to get wider input about the user interface for this
feature.



 if required to pg_stat_activity can be less user intuitive than having a
 separate view for VACUUM.


I think it is a mistake to do something just for VACUUM.

Monitoring software will look at pg_stat_activity. I don't think we should
invent a separate view for progress statistics because it will cause users
to look in two places rather than just one. Reporting progress is fairly
cheap instrumentation, calculating a prediction of completion time might be
expensive.

Having said that, monitoring systems currently use a polling mechanism to
retrieve status data. They look at information published by the backend. We
don't currently have a mechanism to defer publication of expensive
monitoring information until requested by the monitoring system. If you
have a design for how that might work then say so, otherwise we need to
assume a simple workflow: the backend publishes whatever it chooses,
whenever it chooses and then that is made available via the monitoring
system via views.


Your current design completely misses the time taken to scan indexes, which
is significant.

There might be a justification to put this out of core, but measuring
progress of VACUUM wouldn't be it, IMHO.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-01 Thread Amit Langote
On 2015-07-02 AM 11:41, Rahila Syed wrote:
 Hello,
 
 I though about the possibilities of progress visualization - and one
 possibility is one or two special column in pg_stat_activity table - this
 info can be interesting for VACUUM started by autovacuum too.
 
 Thank you for suggestion. The design with hooks and a separate view was
 mainly to keep most of the code outside core as the feature proposed is
 specific to VACUUM command. Also, having a separate view can give more
 flexibility in terms of displaying various progress parameters.
 

Unless I am missing something, I guess you can still keep the actual code that
updates counters outside the core if you adopt an approach that Simon
suggests. Whatever the view (existing/new), any related counters would have a
valid (non-NULL) value when read off the view iff hooks are set perhaps
because you have an extension that sets them. I guess that means any operation
that supports progress tracking would have an extension with suitable hooks
implemented.

Of course unless I misinterpreted Simon's words.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-01 Thread Sameer Thakur
Hello,
Thank you for suggestion. The design with hooks and a separate view was
mainly to keep most of the code outside core as the feature proposed is
specific to VACUUM command. Also, having a separate view can give more
flexibility in terms of displaying various progress parameters.
FWIW ,there was resistance to include columns in pg_stat_activity earlier
in the following thread,
http://www.postgresql.org/message-id/AANLkTi=TcuMA38oGUKX9p5WVPpY+M3L0XUp7=PLT+LCT@...

Perhaps as suggested in the link, the progress could be made available via a
function call which does progress calculation on demand. Then we do not
need a separate view, or clutter pg_stat_activity, and also has benefit of
calculating progress just when it's needed.

  



--
View this message in context: 
http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5856192.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-01 Thread Guillaume Lelarge
Le 2 juil. 2015 7:28 AM, Simon Riggs si...@2ndquadrant.com a écrit :

 On 2 July 2015 at 03:00, Rahila Syed rahilasye...@gmail.com wrote:


 Yes, I suggest just a single column on pg_stat_activity called
pct_complete

 Reporting remaining time also can be crucial to make decisions regarding
continuing or aborting VACUUM.
 The same has been suggested  in  the thread below,

 http://www.postgresql.org/message-id/13072.1284826...@sss.pgh.pa.us

 trace_completion_interval = 5s (default)

 Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.

 We do this for VACUUM first, then we can begin adding other operations
as we work out how (for that operation).

 Thank you for explaining. This design seems good to me except, adding
more than one columns(percent_complete, remaining_time)


 It is attractive to have a remaining_time column, or a
predicted_completion_timestamp, but those columns are prediction
calculations rather than actual progress reports. I'm interested in seeing
a report that relates to actual progress made.


Agreed.

 Predicted total work required is also interesting, but is much less
trustworthy figure.


And it is something a client app or an extension can compute. No need to
put this in core as long as we have the actual progress.

 I think we'll need to get wider input about the user interface for this
feature.



 if required to pg_stat_activity can be less user intuitive than having a
separate view for VACUUM.


 I think it is a mistake to do something just for VACUUM.

 Monitoring software will look at pg_stat_activity. I don't think we
should invent a separate view for progress statistics because it will cause
users to look in two places rather than just one. Reporting progress is
fairly cheap instrumentation, calculating a prediction of completion time
might be expensive.


+1

 Having said that, monitoring systems currently use a polling mechanism to
retrieve status data. They look at information published by the backend. We
don't currently have a mechanism to defer publication of expensive
monitoring information until requested by the monitoring system. If you
have a design for how that might work then say so, otherwise we need to
assume a simple workflow: the backend publishes whatever it chooses,
whenever it chooses and then that is made available via the monitoring
system via views.


 Your current design completely misses the time taken to scan indexes,
which is significant.

 There might be a justification to put this out of core, but measuring
progress of VACUUM wouldn't be it, IMHO.

 --
 Simon Riggshttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-01 Thread Rahila Syed
Hello,

Thank you for suggestions.

Yes, I suggest just a single column on pg_stat_activity called pct_complete

Reporting remaining time also can be crucial to make decisions regarding
continuing or aborting VACUUM.
The same has been suggested  in  the thread below,

http://www.postgresql.org/message-id/13072.1284826...@sss.pgh.pa.us

trace_completion_interval = 5s (default)

Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.

We do this for VACUUM first, then we can begin adding other operations as
we work out how (for that operation).

Thank you for explaining. This design seems good to me except, adding more
than one columns(percent_complete, remaining_time) if required to
pg_stat_activity can be less user intuitive than having a separate view for
VACUUM.

-Rahila Syed



















On Tue, Jun 30, 2015 at 2:02 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 30 June 2015 at 08:52, Pavel Stehule pavel.steh...@gmail.com wrote:


 I though about the possibilities of progress visualization - and one
 possibility is one or two special column in pg_stat_activity table - this
 info can be interesting for VACUUM started by autovacuum too.


 Yes, I suggest just a single column on pg_stat_activity called pct_complete

 trace_completion_interval = 5s (default)

 Every interval, we report the current % complete for any operation that
 supports it. We just show NULL if the current operation has not reported
 anything or never will.

 We do this for VACUUM first, then we can begin adding other operations as
 we work out how (for that operation).

 --
 Simon Riggshttp://www.2ndQuadrant.com/
 http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-01 Thread Rahila Syed
Hello,

I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.

Thank you for suggestion. The design with hooks and a separate view was
mainly to keep most of the code outside core as the feature proposed is
specific to VACUUM command. Also, having a separate view can give more
flexibility in terms of displaying various progress parameters.

FWIW ,there was resistance to include columns in pg_stat_activity earlier
in the following thread,
http://www.postgresql.org/message-id/AANLkTi=TcuMA38oGUKX9p5WVPpY+M3L0XUp7=plt+...@mail.gmail.com

Thank you,
Rahila Syed

On Tue, Jun 30, 2015 at 1:22 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hi

 2015-06-30 9:37 GMT+02:00 Rahila Syed rahilasye...@gmail.com:

 Hello Hackers,

 Following is a proposal for feature to calculate VACUUM progress.


 interesting idea - I like to see it integrated to core.



 Use Case : Measuring progress of long running VACUUMs to help DBAs make
 informed decision
 whether to continue running VACUUM or abort it.

 Design:

 A shared preload library to store progress information from different
 backends running VACUUM, calculate remaining time for each and display
 progress in the
 in the form a view.


 probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO
 statements

 I though about the possibilities of progress visualization - and one
 possibility is one or two special column in pg_stat_activity table - this
 info can be interesting for VACUUM started by autovacuum too.

 Regards

 Pavel



 VACUUM  needs to be instrumented with a hook to collect progress
 information (pages vacuumed/scanned) periodically.

 The patch attached implements a new hook to store vacuumed_pages and
 scanned_pages count at the end of each page scanned by VACUUM.

 This information is stored in a shared memory structure.

 In addition to measuring progress this function using hook also
 calculates remaining time for VACUUM.



 The frequency of collecting progress information can be reduced by
 appending delays in between hook function calls.

 Also, a GUC parameter

 log_vacuum_min_duration can be used.

 This will cause VACUUM progress to be calculated only if VACUUM runs more
 than specified milliseconds.

 A value of zero calculates VACUUM progress for each page processed. -1
 disables logging.


 Progress calculation :


 percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

 remaining_time = elapsed_time * (total_pages_to_be_scanned -
 scanned_pages) / scanned_pages;


 Shared memory struct:

 typedef struct PgStat_VacuumStats

 {

   Oid databaseoid;

   Oid tableoid;

   Int32   vacuumed_pages;

   Int32   total_pages;

   Int32   scanned_pages;

   doubleelapsed_time;

   doubleremaining_time;

  } PgStat_VacuumStats[max_connections];



 Reporting :

  A view named 'pg_maintenance_progress' can be created using the values
 in the struct above.

 pg_stat_maintenance can be called from any other backend and will display
 progress of

 each running VACUUM.


 Other uses of hook in VACUUM:


 Cost of VACUUM in terms of pages hit , missed and dirtied same as
 autovacuum can be collected using this hook.

 Autovacuum does it at the end of VACUUM for each table. It can be done
 while VACUUM on a table is in progress.
 This can be helpful to track manual VACUUMs also not just autovacuum.

 Read/Write(I/O) rates can be computed on the lines of autovacuum.
 Read rate patterns can be used to help tuning future vacuum on the
 table(like shared buffers tuning)
 Other resource usages can also be collected using progress checker hook.


 Attached patch is POC patch of progress calculation for a single backend.

 Also attached is a brief snapshot of the output log.




 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers





Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-06-30 Thread Thomas Munro
On Tue, Jun 30, 2015 at 7:37 PM, Rahila Syed rahilasye...@gmail.com wrote:
 Hello Hackers,

 Following is a proposal for feature to calculate VACUUM progress.

 Use Case : Measuring progress of long running VACUUMs to help DBAs make
 informed decision
 whether to continue running VACUUM or abort it.

+1

I was thinking recently that it would be very cool to see some
estimation of the progress of VACUUM and CLUSTER in a view similar to
pg_stat_activity, or the ps title.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-06-30 Thread dinesh kumar
On Tue, Jun 30, 2015 at 1:07 PM, Rahila Syed rahilasye...@gmail.com wrote:

 Hello Hackers,

 Following is a proposal for feature to calculate VACUUM progress.

 Use Case : Measuring progress of long running VACUUMs to help DBAs make
 informed decision
 whether to continue running VACUUM or abort it.

 +1

I am excited to know how the progress works in when any of the statement
got blocked during locks. Rather displaying the stats in the LOG, shall we
have this in a pg_stat_vacuum_activity[ New catalog for all auto-vacuum
stats].

Best Regards,
Dinesh
manojadinesh.blogspot.com

Design:

 A shared preload library to store progress information from different
 backends running VACUUM, calculate remaining time for each and display
 progress in the
 in the form a view.


 VACUUM  needs to be instrumented with a hook to collect progress
 information (pages vacuumed/scanned) periodically.

 The patch attached implements a new hook to store vacuumed_pages and
 scanned_pages count at the end of each page scanned by VACUUM.

 This information is stored in a shared memory structure.

 In addition to measuring progress this function using hook also calculates
 remaining time for VACUUM.



 The frequency of collecting progress information can be reduced by
 appending delays in between hook function calls.

 Also, a GUC parameter

 log_vacuum_min_duration can be used.

 This will cause VACUUM progress to be calculated only if VACUUM runs more
 than specified milliseconds.

 A value of zero calculates VACUUM progress for each page processed. -1
 disables logging.


 Progress calculation :


 percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

 remaining_time = elapsed_time * (total_pages_to_be_scanned -
 scanned_pages) / scanned_pages;


 Shared memory struct:

 typedef struct PgStat_VacuumStats

 {

   Oid databaseoid;

   Oid tableoid;

   Int32   vacuumed_pages;

   Int32   total_pages;

   Int32   scanned_pages;

   doubleelapsed_time;

   doubleremaining_time;

  } PgStat_VacuumStats[max_connections];



 Reporting :

  A view named 'pg_maintenance_progress' can be created using the values in
 the struct above.

 pg_stat_maintenance can be called from any other backend and will display
 progress of

 each running VACUUM.


 Other uses of hook in VACUUM:


 Cost of VACUUM in terms of pages hit , missed and dirtied same as
 autovacuum can be collected using this hook.

 Autovacuum does it at the end of VACUUM for each table. It can be done
 while VACUUM on a table is in progress.
 This can be helpful to track manual VACUUMs also not just autovacuum.

 Read/Write(I/O) rates can be computed on the lines of autovacuum.
 Read rate patterns can be used to help tuning future vacuum on the
 table(like shared buffers tuning)
 Other resource usages can also be collected using progress checker hook.


 Attached patch is POC patch of progress calculation for a single backend.

 Also attached is a brief snapshot of the output log.




 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-06-30 Thread Pavel Stehule
Hi

2015-06-30 9:37 GMT+02:00 Rahila Syed rahilasye...@gmail.com:

 Hello Hackers,

 Following is a proposal for feature to calculate VACUUM progress.


interesting idea - I like to see it integrated to core.



 Use Case : Measuring progress of long running VACUUMs to help DBAs make
 informed decision
 whether to continue running VACUUM or abort it.

 Design:

 A shared preload library to store progress information from different
 backends running VACUUM, calculate remaining time for each and display
 progress in the
 in the form a view.


probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO
statements

I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.

Regards

Pavel



 VACUUM  needs to be instrumented with a hook to collect progress
 information (pages vacuumed/scanned) periodically.

 The patch attached implements a new hook to store vacuumed_pages and
 scanned_pages count at the end of each page scanned by VACUUM.

 This information is stored in a shared memory structure.

 In addition to measuring progress this function using hook also calculates
 remaining time for VACUUM.



 The frequency of collecting progress information can be reduced by
 appending delays in between hook function calls.

 Also, a GUC parameter

 log_vacuum_min_duration can be used.

 This will cause VACUUM progress to be calculated only if VACUUM runs more
 than specified milliseconds.

 A value of zero calculates VACUUM progress for each page processed. -1
 disables logging.


 Progress calculation :


 percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

 remaining_time = elapsed_time * (total_pages_to_be_scanned -
 scanned_pages) / scanned_pages;


 Shared memory struct:

 typedef struct PgStat_VacuumStats

 {

   Oid databaseoid;

   Oid tableoid;

   Int32   vacuumed_pages;

   Int32   total_pages;

   Int32   scanned_pages;

   doubleelapsed_time;

   doubleremaining_time;

  } PgStat_VacuumStats[max_connections];



 Reporting :

  A view named 'pg_maintenance_progress' can be created using the values in
 the struct above.

 pg_stat_maintenance can be called from any other backend and will display
 progress of

 each running VACUUM.


 Other uses of hook in VACUUM:


 Cost of VACUUM in terms of pages hit , missed and dirtied same as
 autovacuum can be collected using this hook.

 Autovacuum does it at the end of VACUUM for each table. It can be done
 while VACUUM on a table is in progress.
 This can be helpful to track manual VACUUMs also not just autovacuum.

 Read/Write(I/O) rates can be computed on the lines of autovacuum.
 Read rate patterns can be used to help tuning future vacuum on the
 table(like shared buffers tuning)
 Other resource usages can also be collected using progress checker hook.


 Attached patch is POC patch of progress calculation for a single backend.

 Also attached is a brief snapshot of the output log.




 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-06-30 Thread Simon Riggs
On 30 June 2015 at 08:52, Pavel Stehule pavel.steh...@gmail.com wrote:


 I though about the possibilities of progress visualization - and one
 possibility is one or two special column in pg_stat_activity table - this
 info can be interesting for VACUUM started by autovacuum too.


Yes, I suggest just a single column on pg_stat_activity called pct_complete

trace_completion_interval = 5s (default)

Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.

We do this for VACUUM first, then we can begin adding other operations as
we work out how (for that operation).

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


<    1   2   3