Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hi, > -Original Message- > From: Robert Haas [mailto:robertmh...@gmail.com] > Sent: Wednesday, March 16, 2016 2:34 AM > To: Amit Langote> Cc: Amit Langote ; SPS ポクレ ヴィナヤック( > 三技術) ; Kyotaro HORIGUCHI > ; pgsql-hackers@postgresql.org; SPS 坂野 > 昌平(三技術) > Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker. > > On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote > wrote: > > On 2016/03/15 3:41, Robert Haas wrote: > >> On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote > wrote: > >>> Instead, the attached patch adds a IndexBulkDeleteProgressCallback > >>> which AMs should call for every block that's read (say, right before > >>> a call to ReadBufferExtended) as part of a given vacuum run. The > >>> callback with help of some bookkeeping state can count each block > >>> and report to pgstat_progress API. Now, I am not sure if all AMs > >>> read 1..N blocks for every vacuum or if it's possible that some > >>> blocks are read more than once in single vacuum, etc. IOW, some > >>> AM's processing may be non-linear and counting blocks 1..N (where N > >>> is reported total index blocks) may not be possible. However, this > >>> is the best I could think of as doing what we are trying to do here. > >>> Maybe index AM experts can chime in on that. > >>> > >>> Thoughts? > >> > >> Well, I think you need to study the index AMs and figure this out. > > > > OK. I tried to put calls to the callback in appropriate places, but > > couldn't get the resulting progress numbers to look sane. So I ended > > up concluding that any attempt to do so is futile unless I analyze > > each AM's vacuum code carefully to be able to determine in advance the > > max bound on the count of blocks that the callback will report. > > Anyway, as you suggest, we can improve it later. > > I don't think there is any way to bound that, because new blocks can get > added to the index concurrently, and we might end up needing to scan > them. Reporting the number of blocks scanned can still be useful, though - > any chance you can just implement that part of it? > > >> But I think for starters you should write a patch that reports the > >> following: > >> > >> 1. phase > >> 2. number of heap blocks scanned > >> 3. number of heap blocks vacuumed > >> 4. number of completed index vac cycles 5. number of dead tuples > >> collected since the last index vac cycle 6. number of dead tuples > >> that we can store before needing to perform an index vac cycle > >> > >> All of that should be pretty straightforward, and then we'd have > >> something we can ship. We can add the detailed index reporting > >> later, when we get to it, perhaps for 9.7. > > > > OK, I agree with this plan. Attached updated patch implements this. > > Sorta. Committed after renaming what you called heap blocks vacuumed > back to heap blocks scanned, adding heap blocks vacuumed, removing the > overall progress meter which I don't believe will be anything close to > accurate, fixing some stylistic stuff, arranging to update multiple counters > automatically where it could otherwise produce confusion, moving the new > view near similar ones in the file, reformatting it to follow the style of the > rest of the file, exposing the counter #defines via a header file in case > extensions want to use them, and overhauling and substantially expanding > the documentation. Thank you for committing this feature. There is one minor bug. s/ pgstat_progress_update_params/ pgstat_progress_update_multi_param/g Attached patch fixes a minor bug. Regards, Vinayak Pokale pgstat_progress_function-typo.patch Description: pgstat_progress_function-typo.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hi Amit, Thank you for updating the patch. > -Original Message- > From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] > Sent: Thursday, March 10, 2016 5:09 PM > To: SPS ポクレ ヴィナヤック(三技術); > robertmh...@gmail.com > Cc: horiguchi.kyot...@lab.ntt.co.jp; amitlangot...@gmail.com; pgsql- > hack...@postgresql.org; SPS 坂野 昌平(三技術) > Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker. > > > Hi Vinayak, > > Thanks for the quick review! > > On 2016/03/10 16:22, poku...@pm.nttdata.co.jp wrote: > >> On 2016/03/10 14:29, Amit Langote wrote: > >> Updated patches attached. > > In 0002- > > [ snip ] > > > I think we need to use datid instead of datname. > > Robert added datid in pg_stat_get_progress_info() and we are using that > function here. > > +values[1] = ObjectIdGetDatum(beentry->st_databaseid); > > [ snip ] > > > So I think it's better to report datid not datname. > > The definition of view is simply like: > > +CREATE VIEW pg_stat_progress_vacuum AS > > +SELECT > > +S.pid AS pid, > > +S.datid AS datid, > > +S.relid AS relid, > > +CASE S.param1 > > +WHEN 1 THEN 'scanning heap' > > +WHEN 2 THEN 'vacuuming indexes' > > +WHEN 3 THEN 'vacuuming heap' > > +WHEN 4 THEN 'cleanup' > > +ELSE 'unknown phase' > > +END AS processing_phase, > > +S.param2 AS total_heap_blocks, > > +S.param3 AS current_heap_block, > > +S.param4 AS total_index_blocks, > > +S.param5 AS index_blocks_done, > > +S.param6 AS index_vacuum_count, > > +CASE S.param2 > > +WHEN 0 THEN round(100.0, 2) > > + ELSE round((S.param3 + 1) * 100.0 / S.param2, 2) > > +END AS percent_done > > +FROM pg_stat_get_progress_info('VACUUM') AS S; > > > > So maybe we can add datname as separate column in > pg_stat_progress_vacuum, I think it's not required only datid is sufficient. > > Any comment? > > Why do you think showing the name may be unacceptable? Wouldn't that > be a little more user-friendly? Though maybe, we can follow the > pg_stat_activity style and have both instead, as you suggest. Attached > updated version does that. +1 I think reporting both (datid and datname) is more user-friendly. Thank you. Regards, Vinayak Pokale -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hi Amit, Thank you for updating the patch. > -Original Message- > From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] > Sent: Thursday, March 10, 2016 3:36 PM > To: Robert Haas> Cc: Kyotaro HORIGUCHI ; Amit Langote > ; SPS ポクレ ヴィナヤック(三技術) > ; pgsql-hackers@postgresql.org; SPS 坂野 昌 > 平(三技術) > Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker. > > On 2016/03/10 14:29, Amit Langote wrote: > > I rebased remainder patches (attached). > > > > 0001 is a small patch to fix issues reported by Tomas and Vinayak. > > 0002 and 0003 are WIP patches to implement progress reporting for > vacuum. > > Oops, in 0002, I wrongly joined with pg_class in the definition of > pg_stat_progress_vacuum to output the schema-qualified name of the table > being vacuumed. That means we need to connect to the correct database, > which is undesirable. Updated version fixes that (shows database name and > relid). You may also have noticed that I said pg_stat_progress_vacuum, not > pg_stat_vacuum_progress (IMHO, the former is a better name). > > Updated patches attached. In 0002- +CREATE VIEW pg_stat_progress_vacuum AS +SELECT +S.pid AS pid, +D.datname AS database, +S.relid AS relid, . . . . +FROM pg_database D, pg_stat_get_progress_info('VACUUM') AS S +WHERE S.datid = D.oid; I think we need to use datid instead of datname. Robert added datid in pg_stat_get_progress_info() and we are using that function here. +values[1] = ObjectIdGetDatum(beentry->st_databaseid); +DATA(insert OID = 3318 ( pg_stat_get_progress_info PGNSP PGUID 12 1 100 0 0 f f f f f t s r 1 0 2249 "25" "{25,23,26,26,20,20,20,20,20,20,20,20,20,20}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}" _null_ _null_ pg_stat_get_progress_info _null_ _null_ _null_ )); So I think it's better to report datid not datname. The definition of view is simply like: +CREATE VIEW pg_stat_progress_vacuum AS +SELECT +S.pid AS pid, +S.datid AS datid, +S.relid AS relid, +CASE S.param1 +WHEN 1 THEN 'scanning heap' +WHEN 2 THEN 'vacuuming indexes' +WHEN 3 THEN 'vacuuming heap' +WHEN 4 THEN 'cleanup' +ELSE 'unknown phase' +END AS processing_phase, +S.param2 AS total_heap_blocks, +S.param3 AS current_heap_block, +S.param4 AS total_index_blocks, +S.param5 AS index_blocks_done, +S.param6 AS index_vacuum_count, +CASE S.param2 +WHEN 0 THEN round(100.0, 2) + ELSE round((S.param3 + 1) * 100.0 / S.param2, 2) +END AS percent_done +FROM pg_stat_get_progress_info('VACUUM') AS S; In the pg_stat_activity view, datid and datname are the separate columns. So maybe we can add datname as separate column in pg_stat_progress_vacuum, but I think it's not required only datid is sufficient. Any comment? Regards, Vinayak Pokale -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hi, Thank you very much for committing this feature. > -Original Message- > From: Robert Haas [mailto:robertmh...@gmail.com] > Sent: Thursday, March 10, 2016 2:17 AM > To: Amit Langote> Cc: Kyotaro HORIGUCHI ; Amit Langote > ; SPS ポクレ ヴィナヤック(三技術) > ; pgsql-hackers@postgresql.org; SPS 坂野 昌 > 平(三技術) > Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker. > > On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote > wrote: > > On 2016/03/09 10:11, Amit Langote wrote: > >> The attached revision addresses above and one of Horiguchi-san's > >> comments in his email yesterday. > > > > I fixed one more issue in 0002 per Horiguchi-san's comment. Sorry > > about so many versions. > > I've committed 0001 with heavy revisions. Just because we don't need an > SQL-visible function to clear the command progress doesn't mean we don't > need to clear it at all; rather, it has to happen automatically. > I also did a bunch of identifier renaming, added datid to the view output, > adjusted the comments, and so on. Please rebase the remainder of the > series. Thanks. Some minor typos need to fix. +/*--- + * pgstat_progress_start_command() - + * + * Set st_command in own backend entry. Also, zero-initialize + * st_progress_param array. + *--- + */ In the description we need to use st_progress_command instead of st_command. +/*--- + * pgstat_progress_end_command() - + * + * Update index'th member in st_progress_param[] of own backend entry. + *--- + */ Here also need to change the description. Regards, Vinayak Pokale -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hi Amit, > -Original Message- > From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] > Sent: Wednesday, March 09, 2016 4:29 PM > To: Kyotaro HORIGUCHI> Cc: robertmh...@gmail.com; amitlangot...@gmail.com; SPS ポクレ ヴィ > ナヤック(三技術) ; pgsql- > hack...@postgresql.org; SPS 坂野 昌平(三技術) > Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker. > > > On 2016/03/08 18:19, Kyotaro HORIGUCHI wrote: > >> +WHEN 0 THEN 100::numeric(5, 2) > >> +ELSE ((S.param3 + 1)::numeric / S.param2 * > 100)::numeric(5, 2) > >> > >> This usage of numeric seems overkill to me. > > > > Hmm, how could this rather be written? > > OK, agreed about the overkill. Following might be better: > > + WHEN 0 THEN round(100.0, 2) > + ELSE round((S.param3 + 1) * 100.0 / S.param2, 2) +1 > Will update that patch. > > Thanks, > Amit > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hi Amit, Thank you for updating the patch. I am testing it and I will try to improve it. Regards, Vinayak > -Original Message- > From: Amit Langote [mailto:amitlangot...@gmail.com] > Sent: Saturday, March 05, 2016 4:41 PM > To: Robert Haas> Cc: SPS ポクレ ヴィナヤック(三技術) ; > Amit Langote ; Kyotaro HORIGUCHI > ; pgsql-hackers@postgresql.org; SPS 坂野 > 昌平(三技術) > Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker. > > On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote > wrote: > > So, I took the Vinayak's latest patch and rewrote it a little > ... > > I broke it into two: > > > > 0001-Provide-a-way-for-utility-commands-to-report-progres.patch > > 0002-Implement-progress-reporting-for-VACUUM-command.patch > > Oops, unamended commit messages in those patches are misleading. So, > please find attached corrected versions. > > Thanks, > Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Thank you for your comments. Please find attached patch addressing following comments. >As I might have written upthread, transferring the whole string >as a progress message is useless at least in this scenario. Since >they are a set of fixed messages, each of them can be represented >by an identifier, an integer number. I don't see a reason for >sending the whole of a string beyond a backend. Agreed. I used following macros. #define VACUUM_PHASE_SCAN_HEAP 1 #define VACUUM_PHASE_VACUUM_INDEX_HEAP 2 >I guess num_index_scans could better be reported after all the indexes are >done, that is, after the for loop ends. Agreed. I have corrected it. > CREATE VIEW pg_stat_vacuum_progress AS > SELECT S.s[1] as pid, > S.s[2] as relid, > CASE S.s[3] >WHEN 1 THEN 'Scanning Heap' >WHEN 2 THEN 'Vacuuming Index and Heap' >ELSE 'Unknown phase' > END, > > FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S; > > # The name of the function could be other than *_command_progress. The name of function is updated as pg_stat_get_progress_info() and also updated the function. Updated the pg_stat_vacuum_progress view as suggested. Regards, Vinayak Vacuum_progress_checker_v12.patch Description: Vacuum_progress_checker_v12.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Please find attached updated patch. >The point of having pgstat_report_progress_update_counter() is so that >you can efficiently update a single counter without having to update >everything, when only one counter has changed. But here you are >calling this function a whole bunch of times in a row, which >completely misses the point - if you are updating all the counters, >it's more efficient to use an interface that does them all at once >instead of one at a time. The pgstat_report_progress_update_counter() is called at appropriate places in the attached patch. >So I've spent a fair amount of time debugging really-long-running >VACUUM processes with customers, and generally what I really want to >know is: What block number are we at? <<< Agreed. The attached patch reported current block number. Regards, Vinayak Vacuum_progress_checker_v11.patch Description: Vacuum_progress_checker_v11.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers