On Fri, 8 Mar 2019 at 09:14, Tatsuro Yamada
<[email protected]> wrote:
>
> On 2019/03/06 15:38, Tatsuro Yamada wrote:
> > On 2019/03/05 17:56, Tatsuro Yamada wrote:
> >> On 2019/03/05 11:35, Robert Haas wrote:
> >>> On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
> >>> <[email protected]> wrote:
> >>>> === Current design ===
> >>>>
> >>>> CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
> >>>> Depending on which one is chosen, the command will proceed in the
> >>>> following sequence of phases:
> >>>>
> >>>> * Scan method: Seq Scan
> >>>> 0. initializing (*2)
> >>>> 1. seq scanning heap (*1)
> >>>> 3. sorting tuples (*2)
> >>>> 4. writing new heap (*1)
> >>>> 5. swapping relation files (*2)
> >>>> 6. rebuilding index (*2)
> >>>> 7. performing final cleanup (*2)
> >>>>
> >>>> * Scan method: Index Scan
> >>>> 0. initializing (*2)
> >>>> 2. index scanning heap (*1)
> >>>> 5. swapping relation files (*2)
> >>>> 6. rebuilding index (*2)
> >>>> 7. performing final cleanup (*2)
> >>>>
> >>>> VACUUM FULL command will proceed in the following sequence of phases:
> >>>>
> >>>> 1. seq scanning heap (*1)
> >>>> 5. swapping relation files (*2)
> >>>> 6. rebuilding index (*2)
> >>>> 7. performing final cleanup (*2)
> >>>>
> >>>> (*1): increasing the value in heap_tuples_scanned column
> >>>> (*2): only shows the phase in the phase column
> >>>
> >>> All of that sounds good.
> >>>
> >>>> The view provides the information of CLUSTER command progress details as
> >>>> follows
> >>>> # \d pg_stat_progress_cluster
> >>>> View "pg_catalog.pg_stat_progress_cluster"
> >>>> Column | Type | Collation | Nullable | Default
> >>>> ---------------------------+---------+-----------+----------+---------
> >>>> pid | integer | | |
> >>>> datid | oid | | |
> >>>> datname | name | | |
> >>>> relid | oid | | |
> >>>> command | text | | |
> >>>> phase | text | | |
> >>>> cluster_index_relid | bigint | | |
> >>>> heap_tuples_scanned | bigint | | |
> >>>> heap_tuples_vacuumed | bigint | | |
> >>>
> >>> Still not sure if we need heap_tuples_vacuumed. We could try to
> >>> report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
> >>> we're using a Seq Scan.
> >>
> >> I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that
> >> in
> >> next patch.
> >>
> >> Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able
> >> to
> >> get those from initscan(). I'll investigate it more.
> >>
> >> cluster.c
> >> copy_heap_data()
> >> heap_beginscan()
> >> heap_beginscan_internal()
> >> initscan()
> >>
> >>
> >>
> >>>> === Discussion points ===
> >>>>
> >>>> - Progress counter for "3. sorting tuples" phase
> >>>> - Should we add pgstat_progress_update_param() in tuplesort.c like
> >>>> a
> >>>> "trace_sort"?
> >>>> Thanks to Peter Geoghegan for the useful advice!
> >>>
> >>> How would we avoid an abstraction violation?
> >>
> >> Hmm... What do you mean an abstraction violation?
> >> If it is difficult to solve, I'd not like to add the progress counter for
> >> the sorting tuples.
> >>
> >>
> >>>> - Progress counter for "6. rebuilding index" phase
> >>>> - Should we add "index_vacuum_count" in the view like a vacuum
> >>>> progress monitor?
> >>>> If yes, I'll add pgstat_progress_update_param() to
> >>>> reindex_relation() of index.c.
> >>>> However, I'm not sure whether it is okay or not.
> >>>
> >>> Doesn't seem unreasonable to me.
> >>
> >> I see, I'll add it later.
> >
> >
> > Attached file is revised and WIP patch including:
> >
> > - Remove heap_tuples_vacuumed
> > - Add heap_blks_scanned and heap_blks_total
> > - Add index_vacuum_count
> >
> > I tried to "add heap_blks_scanned and heap_blks_total" columns and I
> > realized that
> > "heap_tuples_scanned" column is suitable as a counter when a scan method is
> > both index-scan and seq-scan because CLUSTER is on a tuple basis.
>
>
> Attached file is rebased patch on current HEAD.
> I changed a status. :)
>
>
Looks like the patch needs a rebase.
I was on the commit fb5806533f9fe0433290d84c9b019399cd69e9c2
PFA reject file in case you want to have a look.
> Regards,
> Tatsuro Yamada
>
>
>
--
Regards,
Rafia Sabih
--- src/backend/commands/cluster.c
+++ src/backend/commands/cluster.c
@@ -942,14 +960,33 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
*/
if (OldIndex != NULL && !use_sort)
{
+ const int ci_index[] = {
+ PROGRESS_CLUSTER_PHASE,
+ PROGRESS_CLUSTER_INDEX_RELID
+ };
+ int64 ci_val[2];
+
+ /* Set phase and OIDOldIndex to columns */
+ ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+ ci_val[1] = OIDOldIndex;
+ pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
heapScan = NULL;
indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
index_rescan(indexScan, NULL, 0, NULL, 0);
}
else
{
+ /* Set phase */
+ pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+ PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
indexScan = NULL;
+
+ /* Set total heap blocks */
+ pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+ heapScan->rs_nblocks);
}
/* Log what we're doing */