enhance the efficiency of migrating particularly large tables

David Zhang Mon, 08 Apr 2024 14:52:44 -0700

Hi Postgres hackers,

I'm reaching out to gather some comments on enhancing the efficiency ofmigrating particularly large tables with significant data volumes inPostgreSQL.

When migrating a particularly large table with a significant amount ofdata, users sometimes tend to split the table into multiple segments andutilize multiple sessions to process data from different segments inparallel, aiming to enhance efficiency. When segmenting a large table,it's challenging if the table lacks fields suitable for segmentation orif the data distribution is uneven. I believe that the data volume ineach block should be relatively balanced when vacuum is enabled.Therefore, the ctid can be used to segment a large table, and I amthinking the entire process can be outlined as follows:

1) determine the minimum and maximum ctid.

2) calculate the number of data blocks based on the maximum and minimumctid.3) generate multiple SQL queries, such as SELECT * FROM tbl WHERE ctid>= '(xx,1)' AND ctid < '(xxx,1)'.

However, when executing SELECT min(ctid) and max(ctid), it performs aSeq Scan, which can be slow for a large table. Is there a way toretrieve the minimum and maximum ctid other than using the systemfunctions min() and max()?

Since the minimum and maximum ctid are in order, theoretically, itshould start searching from the first block and can stop as soon as itfinds the first available one when retrieving the minimum ctid.Similarly, it should start searching in reverse order from the lastblock and stop upon finding the first occurrence when retrieving themaximum ctid. Here's a piece of code snippet:


        /* scan the relation for minimum or maximum ctid */
        if (find_max_ctid)
            dir = BackwardScanDirection;
        else
            dir = ForwardScanDirection;

        while ((tuple = heap_getnext(scan, dir)) != NULL)
        ...

The attached is a simple POC by referring to the extension pgstattuple.Any feedback, suggestions, or alternative solutions from the communitywould be greatly appreciated.


Thank you,

David

#include "postgres.h"
#include "fmgr.h"

#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/primnodes.h"
#include "utils/relcache.h"
#include "catalog/namespace.h"
#include "catalog/pg_am_d.h"
#include "utils/varlena.h"
#include "storage/bufmgr.h"
#include "utils/snapmgr.h"
#include "access/heapam.h"
#include "access/relscan.h"
#include "access/tableam.h"


PG_MODULE_MAGIC;
PG_FUNCTION_INFO_V1(get_ctid);


Datum
get_ctid(PG_FUNCTION_ARGS)
{
    bool                find_max_ctid = 0;
    text                *relname;
        RangeVar        *relrv;
        Relation        rel;
        char            ctid[32] = {0};

        if (PG_ARGISNULL(0) || PG_ARGISNULL(1))
                ereport(ERROR,
                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                         errmsg("two parameters are required")));

        relname = PG_GETARG_TEXT_PP(0);
        find_max_ctid = PG_GETARG_UINT32(1);

        /* open relation */
        relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
        rel = relation_openrv(relrv, AccessShareLock);

        if (RELATION_IS_OTHER_TEMP(rel))
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("cannot access temporary tables of 
other sessions")));

        if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind) ||
                        rel->rd_rel->relkind == RELKIND_SEQUENCE)
        {
                TableScanDesc scan;
                HeapScanDesc hscan;
                HeapTuple       tuple;
                SnapshotData SnapshotDirty;
                BlockNumber blockNumber;
                OffsetNumber offsetNumber;
                ScanDirection dir;

                if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
                        ereport(ERROR,
                                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                         errmsg("only heap AM is supported")));

                /* Disable syncscan because we assume we scan from block zero 
upwards */
                scan = table_beginscan_strat(rel, SnapshotAny, 0, NULL, true, 
false);
                hscan = (HeapScanDesc) scan;

                InitDirtySnapshot(SnapshotDirty);

                /* scan the relation for minimum or maximum ctid */
                if (find_max_ctid)
                        dir = BackwardScanDirection;
                else
                        dir = ForwardScanDirection;

                while ((tuple = heap_getnext(scan, dir)) != NULL)
                {
                        CHECK_FOR_INTERRUPTS();

                        /* must hold a buffer lock to call 
HeapTupleSatisfiesVisibility */
                        LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);

                        if (HeapTupleSatisfiesVisibility(tuple, &SnapshotDirty, 
hscan->rs_cbuf))
                        {
                                blockNumber = 
ItemPointerGetBlockNumberNoCheck(&tuple->t_self);
                                offsetNumber = 
ItemPointerGetOffsetNumberNoCheck(&tuple->t_self);

                                /* Perhaps someday we should output this as a 
record. */
                                snprintf(ctid, sizeof(ctid), "(%u,%u)", 
blockNumber, offsetNumber);
                                LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_UNLOCK);
                                break;
                        }
                        LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_UNLOCK);
                }

                table_endscan(scan);
                relation_close(rel, AccessShareLock);

                PG_RETURN_CSTRING(pstrdup(ctid));
        }
        else
        {
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("cannot get tuple-level statistics for 
relation \"%s\"",
                                                RelationGetRelationName(rel)),
                                 
errdetail_relkind_not_supported(rel->rd_rel->relkind)));
        }

        PG_RETURN_CSTRING(pstrdup("(0,0)"));    /* should not happen */
}

enhance the efficiency of migrating particularly large tables

Reply via email to