On Thu, Aug 10, 2017 at 8:26 PM, Robert Haas <[email protected]> wrote:
> On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia
> <[email protected]> wrote:
> >> (1) seems like a pretty arbitrary restriction, so I don't like that
> >> option. (2) would hurt performance in some use cases. Do we have an
> >> option (3)?
> >
> > How about protecting option 2) with the load-via-partition-root
> protection.
> > Means
> > load the parents information even dump is not set only when
> > load-via-partition-root
> > & ispartition.
> >
> > This won't hurt performance for the normal cases.
>
> Yes, that seems like the right approach.
>
> + Dump data via the top-most partitioned table (rather than
> partition
> + table) when dumping data for the partition table.
>
> I think we should phrase this a bit more clearly, something like this:
> When dumping a COPY or INSERT statement for a partitioned table,
> target the root of the partitioning hierarchy which contains it rather
> than the partition itself. This may be useful when reloading data on
> a server where rows do not always fall into the same partitions as
> they did on the original server. This could happen, for example, if
> the partitioning column is of type text and the two system have
> different definitions of the collation used to partition the data.
>
>
Done.
> + printf(_(" --load-via-partition-root load partition table via
> the root relation\n"));
>
> "relation" seems odd to me here. root table?
>
>
Done.
> /* Don't bother computing anything for non-target tables, either
> */
> if (!tblinfo[i].dobj.dump)
> + {
> + /*
> + * Load the parents information for the partition table when
> + * the load-via-partition-root option is set. As we need the
> + * parents information to get the partition root.
> + */
> + if (dopt->load_via_partition_root &&
> + tblinfo[i].ispartition)
> + findParentsByOid(&tblinfo[i], inhinfo, numInherits);
> continue;
> + }
>
> Duplicating the call to findParentsByOid seems less then ideal. How
> about doing something like this:
>
> if (tblinfo[i].dobj.dump)
> {
> find_parents = true;
> mark_parents = true;
> }
> else if (dopt->load_via_partition_root && tblinfo[i].ispartition)
> find_parents = true;
>
> if (find_parents)
> findParentsByOid(&tblinfo[i], inhinfo, numInherits);
>
> etc.
>
>
Done changes to avoid duplicate call to findParentsByOid().
> The comments for this function also need some work - e.g. the function
> header comment deserves some kind of update for these changes.
>
>
Done.
> +static TableInfo *
> +getRootTableInfo(TableInfo *tbinfo)
> +{
> + Assert(tbinfo->ispartition);
> + Assert(tbinfo->numParents == 1);
> + if (tbinfo->parents[0]->ispartition)
> + return getRootTableInfo(tbinfo->parents[0]);
> +
> + return tbinfo->parents[0];
> +}
>
> This code should iterate, not recurse, to avoid any risk of blowing
> out the stack.
>
>
Done.
Please find attach patch with the changes.
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index bafa031..de8297a 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -889,6 +889,21 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ When dumping a COPY or INSERT statement for a partitioned table,
+ target the root of the partitioning hierarchy which contains it rather
+ than the partition itself. This will be useful when reloading data on
+ a server where rows do not always fall into the same partitions as
+ they did on the original server. This could happen, for example, if
+ the partitioning column is of type text and the two system have
+ different definitions of the collation used to partition the data.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
<listitem>
<para>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index aa944a2..dc1d3cc 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -431,6 +431,21 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ When dumping a COPY or INSERT statement for a partitioned table,
+ target the root of the partitioning hierarchy which contains it rather
+ than the partition itself. This will be useful when reloading data on
+ a server where rows do not always fall into the same partitions as
+ they did on the original server. This could happen, for example, if
+ the partitioning column is of type text and the two system have
+ different definitions of the collation used to partition the data.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--use-set-session-authorization</></term>
<listitem>
<para>
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 47191be..e7ee069 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -66,7 +66,7 @@ static int numExtensions;
static ExtensionMemberId *extmembers;
static int numextmembers;
-static void flagInhTables(TableInfo *tbinfo, int numTables,
+static void flagInhTables(Archive *fout, TableInfo *tbinfo, int numTables,
InhInfo *inhinfo, int numInherits);
static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables);
static DumpableObject **buildIndexArray(void *objArray, int numObjs,
@@ -243,7 +243,7 @@ getSchemaData(Archive *fout, int *numTablesPtr)
/* Link tables to parents, mark parents of target tables interesting */
if (g_verbose)
write_msg(NULL, "finding inheritance relationships\n");
- flagInhTables(tblinfo, numTables, inhinfo, numInherits);
+ flagInhTables(fout, tblinfo, numTables, inhinfo, numInherits);
if (g_verbose)
write_msg(NULL, "reading column info for interesting tables\n");
@@ -301,12 +301,16 @@ getSchemaData(Archive *fout, int *numTablesPtr)
* This is sufficient; we don't much care whether they inherited their
* attributes or not.
*
+ * When load-via-partition-root is set, load the parents information for the
+ * partition table.
+ *
* modifies tblinfo
*/
static void
-flagInhTables(TableInfo *tblinfo, int numTables,
+flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables,
InhInfo *inhinfo, int numInherits)
{
+ DumpOptions *dopt = fout->dopt;
int i,
j;
int numParents;
@@ -314,6 +318,8 @@ flagInhTables(TableInfo *tblinfo, int numTables,
for (i = 0; i < numTables; i++)
{
+ bool only_find_parents = false;
+
/* Some kinds never have parents */
if (tblinfo[i].relkind == RELKIND_SEQUENCE ||
tblinfo[i].relkind == RELKIND_VIEW ||
@@ -322,11 +328,26 @@ flagInhTables(TableInfo *tblinfo, int numTables,
/* Don't bother computing anything for non-target tables, either */
if (!tblinfo[i].dobj.dump)
- continue;
+ {
+ /*
+ * Load the parents information for the partition table when the
+ * load-via-partition-root option is set. As we need the parents
+ * information to get the partition root.
+ */
+ if (dopt->load_via_partition_root &&
+ tblinfo[i].ispartition)
+ only_find_parents = true;
+ else
+ continue;
+ }
/* Find all the immediate parent tables */
findParentsByOid(&tblinfo[i], inhinfo, numInherits);
+ /* Skip mark the parents */
+ if (only_find_parents)
+ continue;
+
/* Mark the parents as interesting for getTableAttrs */
numParents = tblinfo[i].numParents;
parents = tblinfo[i].parents;
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068a..ce3100f 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
int outputNoTablespaces;
int use_setsessauth;
int enable_row_security;
+ int load_via_partition_root;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 393b9e2..1703de8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
const char *prefix, Archive *fout);
static char *get_synchronized_snapshot(Archive *fout);
static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"section", required_argument, NULL, 5},
{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root table\n"));
printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n"));
printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n"));
printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
if (insertStmt == NULL)
{
insertStmt = createPQExpBuffer();
+
+ /*
+ * When load-via-partition-root is set, get the root table
+ * name for the partition table, so that we can reload data
+ * through the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify
+ * the table name. This is needed because earlier
+ * search_path will be set for the partition table.
+ */
+ classname = (char *) fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ classname = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
- fmtId(classname));
+ classname);
/* corner case for zero-column table */
if (nfields == 0)
@@ -2025,6 +2052,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
return 1;
}
+/*
+ * getRootTableInfo:
+ * get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ TableInfo *parentTbinfo;
+
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+
+ parentTbinfo = tbinfo->parents[0];
+ while (parentTbinfo->ispartition)
+ {
+ Assert(parentTbinfo->numParents == 1);
+ parentTbinfo = parentTbinfo->parents[0];
+ }
+
+ return parentTbinfo;
+}
/*
* dumpTableData -
@@ -2041,14 +2089,38 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
PQExpBuffer clistBuf = createPQExpBuffer();
DataDumperPtr dumpFn;
char *copyStmt;
+ const char *copyFrom;
if (!dopt->dump_inserts)
{
/* Dump/restore using COPY */
dumpFn = dumpTableData_copy;
- /* must use 2 steps here 'cause fmtId is nonreentrant */
+
+ /*
+ * When load-via-partition-root is set, get the root table name for
+ * the partition table, so that we can reload data through the root
+ * table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify the
+ * table name. This is needed because earlier search_path will be
+ * set for the partition table.
+ */
+ copyFrom = fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ copyFrom = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(copyBuf, "COPY %s ",
- fmtId(tbinfo->dobj.name));
+ copyFrom);
appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
fmtCopyColumnList(tbinfo, clistBuf),
(tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e..c4741c2 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int no_subscriptions = 0;
static int no_unlogged_table_data = 0;
static int no_role_passwords = 0;
static int server_version;
+static int load_via_partition_root = 0;
static char role_catalog[10];
#define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &no_tablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
if (quote_all_identifiers)
appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+ if (load_via_partition_root)
+ appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root");
if (use_setsessauth)
appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
if (no_publications)
@@ -606,6 +610,7 @@ help(void)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root table\n"));
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers