[PATCHES] pg_dump: multiple tables, schemas with exclusions and wildcards

2006-07-16 Thread Greg Sabino Mullane
Here's the latest pg_dump patch I've been (too sporadically) working on.
I abandoned building linked lists and decided to make the backend do all
the work, from building the list of good relations, to doing the POSIX
regex matching. I've added numerous examples to the docs, but it may
still need some more explaining. It should be nearly 100% backwards
compatible with any existing scripts that use a single -t as well.

--
Greg Sabino Mullane [EMAIL PROTECTED]
End Point Corporation
PGP Key: 0x14964AC8 200607162215
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

Index: doc/TODO
===
RCS file: /projects/cvsroot/pgsql/doc/TODO,v
retrieving revision 1.1914
diff -u -r1.1914 TODO
--- doc/TODO	10 Jul 2006 15:44:31 -	1.1914
+++ doc/TODO	17 Jul 2006 02:10:18 -
@@ -795,7 +795,7 @@
 
 * pg_dump
 
-	o %Allow pg_dump to use multiple -t and -n switches  [pg_dump]
+	o -Allow pg_dump to use multiple -t and -n switches  (Greg)
 
 	  http://archives.postgresql.org/pgsql-patches/2006-01/msg00190.php
 
Index: doc/src/sgml/ref/pg_dump.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.86
diff -u -r1.86 pg_dump.sgml
--- doc/src/sgml/ref/pg_dump.sgml	13 May 2006 17:10:35 -	1.86
+++ doc/src/sgml/ref/pg_dump.sgml	17 Jul 2006 02:10:18 -
@@ -398,18 +398,66 @@
   

 Dump data for table
-only. It is possible for there to be
-multiple tables with the same name in different schemas; if that
-is the case, all matching tables will be dumped.  Specify both
---schema and --table to select just one table.
+only. It is possible for there to be multiple tables with the same 
+name in different schemas; if that is the case, all matching tables 
+will be dumped. The argument can also contain POSIX regular expressions 
+which may match more than one table. Using a "+" sign at the end of the 
+name will force a regular expression search. For example, 
+-t employee will dump only tables named exactly "employee", 
+while -t employee+ will dump all tables with the letters
+"employee" inside of it.

 
+   
+	   The options -t, -T, -n, and -N 
+   can be used together to achieve a high degree of control over what will get 
+   dumped. Multiple arguments can be used, and are parsed in the order they are 
+   given to build a list of vaid tables and schemas. The schema options are 
+   parsed first to create a list of schemas to dump, and then the table options 
+   are parsed to only find tables in the matching schemas.
+   
+
+   Examples
+
+   To dump a single table named pg_class:
+
+
+$ pg_dump -t pg_class mydb > db.out
+
+   
+
+   To dump all tables starting with employee in the 
+   detroit schema, except for the table named employee_log:
+
+
+$ pg_dump -n detroit -t ^employee -T employee_log mydb > db.out
+
+   
+
+   To dump all schemas starting with east or west and ending in 
+   gsm, but not schemas that contain the letters test, except for 
+   one named east_alpha_test_five:
+
+
+$ pg_dump -n "^(east|west).*gsm$" -N test -n east_alpha_test_five mydb > db.out
+
+   
+
+
+   To dump all tables except for those beginning with ts_:
+
+
+$ pg_dump -T "^ts_" mydb > db.out
+
+   
+
+

 
  In this mode, pg_dump makes no
- attempt to dump any other database objects that the selected table
+ attempt to dump any other database objects that the selected tables
  may depend upon. Therefore, there is no guarantee
- that the results of a single-table dump can be successfully
+ that the results of a specific-table dump can be successfully
  restored by themselves into a clean database.
 

@@ -417,6 +465,46 @@
  
 
  
+  -T table
+  --exclude-table=table
+  
+   
+Do not dump any matching tables.
+		More than one option may be used, and POSIX regular expressions are allowed. 
+See -t for examples.
+   
+
+   
+  
+ 
+
+ 
+  -n schema
+  --schema=schema
+  
+   
+Dump only the matching schemas.
+		More than one option may be used, and POSIX regular expressions are allowed. 
+See -t for examples.
+   
+
+  
+ 
+
+ 
+  -N schema
+  --exclude-schema=schema
+  
+   
+Do not dump the matching schemas.
+		More than one option may be used, and POSIX regular expressions are allowed. 
+See -t for examples.
+   
+
+  
+ 
+
+ 
   -v
   --verbose
   
Index: src/bin/pg_dump/common.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/pg_dump/common.c,v
retrie

Re: [PATCHES] Have psql display names and OUT/INOUT in \df output

2006-07-16 Thread Neil Conway
On Sun, 2006-07-16 at 07:18 -0700, David Fetter wrote:
> Thanks for clarifying this.  Patch attached.

Applied, with one more fix: format_type's second argument should be NULL
if it is not otherwise known. Thanks for the patch.

(Speaking of which, there is probably room for a one-parameter version
of format_type() that just calls the normal version with a NULL for the
second parameter...)

-Neil



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Restartable Recovery

2006-07-16 Thread Simon Riggs
On Sun, 2006-07-16 at 10:51 -0400, Tom Lane wrote:
> Andreas Seltenreich <[EMAIL PROTECTED]> writes:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> >> [2. text/x-patch; restartableRecovery.patch]
> 
> > Hmm, wouldn't you have to reboot the resource managers at each
> > checkpoint? I'm afraid otherwise things like postponed page splits
> > could get lost on restart from a later checkpoint.
> 
> Ouch.  That's a bit nasty.  You can't just apply a postponed split at
> checkpoint time, because the WAL record could easily be somewhere after
> the checkpoint, leading to duplicate insertions.  Right offhand I don't
> see how to make this work :-(

Yes, ouch. So much for gung-ho code sprints; thanks Andreas.

To do this we would need to have another rmgr specific routine that gets
called at a recovery checkpoint. This would then write to disk the
current state of the incomplete multi-WAL actions, in some manner.
During the startup routines we would check for any pre-existing state
files and use those to initialise the incomplete action cache. Cleanup
would then discard all state files. 

That allows us to not-forget actions, but it doesn't help us if there
are problems repeating actions twice. We would at least know that we are
in a potential double-action zone and could give different kinds of
errors or handling.

Or we can simply mark any indexes incomplete-needs-rebuild if they had a
page split during the overlap time between the last known good recovery
checkpoint and the following one. But that does lead to randomly bounded
recovery time, which might be better to have started from scratch
anyway.

Given time available for 8.2, neither one is a quick fix.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Restartable Recovery

2006-07-16 Thread Tom Lane
Andreas Seltenreich <[EMAIL PROTECTED]> writes:
> Simon Riggs <[EMAIL PROTECTED]> writes:
>> [2. text/x-patch; restartableRecovery.patch]

> Hmm, wouldn't you have to reboot the resource managers at each
> checkpoint? I'm afraid otherwise things like postponed page splits
> could get lost on restart from a later checkpoint.

Ouch.  That's a bit nasty.  You can't just apply a postponed split at
checkpoint time, because the WAL record could easily be somewhere after
the checkpoint, leading to duplicate insertions.  Right offhand I don't
see how to make this work :-(

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Have psql display names and OUT/INOUT in \df output

2006-07-16 Thread David Fetter
On Sun, Jul 16, 2006 at 05:16:58AM -0700, Neil Conway wrote:
> On Sun, 2006-07-16 at 01:00 -0700, David Fetter wrote:
> > On Sun, Jul 16, 2006 at 12:21:12AM -0700, Neil Conway wrote:
> > > You need to schema-qualify references to builtin functions, to
> > > avoid accidentally using functions of the same name that appear
> > > earlier in the user's search path.
> > 
> > I don't understand what you mean here.
> 
> For example, you shouldn't be using "generate_series" in the SQL
> query: if the user has a function of the same name earlier in their
> search path, psql will invoke the wrong function. Instead, you
> should call "pg_catalog.generate_series", as psql was careful to do
> prior to the patch.

Thanks for clarifying this.  Patch attached.

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Index: src/bin/psql/describe.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/psql/describe.c,v
retrieving revision 1.140
diff -c -r1.140 describe.c
*** src/bin/psql/describe.c 14 Jun 2006 16:49:02 -  1.140
--- src/bin/psql/describe.c 16 Jul 2006 14:18:08 -
***
*** 170,177 
  "SELECT n.nspname as \"%s\",\n"
  "  p.proname as \"%s\",\n"
  "  CASE WHEN p.proretset THEN 'setof 
' ELSE '' END ||\n"
! "  pg_catalog.format_type(p.prorettype, NULL) 
as \"%s\",\n"
! "  
pg_catalog.oidvectortypes(p.proargtypes) as \"%s\"",
  _("Schema"), _("Name"), _("Result 
data type"),
  _("Argument data types"));
  
--- 170,204 
  "SELECT n.nspname as \"%s\",\n"
  "  p.proname as \"%s\",\n"
  "  CASE WHEN p.proretset THEN 'setof 
' ELSE '' END ||\n"
! "  
pg_catalog.format_type(p.prorettype, NULL) as \"%s\",\n"
! "  CASE WHEN proallargtypes IS NOT 
NULL THEN\n"
! "
pg_catalog.array_to_string(ARRAY(\n"
! "  SELECT\n"
! "CASE\n"
! "  WHEN p.proargmodes[s.i] = 
'i' THEN ''\n"
! "  WHEN p.proargmodes[s.i] = 
'o' THEN 'OUT '\n"
! "  WHEN p.proargmodes[s.i] = 
'b' THEN 'INOUT '\n"
! "END ||\n"
! "CASE\n"
! "  WHEN p.proargnames[s.i] = 
'' OR p.proargnames[s.i] IS NULL THEN ''\n"
! "  ELSE p.proargnames[s.i] || 
' ' \n"
! "END ||\n"
! "
pg_catalog.format_type(p.proallargtypes[s.i],-1)\n"
! "  FROM\n"
! "
pg_catalog.generate_series(1,pg_catalog.array_upper(p.proallargtypes,1)) AS 
s(i)\n"
! "), ', ')\n"
! "  ELSE\n"
! "
pg_catalog.array_to_string(ARRAY(\n"
! "  SELECT\n"
! "CASE\n"
! "  WHEN p.proargnames[s.i+1] 
= '' OR p.proargnames[s.i+1] IS NULL THEN ''\n"
! "  ELSE p.proargnames[s.i+1] 
|| ' '\n"
! "  END ||\n"
! "
pg_catalog.format_type(p.proargtypes[s.i],-1)\n"
! "  FROM\n"
! "
pg_catalog.generate_series(0,pg_catalog.array_upper(p.proargtypes,1)) AS s(i)\n"
! "), ', ')\n"
! "  END AS  \"%s\"",
  _("Schema"), _("Name"), _("Result 
data type"),
  _("Argument data types"));
  

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Restartable Recovery

2006-07-16 Thread Andreas Seltenreich
Simon Riggs <[EMAIL PROTECTED]> writes:

> [2. text/x-patch; restartableRecovery.patch]

Hmm, wouldn't you have to reboot the resource managers at each
checkpoint? I'm afraid otherwise things like postponed page splits
could get lost on restart from a later checkpoint.

regards,
andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Have psql display names and OUT/INOUT in \df output

2006-07-16 Thread Neil Conway
On Sun, 2006-07-16 at 01:00 -0700, David Fetter wrote:
> On Sun, Jul 16, 2006 at 12:21:12AM -0700, Neil Conway wrote:
> > You need to schema-qualify references to builtin functions, to avoid
> > accidentally using functions of the same name that appear earlier in
> > the user's search path.
> 
> I don't understand what you mean here.

For example, you shouldn't be using "generate_series" in the SQL query:
if the user has a function of the same name earlier in their search
path, psql will invoke the wrong function. Instead, you should call
"pg_catalog.generate_series", as psql was careful to do prior to the
patch.

-Neil



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Have psql display names and OUT/INOUT in \df output

2006-07-16 Thread David Fetter
On Sun, Jul 16, 2006 at 12:21:12AM -0700, Neil Conway wrote:
> On Sat, 2006-07-15 at 23:16 -0700, David Fetter wrote:
> > Anyhow, please find enclosed the context-style diff.
> 
> How carefully did you test this?

Not enough.  Here's the latest.

> postgres=# \df abc
>List of functions
>  Schema | Name | Result data type | Argument data types 
> +--+--+-
>  public | abc  | integer  | a integer,b integer
> (1 row)
> 
> (The argument list should be separated by both a comma and whitespace.)
> 
> postgres=# create or replace function xyz(inout a int, inout int)
> returns record as 'select (1, 2);' language sql;
> CREATE FUNCTION
> postgres=# \df xyz
>   List of functions
>  Schema | Name | Result data type |   Argument data types
> +--+--+--
>  public | xyz  | record   |  INOUT a integer, INOUT  integer
> (1 row)

These are now fixed.

> (Spurious whitespace for the unnamed INOUT parameter.)
> 
> You need to schema-qualify references to builtin functions, to avoid
> accidentally using functions of the same name that appear earlier in
> the user's search path.

I don't understand what you mean here.  The schema name comes with
both versions of \df, just as it did before.

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Index: src/bin/psql/describe.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/psql/describe.c,v
retrieving revision 1.140
diff -c -r1.140 describe.c
*** src/bin/psql/describe.c 14 Jun 2006 16:49:02 -  1.140
--- src/bin/psql/describe.c 16 Jul 2006 07:59:48 -
***
*** 170,177 
  "SELECT n.nspname as \"%s\",\n"
  "  p.proname as \"%s\",\n"
  "  CASE WHEN p.proretset THEN 'setof 
' ELSE '' END ||\n"
! "  pg_catalog.format_type(p.prorettype, NULL) 
as \"%s\",\n"
! "  
pg_catalog.oidvectortypes(p.proargtypes) as \"%s\"",
  _("Schema"), _("Name"), _("Result 
data type"),
  _("Argument data types"));
  
--- 170,204 
  "SELECT n.nspname as \"%s\",\n"
  "  p.proname as \"%s\",\n"
  "  CASE WHEN p.proretset THEN 'setof 
' ELSE '' END ||\n"
! "  
pg_catalog.format_type(p.prorettype, NULL) as \"%s\",\n"
! "  CASE WHEN proallargtypes IS NOT 
NULL THEN\n"
! "array_to_string(ARRAY(\n"
! "  SELECT\n"
! "CASE\n"
! "  WHEN p.proargmodes[s.i] = 
'i' THEN ''\n"
! "  WHEN p.proargmodes[s.i] = 
'o' THEN 'OUT '\n"
! "  WHEN p.proargmodes[s.i] = 
'b' THEN 'INOUT '\n"
! "END ||\n"
! "CASE\n"
! "  WHEN p.proargnames[s.i] = 
'' OR p.proargnames[s.i] IS NULL THEN ''\n"
! "  ELSE p.proargnames[s.i] || 
' ' \n"
! "END ||\n"
! "
format_type(p.proallargtypes[s.i],-1)\n"
! "  FROM\n"
! "
generate_series(1,array_upper(p.proallargtypes,1)) AS s(i)\n"
! "), ', ')\n"
! "  ELSE\n"
! "array_to_string(ARRAY(\n"
! "  SELECT\n"
! "CASE\n"
! "  WHEN p.proargnames[s.i+1] 
= '' OR p.proargnames[s.i+1] IS NULL THEN ''\n"
! "  ELSE p.proargnames[s.i+1] 
|| ' '\n"
! "  END ||\n"
! "
format_type(p.proargtypes[s.i],-1)\n"
! "  FROM\n"
! "
generate_series(0,array_upper(p.proargtypes,1)) AS s(i)\n"
!  

Re: [PATCHES] Have psql display names and OUT/INOUT in \df output

2006-07-16 Thread Neil Conway
On Sat, 2006-07-15 at 23:16 -0700, David Fetter wrote:
> Anyhow, please find enclosed the context-style diff.

How carefully did you test this?

postgres=# \df abc
   List of functions
 Schema | Name | Result data type | Argument data types 
+--+--+-
 public | abc  | integer  | a integer,b integer
(1 row)

(The argument list should be separated by both a comma and whitespace.)

postgres=# create or replace function xyz(inout a int, inout int)
returns record as 'select (1, 2);' language sql;
CREATE FUNCTION
postgres=# \df xyz
  List of functions
 Schema | Name | Result data type |   Argument data types
+--+--+--
 public | xyz  | record   |  INOUT a integer, INOUT  integer
(1 row)

(Spurious whitespace for the unnamed INOUT parameter.)

You need to schema-qualify references to builtin functions, to avoid
accidentally using functions of the same name that appear earlier in the
user's search path. (As a general rule, look at the surrounding code
carefully before you modify it.)

It would be nice to be consistent about SQL style, at least within a
single query (e.g. don't separate function arguments with whitespace in
some places but not others).

-Neil



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings