[GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread John T. Dow
By bad data, I mean a character that's not UTF8, such as hex 98.

As far as I can tell, pg_dump is the tool to use. But it has
serious drawbacks.

If you dump in the custom format, the data is compressed (nice) and
includes large objects (very nice). But, from my tests and the postings of
others, if there is invalid data in a table, although PostgreSQL won't complain 
and
pg_dump won't complain, pg_restore will strenuously object, rejecting all rows 
for that
particular table (not nice at all).

If you dump in plain text format, you can at least inspect the dumped
data and fix it manually or with iconv. But the plain text
format doesn't support large objects (again, not nice). While byte arrays are 
supported, they result in very large dump files.

Also, neither of these methods gets information such as the roles, so
that has to be captured some other way if the database has to be rebuilt
from scratch.

Is my understanding incomplete or wrong? Is there no good solution?

Why isn't there a dumpall that writes in compressed format and allows recovery 
from bad data?

John


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread Tom Lane
John T. Dow [EMAIL PROTECTED] writes:
 If you dump in plain text format, you can at least inspect the dumped
 data and fix it manually or with iconv. But the plain text
 format doesn't support large objects (again, not nice).

It does in 8.1 and later ...

 Also, neither of these methods gets information such as the roles,

Use pg_dumpall.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread John T. Dow
Tom

My mistake in not realizing that 8.1 and later can dump large objects in the 
plain text format. I guess when searching for answers to a problem, the posted 
information doesn't always specify the version. So, sorry about that.

But the plain text format still has serious problems in that the generated file 
is large for byte arrays and large objects, there is no ability to selectively 
restore a table, and bad data still isn't detected until you try to restore.

Or did I miss something else?

John

PS: Yes, I know you can pipe the output from pg_dumpall into an archiver, but 
it's my understanding that the binary data is output in an inefficient format 
so even if zipped, the resulting file would be significantly larger than the 
custom format.



On Mon, 25 Aug 2008 12:14:41 -0400, Tom Lane wrote:

John T. Dow [EMAIL PROTECTED] writes:
 If you dump in plain text format, you can at least inspect the dumped
 data and fix it manually or with iconv. But the plain text
 format doesn't support large objects (again, not nice).

It does in 8.1 and later ...

 Also, neither of these methods gets information such as the roles,

Use pg_dumpall.

   regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread Joshua Drake
On Mon, 25 Aug 2008 10:21:54 -0400
John T. Dow [EMAIL PROTECTED] wrote:

 By bad data, I mean a character that's not UTF8, such as hex 98.
 
 As far as I can tell, pg_dump is the tool to use. But it has
 serious drawbacks.
 
 If you dump in the custom format, the data is compressed (nice) and
 includes large objects (very nice). But, from my tests and the
 postings of others, if there is invalid data in a table, although
 PostgreSQL won't complain and pg_dump won't complain, pg_restore will
 strenuously object, rejecting all rows for that particular table (not
 nice at all).

You can use the TOC feature of -Fc to remove restoring of that single
table. You can then convert that single table to a plain text dump and
clean the data. Then restore it separately.

If you have foregin keys and indexes on the bad data table, don't
restore the keys until *after* you have done the above.

Sincerely,

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread John T. Dow
Joshua

The TOC feature sounds good, as does converting a single table to plain text.

But I can't find documentation for the TOC feature under pg_dump or pg_restore. 
I'm looking in postgresql-8.2.1-US.pdf.

Neither could I see anything about converting a single table to a plain text 
dump.

Also, I stumbled across the statement that you can't restore large objects for 
a single table. Is that true?

Another thing I couldn't find was how to dump roles using -Fc.

John



On Mon, 25 Aug 2008 10:04:13 -0700, Joshua Drake wrote:

On Mon, 25 Aug 2008 10:21:54 -0400
John T. Dow [EMAIL PROTECTED] wrote:

 By bad data, I mean a character that's not UTF8, such as hex 98.
 
 As far as I can tell, pg_dump is the tool to use. But it has
 serious drawbacks.
 
 If you dump in the custom format, the data is compressed (nice) and
 includes large objects (very nice). But, from my tests and the
 postings of others, if there is invalid data in a table, although
 PostgreSQL won't complain and pg_dump won't complain, pg_restore will
 strenuously object, rejecting all rows for that particular table (not
 nice at all).

You can use the TOC feature of -Fc to remove restoring of that single
table. You can then convert that single table to a plain text dump and
clean the data. Then restore it separately.

If you have foregin keys and indexes on the bad data table, don't
restore the keys until *after* you have done the above.

Sincerely,

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread Joshua Drake
On Mon, 25 Aug 2008 13:37:13 -0400
John T. Dow [EMAIL PROTECTED] wrote:

 Joshua
 
 The TOC feature sounds good, as does converting a single table to
 plain text.
 
 But I can't find documentation for the TOC feature under pg_dump or
 pg_restore. I'm looking in postgresql-8.2.1-US.pdf.

The commands you are looking for are:

pg_restore -l to get the toc
pg_restore -L to use the toc

If you open the resulting file from something like pg_restore -l 
foo.toc it is just a plain text list of objects to restore.

I don't know how well it is documented but I am sure we would accept a
patch.

 
 Neither could I see anything about converting a single table to a
 plain text dump.

pg_restore allows you to do so. Something like:

pg_restore foo.sqlc --file=foo.sql

 
 Also, I stumbled across the statement that you can't restore large
 objects for a single table. Is that true?

Large objects are stored in a central table called pg_largeobject, so
yes that would be accuarate.


 
 Another thing I couldn't find was how to dump roles using -Fc.
 

You can't; that is a known and irritating limitation.

Sincerely,

Joshua D. Drake



-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread John T. Dow
Joshua

Thank you very much for answering these various questions.

I guess the compressed format is the best overall solution, except for roles. I 
find myself having a table with other information about users (application 
specific user type, etc) so perhaps the thing to do is record enough 
information there to reconstruct the roles should that become necessary.

Can pg_dump dump roles to plain text? How does pg_dumpall do it, doesn't it do 
everything via pg_dump?

John


On Mon, 25 Aug 2008 10:47:11 -0700, Joshua Drake wrote:

On Mon, 25 Aug 2008 13:37:13 -0400
John T. Dow [EMAIL PROTECTED] wrote:

 Joshua
 
 The TOC feature sounds good, as does converting a single table to
 plain text.
 
 But I can't find documentation for the TOC feature under pg_dump or
 pg_restore. I'm looking in postgresql-8.2.1-US.pdf.

The commands you are looking for are:

pg_restore -l to get the toc
pg_restore -L to use the toc

If you open the resulting file from something like pg_restore -l 
foo.toc it is just a plain text list of objects to restore.

I don't know how well it is documented but I am sure we would accept a
patch.

 
 Neither could I see anything about converting a single table to a
 plain text dump.

pg_restore allows you to do so. Something like:

pg_restore foo.sqlc --file=foo.sql

 
 Also, I stumbled across the statement that you can't restore large
 objects for a single table. Is that true?

Large objects are stored in a central table called pg_largeobject, so
yes that would be accuarate.


 
 Another thing I couldn't find was how to dump roles using -Fc.
 

You can't; that is a known and irritating limitation.

Sincerely,

Joshua D. Drake



-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread Joshua Drake
On Mon, 25 Aug 2008 17:05:53 -0400
John T. Dow [EMAIL PROTECTED] wrote:

 Joshua
 
 Thank you very much for answering these various questions.
 
 I guess the compressed format is the best overall solution, except
 for roles. I find myself having a table with other information about
 users (application specific user type, etc) so perhaps the thing to
 do is record enough information there to reconstruct the roles should
 that become necessary.
 
 Can pg_dump dump roles to plain text? How does pg_dumpall do it,

pg_dumpall -g will dump just roles via plain text.

Joshua D. Drake
-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump/restore with bad data and large objects

2008-08-25 Thread John T. Dow
Sorry, I missed that. Thanks again.

Now to put this all into effect.

John

On Mon, 25 Aug 2008 14:25:12 -0700, Joshua Drake wrote:

On Mon, 25 Aug 2008 17:05:53 -0400
John T. Dow [EMAIL PROTECTED] wrote:

 Joshua
 
 Thank you very much for answering these various questions.
 
 I guess the compressed format is the best overall solution, except
 for roles. I find myself having a table with other information about
 users (application specific user type, etc) so perhaps the thing to
 do is record enough information there to reconstruct the roles should
 that become necessary.
 
 Can pg_dump dump roles to plain text? How does pg_dumpall do it,

pg_dumpall -g will dump just roles via plain text.

Joshua D. Drake
-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general