Re: [HACKERS] pg_dump --split patch

2012-11-28 Thread Alvaro Herrera
Marko Tiikkaja wrote: On 16/11/2012 15:52, Dimitri Fontaine wrote: What happens if you have a table foo and another table FoO? They would go to the same file. If you think there are technical issues behind that decision (e.g. the dump would not restore), I would like to hear an example

Re: [HACKERS] pg_dump --split patch

2012-11-19 Thread Dimitri Fontaine
Marko Tiikkaja pgm...@joh.to writes: What happens if you have a table foo and another table FoO? They would go to the same file. If you think there are technical issues behind that decision (e.g. the dump would not restore), I would like to hear an example case. I didn't try the patch

Re: [HACKERS] pg_dump --split patch

2012-11-19 Thread Dimitri Fontaine
Dimitri Fontaine dimi...@2ndquadrant.fr writes: pg_dump | pg_restore pg_export | psql While I agree that this idea - when implemented - would be nicer in practically every way, I'm not sure I want to volunteer to do all the necessary work. What I think needs to happen now is a

Re: [HACKERS] pg_dump --split patch

2012-11-19 Thread Andrew Dunstan
On 11/19/2012 09:07 AM, Dimitri Fontaine wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: pg_dump | pg_restore pg_export | psql While I agree that this idea - when implemented - would be nicer in practically every way, I'm not sure I want to volunteer to do all the necessary

Re: [HACKERS] pg_dump --split patch

2012-11-18 Thread Marko Tiikkaja
Hi, On 16/11/2012 15:52, Dimitri Fontaine wrote: Marko Tiikkaja pgm...@joh.to writes: The general output scheme looks like this: schemaname/OBJECT_TYPES/object_name.sql, I like this feature, I actually did have to code it myself in the past and several other people did so, so we already

Re: [HACKERS] pg_dump --split patch

2012-11-16 Thread Dimitri Fontaine
Hi, Marko Tiikkaja pgm...@joh.to writes: The general output scheme looks like this: schemaname/OBJECT_TYPES/object_name.sql, I like this feature, I actually did have to code it myself in the past and several other people did so, so we already have at least 3 copies of `getddl` variants

Re: [HACKERS] pg_dump --split patch

2012-10-22 Thread Marko Tiikkaja
Hi, Now that the (at least as far as I know) last ordering problem in pg_dump has been solved [1], I'm going to attempt resurrecting this old thread. It seemed to me that the biggest objections to this patch in the old discussions were directed at the implementation, which I have tried to

Re: [HACKERS] pg_dump --split patch

2011-01-22 Thread Robert Haas
On Mon, Jan 3, 2011 at 2:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 3, 2011 at 1:34 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, that's exactly it.  I can think of some possible uses for splitting up pg_dump output, but frankly to ease

Re: [HACKERS] pg_dump --split patch

2011-01-04 Thread Hannu Krosing
On 28.12.2010 22:44, Joel Jacobson wrote: Sent from my iPhone On 28 dec 2010, at 21:45, Gurjeet Singh singh.gurj...@gmail.com mailto:singh.gurj...@gmail.com wrote: The problem I see with suffixing a sequence id to the objects with name collision is that one day the dump may name

Re: [HACKERS] pg_dump --split patch

2011-01-04 Thread Hannu Krosing
On 28.12.2010 23:51, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 12/28/2010 04:44 PM, Joel Jacobson wrote: Perhaps abbreviations are to prefer, e.g., myfunc_i, myfunc_i_c, etc to reduce the need of truncating filenames. I think that's just horrible. Does the i stand for

Re: [HACKERS] pg_dump --split patch

2011-01-04 Thread Hannu Krosing
On 28.12.2010 17:00, Joel Jacobson wrote: Dear fellow hackers, Problem: A normal diff of two slightly different schema dump files (pg_dump -s), will not produce a user-friendly diff, as you get all changes in the same file. Another Solution: I have used a python script for spliiting dump -s

Re: [HACKERS] pg_dump --split patch

2011-01-04 Thread Greg Smith
Joel Jacobson wrote: To understand a change to my database functions, I would start by looking at the top-level, only focusing on the names of the functions modified/added/removed. At this stage, you want as little information as possible about each change, such as only the names of the

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Dmitry Koterov
To me, this is a wonderful feature, thanks! I think many people would be happy if this patch woud be included to the mainstream (and it is quite short and simple). About name ordering - I think that the problem exists for objects: 1. Stored functions. 2. Foreign keys/triggers (objects which has

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Robert Haas
On Mon, Jan 3, 2011 at 7:11 AM, Dmitry Koterov dmi...@koterov.ru wrote: To me, this is a wonderful feature, thanks! I think many people would be happy if this patch woud be included to the mainstream (and it is quite short and simple). About name ordering - I think that the problem exists for

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Joel Jacobson
2011/1/3 Robert Haas robertmh...@gmail.com: will become confusing for users and hard for us to maintain.  We're going to need to agree on something that won't be perfect for everyone, but will hopefully be a sufficient improvement for enough people to be worth doing. Good point. I think we

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On the specific issue of overloaded functions, I have a feeling that the only feasible option is going to be to put them all in the same file. If you put them in different files, the names will either be very long (because they'll have to include the

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Joel Jacobson
2011/1/3 Tom Lane t...@sss.pgh.pa.us: pg_dump from dumping objects in a consistent order ... and once you do that, you don't need this patch. Yeah, that's exactly it.  I can think of some possible uses for splitting up pg_dump output, but frankly to ease diff-ing is not one of them.  For that

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Robert Haas
On Mon, Jan 3, 2011 at 1:34 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On the specific issue of overloaded functions, I have a feeling that the only feasible option is going to be to put them all in the same file.  If you put them in different files, the

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 3, 2011 at 1:34 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, that's exactly it.  I can think of some possible uses for splitting up pg_dump output, but frankly to ease diff-ing is not one of them.  For that problem, it's nothing but a crude

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Joel Jacobson
Robert Haas robertmh...@gmail.com writes: I have to admit I'm a bit unsold on the approach as well.  It seems like you could write a short Perl script which would transform a text format dump into the proposed format pretty easily, and if you did that and published the script, then the next

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Robert Haas
On Mon, Jan 3, 2011 at 2:46 PM, Joel Jacobson j...@gluefinance.com wrote: My major concern of parsing the schema file is I would never fully trust the output from the script, even if the regex is extremely paranoid and really strict, there is still a risk it contains a bug. That could possibly

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes: I have to admit I'm a bit unsold on the approach as well. It seems like you could write a short Perl script which would transform a text format dump into the proposed format pretty easily, and if you did that and published the script, then the next

Re: [HACKERS] pg_dump --split patch

2011-01-03 Thread Robert Haas
On Mon, Jan 3, 2011 at 3:15 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: On the other hand, I can certainly think of times when even a pretty dumb implementation of this would have saved me some time. You mean like those:  https://labs.omniti.com/labs/pgtreats/wiki/getddl  

Re: [HACKERS] pg_dump --split patch

2011-01-01 Thread Peter Eisentraut
On tis, 2010-12-28 at 12:33 -0500, Tom Lane wrote: (2) randomly different ordering of rows within a table. Your patch didn't address that, unless I misunderstood quite a bit. This issue here is just comparing schemas, so that part is a separate problem for someone else. I think the correct

Re: [HACKERS] pg_dump --split patch

2011-01-01 Thread Peter Eisentraut
On tis, 2010-12-28 at 20:51 -0500, Andrew Dunstan wrote: try: diff -F '^CREATE' ... This works about 67% of the time and still doesn't actually tell at a glance what changed. It will only tell you what the change you are currently looking at probably belongs to. -- Sent via

Re: [HACKERS] pg_dump --split patch

2010-12-30 Thread Robert Treat
On Thu, Dec 30, 2010 at 2:13 AM, Joel Jacobson j...@gluefinance.com wrote: 2010/12/29 Dimitri Fontaine dimi...@2ndquadrant.fr Please have a look at getddl: https://github.com/dimitri/getddl Nice! Looks like a nifty tool. When I tried it, ./getddl.py -f -F /crypt/funcs -d glue, I got the

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Aidan Van Dyk
On Wed, Dec 29, 2010 at 2:27 AM, Joel Jacobson j...@gluefinance.com wrote: description of split stuff So, how different (or not) is this to the directory format that was coming out of the desire of a parallel pg_dump? a. -- Aidan Van Dyk                                             Create like

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Joel Jacobson
2010/12/29 Aidan Van Dyk ai...@highrise.ca On Wed, Dec 29, 2010 at 2:27 AM, Joel Jacobson j...@gluefinance.com wrote: description of split stuff So, how different (or not) is this to the directory format that was coming out of the desire of a parallel pg_dump? Not sure what format you

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Gurjeet Singh
On Wed, Dec 29, 2010 at 8:31 AM, Joel Jacobson j...@gluefinance.com wrote: 2010/12/29 Aidan Van Dyk ai...@highrise.ca On Wed, Dec 29, 2010 at 2:27 AM, Joel Jacobson j...@gluefinance.com wrote: description of split stuff So, how different (or not) is this to the directory format that was

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Aidan Van Dyk
On Wed, Dec 29, 2010 at 9:11 AM, Gurjeet Singh singh.gurj...@gmail.com wrote: On Wed, Dec 29, 2010 at 8:31 AM, Joel Jacobson j...@gluefinance.com wrote: 2010/12/29 Aidan Van Dyk ai...@highrise.ca On Wed, Dec 29, 2010 at 2:27 AM, Joel Jacobson j...@gluefinance.com wrote: description of

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Tom Lane
Aidan Van Dyk ai...@highrise.ca writes: On Wed, Dec 29, 2010 at 9:11 AM, Gurjeet Singh singh.gurj...@gmail.com wrote: AFAIK, that applies to parallel dumps of data (may help in --schema-only dumps too), and what you are trying is for schema. Right, but one of the things it does is break the

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Joel Jacobson
2010/12/29 Tom Lane t...@sss.pgh.pa.us I think they're fundamentally different things, because the previously proposed patch is an extension of the machine-readable archive format, and has to remain so because of the expectation that people will want to use parallel restore with it. Joel is

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Dimitri Fontaine
Joel Jacobson j...@gluefinance.com writes: Solution: I propose a new option to pg_dump, --split, which dumps each object to a separate file in a user friendly directory structure: Please have a look at getddl: https://github.com/dimitri/getddl Regards, -- Dimitri Fontaine

Re: [HACKERS] pg_dump --split patch

2010-12-29 Thread Joel Jacobson
2010/12/29 Dimitri Fontaine dimi...@2ndquadrant.fr Please have a look at getddl: https://github.com/dimitri/getddl Nice! Looks like a nifty tool. When I tried it, ./getddl.py -f -F /crypt/funcs -d glue, I got the error No such file or directory: 'sql/schemas.sql'. While the task of

[HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
Dear fellow hackers, Problem: A normal diff of two slightly different schema dump files (pg_dump -s), will not produce a user-friendly diff, as you get all changes in the same file. Solution: I propose a new option to pg_dump, --split, which dumps each object to a separate file in a user

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Tom Lane
Joel Jacobson j...@gluefinance.com writes: Dear fellow hackers, Problem: A normal diff of two slightly different schema dump files (pg_dump -s), will not produce a user-friendly diff, as you get all changes in the same file. Solution: I propose a new option to pg_dump, --split, which dumps

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/28 Tom Lane t...@sss.pgh.pa.us Joel Jacobson j...@gluefinance.com writes: Dear fellow hackers, Problem: A normal diff of two slightly different schema dump files (pg_dump -s), will not produce a user-friendly diff, as you get all changes in the same file. Solution: I propose

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Tom Lane
Joel Jacobson j...@gluefinance.com writes: 2010/12/28 Tom Lane t...@sss.pgh.pa.us Joel Jacobson j...@gluefinance.com writes: Solution: I propose a new option to pg_dump, --split, which dumps each object to a separate file in a user friendly directory structure: Um ... how does that solve

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/28 Tom Lane t...@sss.pgh.pa.us That has at least as many failure modes as the other representation. I don't follow, what do you mean with failure modes? The oid in the filename? I suggested to use a sequence instead but you didn't comment on that. Are there any other failure modes

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Andrew Dunstan
On 12/28/2010 11:59 AM, Joel Jacobson wrote: 2010/12/28 Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us That has at least as many failure modes as the other representation. I don't follow, what do you mean with failure modes? The oid in the filename? I suggested to use a

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Tom Lane
Joel Jacobson j...@gluefinance.com writes: 2010/12/28 Tom Lane t...@sss.pgh.pa.us That has at least as many failure modes as the other representation. I don't follow, what do you mean with failure modes? The oid in the filename? I suggested to use a sequence instead but you didn't comment on

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Gurjeet Singh
On Tue, Dec 28, 2010 at 11:00 AM, Joel Jacobson j...@gluefinance.comwrote: Dear fellow hackers, Problem: A normal diff of two slightly different schema dump files (pg_dump -s), will not produce a user-friendly diff, as you get all changes in the same file. Solution: I propose a new option

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Aidan Van Dyk
On Tue, Dec 28, 2010 at 11:59 AM, Joel Jacobson j...@gluefinance.com wrote: I don't follow, what do you mean with failure modes? The oid in the filename? I suggested to use a sequence instead but you didn't comment on that. Are there any other failure modes which could cause a diff -r between

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/28 Gurjeet Singh singh.gurj...@gmail.com I would suggest the directory structure as: /crypt/pg.dump-split/schema-name-1/VIEWS/view-name-1.sql /crypt/pg.dump-split/schema-name-1/TABLES/table-name-1.sql ... /crypt/pg.dump-split/schema-name-2/VIEWS/view-name-1.sql

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Gurjeet Singh
On Tue, Dec 28, 2010 at 2:39 PM, Joel Jacobson j...@gluefinance.com wrote: 2010/12/28 Gurjeet Singh singh.gurj...@gmail.com I would suggest the directory structure as: /crypt/pg.dump-split/schema-name-1/VIEWS/view-name-1.sql /crypt/pg.dump-split/schema-name-1/TABLES/table-name-1.sql ...

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
Sent from my iPhone On 28 dec 2010, at 21:45, Gurjeet Singh singh.gurj...@gmail.com wrote: The problem I see with suffixing a sequence id to the objects with name collision is that one day the dump may name myfunc(int) as myfunc.sql and after an overloaded version is created, say myfunc(char,

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Andrew Dunstan
On 12/28/2010 04:44 PM, Joel Jacobson wrote: The problem I see with suffixing a sequence id to the objects with name collision is that one day the dump may name myfunc(int) as myfunc.sql and after an overloaded version is created, say myfunc(char, int), then the same myfunc(int) may be

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/28 Andrew Dunstan and...@dunslane.net I think that's just horrible. Does the i stand for integer or inet? And it will get *really* ugly for type names with spaces in them ... True, true. But while c is too short, I think character varying is too long. Is there some convenient lookup

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: On 12/28/2010 04:44 PM, Joel Jacobson wrote: Perhaps abbreviations are to prefer, e.g., myfunc_i, myfunc_i_c, etc to reduce the need of truncating filenames. I think that's just horrible. Does the i stand for integer or inet? And it will get

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Gurjeet Singh
On Tue, Dec 28, 2010 at 4:57 PM, Andrew Dunstan and...@dunslane.net wrote: On 12/28/2010 04:44 PM, Joel Jacobson wrote: The problem I see with suffixing a sequence id to the objects with name collision is that one day the dump may name myfunc(int) as myfunc.sql and after an overloaded

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread David Wilson
On Tue, Dec 28, 2010 at 2:39 PM, Joel Jacobson j...@gluefinance.com wrote: I think you are right about functions (and aggregates) being the only desc-type where two objects can share the same name in the same schema. This means the problem of dumping objects in different order is a very

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/29 David Wilson david.t.wil...@gmail.com Why not place all overloads of a function within the same file? Then, assuming you order them deterministically within that file, we sidestep the file naming issue and maintain useful diff capabilities, since a diff of the function's file will

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Tom Lane
David Wilson david.t.wil...@gmail.com writes: On Tue, Dec 28, 2010 at 2:39 PM, Joel Jacobson j...@gluefinance.com wrote: I didn't include the arguments in the file name, as it would lead to very long file names unless truncated, and since the problem is very limited, I think we shouldn't

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/29 Tom Lane t...@sss.pgh.pa.us If you've solved the deterministic-ordering problem, then this entire patch is quite useless. You can just run a normal dump and diff it. No, that's only half true. Diff will do a good job minimizing the size of the diff output, yes, but such a diff

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Andrew Dunstan
On 12/28/2010 08:18 PM, Joel Jacobson wrote: 2010/12/29 Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us If you've solved the deterministic-ordering problem, then this entire patch is quite useless. You can just run a normal dump and diff it. No, that's only half true. Diff

Re: [HACKERS] pg_dump --split patch

2010-12-28 Thread Joel Jacobson
2010/12/29 Andrew Dunstan and...@dunslane.net try: diff -F '^CREATE' ... cheers andrew Embarrasing, I'm sure I've done `man diff` before, must have missed that one, wish I'd known about that feature before, would have saved me many hours! :-) Thanks for the tip! There are some other