Re: [HACKERS] pg_upgrade and statistics

2012-03-16 Thread Ants Aasma
On Thu, Mar 15, 2012 at 8:48 PM, Alvaro Herrera alvhe...@commandprompt.com What Peter proposed seems to me pretty reasonable, in the sense that it should be possible to come up with a function that creates some text representation of whatever is in pg_statistic, and another function to load

Re: [HACKERS] pg_upgrade and statistics

2012-03-16 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 11:46:04AM -0400, Bruce Momjian wrote: On Thu, Mar 15, 2012 at 11:15:42AM -0400, Andrew Dunstan wrote: You're not the only person who could do that. I don't think this is all down to you. It should just be understood that if the stats format is changed, adjusting

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Peter Eisentraut
On ons, 2012-03-14 at 17:36 -0400, Bruce Momjian wrote: Well, I have not had to make major adjustments to pg_upgrade since 9.0, meaning the code is almost complete unchanged and does not require additional testing for each major release. If we go down the road of dumping stats, we will need

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 08:22:24AM +0200, Peter Eisentraut wrote: On ons, 2012-03-14 at 17:36 -0400, Bruce Momjian wrote: Well, I have not had to make major adjustments to pg_upgrade since 9.0, meaning the code is almost complete unchanged and does not require additional testing for each

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Andrew Dunstan
On 03/15/2012 11:03 AM, Bruce Momjian wrote: On Thu, Mar 15, 2012 at 08:22:24AM +0200, Peter Eisentraut wrote: On ons, 2012-03-14 at 17:36 -0400, Bruce Momjian wrote: Well, I have not had to make major adjustments to pg_upgrade since 9.0, meaning the code is almost complete unchanged and

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: I think we have two choices --- either migrate the statistics, or adopt my approach to generating incremental statistics quickly. Does anyone see any other options? Would it make any sense to modify the incremental approach to do a first pass of any

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 11:15:42AM -0400, Andrew Dunstan wrote: You're not the only person who could do that. I don't think this is all down to you. It should just be understood that if the stats format is changed, adjusting pg_upgrade needs to be part of the change. When we modified how enums

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Bruce Momjian
On Thu, Mar 15, 2012 at 10:20:02AM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: I think we have two choices --- either migrate the statistics, or adopt my approach to generating incremental statistics quickly. Does anyone see any other options? Would it make any

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: On 03/15/2012 11:03 AM, Bruce Momjian wrote: On Thu, Mar 15, 2012 at 08:22:24AM +0200, Peter Eisentraut wrote: I think this could be budgeted under keeping pg_dump backward compatible. You have to do that anyway for each catalog change, and so doing

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Greg Stark
On Thu, Mar 15, 2012 at 3:15 PM, Andrew Dunstan and...@dunslane.net wrote: You're not the only person who could do that. I don't think this is all down to you. It should just be understood that if the stats format is changed, adjusting pg_upgrade needs to be part of the change. When we

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Peter Eisentraut
On tor, 2012-03-15 at 11:15 -0400, Andrew Dunstan wrote: I haven't looked at it, but I'm wondering how hard it is going to be in practice? Take a look at the commit log of pg_statistic.h; it's not a lot. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes: On tor, 2012-03-15 at 11:15 -0400, Andrew Dunstan wrote: I haven't looked at it, but I'm wondering how hard it is going to be in practice? Take a look at the commit log of pg_statistic.h; it's not a lot. That says nothing as all about the cost of

Re: [HACKERS] pg_upgrade and statistics

2012-03-15 Thread Alvaro Herrera
Excerpts from Greg Stark's message of jue mar 15 14:45:16 -0300 2012: On Thu, Mar 15, 2012 at 3:15 PM, Andrew Dunstan and...@dunslane.net wrote: You're not the only person who could do that. I don't think this is all down to you. It should just be understood that if the stats format is

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Peter Eisentraut
On tis, 2012-03-13 at 20:34 -0400, Bruce Momjian wrote: I frankly am worried that if we copy over statistics even in ASCII that don't match what the server expects, it might lead to a crash, which has me back to wanting to speed up vacuumdb. Why can't we maintain a conversion routine for

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Wed, Mar 14, 2012 at 10:40:41PM +0200, Peter Eisentraut wrote: On tis, 2012-03-13 at 20:34 -0400, Bruce Momjian wrote: I frankly am worried that if we copy over statistics even in ASCII that don't match what the server expects, it might lead to a crash, which has me back to wanting to

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:15:52PM -0400, Bruce Momjian wrote: On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote: On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: What is the target=10 duration? I think 10 is as low

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: Does anyone know how bad the queries will be with only one target? Bad. That cycle seems like largely a waste of time. About the only thing it would do for you is ensure that relpages/reltuples are up to date, which seems like something we could possibly

Re: [HACKERS] pg_upgrade and statistics

2012-03-14 Thread Bruce Momjian
On Wed, Mar 14, 2012 at 08:26:06PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Does anyone know how bad the queries will be with only one target? Bad. That cycle seems like largely a waste of time. About the only thing it would do for you is ensure that

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Daniel Farina
On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian br...@momjian.us wrote: To answer your specific question, I think clearing the last analyzed fields should cause autovacuum to run on analyze those tables.  What I don't know is whether not clearing the last vacuum datetime will cause the table

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Daniel Farina
On Mon, Mar 12, 2012 at 9:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: Copying the statistics from the old server is on the pg_upgrade TODO list.  I have avoided it because it will add an additional requirement that will make pg_upgrade more fragile in case

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 12:12:27AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Copying the statistics from the old server is on the pg_upgrade TODO list. I have avoided it because it will add an additional requirement that will make pg_upgrade more fragile in case of

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian br...@momjian.us wrote: To answer your specific question, I think clearing the last analyzed fields should cause autovacuum to run on analyze those tables.  What I don't know is

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: I just received a sobering blog comment stating that pg_upgrade took 5 minutes on a 0.5TB database, but analyze took over an hour: Yeah, we have had similar experiences. Even if this can't be done for every release or for every data type, bringing over

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mar mar 13 11:14:43 -0300 2012: On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian br...@momjian.us wrote: To answer your specific question, I think clearing the last analyzed fields

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 11:34:16AM -0300, Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of mar mar 13 11:14:43 -0300 2012: On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: On Mon, Mar 12, 2012 at 8:10 PM, Bruce Momjian br...@momjian.us wrote: To answer your

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mar mar 13 11:49:26 -0300 2012: On Tue, Mar 13, 2012 at 11:34:16AM -0300, Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of mar mar 13 11:14:43 -0300 2012: On Tue, Mar 13, 2012 at 12:33:09AM -0700, Daniel Farina wrote: On Mon,

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:28:58AM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: I just received a sobering blog comment stating that pg_upgrade took 5 minutes on a 0.5TB database, but analyze took over an hour: Yeah, we have had similar experiences. Even if this

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 12:08:41PM -0300, Alvaro Herrera wrote: You're wrong. Autovacuum does not consider time, only dead/live tuple counts. The formulas it uses are in the autovacuum docs; some details (such as the fact that it skips tables that do not have stat entries) might be

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Greg Stark
On Tue, Mar 13, 2012 at 1:38 AM, Daniel Farina dan...@heroku.com wrote: You probably are going to ask: why not just run ANALYZE and be done with it? Uhm yes. If analyze takes a long time then something is broken. It's only reading a sample which should be pretty much a fixed number of pages per

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Greg Stark st...@mit.edu wrote: Daniel Farina dan...@heroku.com wrote: You probably are going to ask: why not just run ANALYZE and be done with it? Uhm yes. If analyze takes a long time then something is broken. It's only reading a sample which should be pretty much a fixed number of pages

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 05:46:06PM +, Greg Stark wrote: On Tue, Mar 13, 2012 at 1:38 AM, Daniel Farina dan...@heroku.com wrote: You probably are going to ask: why not just run ANALYZE and be done with it? Uhm yes. If analyze takes a long time then something is broken. It's only

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 01:18:58PM -0500, Kevin Grittner wrote: Greg Stark st...@mit.edu wrote: Daniel Farina dan...@heroku.com wrote: You probably are going to ask: why not just run ANALYZE and be done with it? Uhm yes. If analyze takes a long time then something is broken. It's

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: On Tue, Mar 13, 2012 at 01:18:58PM -0500, Kevin Grittner wrote: cir=# analyze CaseHist; ANALYZE Time: 143450.467 ms cir=# select relpages, reltuples from pg_class where relname = 'CaseHist'; relpages | reltuples --+-

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Bruce Momjian br...@momjian.us wrote: That is 2.5 minutes. How large is that database? I dug around a little and found that we had turned on vacuum cost limits on the central databases, because otherwise the web team complained about

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 02:07:14PM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: On Tue, Mar 13, 2012 at 01:18:58PM -0500, Kevin Grittner wrote: cir=# analyze CaseHist; ANALYZE Time: 143450.467 ms cir=# select relpages, reltuples from pg_class where relname =

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Tom Lane
Greg Stark st...@mit.edu writes: On Tue, Mar 13, 2012 at 1:38 AM, Daniel Farina dan...@heroku.com wrote: You probably are going to ask: why not just run ANALYZE and be done with it? Uhm yes. If analyze takes a long time then something is broken. It's only reading a sample which should be

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: cir=# analyze CaseHist; ANALYZE Time: 143450.467 ms OK, so a single 44GB tables took 2.5 minutes to analyze; that is not good. It would require 11 such tables to reach 500GB (0.5 TB), and would take 27 minutes. The report I had was twice as long,

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Peter Eisentraut
On tis, 2012-03-13 at 15:44 -0400, Tom Lane wrote: I wonder whether it'd be worth recommending that people do an initial ANALYZE with a low stats target, just to get some stats in place, and then go back to analyze at whatever their normal setting is. Perhaps going even further, ANALYZE could

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: OK, so a single 44GB tables took 2.5 minutes to analyze; that is not good. It would require 11 such tables to reach 500GB (0.5 TB), and would take 27 minutes. The report I had was twice as long, but still in the ballpark of too long. :-( But it's

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 10:10:02PM +0200, Peter Eisentraut wrote: On tis, 2012-03-13 at 15:44 -0400, Tom Lane wrote: I wonder whether it'd be worth recommending that people do an initial ANALYZE with a low stats target, just to get some stats in place, and then go back to analyze at

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: OK, good idea. Kevin, can you test this: PGOPTIONS='-c default_statistics_target=10' vacuumdb --all --analyze-only Is it faster? Thanks. Well, I just did something similar in psql -- I disabled the delays by: set vacuum_cost_delay = 0;

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Greg Stark
On Tue, Mar 13, 2012 at 7:30 PM, Bruce Momjian br...@momjian.us wrote: OK, so a single 44GB tables took 2.5 minutes to analyze;  that is not good.  It would require 11 such tables to reach 500GB (0.5 TB), and would take 27 minutes.  The report I had was twice as long, but still in the ballpark

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 09:28:33PM +, Greg Stark wrote: hmph. One thing that could speed up analyze on raid arrays would be doing prefetching so more than one spindle can be busy. Sacrificing statistical accuracy by reading a less random sample on contiguous blocks of rows would also be

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 03:29:22PM -0500, Kevin Grittner wrote: I went even lower than you suggested: set default_statistics_target = 4; And it was much faster, but still more time than the pg_upgrade run itself: cir=# analyze; ANALYZE Time: 474319.826 ms A little under 8

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Robert Haas
On Tue, Mar 13, 2012 at 5:42 PM, Bruce Momjian br...@momjian.us wrote: What is the target=10 duration?  I think 10 is as low as we can acceptably recommend.  Should we recommend they run vacuumdb twice, once with default_statistics_target = 4, and another with the default? I'm not sure why

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote: What is the target=10 duration? I think 10 is as low as we can acceptably recommend. Should we recommend they run vacuumdb twice, once with default_statistics_target = 4, and another with the default? Here are the results at various settings. 1 :

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: I'm not sure why we're so glibly rejecting Dan's original proposal. Sure, adjusting pg_upgrade when we whack around pg_statistic is work, but who ever said that a workable in-place upgrade facility would be maintenance-free? We're operating under a

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Daniel Farina
On Tue, Mar 13, 2012 at 3:30 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Mar 13, 2012 at 5:42 PM, Bruce Momjian br...@momjian.us wrote: What is the target=10 duration?  I think 10 is as low as we can acceptably recommend.  Should we recommend they run vacuumdb twice, once with

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Andrew Dunstan
On 03/13/2012 06:30 PM, Robert Haas wrote: On Tue, Mar 13, 2012 at 5:42 PM, Bruce Momjianbr...@momjian.us wrote: What is the target=10 duration? I think 10 is as low as we can acceptably recommend. Should we recommend they run vacuumdb twice, once with default_statistics_target = 4, and

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: What is the target=10 duration? I think 10 is as low as we can acceptably recommend. Should we recommend they run vacuumdb twice, once with default_statistics_target = 4, and another

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: Another idea is to just copy over pg_statistic like we copy of pg_largeobject now, and force autovacuum to run. That would be an automatic crash in a 9.1 to 9.2 migration, as well as any other release where we changed the column layout of pg_statistic.

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 08:30:17PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Another idea is to just copy over pg_statistic like we copy of pg_largeobject now, and force autovacuum to run. That would be an automatic crash in a 9.1 to 9.2 migration, as well as any other

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Bruce Momjian
On Tue, Mar 13, 2012 at 08:22:51PM -0400, Bruce Momjian wrote: On Tue, Mar 13, 2012 at 05:33:29PM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: What is the target=10 duration? I think 10 is as low as we can acceptably recommend. Should we recommend they run

Re: [HACKERS] pg_upgrade and statistics

2012-03-13 Thread Euler Taveira
On 13-03-2012 21:34, Bruce Momjian wrote: It might be a solution for cases where we don't modify it. I frankly am worried that if we copy over statistics even in ASCII that don't match what the server expects, it might lead to a crash, which has me back to wanting to speed up vacuumdb. That

Re: [HACKERS] pg_upgrade and statistics

2012-03-12 Thread Bruce Momjian
On Mon, Mar 12, 2012 at 06:38:30PM -0700, Daniel Farina wrote: You probably are going to ask: why not just run ANALYZE and be done with it? The reasons are: * ANALYZE can take a sufficiently long time on large databases that the downtime of switching versions is not attractive * If

Re: [HACKERS] pg_upgrade and statistics

2012-03-12 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: Copying the statistics from the old server is on the pg_upgrade TODO list. I have avoided it because it will add an additional requirement that will make pg_upgrade more fragile in case of major version changes. Does anyone have a sense of how often we