Re: [HACKERS] [pgsql-advocacy] PGUpgrade WAS: Audio interview

2006-02-08 Thread Josh Berkus

Andrew,

This would be a very fine project for someone to pick up (maybe one of 
the corporate supporters could sponsor someone to work on it?)


We looked at it for Greenplum but just couldn't justify putting it near 
the top of the priority list.  The work/payoff ratio is terrible.


One justification for in-place upgrades is to be faster than 
dump/reload.  However, if we're assuming the possibility of new/modified 
header fields which could then cause page splits on pages which are 90% 
capacity,  then this time savings would be on the order of no more than 
50% of load time, not the 90% of load time required to justify the 
programming effort involved -- especially when you take into account 
needing to provide multiple conversions, e.g. 7.3--8.1, 7.4 -- 8.1, etc.


The second reason for in-place upgrade is for large databases where the 
owner does not have enough disk space for two complete copies of the 
database.  Again, this is not solvable; if we want in-place upgrade to 
be fault-tolerant, then we need the doubled disk space anyway (you could 
do a certain amount with compression, but you'd still need 150%-175% 
space so it's not much help).


Overall, it would be both easier and more effective to write a Slony 
automation wrapper which does the replication, population, and 
switchover for you.


--Josh

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [pgsql-advocacy] PGUpgrade WAS: Audio interview

2006-02-08 Thread Neil Conway
On Wed, 2006-02-08 at 11:55 -0800, Josh Berkus wrote:
 One justification for in-place upgrades is to be faster than 
 dump/reload.  However, if we're assuming the possibility of new/modified 
 header fields which could then cause page splits on pages which are 90% 
 capacity,  then this time savings would be on the order of no more than 
 50% of load time

Well, if you need to start shuffling heap tuples around, you also need
to update indexes, in addition to rewriting all the heap pages. This
would require work on the order of VACUUM FULL in the worst case, which
is pretty expensive.

However, we don't change the format of heap or index pages _that_ often.
An in-place upgrade script that worked when the heap/index page format
has not changed would still be valuable -- only the system catalog
format would need to be modified.

 The second reason for in-place upgrade is for large databases where the 
 owner does not have enough disk space for two complete copies of the 
 database.  Again, this is not solvable; if we want in-place upgrade to 
 be fault-tolerant, then we need the doubled disk space anyway

When the heap/index page format hasn't changed, we would only need to
backup the system catalogs, which would be far less expensive.

-Neil



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [pgsql-advocacy] PGUpgrade WAS: Audio interview

2006-02-08 Thread Rick Gigger


On Feb 8, 2006, at 12:55 PM, Josh Berkus wrote:


Andrew,

This would be a very fine project for someone to pick up (maybe  
one of the corporate supporters could sponsor someone to work on it?)


We looked at it for Greenplum but just couldn't justify putting it  
near the top of the priority list.  The work/payoff ratio is terrible.


One justification for in-place upgrades is to be faster than dump/ 
reload.  However, if we're assuming the possibility of new/modified  
header fields which could then cause page splits on pages which are  
90% capacity,  then this time savings would be on the order of no  
more than 50% of load time, not the 90% of load time required to  
justify the programming effort involved -- especially when you take  
into account needing to provide multiple conversions, e.g. 7.3-- 
8.1, 7.4 -- 8.1, etc.


I just posted an idea for first upgrading a physical backup of the  
data directory that you would create when doing Online backups and  
then also altering the the WAL log records as they are applied during  
recovery.  That way the actual load time might still be huge but  
since it could run in parallel with the running server it would  
probably eliminate 99% of the downtime.  Would that be worth the effort?


Also all the heavy lifting could be offloaded to a separate box while  
your production server just keeps running unaffected.


The second reason for in-place upgrade is for large databases where  
the owner does not have enough disk space for two complete copies  
of the database.  Again, this is not solvable; if we want in-place  
upgrade to be fault-tolerant, then we need the doubled disk space  
anyway (you could do a certain amount with compression, but you'd  
still need 150%-175% space so it's not much help).


Yeah, anyone who has so much data that they need this feature but  
isn't willing to back it up is crazy.  Plus disk space is cheap.


Overall, it would be both easier and more effective to write a  
Slony automation wrapper which does the replication, population,  
and switchover for you.


Now that is something that I would actually use.  I think that a  
little bit of automation would greatly enhance the number of users  
using slony.


Rick


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgsql-advocacy] PGUpgrade WAS: Audio interview

2006-02-08 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 This would be a very fine project for someone to pick up (maybe one of 
 the corporate supporters could sponsor someone to work on it?)

 We looked at it for Greenplum but just couldn't justify putting it near 
 the top of the priority list.  The work/payoff ratio is terrible.

I agree that doing pgupgrade in full generality is probably not worth
the investment required.  However, handling the restricted case where
no changes are needed in user tables or indexes would be considerably
easier, and I think it would be worth doing.

If such a tool were available, I don't think it'd be hard to get
consensus on organizing our releases so that it were applicable more
often than not.  We could postpone changes that would affect user
table contents until we'd built up a backlog that would all go into
one release.  Even a minimal commitment in that line would probably
result in pgupgrade working for at least every other release, and
that would be enough to make it worthwhile if you ask me ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [pgsql-advocacy] PGUpgrade WAS: Audio interview

2006-02-08 Thread Hannu Krosing
Ühel kenal päeval, K, 2006-02-08 kell 15:51, kirjutas Tom Lane:
 Josh Berkus josh@agliodbs.com writes:
  This would be a very fine project for someone to pick up (maybe one of 
  the corporate supporters could sponsor someone to work on it?)
 
  We looked at it for Greenplum but just couldn't justify putting it near 
  the top of the priority list.  The work/payoff ratio is terrible.
 
 I agree that doing pgupgrade in full generality is probably not worth
 the investment required.  However, handling the restricted case where
 no changes are needed in user tables or indexes would be considerably
 easier, and I think it would be worth doing.

How hard would it be to modify postgres so that it can handle multiple
heap page formats ?

This could come handy for pgupgrade, but my real interest would be to
have several task-specific formats supported even in non-upgrade
situations, such as a more compact heap page format for read-only
archive/analysis tables.

--
Hannu


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [pgsql-advocacy] PGUpgrade WAS: Audio interview

2006-02-08 Thread Josh Berkus
Tom,

 If such a tool were available, I don't think it'd be hard to get
 consensus on organizing our releases so that it were applicable more
 often than not.  We could postpone changes that would affect user
 table contents until we'd built up a backlog that would all go into
 one release.  Even a minimal commitment in that line would probably
 result in pgupgrade working for at least every other release, and
 that would be enough to make it worthwhile if you ask me ...

We could even make that our first/second dot difference in the future.  
That is, 8.2 will be pg-upgradable from 8.1 but 9.0 will not.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings