Re: [BUGS] BUG #7884: pg_upgradecluster is terribly slow
On 2/15/2013 4:27 PM, Bruce Momjian wrote: > >do note this -k option only functions if the old and new cluster are on the >same mount point, so the files can be mv'd Actually, hard linked, not moved. errr, that. right. ;) -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #7884: pg_upgradecluster is terribly slow
On Fri, Feb 15, 2013 at 03:55:02PM -0800, John R Pierce wrote: > On 2/15/2013 10:27 AM, Kevin Grittner wrote: > > You might want to try the utility which *is* supported by the > PostgreSQL community, which is pg_upgrade. For speed, consider > using the -k option. I've been able to upgrade a 3 TB database in > just a few minutes with that. (Timing depends more on the number > of database objects than their size.) > > > do note this -k option only functions if the old and new cluster are on the > same mount point, so the files can be mv'd Actually, hard linked, not moved. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #7884: pg_upgradecluster is terribly slow
On 2/15/2013 10:27 AM, Kevin Grittner wrote: You might want to try the utility which*is* supported by the PostgreSQL community, which is pg_upgrade. For speed, consider using the -k option. I've been able to upgrade a 3 TB database in just a few minutes with that. (Timing depends more on the number of database objects than their size.) do note this -k option only functions if the old and new cluster are on the same mount point, so the files can be mv'd -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [BUGS] BUG #7884: pg_upgradecluster is terribly slow
Hello, r...@me.com [2013-02-15 17:41 +]: > I have tried upgrading a database from 8.4 to 9.1 and pg_upgradecluster is > terribly slow. Please note that this is a Debian/Ubuntu specific bug, pg_upgradecluster is not shipped by upstream PostgreSQL. I admittedly don't have any first-hand experience with migrating large databases, and get very little feedback about it, so thanks for your observations! First of all, Peter has worked on a branch that makes pg_upgradecluster use pg_upgrade, which hopefully will be a lot more performant. > 1. It uses "-Fc". This is bad as it enables compression by default. Which is > pointless and wastes CPU power. The tar format (-Ft) has also been supported since at least 8.4 (that's the earliest version which is still supported in Debian/Ubuntu), and documentation says it's not compressed. If you replace -Fc with -Ft, do you see a significant performance increase? > 2. It seems to migrate the schema first and only then it moves data. This is > bad as indices are being re-built when the data is finally poured in. At least in earlier versions pg_dumpall wasn't able to dump BLOBs, and the manpage doesn't indicate otherwise for 9.2 either. AFAIK this just supports the plain SQL text format. I don't believe pg_dumpall would avoid the rebuilding of incides? > 3. The migration does not happen within a transaction. > 1. migrate the schema pg_upgradecluster does that, too. > 2. drop all databases (leaving roles in) > 3. re-creating the databases (without any content) Out of interest, why does that help? > 4. for each database run pg_dump | psql --single-transaction > As written above, I don't want to use the text format and psql, but pg_restore also supports --single-transaction since at least 8.4. The question which I'm not sure about is whether it's ok to use --single-transaction even for very large databases. I. e. is piling up gigabytes of data in a transaction and committing it all in the end always more efficient than the default mode (which I assume will use one transaction by row)? Is that a safe thing to do, or could one run into out-of-memory conditions? http://people.canonical.com/~pitti/tmp/pg_upgradecluster is a version with these two changes: pg_restore --single-transaction and using the tar format. Perhaps you can try this, and compare performance? Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #7884: pg_upgradecluster is terribly slow
"r...@me.com" wrote: > I have tried upgrading a database from 8.4 to 9.1 and > pg_upgradecluster is terribly slow. I had never heard of pg_upgradecluster sluster; a google search suggests that there is an Ubuntu utility by that name, so you might want to register a complaint there; this list is definitely not the right place for that. You might want to try the utility which *is* supported by the PostgreSQL community, which is pg_upgrade. For speed, consider using the -k option. I've been able to upgrade a 3 TB database in just a few minutes with that. (Timing depends more on the number of database objects than their size.) http://www.postgresql.org/docs/9.1/interactive/pgupgrade.html -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #7884: pg_upgradecluster is terribly slow
The following bug has been logged on the website: Bug reference: 7884 Logged by: Denys Rtveliashvili Email address: r...@me.com PostgreSQL version: 9.1.7 Operating system: Ubuntu 12.04 Description: I have tried upgrading a database from 8.4 to 9.1 and pg_upgradecluster is terribly slow. The bandwidth of data stream is about 10MB/sec (measured by iostat). This is quite bad, as this runs on a quite new and powerful machine (new, with proper battery-backed HW RAID, many disks). IO is certainly not a bottleneck. The problem is that "psql" and "pg_dump" are consuming a lot of CPU power while doing almost nothing. When I do "pg_dump > /dev/null" I see read bandwidth about 10MB/sec. This is tiny comparing to 500MB/sec shown by iozone3. Also, I think there are obvious problems with pg_upgradecluster: 1. It uses "-Fc". This is bad as it enables compression by default. Which is pointless and wastes CPU power. 2. It seems to migrate the schema first and only then it moves data. This is bad as indices are being re-built when the data is finally poured in. 3. The migration does not happen within a transaction. Thus, there are two consequences: 3.1. The migration will not fail obviously in case data cannot be moved carefully for whatever reason. This is just not safe. 3.2. The migration does not work as fast as it could. "COPY" can run faster in case it happens within the same transaction where table has been created. However, as creation of the table and copying into it are separated, the postgres has to do a lot of useless work which slows it down. I was able to improve the performance of upgrade approximately 4 times by following a different approach: 1. migrate the schema 2. drop all databases (leaving roles in) 3. re-creating the databases (without any content) 4. for each database run pg_dump | psql --single-transaction Perhaps the same thing should also be used in pg_upgradecluster? -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs