Re: [PERFORM] Postgres replication: dump/restore, PITR, Slony,...?
All right, so I misspelled Bucardo (also Mammoth...), and the company's name is Command Prompt (please get someone to work on that incomprehensible logo - I went back and looked at it and still have no clue what it means :-). Now how about some serious answers relating to my questions? Dimitri, thanks for your answer. I don't need to replicate TO the staging server (this is where the changes happen) but rather FROM the staging server TO the Web (query) servers. I think my description wasn't clear enough. Currently the staging DB changes daily as new records are inserted to it (would have liked to use COPY instead, but I believe that's only useful for bulk loading the whole DB, not appending to it?). Those changes need to be reflected on the Web servers. Today this is done via dump-copy files-restore of the whole DB (we shut down each Web server DB while restoring it, obviously), and I we are looking for a better way. I would truly appreciate specific suggestions and pointers/references (some trigger based asynchronous replication doesn't help much...). Also is my understanding of PITR limitations correct? Thanks, -- Shaul On Thu, Jun 11, 2009 at 7:32 PM, Joshua D. Drake j...@commandprompt.comwrote: On Thu, 2009-06-11 at 16:30 +, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple replication in Postgres 8.4 and other projects... CMO? :) Buchardo? :) A new desert, Buchardo CMO: Two shots of brandy One shot of rum Vanilla Ice cream Cherries Blend to perfection. Joshua D. Drake
Re: [PERFORM] GiST index performance
Matthew Wakeling wrote: Okay, I don't know quite what's happening here. Tom, perhaps you could advise. Running opannotate --source, I get this sort of stuff: /* * Total samples for file : .../postgresql-8.4beta2/src/backend/access/gist/gistget.c * * 6880 0.2680 */ and then: :static int64 :gistnext(IndexScanDesc scan, TIDBitmap *tbm) 81 0.0032 :{ /* gistnext total: 420087 16.3649 */ :Pagep; The gistnext total doesn't seem to correspond to the amount I get by adding up all the individual lines in gistnest. Moreover, it is greater than the total samples attributed to the whole file, and greater than the samples assigned to all the lines where gistnext is called. there's another alternative for profiling that you might try if you can't get sensible results out of oprofile - cachegrind (which is part of the valgrind toolset). basically it runs the code in an emulated environment, but records every access (reads/writes/CPU cycles/cache hits/misses/etc). it's *extremely* good at finding hotspots, even when they are due to 'cache flushing' behavior in your code (for example, trawling a linked list is touching a bunch of pages and effectively blowing your CPU cache..) there's an associated graphical tool called kcachegrind which takes the dumped output and lets you drill down, even to the source code level (with cycle count/percentage annotations on the source lines) all you need to do is compile postgres with debug symbols (full optimization ON, otherwise you end up reaching the wrong conclusions). there's an example of running valgrind on postgres here: http://blog.cleverelephant.ca/2008/08/valgrinding-postgis.html for cachegrind, you basically need to use 'cachegrind' instead of 'valgrind', and don't disable optimization when you build.. smime.p7s Description: S/MIME Cryptographic Signature
Re: [PERFORM] Postgres replication: dump/restore, PITR, Slony,...?
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Currently we use dump (to SQL file, i.e. pg_dump with no args) + file copy to replicate the DB daily between the staging and web servers, and then restore (via psql) the servers one at the time. In our application we expect that average daily change is only to 3% of the records. My question is what would be the best way to do this replication? Bucardo should handle this easy enough. Just install Bucardo, tell it about the databases, tell it which tables to replicate, and start it up. If the tables have unique indexes (e.g. PKs) you can use the 'pushdelta' type of sync, which will copy rows as they change from the staging server to the web servers. If the tables don't have unique indexes, you'll have to use the 'fullcopy' sync type, which, as you might imagine, copies the entire table each time. You can further control both of these to fire automatically when the data on the staging server changes, or to only fire when you tell it to, e.g. every X minutes, or based on some other criteria. You can also configure how many of the web servers get pushed to at one time, from 1 up to all of them. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200906121509 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkoyqFkACgkQvJuQZxSWSsjB8ACffcQRD+Vb7SV0RZnoo70hkpwB nycAn0QDiogs3EuCrc9+h4rMoToTFopz =Sltu -END PGP SIGNATURE- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] what server stats to track / monitor ?
On Fri, Jun 12, 2009 at 03:52:19PM -0400, Alan McKay wrote: I want to turn off the graphing of unimportant data, to unclutter the graphs and focus on what's important. I'm unfamiliar with Munin, but if you can turn off the graphing (so as to achieve your desired level of un-cluttered-ness) without disabling the capture of the data that was being graphed, you'll be better off. Others' opinions may certainly vary, but in my experience, provided you're not causing a performance problem simply because you're monitoring so much stuff, you're best off capturing every statistic reasonably possible. The time will probably come when you'll find that that statistic, and all the history you've been capturing for it, becomes useful. - Josh / eggyknap signature.asc Description: Digital signature
Re: [PERFORM] what server stats to track / monitor ?
I'm unfamiliar with Munin, but if you can turn off the graphing (so as to achieve your desired level of un-cluttered-ness) without disabling the capture of the data that was being graphed, you'll be better off. Others' opinions may certainly vary, but in my experience, provided you're not causing a performance problem simply because you're monitoring so much stuff, you're best off capturing every statistic reasonably possible. The time will probably come when you'll find that that statistic, and all the history you've been capturing for it, becomes useful. Yes, Munin does allow me to turn off graphing without turning off collecting. Any pointers for good reading material here? Other tips? -- “Don't eat anything you've ever seen advertised on TV” - Michael Pollan, author of In Defense of Food -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] what server stats to track / monitor ?
Yes, I'm familiar with Staplr - if anyone from myyearbook.com is listening in, I'm still hoping for that 0.7 update :-) I plan to run both for the immediate term at least. But this only concerns collecting - my biggest concern is how to read/interpret the data! Pointers to good reading material would be greatly appreciated. On Fri, Jun 12, 2009 at 4:40 PM, Rauan Maemirovra...@maemirov.com wrote: Hi Alan. For simple needs you can use Staplr, it's very easy to configure. There's also one - zabbix, pretty much. -- “Don't eat anything you've ever seen advertised on TV” - Michael Pollan, author of In Defense of Food -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] what server stats to track / monitor ?
Hi Alan. For simple needs you can use Staplr, it's very easy to configure. There's also one - zabbix, pretty much. 2009/6/13 Alan McKay: Hey folks, I'm new to performance monitoring and tuning of PG/Linux (have a fair bit of experience in Windows, though those skills were last used about 5 years ago) I finally have Munin set up in my production environment, and my goodness it tracks a whole whack of stuff by default! I want to turn off the graphing of unimportant data, to unclutter the graphs and focus on what's important. So, from the perspective of both Linux and PG, is there canonical list of here are the most important X things to track ? On the PG side I currently have 1 graph for # connections, another for DB size, and another for TPS. Then there are a few more graphs that are really cluttered up, each with 8 or 9 things on them. On the Linux side, I clearly want to track HD usage, CPU, memory. But not sure what aspects of each. There is also a default Munin graph for IO Stat - not sure what I am looking for there (I know what it does of course, just not sure what to look for in the numbers) I know some of this stuff was mentioned at PG Con so now I start going back through all my notes and the videos. Already been reviewing. If there is not already a wiki page for this I'll write one. I see this is a good general jump off point : http://wiki.postgresql.org/wiki/Performance_Optimization But jumping off from there (and searching on Performance) does not come up with anything like what I am talking about. Is there some good Linux performance monitoring and tuning reading that you can recommend? thanks, -Alan -- “Mother Nature doesn’t do bailouts.” - Glenn Prickett -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance