Re: [PERFORM] Postgres replication: dump/restore, PITR, Slony,...?

2009-06-12 Thread Shaul Dar
All right, so I misspelled Bucardo (also Mammoth...), and the company's name
is Command Prompt (please get someone to work on that incomprehensible logo
- I went back and looked at it and still have no clue what it means :-).

Now how about some serious answers relating to my questions?

Dimitri, thanks for your answer. I don't need to replicate TO the staging
server (this is where the changes happen) but rather FROM the staging server
TO the Web (query) servers. I think my description wasn't clear enough.
Currently the staging DB changes daily as new records are inserted to it
(would have liked to use COPY instead, but I believe that's only useful for
bulk loading the whole DB, not appending to it?). Those changes need to be
reflected on the Web servers. Today this is done via dump-copy files-restore
of the whole DB (we shut down each Web server DB while restoring it,
obviously), and I we are looking for a better way.

I would truly appreciate specific suggestions and pointers/references (some
trigger based asynchronous replication doesn't help much...).

Also is my understanding of PITR limitations correct?

Thanks,

-- Shaul

On Thu, Jun 11, 2009 at 7:32 PM, Joshua D. Drake j...@commandprompt.comwrote:

 On Thu, 2009-06-11 at 16:30 +, Greg Sabino Mullane wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: RIPEMD160
 
   Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple
   replication in Postgres 8.4 and other projects...
 
   CMO? :)
 
  Buchardo? :)

 A new desert, Buchardo CMO:

 Two shots of brandy
 One shot of rum
 Vanilla Ice cream
 Cherries

 Blend to perfection.

 Joshua D. Drake



Re: [PERFORM] GiST index performance

2009-06-12 Thread Adam Gundy

Matthew Wakeling wrote:
Okay, I don't know quite what's happening here. Tom, perhaps you could 
advise. Running opannotate --source, I get this sort of stuff:


/*
 * Total samples for file : 
.../postgresql-8.4beta2/src/backend/access/gist/gistget.c

 *
 *   6880  0.2680
 */

and then:

   :static int64
   :gistnext(IndexScanDesc scan, TIDBitmap *tbm)
81  0.0032 :{ /* gistnext total: 420087 16.3649 */
   :Pagep;



The gistnext total doesn't seem to correspond to the amount I get by 
adding up all the individual lines in gistnest. Moreover, it is greater 
than the total samples attributed to the whole file, and greater than 
the samples assigned to all the lines where gistnext is called.


there's another alternative for profiling that you might try if you 
can't get sensible results out of oprofile - cachegrind (which is part 
of the valgrind toolset).


basically it runs the code in an emulated environment, but records every 
access (reads/writes/CPU cycles/cache hits/misses/etc). it's *extremely* 
good at finding hotspots, even when they are due to 'cache flushing' 
behavior in your code (for example, trawling a linked list is touching a 
bunch of pages and effectively blowing your CPU cache..)


there's an associated graphical tool called kcachegrind which takes the 
dumped output and lets you drill down, even to the source code level 
(with cycle count/percentage annotations on the source lines)


all you need to do is compile postgres with debug symbols (full 
optimization ON, otherwise you end up reaching the wrong conclusions).


there's an example of running valgrind on postgres here:

  http://blog.cleverelephant.ca/2008/08/valgrinding-postgis.html

for cachegrind, you basically need to use 'cachegrind' instead of 
'valgrind', and don't disable optimization when you build..


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PERFORM] Postgres replication: dump/restore, PITR, Slony,...?

2009-06-12 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160   


 Currently we use dump (to SQL file, i.e. pg_dump with no args) + file copy
 to replicate the DB daily between the staging and web servers, and then
 restore (via psql) the servers one at the time. In our application we expect
 that average daily change is only to 3% of the records. My question is what
 would be the best way to do this replication?

Bucardo should handle this easy enough. Just install Bucardo, tell it about the
databases, tell it which tables to replicate, and start it up. If the tables
have unique indexes (e.g. PKs) you can use the 'pushdelta' type of sync, which
will copy rows as they change from the staging server to the web servers.
If the tables don't have unique indexes, you'll have to use the 'fullcopy'
sync type, which, as you might imagine, copies the entire table each time.

You can further control both of these to fire automatically when the data
on the staging server changes, or to only fire when you tell it to, e.g.
every X minutes, or based on some other criteria. You can also configure
how many of the web servers get pushed to at one time, from 1 up to
all of them.

- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200906121509
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkoyqFkACgkQvJuQZxSWSsjB8ACffcQRD+Vb7SV0RZnoo70hkpwB
nycAn0QDiogs3EuCrc9+h4rMoToTFopz
=Sltu
-END PGP SIGNATURE-



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] what server stats to track / monitor ?

2009-06-12 Thread Joshua Tolley
On Fri, Jun 12, 2009 at 03:52:19PM -0400, Alan McKay wrote:
 I want to turn off the graphing of unimportant data, to unclutter the
 graphs and focus on what's important.

I'm unfamiliar with Munin, but if you can turn off the graphing (so as to
achieve your desired level of un-cluttered-ness) without disabling the capture
of the data that was being graphed, you'll be better off. Others' opinions may
certainly vary, but in my experience, provided you're not causing a
performance problem simply because you're monitoring so much stuff, you're
best off capturing every statistic reasonably possible. The time will probably
come when you'll find that that statistic, and all the history you've been
capturing for it, becomes useful.

- Josh / eggyknap



signature.asc
Description: Digital signature


Re: [PERFORM] what server stats to track / monitor ?

2009-06-12 Thread Alan McKay
 I'm unfamiliar with Munin, but if you can turn off the graphing (so as to
 achieve your desired level of un-cluttered-ness) without disabling the capture
 of the data that was being graphed, you'll be better off. Others' opinions may
 certainly vary, but in my experience, provided you're not causing a
 performance problem simply because you're monitoring so much stuff, you're
 best off capturing every statistic reasonably possible. The time will probably
 come when you'll find that that statistic, and all the history you've been
 capturing for it, becomes useful.

Yes, Munin does allow me to turn off graphing without turning off collecting.

Any pointers for good reading material here?   Other tips?


-- 
“Don't eat anything you've ever seen advertised on TV”
 - Michael Pollan, author of In Defense of Food

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] what server stats to track / monitor ?

2009-06-12 Thread Alan McKay
Yes, I'm familiar with Staplr - if anyone from myyearbook.com is
listening in, I'm still hoping for that 0.7 update :-)   I plan to run
both for the immediate term at least.

But this only concerns collecting - my biggest concern is how to
read/interpret the data!  Pointers to good reading material would be
greatly appreciated.

On Fri, Jun 12, 2009 at 4:40 PM, Rauan Maemirovra...@maemirov.com wrote:
 Hi Alan. For simple needs you can use Staplr, it's very easy to configure.
 There's also one - zabbix, pretty much.


-- 
“Don't eat anything you've ever seen advertised on TV”
 - Michael Pollan, author of In Defense of Food

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] what server stats to track / monitor ?

2009-06-12 Thread Rauan Maemirov
Hi Alan. For simple needs you can use Staplr, it's very easy to configure.
There's also one - zabbix, pretty much.

2009/6/13 Alan McKay:
 Hey folks,

 I'm new to performance monitoring and tuning of PG/Linux (have a fair
 bit of experience in Windows, though those skills were last used about
 5 years ago)

 I finally have Munin set up in my production environment, and my
 goodness it tracks a whole whack of stuff by default!

 I want to turn off the graphing of unimportant data, to unclutter the
 graphs and focus on what's important.

 So, from the perspective of both Linux and PG, is there canonical list
 of here are the most important X things to track ?

 On the PG side I currently have 1 graph for # connections, another for
 DB size, and another for TPS.  Then there are a few more graphs that
 are really cluttered up, each with 8 or 9 things on them.

 On the Linux side, I clearly want to track HD usage, CPU, memory.  But
 not sure what aspects of each.  There is also a default Munin graph
 for IO Stat - not sure what I am looking for there (I know what it
 does of course, just not sure what to look for in the numbers)

 I know some of this stuff was mentioned at PG Con so now I start going
 back through all my notes and the videos.  Already been reviewing.

 If there is not already a wiki page for this I'll write one.   I see
 this is a good general jump off point :

 http://wiki.postgresql.org/wiki/Performance_Optimization

 But jumping off from there (and searching on Performance) does not
 come up with anything like what I am talking about.

 Is there some good Linux performance monitoring and tuning reading
 that you can recommend?

 thanks,
 -Alan

 --
 “Mother Nature doesn’t do bailouts.”
         - Glenn Prickett

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance