Re: [PERFORM] Replication: Slony-I vs. Mammoth Replicator vs. ?

2004-08-14 Thread Joshua D. Drake

Once again, Joshua, would you please explain what you mean with 
batch and live replication system? Slony does group multiple 
master transactions into one replication transaction to improve 
performance (fewer commits on the slaves). The interval of these 
groups is configurable and for high volume DBs it is recommended to 
use about one second, which means that all commits that fall into an 
interval of one second are replicated in one transaction on the slave. 
On normal running systems this results in a replication lag of 600 to 
800 milliseconds in average. On overloaded systems the asynchronous 
nature of course allows the slaves to fall behind.

Your description above is what I considered batch... you are taking a 
batch of transactions and replicating them versus each transaction. I 
am not saying it is bad in any way. I am just saying it is different 
that replicator.

What is a usual average replication lag of Mammoth Replicator?
Obviously it depends on the system, the network connectivity between the 
systems etc... In our test systems it takes less than 100 ms to 
replicate the data. Again it depends on the size of the transaction (the 
data being moved).

What happens to the other existing slaves when you promote by hand? 
This is something that Slony has over replicator. Currently the new 
master will force a full dump to the slaves. Of course this is already 
on the road map, thanks to Slony :) and should be resolved by months end.

The Slony documentation is an issue at the moment and the 
administrative tools around it are immature. The replication engine 
itself exceeds my own expectations and performs very robust.

I have never suggested otherwise. My only comment about maturity is that 
their are actually many companies using replicator in production. We 
have already dealt with the 1.0 blues as they say.

I hope you understand that I, in no way have ever suggested (purposely) 
anything negative about Slony. Only that I believe they serve different 
technical solutions.

Sincerely,
Joshua D. Drake

Jan

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Replication: Slony-I vs. Mammoth Replicator vs. ?

2004-08-14 Thread Christopher Browne
Centuries ago, Nostradamus foresaw when [EMAIL PROTECTED] (Joshua D. Drake) would 
write:
 I hope you understand that I, in no way have ever suggested
 (purposely) anything negative about Slony. Only that I believe they
 serve different technical solutions.

Stipulating that I may have some bias ;-), I still don't find it at
all clear what the different situations are shaped like that lead to
Mammoth being forcibly preferable to Slony-I.

(Note that I have a pretty decent understanding about how ERS and
Slony work, so I'm not too frightened of technicalities...  I set up
instances of both on Thursday, so I'm pretty up to speed :-).)

Win32 support may be true at the moment, although I have to discount
that as we only just got the start of a beta release of native Win32
support for PostgreSQL proper.  For that very reason, I had to point
my youngest brother who needed something better than Access to
Firebird last Saturday; I played with my niece while he was doing the
install.  And there is little reason to think that Slony-I won't be
portable to Win32 given a little interest and effort, particularly
once work to make it play well with pgxs gets done.
-- 
(format nil [EMAIL PROTECTED] cbbrowne ntlug.org)
http://www3.sympatico.ca/cbbrowne/multiplexor.html
At  Microsoft, it doesn't  matter which  file you're  compiling, only
which flags you #define.  -- Colin Plumb

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Replication: Slony-I vs. Mammoth Replicator vs. ?

2004-08-14 Thread Joshua D. Drake




Christopher Browne wrote:

  Centuries ago, Nostradamus foresaw when [EMAIL PROTECTED] ("Joshua D. Drake") would write:
  
  
I hope you understand that I, in no way have ever suggested
(purposely) anything negative about Slony. Only that I believe they
serve different technical solutions.

  
  
Stipulating that I may have some bias ;-), I still don't find it at
all clear what the different situations are "shaped like" that lead to
Mammoth being forcibly preferable to Slony-I.
  

I would choose replicator if:

1. You want ease of setup
2. You want your each transaction to be replicated at time of commit
3. Your database is already laden with triggers
4. You are pushing a very high transactional load*

* Caveat I have no idea how well Slony performs on a system that does
say 200,000 transactions
an hours that are heavily geared toward updates. Replicator performs
very well in this scenario.

5. Replicators administrative tools are more mature than Slony (for
example you know exactly what state your slaves are in with Replicator).

I would choose Slony if:

1. The fact that it is Open Source matters to you
2. The auto promotion of slaves is important*

*This will be fixed in a couple of weeks with Replicator

To be fair, in the real world --- 

It doesn't make a bit of difference which one you choose it really
comes down to this:

Replicator is dumb simple to setup. Any halfway talented person can
setup replicator
in 30 minutes with a single master / slave configuration.

Slony is Open Source and thus a little easier on the pocket book
initially.

Command Prompt, will support either one -- so the Replicator is
commercially supported
argument is a little weak here. 

Sincerely,

Joshua D. Drake




Sincerely,

Joshua D. Drake





  
(Note that I have a pretty decent understanding about how ERS and
Slony work, so I'm not too frightened of technicalities...  I set up
instances of both on Thursday, so I'm pretty up to speed :-).)

Win32 support may be true at the moment, although I have to discount
that as we only just got the start of a beta release of native Win32
support for PostgreSQL proper.  For that very reason, I had to point
my youngest brother who needed "something better than Access" to
Firebird last Saturday; I played with my niece while he was doing the
install.  And there is little reason to think that Slony-I won't be
portable to Win32 given a little interest and effort, particularly
once work to make it play well with "pgxs" gets done.
  



-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL




[PERFORM] Faster with a sub-query then without

2004-08-14 Thread Martin Foster
I thought this could generate some interesting discussion.  Essentially, 
there are three queries below, two using sub-queries to change the way 
the randomized information (works first by author and then by work) and 
the original which simply randomizes out of all works available.

The one not using sub-queries under EXPLAIN ANALYZE proves itself to be 
less efficient and have a far higher cost then those with the penalty of 
a sub-query.   Since this seems to be counter to what I have been told 
in the past, I thought I would bring this forward and get some 
enlightenment.

Martin Foster
Creator/Designer Ethereal Realms
[EMAIL PROTECTED]
---
SELECT
 g.GalleryID,
 w.WorkID,
 w.WorkName,
 w.WorkImageThumbnail,
 g.GalleryRating,
 g.GalleryPenName
FROM ethereal.Work w, ethereal.Gallery g
WHERE w.GalleryID = g.GalleryID
 AND g.GalleryPrivacy = 'no'
 AND w.WorkImageThumbnail IS NOT NULL
 AND g.PuppeteerLogin = (SELECT PuppeteerLogin
  FROM ethereal.Gallery
  WHERE GalleryType='image'
  GROUP BY PuppeteerLogin
  ORDER BY RANDOM() LIMIT 1)
ORDER BY RANDOM() LIMIT 1
 Limit  (cost=60.70..60.70 rows=1 width=100) (actual time=1.013..1.013 
rows=0 loops=1)
   InitPlan
 -  Limit  (cost=6.36..6.37 rows=1 width=11) (actual 
time=0.711..0.713 rows=1 loops=1)
   -  Sort  (cost=6.36..6.45 rows=33 width=11) (actual 
time=0.708..0.708 rows=1 loops=1)
 Sort Key: random()
 -  HashAggregate  (cost=5.45..5.53 rows=33 width=11) 
(actual time=0.420..0.553 rows=46 loops=1)
   -  Seq Scan on gallery  (cost=0.00..5.30 
rows=60 width=11) (actual time=0.007..0.227 rows=59 loops=1)
 Filter: ((gallerytype)::text = 'image'::text)
   -  Sort  (cost=54.33..54.37 rows=16 width=100) (actual 
time=1.009..1.009 rows=0 loops=1)
 Sort Key: random()
 -  Nested Loop  (cost=0.00..54.01 rows=16 width=100) (actual 
time=0.981..0.981 rows=0 loops=1)
   -  Seq Scan on gallery g  (cost=0.00..5.56 rows=2 
width=24) (actual time=0.855..0.888 rows=1 loops=1)
 Filter: (((galleryprivacy)::text = 'no'::text) AND 
((puppeteerlogin)::text = ($0)::text))
   -  Index Scan using pkwork on work w 
(cost=0.00..24.10 rows=8 width=80) (actual time=0.080..0.080 rows=0 loops=1)
 Index Cond: (w.galleryid = outer.galleryid)
 Filter: (workimagethumbnail IS NOT NULL)
 Total runtime: 1.211 ms

---
SELECT
 g.GalleryID,
 w.WorkID,
 w.WorkName,
 w.WorkImageThumbnail,
 g.GalleryRating,
 g.GalleryPenName
FROM ethereal.Work w, ethereal.Gallery g
WHERE w.GalleryID = g.GalleryID
 AND g.GalleryPrivacy = 'no'
 AND w.WorkImageThumbnail IS NOT NULL
 AND g.GalleryPenName = (SELECT GalleryPenName
  FROM ethereal.Gallery
  WHERE GalleryType='image'
  GROUP BY GalleryPenName
  ORDER BY RANDOM() LIMIT 1)
ORDER BY RANDOM() LIMIT 1
 Limit  (cost=59.92..59.92 rows=1 width=100) (actual time=0.904..0.906 
rows=1 loops=1)
   InitPlan
 -  Limit  (cost=6.69..6.69 rows=1 width=14) (actual 
time=0.731..0.733 rows=1 loops=1)
   -  Sort  (cost=6.69..6.79 rows=42 width=14) (actual 
time=0.729..0.729 rows=1 loops=1)
 Sort Key: random()
 -  HashAggregate  (cost=5.45..5.56 rows=42 width=14) 
(actual time=0.431..0.568 rows=48 loops=1)
   -  Seq Scan on gallery  (cost=0.00..5.30 
rows=60 width=14) (actual time=0.011..0.233 rows=59 loops=1)
 Filter: ((gallerytype)::text = 'image'::text)
   -  Sort  (cost=53.23..53.27 rows=16 width=100) (actual 
time=0.899..0.899 rows=1 loops=1)
 Sort Key: random()
 -  Nested Loop  (cost=0.00..52.91 rows=16 width=100) (actual 
time=0.808..0.862 rows=6 loops=1)
   -  Index Scan using idxgallery_pen on gallery g 
(cost=0.00..4.45 rows=2 width=24) (actual time=0.767..0.769 rows=1 loops=1)
 Index Cond: ((gallerypenname)::text = ($0)::text)
 Filter: ((galleryprivacy)::text = 'no'::text)
   -  Index Scan using pkwork on work w 
(cost=0.00..24.10 rows=8 width=80) (actual time=0.020..0.042 rows=6 loops=1)
 Index Cond: (w.galleryid = outer.galleryid)
 Filter: (workimagethumbnail IS NOT NULL)
 Total runtime: 1.117 ms

---
SELECT
 g.GalleryID,
 w.WorkID,
 w.WorkName,
 w.WorkImageThumbnail,
 g.GalleryRating,
 g.GalleryPenName
FROM ethereal.Work w, ethereal.Gallery g
WHERE w.GalleryID = g.GalleryID
 AND g.GalleryType = 'image'
 AND g.GalleryPrivacy = 'no'
 AND w.WorkImageThumbnail IS NOT NULL
ORDER BY RANDOM() LIMIT 1

 Limit  (cost=111.73..111.73 rows=1 width=100) (actual 
time=13.021..13.023 rows=1 loops=1)
   -  Sort  (cost=111.73..113.70 rows=786 width=100) 

Re: [PERFORM] Faster with a sub-query then without

2004-08-14 Thread Tom Lane
Martin Foster [EMAIL PROTECTED] writes:
 The one not using sub-queries under EXPLAIN ANALYZE proves itself to be 
 less efficient and have a far higher cost then those with the penalty of 
 a sub-query.   Since this seems to be counter to what I have been told 
 in the past, I thought I would bring this forward and get some 
 enlightenment.

The ones with the subqueries are not having to form the full join of W
and G; they just pick a few rows out of G and look up the matching W
rows.

The subquery penalty is nonexistent in this case because the
subqueries are not dependent on any variables from the outer query, and
so they need be evaluated only once, rather than once per outer-query
row which is what I suppose you were expecting.  This is reflected in
the EXPLAIN output: notice they are shown as InitPlans not SubPlans.
The outputs of the InitPlans are essentially treated as constants (shown
as $0 in the EXPLAIN output) and the outer plan is approximately what
it would be if you'd written WHERE g.field = 'constant' instead of
WHERE g.field = (select ...)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster