Re: [ADMIN] High Availability: Hot Standby vs. Warm Standby

2010-07-12 Thread Brad Nicholson
On Mon, 2010-07-12 at 08:58 +0200, Thomas Kellerer wrote:
> Greg Smith, 10.07.2010 14:44:
> >> Is there a difference in how much data could potentially be lost in
> >> case of a failover? E.g. because 9.0 replicates the changes quicker than 
> >> 8.4?
> >
> > There's nothing that 9.0 does that you can' t do with 8.4 and the right
> > software to aggressively ship partial files around. In practice though,
> > streaming shipping is likely to result in less average data loss simply
> > because it will do the right thing to ship new transactions
> > automatically. Getting the same reaction time and resulting low amount
> > of lag out of an earlier version requires a level of external script
> > configuration that few sites every actually manage to obtain. You can
> > think of the 9.0 features as mainly reducing the complexity of
> > installation needed to achieve low latency significantly. I would bet
> > that if you tried to setup 8.4 to achieve the same quality level in
> > terms of quick replication, your result would be more fragile and buggy
> > than just using 9.0--the bugs would be just be in your own code rather
> > than in the core server.
> >
> 
> Greg and Rob,
> 
> thanks for the answers.
> 
> I didn't "plan" (or expect) to get the same level of reliability from a 
> "standard" 8.4 HA installation, so I don't think I would go that way. If we 
> do need that level, we'd go for 9.0 or for some other solution.
> 
> The manual lists three possible solutions to HA: shared disk failover, file 
> system replication and Warm/Hot Standby. I'm not an admin (nor a DBA), so my 
> question might sound a bit stupid: from my point of view solutions using 
> shared disk failover of file system replication seem to be more reliable in 
> terms of how much data can get lost (and possibly the switch over lag)

With Shared Disk failover, you don't use filesystem replication.  Your
disk resources are available to a secondary server, and in the result of
a failure to the primary server, your secondary takes ownership of the
disk resources. 

The nice thing about shared disk solutions is that you won't lose any
committed data if a server fails.

The down sides are that this shared disk can be really tough to setup
properly. Your storage is a still a single point of failure, so you need
to make sure that it's reliable and most likely still use alternate
means to protect against failure of the storage.

Warm/Hot Standby is a lot easier to setup, but there is a window for
data loss on failure.  This can be minimized/eliminated by using some
sort of block level synchronous replication (DRBD file system, array or
SAN based) if you can afford the overhead.  I don't have any first hand
experience with the sync based stuff, so I can't comment much further
than that.

Switchover times are really going to vary.

For shared clusters, there is some overhead in dealing with the low
level disk stuff, but I find it's not that bad.

The bigger issue on switchover is whether or not you have time to call a
fast shutdown instead of having the server do a hard crash.  If it's a
hard crash (which it usually is), you'll start up in recovery mode on
the secondary server and have to replay through wal.  If you have a lot
of wal files you need to replay on start up, the switchover time can be
quite long. 

Warm/Hot Standby tends to be faster on fail over as long as you are
applying the wal files at a reasonable rate.

One further thing to mention - all of these solutions are based on
making the physical blocks available (actually, I'm not sure about
Streaming replication in 9.0).  As such, it is possible for corruption
to hit the master at the block level and get replicated through the
chain.

Logical solutions like Slony/Bucardo/Londiste do give some additional
protection against this.

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Postgresql shared_buffer and SHMMAX configuration

2010-07-12 Thread Kevin Grittner
"Gnanakumar"  wrote:
 
> We have our java servlet code running in Tomcat and the connection
> is managed by pgpool.
 
If you're running under Tomcat, you might not even need pgpool; see
this recent post:
 
http://archives.postgresql.org/pgsql-performance/2010-07/msg00154.php
 
> When we did a Load Test in this environment with 150 connections
> in num_init_children we got error in java side saying "DBACCESS
> CONNECTION exception no connection could be got"
> While there was no error in PGPOOL and POSTGRESQL logs
> 
> When we increased this to 420, it was working fine.
 
You're not going to get errors with your current configuration, but
you're also not going to see optimal performance.  If performance is
"good enough" for your needs, maybe you don't need to care.
 
>> For starters, you should not be configuring a shared memory
>> maximum of over 18GB for your 15GB machine.
>  
>> More like (assuming your "given" settings):
>  
>> (250 kB) + (2304 MB * 1.025) + (14.2 kB * 500)
>>  = 2.48384348 * 10^9 bytes
>  
>> plus whatever shared memory you need for other processes and a
>> little "slush".  I might just go to something in the 4GB range,
>> unless I know something else needs a lot.
> Is there any documentation link for the above formula?
> 250kB + 8.2kB * shared_buffer + 14.2kB * max_Connections Was our
> formula
 
Actually, I was just using your formula, correcting for an obvious
error -- you took "8.2kB * shared_buffer" to mean that you multiply
the kB of buffer space by 8.2kB.  This is obviously silly, since
there's no such thing as square kB.  The thing to multiply by 8.2kB
would appear to be the *number of 8kB buffers*, which is allowing
for some overhead involved in tracking the buffers.  8.2 / 8 =
1.025.
 
>> I would expect these changes in the pgpool and PostgreSQL
>> settings to actually make things worse, although the extra
>> caching from the additional RAM may counteract that to some
>> degree.  Before I could venture a suggestion on what good
>> settings might be, I would need to know more about the database
>> server hardware.  How many CPU cores?  How many disks in what
>> arrangement?
>  
> We have our database server running in Amazon EC2.
> It is not dedicated database server our java application is also
> running in the same instance. 
> 
> Regarding server hardware, CPU core, etc.   It is 4 virtual cores
> with 2 EC2 Compute Units each.
 
You said nothing about the effective spindle count.  Benchmarks by
myself and others have suggested that PostgreSQL gets best
performance (both response time and throughput) when a connection
pool limits the number of active queries to somewhere around:
 
((2 * CPU core count) + (effective spindle count))
 
Beyond that, resource contention and context switching causes
performance to degrade.  I always try to use a pooling technique
which will limit active queries on the database to some hard limit,
queuing excess queries to run as other queries finish.  You're using
what by all accounts is a good pooling product, but you're not
configuring it to do this for you.  And since Tomcat has a built-in
pool, it would probably be better to use that and not even set up an
intermediate pool; although since you already have one I would
recommend benchmarking both ways to check.
 
-Kevin

-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] High Availability: Hot Standby vs. Warm Standby

2010-07-12 Thread Greg Smith

Thomas Kellerer wrote:
The manual lists three possible solutions to HA: shared disk failover, 
file system replication and Warm/Hot Standby. I'm not an admin (nor a 
DBA), so my question might sound a bit stupid: from my point of view 
solutions using shared disk failover of file system replication seem 
to be more reliable in terms of how much data can get lost (and 
possibly the switch over lag)


Yes, but if you try you'll discover that actually getting any shared 
disk or file system replication solution setup so that you really do 
achieve less failover loss than the file shipping approach will be 
expensive, complicated, fragile in its own way, and just generally a 
pain to pull off.  The fundamental problem with shared storage for 
example is how to keep a note that's failed from try to reassume being 
the master when it comes back.  Doing that well requires hardware 
support aimed at that specific use case. 

Meanwhile, file shipping for Warm Standby use requires nothing special 
at all except some modest software setup.  It's comparatively simple to 
setup, validate, and keep going on any hardware capable of running the 
database.  This is why shared storage and the like isn't the only 
obvious solution even though it's technically capable of losing less 
transactions; you'll discover that keeping from losing that last little 
bit of data when there's a crash turns out to be quite expensive.  
Whether it worth it or not depends on the value of your data and whether 
it can be retained at some higher level when this happens instead.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] High Availability: Hot Standby vs. Warm Standby

2010-07-12 Thread Greg Smith

Brad Nicholson wrote:

One further thing to mention - all of these solutions are based on
making the physical blocks available (actually, I'm not sure about
Streaming replication in 9.0).


You're right here; the SR feature in 9.0 is essentially near real-time 
partial WAL file shipping, and the WAL contains physical disk block 
change data.  If your master has data blocks corrupted, the next time 
you do a base backup against it that corruption will be mirrored to the 
standby too.  I've mentioned on one of these lists recently that I like 
to schedule a periodic pg_dump even if log shipping is the main backup 
mechanism for a database, just so that corruption in the underlying 
files is caught as early as possible by trying to read every block and 
confirm it has valid data.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin