Re: [Drizzle-discuss] What are the "big" replication problems and solutions?

Paul McCullagh Wed, 24 Mar 2010 01:12:10 -0700

Hi Robert,

On Mar 23, 2010, at 5:13 PM, Robert Hodges wrote:

On 3/23/10 3:24 AM PDT, "Paul McCullagh"<[email protected]>

wrote:

Hi Robert,

On Mar 22, 2010, at 8:43 PM, Robert Hodges wrote:


1.) What are the "big" problems in replication?  Let's say we have
things
like availability and basic read scaling basically handled.  What's
next on
the list?  (Big data, No-SQL, replication/database impedance
mismatch due to
faster hardware, complex topologies, management, etc., all
suggestions are
welcome.)

2.) What replication solutions are emerging to address those
problems?  Jay
and I work on or know most of the usual suspects like Drizzle,
Tungsten,
Galera, MySQL 5.4, Rabbit MQ, etc.  However, if there's something
really
cool out there we'll add it to the list.


<plug>
It may be relevant to mention engine-level replication in a survey of
the replication landscape. This is now built into PBXT.

Engine-level replication is not as flexible as Drizzle replication,
for example, but it is extremely efficient.

It is relatively low-level, but not as low level has block device
replication (DRBD), and is much easier to setup and maintain.

I believe this has a role to play in its position between the block-
level replication and the high-level, replicate to anywhere, being
developed for Drizzle, and global replication systems like Tungsten.
</plug>

This is a very interesting point. Physical and logical replicationare kind

of like Yin and Yang for building systems.

I was thinking of pointing out in the talk the work going on inPostgreSQLwith PG 9 and streaming replication/hot standby. The PostgreSQLapproach,

as you are probably aware, has been to implement very solid physical

replication while leaving logical replication to external, trigger-based

products like Londiste.


This is basically the PBXT approach.

How do you get around some of the issues like snapshot maintenance for
queries that are currently consuming cycles in the PostgreSQLeffort? Inthe current PG 9 alpha users may have to choose between between upto dateslaves and ability to run queries that maintain a snapshot for alengthy
period of time.

Should not be much of a problem because PBXT MVCC implementation canhandle long running snapshot queries/transactions. PBXT stores allversions of a row on disk, so it does not limit the size of a snapshot.

However, each version of a row is indexed. So a long running snapshotcan lead to inflation of the index. This is not a problem for updates,however. Especially not replicated updates which are based on theinternal row ID. So a snapshot should not have that much affect on thespeed at which replication changes are applied.

But, it can affect the speed of long running SELECTs that are beingdone on the slave.


Best regards,

Paul



--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com




_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] What are the "big" replication problems and solutions?

Reply via email to