> I'm trying to address 2 issues:
>
> A. Avoiding a single point of failure associated with a
>    having a central repository for the data, such as a NFS
>    share or a single database server.
> B. Avoiding the overhead from using heavyweight tools like
>    database replication.
>
> So I've been thinking about how to pull that off, and I think
> I've figured out how, as long as I don't need every machine to
> have exactly the same version of the data structure at all times.

There are many approaches to this problem, and the one that's appropriate
depends on what you're using the data for, how large your cluster is, and
how out-of-sync the nodes can be.

> What it comes down to is implementing 2 classes: one implements
> a daemon running on each server in the cluster, responsible for
> handling requests to update the data across the network and one
> a class usable inside mod_perl to handle local updates and inform
> other servers of updates.

That will get cumbersome if you have a large number of nodes all trying to
tell each other about updates.  On a recent project, we wanted to share some
cached data which didn't have to be well synchronized.  We did it by writing
a daemon like you're suggesting that sits on top of a fast BerkeleyDB
database, but we used multicast for sending out updates.  With a large
cluster, you'll quickly tie up all your resources if every update has to be
sent separately to every other server.

If all you're after is redundancy for your short-term user data, you could
do something like the "TCP-ring" sessions described here:
http://www.caucho.com/products/resin/java_tut/tcp-sessions.xtp.  There's
nothing especially tricky about this: just write all of your updates through
to a backup database on another server, and embed enough information in a
cookie to find the backup if the main one for a session fails.

> I believe I wouldn't be the only person finding something like
> this terrifically useful.  Furthermore, I see that Cache::Cache
> could be the underlying basis for those classes.  Most of the
> deep network programming is already there in Net::Daemon.

You might want to look at http://www.spread.org/ or Recall:
http://www.fault-tolerant.org/recall/.

Incidentally, database replication may not sound like such a bad idea after
you examine some of the alternatives.  It's really just one group's solution
to the problem you're posing.  There are replication tools for MySQL which
are supposedly fairly easy to run.

- Perrin

Reply via email to