Rudi Kramer - MWEB wrote:
Michael Christie:

I want to cluster some freeBSD servers, The purpose of this is to
learn.  I would like to  run some basic services like www and mail on
a
test network. I would like to set up the servers so if one server
falls
over the other will take over the services automatically, load
balanceing would be good as well. I have googled, I could be looking
in
the wrong place , there seems not to be much in regard to seting up
freebsd in a cluster, lots on linux. I have looked at the High
Availability Linux project , I see on the front page that it will run
on
freebsd.

So I am a bit lost and i am wanting to learn how to cluster freebsd
web
and mail servers, I have looked at  Beowulf clusters, which seem to
give
computers more grunt, Can some on on the list please advise me on what
clustering softwhere i need to get started and if the High
Availability
Linux project softwhere will do the job.

I also did some research a while ago and found Wackamole. It looks
pretty interesting as you don't need a central "director" server but all
servers in the cluster check each other. It's also in the ports tree :-)

Site: http://www.backhand.org/wackamole/ Port: /usr/ports/net/wackamole

There's clustering and clustering.  Neither of the two applications
the OP mentioned needs anything like as tight a coupling as what many 
commercial 'cluster' solutions provide, or that compute-cluster solutions
like Beowulf or Grid Engine[!] provide.

WWW clustering requires two things:

* A means to detect failed / out of service machines and redirect traffic to alternative servers

   * A means to delocalize user sessions between servers

The first requirement can be handled with programs already mentioned
such as wackamole/spread or hacluster -- or another alternative is hoststated(8)[*] on OpenBSD. You can use mod_proxy_balancer[+] on recent Apache 2.2.x to good effect. Certain web technologies provide this
sort of capability directly: eg. mod_jk or the newer mod_proxy_ajp13
modules for apache can balance traffic across a number of back-end tomcat workers: of course this only applies to sites written in Java.

If you're dealing with high traffic levels and have plenty of money to spend, then a hardware load balancer (Cisco Arrowpoint, Alteon Acedirector, Foundry ServerIron etc.) is a pretty standard choice.

The second requirement is more subtle.  Any reasonably complicated
web application nowadays is unlikely to completely stateless.  Either
you have to recognise each session and direct the traffic back to the
same server each time, or you have to store the session state in a way
that is accessible to all servers -- typically in a back-end database. Implementing 'sticky sessions' is generally slightly easier in terms of application programming, but less resilient to machine failure. There
are other alternatives: Java Servlet based applications running under
Apache Tomcat can cluster about 4 machines together so that session
state is replicated to all of them.  This solution is however not at
all scalable beyond 4 machines, as they'll quickly spend more time passing
state information between themselves than they do actually serving incoming web queries.

Mail clustering is an entirely different beast.  In fact, it's two
different beasts with entirely different characteristics.

The easy part with mail is the MTA -- SMTP has built in intrinsic concepts of fail-over and retrying with alternate servers. Just set up appropriate MX records in the DNS pointing at a selection of servers and it all should work pretty much straight away. You may need to share certain data between your SMTP servers (like greylisting status, Bayesian spam filtering, authentication databases) but the software is generally written with this capability built in.

The hard part with mail clustering is the mail store which provides the
IMAP or POP3 or WebMail interface to allow users to actually read their mail. To my knowledge there is no freely available opensource solution
that provides an entirely resilient IMAP/POP3 solution.  Cyrus Murder
comes close, in that it provides multiple back-end mail stores, easy migration of mailboxes between stores and resilient front ends. The typical approach here is to use a high-spec server with RAIDed disk systems, multiple PSUs etc. and to keep very good backups.

        Cheers,

        Matthew


[!] http://gridengine.sunsource.net/

[*] hoststated(8) integrates with the traffic redirection capabilities of pf(4) to provide pretty much the same sort of functionality as a hardware loadbalancer via a firewall machine, but a lot cheaper.
http://www.openbsd.org/cgi-bin/man.cgi?query=hoststated&sektion=8&format=html

[+] http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html

--
Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
                                                 Kent, CT11 9PW

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to