On 24 Oct 2008, at 3:09 pm, Joe Landman wrote:

Carsten Aulbert wrote:
Hi Jon
Jon Aquilina wrote:
but why waste time sifting through all 26,000+ pkgs in the repos when u
can have a distro with repos focused on clustering pkgs?
Because you might/will save time later when you hit user requests which
want packages which are not pre-packaged in your cluster distro.

Allow me to expand on this.

Some distro packaged stuff is garbage, and broken. Perl in RHEL4 and RHEL5 is notoriously bad (long discussions on this on a few other lists I lurk on). The rational for keeping it bad is compatibility. Which curiously leads to many developers building their own base tools trees.

We do that to an extent, mainly so that machines running different OS's are running a consistent perl environment, for example. But we don't do it because of breakages in the upstream distro. If distros are that broken, we tend to not use them at all. We abandoned pretty much all Red Hat flavours years ago for that reason. For years, large parts of Red Hat were not 64-bit file aware, which was massively infuriating, and as you say, the kernel is in a world of its own (which of course leads to all sorts of fun problems with ISV software, which only supports Red Hat, and then doesn't work on other distros because it's been ported specifically to the Red Hat Broken View of the World)

You can only trust the distro supplied tools so far. Apache2 has greatly improved in RHEL, and Debian/Ubuntu as compared to Apache in RHEL. Php is ancient, as is mysql, postgresql, etc.

That's always going to happen with any distribution. Ubuntu is, thanks to its 6-month release cycle, usually rather more current than Debian. But it's a trivial matter, usually, if you want something more up to date, to grab the source package from the distro's development tree, and build it on the current stable release. Indeed, there are public repositories (such as etch-backports) where communities are doing just that. But it's easy to do yourself it you want finer control.

However, for things like mysql, we tend to do as you describe, and install the versions directly obtained from upstream.

The issue is that any cluster distribution based upon and base distribution inherits all of the underlying issues of the base. And some of those issues are really pretty annoying. In some cases, they are broken.

I can't think of any real show-stoppers in the five or so years we've been running Debian. The closest we came to a major snafu there was when Debian made their cock-up with SSH key security. But that was easy enough to put right, and fortunately we hadn't migrated to Etch wholesale when it came to light, so we weren't badly affected.

This is why we tend to prefer underlying-OS insensitive systems. As long as the underlying OS works, we don't care what it is. When it doesn't work, this is when we care, and have to figure out if the cost of making it work is worth the effort. The cost is time in this case.

I agree wholeheartedly with that - time is the most important cost. I also try not to care too much what the underlying OS is, but I also want to minimise the amount of software stack maintenance I have to do, so I tend to ask myself the following questions of the piece of software I'm considering:

1)  Does it need to be installed on every machine?
2)  Is the precise version present on the machine important?
3) Is the software being rapidly developed, and consequently likely to be out of date in distros?
4)  Do I have an ISV support matrix to consider?

If the answer is yes to questions 1 and 4, or no to questions 2 and 3, then I tend to lean towards using the distro's packaging. If the answers are the opposite to those, I will tend to use a copy I build and maintain myself, preferably on a central NFS server so I don't have to synchronise it everywhere. There's no hard-and-fast answer to which approach is always best; it's very dependent on the situation.

Tim


--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to