Oh but memory is a problem – but not if you have just a small cluster of
machines!
Our boxes are larger than that – but they all run virtual machine {only a small
proportion web related} – machines/memory would rapidly become in our data
centre - we run VMWARE [995 hosts] and openstack [10,000s of hosts] + a
selection of large memory machines {measured in TBs of memory per machine }.
We would be looking at somewhere between 0.5 PB and 1 PB of memory – not just
the price of buying that amount of memory - for many machines we need the
fastest memory money can buy for the workload, but we would need a lot more
CPUs then we currently have as we would need a larger amount of machines to
have 64GB virtual machines {we would get 2 VMs per host. We currently have
approx. 1-2000 CPUs running our hardware (last time I had a figure) – it would
probably need to go to approximately 5-10,000!
It is not just the initial outlay but the environmental and financial cost of
running that number of machines, and finding space to run them without putting
the cooling costs through the roof!! That is without considering what
additional constraints on storage having the extra machines may have (at the
last count a year ago we had over 30 PBytes of storage on side – and a large
amount of offsite backup.
We would also stretch the amount of power we can get from the national grid to
power it all - we currently have 3 feeds from different part of the national
grid (we are fortunately in position where this is possible) and the dedicated
link we would need to add more power would be at least 50 miles long!
So - managing cores/memory is vitally important to us – moving to the cloud is
an option we are looking at – but that is more than 4 times the price of our
onsite set-up (with substantial discounts from AWS) and would require an
upgrade of our existing link to the internet – which is currently 40Gbit of
data (I think).
Currently we are analysing a very large amounts of data directly linked to the
current major world problem – this is why the UK is currently being isolated as
we have discovered and can track a new strain, in near real time – other
countries have no ability to do this – we in a day can and do handle, sequence
and analyse more samples than the whole of France has sequenced since February.
We probably don’t have more of the new variant strain than in other areas of
the world – it is just that we know we have because of the amount of sequencing
and analysis that we in the UK have done.
From: Matthias Peng <[email protected]>
Sent: 23 December 2020 12:02
To: mod_perl list <[email protected]>
Subject: Re: Confused about two development utils [EXT]
Today memory is not serious problem, each of our server has 64GB memory.
Forgot to add - so our FCGI servers need a lot (and I mean a lot) more memory
than the mod_perl servers to serve the same level of content (just in case
memory blows up with FCGI backends)
-----Original Message-----
From: James Smith <[email protected]<mailto:[email protected]>>
Sent: 23 December 2020 11:34
To: André Warnier (tomcat/perl) <[email protected]<mailto:[email protected]>>;
[email protected]<mailto:[email protected]>
Subject: RE: Confused about two development utils [EXT]
> This costs memory, and all the more since many perl modules are not
> thread-safe, so if you use them in your code, at this moment the only safe
> way to do it is to use the Apache httpd prefork model. This means that each
> Apache httpd child process has its own copy of the perl interpreter, which
> means that the memory used by this embedded perl interpreter has to be
> counted n times (as many times as there are Apache httpd child processes
> running at any one time).
This isn’t quite true - if you load modules before the process forks then they
can cleverly share the same parts of memory. It is useful to be able to
"pre-load" core functionality which is used across all functions {this is the
case in Linux anyway}. It also speeds up child process generation as the
modules are already in memory and converted to byte code.
One of the great advantages of mod_perl is Apache2::SizeLimit which can blow
away large child process - and then if needed create new ones. This is not the
case with some of the FCGI solutions as the individual processes can grow if
there is a memory leak or a request that retrieves a large amount of content
(even if not served), but perl can't give the memory back. So FCGI processes
only get bigger and bigger and eventually blow up memory (or hit swap first)
--
The Wellcome Sanger Institute is operated by Genome Research Limited, a
charity registered in England with number 1021457 and a company registered in
England with number 2742969, whose registered office is 215 Euston Road,
London, NW1 2
[google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-26source-3Dg&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=friR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=>BE.
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2
[google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-26source-3Dg&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=friR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=>BE.
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.