Hello Frank,
Here at the CLS we have a 8-CPU AMD Quad-Core 2.9GHz machines (ie 32 cores per 
node). There are newer models capable of 48 cores. Each node has four Gb 
network ports bonded through Etherchannel.  Our file server is a Sun 7310 
storage server also with 4 bonded Gb network ports. We use NFSv4.

We also use XDS extensively and I have spent some time trying to optimize the 
system. In addition to the comments on the XDS wiki and in Kay's reply, I have 
made a few observations:

- I determine the optimum number of CPUS and number of jobs to provide to 
XDS.INP using the following python snippet:

min_cpus = int(round(DELPHI/delta))
stride = int(math.ceil(num_frames/float(total_cores)))
jobs = (frames//min_cpus)//stride
num_cpus = 1 + (num_frames//jobs)//stride

For example if you have a cluster with 96 cores, and you want to process 360 
frames with delta of 1 deg, this gives you
maximum_number_of_cpus=6
maximum_number_of_jobs=18

So you get 18 jobs each with 4 batches, each batch will use 5 Cores each of 
which will be processing one frame from the frames in each batch, which 
maximises the CPU usage but does not over commit them. So each core will 
process 4 images before the integration job is complete.

I find that adding many more cpus to the maximum_number_of_cpus above the 
number actually used in each batch is counterproductive. I simply add 1 because 
sometimes a few batches will get an extra frame if the number of jobs is not an 
integral multiple of the number of frames.

I hope this helps.


Michel Fodje,
Canadian Macromolecular Crystallography Facility,
Canadian Light Source


-----Original Message-----
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Frank 
Murphy
Sent: April-19-11 6:06 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Cluster Design

Dear All,

Here at NE-CAT, we make extensive use of XDS in a parallel environment. We are 
looking to purchase some new hardware, so I am soliciting your opinions.

Our current cluster is made up of 16 nodes, each with 2 processors that have 
four cores, running at 2.2 GHz (I believe). We run with hyperthreading on, so 8 
physical and 16 virtual cores per node.

Our benchmarking with XDS (see 
https://rapd.nec.aps.anl.gov/wiki/RAPD_NecatStats for an example) shows a 
diminishing return on increasing the MAXIMUM_NUMBER_OF_PROCESSORS beyond the 
number of physical cores, and we are wondering if this is due to the test, the 
processor, the RAM, or XDS. In short, will going to 2 six core processors speed 
up processing using up to 12 for  MAXIMUM_NUMBER_OF_PROCESSORS?

Please do not feel the need to constrain the discussion to XDS, as we use our 
cluster for pretty much all the common crystallographic tasks.

Thanks in advance,

Frank Murphy
Beamline Scientist, NE-CAT

Reply via email to