Re: [gmx-users] Re: Question about parallazing Gromacs (Qiao Baofu)

2006-09-14 Thread Qiao Baofu
Hi,Thanks. I have test different cpus. Our institute has two clusters: one is each node has 4 cpu (A), one is one node has only 1 cpu (B). I made different tests on the two clusters and my local computer using the same system. See the following result:
A  (For 1 hour) # of cpus ; MD steps  4 finished (20steps for 26:21) 8 finished (20steps for 40:57) 12 87950 20 42749
 44 5962 ! B  (For 1 hour) # of cpu ; MD steps 1 156991 for 56:12 2 179820  3 200,000 for 54:20
 4 200,000 for 51:12c. Local(single cpu), 20 steps For 1h52:38One can see that 1. On cluster A, one nodes(4 cpu) is just as 4 times fast as my local computer.2. More than one nodes will decrease the performancs the gromacs,
3. On cluster B, the more cpu used, the faster gromacs runs. But the difference of speed is not apparent.4. Cluster B with 4 cpus is slow as half as that Cluster A with 1 node (4 cpus)I wonder if anyone can tell the bottlenack: the hardware on the cluster or gromacs?
2006/9/14, Mark Abraham [EMAIL PROTECTED]:
 You have to find your optimum making some tests with your settings. To do that you can start your simulation and interrupt after a while to have some data logged in the log file. Then, from the information in that log file you can
 estimate the time that the whole task will take and compare using more or less number of processors until you find your optimum value.Of course, that while should be at least of the order of several
minutes. There is a set-up cost borne once at the start of the calculationwhich is not proportional to the length of the calculation, so you need torun long enough to get out of the time period during which it dominates
the linear component.Mark___gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-usersPlease don't post (un)subscribe requests to the list. Use thewww interface or send it to [EMAIL PROTECTED]
.Can't post? Read http://www.gromacs.org/mailing_lists/users.php-- Sincerely yours,**
Baofu Qiao, PhDFrankfurt Institute for Advanced Studies**
___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Re: Question about parallazing Gromacs (Qiao Baofu)

2006-09-14 Thread Florian Haberl
On Thursday 14 September 2006 09:53, Mark Abraham wrote:
 Qiao Baofu wrote:
  Hi,
 
  Thanks. I have test different cpus. Our institute has two clusters: one
  is each node has 4 cpu (A), one is one node has only 1 cpu (B).   I made
  different tests on the two clusters and my local computer using the same
  system. See the following result:
 
   A (For 1 hour)
  # of cpus   ;MD steps
4 finished (20steps for 26:21)
8 finished (20steps for 40:57)
   12 87950
   20 42749
   44 5962   !
   B   (For 1 hour)
   # of cpu   ; MD steps
 1156991  for 56:12
 2179820
 3200,000 for 54:20
 4200,000 for 51:12
   c. Local(single cpu), 20 steps  For 1h52:38
 
  One can see that
  1. On cluster A, one nodes(4 cpu) is just as 4 times fast as my local
  computer.
  2. More than one nodes will decrease the performancs the gromacs,
  3. On cluster B, the more cpu used, the faster gromacs runs. But the
  difference of speed is not apparent.
  4. Cluster B with 4 cpus is slow as half as that Cluster A with 1 node
  (4 cpus)
 
  I wonder if anyone can tell the bottlenack: the hardware on the cluster
  or gromacs?

 Probably your interconnects between nodes are using carrier pigeons or
 something :-) I expect that 1 cpu on machine A will require around four
 times as long as 1 4-cpu node, which you can presumably test for yourself.

 For next time, if you want to compare hardware like this, either use the
 same length of time or the same number of MD steps for all of your runs.
 Also when reporting runtimes, make it clear whether you are reporting
 walltime or some time * number_of_cpus, etc. :-)

Search the mailing list i have post several times benchmark results for 
different systems with standard benchmark suite.

http://www.gromacs.org/pipermail/gmx-developers/2006-January/001473.html



 Mark
 ___
 gmx-users mailing listgmx-users@gromacs.org
 http://www.gromacs.org/mailman/listinfo/gmx-users
 Please don't post (un)subscribe requests to the list. Use the
 www interface or send it to [EMAIL PROTECTED]
 Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Greetings,

Florian 

-- 
---
 Florian Haberl
 Computer-Chemie-Centrum   
 Universitaet Erlangen/ Nuernberg
 Naegelsbachstr 25
 D-91052 Erlangen
 Telephone: +49(0) − 9131 − 85 26581
 Mailto: florian.haberl AT chemie.uni-erlangen.de
---
___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php


Re: [gmx-users] Re: Question about parallazing Gromacs (Qiao Baofu)

2006-09-14 Thread Qiao Baofu
2006/9/14, Mark Abraham [EMAIL PROTECTED]:
Probably your interconnects between nodes are using carrier pigeons orsomething :-) I expect that 1 cpu on machine A will require around fourtimes as long as 1 4-cpu node, which you can presumably test for yourself.
It is forbidden to run only one cpu on the cluster A in my inisititute.
For next time, if you want to compare hardware like this, either use thesame length of time or the same number of MD steps for all of your runs.Also when reporting runtimes, make it clear whether you are reporting
walltime or some time * number_of_cpus, etc. :-)For all the jobs (except the one on my local computer) , I set walltime=1hour, and nsteps= 200,000,dt=0.001.The running time are taken from the end of the .log file. See the following example.
 NODE (s) Real (s) (%) Time: 1581.000 1581.000 100.0 26:21  (Mnbf/s) (GFlops) (ns/day) (hour/ns)Performance: 
56.376 4.515 10.930 2.196Mark___
gmx-users mailing listgmx-users@gromacs.orghttp://www.gromacs.org/mailman/listinfo/gmx-usersPlease don't post (un)subscribe requests to the list. Use the
www interface or send it to [EMAIL PROTECTED].Can't post? Read http://www.gromacs.org/mailing_lists/users.php
-- Sincerely yours,**Baofu Qiao, PhDFrankfurt Institute for Advanced StudiesMax-von-Laue-Str. 160438 Frankfurt am Main, Germany TEL:+49-69-7984-7529
**
___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

[gmx-users] Re: Question about parallazing Gromacs (Qiao Baofu)

2006-09-13 Thread Cesar Araujo
Well, there is a common misconception with parallel computing. Usually, you 
will have an optimun number of processors that guarantees the best 
performance. More or less than that number will result in a decreased 
performance and longer computation times. The optimum number of processors 
will depend on the particular problem and the hardware/software 
configuration of your cluster, but for instance, in my case for docking 
experiments I've found that 4 cpu's is Ok. If I try to use more than 4 cpus 
the performance is worst. The same is for less than 4 cpu's. You have to 
find your optimum making some tests with your settings. To do that you can 
start your simulation and interrupt after a while to have some data logged 
in the log file. Then, from the information in that log file you can 
estimate the time that the whole task will take and compare using more or 
less number of processors until you find your optimum value.


I hope it helps.

Regards,
César.-


--

Message: 1
Date: Wed, 13 Sep 2006 12:42:37 +0200
From: Qiao Baofu [EMAIL PROTECTED]
Subject: [gmx-users] Question about parallazing Gromacs
To: gmx-users@gromacs.org
Message-ID:
[EMAIL PROTECTED]
Content-Type: text/plain; charset=iso-8859-1

Hi all,

I have a question about parallazing gromacs: I run the same system on a
cluster of my institute and my local computer,
Cluster:* *dual processor boards AMD Opteron 270 (Dual-Core), 2.0 GHz
Local computer: AMD X86-64 Cpu, double precision

1. The cluster (nodes=3:ppn=4) runs  87950 MD steps  for one hour
2. The cluster (nodes=5:ppn=4) runs  42749 MD  steps  for one hour
3. The cluster (nodes=11:ppn=4) runs  5962 MD  steps  for one hour
3. My local computer runs  179090 MD steps  For 1hour 51 mintues.

It is verry strange that the more cpus I use, the slowest the gromacs
runs.!!

Who knows what's wrong with my job?   And for paralleled gromacs, how many
cpus is prefered?



The grompp command is:   grompp -np 12 -o md3.mdp -c md3in.gro -p 
MCl.top -o

md3.tpr

The following is one of the the job scripts on the cluster:

#
# MD NTP(BerendsenBerendsen, T=425P=1bar),200ps tau_p=4
#
#
#!/bin/bash
#PBS -N md3
#
#PBS -l walltime=01:00:00,nodes=3:ppn=4
#
#PBS -m abe
#
#PBS -o md3.out
#
#PBS -e md3.err
#
#
cd /work/fias/qiao/time_checking/nodes3/
/usr/local/Cluster-Apps/lam/gcc/64/7.1.1/bin/lamboot $PBS_NODEFILE
/usr/local/Cluster-Apps/lam/gcc/64/7.1.1/bin/mpirun -np 12 mdrun -v -s
md3.tpr -x md3 -e md3 -c md3 -g md3
exit 0


--
Sincerely yours,
**
Baofu Qiao, PhD
Frankfurt Institute for Advanced Studies
**
-- next part --
An HTML attachment was scrubbed...
URL: 
http://www.gromacs.org/pipermail/gmx-users/attachments/20060913/fdee271a/attachment-0001.html




___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php


Re: [gmx-users] Re: Question about parallazing Gromacs (Qiao Baofu)

2006-09-13 Thread Mark Abraham
 You have to
 find your optimum making some tests with your settings. To do that you can
 start your simulation and interrupt after a while to have some data logged
 in the log file. Then, from the information in that log file you can
 estimate the time that the whole task will take and compare using more or
 less number of processors until you find your optimum value.

Of course, that while should be at least of the order of several
minutes. There is a set-up cost borne once at the start of the calculation
which is not proportional to the length of the calculation, so you need to
run long enough to get out of the time period during which it dominates
the linear component.

Mark

___
gmx-users mailing listgmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php