subject:"\[gmx\-users\] Replica Exchange MD on more than 64 processors"

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2010-02-03 Thread Sebastian Breuers


Hey,

thanks a lot for the quick answers.

Installation of mvapich 1.2 and compiliation and linking of mdrun 
against their libraries seem to do the trick.


Kind regards

Sebastian



Mark Abraham schrieb:

- Original Message -
From: Berk Hess g...@hotmail.com
Date: Wednesday, February 3, 2010 5:13
Subject: RE: [gmx-users] Replica Exchange MD on more than 64 processors
To: Discussion list for GROMACS users gmx-users@gromacs.org





---
| 

  
Hi,
  

One issue could be MPI memory usage.
I have noticed that many MPI implementations use an amount of memory
per process that is quadratic (!) in the number of processes involved.
This can quickly get out of hand. But 28 GB is a lot of memory.



The OP was using MVAPICH 1.1, which is not the most current version. MVAPICH 
1.2 claims to scale with near-constant memory usage. I suggest an upgrade.

Mark

  

One thing that might help slightly is to not use double precision,
which is almost never required. This will also make your simulations
a factor 1.4 faster.

Berk



Date: Tue, 2 Feb 2010 18:55:37 +0100
From: breue...@uni-koeln.de
To: gmx-users@gromacs.org
Subject: [gmx-users] Replica Exchange MD on more than 64 processors

Dear list

I recently came up with a problem concerning a replica exchange simulation. The 
simulation is run
with gromacs-mpi in Version 4.0.7 compiled with following flags
--enable-threads --enable-mpi --with-fft=mkl -enable-double,

intel compiler version 11.0
mvapich version 1.1.0
mkl version 10.1

The program is working fine in this cluster evironment consisting of 32 nodes 
with 8 processors and
32GB each. I've already run several simulations using the MPI feature.

It seems that I stuck in a similar problem that was already announced on this 
list by bharat v.
adkar in december 2009 without an eventual solution:

http://www.mail-archive.com/gmx-users@gromacs.org/msg27175.html

I am doing a replica exchange simulation on a simulation box with 5000 
molecules (81 atoms each) and
4 different temperatures. The simulation runs nicely with 64 processors (8 
nodes) but stops with an
error message on 128 processors (16 nodes).

Taking the following four points into account:

 1. every cluster node has at least 28GB memory in a usable way 
available
 2. the system I am working with should only use
5000*81*900B=347.614MB (according to the FAQ)
 3. even if every replica (4) is run on the same node the memory usage
should be less than 2GB
 4. the simulation works fine with 64 processors

it seems to me the following error

---
Program mdrun, VERSION 4.0.7
Source code file: smalloc.c, line: 179

Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr, 
nlist-jjnr=0xae70b7b0
(called from file ns.c, line 503)
---

has to be caused by another issue than missing memory.

I am wondering if there is anyone else who is still facing the same problem or 
has already found a
solution for this issue.

Kind regards

Sebastian

--
_

Sebastian BreuersTel: +49-221-470-4108
EMail: breue...@uni-koeln.de

Universität zu Köln  University of Cologne
Department für ChemieDepartment Chemistry
Organische Chemieuniversity of Cologne

Greinstr. 4  Greinstr. 4
D-50939 Köln D-50939 Cologne, Federal Rep. of Germany
_

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php
  
 		 	   		  




---
  

New Windows 7: Simplify what you do everyday. Find the right PC for you. |


---


  

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search 
before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php



  



--
_

Sebastian BreuersTel: +49-221-470-4108
EMail: breue...@uni-koeln.de 


Universität zu Köln  University of Cologne
Department für Chemie

[gmx-users] Replica Exchange MD on more than 64 processors

2010-02-02 Thread Sebastian Breuers


Dear list

I recently came up with a problem concerning a replica exchange simulation. The 
simulation is run
with gromacs-mpi in Version 4.0.7 compiled with following flags
--enable-threads --enable-mpi --with-fft=mkl -enable-double,

intel compiler version 11.0
mvapich version 1.1.0
mkl version 10.1

The program is working fine in this cluster evironment consisting of 32 nodes 
with 8 processors and
32GB each. I've already run several simulations using the MPI feature.

It seems that I stuck in a similar problem that was already announced on this 
list by bharat v.
adkar in december 2009 without an eventual solution:

http://www.mail-archive.com/gmx-users@gromacs.org/msg27175.html

I am doing a replica exchange simulation on a simulation box with 5000 
molecules (81 atoms each) and
4 different temperatures. The simulation runs nicely with 64 processors (8 
nodes) but stops with an
error message on 128 processors (16 nodes).

Taking the following four points into account:

1. every cluster node has at least 28GB memory in a usable way available
2. the system I am working with should only use
   5000*81*900B=347.614MB (according to the FAQ)
3. even if every replica (4) is run on the same node the memory usage
   should be less than 2GB
4. the simulation works fine with 64 processors

it seems to me the following error

---
Program mdrun, VERSION 4.0.7
Source code file: smalloc.c, line: 179

Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr, 
nlist-jjnr=0xae70b7b0
(called from file ns.c, line 503)
---

has to be caused by another issue than missing memory.

I am wondering if there is anyone else who is still facing the same problem or 
has already found a
solution for this issue.

Kind regards

Sebastian

--
_

Sebastian BreuersTel: +49-221-470-4108
EMail: breue...@uni-koeln.de

Universität zu Köln  University of Cologne
Department für ChemieDepartment Chemistry
Organische Chemieuniversity of Cologne

Greinstr. 4  Greinstr. 4
D-50939 Köln D-50939 Cologne, Federal Rep. of Germany
_

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

RE: [gmx-users] Replica Exchange MD on more than 64 processors

2010-02-02 Thread Berk Hess


Hi,

One issue could be MPI memory usage.
I have noticed that many MPI implementations use an amount of memory
per process that is quadratic (!) in the number of processes involved.
This can quickly get out of hand. But 28 GB is a lot of memory.

One thing that might help slightly is to not use double precision,
which is almost never required. This will also make your simulations
a factor 1.4 faster.

Berk

 Date: Tue, 2 Feb 2010 18:55:37 +0100
 From: breue...@uni-koeln.de
 To: gmx-users@gromacs.org
 Subject: [gmx-users] Replica Exchange MD on more than 64 processors
 
 Dear list
 
 I recently came up with a problem concerning a replica exchange simulation. 
 The simulation is run
 with gromacs-mpi in Version 4.0.7 compiled with following flags
 --enable-threads --enable-mpi --with-fft=mkl -enable-double,
 
 intel compiler version 11.0
 mvapich version 1.1.0
 mkl version 10.1
 
 The program is working fine in this cluster evironment consisting of 32 nodes 
 with 8 processors and
 32GB each. I've already run several simulations using the MPI feature.
 
 It seems that I stuck in a similar problem that was already announced on this 
 list by bharat v.
 adkar in december 2009 without an eventual solution:
 
 http://www.mail-archive.com/gmx-users@gromacs.org/msg27175.html
 
 I am doing a replica exchange simulation on a simulation box with 5000 
 molecules (81 atoms each) and
 4 different temperatures. The simulation runs nicely with 64 processors (8 
 nodes) but stops with an
 error message on 128 processors (16 nodes).
 
 Taking the following four points into account:
 
  1. every cluster node has at least 28GB memory in a usable way 
 available
  2. the system I am working with should only use
 5000*81*900B=347.614MB (according to the FAQ)
  3. even if every replica (4) is run on the same node the memory usage
 should be less than 2GB
  4. the simulation works fine with 64 processors
 
 it seems to me the following error
 
 ---
 Program mdrun, VERSION 4.0.7
 Source code file: smalloc.c, line: 179
 
 Fatal error:
 Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr, 
 nlist-jjnr=0xae70b7b0
 (called from file ns.c, line 503)
 ---
 
 has to be caused by another issue than missing memory.
 
 I am wondering if there is anyone else who is still facing the same problem 
 or has already found a
 solution for this issue.
 
 Kind regards
 
 Sebastian
 
 -- 
 _
 
 Sebastian BreuersTel: +49-221-470-4108
 EMail: breue...@uni-koeln.de
 
 Universität zu Köln  University of Cologne
 Department für ChemieDepartment Chemistry
 Organische Chemieuniversity of Cologne
 
 Greinstr. 4  Greinstr. 4
 D-50939 Köln D-50939 Cologne, Federal Rep. of Germany
 _
 
 -- 
 gmx-users mailing listgmx-users@gromacs.org
 http://lists.gromacs.org/mailman/listinfo/gmx-users
 Please search the archive at http://www.gromacs.org/search before posting!
 Please don't post (un)subscribe requests to the list. Use the 
 www interface or send it to gmx-users-requ...@gromacs.org.
 Can't post? Read http://www.gromacs.org/mailing_lists/users.php
  
_
New Windows 7: Simplify what you do everyday. Find the right PC for you.
http://windows.microsoft.com/shop-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

[gmx-users] Replica Exchange MD on more than 64 processors

2010-02-02 Thread Sebastian Breuers

Dear list

I recently came up with a problem concerning a replica exchange simulation. The simulation is run
with gromacs-mpi in Version 4.0.7 compiled with following flags

--enable-threads --enable-mpi --with-fft=mkl -enable-double,

intel compiler version 11.0
mvapich version 1.1.0
mkl version 10.1

The program is working fine in this cluster evironment consisting of 32 nodes with 8 processors and
32GB each. I've already run several simulations using the MPI feature.

It seems that I stuck in a similar problem that was already announced on this list by bharat v.
adkar in december 2009 without an eventual solution:

http://www.mail-archive.com/gmx-users@gromacs.org/msg27175.html

I am doing a replica exchange simulation on a simulation box with 5000 molecules (81 atoms each) and
4 different temperatures. The simulation runs nicely with 64 processors (8 nodes) but stops with an
error message on 128 processors (16 nodes).

Taking the following four points into account:

1. every cluster node has at least 28GB memory in a usable way available
2. the system I am working with should only use
5000*81*900B=347.614MB (according to the FAQ)
3. even if every replica (4) is run on the same node the memory usage
should be less than 2GB
4. the simulation works fine with 64 processors

it seems to me the following error

---
Program mdrun, VERSION 4.0.7
Source code file: smalloc.c, line: 179

Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr,
nlist-jjnr=0xae70b7b0
(called from file ns.c, line 503)
---

has to be caused by another issue than missing memory.

I am wondering if there is anyone else who is still facing the same problem or has already found a
solution for this issue.

Kind regards

Sebastian

--
_

Sebastian BreuersTel: +49-221-470-4108
EMail: breue...@uni-koeln.de

Universität zu Köln University of Cologne
Department für ChemieDepartment Chemistry
Organische Chemieuniversity of Cologne

Greinstr. 4 Greinstr. 4
D-50939 Köln D-50939 Cologne, Federal Rep. of Germany
_
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-28 Thread bharat v. adkar


On Mon, 28 Dec 2009, David van der Spoel wrote:


bharat v. adkar wrote:

 On Mon, 28 Dec 2009, Mark Abraham wrote:

  bharat v. adkar wrote:
On Sun, 27 Dec 2009, Mark Abraham wrote:
  
 bharat v. adkar wrote:

   On Sun, 27 Dec 2009, Mark Abraham wrote:
  bharat v. adkar wrote:
Dear all,
I am trying to perform replica exchange MD (REMD) on a  
 'protein in
  water' system. I am following instructions given on wiki  
 (How-Tos -
  REMD). I have to perform the REMD simulation with 35 
   different
  temperatures. As per advise on wiki, I equilibrated the 
  system at
  respective temperatures (total of 35 equilibration   
   simulations). After
  this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr 
  files from the

  equilibrated structures.
   Now when I submit final job for REMD with following  
 command-line, itgives

  some error:
   command line: mpiexec -np 70 mdrun -multi 35 -replex 
   1000 -schk_.tpr-v

error msg:
  ---
  Program mdrun_mpi, VERSION 4.0.7
  Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 
   179

Fatal error:
  Not enough memory. Failed to realloc 790760 bytes for   
nlist-jjnr,

  nlist-jjnr=0x9a400030
  (called from file ../../../SRC/src/mdlib/ns.c, line 503)
  ---
Thanx for Using GROMACS - Have a Nice Day
  : Cannot allocate memory
  Error on node 19, will try to stop all the nodes
  Halting parallel program mdrun_mpi on CPU 19 out of 70

   ***
 The individual node on the cluster has 8GB of 
  physical memory and 16GBof
  swap memory. Moreover, when logged onto the individual 
   nodes,   it  shows
  more than 1GB of free memory, so there should be no 
  problem   with  cluster
  memory. Also, the equilibration jobs for the same system 
  are run on the

  same cluster without any problem.
   What I have observed by submitting different test jobs 
  with   varying  number
 of processors (and no. of replicas, wherever necessary), 
  that any jobwith
 total number of processors = 64, runs faithfully without 
  any problem.As
  soon as total number of processors are more than 64, it 
  gives   the   above
 error. I have tested this with 65 processors/65 replicas 
 also.
 This sounds like you might be running on fewer physical 
   CPUsthan you   have available. If so, running multiple MPI 
   processes perphysical CPU   can lead to memory shortage 
   conditions.
I don't understand what you mean. Do you mean, there might 
   be morethan 8
   processes running per node (each node has 8 processors)? But 
  that  also
   does not seem to be the case, as SGE (sun grid engine) output 
  shows only

  eight processes per node.
  65 processes can't have 8 processes per node.
   why can't it have? as i said, there are 8 processors per node. what i 
   have
   not mentioned is that how many nodes it is using. The jobs got 
   distributed
   over 9 nodes. 8 of which corresponds to 64 processors + 1 processor 
   from

9th node.
 
  OK, that's a full description. Your symptoms are indicative of someone 
  making an error somewhere. Since GROMACS works over more than 64 
  processors elsewhere, the presumption is that you are doing something 
  wrong or the machine is not set up in the way you think it is or should 
  be. To get the most effective help, you need to be sure you're providing 
  full information - else we can't tell which error you're making or 
  (potentially) eliminate you as a source of error.


 Sorry for not being clear in statements.

   As far I can tell you, job distribution seems okay to me. It is 1 job 
   per

processor.
 
  Does non-REMD GROMACS run on more than 64 processors? Does your cluster 
  support using more than 8 nodes in a run? Can you run an MPI Hello 
  world application that prints the processor and node ID across more 
  than 64 processors?


 Yes, the cluster supports runs with more than 8 nodes. I generated a
 system with 10 nm water box and submitted on 80 processors. It was running
 fine. It printed all 80 NODEIDs. Also showed me when the job will get
 over.

 bharat


 
  Mark
 
 
bharat
  
  Mark

 I don't know what you mean by swap memory.
 Sorry, I meant cache memory..
 bharat
   Mark
   System: Protein + water + Na ions (total 46878 atoms)
  Gromacs version: tested with both v4.0.5 and v4.0.7
  compiled with:

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-28 Thread Mark Abraham


bharat v. adkar wrote:

On Mon, 28 Dec 2009, David van der Spoel wrote:


bharat v. adkar wrote:

 On Mon, 28 Dec 2009, Mark Abraham wrote:

  bharat v. adkar wrote:
On Sun, 27 Dec 2009, Mark Abraham wrote:
   bharat v. adkar wrote:
   On Sun, 27 Dec 2009, Mark Abraham wrote:
  bharat v. adkar wrote:
Dear all,
I am trying to perform replica exchange MD (REMD) 
on a   'protein in
  water' system. I am following instructions given on 
wiki   (How-Tos -
  REMD). I have to perform the REMD simulation with 35 
   different
  temperatures. As per advise on wiki, I equilibrated 
the   system at
  respective temperatures (total of 35 equilibration  
simulations). After
  this I generated chk_0.tpr, chk_1.tpr, ..., 
chk_34.tpr   files from the

  equilibrated structures.
   Now when I submit final job for REMD with 
following   command-line, itgives

  some error:
   command line: mpiexec -np 70 mdrun -multi 35 
-replex1000 -schk_.tpr-v

error msg:
  ---
  Program mdrun_mpi, VERSION 4.0.7
  Source code file: ../../../SRC/src/gmxlib/smalloc.c, 
line:179

Fatal error:
  Not enough memory. Failed to realloc 790760 bytes for 
  nlist-jjnr,

  nlist-jjnr=0x9a400030
  (called from file ../../../SRC/src/mdlib/ns.c, line 503)
  ---
Thanx for Using GROMACS - Have a Nice Day
  : Cannot allocate memory
  Error on node 19, will try to stop all the nodes
  Halting parallel program mdrun_mpi on CPU 19 out of 70
   
***
 The individual node on the cluster has 8GB of 
  physical memory and 16GBof
  swap memory. Moreover, when logged onto the 
individualnodes,   it  shows
  more than 1GB of free memory, so there should be no  
 problem   with  cluster
  memory. Also, the equilibration jobs for the same 
system   are run on the

  same cluster without any problem.
   What I have observed by submitting different test 
jobs   with   varying  number
 of processors (and no. of replicas, wherever 
necessary),   that any jobwith
 total number of processors = 64, runs faithfully 
without   any problem.As
  soon as total number of processors are more than 64, 
it   gives   the   above
 error. I have tested this with 65 processors/65 
replicas  also.
 This sounds like you might be running on fewer 
physicalCPUsthan you   have available. If so, running 
multiple MPIprocesses perphysical CPU   can lead to 
memory shortageconditions.
I don't understand what you mean. Do you mean, there 
mightbe morethan 8
   processes running per node (each node has 8 processors)? 
But   that  also
   does not seem to be the case, as SGE (sun grid engine) 
output   shows only

  eight processes per node.
  65 processes can't have 8 processes per node.
   why can't it have? as i said, there are 8 processors per node. 
what ihave
   not mentioned is that how many nodes it is using. The jobs got  
  distributed
   over 9 nodes. 8 of which corresponds to 64 processors + 1 
processorfrom

9th node.
   OK, that's a full description. Your symptoms are indicative of 
someone   making an error somewhere. Since GROMACS works over more 
than 64   processors elsewhere, the presumption is that you are 
doing something   wrong or the machine is not set up in the way you 
think it is or should   be. To get the most effective help, you need 
to be sure you're providing   full information - else we can't tell 
which error you're making or   (potentially) eliminate you as a 
source of error.


 Sorry for not being clear in statements.

   As far I can tell you, job distribution seems okay to me. It is 
1 jobper

processor.
   Does non-REMD GROMACS run on more than 64 processors? Does your 
cluster   support using more than 8 nodes in a run? Can you run an 
MPI Hello   world application that prints the processor and node 
ID across more   than 64 processors?


 Yes, the cluster supports runs with more than 8 nodes. I generated a
 system with 10 nm water box and submitted on 80 processors. It was 
running

 fine. It printed all 80 NODEIDs. Also showed me when the job will get
 over.

 bharat


   Mark
  bharat
Mark
 I don't know what you mean by swap memory.
 Sorry, I meant cache memory..
 bharat
   Mark
   System: Protein + water + Na ions (total 46878 atoms)
  Gromacs version: tested with both v4.0.5 and v4.0.7

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-27 Thread bharat v. adkar


On Sun, 27 Dec 2009, Mark Abraham wrote:


bharat v. adkar wrote:


 Dear all,
   I am trying to perform replica exchange MD (REMD) on a 'protein in
 water' system. I am following instructions given on wiki (How-Tos -
 REMD). I have to perform the REMD simulation with 35 different
 temperatures. As per advise on wiki, I equilibrated the system at
 respective temperatures (total of 35 equilibration simulations). After
 this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files from the
 equilibrated structures.

 Now when I submit final job for REMD with following command-line, it gives
 some error:

 command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s chk_.tpr -v

 error msg:
 ---
 Program mdrun_mpi, VERSION 4.0.7
 Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179

 Fatal error:
 Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr,
 nlist-jjnr=0x9a400030
 (called from file ../../../SRC/src/mdlib/ns.c, line 503)
 ---

 Thanx for Using GROMACS - Have a Nice Day
:  Cannot allocate memory
 Error on node 19, will try to stop all the nodes
 Halting parallel program mdrun_mpi on CPU 19 out of 70
 ***


 The individual node on the cluster has 8GB of physical memory and 16GB of
 swap memory. Moreover, when logged onto the individual nodes, it shows
 more than 1GB of free memory, so there should be no problem with cluster
 memory. Also, the equilibration jobs for the same system are run on the
 same cluster without any problem.

 What I have observed by submitting different test jobs with varying number
 of processors (and no. of replicas, wherever necessary), that any job with
 total number of processors = 64, runs faithfully without any problem. As
 soon as total number of processors are more than 64, it gives the above
 error. I have tested this with 65 processors/65 replicas also.


This sounds like you might be running on fewer physical CPUs than you have 
available. If so, running multiple MPI processes per physical CPU can lead to 
memory shortage conditions.


I don't understand what you mean. Do you mean, there might be more than 8 
processes running per node (each node has 8 processors)? But that also 
does not seem to be the case, as SGE (sun grid engine) output shows only 
eight processes per node.




I don't know what you mean by swap memory.


Sorry, I meant cache memory..

bharat



Mark


 System: Protein + water + Na ions (total 46878 atoms)
 Gromacs version: tested with both v4.0.5 and v4.0.7
 compiled with: --enable-float --with-fft=fftw3 --enable-mpi
 compiler: gcc_3.4.6 -O3
 machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux


 I tried searching the mailing-list without any luck. I am not sure, if i
 am doing anything wrong in giving commands. Please correct me if it is
 wrong.

 Kindly let me know the solution.


 bharat






--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-27 Thread Mark Abraham


bharat v. adkar wrote:

On Sun, 27 Dec 2009, Mark Abraham wrote:


bharat v. adkar wrote:


 Dear all,
   I am trying to perform replica exchange MD (REMD) on a 'protein in
 water' system. I am following instructions given on wiki (How-Tos -
 REMD). I have to perform the REMD simulation with 35 different
 temperatures. As per advise on wiki, I equilibrated the system at
 respective temperatures (total of 35 equilibration simulations). After
 this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files from the
 equilibrated structures.

 Now when I submit final job for REMD with following command-line, it 
gives

 some error:

 command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s 
chk_.tpr -v


 error msg:
 ---
 Program mdrun_mpi, VERSION 4.0.7
 Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179

 Fatal error:
 Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr,
 nlist-jjnr=0x9a400030
 (called from file ../../../SRC/src/mdlib/ns.c, line 503)
 ---

 Thanx for Using GROMACS - Have a Nice Day
:  Cannot allocate memory
 Error on node 19, will try to stop all the nodes
 Halting parallel program mdrun_mpi on CPU 19 out of 70
 ***


 The individual node on the cluster has 8GB of physical memory and 
16GB of

 swap memory. Moreover, when logged onto the individual nodes, it shows
 more than 1GB of free memory, so there should be no problem with 
cluster

 memory. Also, the equilibration jobs for the same system are run on the
 same cluster without any problem.

 What I have observed by submitting different test jobs with varying 
number
 of processors (and no. of replicas, wherever necessary), that any 
job with
 total number of processors = 64, runs faithfully without any 
problem. As

 soon as total number of processors are more than 64, it gives the above
 error. I have tested this with 65 processors/65 replicas also.


This sounds like you might be running on fewer physical CPUs than you 
have available. If so, running multiple MPI processes per physical CPU 
can lead to memory shortage conditions.


I don't understand what you mean. Do you mean, there might be more than 
8 processes running per node (each node has 8 processors)? But that also 
does not seem to be the case, as SGE (sun grid engine) output shows only 
eight processes per node.


65 processes can't have 8 processes per node.

Mark


I don't know what you mean by swap memory.


Sorry, I meant cache memory..

bharat



Mark


 System: Protein + water + Na ions (total 46878 atoms)
 Gromacs version: tested with both v4.0.5 and v4.0.7
 compiled with: --enable-float --with-fft=fftw3 --enable-mpi
 compiler: gcc_3.4.6 -O3
 machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux


 I tried searching the mailing-list without any luck. I am not sure, 
if i

 am doing anything wrong in giving commands. Please correct me if it is
 wrong.

 Kindly let me know the solution.


 bharat







--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-27 Thread bharat v. adkar


On Sun, 27 Dec 2009, Mark Abraham wrote:


bharat v. adkar wrote:

 On Sun, 27 Dec 2009, Mark Abraham wrote:

  bharat v. adkar wrote:
  
Dear all,

  I am trying to perform replica exchange MD (REMD) on a 'protein in
water' system. I am following instructions given on wiki (How-Tos -
REMD). I have to perform the REMD simulation with 35 different
temperatures. As per advise on wiki, I equilibrated the system at
respective temperatures (total of 35 equilibration simulations). 
After

this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files from the
equilibrated structures.
  
   Now when I submit final job for REMD with following command-line, it 
   gives

some error:
  
   command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s chk_.tpr 
   -v
  
error msg:

---
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179
  
Fatal error:

Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr,
nlist-jjnr=0x9a400030
(called from file ../../../SRC/src/mdlib/ns.c, line 503)
---
  
Thanx for Using GROMACS - Have a Nice Day

  :   Cannot allocate memory
Error on node 19, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 19 out of 70
***
  
  
   The individual node on the cluster has 8GB of physical memory and 16GB 
   of
swap memory. Moreover, when logged onto the individual nodes, it 
shows
more than 1GB of free memory, so there should be no problem with 
   cluster
memory. Also, the equilibration jobs for the same system are run on 
the

same cluster without any problem.
  
   What I have observed by submitting different test jobs with varying 
   number
   of processors (and no. of replicas, wherever necessary), that any job 
   with
   total number of processors = 64, runs faithfully without any problem. 
   As
soon as total number of processors are more than 64, it gives the 
above

error. I have tested this with 65 processors/65 replicas also.
 
  This sounds like you might be running on fewer physical CPUs than you 
  have available. If so, running multiple MPI processes per physical CPU 
  can lead to memory shortage conditions.


 I don't understand what you mean. Do you mean, there might be more than 8
 processes running per node (each node has 8 processors)? But that also
 does not seem to be the case, as SGE (sun grid engine) output shows only
 eight processes per node.


65 processes can't have 8 processes per node.
why can't it have? as i said, there are 8 processors per node. what i have 
not mentioned is that how many nodes it is using. The jobs got distributed 
over 9 nodes. 8 of which corresponds to 64 processors + 1 processor from 
9th node.
As far I can tell you, job distribution seems okay to me. It is 1 job per 
processor.


bharat



Mark


  I don't know what you mean by swap memory.

 Sorry, I meant cache memory..

 bharat

 
  Mark
 
System: Protein + water + Na ions (total 46878 atoms)

Gromacs version: tested with both v4.0.5 and v4.0.7
compiled with: --enable-float --with-fft=fftw3 --enable-mpi
compiler: gcc_3.4.6 -O3
machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux
  
  
   I tried searching the mailing-list without any luck. I am not sure, if 
   i
am doing anything wrong in giving commands. Please correct me if it 
is

wrong.
  
Kindly let me know the solution.
  
  
bharat
  
  







--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-27 Thread Mark Abraham


bharat v. adkar wrote:

On Sun, 27 Dec 2009, Mark Abraham wrote:


bharat v. adkar wrote:

 On Sun, 27 Dec 2009, Mark Abraham wrote:

  bharat v. adkar wrote:
  Dear all,
  I am trying to perform replica exchange MD (REMD) on a 
'protein in
water' system. I am following instructions given on wiki 
(How-Tos -

REMD). I have to perform the REMD simulation with 35 different
temperatures. As per advise on wiki, I equilibrated the system at
respective temperatures (total of 35 equilibration 
simulations). After
this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files 
from the

equilibrated structures.
 Now when I submit final job for REMD with following 
command-line, itgives

some error:
 command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s 
chk_.tpr-v

  error msg:
---
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179
  Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr,
nlist-jjnr=0x9a400030
(called from file ../../../SRC/src/mdlib/ns.c, line 503)
---
  Thanx for Using GROMACS - Have a Nice Day
  :   Cannot allocate memory
Error on node 19, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 19 out of 70

***
   The individual node on the cluster has 8GB of physical 
memory and 16GBof
swap memory. Moreover, when logged onto the individual nodes, 
it shows
more than 1GB of free memory, so there should be no problem 
withcluster
memory. Also, the equilibration jobs for the same system are 
run on the

same cluster without any problem.
 What I have observed by submitting different test jobs with 
varyingnumber
   of processors (and no. of replicas, wherever necessary), that 
any jobwith
   total number of processors = 64, runs faithfully without any 
problem.As
soon as total number of processors are more than 64, it gives 
the above

error. I have tested this with 65 processors/65 replicas also.
   This sounds like you might be running on fewer physical CPUs 
than you   have available. If so, running multiple MPI processes per 
physical CPU   can lead to memory shortage conditions.


 I don't understand what you mean. Do you mean, there might be more 
than 8

 processes running per node (each node has 8 processors)? But that also
 does not seem to be the case, as SGE (sun grid engine) output shows 
only

 eight processes per node.


65 processes can't have 8 processes per node.
why can't it have? as i said, there are 8 processors per node. what i 
have not mentioned is that how many nodes it is using. The jobs got 
distributed over 9 nodes. 8 of which corresponds to 64 processors + 1 
processor from 9th node.


OK, that's a full description. Your symptoms are indicative of someone 
making an error somewhere. Since GROMACS works over more than 64 
processors elsewhere, the presumption is that you are doing something 
wrong or the machine is not set up in the way you think it is or should 
be. To get the most effective help, you need to be sure you're providing 
full information - else we can't tell which error you're making or 
(potentially) eliminate you as a source of error.


As far I can tell you, job distribution seems okay to me. It is 1 job 
per processor.


Does non-REMD GROMACS run on more than 64 processors? Does your cluster 
support using more than 8 nodes in a run? Can you run an MPI Hello 
world application that prints the processor and node ID across more 
than 64 processors?


Mark



bharat



Mark


  I don't know what you mean by swap memory.

 Sorry, I meant cache memory..

 bharat

   Mark
 System: Protein + water + Na ions (total 46878 atoms)
Gromacs version: tested with both v4.0.5 and v4.0.7
compiled with: --enable-float --with-fft=fftw3 --enable-mpi
compiler: gcc_3.4.6 -O3
machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux
   I tried searching the mailing-list without any luck. I 
am not sure, ifi
am doing anything wrong in giving commands. Please correct me 
if it is

wrong.
  Kindly let me know the solution.
bharat







--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-27 Thread bharat v. adkar


On Mon, 28 Dec 2009, Mark Abraham wrote:


bharat v. adkar wrote:

 On Sun, 27 Dec 2009, Mark Abraham wrote:

  bharat v. adkar wrote:
On Sun, 27 Dec 2009, Mark Abraham wrote:
  
 bharat v. adkar wrote:

 Dear all,
 I am trying to perform replica exchange MD (REMD) on a 
   'protein in
   water' system. I am following instructions given on wiki 
   (How-Tos -

   REMD). I have to perform the REMD simulation with 35 different
   temperatures. As per advise on wiki, I equilibrated the system 
   at
   respective temperatures (total of 35 equilibration 
   simulations). After
   this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files 
   from the

   equilibrated structures.
Now when I submit final job for REMD with following 
   command-line, itgives

   some error:
command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s 
   chk_.tpr-v

 error msg:
   ---
   Program mdrun_mpi, VERSION 4.0.7
   Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179
 Fatal error:
   Not enough memory. Failed to realloc 790760 bytes for 
   nlist-jjnr,

   nlist-jjnr=0x9a400030
   (called from file ../../../SRC/src/mdlib/ns.c, line 503)
   ---
 Thanx for Using GROMACS - Have a Nice Day
:Cannot allocate memory
   Error on node 19, will try to stop all the nodes
   Halting parallel program mdrun_mpi on CPU 19 out of 70

   ***
  The individual node on the cluster has 8GB of physical 
   memory and 16GBof
   swap memory. Moreover, when logged onto the individual nodes, 
  it  shows
   more than 1GB of free memory, so there should be no problem 
  with cluster
   memory. Also, the equilibration jobs for the same system are 
   run on the

   same cluster without any problem.
What I have observed by submitting different test jobs with 
  varying number
  of processors (and no. of replicas, wherever necessary), that 
   any jobwith
  total number of processors = 64, runs faithfully without any 
   problem.As
   soon as total number of processors are more than 64, it gives 
  the  above

  error. I have tested this with 65 processors/65 replicas also.
  This sounds like you might be running on fewer physical CPUs 
   than you   have available. If so, running multiple MPI processes per 
   physical CPU   can lead to memory shortage conditions.
  
   I don't understand what you mean. Do you mean, there might be more 
   than 8
processes running per node (each node has 8 processors)? But that 
also
does not seem to be the case, as SGE (sun grid engine) output shows 
   only

eight processes per node.
 
  65 processes can't have 8 processes per node.

 why can't it have? as i said, there are 8 processors per node. what i have
 not mentioned is that how many nodes it is using. The jobs got distributed
 over 9 nodes. 8 of which corresponds to 64 processors + 1 processor from
 9th node.


OK, that's a full description. Your symptoms are indicative of someone making 
an error somewhere. Since GROMACS works over more than 64 processors 
elsewhere, the presumption is that you are doing something wrong or the 
machine is not set up in the way you think it is or should be. To get the 
most effective help, you need to be sure you're providing full information - 
else we can't tell which error you're making or (potentially) eliminate you 
as a source of error.



Sorry for not being clear in statements.


 As far I can tell you, job distribution seems okay to me. It is 1 job per
 processor.


Does non-REMD GROMACS run on more than 64 processors? Does your cluster 
support using more than 8 nodes in a run? Can you run an MPI Hello world 
application that prints the processor and node ID across more than 64 
processors?


Yes, the cluster supports runs with more than 8 nodes. I generated a 
system with 10 nm water box and submitted on 80 processors. It was running 
fine. It printed all 80 NODEIDs. Also showed me when the job will get 
over.


bharat




Mark



 bharat

 
  Mark
 
 I don't know what you mean by swap memory.
  
Sorry, I meant cache memory..
  
bharat
  
  Mark

System: Protein + water + Na ions (total 46878 atoms)
   Gromacs version: tested with both v4.0.5 and v4.0.7
   compiled with: --enable-float --with-fft=fftw3 --enable-mpi
   compiler: gcc_3.4.6 -O3
   machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux
  I tried searching the mailing-list without any luck. I 
   am not sure, ifi
   am doing anything wrong in giving commands. Please correct me 
   if it is

   wrong.
 Kindly let me know the solution.

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-27 Thread David van der Spoel


bharat v. adkar wrote:

On Mon, 28 Dec 2009, Mark Abraham wrote:


bharat v. adkar wrote:

 On Sun, 27 Dec 2009, Mark Abraham wrote:

  bharat v. adkar wrote:
On Sun, 27 Dec 2009, Mark Abraham wrote:
   bharat v. adkar wrote:
 Dear all,
 I am trying to perform replica exchange MD (REMD) on a  
  'protein in
   water' system. I am following instructions given on wiki  
  (How-Tos -
   REMD). I have to perform the REMD simulation with 35 
different
   temperatures. As per advise on wiki, I equilibrated the 
systemat
   respective temperatures (total of 35 equilibration
simulations). After
   this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr 
filesfrom the

   equilibrated structures.
Now when I submit final job for REMD with following  
  command-line, itgives

   some error:
command line: mpiexec -np 70 mdrun -multi 35 -replex 
1000 -schk_.tpr-v

 error msg:
   ---
   Program mdrun_mpi, VERSION 4.0.7
   Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 
179

 Fatal error:
   Not enough memory. Failed to realloc 790760 bytes for   
 nlist-jjnr,

   nlist-jjnr=0x9a400030
   (called from file ../../../SRC/src/mdlib/ns.c, line 503)
   ---
 Thanx for Using GROMACS - Have a Nice Day
:Cannot allocate memory
   Error on node 19, will try to stop all the nodes
   Halting parallel program mdrun_mpi on CPU 19 out of 70
   
***
  The individual node on the cluster has 8GB of 
physicalmemory and 16GBof
   swap memory. Moreover, when logged onto the individual 
nodes,   it  shows
   more than 1GB of free memory, so there should be no 
problem   with cluster
   memory. Also, the equilibration jobs for the same system 
arerun on the

   same cluster without any problem.
What I have observed by submitting different test jobs 
with   varying number
  of processors (and no. of replicas, wherever necessary), 
thatany jobwith
  total number of processors = 64, runs faithfully without 
anyproblem.As
   soon as total number of processors are more than 64, it 
gives   the  above

  error. I have tested this with 65 processors/65 replicas also.
  This sounds like you might be running on fewer physical 
CPUsthan you   have available. If so, running multiple MPI 
processes perphysical CPU   can lead to memory shortage 
conditions.
 I don't understand what you mean. Do you mean, there might 
be morethan 8
processes running per node (each node has 8 processors)? But 
that also
does not seem to be the case, as SGE (sun grid engine) output 
showsonly

eight processes per node.
   65 processes can't have 8 processes per node.
 why can't it have? as i said, there are 8 processors per node. what 
i have
 not mentioned is that how many nodes it is using. The jobs got 
distributed
 over 9 nodes. 8 of which corresponds to 64 processors + 1 processor 
from

 9th node.


OK, that's a full description. Your symptoms are indicative of someone 
making an error somewhere. Since GROMACS works over more than 64 
processors elsewhere, the presumption is that you are doing something 
wrong or the machine is not set up in the way you think it is or 
should be. To get the most effective help, you need to be sure you're 
providing full information - else we can't tell which error you're 
making or (potentially) eliminate you as a source of error.



Sorry for not being clear in statements.

 As far I can tell you, job distribution seems okay to me. It is 1 
job per

 processor.


Does non-REMD GROMACS run on more than 64 processors? Does your 
cluster support using more than 8 nodes in a run? Can you run an MPI 
Hello world application that prints the processor and node ID across 
more than 64 processors?


Yes, the cluster supports runs with more than 8 nodes. I generated a 
system with 10 nm water box and submitted on 80 processors. It was 
running fine. It printed all 80 NODEIDs. Also showed me when the job 
will get over.


bharat




Mark



 bharat

   Mark
  I don't know what you mean by swap memory.
  Sorry, I meant cache memory..
  bharat
Mark
System: Protein + water + Na ions (total 46878 atoms)
   Gromacs version: tested with both v4.0.5 and v4.0.7
   compiled with: --enable-float --with-fft=fftw3 --enable-mpi
   compiler: gcc_3.4.6 -O3
   machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux
  I tried searching the mailing-list without any 
luck. Iam not sure, ifi
   am doing anything wrong in giving commands. Please correct 
meif it is

   wrong.
 Kindly let me

[gmx-users] Replica Exchange MD on more than 64 processors

2009-12-26 Thread bharat v. adkar



Dear all,
  I am trying to perform replica exchange MD (REMD) on a 'protein in 
water' system. I am following instructions given on wiki (How-Tos - 
REMD). I have to perform the REMD simulation with 35 different 
temperatures. As per advise on wiki, I equilibrated the system at 
respective temperatures (total of 35 equilibration simulations). After 
this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files from the 
equilibrated structures.


Now when I submit final job for REMD with following command-line, it 
gives some error:


command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s chk_.tpr -v

error msg:
---
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179

Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr, 
nlist-jjnr=0x9a400030

(called from file ../../../SRC/src/mdlib/ns.c, line 503)
---

Thanx for Using GROMACS - Have a Nice Day
: Cannot allocate memory
Error on node 19, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 19 out of 70
***


The individual node on the cluster has 8GB of physical memory and 16GB of 
swap memory. Moreover, when logged onto the individual nodes, it shows 
more than 1GB of free memory, so there should be no problem with cluster 
memory. Also, the equilibration jobs for the same system are run on the 
same cluster without any problem.


What I have observed by submitting different test jobs with varying number 
of processors (and no. of replicas, wherever necessary), that any job with 
total number of processors = 64, runs faithfully without any problem. As 
soon as total number of processors are more than 64, it gives the above 
error. I have tested this with 65 processors/65 replicas also.


System: Protein + water + Na ions (total 46878 atoms)
Gromacs version: tested with both v4.0.5 and v4.0.7
compiled with: --enable-float --with-fft=fftw3 --enable-mpi
compiler: gcc_3.4.6 -O3
machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux


I tried searching the mailing-list without any luck. I am not sure, if i 
am doing anything wrong in giving commands. Please correct me if it is 
wrong.


Kindly let me know the solution.


bharat


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

2009-12-26 Thread Mark Abraham


bharat v. adkar wrote:


Dear all,
  I am trying to perform replica exchange MD (REMD) on a 'protein in 
water' system. I am following instructions given on wiki (How-Tos - 
REMD). I have to perform the REMD simulation with 35 different 
temperatures. As per advise on wiki, I equilibrated the system at 
respective temperatures (total of 35 equilibration simulations). After 
this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr files from the 
equilibrated structures.


Now when I submit final job for REMD with following command-line, it 
gives some error:


command line: mpiexec -np 70 mdrun -multi 35 -replex 1000 -s chk_.tpr -v

error msg:
---
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179

Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist-jjnr, 
nlist-jjnr=0x9a400030

(called from file ../../../SRC/src/mdlib/ns.c, line 503)
---

Thanx for Using GROMACS - Have a Nice Day
: Cannot allocate memory
Error on node 19, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 19 out of 70
***


The individual node on the cluster has 8GB of physical memory and 16GB 
of swap memory. Moreover, when logged onto the individual nodes, it 
shows more than 1GB of free memory, so there should be no problem with 
cluster memory. Also, the equilibration jobs for the same system are run 
on the same cluster without any problem.


What I have observed by submitting different test jobs with varying 
number of processors (and no. of replicas, wherever necessary), that any 
job with total number of processors = 64, runs faithfully without any 
problem. As soon as total number of processors are more than 64, it 
gives the above error. I have tested this with 65 processors/65 replicas 
also.


This sounds like you might be running on fewer physical CPUs than you 
have available. If so, running multiple MPI processes per physical CPU 
can lead to memory shortage conditions.


I don't know what you mean by swap memory.

Mark


System: Protein + water + Na ions (total 46878 atoms)
Gromacs version: tested with both v4.0.5 and v4.0.7
compiled with: --enable-float --with-fft=fftw3 --enable-mpi
compiler: gcc_3.4.6 -O3
machine details: uname -mpio: x86_64 x86_64 x86_64 GNU/Linux


I tried searching the mailing-list without any luck. I am not sure, if i 
am doing anything wrong in giving commands. Please correct me if it is 
wrong.


Kindly let me know the solution.


bharat



--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Re: [gmx-users] Replica Exchange MD on more than 64 processors

[gmx-users] Replica Exchange MD on more than 64 processors

RE: [gmx-users] Replica Exchange MD on more than 64 processors

[gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

[gmx-users] Replica Exchange MD on more than 64 processors

Re: [gmx-users] Replica Exchange MD on more than 64 processors

14 matches

Site Navigation

Mail list logo

Footer information