----- Original Message ----- From: xho...@sohu.com Date: Tuesday, June 1, 2010 11:53 Subject: [gmx-users] “Fatal error in PMPI_Bcast: Other MPI error, …..” occurs when using the ‘particle decomposition’ option. To: gmx-users <gmx-users@gromacs.org>
> Hi, everyone of gmx-users, > > I met a problem when I use the ‘particle decomposition’ option > in a NTP MD simulation of Engrailed Homeodomain (En) in CL- > neutralized water box. It just crashed with an error “Fatal > error in PMPI_Bcast: Other MPI error, error stack: …..”. > However, I’ve tried the ‘domain decomposition’ and everything is > ok! I use the Gromacs 4.05 and 4.07, the MPI lib is mpich2- > 1.2.1p1. The system box size is 5.386(nm)3. The MDP file list as > below: > ######################################################## > title = En > ;cpp = /lib/cpp > ;include = -I../top > define = > integrator = md > dt = 0.002 > nsteps = 3000000 > nstxout = 500 > nstvout = 500 > nstlog = 250 > nstenergy = 250 > nstxtcout = 500 > comm- > mode = Linear > nstcomm = 1 > > ;xtc_grps = Protein > energygrps = protein non-protein > > nstlist = 10 > ns_type = grid > pbc = xyz ;default xyz > ;periodic_molecules = > yes ;default no > rlist = 1.0 > > coulombtype = PME > rcoulomb = 1.0 > vdwtype = Cut-off > rvdw = 1.4 > fourierspacing = 0.12 > fourier_nx = 0 > fourier_ny = 0 > fourier_nz = 0 > pme_order = 4 > ewald_rtol = 1e-5 > optimize_fft = yes > > tcoupl = v-rescale > tc_grps = protein non-protein > tau_t = 0.1 0.1 > ref_t = 298 298 > Pcoupl = Parrinello-Rahman > pcoupltype = isotropic > tau_p = 0.5 > compressibility = 4.5e-5 > ref_p = 1.0 > > gen_vel = yes > gen_temp = 298 > gen_seed = 173529 > > constraints = hbonds > lincs_order = 10 > ######################################################## > > When I conduct MD using “nohup mpiexec -np 2 mdrun_dmpi -s > 11_Trun.tpr -g 12_NTPmd.log -o 12_NTPmd.trr -c 12_NTPmd.pdb -e > 12_NTPmd_ener.edr -cpo 12_NTPstate.cpt &”, everything is OK. > > Since the system doesn’t support more than 2 processes under > ‘domain decomposition’ option, it took me about 30 days to > calculate a 6ns trajectory. Then I decide to use the ‘particle Why no more than 2? What GROMACS version? Why are you using double precision with temperature coupling? MPICH has known issues. Use OpenMPI. > decomposition’ option. The command line is “nohup mpiexec -np 6 > mdrun_dmpi -pd -s 11_Trun.tpr -g 12_NTPmd.log -o 12_NTPmd.trr -c > 12_NTPmd.pdb -e 12_NTPmd_ener.edr -cpo 12_NTPstate.cpt &”. And I > got the crash in the nohup file like below: > #################### > Fatal error in PMPI_Bcast: Other MPI error, error stack: > PMPI_Bcast(1302)......................: MPI_Bcast(buf=0x8fedeb0, > count=60720, MPI_BYTE, root=0, MPI_COMM_WORLD) failed > MPIR_Bcast(998).......................: > MPIR_Bcast_scatter_ring_allgather(842): > MPIR_Bcast_binomial(187)..............: > MPIC_Send(41).........................: > MPIC_Wait(513)........................: > MPIDI_CH3I_Progress(150)..............: > MPID_nem_mpich2_blocking_recv(948)....: > MPID_nem_tcp_connpoll(1720)...........: > state_commrdy_handler(1561)...........: > MPID_nem_tcp_send_queued(127).........: writev to socket failed - > Bad address > rank 0 in job 25 cluster.cn_52655 caused > collective abort of all ranks > exit status of rank 0: killed by signal 9 > #################### > > And the ends of the log file list as below: > #################### > …….. > …….. > …….. > …….. > > bQMMM = FALSE > > QMconstraints = 0 > QMMMscheme = 0 > > scalefactor = 1 > qm_opts: > > ngQM = 0 > #################### > > I’ve search the gmx-users mail list and tried to adjust the md > parameters, and no solution was found. The "mpiexec -np x" > option doesn't work except when x=1. I did found that when the > whole En protein is constrained using position restraints > (define = -DPOSRES), the ‘particle decomposition’ option works. > However this is not the kind of MD I want to conduct. > > Could anyone help me about this problem? And I also want to know > how can I accelerate this kind of MD (long time simulation of > small system) using Gromacs? Thinks a lot! > > (Further information about the simulated system: The system has > one En protein (54 residues, 629 atoms), total 4848 spce waters, > and 7 Cl- used to neutralize the system. The system has been > minimized first. A 20ps MD is also performed for the waters and > ions before EM.) This should be bread-and-butter with either decomposition up to at least 16 processors, for a correctly compiled GROMACS with a useful MPI library. Mark -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php