Hi, there was also an issue with the locking of the general md.log output file which was resolved for 4.5.2. An update might help.
Carsten On Nov 3, 2010, at 3:50 PM, Florian Dommert wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/03/2010 03:38 PM, Hong, Liang wrote: >> Dear all, >> I'm performing a three-day simulation. It runs well for the first day, but >> stops for the second one. The error message is below. Does anyone know what >> might be the problem? Thanks >> Liang >> >> Program mdrun, VERSION 4.5.1-dev-20101008-e2cbc-dirty >> Source code file: /home/z8g/download/gromacs.head/src/gmxlib/checkpoint.c, >> line: 1748 >> >> Fatal error: >> Failed to lock: md100ns.log. Already running simulation? >> For more information and tips for troubleshooting, please check the GROMACS >> website at http://www.gromacs.org/Documentation/Errors >> ------------------------------------------------------- >> >> "Sitting on a rooftop watching molecules collide" (A Camp) >> >> Error on node 0, will try to stop all the nodes >> Halting parallel program mdrun on CPU 0 out of 32 >> >> gcq#348: "Sitting on a rooftop watching molecules collide" (A Camp) >> >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >> with errorcode -1. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [node139:04470] [[37327,0],0]-[[37327,1],0] mca_oob_tcp_msg_recv: readv >> failed: Connection reset by peer (104) >> -------------------------------------------------------------------------- >> mpiexec has exited due to process rank 0 with PID 4471 on >> node node139 exiting without calling "finalize". This may >> have caused other processes in the application to be >> terminated by signals sent by mpiexec (as reported here). > > Perhaps the queueing system of your cluster does not allow running a job > longer than 24h. Or the default is 24h and you have to supply the > corresponding information to the submission script. > > /Flo > > - -- > Florian Dommert > Dipl.-Phys. > > Institute for Computational Physics > > University Stuttgart > > Pfaffenwaldring 27 > 70569 Stuttgart > > Phone: +49(0)711/685-6-3613 > Fax: +49-(0)711/685-6-3658 > > EMail: domm...@icp.uni-stuttgart.de > Home: http://www.icp.uni-stuttgart.de/~icp/Florian_Dommert > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAkzRdrEACgkQLpNNBb9GiPm1sgCg3LkRUWgiZvOOH/GIjp5ifbZI > bJcAn1aamCMWlWTokD1+eDCLG1WhT/rd > =4Vs3 > -----END PGP SIGNATURE----- > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > Please don't post (un)subscribe requests to the list. Use the > www interface or send it to gmx-users-requ...@gromacs.org. > Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/Support/Mailing_Lists