[OMPI users] fork in Fortran
Dear users, How to use fork(), vfork() type functions in Fortran programming ?? Thanking you in advance -- Sudhir Kumar Sahoo Ph.D Scholar Dept. Of Chemistry IIT Kanpur-208016
[OMPI users] system call
Dear Users, I have two separate program say PROG1 and PROG2 .Both programs run parallel independently. But I have made following modifications. PROG1 run first doing its own job and then preparing a input file for PROG2, and then to run PROG2 call system is used. The result of PROG2 is used in PROG1 It runs successfully. I am interested to make this system call parallel . Is there any way to do it ?? Thanking you -- Sudhir Kumar Sahoo Ph.D Scholar Dept. Of Chemistry IIT Kanpur-208016
[OMPI users] LD_LIBRARY_PATH Problem
Dear users, I am getting following error while doing a calculation. The job is getting terminated before writing anything in output file . == ssh: ibc18: Name or service not known^M -- A daemon (pid 8103) died unexpectedly with status 255 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -- -- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -- mpirun: clean termination accomplished Can any one help me ?? Thanks in advances -- Sudhir Kumar Sahoo Ph.D Scholar Dept. Of Chemistry IIT Kanpur-208016
[OMPI users] line 60: echo: write error: No space left on device
Dear open-mpi user, I am running a CPMD calculation in parallel. I got the following error and job got killed. Below I have given the error message. What is this error and how to fix it ? /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device /opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left on device [compute-0-6.local:17488] opal_os_dirpath_create: Error: Unable to create the sub-directory (/data/20952.1.all.q/openmpi-sessions-sudhirs@compute-0-6.local_0) of (/data/20952.1.all.q/openmpi-sessions-sudhirs@compute-0-6.local_0/43063/0/0), mkdir failed [1] [compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 101 [compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 425 [compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file ess_hnp_module.c at line 273 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_session_dir failed --> Returned value Error (-1) instead of ORTE_SUCCESS -- [compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 132 -- It looks like orte_init failed for some reason; your parallel pr
Re: [OMPI users] line 60: echo: write error: No space left on device
Dear John Hearns, Thank you. It is working fine after mount. On Tue, Oct 1, 2013 at 12:42 PM, John Hearns wrote: > E do you have a filesystem which is full?df will tell you > Or maybe mounted read only. > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Sudhir Kumar Sahoo Ph.D Scholar Dept. Of Chemistry IIT Kanpur-208016
[OMPI users] Error
Dear open-mpi user, I am running a CPMD calculation in parallel. I got the following error and job got killed. Below I have given the error message. What is this error and how to fix it ? [[12065,1],23][btl_openib_component.c:2948:handle_wc] from compute-0-0.local to: compute-0-7 error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id 396116864 opcode 0 vendor error 129 qp_idx 1 -- The InfiniBand retry count between two MPI processes has been exceeded. "Retry count" is defined in the InfiniBand spec 1.2 (section 12.7.38): The total number of times that the sender wishes the receiver to retry timeout, packet sequence, etc. errors before posting a completion error. This error typically means that there is something awry within the InfiniBand fabric itself. You should note the hosts on which this error has occurred; it has been observed that rebooting or removing a particular host from the job can sometimes resolve this issue. Two MCA parameters can be used to control Open MPI's behavior with respect to the retry count: * btl_openib_ib_retry_count - The number of times the sender will attempt to retry (defaulted to 7, the maximum value). * btl_openib_ib_timeout - The local ACK timeout parameter (defaulted to 10). The actual timeout value used is calculated as: 4.096 microseconds * (2^btl_openib_ib_timeout) See the InfiniBand spec 1.2 (section 12.7.34) for more details. Below is some information about the host that raised the error and the peer to which it was connected: Local host: compute-0-0.local Local device: mthca0 Peer host:compute-0-7 You may need to consult with your system administrator to get this problem fixed. -- -- mpirun has exited due to process rank 23 with PID 24240 on node compute-0-0 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -- forrtl: error (78): process killed (SIGTERM) Image PCRoutineLineSource mca_btl_openib.so 2AD8CFE0DED0 Unknown Unknown Unknown forrtl: error (78): process killed (SIGTERM) Image PCRoutineLineSource mca_btl_sm.so 2B316684B029 Unknown Unknown Unknown libopen-pal.so.0 2B3162A0FD97 Unknown Unknown Unknown libmpi.so.02B31625008B6 Unknown Unknown Unknown mca_coll_tuned.so 2B3167902A3E Unknown Unknown Unknown mca_coll_tuned.so 2B31678FF6F5 Unknown Unknown Unknown libmpi.so.02B31625178C6 Unknown Unknown Unknown libmpi_f77.so.02B31622B7725 Unknown Unknown Unknown cpmd.x 00808017 Unknown Unknown Unknown cpmd.x 00805AF8 Unknown Unknown Unknown cpmd.x 0050C49D Unknown Unknown Unknown cpmd.x 005B6FC8 Unknown Unknown Unknown cpmd.x 0051D5DE Unknown Unknown Unknown cpmd.x 005B3557 Unknown Unknown Unknown cpmd.x 0095817C Unknown Unknown Unknown cpmd.x 00959557 Unknown Unknown Unknown cpmd.x 00657E07 Unknown Unknown Unknown cpmd.x 0046F2D1 Unknown Unknown Unknown cpmd.x 0046EF6C Unknown Unknown Unknown libc.so.6 003F34E1D974 Unknown Unknown Unknown cpmd.x 0046EE79 Unknown Unknown Unknown Thanking you -- Sudhir Kumar Sahoo Ph.D Scholar Dept. Of Chemistry IIT Kanpur-208016