[OMPI users] fork in Fortran

2012-08-30 Thread sudhirs@
Dear users,
 How to use fork(), vfork() type functions in Fortran programming ??

Thanking you in advance

-- 
Sudhir Kumar Sahoo
Ph.D Scholar
Dept. Of Chemistry
IIT Kanpur-208016


[OMPI users] system call

2012-09-11 Thread sudhirs@
Dear Users,
I have two separate program say PROG1 and PROG2 .Both programs run
parallel independently. But I have made following modifications.
PROG1 run first doing its own job and then preparing a input file for
PROG2, and then to run PROG2  call system is used.
The result of PROG2 is used in PROG1 It runs successfully.

I am interested to make this system call parallel . Is there any way to do it ??

Thanking you
-- 
Sudhir Kumar Sahoo
Ph.D Scholar
Dept. Of Chemistry
IIT Kanpur-208016


[OMPI users] LD_LIBRARY_PATH Problem

2013-04-29 Thread sudhirs@
Dear users,
I am getting following error while doing a calculation. The job is getting
terminated before writing anything in output file .
==
ssh: ibc18: Name or service not known^M
--
A daemon (pid 8103) died unexpectedly with status 255 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--
mpirun: clean termination accomplished

Can any one help me ??

Thanks in advances
-- 
Sudhir Kumar Sahoo
Ph.D Scholar
Dept. Of Chemistry
IIT Kanpur-208016


[OMPI users] line 60: echo: write error: No space left on device

2013-10-01 Thread sudhirs@
Dear open-mpi user,
I am running a CPMD calculation in parallel. I got the following error and
job got killed. Below I have given the error message. What is this error
and how to fix it ?

/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
/opt/gridengine/mpi/startmpi.sh: line 60: echo: write error: No space left
on device
[compute-0-6.local:17488] opal_os_dirpath_create: Error: Unable to create
the sub-directory
(/data/20952.1.all.q/openmpi-sessions-sudhirs@compute-0-6.local_0) of
(/data/20952.1.all.q/openmpi-sessions-sudhirs@compute-0-6.local_0/43063/0/0),
mkdir failed [1]
[compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file
util/session_dir.c at line 101
[compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file
util/session_dir.c at line 425
[compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file
ess_hnp_module.c at line 273
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_session_dir failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--
[compute-0-6.local:17488] [[43063,0],0] ORTE_ERROR_LOG: Error in file
runtime/orte_init.c at line 132
--
It looks like orte_init failed for some reason; your parallel pr

Re: [OMPI users] line 60: echo: write error: No space left on device

2013-10-01 Thread sudhirs@
Dear John Hearns,
Thank you. It is working fine after mount.


On Tue, Oct 1, 2013 at 12:42 PM, John Hearns  wrote:

> E do you have a filesystem which is full?df will tell you
> Or maybe mounted read only.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Sudhir Kumar Sahoo
Ph.D Scholar
Dept. Of Chemistry
IIT Kanpur-208016


[OMPI users] Error

2013-10-18 Thread sudhirs@
Dear open-mpi user,
I am running a CPMD calculation in parallel. I got the following error and
job got killed. Below I have given the error message. What is this error
and how to fix it ?


[[12065,1],23][btl_openib_component.c:2948:handle_wc] from
compute-0-0.local to: compute-0-7 error polling LP CQ with status RETRY
EXCEEDED ERROR status number 12 for wr_id 396116864 opcode 0  vendor error
129 qp_idx 1
--
The InfiniBand retry count between two MPI processes has been
exceeded.  "Retry count" is defined in the InfiniBand spec 1.2
(section 12.7.38):

The total number of times that the sender wishes the receiver to
retry timeout, packet sequence, etc. errors before posting a
completion error.

This error typically means that there is something awry within the
InfiniBand fabric itself.  You should note the hosts on which this
error has occurred; it has been observed that rebooting or removing a
particular host from the job can sometimes resolve this issue.

Two MCA parameters can be used to control Open MPI's behavior with
respect to the retry count:

* btl_openib_ib_retry_count - The number of times the sender will
  attempt to retry (defaulted to 7, the maximum value).
* btl_openib_ib_timeout - The local ACK timeout parameter (defaulted
  to 10).  The actual timeout value used is calculated as:

 4.096 microseconds * (2^btl_openib_ib_timeout)

  See the InfiniBand spec 1.2 (section 12.7.34) for more details.

Below is some information about the host that raised the error and the
peer to which it was connected:

  Local host:   compute-0-0.local
  Local device: mthca0
  Peer host:compute-0-7

You may need to consult with your system administrator to get this
problem fixed.

--
--
mpirun has exited due to process rank 23 with PID 24240 on
node compute-0-0 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--
forrtl: error (78): process killed (SIGTERM)
Image  PCRoutineLineSource
mca_btl_openib.so  2AD8CFE0DED0  Unknown   Unknown  Unknown
forrtl: error (78): process killed (SIGTERM)
Image  PCRoutineLineSource
mca_btl_sm.so  2B316684B029  Unknown   Unknown  Unknown
libopen-pal.so.0   2B3162A0FD97  Unknown   Unknown  Unknown
libmpi.so.02B31625008B6  Unknown   Unknown  Unknown
mca_coll_tuned.so  2B3167902A3E  Unknown   Unknown  Unknown
mca_coll_tuned.so  2B31678FF6F5  Unknown   Unknown  Unknown
libmpi.so.02B31625178C6  Unknown   Unknown  Unknown
libmpi_f77.so.02B31622B7725  Unknown   Unknown  Unknown
cpmd.x 00808017  Unknown   Unknown  Unknown
cpmd.x 00805AF8  Unknown   Unknown  Unknown
cpmd.x 0050C49D  Unknown   Unknown  Unknown
cpmd.x 005B6FC8  Unknown   Unknown  Unknown
cpmd.x 0051D5DE  Unknown   Unknown  Unknown
cpmd.x 005B3557  Unknown   Unknown  Unknown
cpmd.x 0095817C  Unknown   Unknown  Unknown
cpmd.x 00959557  Unknown   Unknown  Unknown
cpmd.x 00657E07  Unknown   Unknown  Unknown
cpmd.x 0046F2D1  Unknown   Unknown  Unknown
cpmd.x 0046EF6C  Unknown   Unknown  Unknown
libc.so.6  003F34E1D974  Unknown   Unknown  Unknown
cpmd.x 0046EE79  Unknown   Unknown  Unknown


Thanking you
-- 
Sudhir Kumar Sahoo
Ph.D Scholar
Dept. Of Chemistry
IIT Kanpur-208016