Re: [OMPI users] mpi problems,

2011-04-07 Thread Nehemiah Dacres
oh thank you ! that might work
On Thu, Apr 7, 2011 at 5:31 AM, Terry Dontje <terry.don...@oracle.com>wrote:

>  Nehemiah,
> I took a look at an old version of a hpl Makefile I have.  I think what you
> really want to do is not set the MP* variables to anything and near the end
> of the Makefile set CC and LINKER to mpicc.  You may need to also change the
> CFLAGS and LINKERFLAGS variables to match which compiler/arch you are
> using.
>
> --td
>
> On 04/07/2011 06:20 AM, Terry Dontje wrote:
>
> On 04/06/2011 03:38 PM, Nehemiah Dacres wrote:
>
> I am also trying to get netlib's hpl to run via sun cluster tools so i am
> trying to compile it and am having trouble. Which is the proper mpi library
> to give?
> naturally this isn't going to work
>
> MPdir= /opt/SUNWhpc/HPC8.2.1c/sun/
> MPinc= -I$(MPdir)/include
> *MPlib= $(MPdir)/lib/libmpi.a*
>
> Is there a reason you are trying to link with a static libmpi.  You really
> want to link with libmpi.so.  It also seems like whatever Makefile you are
> using is not using mpicc, is that true.  The reason that is important is
> mpicc would pick up the right libs you needed.  Which brings me to Ralph's
> comment, if you really want to go around the mpicc way of compiling use
> mpicc --showme, copy the compile line shown in that commands output and
> insert your files accordingly.
>
> --td
>
>
> because that doesn't exist
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libotf.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.fmpi.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.omp.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.ompi.a
>
> is what I have for listing *.a  in the lib directory. none of those are
> equivilant becasue they are all linked with vampire trace if I am reading
> the names right. I've already tried putting
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a for this and it didnt work
> giving errors like
>
> On Wed, Apr 6, 2011 at 12:42 PM, Terry Dontje <terry.don...@oracle.com>wrote:
>
>>  Something looks fishy about your numbers.  The first two sets of numbers
>> look the same and the last set do look better for the most part.  Your
>> mpirun command line looks weird to me with the "-mca
>> orte_base_help_aggregate btl,openib,self," did something get chopped off
>> with the text copy?  You should have had a "-mca btl openib,self".  Can you
>> do a run with "-mca btl tcp,self", it should be slower.
>>
>> I really wouldn't have expected another compiler over IB to be that
>> dramatically lower performing.
>>
>> --td
>>
>>
>>
>> On 04/06/2011 12:40 PM, Nehemiah Dacres wrote:
>>
>>  also, I'm not sure if I'm reading the results right. According to the
>> last run, did using the sun compilers (update 1 )  result in higher
>> performance with sunct?
>>
>> On Wed, Apr 6, 2011 at 11:38 AM, Nehemiah Dacres <dacre...@slu.edu>wrote:
>>
>>> some tests I did. I hope this isn't an abuse of the list. please tell me
>>> if it is but thanks to all those who helped me.
>>>
>>> this  goes to say that the sun MPI works with programs not compiled with
>>> sun’s compilers.
>>> this first test was run as a base case to see if MPI works., the sedcond
>>> run is to see the speed up using OpenIB provides
>>> jian@therock ~]$ mpirun -machinefile list
>>> /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
>>> Start mpi_stress at Wed Apr  6 10:56:29 2011
>>>
>>>Size (bytes) TxMessages  TxMillionBytes/s
>>> TxMessages/s
>>>  32  1  2.77
>>> 86485.67
>>>  64  1  5.76
>>> 90049.42
>>> 128  1 11.00
>>> 85923.85
>>> 256  1 18.78
>>> 73344.43
>>> 512  1 34.47
>>> 67331.98
>>>1024  1 34.81
>>> 33998.09
>>>2048  1 17.31
>>> 8454.27
>>>4096  1 18.34
>>> 4476.61
>>>8192  1 25.43
>>> 3104.28
>>>   16384  1 15.56
>>> 949.50
>>>   32768  1 13.95
>>> 425.74
>>>
>>>   6

Re: [OMPI users] mpi problems,

2011-04-06 Thread Nehemiah Dacres
[jian@therock lib]$ ls lib64/*.a
lib64/libotf.a  lib64/libvt.fmpi.a  lib64/libvt.omp.a
lib64/libvt.a   lib64/libvt.mpi.a   lib64/libvt.ompi.a
last time i linked one of those files it told me they were in the wrong
format. these are in archive format, what format should they be in?


On Wed, Apr 6, 2011 at 2:44 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Look at your output from mpicc --showme. It indicates that the OMPI libs
> were put in the lib64 directory, not lib.
>
>
> On Apr 6, 2011, at 1:38 PM, Nehemiah Dacres wrote:
>
> I am also trying to get netlib's hpl to run via sun cluster tools so i am
> trying to compile it and am having trouble. Which is the proper mpi library
> to give?
> naturally this isn't going to work
>
> MPdir= /opt/SUNWhpc/HPC8.2.1c/sun/
> MPinc= -I$(MPdir)/include
> *MPlib= $(MPdir)/lib/libmpi.a*
>
> because that doesn't exist
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libotf.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.fmpi.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.omp.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.ompi.a
>
> is what I have for listing *.a  in the lib directory. none of those are
> equivilant becasue they are all linked with vampire trace if I am reading
> the names right. I've already tried putting
> /opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a for this and it didnt work
> giving errors like
>
> On Wed, Apr 6, 2011 at 12:42 PM, Terry Dontje <terry.don...@oracle.com>wrote:
>
>>  Something looks fishy about your numbers.  The first two sets of numbers
>> look the same and the last set do look better for the most part.  Your
>> mpirun command line looks weird to me with the "-mca
>> orte_base_help_aggregate btl,openib,self," did something get chopped off
>> with the text copy?  You should have had a "-mca btl openib,self".  Can you
>> do a run with "-mca btl tcp,self", it should be slower.
>>
>> I really wouldn't have expected another compiler over IB to be that
>> dramatically lower performing.
>>
>> --td
>>
>>
>>
>> On 04/06/2011 12:40 PM, Nehemiah Dacres wrote:
>>
>> also, I'm not sure if I'm reading the results right. According to the last
>> run, did using the sun compilers (update 1 )  result in higher performance
>> with sunct?
>>
>> On Wed, Apr 6, 2011 at 11:38 AM, Nehemiah Dacres <dacre...@slu.edu>wrote:
>>
>>> some tests I did. I hope this isn't an abuse of the list. please tell me
>>> if it is but thanks to all those who helped me.
>>>
>>> this  goes to say that the sun MPI works with programs not compiled with
>>> sun’s compilers.
>>> this first test was run as a base case to see if MPI works., the sedcond
>>> run is to see the speed up using OpenIB provides
>>> jian@therock ~]$ mpirun -machinefile list
>>> /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
>>> Start mpi_stress at Wed Apr  6 10:56:29 2011
>>>
>>>Size (bytes) TxMessages  TxMillionBytes/s
>>> TxMessages/s
>>>  32  1  2.77
>>> 86485.67
>>>  64  1  5.76
>>> 90049.42
>>> 128  1 11.00
>>> 85923.85
>>> 256  1 18.78
>>> 73344.43
>>> 512  1 34.47
>>> 67331.98
>>>1024  1 34.81
>>> 33998.09
>>>2048  1 17.31
>>> 8454.27
>>>4096  1 18.34
>>> 4476.61
>>>8192  1 25.43
>>> 3104.28
>>>   16384  1 15.56
>>> 949.50
>>>   32768  1 13.95
>>> 425.74
>>>
>>>   65536  1  9.88
>>> 150.79
>>>  131072   8192 11.05
>>> 84.31
>>>  262144   4096 13.12
>>> 50.04
>>>  524288   2048 16.54
>>> 31.55
>>> 1048576   1024 19.92
>>> 18.99
>>> 2097152512 22.54
>>> 10.75
>>> 4194304256 

Re: [OMPI users] mpi problems,

2011-04-06 Thread Nehemiah Dacres
I am also trying to get netlib's hpl to run via sun cluster tools so i am
trying to compile it and am having trouble. Which is the proper mpi library
to give?
naturally this isn't going to work

MPdir= /opt/SUNWhpc/HPC8.2.1c/sun/
MPinc= -I$(MPdir)/include
*MPlib= $(MPdir)/lib/libmpi.a*

because that doesn't exist
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libotf.a
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.fmpi.a
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.omp.a
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.a
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.ompi.a

is what I have for listing *.a  in the lib directory. none of those are
equivilant becasue they are all linked with vampire trace if I am reading
the names right. I've already tried putting
/opt/SUNWhpc-O/HPC8.2.1c/sun/lib/libvt.mpi.a for this and it didnt work
giving errors like

On Wed, Apr 6, 2011 at 12:42 PM, Terry Dontje <terry.don...@oracle.com>wrote:

>  Something looks fishy about your numbers.  The first two sets of numbers
> look the same and the last set do look better for the most part.  Your
> mpirun command line looks weird to me with the "-mca
> orte_base_help_aggregate btl,openib,self," did something get chopped off
> with the text copy?  You should have had a "-mca btl openib,self".  Can you
> do a run with "-mca btl tcp,self", it should be slower.
>
> I really wouldn't have expected another compiler over IB to be that
> dramatically lower performing.
>
> --td
>
>
>
> On 04/06/2011 12:40 PM, Nehemiah Dacres wrote:
>
> also, I'm not sure if I'm reading the results right. According to the last
> run, did using the sun compilers (update 1 )  result in higher performance
> with sunct?
>
> On Wed, Apr 6, 2011 at 11:38 AM, Nehemiah Dacres <dacre...@slu.edu> wrote:
>
>> some tests I did. I hope this isn't an abuse of the list. please tell me
>> if it is but thanks to all those who helped me.
>>
>> this  goes to say that the sun MPI works with programs not compiled with
>> sun’s compilers.
>> this first test was run as a base case to see if MPI works., the sedcond
>> run is to see the speed up using OpenIB provides
>> jian@therock ~]$ mpirun -machinefile list
>> /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
>> Start mpi_stress at Wed Apr  6 10:56:29 2011
>>
>>Size (bytes) TxMessages  TxMillionBytes/s
>> TxMessages/s
>>  32  1  2.77
>> 86485.67
>>  64  1  5.76
>> 90049.42
>> 128  1 11.00
>> 85923.85
>> 256  1 18.78
>> 73344.43
>> 512  1 34.47
>> 67331.98
>>1024  1 34.81
>> 33998.09
>>2048  1 17.31
>> 8454.27
>>4096  1 18.34
>> 4476.61
>>8192  1 25.43
>> 3104.28
>>   16384  1 15.56
>> 949.50
>>   32768  1 13.95
>> 425.74
>>
>>   65536  1  9.88
>> 150.79
>>  131072   8192 11.05
>> 84.31
>>  262144   4096 13.12
>> 50.04
>>  524288   2048 16.54
>> 31.55
>> 1048576   1024 19.92
>> 18.99
>> 2097152512 22.54
>> 10.75
>> 4194304256 25.46
>> 6.07
>>
>> Iteration 0 : errors = 0, total = 0 (495 secs, Wed Apr  6 11:04:44 2011)
>> After 1 iteration(s), 8 mins and 15 secs, total errors = 0
>>
>> here is the infiniband run
>>
>> [jian@therock ~]$ mpirun -mca orte_base_help_aggregate btl,openib,self,
>> -machinefile list /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
>> Start mpi_stress at Wed Apr  6 11:07:06 2011
>>
>>Size (bytes) TxMessages  TxMillionBytes/s
>> TxMessages/s
>>  32  1  2.72   84907.69
>>  64  1  5.83   91097.94
>> 128  1 10.75   83959.63
>> 256  1 18.53   72384.48
>> 512  1 34.96   68285.00
>>  

Re: [OMPI users] mpi problems,

2011-04-06 Thread Nehemiah Dacres
also, I'm not sure if I'm reading the results right. According to the last
run, did using the sun compilers (update 1 )  result in higher performance
with sunct?

On Wed, Apr 6, 2011 at 11:38 AM, Nehemiah Dacres <dacre...@slu.edu> wrote:

> some tests I did. I hope this isn't an abuse of the list. please tell me if
> it is but thanks to all those who helped me.
>
> this  goes to say that the sun MPI works with programs not compiled with
> sun’s compilers.
> this first test was run as a base case to see if MPI works., the sedcond
> run is to see the speed up using OpenIB provides
> jian@therock ~]$ mpirun -machinefile list
> /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
> Start mpi_stress at Wed Apr  6 10:56:29 2011
>
>   Size (bytes) TxMessages  TxMillionBytes/s   TxMessages/s
>  32  1  2.77   86485.67
>  64  1  5.76   90049.42
> 128  1 11.00   85923.85
> 256  1 18.78   73344.43
> 512  1 34.47   67331.98
>1024  1 34.81   33998.09
>2048  1 17.318454.27
>4096  1 18.344476.61
>8192  1 25.433104.28
>   16384  1 15.56 949.50
>   32768  1 13.95 425.74
>
>  65536  1  9.88 150.79
>  131072   8192 11.05  84.31
>  262144   4096 13.12  50.04
>  524288   2048 16.54  31.55
> 1048576   1024 19.92  18.99
> 2097152512 22.54  10.75
> 4194304256 25.46   6.07
>
> Iteration 0 : errors = 0, total = 0 (495 secs, Wed Apr  6 11:04:44 2011)
> After 1 iteration(s), 8 mins and 15 secs, total errors = 0
>
> here is the infiniband run
>
> [jian@therock ~]$ mpirun -mca orte_base_help_aggregate btl,openib,self,
> -machinefile list /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
> Start mpi_stress at Wed Apr  6 11:07:06 2011
>
>   Size (bytes) TxMessages  TxMillionBytes/s   TxMessages/s
>  32  1  2.72   84907.69
>  64  1  5.83   91097.94
> 128  1 10.75   83959.63
> 256  1 18.53   72384.48
> 512  1 34.96   68285.00
>1024  1 11.40   11133.10
>2048  1 20.88   10196.34
>4096  1 10.132472.13
>8192  1 19.322358.25
>   16384  1 14.58 890.10
>   32768  1 15.85 483.61
>   65536  1  9.04 137.95
>   1310728192 10.90  83.12
>  262144   4096 13.57  51.76
>  524288  2048 16.82  32.08
> 10485761024 19.10  18.21
> 2097152512 22.13  10.55
> 4194304256 21.66   5.16
>
> Iteration 0 : errors = 0, total = 0 (511 secs, Wed Apr  6 11:15:37 2011)
> After 1 iteration(s), 8 mins and 31 secs, total errors = 0
> compiled with the sun compilers i think
> [jian@therock ~]$ mpirun -mca orte_base_help_aggregate btl,openib,self,
> -machinefile list sunMpiStress
> Start mpi_stress at Wed Apr  6 11:23:18 2011
>
>   Size (bytes) TxMessages  TxMillionBytes/s   TxMessages/s
>  32  1  2.60   81159.60
>  64  1  5.19   81016.95
> 128  1 10.23   79953.34
> 256  1 16.74   65406.52
> 512  1 23.71  

Re: [OMPI users] mpi problems,

2011-04-06 Thread Nehemiah Dacres
some tests I did. I hope this isn't an abuse of the list. please tell me if
it is but thanks to all those who helped me.

this  goes to say that the sun MPI works with programs not compiled with
sun’s compilers.
this first test was run as a base case to see if MPI works., the sedcond run
is to see the speed up using OpenIB provides
jian@therock ~]$ mpirun -machinefile list
/opt/iba/src/mpi_apps/mpi_stress/mpi_stress
Start mpi_stress at Wed Apr  6 10:56:29 2011

  Size (bytes) TxMessages  TxMillionBytes/s   TxMessages/s
32  1  2.77   86485.67
64  1  5.76   90049.42
   128  1 11.00   85923.85
   256  1 18.78   73344.43
   512  1 34.47   67331.98
  1024  1 34.81   33998.09
  2048  1 17.318454.27
  4096  1 18.344476.61
  8192  1 25.433104.28
 16384  1 15.56 949.50
 32768  1 13.95 425.74

 65536  1  9.88 150.79
131072   8192 11.05  84.31
262144   4096 13.12  50.04
524288   2048 16.54  31.55
   1048576   1024 19.92  18.99
   2097152512 22.54  10.75
   4194304256 25.46   6.07

Iteration 0 : errors = 0, total = 0 (495 secs, Wed Apr  6 11:04:44 2011)
After 1 iteration(s), 8 mins and 15 secs, total errors = 0

here is the infiniband run

[jian@therock ~]$ mpirun -mca orte_base_help_aggregate btl,openib,self,
-machinefile list /opt/iba/src/mpi_apps/mpi_stress/mpi_stress
Start mpi_stress at Wed Apr  6 11:07:06 2011

  Size (bytes) TxMessages  TxMillionBytes/s   TxMessages/s
32  1  2.72   84907.69
64  1  5.83   91097.94
   128  1 10.75   83959.63
   256  1 18.53   72384.48
   512  1 34.96   68285.00
  1024  1 11.40   11133.10
  2048  1 20.88   10196.34
  4096  1 10.132472.13
  8192  1 19.322358.25
 16384  1 14.58 890.10
 32768  1 15.85 483.61
 65536  1  9.04 137.95
 1310728192 10.90  83.12
262144   4096 13.57  51.76
524288  2048 16.82  32.08
   10485761024 19.10  18.21
   2097152512 22.13  10.55
   4194304256 21.66   5.16

Iteration 0 : errors = 0, total = 0 (511 secs, Wed Apr  6 11:15:37 2011)
After 1 iteration(s), 8 mins and 31 secs, total errors = 0
compiled with the sun compilers i think
[jian@therock ~]$ mpirun -mca orte_base_help_aggregate btl,openib,self,
-machinefile list sunMpiStress
Start mpi_stress at Wed Apr  6 11:23:18 2011

  Size (bytes) TxMessages  TxMillionBytes/s   TxMessages/s
32  1  2.60   81159.60
64  1  5.19   81016.95
   128  1 10.23   79953.34
   256  1 16.74   65406.52
   512  1 23.71   46304.92
  1024  1 54.62   53340.73
  2048  1 45.75   22340.58
  4096  1 29.327158.87
  8192  1 28.613492.77
 16384  1184.03   11232.26
 32768  1215.696582.21
 65536  1229.883507.64
131072   8192231.641767.25
262144   4096 

Re: [OMPI users] mpi problems,

2011-04-06 Thread Nehemiah Dacres
On Mon, Apr 4, 2011 at 7:35 PM, Terry Dontje wrote:

>  libfui.so is a library a part of the Solaris Studio FORTRAN tools.  It
> should be located under lib from where your Solaris Studio compilers are
> installed from.  So one question is whether you actually have Studio Fortran
> installed on all your nodes or not?
>
> --td
>

actually I kind of realized this shortly after I read this message


On 04/04/2011 04:02 PM, Ralph Castain wrote:

Well, where is libfui located? Is that location in your ld path? Is the lib
present on all nodes in your hostfile?


thank you all for your help

-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] mpi problems,

2011-04-06 Thread Nehemiah Dacres
thanks all, I realized that  the sun  compilers weren't installed on all the
nodes. It seems to be working, soon I will test the mca parameters for IB

On Mon, Apr 4, 2011 at 7:35 PM, Terry Dontje <terry.don...@oracle.com>wrote:

>  libfui.so is a library a part of the Solaris Studio FORTRAN tools.  It
> should be located under lib from where your Solaris Studio compilers are
> installed from.  So one question is whether you actually have Studio Fortran
> installed on all your nodes or not?
>
> --td
>
>
> On 04/04/2011 04:02 PM, Ralph Castain wrote:
>
> Well, where is libfui located? Is that location in your ld path? Is the lib
> present on all nodes in your hostfile?
>
>
>  On Apr 4, 2011, at 1:58 PM, Nehemiah Dacres wrote:
>
>  [jian@therock ~]$ echo $LD_LIBRARY_PATH
>
> /opt/sun/sunstudio12.1/lib:/opt/vtk/lib:/opt/gridengine/lib/lx26-amd64:/opt/gridengine/lib/lx26-amd64:/home/jian/.crlibs:/home/jian/.crlibs32
> [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/mpirun  -np 4 -hostfile
> list ring2
> ring2: error while loading shared libraries: libfui.so.1: cannot open
> shared object file: No such file or directory
> ring2: error while loading shared libraries: libfui.so.1: cannot open
> shared object file: No such file or directory
> ring2: error while loading shared libraries: libfui.so.1: cannot open
> shared object file: No such file or directory
> mpirun: killing job...
>
>
> --
> mpirun noticed that process rank 1 with PID 31763 on node compute-0-1
> exited on signal 0 (Unknown signal 0).
> --
> mpirun: clean termination accomplished
>
>  I really don't know what's wrong here. I was sure that would work
>
> On Mon, Apr 4, 2011 at 2:43 PM, Samuel K. Gutierrez <sam...@lanl.gov>wrote:
>
>> Hi,
>>
>>  Try prepending the path to your compiler libraries.
>>
>>  Example (bash-like):
>>
>>  export
>> LD_LIBRARY_PATH=/compiler/prefix/lib:/ompi/prefix/lib:$LD_LIBRARY_PATH
>>
>>  --
>>Samuel K. Gutierrez
>> Los Alamos National Laboratory
>>
>>
>>  On Apr 4, 2011, at 1:33 PM, Nehemiah Dacres wrote:
>>
>>   altering LD_LIBRARY_PATH alter's the process's path to mpi's libraries,
>> how do i alter its path to compiler libs like libfui.so.1? it needs to find
>> them cause it was compiled by a sun compiler
>>
>>  On Mon, Apr 4, 2011 at 10:06 AM, Nehemiah Dacres <dacre...@slu.edu>wrote:
>>
>>>
>>>  As Ralph indicated, he'll add the hostname to the error message (but
>>>> that might be tricky; that error message is coming from rsh/ssh...).
>>>>
>>>> In the meantime, you might try (csh style):
>>>>
>>>> foreach host (`cat list`)
>>>>echo $host
>>>>ls -l /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
>>>> end
>>>>
>>>>
>>>  that's what the tentakel line was refering to, or ...
>>>
>>>
>>>
>>>>
>>>> On Apr 4, 2011, at 10:24 AM, Nehemiah Dacres wrote:
>>>>
>>>> > I have installed it via a symlink on all of the nodes, I can go
>>>> 'tentakel which mpirun ' and it finds it' I'll check the library paths but
>>>> isn't there a way to find out which nodes are returning the error?
>>>>
>>>
>>>  I found it misslinked on a couple nodes. thank you
>>>
>>> --
>>>  Nehemiah I. Dacres
>>> System Administrator
>>> Advanced Technology Group Saint Louis University
>>>
>>>
>>
>>
>> --
>> Nehemiah I. Dacres
>> System Administrator
>> Advanced Technology Group Saint Louis University
>>
>>   ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Nehemiah I. Dacres
> System Administrator
> Advanced Technology Group Saint Louis University
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> [image: Oracle]
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
>  Oracle * - Performance Technologies*
>  95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] mpi problems,

2011-04-04 Thread Nehemiah Dacres
[jian@therock ~]$ echo $LD_LIBRARY_PATH
/opt/sun/sunstudio12.1/lib:/opt/vtk/lib:/opt/gridengine/lib/lx26-amd64:/opt/gridengine/lib/lx26-amd64:/home/jian/.crlibs:/home/jian/.crlibs32
[jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/mpirun  -np 4 -hostfile
list ring2
ring2: error while loading shared libraries: libfui.so.1: cannot open shared
object file: No such file or directory
ring2: error while loading shared libraries: libfui.so.1: cannot open shared
object file: No such file or directory
ring2: error while loading shared libraries: libfui.so.1: cannot open shared
object file: No such file or directory
mpirun: killing job...

--
mpirun noticed that process rank 1 with PID 31763 on node compute-0-1 exited
on signal 0 (Unknown signal 0).
--
mpirun: clean termination accomplished

I really don't know what's wrong here. I was sure that would work

On Mon, Apr 4, 2011 at 2:43 PM, Samuel K. Gutierrez <sam...@lanl.gov> wrote:

> Hi,
>
> Try prepending the path to your compiler libraries.
>
> Example (bash-like):
>
> export
> LD_LIBRARY_PATH=/compiler/prefix/lib:/ompi/prefix/lib:$LD_LIBRARY_PATH
>
> --
> Samuel K. Gutierrez
> Los Alamos National Laboratory
>
>
> On Apr 4, 2011, at 1:33 PM, Nehemiah Dacres wrote:
>
> altering LD_LIBRARY_PATH alter's the process's path to mpi's libraries, how
> do i alter its path to compiler libs like libfui.so.1? it needs to find them
> cause it was compiled by a sun compiler
>
> On Mon, Apr 4, 2011 at 10:06 AM, Nehemiah Dacres <dacre...@slu.edu> wrote:
>
>>
>> As Ralph indicated, he'll add the hostname to the error message (but that
>>> might be tricky; that error message is coming from rsh/ssh...).
>>>
>>> In the meantime, you might try (csh style):
>>>
>>> foreach host (`cat list`)
>>>echo $host
>>>ls -l /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
>>> end
>>>
>>>
>> that's what the tentakel line was refering to, or ...
>>
>>
>>
>>>
>>> On Apr 4, 2011, at 10:24 AM, Nehemiah Dacres wrote:
>>>
>>> > I have installed it via a symlink on all of the nodes, I can go
>>> 'tentakel which mpirun ' and it finds it' I'll check the library paths but
>>> isn't there a way to find out which nodes are returning the error?
>>>
>>
>> I found it misslinked on a couple nodes. thank you
>>
>> --
>> Nehemiah I. Dacres
>> System Administrator
>> Advanced Technology Group Saint Louis University
>>
>>
>
>
> --
> Nehemiah I. Dacres
> System Administrator
> Advanced Technology Group Saint Louis University
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] mpi problems,

2011-04-04 Thread Nehemiah Dacres
altering LD_LIBRARY_PATH alter's the process's path to mpi's libraries, how
do i alter its path to compiler libs like libfui.so.1? it needs to find them
cause it was compiled by a sun compiler

On Mon, Apr 4, 2011 at 10:06 AM, Nehemiah Dacres <dacre...@slu.edu> wrote:

>
> As Ralph indicated, he'll add the hostname to the error message (but that
>> might be tricky; that error message is coming from rsh/ssh...).
>>
>> In the meantime, you might try (csh style):
>>
>> foreach host (`cat list`)
>>echo $host
>>ls -l /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
>> end
>>
>>
> that's what the tentakel line was refering to, or ...
>
>
>
>>
>> On Apr 4, 2011, at 10:24 AM, Nehemiah Dacres wrote:
>>
>> > I have installed it via a symlink on all of the nodes, I can go
>> 'tentakel which mpirun ' and it finds it' I'll check the library paths but
>> isn't there a way to find out which nodes are returning the error?
>>
>
> I found it misslinked on a couple nodes. thank you
>
> --
> Nehemiah I. Dacres
> System Administrator
> Advanced Technology Group Saint Louis University
>
>


-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] mpi problems,

2011-04-04 Thread Nehemiah Dacres
> As Ralph indicated, he'll add the hostname to the error message (but that
> might be tricky; that error message is coming from rsh/ssh...).
>
> In the meantime, you might try (csh style):
>
> foreach host (`cat list`)
>echo $host
>ls -l /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
> end
>
>
that's what the tentakel line was refering to, or ...


>
> On Apr 4, 2011, at 10:24 AM, Nehemiah Dacres wrote:
>
> > I have installed it via a symlink on all of the nodes, I can go 'tentakel
> which mpirun ' and it finds it' I'll check the library paths but isn't there
> a way to find out which nodes are returning the error?
>

I found it misslinked on a couple nodes. thank you

-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] mpi problems,

2011-04-04 Thread Nehemiah Dacres
that's an excellent suggestion

On Mon, Apr 4, 2011 at 9:45 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

> As Ralph indicated, he'll add the hostname to the error message (but that
> might be tricky; that error message is coming from rsh/ssh...).
>
> In the meantime, you might try (csh style):
>
> foreach host (`cat list`)
>echo $host
>ls -l /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
> end
>
>
>
> On Apr 4, 2011, at 10:24 AM, Nehemiah Dacres wrote:
>
> > I have installed it via a symlink on all of the nodes, I can go 'tentakel
> which mpirun ' and it finds it' I'll check the library paths but isn't there
> a way to find out which nodes are returning the error?
> >
> >
> > On Thu, Mar 31, 2011 at 7:30 AM, Jeff Squyres <jsquy...@cisco.com>
> wrote:
> > The error message seems to imply that you don't have OMPI installed on
> all your nodes (because it didn't find /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
> on a remote node).
> >
> >
> > On Mar 30, 2011, at 4:24 PM, Nehemiah Dacres wrote:
> >
> > > I am trying to figure out why my jobs aren't getting distributed and
> need some help. I have an install of sun cluster tools on Rockscluster 5.2
> (essentially centos4u2). this user's account has its home dir shared via
> nfs. I am getting some strange errors. here's an example run
> > >
> > >
> > > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/mpirun -np 3
> -hostfile list ./job2.sh
> > > bash: /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted: No such file or directory
> > >
> --
> > > A daemon (pid 20362) died unexpectedly with status 127 while attempting
> > > to launch so we are aborting.
> > >
> > > There may be more information reported by the environment (see above).
> > >
> > > This may be because the daemon was unable to find all the needed shared
> > > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> > > location of the shared libraries on the remote nodes and this will
> > > automatically be forwarded to the remote nodes.
> > >
> --
> > >
> --
> > > mpirun noticed that the job aborted, but has no info as to the process
> > > that caused that situation.
> > >
> --
> > > mpirun: clean termination accomplished
> > >
> > > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/
> > > bin/examples/   instrument/ man/
> > > etc/include/lib/share/
> > > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orte
> > > orte-clean  orted   orte-ioforte-ps orterun
> > > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
> > > [therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found
> in file runtime/orte_init.c at line 125
> > >
> --
> > > It looks like orte_init failed for some reason; your parallel process
> is
> > > likely to abort.  There are many reasons that a parallel process can
> > > fail during orte_init; some of which are due to configuration or
> > > environment problems.  This failure appears to be an internal failure;
> > > here's some additional information (which may only be relevant to an
> > > Open MPI developer):
> > >
> > >   orte_ess_base_select failed
> > >   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> > >
> --
> > > [therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found
> in file orted/orted_main.c at line 325
> > > [jian@therock ~]$
> > >
> > >
> > > --
> > > Nehemiah I. Dacres
> > > System Administrator
> > > Advanced Technology Group Saint Louis University
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> > ___
> > users mailing list
> > us...@op

Re: [OMPI users] mpi problems,

2011-04-04 Thread Nehemiah Dacres
you do realize that this is Sun Cluster Tools branch (it is a branch right?
or is it a *port* of openmpi to sun's compilers?) I'm not sure if your
changes made it into sunct 8.2.1

On Mon, Apr 4, 2011 at 9:34 AM, Ralph Castain <r...@open-mpi.org> wrote:

> Guess I can/will add the node name to the error message - should have been
> there before now.
>
> If it is a debug build, you can add "-mca plm_base_verbose 1" to the cmd
> line and get output tracing the launch and showing you what nodes are having
> problems.
>
>
> On Apr 4, 2011, at 8:24 AM, Nehemiah Dacres wrote:
>
> I have installed it via a symlink on all of the nodes, I can go 'tentakel
> which mpirun ' and it finds it' I'll check the library paths but isn't there
> a way to find out which nodes are returning the error?
>
>
> On Thu, Mar 31, 2011 at 7:30 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
>
>> The error message seems to imply that you don't have OMPI installed on all
>> your nodes (because it didn't find /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted on a
>> remote node).
>>
>>
>> On Mar 30, 2011, at 4:24 PM, Nehemiah Dacres wrote:
>>
>> > I am trying to figure out why my jobs aren't getting distributed and
>> need some help. I have an install of sun cluster tools on Rockscluster 5.2
>> (essentially centos4u2). this user's account has its home dir shared via
>> nfs. I am getting some strange errors. here's an example run
>> >
>> >
>> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/mpirun -np 3 -hostfile
>> list ./job2.sh
>> > bash: /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted: No such file or directory
>> >
>> --
>> > A daemon (pid 20362) died unexpectedly with status 127 while attempting
>> > to launch so we are aborting.
>> >
>> > There may be more information reported by the environment (see above).
>> >
>> > This may be because the daemon was unable to find all the needed shared
>> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
>> the
>> > location of the shared libraries on the remote nodes and this will
>> > automatically be forwarded to the remote nodes.
>> >
>> --
>> >
>> --
>> > mpirun noticed that the job aborted, but has no info as to the process
>> > that caused that situation.
>> >
>> --
>> > mpirun: clean termination accomplished
>> >
>> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/
>> > bin/examples/   instrument/ man/
>> > etc/include/lib/share/
>> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orte
>> > orte-clean  orted   orte-ioforte-ps orterun
>> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
>> > [therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>> file runtime/orte_init.c at line 125
>> >
>> --
>> > It looks like orte_init failed for some reason; your parallel process is
>> > likely to abort.  There are many reasons that a parallel process can
>> > fail during orte_init; some of which are due to configuration or
>> > environment problems.  This failure appears to be an internal failure;
>> > here's some additional information (which may only be relevant to an
>> > Open MPI developer):
>> >
>> >   orte_ess_base_select failed
>> >   --> Returned value Not found (-13) instead of ORTE_SUCCESS
>> >
>> --
>> > [therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>> file orted/orted_main.c at line 325
>> > [jian@therock ~]$
>> >
>> >
>> > --
>> > Nehemiah I. Dacres
>> > System Administrator
>> > Advanced Technology Group Saint Louis University
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Nehemiah I. Dacres
> System Administrator
> Advanced Technology Group Saint Louis University
>
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] mpi problems,

2011-04-04 Thread Nehemiah Dacres
I have installed it via a symlink on all of the nodes, I can go 'tentakel
which mpirun ' and it finds it' I'll check the library paths but isn't there
a way to find out which nodes are returning the error?


On Thu, Mar 31, 2011 at 7:30 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

> The error message seems to imply that you don't have OMPI installed on all
> your nodes (because it didn't find /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted on a
> remote node).
>
>
> On Mar 30, 2011, at 4:24 PM, Nehemiah Dacres wrote:
>
> > I am trying to figure out why my jobs aren't getting distributed and need
> some help. I have an install of sun cluster tools on Rockscluster 5.2
> (essentially centos4u2). this user's account has its home dir shared via
> nfs. I am getting some strange errors. here's an example run
> >
> >
> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/mpirun -np 3 -hostfile
> list ./job2.sh
> > bash: /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted: No such file or directory
> >
> --
> > A daemon (pid 20362) died unexpectedly with status 127 while attempting
> > to launch so we are aborting.
> >
> > There may be more information reported by the environment (see above).
> >
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> > location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> >
> --
> >
> --
> > mpirun noticed that the job aborted, but has no info as to the process
> > that caused that situation.
> >
> --
> > mpirun: clean termination accomplished
> >
> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/
> > bin/examples/   instrument/ man/
> > etc/include/lib/share/
> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orte
> > orte-clean  orted   orte-ioforte-ps orterun
> > [jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
> > [therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
> file runtime/orte_init.c at line 125
> >
> --
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems.  This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> >   orte_ess_base_select failed
> >   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> >
> --
> > [therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
> file orted/orted_main.c at line 325
> > [jian@therock ~]$
> >
> >
> > --
> > Nehemiah I. Dacres
> > System Administrator
> > Advanced Technology Group Saint Louis University
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


[OMPI users] mpi problems,

2011-03-30 Thread Nehemiah Dacres
I am trying to figure out why my jobs aren't getting distributed and need
some help. I have an install of sun cluster tools on Rockscluster 5.2
(essentially centos4u2). this user's account has its home dir shared via
nfs. I am getting some strange errors. here's an example run


[jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/mpirun -np 3 -hostfile list
./job2.sh
bash: /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted: No such file or directory
--
A daemon (pid 20362) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--
mpirun: clean termination accomplished

[jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/
bin/examples/   instrument/ man/
etc/include/lib/share/
[jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orte
orte-clean  orted   orte-ioforte-ps orterun
[jian@therock ~]$ /opt/SUNWhpc/HPC8.2.1c/sun/bin/orted
[therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
file runtime/orte_init.c at line 125
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--
[therock.slu.loc:20365] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
file orted/orted_main.c at line 325
[jian@therock ~]$


-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Nehemiah Dacres
thanks.
FYI: its openmpi-1.4.2 from a tarball like you assume
I changed this line
 *Sun\ F* | *Sun*Fortran*)
  # Sun Fortran 8.3 passes all unrecognized flags to the linker
  _LT_TAGVAR(lt_prog_compiler_pic, $1)='-KPIC'
  _LT_TAGVAR(lt_prog_compiler_static, $1)='-Bstatic'
  _LT_TAGVAR(lt_prog_compiler_wl, $1)='-Qoption ld '

 unfortunately my autoconf tool is out of date (2.59 , it says it wants
2.60+ )


On Mon, Nov 29, 2010 at 4:11 PM, Rolf vandeVaart <rolf.vandeva...@oracle.com
> wrote:

>  No, I do not believe so.  First, I assume you are trying to build either
> 1.4 or 1.5, not the trunk.
> Secondly, I assume you are building from a tarfile that you have
> downloaded.  Assuming these
> two things are true, then (as stated in the bug report), prior to running
> configure, you want to
> make the following edits to config/libtool.m4 in all the places you see it.
> ( I think just one place)
>
> FROM:
>
> *Sun\ F*)
>   # Sun Fortran 8.3 passes all unrecognized flags to the linker
>   _LT_TAGVAR(lt_prog_compiler_pic, $1)='-KPIC'
>   _LT_TAGVAR(lt_prog_compiler_static, $1)='-Bstatic'
>   _LT_TAGVAR(lt_prog_compiler_wl, $1)=''
>   ;;
>
> TO:
>
> *Sun\ F*)
>   # Sun Fortran 8.3 passes all unrecognized flags to the linker
>   _LT_TAGVAR(lt_prog_compiler_pic, $1)='-KPIC'
>   _LT_TAGVAR(lt_prog_compiler_static, $1)='-Bstatic'
>   _LT_TAGVAR(lt_prog_compiler_wl, $1)='-Qoption ld '
>   ;;
>
>
>
> Note the difference in the lt_prog_compiler_wl line.
>
I ran ./configure anyway, but I don't think it did anything

>
> Then, you need to run ./autogen.sh.  Then, redo your configure but you do
> not need to do anything
> with LDFLAGS.  Just use your original flags.  I think this should work, but
> I am only reading
> what is in the ticket.
>
> Rolf
>
>
>
> On 11/29/10 16:26, Nehemiah Dacres wrote:
>
> that looks about right. So the suggestion:
>
> ./configure LDFLAGS="-notpath ... ... ..."
>
> -notpath should be replaced by whatever the proper flag should be, in my case 
> -L ?
>
>
>
>
> On Mon, Nov 29, 2010 at 3:16 PM, Rolf vandeVaart <
> rolf.vandeva...@oracle.com> wrote:
>
>> This problem looks a lot like a thread from earlier today.  Can you look
>> at this
>> ticket and see if it helps?  It has a workaround documented in it.
>>
>> https://svn.open-mpi.org/trac/ompi/ticket/2632
>>
>> Rolf
>>
>>
>> On 11/29/10 16:13, Prentice Bisbal wrote:
>>
>> No, it looks like ld is being called with the option -path, and your
>> linker doesn't use that switch. Grep you Makefile(s) for the string
>> "-path". It's probably in a statement defining LDFLAGS somewhere.
>>
>> When you find it, replace it with the equivalent switch for your
>> compiler. You may be able to override it's value on the configure
>> command-line, which is usually easiest/best:
>>
>> ./configure LDFLAGS="-notpath ... ... ..."
>>
>> --
>> Prentice
>>
>>
>> Nehemiah Dacres wrote:
>>
>>
>>  it may have been that  I didn't set ld_library_path
>>
>> On Mon, Nov 29, 2010 at 2:36 PM, Nehemiah Dacres 
>> <dacre...@slu.edu<mailto:dacre...@slu.edu> <dacre...@slu.edu>> wrote:
>>
>> thank you, you have been doubly helpful, but I am having linking
>> errors and I do not know what the solaris studio compiler's
>> preferred linker is. The
>>
>> the configure statement was
>>
>> ./configure --prefix=/state/partition1/apps/sunmpi/
>> --enable-mpi-threads --with-sge --enable-static
>> --enable-sparse-groups CC=/opt/oracle/solstudio12.2/bin/suncc
>> CXX=/opt/oracle/solstudio12.2/bin/sunCC
>> F77=/opt/oracle/solstudio12.2/bin/sunf77
>> FC=/opt/oracle/solstudio12.2/bin/sunf90
>>
>>compile statement was
>>
>> make all install 2>errors
>>
>>
>> error below is
>>
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -soname passed to ld, if ld is invoked, ignored
>> otherwise
>> /usr/bin/ld: unrecognized option '-path'
>> /usr/bin/ld: use the

Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Nehemiah Dacres
I believe the user specifically wishes to use the special debugging tools in
Solaris Studio. The flag in question seems to be -rpath according to the
logs,  It would be suspicious if this was a flag for the Solaris linker. I
don't have access to any solaris machines but I may try make a virtual
install to investigate.



Hi Nehemiah

Hard to tell, I never tried Sun/Oracle Studio compilers.
However, the Intel compilers, for instance, require you to setup
environment variables that include PATH and LD_LIBRARY_PATH at least.
Would this be the case with Sun Studio?
Do you have its full environment set?

As for the error message,
indeed, "man ld" doesn't show "-path" as a possible option.
Would this be a "Solaris thing", perhaps an option
to the Solaris linker?

For what it is worth, OpenMPI compiles with gcc,g++ and gfortran,
which may be a workaround for you, if you want to stick to free compilers.
Likewise, it also compiles with Open64 compilers, although later
on I had trouble with the Open64 Fortran compiler (not to compile OpenMPI,
but MPI applications).
Do you have any specific requirement for Sun/Oracle software?

OpenMPI also compiles with Intel and PGI compilers,
but those aren't free.

Finally, make sure you are passing the Sun compilers to the OpenMPI
configure script correctly.
Somehow your warning messages are labeled "f90", not "sunf90" as I
would expect, but this may be just the way Sun decided to spell their
own error messages.

If you are in Rocks, better install the compilers in /share/apps,
not in /opt as it is now.
That will make the Sun compilers and their possible shared libraries
available to all nodes.
/share/apps is the right place to install mostly anything that doesn't
come in the Rocks/CentOS distribution.

Good luck,


Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Nehemiah Dacres
I put the ld flag on the command line ( ./configure
--prefix=/state/partition1/apps/sunmpi/ --enable-mpi-threads --with-sge
--enable-static --enable-sparse-groups
CC=/opt/oracle/solstudio12.2/bin/suncc
CXX=/opt/oracle/solstudio12.2/bin/sunCC
F77=/opt/oracle/solstudio12.2/bin/sunf77
FC=/opt/oracle/solstudio12.2/bin/sunf90
LD_LIBRARY_PATH=/opt/oracle/solstudio12.2/lib/amd64/lib CFLAGS=-m64
CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64
LDFLAGS=-L/opt/oracle/solstudio12.2/lib/amd64/lib)
which may have been redundant but it still didn't work.

the last line before the same error is thus:

libtool: link: /opt/oracle/solstudio12.2/bin/sunf90 -G  .libs/mpi.o
.libs/mpi_sizeof.o .libs/mpi_comm_spawn_multiple_f90.o
.libs/mpi_testall_f90.o .libs/mpi_testsome_f90.o .libs/mpi_waitall_f90.o
.libs/mpi_waitsome_f90.o .libs/mpi_wtick_f90.o .libs/mpi_wtime_f90.o
-rpath /home/dacresni/openmpi/openmpi-1.4.2/ompi/.libs -rpath
/home/dacresni/openmpi/openmpi-1.4.2/orte/.libs -rpath
/home/dacresni/openmpi/openmpi-1.4.2/opal/.libs -rpath
/opt/oracle/solstudio12.2/lib
-L/home/dacresni/openmpi/openmpi-1.4.2/orte/.libs
-L/home/dacresni/openmpi/openmpi-1.4.2/opal/.libs
-L/opt/oracle/solstudio12.2/lib/amd64/lib ../../../ompi/.libs/libmpi.so
/home/dacresni/openmpi/openmpi-1.4.2/orte/.libs/libopen-rte.so
/home/dacresni/openmpi/openmpi-1.4.2/opal/.libs/libopen-pal.so -ldl -lnsl
-lutil -lm  -m64   -mt -soname libmpi_f90.so.0 -o .libs/libmpi_f90.so.0.0.0


which, if I'm not mistaken, specifically what i told it NOT to do.

On Mon, Nov 29, 2010 at 3:26 PM, Nehemiah Dacres <dacre...@slu.edu> wrote:

> that looks about right. So the suggestion:
>
> ./configure LDFLAGS="-notpath ... ... ..."
>
> -notpath should be replaced by whatever the proper flag should be, in my case 
> -L ?
>
>
> On Mon, Nov 29, 2010 at 3:16 PM, Rolf vandeVaart <
> rolf.vandeva...@oracle.com> wrote:
>
>>  This problem looks a lot like a thread from earlier today.  Can you look
>> at this
>> ticket and see if it helps?  It has a workaround documented in it.
>>
>> https://svn.open-mpi.org/trac/ompi/ticket/2632
>>
>> Rolf
>>
>>
>> On 11/29/10 16:13, Prentice Bisbal wrote:
>>
>> No, it looks like ld is being called with the option -path, and your
>> linker doesn't use that switch. Grep you Makefile(s) for the string
>> "-path". It's probably in a statement defining LDFLAGS somewhere.
>>
>> When you find it, replace it with the equivalent switch for your
>> compiler. You may be able to override it's value on the configure
>> command-line, which is usually easiest/best:
>>
>> ./configure LDFLAGS="-notpath ... ... ..."
>>
>> --
>> Prentice
>>
>>
>> Nehemiah Dacres wrote:
>>
>>
>>  it may have been that  I didn't set ld_library_path
>>
>> On Mon, Nov 29, 2010 at 2:36 PM, Nehemiah Dacres 
>> <dacre...@slu.edu<mailto:dacre...@slu.edu> <dacre...@slu.edu>> wrote:
>>
>> thank you, you have been doubly helpful, but I am having linking
>> errors and I do not know what the solaris studio compiler's
>> preferred linker is. The
>>
>> the configure statement was
>>
>> ./configure --prefix=/state/partition1/apps/sunmpi/
>> --enable-mpi-threads --with-sge --enable-static
>> --enable-sparse-groups CC=/opt/oracle/solstudio12.2/bin/suncc
>> CXX=/opt/oracle/solstudio12.2/bin/sunCC
>> F77=/opt/oracle/solstudio12.2/bin/sunf77
>> FC=/opt/oracle/solstudio12.2/bin/sunf90
>>
>>compile statement was
>>
>> make all install 2>errors
>>
>>
>> error below is
>>
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
>> otherwise
>> f90: Warning: Option -soname passed to ld, if ld is invoked, ignored
>> otherwise
>> /usr/bin/ld: unrecognized option '-path'
>> /usr/bin/ld: use the --help option for usage information
>> make[4]: *** [libmpi_f90.la <http://libmpi_f90.la> 
>> <http://libmpi_f90.la>] Error 2
>> make[3]: *** [all-recursive] Error 1
>> make[2]: *** [all] Error 2
>> make[1]: *** [all-recursive] Error 1
>> make: *** [all-recursive] Error 1
>>
>> am I doing this wrong? are any of those configure flags unnecessary
>> or inappropriate
>>
>>
>>
>

Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Nehemiah Dacres
that looks about right. So the suggestion:

./configure LDFLAGS="-notpath ... ... ..."

-notpath should be replaced by whatever the proper flag should be, in
my case -L ?


On Mon, Nov 29, 2010 at 3:16 PM, Rolf vandeVaart <rolf.vandeva...@oracle.com
> wrote:

>  This problem looks a lot like a thread from earlier today.  Can you look
> at this
> ticket and see if it helps?  It has a workaround documented in it.
>
> https://svn.open-mpi.org/trac/ompi/ticket/2632
>
> Rolf
>
>
> On 11/29/10 16:13, Prentice Bisbal wrote:
>
> No, it looks like ld is being called with the option -path, and your
> linker doesn't use that switch. Grep you Makefile(s) for the string
> "-path". It's probably in a statement defining LDFLAGS somewhere.
>
> When you find it, replace it with the equivalent switch for your
> compiler. You may be able to override it's value on the configure
> command-line, which is usually easiest/best:
>
> ./configure LDFLAGS="-notpath ... ... ..."
>
> --
> Prentice
>
>
> Nehemiah Dacres wrote:
>
>
>  it may have been that  I didn't set ld_library_path
>
> On Mon, Nov 29, 2010 at 2:36 PM, Nehemiah Dacres 
> <dacre...@slu.edu<mailto:dacre...@slu.edu> <dacre...@slu.edu>> wrote:
>
> thank you, you have been doubly helpful, but I am having linking
> errors and I do not know what the solaris studio compiler's
> preferred linker is. The
>
> the configure statement was
>
> ./configure --prefix=/state/partition1/apps/sunmpi/
> --enable-mpi-threads --with-sge --enable-static
> --enable-sparse-groups CC=/opt/oracle/solstudio12.2/bin/suncc
> CXX=/opt/oracle/solstudio12.2/bin/sunCC
> F77=/opt/oracle/solstudio12.2/bin/sunf77
> FC=/opt/oracle/solstudio12.2/bin/sunf90
>
>compile statement was
>
> make all install 2>errors
>
>
> error below is
>
> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
> otherwise
> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
> otherwise
> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
> otherwise
> f90: Warning: Option -path passed to ld, if ld is invoked, ignored
> otherwise
> f90: Warning: Option -soname passed to ld, if ld is invoked, ignored
> otherwise
> /usr/bin/ld: unrecognized option '-path'
> /usr/bin/ld: use the --help option for usage information
> make[4]: *** [libmpi_f90.la <http://libmpi_f90.la> 
> <http://libmpi_f90.la>] Error 2
> make[3]: *** [all-recursive] Error 1
> make[2]: *** [all] Error 2
> make[1]: *** [all-recursive] Error 1
> make: *** [all-recursive] Error 1
>
> am I doing this wrong? are any of those configure flags unnecessary
> or inappropriate
>
>
>
> On Mon, Nov 29, 2010 at 2:06 PM, Gus Correa <g...@ldeo.columbia.edu
> <mailto:g...@ldeo.columbia.edu> <g...@ldeo.columbia.edu>> wrote:
>
> Nehemiah Dacres wrote:
>
> I want to compile openmpi to work with the solaris studio
> express  or
> solaris studio. This is a different version than is installed on
> rockscluster 5.2  and would like to know if there any
> gotchas or configure
> flags I should use to get it working or portable to nodes on
> the cluster.
> Software-wise,  it is a fairly homogeneous environment with
> only slight
> variations on the hardware side which could be isolated
> (machinefile flag
> and what-not)
> Please advise
>
>
> Hi Nehemiah
> I just answered your email to the OpenMPI list.
> I want to add that if you build OpenMPI with Torque support,
> the machine file for each is not needed, it is provided by Torque.
> I believe the same is true for SGE (but I don't use SGE).
> Gus Correa
>
>
>
>
> --
> Nehemiah I. Dacres
> System Administrator
> Advanced Technology Group Saint Louis University
>
>
>
>
> --
> Nehemiah I. Dacres
> System Administrator
> Advanced Technology Group Saint Louis University
>
>
> 
>
> ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>  ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Nehemiah Dacres
thank you, you have been doubly helpful, but I am having linking errors and
I do not know what the solaris studio compiler's preferred linker is. The

the configure statement was

./configure --prefix=/state/partition1/apps/sunmpi/ --enable-mpi-threads
--with-sge --enable-static --enable-sparse-groups
CC=/opt/oracle/solstudio12.2/bin/suncc
CXX=/opt/oracle/solstudio12.2/bin/sunCC
F77=/opt/oracle/solstudio12.2/bin/sunf77
FC=/opt/oracle/solstudio12.2/bin/sunf90

   compile statement was

make all install 2>errors


error below is

f90: Warning: Option -path passed to ld, if ld is invoked, ignored otherwise
f90: Warning: Option -path passed to ld, if ld is invoked, ignored otherwise
f90: Warning: Option -path passed to ld, if ld is invoked, ignored otherwise
f90: Warning: Option -path passed to ld, if ld is invoked, ignored otherwise
f90: Warning: Option -soname passed to ld, if ld is invoked, ignored
otherwise
/usr/bin/ld: unrecognized option '-path'
/usr/bin/ld: use the --help option for usage information
make[4]: *** [libmpi_f90.la] Error 2
make[3]: *** [all-recursive] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

am I doing this wrong? are any of those configure flags unnecessary or
inappropriate



On Mon, Nov 29, 2010 at 2:06 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Nehemiah Dacres wrote:
>
>> I want to compile openmpi to work with the solaris studio express  or
>> solaris studio. This is a different version than is installed on
>> rockscluster 5.2  and would like to know if there any gotchas or configure
>> flags I should use to get it working or portable to nodes on the cluster.
>> Software-wise,  it is a fairly homogeneous environment with only slight
>> variations on the hardware side which could be isolated (machinefile flag
>> and what-not)
>> Please advise
>>
>>
> Hi Nehemiah
> I just answered your email to the OpenMPI list.
> I want to add that if you build OpenMPI with Torque support,
> the machine file for each is not needed, it is provided by Torque.
> I believe the same is true for SGE (but I don't use SGE).
> Gus Correa
>



-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


[OMPI users] compiling Openmpi on solaris studio express

2010-11-29 Thread Nehemiah Dacres
I want to compile openmpi to work with the solaris studio express  or
solaris studio. This is a different version than is installed on
rockscluster 5.2  and would like to know if there any gotchas or configure
flags I should use to get it working or portable to nodes on the cluster.
Software-wise,  it is a fairly homogeneous environment with only slight
variations on the hardware side which could be isolated (machinefile flag
and what-not)
Please advise

-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University


[OMPI users] sun compilers

2010-11-19 Thread Nehemiah Dacres
is there a searchable archive of this mailing list?

I am helping someone use Openmpi with Sun's compilers that came with
SolarisStudio. I used the --showme  with mpif90 and got this

gfortran -I/opt/openmpi/include -pthread -I/opt/openmpi/lib ring_f90.f90
-L/opt/openmpi/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl
-Wl,--export-dynamic -lnsl -lutil -lm -ldl

that line compiles fine and so does the mpif90 command but when I replace
gfortran with sunf90 or the absoulute path to my solaris studio compilers I
get this

$ f90 -I/opt/openmpi/include -pthread -I/opt/openmpi/lib ring_f90.f90
-L/opt/openmpi/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl
-Wl,--export-dynamic -lnsl -lutil -lm -ldl
f90: Warning: Option -pthread passed to ld, if ld is invoked, ignored
otherwise
f90: Warning: Option -Wl,--export-dynamic passed to ld, if ld is invoked,
ignored otherwise

  use mpi
  ^
"ring_f90.f90", Line = 10, Column = 7: ERROR: "MPI" is specified as the
module name on a USE statement, but the compiler cannot find it.

  call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
 ^
"ring_f90.f90", Line = 17, Column = 22: ERROR: IMPLICIT NONE is specified in
the local scope, therefore an explicit type must be specified for data
object "MPI_COMM_WORLD".

 call MPI_SEND(message, 1, MPI_INTEGER, next, tag, MPI_COMM_WORLD, ierr)
   ^

"ring_f90.f90", Line = 34, Column = 32: ERROR: IMPLICIT NONE is specified in
the local scope, therefore an explicit type must be specified for data
object "MPI_INTEGER".

MPI_STATUS_IGNORE, ierr)
^
"ring_f90.f90", Line = 46, Column = 9: ERROR: IMPLICIT NONE is specified in
the local scope, therefore an explicit type must be specified for data
object "MPI_STATUS_IGNORE".

f90comp: 73 SOURCE LINES
f90comp: 4 ERRORS, 0 WARNINGS, 0 OTHER MESSAGES, 0 ANSI

and the file contains this (from cat ring_f90.f90 ):

!
! Copyright (c) 2004-2006 The Trustees of Indiana University and Indiana
! University Research and Technology
! Corporation.  All rights reserved.
! Copyright (c) 2006  Cisco Systems, Inc.  All rights reserved.
!
! Simple ring test program
!
program ring
  use mpi
  implicit none
  integer :: rank, size, tag, next, from, message, ierr

! Start up MPI

  call MPI_INIT(ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)

! Calculate the rank of the next process in the ring.  Use the modulus
! operator so that the last process "wraps around" to rank zero.

  tag = 201
  next = mod((rank + 1), size)
  from = mod((rank + size - 1), size)

! If we are the "master" process (i.e., MPI_COMM_WORLD rank 0), put
! the number of times to go around the ring in the message.

  if (rank .eq. 0) then
 message = 10

 print *, 'Process 0 sending ', message, ' to ', next, ' tag ', tag, '
(', size, ' processes in ring)'
 call MPI_SEND(message, 1, MPI_INTEGER, next, tag, MPI_COMM_WORLD, ierr)
 print *, 'Process 0 sent to ', next
  endif

! Pass the message around the ring.  The exit mechanism works as
! follows: the message (a positive integer) is passed around the ring.
! Each time it passes rank 0, it is decremented.  When each processes
! receives a message containing a 0 value, it passes the message on to
! the next process and then quits.  By passing the 0 message first,
! every process gets the 0 message and can quit normally.

10 call MPI_RECV(message, 1, MPI_INTEGER, from, tag, MPI_COMM_WORLD, &
MPI_STATUS_IGNORE, ierr)

  if (rank .eq. 0) then
 message = message - 1
 print *, 'Process 0 decremented value:', message
  endif

  call MPI_SEND(message, 1, MPI_INTEGER, next, tag, MPI_COMM_WORLD, ierr)

  if (message .eq. 0) then
 print *, 'Process ', rank, ' exiting'
 goto 20
  endif
  goto 10

! The last process does one extra send to process 0, which needs to be
! received before the program can exit

 20 if (rank .eq. 0) then
 call MPI_RECV(message, 1, MPI_INTEGER, from, tag, MPI_COMM_WORLD, &
  MPI_STATUS_IGNORE, ierr)
  endif

! All done

  call MPI_FINALIZE(ierr)
end program


Now, i must warn you, I don't know FORTRAN but I am supporting someone who
does. I have them CC

-- 
Nehemiah I. Dacres
System Administrator
Advanced Technology Group Saint Louis University