[OMPI users] Open Mpi execution error

2014-04-07 Thread Kamal

Hello Open MPI,

I installed open-mpi with
./configure --prefix=/usr/local/
make all
make install

then I launched my sample code which gave me this error
my LD_LIBRARY_PATH=/usr/local/

I have attached the output file with this mail
could some one please help me with this .


Thanks,

Kamal


[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_ess_slurmd: 
dlopen(/usr/local/lib/openmpi/mca_ess_slurmd.so, 9): Symbol not found: 
_orte_jmap_t_class

  Referenced from: /usr/local/lib/openmpi/mca_ess_slurmd.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_ess_slurmd.so (ignored)

[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_errmgr_default: 
dlopen(/usr/local/lib/openmpi/mca_errmgr_default.so, 9): Symbol not found: 
_orte_errmgr_base_error_abort

  Referenced from: /usr/local/lib/openmpi/mca_errmgr_default.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_errmgr_default.so (ignored)

[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_routed_cm: 
dlopen(/usr/local/lib/openmpi/mca_routed_cm.so, 9): Symbol not found: 
_orte_message_event_t_class

  Referenced from: /usr/local/lib/openmpi/mca_routed_cm.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_routed_cm.so (ignored)

[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_routed_linear: 
dlopen(/usr/local/lib/openmpi/mca_routed_linear.so, 9): Symbol not found: 
_orte_message_event_t_class

  Referenced from: /usr/local/lib/openmpi/mca_routed_linear.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_routed_linear.so (ignored)

[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_grpcomm_basic: 
dlopen(/usr/local/lib/openmpi/mca_grpcomm_basic.so, 9): Symbol not found: 
_opal_profile

  Referenced from: /usr/local/lib/openmpi/mca_grpcomm_basic.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_grpcomm_basic.so (ignored)

[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_grpcomm_hier: 
dlopen(/usr/local/lib/openmpi/mca_grpcomm_hier.so, 9): Symbol not found: 
_orte_daemon_cmd_processor

  Referenced from: /usr/local/lib/openmpi/mca_grpcomm_hier.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_grpcomm_hier.so (ignored)

[BOW.local:33757] mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_filem_rsh: 
dlopen(/usr/local/lib/openmpi/mca_filem_rsh.so, 9): Symbol not found: 
_opal_uses_threads

  Referenced from: /usr/local/lib/openmpi/mca_filem_rsh.so

  Expected in: flat namespace

 in /usr/local/lib/openmpi/mca_filem_rsh.so (ignored)

[BOW:33757] *** Process received signal ***

[BOW:33757] Signal: Segmentation fault: 11 (11)

[BOW:33757] Signal code: Address not mapped (1)

[BOW:33757] Failing at address: 0x10013

[BOW:33757] [ 0] 0   libsystem_platform.dylib0x7fff843975aa 
_sigtramp + 26

[BOW:33757] [ 1] 0   ??? 0x7fff57eafee8 0x0 
+ 140734668406504

[BOW:33757] [ 2] 0   libopen-pal.6.dylib 0x000107e1790a 
opal_libevent2021_event_base_loop + 2218

[BOW:33757] [ 3] 0   mpiexec 0x000107d516f3 
orterun + 5843

[BOW:33757] [ 4] 0   mpiexec 0x000107d50002 
main + 34

[BOW:33757] [ 5] 0   libdyld.dylib   0x7fff89b6c5fd 
start + 1

[BOW:33757] [ 6] 0   ??? 0x0004 0x0 
+ 4

[BOW:33757] *** End of error message ***

./nekmpi: line 8: 33757 Segmentation fault: 11  mpiexec -np $2 ./nek5000





Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Ralph Castain
What version of OMPI are you attempting to install?

Also, using /usr/local as your prefix is a VERY, VERY BAD idea. Most OS 
distributions come with a (typically old) version of OMPI installed in the 
system area. Overlaying that with another version can easily lead to the errors 
you show.

You should always install to a user-level directory and then set your path and 
ld_library_path to start with that location


On Apr 6, 2014, at 8:30 AM, Kamal  wrote:

> Hello Open MPI,
> 
> I installed open-mpi with 
> ./configure --prefix=/usr/local/
> make all 
> make install 
> 
> then I launched my sample code which gave me this error 
> my LD_LIBRARY_PATH=/usr/local/
> 
> I have attached the output file with this mail 
> could some one please help me with this . 
> 
> 
> Thanks,
> 
> Kamal 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Hamid Saeed
Hello,

I also had a same problem.

but when i re-installed MPI using

 ./configure --prefix=/usr/local/
make
-j2
make install

it worked.






On Sun, Apr 6, 2014 at 5:30 PM, Kamal  wrote:

>  Hello Open MPI,
>
>  I installed open-mpi with
> ./configure --prefix=/usr/local/
>  make all
> make install
>
>  then I launched my sample code which gave me this error
> my LD_LIBRARY_PATH=/usr/local/
>
>  I have attached the output file with this mail
> could some one please help me with this .
>
>
>  Thanks,
>
>  Kamal
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 

___

Hamid Saeed
CoSynth GmbH & Co. KG
Escherweg 2 - 26121 Oldenburg - Germany

Tel +49 441 9722 738 | Fax -278
http://www.cosynth.com

___


Re: [OMPI users] performance of MPI_Iallgatherv

2014-04-07 Thread Nathan Hjelm
There is no async progress in Open MPI at this time so this is the
expected behavior. We plan to fix this for the 1.9 release series.

-Nathan Hjelm
HPC-5, LANL

On Mon, Apr 07, 2014 at 11:12:06AM +0800, Zehan Cui wrote:
> Hi Matthieu,
> 
> Thanks for your suggestion. I tried MPI_Waitall(), but the results are
> the same. It seems the communication didn't overlap with computation.
> 
> Regards,
> Zehan
> 
> On 4/5/14, Matthieu Brucher  wrote:
> > Hi,
> >
> > Try waiting on all gathers at the same time, not one by one (this is
> > what non blocking collectives are made for!)
> >
> > Cheers,
> >
> > Matthieu
> >
> > 2014-04-05 10:35 GMT+01:00 Zehan Cui :
> >> Hi,
> >>
> >> I'm testing the non-blocking collective of OpenMPI-1.8.
> >>
> >> I have two nodes with Infiniband to perform allgather on totally 128MB
> >> data.
> >>
> >> I split the 128MB data into eight pieces, and perform computation and
> >> MPI_Iallgatherv() on one piece of data each iteration, hoping that the
> >> MPI_Iallgatherv() of last iteration can be overlapped with computation of
> >> current iteration. A MPI_Wait() is called at the end of last iteration.
> >>
> >> However, the total communication time (including the final wait time) is
> >> similar with that of the traditional blocking MPI_Allgatherv, even
> >> slightly
> >> higher.
> >>
> >>
> >> Following is the test pseudo-code, the source code are attached.
> >>
> >> ===
> >>
> >> Using MPI_Allgatherv:
> >>
> >> for( i=0; i<8; i++ )
> >> {
> >>   // computation
> >> mytime( t_begin );
> >> computation;
> >> mytime( t_end );
> >> comp_time += (t_end - t_begin);
> >>
> >>   // communication
> >> t_begin = t_end;
> >> MPI_Allgatherv();
> >> mytime( t_end );
> >> comm_time += (t_end - t_begin);
> >> }
> >> 
> >>
> >> Using MPI_Iallgatherv:
> >>
> >> for( i=0; i<8; i++ )
> >> {
> >>   // computation
> >> mytime( t_begin );
> >> computation;
> >> mytime( t_end );
> >> comp_time += (t_end - t_begin);
> >>
> >>   // communication
> >> t_begin = t_end;
> >> MPI_Iallgatherv();
> >> mytime( t_end );
> >> comm_time += (t_end - t_begin);
> >> }
> >>
> >> // wait for non-blocking allgather to complete
> >> mytime( t_begin );
> >> for( i=0; i<8; i++ )
> >> MPI_Wait;
> >> mytime( t_end );
> >> wait_time = t_end - t_begin;
> >>
> >> ==
> >>
> >> The results of Allgatherv is:
> >> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2
> >> --host
> >> gnode102,gnode103 ./Allgatherv 128 2 | grep time
> >> Computation time  : 8481279 us
> >> Communication time: 319803 us
> >>
> >> The results of Iallgatherv is:
> >> [cmy@gnode102 test_nbc]$ /home3/cmy/czh/opt/ompi-1.8/bin/mpirun -n 2
> >> --host
> >> gnode102,gnode103 ./Iallgatherv 128 2 | grep time
> >> Computation time  : 8479177 us
> >> Communication time: 199046 us
> >> Wait time:  139841 us
> >>
> >>
> >> So, does this mean that current OpenMPI implementation of MPI_Iallgatherv
> >> doesn't support offloading of collective communication to dedicated cores
> >> or
> >> network interface?
> >>
> >> Best regards,
> >> Zehan
> >>
> >>
> >>
> >>
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > Information System Engineer, Ph.D.
> > Blog: http://matt.eifelle.com
> > LinkedIn: http://www.linkedin.com/in/matthieubrucher
> > Music band: http://liliejay.com/
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> 
> 
> -- 
> Best Regards
> Zehan Cui(崔泽汉)
> ---
> Institute of Computing Technology, Chinese Academy of Sciences.
> No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


pgpccAMD3gv69.pgp
Description: PGP signature


Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Kamal

Hi Hamid,

So I can uninstall just by typing

' make uninstall ' right ?

what does ' make -j2 ' do ?

Thanks for your reply,
Kamal

On 07/04/2014 17:38, Hamid Saeed wrote:

Hello,

I also had a same problem.

but when i re-installed MPI using

./configure --prefix=/usr/local/
make
-j2
make install

it worked.





On Sun, Apr 6, 2014 at 5:30 PM, Kamal > wrote:


Hello Open MPI,

I installed open-mpi with
./configure --prefix=/usr/local/
make all
make install

then I launched my sample code which gave me this error
my LD_LIBRARY_PATH=/usr/local/

I have attached the output file with this mail
could some one please help me with this .


Thanks,

Kamal



___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users




--

___

Hamid Saeed

CoSynth GmbH & Co. KG
Escherweg 2 - 26121 Oldenburg - Germany

Tel +49 441 9722 738 | Fax -278
http://www.cosynth.com 

___



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Openmpi 1.8 "rmaps seq" doesn't work

2014-04-07 Thread Ralph Castain
Looks like bit-rot has struck the sequential mapper support - I'll revive it 
for 1.8.1

On Apr 6, 2014, at 7:17 PM, Chen Bill  wrote:

> Hi ,
> 
> I just tried the openmpi 1.8, but I found the feature --mca rmaps seq doesn't 
> work.
> 
> for example,
> 
> >mpirun -np 4 -hostfile hostsfle --mca rmaps seq hostname
> 
> It shows below error,
> 
> --
> Your job failed to map. Either no mapper was available, or none
> of the available mappers was able to perform the requested
> mapping operation. This can happen if you request a map type
> (e.g., loadbalance) and the corresponding mapper was not built.
> --
> 
> but when I use ompi_info ,it shows has this feature
> 
> 
> >ompi_info |grep -i rmaps
>MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: mindist (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8)
>MCA rmaps: staged (MCA v2.0, API v2.0, Component v1.8)
> 
> Any suggestions?
> 
> Many thanks,
> Bill
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Kamal

Hi Ralph,

I use OMPI 1.8 for Macbook OS X mavericks.

As you said I will create a new directory to install my MPI files.

Thanks for your reply,

Kamal.

On 07/04/2014 17:37, Ralph Castain wrote:

What version of OMPI are you attempting to install?

Also, using /usr/local as your prefix is a VERY, VERY BAD idea. Most 
OS distributions come with a (typically old) version of OMPI installed 
in the system area. Overlaying that with another version can easily 
lead to the errors you show.


You should always install to a user-level directory and then set your 
path and ld_library_path to start with that location



On Apr 6, 2014, at 8:30 AM, Kamal > wrote:



Hello Open MPI,

I installed open-mpi with
./configure --prefix=/usr/local/
make all
make install

then I launched my sample code which gave me this error
my LD_LIBRARY_PATH=/usr/local/

I have attached the output file with this mail
could some one please help me with this .


Thanks,

Kamal


___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Ralph Castain
Nope - make uninstall will not clean everything out, which is one reason we 
don't recommend putting things in a system directory

On Apr 6, 2014, at 8:44 AM, Kamal  wrote:

> Hi Hamid, 
> 
> So I can uninstall just by typing 
> 
> ' make uninstall ' right ? 
> 
> what does ' make -j2 ' do ?
> 
> Thanks for your reply, 
> Kamal
> 
> On 07/04/2014 17:38, Hamid Saeed wrote:
>> Hello,
>> 
>> I also had a same problem.
>> 
>> but when i re-installed MPI using 
>> 
>>  ./configure --prefix=/usr/local/
>> make  -j2
>> make install 
>> 
>> it worked.
>> 
>> 
>> 
>> 
>> 
>> On Sun, Apr 6, 2014 at 5:30 PM, Kamal  wrote:
>> Hello Open MPI,
>> 
>> I installed open-mpi with 
>> ./configure --prefix=/usr/local/
>> make all 
>> make install 
>> 
>> then I launched my sample code which gave me this error 
>> my LD_LIBRARY_PATH=/usr/local/
>> 
>> I have attached the output file with this mail 
>> could some one please help me with this . 
>> 
>> 
>> Thanks,
>> 
>> Kamal 
>> 
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> -- 
>> ___
>> Hamid Saeed
>> CoSynth GmbH & Co. KG
>> Escherweg 2 - 26121 Oldenburg - Germany
>> Tel +49 441 9722 738 | Fax -278
>> http://www.cosynth.com
>> ___
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Hamid Saeed
make -j


It is very simple to uninstall

go to the
/usr/local/
here you will find
lib,bin etc these are the file of MPI.
just type
rm -r 

and also next time when you want to install i will recommend you use

./configure --prefix=/usr/local/mpi_installation
make -j2
make install

include the following lines in your .bashrc file.

export PATH=/usr/local/mpi_installation/bin:$PATH
export LD_LIBRARY_PATH=

/usr/local/mpi_installation/lib:$LD_LIBRARY_PATH

best of luck.


On Sun, Apr 6, 2014 at 5:45 PM, Kamal  wrote:

>  Hi Ralph,
>
> I use OMPI 1.8 for Macbook OS X mavericks.
>
> As you said I will create a new directory to install my MPI files.
>
> Thanks for your reply,
>
> Kamal.
>
> On 07/04/2014 17:37, Ralph Castain wrote:
>
> What version of OMPI are you attempting to install?
>
>  Also, using /usr/local as your prefix is a VERY, VERY BAD idea. Most OS
> distributions come with a (typically old) version of OMPI installed in the
> system area. Overlaying that with another version can easily lead to the
> errors you show.
>
>  You should always install to a user-level directory and then set your
> path and ld_library_path to start with that location
>
>
>  On Apr 6, 2014, at 8:30 AM, Kamal  wrote:
>
>  Hello Open MPI,
>
>  I installed open-mpi with
> ./configure --prefix=/usr/local/
>  make all
> make install
>
>  then I launched my sample code which gave me this error
> my LD_LIBRARY_PATH=/usr/local/
>
>  I have attached the output file with this mail
> could some one please help me with this .
>
>
>  Thanks,
>
>  Kamal
>
>
>  ___
>
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 

___

Hamid Saeed
CoSynth GmbH & Co. KG
Escherweg 2 - 26121 Oldenburg - Germany

Tel +49 441 9722 738 | Fax -278
http://www.cosynth.com

___


Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Ralph Castain
Deleting the install as you describe is a VERY bad idea. As I explained 
elsewhere, the system generally comes with an installation. Blowing things away 
can destabilize other areas of the system unless you are (a) very careful, and 
(b) very lucky

Just stay away from the system directories, please.


On Apr 7, 2014, at 8:50 AM, Hamid Saeed  wrote:

> make -j
> 
> 
> It is very simple to uninstall 
> 
> go to the
> /usr/local/
> here you will find
> lib,bin etc these are the file of MPI.
> just type
> rm -r 
> 
> and also next time when you want to install i will recommend you use
> 
> ./configure --prefix=/usr/local/mpi_installation
> make -j2
> make install 
> 
> include the following lines in your .bashrc file.
> 
> export PATH=/usr/local/mpi_installation/bin:$PATH
> export LD_LIBRARY_PATH=/usr/local/mpi_installation/lib:$LD_LIBRARY_PATH
> 
> best of luck.
> 
> 
> On Sun, Apr 6, 2014 at 5:45 PM, Kamal  wrote:
> Hi Ralph, 
> 
> I use OMPI 1.8 for Macbook OS X mavericks.
> 
> As you said I will create a new directory to install my MPI files.
> 
> Thanks for your reply, 
> 
> Kamal. 
> 
> On 07/04/2014 17:37, Ralph Castain wrote:
>> What version of OMPI are you attempting to install?
>> 
>> Also, using /usr/local as your prefix is a VERY, VERY BAD idea. Most OS 
>> distributions come with a (typically old) version of OMPI installed in the 
>> system area. Overlaying that with another version can easily lead to the 
>> errors you show.
>> 
>> You should always install to a user-level directory and then set your path 
>> and ld_library_path to start with that location
>> 
>> 
>> On Apr 6, 2014, at 8:30 AM, Kamal  wrote:
>> 
>>> Hello Open MPI,
>>> 
>>> I installed open-mpi with 
>>> ./configure --prefix=/usr/local/
>>> make all 
>>> make install 
>>> 
>>> then I launched my sample code which gave me this error 
>>> my LD_LIBRARY_PATH=/usr/local/
>>> 
>>> I have attached the output file with this mail 
>>> could some one please help me with this . 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Kamal 
>>> 
>>> 
>>> ___
>>> 
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> ___
> Hamid Saeed
> CoSynth GmbH & Co. KG
> Escherweg 2 - 26121 Oldenburg - Germany
> Tel +49 441 9722 738 | Fax -278
> http://www.cosynth.com
> ___
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Open Mpi execution error

2014-04-07 Thread Kamal Selvam
Thanks a lot for your help, It worked finally.


On Mon, Apr 7, 2014 at 6:05 PM, Ralph Castain  wrote:

> Deleting the install as you describe is a VERY bad idea. As I explained
> elsewhere, the system generally comes with an installation. Blowing things
> away can destabilize other areas of the system unless you are (a) very
> careful, and (b) very lucky
>
> Just stay away from the system directories, please.
>
>
> On Apr 7, 2014, at 8:50 AM, Hamid Saeed  wrote:
>
> make -j
>
>
> It is very simple to uninstall
>
> go to the
> /usr/local/
> here you will find
> lib,bin etc these are the file of MPI.
> just type
> rm -r 
>
> and also next time when you want to install i will recommend you use
>
> ./configure --prefix=/usr/local/mpi_installation
> make -j2
> make install
>
> include the following lines in your .bashrc file.
>
> export PATH=/usr/local/mpi_installation/bin:$PATH
> export LD_LIBRARY_PATH=
> /usr/local/mpi_installation/lib:$LD_LIBRARY_PATH
>
> best of luck.
>
>
> On Sun, Apr 6, 2014 at 5:45 PM, Kamal  wrote:
>
>>  Hi Ralph,
>>
>> I use OMPI 1.8 for Macbook OS X mavericks.
>>
>> As you said I will create a new directory to install my MPI files.
>>
>> Thanks for your reply,
>>
>> Kamal.
>>
>> On 07/04/2014 17:37, Ralph Castain wrote:
>>
>> What version of OMPI are you attempting to install?
>>
>>  Also, using /usr/local as your prefix is a VERY, VERY BAD idea. Most OS
>> distributions come with a (typically old) version of OMPI installed in the
>> system area. Overlaying that with another version can easily lead to the
>> errors you show.
>>
>>  You should always install to a user-level directory and then set your
>> path and ld_library_path to start with that location
>>
>>
>>  On Apr 6, 2014, at 8:30 AM, Kamal  wrote:
>>
>>  Hello Open MPI,
>>
>>  I installed open-mpi with
>> ./configure --prefix=/usr/local/
>>  make all
>> make install
>>
>>  then I launched my sample code which gave me this error
>> my LD_LIBRARY_PATH=/usr/local/
>>
>>  I have attached the output file with this mail
>> could some one please help me with this .
>>
>>
>>  Thanks,
>>
>>  Kamal
>>
>>
>>  ___
>>
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>>
>> ___
>> users mailing 
>> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> ___
> Hamid Saeed
> CoSynth GmbH & Co. KG
> Escherweg 2 - 26121 Oldenburg - Germany
> Tel +49 441 9722 738 | Fax -278
> http://www.cosynth.com
> ___
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] problem for multiple clusters using mpirun

2014-04-07 Thread Jeff Squyres (jsquyres)
I was out on vacation / fully disconnected last week, and am just getting to 
all the backlog now...

Are you saying that port 1024 was locked as well -- i.e., that we should set 
the minimum to 1025?


On Mar 31, 2014, at 4:32 AM, Hamid Saeed  wrote:

> Yes Jeff,
> You were right. The default value for btl_tcp_port_min_v4 is 1024.
> 
> I was facing problem in running my Algorithm on multiple processors (using 
> ssh).
> 
> Answer:
> The network administrator locked that port.
> :(
> 
> i changed the communication port by forcing mpi to use another.
> 
> mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include br0 
> --mca btl_tcp_port_min_v4 1 ./a.out
> 
> Thanks again for the nice and effective suggestions.
> 
> Regards. 
>  
> 
> 
> On Tue, Mar 25, 2014 at 1:27 PM, Jeff Squyres (jsquyres)  
> wrote:
> This is very odd -- the default value for btl_tcp_port_min_v4 is 1024.  So 
> unless you have overridden this value, you should not be getting a port less 
> than 1024.  You can run this to see:
> 
> ompi_info --level 9 --param  btl tcp --parsable | grep port_min_v4
> 
> Mine says this in a default 1.7.5 installation:
> 
> mca:btl:tcp:param:btl_tcp_port_min_v4:value:1024
> mca:btl:tcp:param:btl_tcp_port_min_v4:source:default
> mca:btl:tcp:param:btl_tcp_port_min_v4:status:writeable
> mca:btl:tcp:param:btl_tcp_port_min_v4:level:2
> mca:btl:tcp:param:btl_tcp_port_min_v4:help:The minimum port where the TCP BTL 
> will try to bind (default 1024)
> mca:btl:tcp:param:btl_tcp_port_min_v4:deprecated:no
> mca:btl:tcp:param:btl_tcp_port_min_v4:type:int
> mca:btl:tcp:param:btl_tcp_port_min_v4:disabled:false
> 
> 
> 
> On Mar 25, 2014, at 5:36 AM, Hamid Saeed  wrote:
> 
> > Hello,
> > Thanks i figured out what was the exact problem in my case.
> > Now i am using the following execution line.
> > it is directing the mpi comm port to start from 1...
> >
> > mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include 
> > br0 --mca btl_tcp_port_min_v4 1 ./a.out
> >
> > and every thing works again.
> >
> > Thanks.
> >
> > Best regards.
> >
> >
> >
> >
> > On Tue, Mar 25, 2014 at 10:23 AM, Hamid Saeed  
> > wrote:
> > Hello,
> > I am not sure what approach does the MPI communication follow but when i
> > use
> > --mca btl_base_verbose 30
> >
> > I observe the mentioned port.
> >
> > [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on 
> > port 4
> > [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> >  connect() to 134.106.3.252 failed: Connection refused (111)
> >
> >
> > the information on the
> > http://www.open-mpi.org/community/lists/users/2011/11/17732.php
> > is not enough could you kindly explain..
> >
> > How can restrict MPI communication to use the ports starting from 1025.
> > or use the port some what like
> > 59822...
> >
> > Regards.
> >
> >
> >
> > On Tue, Mar 25, 2014 at 9:15 AM, Reuti  wrote:
> > Hi,
> >
> > Am 25.03.2014 um 08:34 schrieb Hamid Saeed:
> >
> > > Is it possible to change the port number for the MPI communication?
> > >
> > > I can see that my program uses port 4 for the MPI communication.
> > >
> > > [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 
> > > on port 4
> > > [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> > >  connect() to 134.106.3.252 failed: Connection refused (111)
> > >
> > > In my case the ports from 1 to 1024 are reserved.
> > > MPI tries to use one of the reserve ports and prompts the connection 
> > > refused error.
> > >
> > > I will be very glade for the kind suggestions.
> >
> > There are certain parameters to set the range of used ports, but using any 
> > up to 1024 should not be the default:
> >
> > http://www.open-mpi.org/community/lists/users/2011/11/17732.php
> >
> > Are any of these set by accident beforehand by your environment?
> >
> > -- Reuti
> >
> >
> > > Regards.
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Mar 24, 2014 at 5:32 PM, Hamid Saeed  
> > > wrote:
> > > Hello Jeff,
> > >
> > > Thanks for your cooperation.
> > >
> > > --mca btl_tcp_if_include br0
> > >
> > > worked out of the box.
> > >
> > > The problem was from the network administrator. The machines on the 
> > > network side were halting the mpi...
> > >
> > > so cleaning and killing every thing worked.
> > >
> > > :)
> > >
> > > regards.
> > >
> > >
> > > On Mon, Mar 24, 2014 at 4:34 PM, Jeff Squyres (jsquyres) 
> > >  wrote:
> > > There is no "self" IP interface in the Linux kernel.
> > >
> > > Try using btl_tcp_if_include and list just the interface(s) that you want 
> > > to use.  From your prior email, I'm *guessing* it's just br2 (i.e., the 
> > > 10.x address inside your cluster).
> > >
> > > Also, it looks like you didn't setup your SSH keys properly for logging 
> > > in to remote notes automatically.
> > >
> > >
> > >
> > > On Mar 24, 2014, at 10:56 AM, Hamid Saeed  wrote:
> > >
> > > > Hello,
> > > >
> > > > I added the "self" 

Re: [OMPI users] problem for multiple clusters using mpirun

2014-04-07 Thread Hamid Saeed
Thanks for the reply.

no.
In my case the problem was with the misunderstanding of our network
administrator.
Our network system should have, up to 1023 ports locked but some one else
put a ticket on 1024 too.
for this purpose i wasn't able to communicate with other computers.





On Mon, Apr 7, 2014 at 9:52 PM, Jeff Squyres (jsquyres)
wrote:

> I was out on vacation / fully disconnected last week, and am just getting
> to all the backlog now...
>
> Are you saying that port 1024 was locked as well -- i.e., that we should
> set the minimum to 1025?
>
>
> On Mar 31, 2014, at 4:32 AM, Hamid Saeed  wrote:
>
> > Yes Jeff,
> > You were right. The default value for btl_tcp_port_min_v4 is 1024.
> >
> > I was facing problem in running my Algorithm on multiple processors
> (using ssh).
> >
> > Answer:
> > The network administrator locked that port.
> > :(
> >
> > i changed the communication port by forcing mpi to use another.
> >
> > mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca
> btl_tcp_if_include br0 --mca btl_tcp_port_min_v4 1 ./a.out
> >
> > Thanks again for the nice and effective suggestions.
> >
> > Regards.
> >
> >
> >
> > On Tue, Mar 25, 2014 at 1:27 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > This is very odd -- the default value for btl_tcp_port_min_v4 is 1024.
>  So unless you have overridden this value, you should not be getting a port
> less than 1024.  You can run this to see:
> >
> > ompi_info --level 9 --param  btl tcp --parsable | grep port_min_v4
> >
> > Mine says this in a default 1.7.5 installation:
> >
> > mca:btl:tcp:param:btl_tcp_port_min_v4:value:1024
> > mca:btl:tcp:param:btl_tcp_port_min_v4:source:default
> > mca:btl:tcp:param:btl_tcp_port_min_v4:status:writeable
> > mca:btl:tcp:param:btl_tcp_port_min_v4:level:2
> > mca:btl:tcp:param:btl_tcp_port_min_v4:help:The minimum port where the
> TCP BTL will try to bind (default 1024)
> > mca:btl:tcp:param:btl_tcp_port_min_v4:deprecated:no
> > mca:btl:tcp:param:btl_tcp_port_min_v4:type:int
> > mca:btl:tcp:param:btl_tcp_port_min_v4:disabled:false
> >
> >
> >
> > On Mar 25, 2014, at 5:36 AM, Hamid Saeed  wrote:
> >
> > > Hello,
> > > Thanks i figured out what was the exact problem in my case.
> > > Now i am using the following execution line.
> > > it is directing the mpi comm port to start from 1...
> > >
> > > mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca
> btl_tcp_if_include br0 --mca btl_tcp_port_min_v4 1 ./a.out
> > >
> > > and every thing works again.
> > >
> > > Thanks.
> > >
> > > Best regards.
> > >
> > >
> > >
> > >
> > > On Tue, Mar 25, 2014 at 10:23 AM, Hamid Saeed 
> wrote:
> > > Hello,
> > > I am not sure what approach does the MPI communication follow but when
> i
> > > use
> > > --mca btl_base_verbose 30
> > >
> > > I observe the mentioned port.
> > >
> > > [karp:23756] btl: tcp: attempting to connect() to address
> 134.106.3.252 on port 4
> > >
> [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> connect() to 134.106.3.252 failed: Connection refused (111)
> > >
> > >
> > > the information on the
> > > http://www.open-mpi.org/community/lists/users/2011/11/17732.php
> > > is not enough could you kindly explain..
> > >
> > > How can restrict MPI communication to use the ports starting from 1025.
> > > or use the port some what like
> > > 59822...
> > >
> > > Regards.
> > >
> > >
> > >
> > > On Tue, Mar 25, 2014 at 9:15 AM, Reuti 
> wrote:
> > > Hi,
> > >
> > > Am 25.03.2014 um 08:34 schrieb Hamid Saeed:
> > >
> > > > Is it possible to change the port number for the MPI communication?
> > > >
> > > > I can see that my program uses port 4 for the MPI communication.
> > > >
> > > > [karp:23756] btl: tcp: attempting to connect() to address
> 134.106.3.252 on port 4
> > > >
> [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> connect() to 134.106.3.252 failed: Connection refused (111)
> > > >
> > > > In my case the ports from 1 to 1024 are reserved.
> > > > MPI tries to use one of the reserve ports and prompts the connection
> refused error.
> > > >
> > > > I will be very glade for the kind suggestions.
> > >
> > > There are certain parameters to set the range of used ports, but using
> any up to 1024 should not be the default:
> > >
> > > http://www.open-mpi.org/community/lists/users/2011/11/17732.php
> > >
> > > Are any of these set by accident beforehand by your environment?
> > >
> > > -- Reuti
> > >
> > >
> > > > Regards.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Mar 24, 2014 at 5:32 PM, Hamid Saeed 
> wrote:
> > > > Hello Jeff,
> > > >
> > > > Thanks for your cooperation.
> > > >
> > > > --mca btl_tcp_if_include br0
> > > >
> > > > worked out of the box.
> > > >
> > > > The problem was from the network administrator. The machines on the
> network side were halting the mpi...
> > > >
> > > > so cleaning and killing every thing worked.
> > > >
> > > > :)
> > > >
> > > > regards.
> > > >
> > > >
> > > > On Mon, M

[OMPI users] Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Blosch, Edwin L
I am submitting a job for execution under SGE.  My default shell is /bin/csh.  
The script that is submitted has #!/bin/bash at the top.  The script runs on 
the 1st node allocated to the job.  The script runs a Python wrapper that 
ultimately issues the following mpirun command:

/apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
/apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i flow.inp 
>& output

Just so there's no confusion, OpenMPI is built without support for SGE.  It 
should be using rsh to launch.

There are 4 nodes involved (each 12 cores, 48 processes total).  In the output 
file, I see 3 sets of messages as shown below.  I assume I am seeing 1 set of 
messages for each of the 3 remote nodes where processes need to be launched:

/bin/.: Permission denied.
OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
export: Command not found.
PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
 Command not found.
export: Command not found.
LD_LIBRARY_PATH: Undefined variable.

These look like errors you get when csh is trying to parse commands intended 
for bash.

Does anyone know what may be going on here?

Thanks,

Ed



Re: [OMPI users] problem for multiple clusters using mpirun

2014-04-07 Thread Jeff Squyres (jsquyres)
Ok, got it.  Thanks.


On Apr 7, 2014, at 4:04 PM, Hamid Saeed  wrote:

> Thanks for the reply.
> 
> no. 
> In my case the problem was with the misunderstanding of our network 
> administrator.
> Our network system should have, up to 1023 ports locked but some one else put 
> a ticket on 1024 too.
> for this purpose i wasn't able to communicate with other computers.
> 
> 
>  
> 
> 
> On Mon, Apr 7, 2014 at 9:52 PM, Jeff Squyres (jsquyres)  
> wrote:
> I was out on vacation / fully disconnected last week, and am just getting to 
> all the backlog now...
> 
> Are you saying that port 1024 was locked as well -- i.e., that we should set 
> the minimum to 1025?
> 
> 
> On Mar 31, 2014, at 4:32 AM, Hamid Saeed  wrote:
> 
> > Yes Jeff,
> > You were right. The default value for btl_tcp_port_min_v4 is 1024.
> >
> > I was facing problem in running my Algorithm on multiple processors (using 
> > ssh).
> >
> > Answer:
> > The network administrator locked that port.
> > :(
> >
> > i changed the communication port by forcing mpi to use another.
> >
> > mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include 
> > br0 --mca btl_tcp_port_min_v4 1 ./a.out
> >
> > Thanks again for the nice and effective suggestions.
> >
> > Regards.
> >
> >
> >
> > On Tue, Mar 25, 2014 at 1:27 PM, Jeff Squyres (jsquyres) 
> >  wrote:
> > This is very odd -- the default value for btl_tcp_port_min_v4 is 1024.  So 
> > unless you have overridden this value, you should not be getting a port 
> > less than 1024.  You can run this to see:
> >
> > ompi_info --level 9 --param  btl tcp --parsable | grep port_min_v4
> >
> > Mine says this in a default 1.7.5 installation:
> >
> > mca:btl:tcp:param:btl_tcp_port_min_v4:value:1024
> > mca:btl:tcp:param:btl_tcp_port_min_v4:source:default
> > mca:btl:tcp:param:btl_tcp_port_min_v4:status:writeable
> > mca:btl:tcp:param:btl_tcp_port_min_v4:level:2
> > mca:btl:tcp:param:btl_tcp_port_min_v4:help:The minimum port where the TCP 
> > BTL will try to bind (default 1024)
> > mca:btl:tcp:param:btl_tcp_port_min_v4:deprecated:no
> > mca:btl:tcp:param:btl_tcp_port_min_v4:type:int
> > mca:btl:tcp:param:btl_tcp_port_min_v4:disabled:false
> >
> >
> >
> > On Mar 25, 2014, at 5:36 AM, Hamid Saeed  wrote:
> >
> > > Hello,
> > > Thanks i figured out what was the exact problem in my case.
> > > Now i am using the following execution line.
> > > it is directing the mpi comm port to start from 1...
> > >
> > > mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include 
> > > br0 --mca btl_tcp_port_min_v4 1 ./a.out
> > >
> > > and every thing works again.
> > >
> > > Thanks.
> > >
> > > Best regards.
> > >
> > >
> > >
> > >
> > > On Tue, Mar 25, 2014 at 10:23 AM, Hamid Saeed  
> > > wrote:
> > > Hello,
> > > I am not sure what approach does the MPI communication follow but when i
> > > use
> > > --mca btl_base_verbose 30
> > >
> > > I observe the mentioned port.
> > >
> > > [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 
> > > on port 4
> > > [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> > >  connect() to 134.106.3.252 failed: Connection refused (111)
> > >
> > >
> > > the information on the
> > > http://www.open-mpi.org/community/lists/users/2011/11/17732.php
> > > is not enough could you kindly explain..
> > >
> > > How can restrict MPI communication to use the ports starting from 1025.
> > > or use the port some what like
> > > 59822...
> > >
> > > Regards.
> > >
> > >
> > >
> > > On Tue, Mar 25, 2014 at 9:15 AM, Reuti  wrote:
> > > Hi,
> > >
> > > Am 25.03.2014 um 08:34 schrieb Hamid Saeed:
> > >
> > > > Is it possible to change the port number for the MPI communication?
> > > >
> > > > I can see that my program uses port 4 for the MPI communication.
> > > >
> > > > [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 
> > > > on port 4
> > > > [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> > > >  connect() to 134.106.3.252 failed: Connection refused (111)
> > > >
> > > > In my case the ports from 1 to 1024 are reserved.
> > > > MPI tries to use one of the reserve ports and prompts the connection 
> > > > refused error.
> > > >
> > > > I will be very glade for the kind suggestions.
> > >
> > > There are certain parameters to set the range of used ports, but using 
> > > any up to 1024 should not be the default:
> > >
> > > http://www.open-mpi.org/community/lists/users/2011/11/17732.php
> > >
> > > Are any of these set by accident beforehand by your environment?
> > >
> > > -- Reuti
> > >
> > >
> > > > Regards.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Mar 24, 2014 at 5:32 PM, Hamid Saeed  
> > > > wrote:
> > > > Hello Jeff,
> > > >
> > > > Thanks for your cooperation.
> > > >
> > > > --mca btl_tcp_if_include br0
> > > >
> > > > worked out of the box.
> > > >
> > > > The problem was from the network administrator. The machines on the 
> > > > network side

Re: [OMPI users] Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Ralph Castain
Looks to me like the problem is here:

/bin/.: Permission denied.

Appears you don't have permission to exec bash??


On Apr 7, 2014, at 1:04 PM, Blosch, Edwin L  wrote:

> I am submitting a job for execution under SGE.  My default shell is /bin/csh. 
>  The script that is submitted has #!/bin/bash at the top.  The script runs on 
> the 1st node allocated to the job.  The script runs a Python wrapper that 
> ultimately issues the following mpirun command:
>  
> /apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
> LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
> shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
> orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
> /apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i 
> flow.inp >& output
>  
> Just so there’s no confusion, OpenMPI is built without support for SGE.  It 
> should be using rsh to launch.
>  
> There are 4 nodes involved (each 12 cores, 48 processes total).  In the 
> output file, I see 3 sets of messages as shown below.  I assume I am seeing 1 
> set of messages for each of the 3 remote nodes where processes need to be 
> launched:
>  
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
>  
> These look like errors you get when csh is trying to parse commands intended 
> for bash. 
>  
> Does anyone know what may be going on here?
>  
> Thanks,
>  
> Ed
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Blosch, Edwin L
If I create a program called hello which just contains the line "echo hello", 
then I do

"/bin/. hello"  then I get permission denied.

Is that what you mean?

I might be lost in esoteric corners of Linux.  What is "." under /bin ?  There 
is no program there by that name.  I've heard of "." as a shell built-in, but I 
haven't seen it prefixed by /bin before.

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Monday, April 07, 2014 3:10 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Problem with shell when launching jobs with 
OpenMPI 1.6.5 rsh

Looks to me like the problem is here:

/bin/.: Permission denied.

Appears you don't have permission to exec bash??


On Apr 7, 2014, at 1:04 PM, Blosch, Edwin L 
mailto:edwin.l.blo...@lmco.com>> wrote:


I am submitting a job for execution under SGE.  My default shell is /bin/csh.  
The script that is submitted has #!/bin/bash at the top.  The script runs on 
the 1st node allocated to the job.  The script runs a Python wrapper that 
ultimately issues the following mpirun command:

/apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
/apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i flow.inp 
>& output

Just so there's no confusion, OpenMPI is built without support for SGE.  It 
should be using rsh to launch.

There are 4 nodes involved (each 12 cores, 48 processes total).  In the output 
file, I see 3 sets of messages as shown below.  I assume I am seeing 1 set of 
messages for each of the 3 remote nodes where processes need to be launched:

/bin/.: Permission denied.
OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
export: Command not found.
PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
 Command not found.
export: Command not found.
LD_LIBRARY_PATH: Undefined variable.

These look like errors you get when csh is trying to parse commands intended 
for bash.

Does anyone know what may be going on here?

Thanks,

Ed

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Reuti
Am 07.04.2014 um 22:04 schrieb Blosch, Edwin L:

> I am submitting a job for execution under SGE.  My default shell is /bin/csh.

Where - in SGE or on the interactive command line you get?


>  The script that is submitted has #!/bin/bash at the top.  The script runs on 
> the 1st node allocated to the job.  The script runs a Python wrapper that 
> ultimately issues the following mpirun command:
>  
> /apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
> LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
> shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
> orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
> /apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i 
> flow.inp >& output
>  
> Just so there’s no confusion, OpenMPI is built without support for SGE.  It 
> should be using rsh to launch.
>  
> There are 4 nodes involved (each 12 cores, 48 processes total).  In the 
> output file, I see 3 sets of messages as shown below.  I assume I am seeing 1 
> set of messages for each of the 3 remote nodes where processes need to be 
> launched:
>  
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.

This looks really like csh is trying to interpret bash commands. In case SGE's 
queue is set up to have "shell_start_mode posix_compliant" set, the first line 
of the script is not treated in a special way. You can change the shell only by 
"-S /bin/bash" then (or redefine the queue to have "shell_start_mode 
unix_behavior" set and get the expected behavior when starting a script [side 
effect: the shell is not started as login shell any longer. See also `man 
sge_conf` => "login_shells" for details]).

BTW: you don't want a tight integration by intention?

-- Reuti


>  These look like errors you get when csh is trying to parse commands intended 
> for bash. 
>  
> Does anyone know what may be going on here?
>  
> Thanks,
>  
> Ed
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Blosch, Edwin L
I guess this is not OpenMPI related anymore.  I can repeat the essential 
problem interactively:

% echo $SHELL
/bin/csh

% echo $SHLVL
1

% cat hello
echo Hello

% /bin/bash hello
Hello

% /bin/csh hello
Hello

%  . hello
/bin/.: Permission denied

I think I need to hope the administrator can fix it.  Sorry for the bother...


-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti
Sent: Monday, April 07, 2014 3:27 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Problem with shell when launching jobs with 
OpenMPI 1.6.5 rsh

Am 07.04.2014 um 22:04 schrieb Blosch, Edwin L:

> I am submitting a job for execution under SGE.  My default shell is /bin/csh.

Where - in SGE or on the interactive command line you get?


>  The script that is submitted has #!/bin/bash at the top.  The script runs on 
> the 1st node allocated to the job.  The script runs a Python wrapper that 
> ultimately issues the following mpirun command:
>  
> /apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
> LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
> shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
> orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
> /apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i 
> flow.inp >& output
>  
> Just so there's no confusion, OpenMPI is built without support for SGE.  It 
> should be using rsh to launch.
>  
> There are 4 nodes involved (each 12 cores, 48 processes total).  In the 
> output file, I see 3 sets of messages as shown below.  I assume I am seeing 1 
> set of messages for each of the 3 remote nodes where processes need to be 
> launched:
>  
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.

This looks really like csh is trying to interpret bash commands. In case SGE's 
queue is set up to have "shell_start_mode posix_compliant" set, the first line 
of the script is not treated in a special way. You can change the shell only by 
"-S /bin/bash" then (or redefine the queue to have "shell_start_mode 
unix_behavior" set and get the expected behavior when starting a script [side 
effect: the shell is not started as login shell any longer. See also `man 
sge_conf` => "login_shells" for details]).

BTW: you don't want a tight integration by intention?

-- Reuti


>  These look like errors you get when csh is trying to parse commands intended 
> for bash. 
>  
> Does anyone know what may be going on here?
>  
> Thanks,
>  
> Ed
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Noam Bernstein

On Apr 7, 2014, at 4:36 PM, Blosch, Edwin L  wrote:

> I guess this is not OpenMPI related anymore.  I can repeat the essential 
> problem interactively:
> 
> % echo $SHELL
> /bin/csh
> 
> % echo $SHLVL
> 1
> 
> % cat hello
> echo Hello
> 
> % /bin/bash hello
> Hello
> 
> % /bin/csh hello
> Hello
> 
> %  . hello
> /bin/.: Permission denied

. is a bash internal which evaluates the contents of the file in the current 
shell.  Since you’re running csh, it’s just looking for an executable named ., 
which does not exist (the csh analog of bash’s . is source). /bin/. _is_ in 
your path, but it’s a directory (namely /bin itself), which cannot be executed, 
hence the error. Perhaps you meant to do
   ./hello
which means (both in bash and csh) run the script hello in the current working 
directory (.), rather than looking for it in the list of directories in $PATH


Noam

Re: [OMPI users] Fortran MPI module and gfortran

2014-04-07 Thread Jeff Squyres (jsquyres)
On Mar 30, 2014, at 2:43 PM, W Spector  wrote:

> The mpi.mod file that is created from both the openmpi-1.7.4 and 
> openmpi-1.8rc1 tarballs does not seem to be generating interface blocks for 
> the Fortran API - whether the calls use choice buffers or not.

Can you be a bit more specific -- are there *no* interface blocks in the mpi 
module? Or just less than expected?

In 1.7.x (and 1.8), all versions of gfortran should be using the "tkr" 
implementation of the mpi module, which should only include MPI subroutines 
that have no choice buffers (e.g., MPI_INIT, MPI_FINALIZE, ... etc.).

> I initially tried the default gfortran on my system - 4.7.2.  The configure 
> commands are:
> 
> export CC=gcc
> export CXX=g++
> export FC=gfortran
> export F90=gfortran
> ./configure --prefix=/home/wws/openmpi_gfortran  \
>--enable-mpi-fortran --enable-mpi-thread-multiple \
>--enable-mpirun-prefix-by-default  \
>2>&1 | tee config.gfortran.out
> 
> The relevant configure output reads:
> 
> 
> checking if building Fortran mpif.h bindings... yes
> checking for Fortran compiler module include flag... -I
> checking Fortran compiler ignore TKR syntax... not cached; checking variants
> checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no
> checking for Fortran compiler support of !DEC$ ATTRIBUTES NO_ARG_CHECK... no
> checking for Fortran compiler support of !$PRAGMA IGNORE_TKR... no
> checking for Fortran compiler support of !DIR$ IGNORE_TKR... no
> checking for Fortran compiler support of !IBM* IGNORE_TKR... no
> checking Fortran compiler ignore TKR syntax... 0:real:!
> checking if building Fortran 'use mpi' bindings... yes
> checking if building Fortran 'use mpi_f08' bindings... no
> 

That looks right.

> I have also tried using a version of the 4.9 trunk that I generated from a 
> March 18th, 2014 snapshot of the gcc trunk.  This latter compiler supports 
> some of the TS 29 features.  (I set the latter by setting PATH to find the 
> 4.9 compilers first.  I also set the F90 and FC environment variables to 
> point to the 4.9 compiler.)
> 
> make clean
> export PATH=/usr/local/gcc-trunk/bin:$PATH
> export CC=gcc
> export CXX=g++
> export FC=/usr/local/gcc-trunk/bin/gfortran
> export F90=/usr/local/gcc-trunk/bin/gfortran
> ./configure --prefix=/home/wws/openmpi_gfortran49  \
>--enable-mpi-fortran --enable-mpi-thread-multiple \
>--enable-mpirun-prefix-by-default  \
>2>&1 | tee config.gfortran49.out
> 
> The configure output is identical to the 4.7 compiler.  Note that it did NOT 
> recognize that gfortran now supports the !GCC$ ATTRIBUTE NO_ARG_CHECK 
> directive, nor did it recognize that gfortran also accepts 'TYPE(*), 
> DIMENSION(*)'.

That's correct, too, but for a few obscure reasons:

1. I think there's been some churn on the gfortran HEAD recently; I had an 
older version that worked (I'm afraid I don't know/remember the exact date of 
the checkout), but I was comparing notes with the MPICH guys doing the Fortran 
module stuff and they had a slightly newer gfortran checkout that *didn't* 
work.  Then I updated my gfortran checkout to be slightly newer than theirs, 
and it *did* work.

I realize this is a a very fuzzy, anecdotal story with very few details, but 
the point is that I think there's been some instability at the gfortran 
development head (which is probably to be expected -- it's the development 
head, after all).

2. TYPE(*), DIMENSION(*) is not sufficient for MPI choice buffers -- it doesn't 
allow scalars.  So we don't use it.

3. There's currently a bug in OMPI since 1.7.5 that affects the new gfortran 
4.9 usage that I haven't had a chance to fix yet (it isn't super-high-priority 
because gfortran 4.9 isn't released yet).  Hence, OMPI still doesn't use the 
gfortran 4.9 goodness so that it avoids this bug.  :-\  See 
https://svn.open-mpi.org/trac/ompi/ticket/4157 for more details.

> I have also verified with strace that the proper mpi.mod file is being 
> accessed when I am trying to USE the mpi module.
> 
> I have not dug into the openmpi code yet.  Just wondering if this is a known 
> problem before I start?  Or did I do something wrong during configure?


If you're using subroutines like MPI_INIT and other non-choice-buffer 
interfaces, they should be there in mpi.mod.

Let me know if they're not -- we can dig further.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Reuti
Am 07.04.2014 um 22:36 schrieb Blosch, Edwin L:

> I guess this is not OpenMPI related anymore.  I can repeat the essential 
> problem interactively:
> 
> % echo $SHELL
> /bin/csh
> 
> % echo $SHLVL
> 1
> 
> % cat hello
> echo Hello
> 
> % /bin/bash hello
> Hello
> 
> % /bin/csh hello
> Hello
> 
> %  . hello
> /bin/.: Permission denied

. as a bash shortcut for `source` will also be interpreted by `csh` an generate 
this error. You can try to change your interactive shell by: `chsh`.

-- Reuti


> I think I need to hope the administrator can fix it.  Sorry for the bother...
> 
> 
> -Original Message-
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti
> Sent: Monday, April 07, 2014 3:27 PM
> To: Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] Problem with shell when launching jobs 
> with OpenMPI 1.6.5 rsh
> 
> Am 07.04.2014 um 22:04 schrieb Blosch, Edwin L:
> 
>> I am submitting a job for execution under SGE.  My default shell is /bin/csh.
> 
> Where - in SGE or on the interactive command line you get?
> 
> 
>> The script that is submitted has #!/bin/bash at the top.  The script runs on 
>> the 1st node allocated to the job.  The script runs a Python wrapper that 
>> ultimately issues the following mpirun command:
>> 
>> /apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
>> LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
>> shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
>> orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
>> /apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i 
>> flow.inp >& output
>> 
>> Just so there's no confusion, OpenMPI is built without support for SGE.  It 
>> should be using rsh to launch.
>> 
>> There are 4 nodes involved (each 12 cores, 48 processes total).  In the 
>> output file, I see 3 sets of messages as shown below.  I assume I am seeing 
>> 1 set of messages for each of the 3 remote nodes where processes need to be 
>> launched:
>> 
>> /bin/.: Permission denied.
>> OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
>> export: Command not found.
>> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>>  Command not found.
>> export: Command not found.
>> LD_LIBRARY_PATH: Undefined variable.
> 
> This looks really like csh is trying to interpret bash commands. In case 
> SGE's queue is set up to have "shell_start_mode posix_compliant" set, the 
> first line of the script is not treated in a special way. You can change the 
> shell only by "-S /bin/bash" then (or redefine the queue to have 
> "shell_start_mode unix_behavior" set and get the expected behavior when 
> starting a script [side effect: the shell is not started as login shell any 
> longer. See also `man sge_conf` => "login_shells" for details]).
> 
> BTW: you don't want a tight integration by intention?
> 
> -- Reuti
> 
> 
>> These look like errors you get when csh is trying to parse commands intended 
>> for bash. 
>> 
>> Does anyone know what may be going on here?
>> 
>> Thanks,
>> 
>> Ed
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Bennet Fauber
The permission denied looks like it is being issued against

'/bin/.'

What do you get if you grep your own username from /etc/passwd?  That is,

% grep Edwin /etc/passwd

If your shell is listed as /bin/csh, then you need to use csh's
syntax, which would be

% source hello

(which will also work from bash).  The dot command is specific to
sh/bash, and csh sees it as you trying to run /bin/., which is the
same as tying any directory name as a command.

[roo]$ csh
[roo]% /bin/.
/bin/.: Permission denied.
[roo]% . foo.txt
/bin/.: Permission denied.
[roo]% /bin
/bin: Permission denied.

Try specifically invoking bash, then try it

% bash
$ . hello




On Mon, Apr 7, 2014 at 4:36 PM, Blosch, Edwin L  wrote:
> I guess this is not OpenMPI related anymore.  I can repeat the essential 
> problem interactively:
>
> % echo $SHELL
> /bin/csh
>
> % echo $SHLVL
> 1
>
> % cat hello
> echo Hello
>
> % /bin/bash hello
> Hello
>
> % /bin/csh hello
> Hello
>
> %  . hello
> /bin/.: Permission denied
>
> I think I need to hope the administrator can fix it.  Sorry for the bother...
>
>
> -Original Message-
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Reuti
> Sent: Monday, April 07, 2014 3:27 PM
> To: Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] Problem with shell when launching jobs 
> with OpenMPI 1.6.5 rsh
>
> Am 07.04.2014 um 22:04 schrieb Blosch, Edwin L:
>
>> I am submitting a job for execution under SGE.  My default shell is /bin/csh.
>
> Where - in SGE or on the interactive command line you get?
>
>
>>  The script that is submitted has #!/bin/bash at the top.  The script runs 
>> on the 1st node allocated to the job.  The script runs a Python wrapper that 
>> ultimately issues the following mpirun command:
>>
>> /apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
>> LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
>> shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
>> orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
>> /apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i 
>> flow.inp >& output
>>
>> Just so there's no confusion, OpenMPI is built without support for SGE.  It 
>> should be using rsh to launch.
>>
>> There are 4 nodes involved (each 12 cores, 48 processes total).  In the 
>> output file, I see 3 sets of messages as shown below.  I assume I am seeing 
>> 1 set of messages for each of the 3 remote nodes where processes need to be 
>> launched:
>>
>> /bin/.: Permission denied.
>> OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
>> export: Command not found.
>> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>>  Command not found.
>> export: Command not found.
>> LD_LIBRARY_PATH: Undefined variable.
>
> This looks really like csh is trying to interpret bash commands. In case 
> SGE's queue is set up to have "shell_start_mode posix_compliant" set, the 
> first line of the script is not treated in a special way. You can change the 
> shell only by "-S /bin/bash" then (or redefine the queue to have 
> "shell_start_mode unix_behavior" set and get the expected behavior when 
> starting a script [side effect: the shell is not started as login shell any 
> longer. See also `man sge_conf` => "login_shells" for details]).
>
> BTW: you don't want a tight integration by intention?
>
> -- Reuti
>
>
>>  These look like errors you get when csh is trying to parse commands 
>> intended for bash.
>>
>> Does anyone know what may be going on here?
>>
>> Thanks,
>>
>> Ed
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Blosch, Edwin L
Thanks Noam, that makes sense.

Yes, I did mean to do ". hello" (with space in between).  That was an attempt 
to replicate whatever OpenMPI is doing.  

In the first post I mentioned that my mpirun command actually gets executed 
from within a Python script using the subprocess module.  I don't know the 
details of the rsh launcher, but there are 3 remote hosts in the hosts file, 
and 3 sets of the error messages below.  May be the rsh launcher is getting 
confused, doing something that is only valid under bash even though my default 
login environment is /bin/csh.  

mpirun --machinefile mpihosts.914 -np 48 -x LD_LIBRARY_PATH --mca 
orte_rsh_agent /usr/bin/rsh  solver_openmpi  -i flow.inp >& output

% cat output

/bin/.: Permission denied.
OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
export: Command not found.
PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
 Command not found.
export: Command not found.
LD_LIBRARY_PATH: Undefined variable.
/bin/.: Permission denied.
OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
export: Command not found.
PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
 Command not found.
export: Command not found.
LD_LIBRARY_PATH: Undefined variable.
/bin/.: Permission denied.
OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
export: Command not found.
PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
 Command not found.
export: Command not found.
LD_LIBRARY_PATH: Undefined variable.

-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein
Sent: Monday, April 07, 2014 3:41 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs 
with OpenMPI 1.6.5 rsh


On Apr 7, 2014, at 4:36 PM, Blosch, Edwin L  wrote:

> I guess this is not OpenMPI related anymore.  I can repeat the essential 
> problem interactively:
> 
> % echo $SHELL
> /bin/csh
> 
> % echo $SHLVL
> 1
> 
> % cat hello
> echo Hello
> 
> % /bin/bash hello
> Hello
> 
> % /bin/csh hello
> Hello
> 
> %  . hello
> /bin/.: Permission denied

. is a bash internal which evaluates the contents of the file in the current 
shell.  Since you're running csh, it's just looking for an executable named ., 
which does not exist (the csh analog of bash's . is source). /bin/. _is_ in 
your path, but it's a directory (namely /bin itself), which cannot be executed, 
hence the error. Perhaps you meant to do
   ./hello
which means (both in bash and csh) run the script hello in the current working 
directory (.), rather than looking for it in the list of directories in $PATH


Noam
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Ralph Castain
I doubt that the rsh launcher is getting confused by the cmd you show below. 
However, if that command is embedded in a script that changes the shell away 
from your default shell, then yes - it might get confused. When the rsh 
launcher spawns your remote orted, it attempts to set some envars to ensure 
things are correctly setup (e.g., that the path is right). Thus, it needs to 
know what the remove shell is going to be.

If given no other direction, it assumes that both the remote shell and your 
current shell are your default shell as reported by getpwuid (if available - 
otherwise, it falls back to the SHELL envar). If the remote shell can be 
something different, then you need to set the "plm_rsh_assume_same_shell=0" MCA 
param so it will check the remote shell.


On Apr 7, 2014, at 1:53 PM, Blosch, Edwin L  wrote:

> Thanks Noam, that makes sense.
> 
> Yes, I did mean to do ". hello" (with space in between).  That was an attempt 
> to replicate whatever OpenMPI is doing.  
> 
> In the first post I mentioned that my mpirun command actually gets executed 
> from within a Python script using the subprocess module.  I don't know the 
> details of the rsh launcher, but there are 3 remote hosts in the hosts file, 
> and 3 sets of the error messages below.  May be the rsh launcher is getting 
> confused, doing something that is only valid under bash even though my 
> default login environment is /bin/csh.  
> 
> mpirun --machinefile mpihosts.914 -np 48 -x LD_LIBRARY_PATH --mca 
> orte_rsh_agent /usr/bin/rsh  solver_openmpi  -i flow.inp >& output
> 
> % cat output
> 
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
> 
> -Original Message-
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein
> Sent: Monday, April 07, 2014 3:41 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching 
> jobs with OpenMPI 1.6.5 rsh
> 
> 
> On Apr 7, 2014, at 4:36 PM, Blosch, Edwin L  wrote:
> 
>> I guess this is not OpenMPI related anymore.  I can repeat the essential 
>> problem interactively:
>> 
>> % echo $SHELL
>> /bin/csh
>> 
>> % echo $SHLVL
>> 1
>> 
>> % cat hello
>> echo Hello
>> 
>> % /bin/bash hello
>> Hello
>> 
>> % /bin/csh hello
>> Hello
>> 
>> %  . hello
>> /bin/.: Permission denied
> 
> . is a bash internal which evaluates the contents of the file in the current 
> shell.  Since you're running csh, it's just looking for an executable named 
> ., which does not exist (the csh analog of bash's . is source). /bin/. _is_ 
> in your path, but it's a directory (namely /bin itself), which cannot be 
> executed, hence the error. Perhaps you meant to do
>   ./hello
> which means (both in bash and csh) run the script hello in the current 
> working directory (.), rather than looking for it in the list of directories 
> in $PATH
> 
>   
> Noam
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Blosch, Edwin L
That worked!

But still a mystery.

I tried printing the environment immediately before mpirun.  Inside the Python 
wrapper, I do os.system('env') immediately before the subprocess.pOpen( 
['mpirun', ..., shell=False ] ) command.  This returns SHELL=/bin/csh, and I 
can confirm that getpwuid, if it works, would also have returned /bin/csh, as 
that is my default shell.

It is also interesting that it does not matter if the job-submission script is 
#!/bin/bash or #!/bin/tcsh (properly re-written, of course) -- I get the same 
errors either way. 

So why did the launcher use a bash syntax on the remote host?  It does not seem 
to be behaving exactly as you described.

But telling it to check the remote shell did the trick.

Thanks


-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Monday, April 07, 2014 4:12 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs 
with OpenMPI 1.6.5 rsh

I doubt that the rsh launcher is getting confused by the cmd you show below. 
However, if that command is embedded in a script that changes the shell away 
from your default shell, then yes - it might get confused. When the rsh 
launcher spawns your remote orted, it attempts to set some envars to ensure 
things are correctly setup (e.g., that the path is right). Thus, it needs to 
know what the remove shell is going to be.

If given no other direction, it assumes that both the remote shell and your 
current shell are your default shell as reported by getpwuid (if available - 
otherwise, it falls back to the SHELL envar). If the remote shell can be 
something different, then you need to set the "plm_rsh_assume_same_shell=0" MCA 
param so it will check the remote shell.


On Apr 7, 2014, at 1:53 PM, Blosch, Edwin L  wrote:

> Thanks Noam, that makes sense.
> 
> Yes, I did mean to do ". hello" (with space in between).  That was an attempt 
> to replicate whatever OpenMPI is doing.  
> 
> In the first post I mentioned that my mpirun command actually gets executed 
> from within a Python script using the subprocess module.  I don't know the 
> details of the rsh launcher, but there are 3 remote hosts in the hosts file, 
> and 3 sets of the error messages below.  May be the rsh launcher is getting 
> confused, doing something that is only valid under bash even though my 
> default login environment is /bin/csh.  
> 
> mpirun --machinefile mpihosts.914 -np 48 -x LD_LIBRARY_PATH --mca 
> orte_rsh_agent /usr/bin/rsh  solver_openmpi  -i flow.inp >& output
> 
> % cat output
> 
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
> /bin/.: Permission denied.
> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
> export: Command not found.
> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>  Command not found.
> export: Command not found.
> LD_LIBRARY_PATH: Undefined variable.
> 
> -Original Message-
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein
> Sent: Monday, April 07, 2014 3:41 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching 
> jobs with OpenMPI 1.6.5 rsh
> 
> 
> On Apr 7, 2014, at 4:36 PM, Blosch, Edwin L  wrote:
> 
>> I guess this is not OpenMPI related anymore.  I can repeat the essential 
>> problem interactively:
>> 
>> % echo $SHELL
>> /bin/csh
>> 
>> % echo $SHLVL
>> 1
>> 
>> % cat hello
>> echo Hello
>> 
>> % /bin/bash hello
>> Hello
>> 
>> % /bin/csh hello
>> Hello
>> 
>> %  . hello
>> /bin/.: Permission denied
> 
> . is a bash internal which evaluates the contents of the file in the current 
> shell.  Since you're running csh, it's just looking for an executable named 
> ., which does not exist (the csh analog of bash's . is source). /bin/. _is_ 
> in your path, but it's a directory (namely /bin itself), which cannot be 
> executed, hence the error. Perhaps you meant to do
>   ./hello
> which means (both in bash and csh) run the script hello in the current 
> working directory (.), rather than looking for it in the list of directories 
> in $PATH
> 
>   
> Noam
> ___

Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs with OpenMPI 1.6.5 rsh

2014-04-07 Thread Ralph Castain

On Apr 7, 2014, at 2:35 PM, Blosch, Edwin L  wrote:

> That worked!
> 
> But still a mystery.
> 
> I tried printing the environment immediately before mpirun.  Inside the 
> Python wrapper, I do os.system('env') immediately before the 
> subprocess.pOpen( ['mpirun', ..., shell=False ] ) command.  This returns 
> SHELL=/bin/csh, and I can confirm that getpwuid, if it works, would also have 
> returned /bin/csh, as that is my default shell.
> 
> It is also interesting that it does not matter if the job-submission script 
> is #!/bin/bash or #!/bin/tcsh (properly re-written, of course) -- I get the 
> same errors either way. 
> 
> So why did the launcher use a bash syntax on the remote host?  It does not 
> seem to be behaving exactly as you described.

It is odd - my best guess is that something in the code incorrectly picks up on 
the local bash shell and uses it instead of csh. Perhaps getpwuid isn't on your 
system?

Otherwise, I'll have to check as it is possible the configure logic is 
incorrectly defaulting to "no getpwuid", which would result in us picking up 
your local shell as bash.


> 
> But telling it to check the remote shell did the trick.
> 
> Thanks
> 
> 
> -Original Message-
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Monday, April 07, 2014 4:12 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching 
> jobs with OpenMPI 1.6.5 rsh
> 
> I doubt that the rsh launcher is getting confused by the cmd you show below. 
> However, if that command is embedded in a script that changes the shell away 
> from your default shell, then yes - it might get confused. When the rsh 
> launcher spawns your remote orted, it attempts to set some envars to ensure 
> things are correctly setup (e.g., that the path is right). Thus, it needs to 
> know what the remove shell is going to be.
> 
> If given no other direction, it assumes that both the remote shell and your 
> current shell are your default shell as reported by getpwuid (if available - 
> otherwise, it falls back to the SHELL envar). If the remote shell can be 
> something different, then you need to set the "plm_rsh_assume_same_shell=0" 
> MCA param so it will check the remote shell.
> 
> 
> On Apr 7, 2014, at 1:53 PM, Blosch, Edwin L  wrote:
> 
>> Thanks Noam, that makes sense.
>> 
>> Yes, I did mean to do ". hello" (with space in between).  That was an 
>> attempt to replicate whatever OpenMPI is doing.  
>> 
>> In the first post I mentioned that my mpirun command actually gets executed 
>> from within a Python script using the subprocess module.  I don't know the 
>> details of the rsh launcher, but there are 3 remote hosts in the hosts file, 
>> and 3 sets of the error messages below.  May be the rsh launcher is getting 
>> confused, doing something that is only valid under bash even though my 
>> default login environment is /bin/csh.  
>> 
>> mpirun --machinefile mpihosts.914 -np 48 -x LD_LIBRARY_PATH --mca 
>> orte_rsh_agent /usr/bin/rsh  solver_openmpi  -i flow.inp >& output
>> 
>> % cat output
>> 
>> /bin/.: Permission denied.
>> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
>> export: Command not found.
>> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>>  Command not found.
>> export: Command not found.
>> LD_LIBRARY_PATH: Undefined variable.
>> /bin/.: Permission denied.
>> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
>> export: Command not found.
>> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>>  Command not found.
>> export: Command not found.
>> LD_LIBRARY_PATH: Undefined variable.
>> /bin/.: Permission denied.
>> OPAL_PREFIX=/apps/local/test/openmpi: Command not found.
>> export: Command not found.
>> PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
>>  Command not found.
>> export: Command not found.
>> LD_LIBRARY_PATH: Undefined variable.
>> 
>> -Original Message-
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein
>> Sent: Monday, April 07, 2014 3:41 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching 
>> jobs with OpenMPI 1.6.5 rsh
>> 
>> 
>> On Apr 7, 2014, at 4:36 PM, Blosch, Edwin L  wrote:
>> 
>>> I guess this is not OpenMPI related anymore.  I can repeat the essential 
>>> problem interactively:
>>> 
>>> % echo $SHELL
>>> /bin/csh
>>> 
>>> % echo $SHLVL
>>> 1
>>> 
>>> % cat hello
>>> echo Hello
>>> 
>>> % /bin/bash hello
>>> Hello
>>> 
>>> % /bin/csh hello
>>> Hello
>>> 
>>> %  . hello
>>> /bin/.: Permission denied
>> 
>> . is a bash internal which evaluates the contents of the file in the current 
>> shell.  Since you're running csh, it's j

Re: [OMPI users] openmpi query

2014-04-07 Thread Jeff Squyres (jsquyres)
Open MPI 1.4.3 is *ancient*.  Please upgrade -- we just released Open MPI 1.8 
last week.

Also, please look at this FAQ entry -- it steps you through a lot of basic 
troubleshooting steps about getting basic MPI programs working.  

http://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems

Once you get basic MPI programs working, then try with MPI Blast.



On Apr 5, 2014, at 3:11 AM, Nisha Dhankher -M.Tech(CSE) 
 wrote:

> Mpirun --mca btl ^openib --mca btl_tcp_if_include eth0  -np 16  -machinefile 
> mf mpiblast -d all.fas -p blastn -i query.fas -o out.txt
> 
> was the command i executed on cluster...
>  
> 
> 
> On Sat, Apr 5, 2014 at 12:34 PM, Nisha Dhankher -M.Tech(CSE) 
>  wrote:
> sorry Ralph my mistake its not names...it is "it does not happen on same 
> nodes."
> 
> 
> On Sat, Apr 5, 2014 at 12:33 PM, Nisha Dhankher -M.Tech(CSE) 
>  wrote:
> same vm on all machines that is virt-manager
> 
> 
> On Sat, Apr 5, 2014 at 12:32 PM, Nisha Dhankher -M.Tech(CSE) 
>  wrote:
> opmpi version 1.4.3
> 
> 
> On Fri, Apr 4, 2014 at 8:13 PM, Ralph Castain  wrote:
> Okay, so if you run mpiBlast on all the non-name nodes, everything is okay? 
> What do you mean by "names nodes"?
> 
> 
> On Apr 4, 2014, at 7:32 AM, Nisha Dhankher -M.Tech(CSE) 
>  wrote:
> 
>> no it does not happen on names nodes 
>> 
>> 
>> On Fri, Apr 4, 2014 at 7:51 PM, Ralph Castain  wrote:
>> Hi Nisha
>> 
>> I'm sorry if my questions appear abrasive - I'm just a little frustrated at 
>> the communication bottleneck as I can't seem to get a clear picture of your 
>> situation. So you really don't need to keep calling me "sir" :-)
>> 
>> The error you are hitting is very unusual - it means that the processes are 
>> able to make a connection, but are failing to correctly complete a simple 
>> handshake exchange of their process identifications. There are only a few 
>> ways that can happen, and I'm trying to get you to test for them.
>> 
>> So let's try and see if we can narrow this down. You mention that it works 
>> on some machines, but not all. Is this consistent - i.e., is it always the 
>> same machines that work, and the same ones that generate the error? If you 
>> exclude the ones that show the error, does it work? If so, what is different 
>> about those nodes? Are they a different architecture?
>> 
>> 
>> On Apr 3, 2014, at 11:09 PM, Nisha Dhankher -M.Tech(CSE) 
>>  wrote:
>> 
>>> sir
>>> smae virt-manager is bein used by all pc's.no i did n't enable 
>>> openmpi-hetro.Yes openmpi version is same in all through same kickstart 
>>> file.
>>> ok...actually sir...rocks itself installed,configured openmpi and mpich on 
>>> it own through hpc roll.
>>> 
>>> 
>>> On Fri, Apr 4, 2014 at 9:25 AM, Ralph Castain  wrote:
>>> 
>>> On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE) 
>>>  wrote:
>>> 
 thankyou Ralph.
 Yes cluster is heterogenous...
>>> 
>>> And did you configure OMPI --enable-heterogeneous? And are you running it 
>>> with ---hetero-nodes? What version of OMPI are you using anyway?
>>> 
>>> Note that we don't care if the host pc's are hetero - what we care about is 
>>> the VM. If all the VMs are the same, then it shouldn't matter. However, 
>>> most VM technologies don't handle hetero hardware very well - i.e., you 
>>> can't emulate an x86 architecture on top of a Sparc or Power chip or vice 
>>> versa.
>>> 
>>> 
 And i haven't made compute nodes on direct physical nodes (pc's) becoz in 
 college it is not possible to take whole lab of 32 pc's for your work  so 
 i ran on vm.
>>> 
>>> Yes, but at least it would let you test the setup to run MPI across even a 
>>> couple of pc's - this is simple debugging practice.
>>> 
 In Rocks cluster, frontend give the same kickstart to all the pc's so 
 openmpi version should be same i guess.
>>> 
>>> Guess? or know? Makes a difference - might be worth testing.
>>> 
 Sir 
 mpiformatdb is a command to distribute database fragments to different 
 compute nodes after partitioning od database.
 And sir have you done mpiblast ?
>>> 
>>> Nope - but that isn't the issue, is it? The issue is with the MPI setup.
>>> 
 
 
 On Fri, Apr 4, 2014 at 4:48 AM, Ralph Castain  wrote:
 What is "mpiformatdb"? We don't have an MPI database in our system, and I 
 have no idea what that command means
 
 As for that error - it means that the identifier we exchange between 
 processes is failing to be recognized. This could mean a couple of things:
 
 1. the OMPI version on the two ends is different - could be you aren't 
 getting the right paths set on the various machines
 
 2. the cluster is heterogeneous
 
 You say you have "virtual nodes" running on various PC's? That would be an 
 unusual setup - VM's can be problematic given the way they handle TCP 
 connections, so that might be another source of the problem if my 
 understanding of your setup is correct. Have you tried runn

Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6

2014-04-07 Thread Jeff Squyres (jsquyres)
Per Dave's comment: note that running autogen.pl (or autogen.sh -- they're sym 
links to the same thing) is *only* necessary for SVN/hg/git checkouts of Open 
MPI.  

You should *not* run autogen.pl in an expanded Open MPI tarball unless you 
really know what you're doing (e.g., you made a change to Open MPI that 
requires re-generating the GNU Autotools scripts).  Meaning: if you are 
expanding an Open MPI tarball, there's usually no reason to run autogen.pl -- 
just run configure and make.

Sidenote: if you don't run autogen.pl, then it doesn't matter what version of 
GNU Autotools you have installed.  The Open MPI tarballs are pre-bootstrapped 
with the Right versions of the GNU Autotools so that end users don't have to 
worry about this kind of junk.  Having the Right versions of GNU Autotools is 
generally only an issue for Open MPI developers.



On Apr 2, 2014, at 5:12 PM, Tru Huynh  wrote:

> On Tue, Apr 01, 2014 at 03:26:00PM +, Blosch, Edwin L wrote:
>> I am getting some errors building 1.8 on RHEL6.  I tried autoreconf as
>> suggested, but it failed for the same reason.  Is there a minimum
>> version of m4 required that is newer than that provided by RHEL6?
>> 
> What kind of errors? I have build on CentOS-5 (5.10) and CentOS-6 (6.5) x86_64
> without any issue.
> tar xjf openmpi-1.8.tar.bz2
> cd openmpi-1.8/
> ./configure --prefix=/c6/shared/openmpi/1.8
> nice make -j 8 && make check && make install
> 
> Tru
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6

2014-04-07 Thread Blosch, Edwin L
Sorry for the confusion.  I am not building OpenMPI from the SVN source.  I 
downloaded the 1.8 tarball and did configure, and that is what failed.  I was 
surprised that it didn't work on a vanilla Redhat Enterprise Linux 6, out of 
the box operating system installation.   

The error message suggested that I try autoreconf, so I tried it.  

I can try the autogen.sh script and see if that fixes it, but I'm noticing 
another thread right now where Jeff is saying that shouldn't be necessary.

-Original Message-
From: Dave Goodell (dgoodell) [mailto:dgood...@cisco.com] 
Sent: Tuesday, April 01, 2014 11:20 AM
To: Open MPI Users
Subject: Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6

On Apr 1, 2014, at 10:26 AM, "Blosch, Edwin L"  wrote:

> I am getting some errors building 1.8 on RHEL6.  I tried autoreconf as 
> suggested, but it failed for the same reason.  Is there a minimum version of 
> m4 required that is newer than that provided by RHEL6?

Don't run "autoreconf" by hand, make sure to run the "./autogen.sh" script that 
is packaged with OMPI.  It will also check your versions and warn you if they 
are out of date.

Do you need to build OMPI from the SVN source?  Or would a (pre-autogen'ed) 
release tarball work for you?

-Dave





Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6

2014-04-07 Thread Jeff Squyres (jsquyres)
On Apr 7, 2014, at 6:47 PM, "Blosch, Edwin L"  wrote:

> Sorry for the confusion.  I am not building OpenMPI from the SVN source.  I 
> downloaded the 1.8 tarball and did configure, and that is what failed.  I was 
> surprised that it didn't work on a vanilla Redhat Enterprise Linux 6, out of 
> the box operating system installation.   

It should -- if it doesn't, please send us the info listed here:

http://www.open-mpi.org/community/help/

> The error message suggested that I try autoreconf, so I tried it.  

That may have been a generic GNU Autotool message.  Blech.  I hope we (OMPI) 
don't recommend that an end user run autogen.  :-)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/