Re: [OMPI devel] Is it possible to get BTL transport work directly with MPI level

2007-04-15 Thread Jeff Squyres
This is unfortunately not enough information to provide any help --  
the (lots of output) parts are pretty important.  Can you provide all  
the information cited here:


http://www.open-mpi.org/community/help/


On Apr 14, 2007, at 11:36 PM, po...@cc.gatech.edu wrote:


Hi!!!
Thanks for help!!!

Right now I am just trying to install the normal openmpi(without  
using all

development header files).
But it is still giving me some error.
I have downloaded the developer version from the openmpi.org site.
Then I gave
./configure --prefix=/net/hc293/pooja/dev_openmpi
(lots of out put)
make all install
(lots of output )
and error :ld returned 1 exit status
make[2]: *** [libopen-pal.la] Error 1
make[2]: Leaving directory `/net/hc293/pooja/openmpi-1.2.1a0r14362- 
dev/opal'

make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/net/hc293/pooja/openmpi-1.2.1a0r14362- 
dev/opal'

make: *** [all-recursive] Error 1



Also the dev_openmpi folder is empty.

SO I am not able to complie normal ring_c.c example also.

Please help

Thanks and Regards
Pooja







Configure with the --with-devel-headers switch.  This will install
all the developer headers.

If you care, check out "./configure --help" -- that shows all the
options available to the configure script (including --with-devel-
headers).


On Apr 13, 2007, at 7:36 PM, po...@cc.gatech.edu wrote:


Hi

I have downloaded the developer version of source code by
downloading a
nightly Subversion snapshot tarball.And have installed the openmpi.
Using

./configure --prefix=/usr/local
make all install.

But I want to install with all the development headers.So that I
can write
an application that can use Ompi internal headers.


Thanks and Regards
Pooja






On Apr 1, 2007, at 3:12 PM, Ralph Castain wrote:


I can't help you with the BTL question. On the others:


Yes, you can "sorta" call BTL's directly from application programs
(are you trying to use MPI alongside other communication libraries,
and using the BTL components as a sample?), but there are issues
involved with this.

First, you need to install Open MPI with all the development
headers.  Open MPI normally only installs "mpi.h" and a small  
number
of other heads; installing *all* the headers will allow you to  
write

applications that use OMPI's internal headers (such as btl.h) while
developing outside of the Open MPI source tree.

Second, you probably won't want to access the BTL's directly.  To
make this make sense, here's how the code is organized (even if the
specific call sequence is not exactly this layered for performance/
optimization reasons):

MPI layer (e.g., MPI_SEND)
  -> PML
-> BML
  -> BTL

You have two choices:

1. Go through the PML instead (this is what we do in the MPI
collectives, for example) -- but this imposes MPI semantics on
sending and receiving, which assumedly you are trying to avoid.
Check out ompi/mca/pml/pml.h.

2. Go through the BML instead -- the BTL Management Layer.  This is
essentially a multiplexor for all the BTLs that have been
instantiated.  I'm guessing that this is what you want to do
(remember that OMPI has true multi-device support; using the BML  
and

multiple BTLs is one of the ways that we do this).  Have a look at
ompi/mca/bml/bml.h for the interface.

There is also currently no mechanism to get the BML and BTL  
pointers

that were instantiated by the PML.  However, if you're just doing
proof-of-concept code, you can extract these directly from the MPI
layer's global variables to see how this stuff works.

To have full interoperability of the underlying BTLs and between
multiple upper-layer communication libraries (e.g., between OMPI  
and
something else) is something that we have talked about a little,  
but

have not done much work on.

To see the BTL interface (just for completeness), see ompi/mca/btl/
btl.h.

You can probably see the pattern here...  In all of Open MPI's
frameworks, the public interface is in /mca//
.h, where  is one of opal, orte, or ompi, and
 is the name of the framework.

1. states are reported via the orte/mca/smr framework. You will  
see

the
states listed in orte/mca/smr/smr_types.h. We track both process
and job
states. Hopefully, the state names will be somewhat self-
explanatory and
indicative of the order in which they are traversed. The job  
states

are set
when *all* of the processes in the job reach the corresponding
state.


Note that these are very coarse-grained process-level states (e.g.,
is a given process running or not?).  It's not clear what kind of
states you were asking about -- the Open MPI code base has many
internal state machines for various message passing and other
mechanisms.

What information are you looking for, specifically?

2. I'm not sure what you mean by mapping MPI processes to  
"physical"

processes, but I assume you mean how do we assign MPI ranks to
processes on
specific nodes. You will find that done in the orte/mca/rmaps
framework. We
currently only have one component

Re: [OMPI devel] Is it possible to get BTL transport work directly with MPI level

2007-04-15 Thread pooja
Hi!!
Thanks for reply.Actaully there was some problem with the my downloaded
version of openmpi.But when I downloaded everything again and did all
configure and make statements again it worked fine.

Thanks a lot .
And next time I will make sure that I give all details.

Thanks
Pooja




> This is unfortunately not enough information to provide any help --
> the (lots of output) parts are pretty important.  Can you provide all
> the information cited here:
>
>  http://www.open-mpi.org/community/help/
>
>
> On Apr 14, 2007, at 11:36 PM, po...@cc.gatech.edu wrote:
>
>> Hi!!!
>> Thanks for help!!!
>>
>> Right now I am just trying to install the normal openmpi(without
>> using all
>> development header files).
>> But it is still giving me some error.
>> I have downloaded the developer version from the openmpi.org site.
>> Then I gave
>> ./configure --prefix=/net/hc293/pooja/dev_openmpi
>> (lots of out put)
>> make all install
>> (lots of output )
>> and error :ld returned 1 exit status
>> make[2]: *** [libopen-pal.la] Error 1
>> make[2]: Leaving directory `/net/hc293/pooja/openmpi-1.2.1a0r14362-
>> dev/opal'
>> make[1]: *** [all-recursive] Error 1
>> make[1]: Leaving directory `/net/hc293/pooja/openmpi-1.2.1a0r14362-
>> dev/opal'
>> make: *** [all-recursive] Error 1
>>
>>
>>
>> Also the dev_openmpi folder is empty.
>>
>> SO I am not able to complie normal ring_c.c example also.
>>
>> Please help
>>
>> Thanks and Regards
>> Pooja
>>
>>
>>
>>
>>
>>
>>> Configure with the --with-devel-headers switch.  This will install
>>> all the developer headers.
>>>
>>> If you care, check out "./configure --help" -- that shows all the
>>> options available to the configure script (including --with-devel-
>>> headers).
>>>
>>>
>>> On Apr 13, 2007, at 7:36 PM, po...@cc.gatech.edu wrote:
>>>
 Hi

 I have downloaded the developer version of source code by
 downloading a
 nightly Subversion snapshot tarball.And have installed the openmpi.
 Using

 ./configure --prefix=/usr/local
 make all install.

 But I want to install with all the development headers.So that I
 can write
 an application that can use Ompi internal headers.


 Thanks and Regards
 Pooja





> On Apr 1, 2007, at 3:12 PM, Ralph Castain wrote:
>
>> I can't help you with the BTL question. On the others:
>
> Yes, you can "sorta" call BTL's directly from application programs
> (are you trying to use MPI alongside other communication libraries,
> and using the BTL components as a sample?), but there are issues
> involved with this.
>
> First, you need to install Open MPI with all the development
> headers.  Open MPI normally only installs "mpi.h" and a small
> number
> of other heads; installing *all* the headers will allow you to
> write
> applications that use OMPI's internal headers (such as btl.h) while
> developing outside of the Open MPI source tree.
>
> Second, you probably won't want to access the BTL's directly.  To
> make this make sense, here's how the code is organized (even if the
> specific call sequence is not exactly this layered for performance/
> optimization reasons):
>
> MPI layer (e.g., MPI_SEND)
>   -> PML
> -> BML
>   -> BTL
>
> You have two choices:
>
> 1. Go through the PML instead (this is what we do in the MPI
> collectives, for example) -- but this imposes MPI semantics on
> sending and receiving, which assumedly you are trying to avoid.
> Check out ompi/mca/pml/pml.h.
>
> 2. Go through the BML instead -- the BTL Management Layer.  This is
> essentially a multiplexor for all the BTLs that have been
> instantiated.  I'm guessing that this is what you want to do
> (remember that OMPI has true multi-device support; using the BML
> and
> multiple BTLs is one of the ways that we do this).  Have a look at
> ompi/mca/bml/bml.h for the interface.
>
> There is also currently no mechanism to get the BML and BTL
> pointers
> that were instantiated by the PML.  However, if you're just doing
> proof-of-concept code, you can extract these directly from the MPI
> layer's global variables to see how this stuff works.
>
> To have full interoperability of the underlying BTLs and between
> multiple upper-layer communication libraries (e.g., between OMPI
> and
> something else) is something that we have talked about a little,
> but
> have not done much work on.
>
> To see the BTL interface (just for completeness), see ompi/mca/btl/
> btl.h.
>
> You can probably see the pattern here...  In all of Open MPI's
> frameworks, the public interface is in /mca//
> .h, where  is one of opal, orte, or ompi, and
>  is the name of the framework.
>
>> 1. states are reported via the orte/mca/smr framework. You will
>> see
>

[OMPI devel] SOS!! Run-time error

2007-04-15 Thread chaitali dherange

Hi,

 I have downloaded the developer version of source code by downloading a
nightly Subversion snapshot tarball.And have installed the openmpi.
Using

./configure --prefix=/net/hc293/chaitali/openmpi_dev
(lots of output... without errors)
make all install.
(lots of output... without errors)

then I have tried to run the example provided in this version of source
code... the ring_c.c file... I first copied it to my home directory...
/net/hc293/chaitali
now when inside my home directory... i did

set path=($path /net.hc293/chaitali/openmpi_dev/bin)
set $LD_LIBRARY_PATH = ( /net/hc293/chaitali/dev_openmpi/lib )
mpicc -o chaitali_test ring_c.c
(This gave no errors at all)
mpirun --prefix /net/hc293/chaitali/openmpi_dev -np 3 --hostfile
/net/hc293/chaitali/machinefile ./test_chaitali
(This gave foll errors..)
[oolong:09783] *** Process received signal ***
[oolong:09783] Signal: Segmentation fault (11)
[oolong:09783] Signal code:  (128)
[oolong:09783] Failing at address: (nil)
[oolong:09783] [ 0] /lib64/tls/libpthread.so.0 [0x2a95e01430]
[oolong:09783] [ 1]
/net/hc293/chaitali/openmpi_dev/lib/libopen-pal.so.0(opal_event_init+0x166)
[0x2a957d9e16]
[oolong:09783] [ 2]
/net/hc293/chaitali/openmpi_dev/lib/libopen-rte.so.0(orte_init_stage1+0x168)
[0x2a95680638]
[oolong:09783] [ 3]
/net/hc293/chaitali/openmpi_dev/lib/libopen-rte.so.0(orte_system_init+0xa)
[0x2a9568375a]
[oolong:09783] [ 4]
/net/hc293/chaitali/openmpi_dev/lib/libopen-rte.so.0(orte_init+0x49)
[0x2a95680329]
[oolong:09783] [ 5] mpirun(orterun+0x155) [0x4029fd]
[oolong:09783] [ 6] mpirun(main+0x1b) [0x4028a3]
[oolong:09783] [ 7] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
[0x2a95f273fb]
[oolong:09783] [ 8] mpirun [0x4027fa]
[oolong:09783] *** End of error message ***
Segmentation fault

I understand that the [5] and [6] are the actual errors. But dont understand
why? or how to overcome this error?

Please find attached the foll files:
- 'ring_c.c' file which I am trying to run.
- 'config.log' file from the openmpi-1.2.1a0r14362 folder
- 'ompi_info --all.txt' which is the the output of ompi_info --all... This
contains the above mentioned errors.

Thanks and Regards,
Chaitali


doubt.rar
Description: Binary data


Re: [OMPI devel] SOS!! Run-time error

2007-04-15 Thread Adrian Knoth
On Sun, Apr 15, 2007 at 01:40:01PM -0400, chaitali dherange wrote:

> Hi,

Hi!

>   I have downloaded the developer version of source code by downloading a
> nightly Subversion snapshot tarball.And have installed the openmpi.

Things are getting much clearer when you compile Open MPI with
--enable-debug.


> [oolong:09783] *** Process received signal ***
> [oolong:09783] Signal: Segmentation fault (11)
> [oolong:09783] Signal code:  (128)
> [oolong:09783] Failing at address: (nil)

NULL-pointer dereference, so at least the segfault is correct ;)


HTH

-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de


[OMPI devel] SOS... help needed :(

2007-04-15 Thread chaitali dherange

Hi,

 Pooja and I are actually working on this course project where we our main
aim is
schedule MPI and non MPI calls... giving more priority to the MPI calls over
the non
MPI ones.

To make things simple, we are making this scheduling static to some
extent... by
static I mean.. we know that our clusters use Infiniband for MPI ( from our
study of
the openmpi source code this precisely uses the 'mca_btl_openib_send()' from

the ompi/mca/btl/openib/btl_openib.c file) ... so all the non MPI
communication can
be assumed to be TCP communication using the 'mca_btl_tcp_send()' from the
ompi/mca/btl/tcp/btl_tcp.c file.


To implement this we plan to implement the foll. simple algorithm:

- before calling the 'mca_btl_openib_send()' lock0(X);
- before calling the 'mca_btl_tcp_send()' lock1(X);

Algo:

1. Allow Lock0(x) -> Lock0(x);.. meaning Lock0(x) is followed by Lock0(x).
2. Allow Lock1(x) -> Lock1(x);
3. Do not allow Lock0(x) -> Lock1(x);
4. If Lock1(x) -> Lock0(x) since MPI calls are to be higher priority
over the non
MPI ones.. in this case the non MPI communication should be paused and all
the
related data off course needs to be put into a queue(meaning the status of
this
should be saved in a queue). All other non MPI communications newer than
this
should also be added to this same queue. Now the MPI process trying to
perform Lock0(x) should be allowed to complete and only when all the MPI
communications are complete should the non MPI communication be allowed.

Currently we are working on a simple scheduling algorithm without giving any

priorities to the 'MPI_send' calls.

However to implement the project fully, we have the following queries :(
-Can we abort or pause the non-MPI/TCP communication in any way???
-Given the assumption that the non-MPI communication is TCP, can we
make use of the built in structures (i mean the buffer already used) in
mca_btl_tcp_send() for the implementation of pt.4  in the above mentioned
algorithm??? and more importantly how?

Regards,
Chaitali