[OMPI devel] limit tcp fragment size?

2008-03-31 Thread Muhammad Atif
G'day
Just a quick basic question. in case of tcp btl, how do I limit the frag 
size? 
I do not want MPI to send a fragment of size greater than lets say 16K in size.

If I am not mistaken, should not the btl_tcp_min_send_size do the trick?  If it 
is supposed to do it, why do i see packets of lenght 64K ?

Thanks in advance. 

Best Regards,
Muhammad Atif





  

Like movies? Here's a limited-time offer: Blockbuster Total Access for one 
month at no cost. 
http://tc.deals.yahoo.com/tc/blockbuster/text4.com

[OMPI devel] segfault on host not found error.

2008-03-31 Thread Lenny Verkhovsky



I accidently run job on the hostfile where one of hosts was not properly
mounted. As a result I got an error and a segfault.


/home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile
./mpi_p01 -t lt
bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or
directory

--
A daemon (pid 9753) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have
the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.

--

--
mpirun was unable to start the specified application as it encountered
an error.
More information may be available above.

--
[witch1:09745] *** Process received signal ***
[witch1:09745] Signal: Segmentation fault (11)
[witch1:09745] Signal code: Address not mapped (1)
[witch1:09745] Failing at address: 0x3c
[witch1:09745] [ 0] /lib64/libpthread.so.0 [0x2aff223ebc10]
[witch1:09745] [ 1]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cdfe21]
[witch1:09745] [ 2]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_rml_oob.so
[0x2aff22c398f1]
[witch1:09745] [ 3]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
[0x2aff22d426ee]
[witch1:09745] [ 4]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
[0x2aff22d433fb]
[witch1:09745] [ 5]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
[0x2aff22d4485b]
[witch1:09745] [ 6]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
[witch1:09745] [ 7] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x403203]
[witch1:09745] [ 8]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
[witch1:09745] [ 9]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
8b) [0x2aff21e060cb]
[witch1:09745] [10]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_trigger_eve
nt+0x20) [0x2aff21cc6940]
[witch1:09745] [11]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_wakeup+0x2d
) [0x2aff21cc776d]
[witch1:09745] [12]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
[0x2aff22b34756]
[witch1:09745] [13]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cc6ea7]
[witch1:09745] [14]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
[witch1:09745] [15]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
8b) [0x2aff21e060cb]
[witch1:09745] [16]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_plm_base_da
emon_callback+0xad) [0x2aff21ce068d]
[witch1:09745] [17]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
[0x2aff22b34e5e]
[witch1:09745] [18] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x402e13]
[witch1:09745] [19] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x402873]
[witch1:09745] [20] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2aff22512154]
[witch1:09745] [21] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x4027c9]
[witch1:09745] *** End of error message ***
Segmentation fault (core dumped)


Best Regards,
Lenny.




Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-31 Thread Jeff Squyres

On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:



- I don't think we can delete the MCA param ompi_paffinity_alone; it
exists in the v1.2 series and has historical precedent.

It will not be deleted,
It will just use the same infrastructure ( slot_list parameter and
opal_base functions ). It will be transparent for the user.

User have 3 ways to setup it
1.  mca opal_paffinity_alone 1
This will set paffinity as it did before
2.  mca opal_paffinity_slot_list "slot_list"
Used to define slots that will be used for all ranks on all
nodes.
3.  mca rmaps_rank_file_path rankfile
Assigning ranks to CPUs according to the file



I don't see the MCA parameter "mpi_paffinity_alone" anymore:

-
[4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep  
paffinity_alone
MCA opal: parameter "opal_paffinity_alone" (current  
value: "0")

[4:54] svbu-mpi:~/svn/ompi2 %
-

My point is that I don't think we should delete this parameter; there  
is historical precedence for it (and it has been documented on the web  
page for a long, long time).  Perhaps it can now simply be a synonym  
for opal_paffinity_alone (registered in the MPI layer, not opal).


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-31 Thread Jeff Squyres

Sorry, I missed this mail.

IIRC, the verbosity level for stream 0 is 0.  It probably would not be  
good to increase it; many places in the code use output stream 0.


Perhaps you could make a new stream with a different verbosity level  
to do what you want...?  See the docs in opal/util/output.h.



On Mar 27, 2008, at 8:12 AM, Lenny Verkhovsky wrote:

NO, just tried to see some printouts during the run,
I use in the code

opal_output_verbose(0, 0,"LNY100 opal_paffinity_base_slot_list_set  
ver=%d ",0);
opal_output_verbose(1, 0,"LNY101 opal_paffinity_base_slot_list_set  
ver=%d ",1);
OPAL_OUTPUT_VERBOSE((1, 0,"VERBOSE LNY102  
opal_paffinity_base_slot_list_set ver=%d ",1));

but all I see is the first line ( since I put level 0)
I suppose that to see the second line I must configure with --enable- 
debug, but this is not working for me either.




On Thu, Mar 27, 2008 at 2:02 PM, Jeff Squyres   
wrote:

Are you using BTL_OUTPUT or something else from btl_base_error.h?


On Mar 27, 2008, at 7:49 AM, Lenny Verkhovsky wrote:
> Hi,
> thanks for the comments. I will definetly implement all of them and
> commit the code as soon as I finished.
>
> Also I experience few problems with using opal_verbose_output,
> either there is a bugs or I am doing something wrong.
>
>
> /home/USERS/lenny/OMPI_ORTE_DEBUG/bin/mpirun -mca mca_verbose 0 -mca
> paffinity_base_verbose 1 --byslot -np 2 -hostfile hostfile -mca
> btl_openib_max_lmc 1  -mca opal_paffinity_alone 1 -mca
> btl_openib_verbose 1  /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug - 
t lt

>
>
> /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug: symbol lookup error: /
> home/USERS/lenny/OMPI_ORTE_DEBUG//lib/openmpi/mca_btl_openib.so:
> undefined symbol: mca_btl_base_out
> /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug: symbol lookup error: /
> home/USERS/lenny/OMPI_ORTE_DEBUG//lib/openmpi/mca_btl_openib.so:
> undefined symbol: mca_btl_base_out
>  
--

> mpirun has exited due to process rank 1 with PID 5896 on
> node witch17 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
>
>
> On Wed, Mar 26, 2008 at 2:50 PM, Ralph H Castain   
wrote:

> I would tend to echo Tim's suggestions. I note that you do lookup
> that opal
> mca param in orte as well. I know you sent me a note about that off-
> list - I
> apologize for not getting to it yet, but was swamped yesterday.
>
> I think the solution suggested in #1 below is the right approach.
> Looking up
> opal params in orte or ompi is probably not a good idea. We have had
> problems in the past where params were looked up in multiple  
places as

> people -do- sometimes change the names (ahem...).
>
> Also, I would suggest using the macro version of verbose
> OPAL_OUTPUT_VERBOSE
> so that it compiles out for non-debug builds - up to you. Many of us
> use it
> as we don't need the output from optimized builds.
>
> Other than that, I think this looks fine. I do truly appreciate the
> cleanup
> of ompi_mpi_init.
>
> Ralph
>
>
>
> On 3/26/08 6:09 AM, "Tim Prins"  wrote:
>
> > Hi Lenny,
> >
> > This looks good. But I have a couple of suggestions (which others
> may
> > disagree with):
> >
> > 1. You register an opal mca parameter, but look it up in ompi,
> then call
> > a opal function with the result. What if you had a function
> > opal_paffinity_base_set_slots(long rank) (or some other name, I
> don't
> > care) which looked up the mca parameter and then setup the slots
> as you
> > are doing if it is fount. This would make things a bit cleaner  
IMHO.

> >
> > 2. the functions in the paffinety base should be prefixed with
> > 'opal_paffinity_base_'
> >
> > 3. Why was the ompi_debug_flag added? It is not used anywhere.
> >
> > 4. You probably do not need to add the opal debug flag. There is
> already
> > a 'paffinity_base_verbose' flag which should suit your purposes
> fine. So
> > you should just be able to replace all of the conditional output
> > statements in paffinity with something like
> > opal_output_verbose(10, opal_paffinity_base_output, ...),
> > where 10 is the verbosity level number.
> >
> > Tim
> >
> >
> > Lenny Verkhovsky wrote:
> >>
> >>
> >> Hi, all
> >>
> >> Attached patch for modified Rank_File RMAPS component.
> >>
> >>
> >>
> >> 1.introduced new general purpose debug flags
> >>
> >>   mpi_debug
> >>
> >>   opal_debug
> >>
> >>
> >>
> >> 2.introduced new mca parameter opal_paffinity_slot_list
> >>
> >> 3.ompi_mpi_init cleaned from opal paffinity functions
> >>
> >> 4.opal paffinity functions moved to new file
> >> opal/mca/paffinity/base/paffinity_base_service.c
> >>
> >> 5.rank_file component files were renamed according to prefix
> policy
> >>
> >> 6.global variables renamed as well.
> >>
> >> 7.few bug fixes that were brought during previous  
discussions.

> >>
> >> 8.If user defines opal_paffinity_a

Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-31 Thread Terry Dontje

Jeff Squyres wrote:

On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
  

- I don't think we can delete the MCA param ompi_paffinity_alone; it
exists in the v1.2 series and has historical precedent.
  

It will not be deleted,
It will just use the same infrastructure ( slot_list parameter and
opal_base functions ). It will be transparent for the user.

User have 3 ways to setup it
1.  mca opal_paffinity_alone 1
This will set paffinity as it did before
2.  mca opal_paffinity_slot_list "slot_list"
Used to define slots that will be used for all ranks on all
nodes.
3.  mca rmaps_rank_file_path rankfile
Assigning ranks to CPUs according to the file




I don't see the MCA parameter "mpi_paffinity_alone" anymore:

-
[4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep  
paffinity_alone
 MCA opal: parameter "opal_paffinity_alone" (current  
value: "0")

[4:54] svbu-mpi:~/svn/ompi2 %
-

My point is that I don't think we should delete this parameter; there  
is historical precedence for it (and it has been documented on the web  
page for a long, long time).  Perhaps it can now simply be a synonym  
for opal_paffinity_alone (registered in the MPI layer, not opal).


  
I agree with Jeff on the above.  This would cause a lot of busy work for 
our customers and internal setups.


--td


Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-31 Thread Lenny Verkhovsky
OK, 
I am putting it back.



> -Original Message-
> From: terry.don...@sun.com [mailto:terry.don...@sun.com]
> Sent: Monday, March 31, 2008 2:59 PM
> To: Open MPI Developers
> Cc: Lenny Verkhovsky; Sharon Melamed
> Subject: Re: [OMPI devel] RMAPS rank_file component patch and
> modifications for review
> 
> Jeff Squyres wrote:
> > On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
> >
> >>> - I don't think we can delete the MCA param ompi_paffinity_alone;
it
> >>> exists in the v1.2 series and has historical precedent.
> >>>
> >> It will not be deleted,
> >> It will just use the same infrastructure ( slot_list parameter and
> >> opal_base functions ). It will be transparent for the user.
> >>
> >> User have 3 ways to setup it
> >> 1. mca opal_paffinity_alone 1
> >>This will set paffinity as it did before
> >> 2. mca opal_paffinity_slot_list "slot_list"
> >>Used to define slots that will be used for all ranks on all
> >> nodes.
> >> 3. mca rmaps_rank_file_path rankfile
> >>Assigning ranks to CPUs according to the file
> >>
> >
> >
> > I don't see the MCA parameter "mpi_paffinity_alone" anymore:
> >
> > -
> > [4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep
> > paffinity_alone
> >  MCA opal: parameter "opal_paffinity_alone" (current
> > value: "0")
> > [4:54] svbu-mpi:~/svn/ompi2 %
> > -
> >
> > My point is that I don't think we should delete this parameter;
there
> > is historical precedence for it (and it has been documented on the
web
> > page for a long, long time).  Perhaps it can now simply be a synonym
> > for opal_paffinity_alone (registered in the MPI layer, not opal).
> >
> >
> I agree with Jeff on the above.  This would cause a lot of busy work
for
> our customers and internal setups.
> 
> --td



Re: [OMPI devel] Scalability of openib modex

2008-03-31 Thread Ralph H Castain
Thanks Jeff. It appears to me that the first approach to reducing modex data
makes the most sense and has the largest impact - I would advocate pursuing
it first. We can look at further refinements later.

Along that line, one thing we also exchange in the modex (not IB specific)
is hostname and arch. This is in the ompi/proc/proc.c code. It seems to me
that this is also wasteful and can be removed. The daemons already have that
info for the job and can easily "drop" it into each proc - there is no
reason to send it around.

I'll take a look at cleaning that up, ensuring we don't "break" daemonless
environments, along with the other things underway.

Ralph



On 3/28/08 11:37 AM, "Jeff Squyres"  wrote:

> I've had this conversation independently with several people now, so
> I'm sending it to the list rather than continuing to have the same
> conversation over and over.  :-)
> 
> --
> 
> As most of you know, Jon and I are working on the new openib
> "CPC" (connect pseudo-component) stuff in /tmp-public/openib-cpc2.
> There are two main reasons for it:
> 
> 1. Add support for RDMA CM (they need it for iWarp support)
> 2. Add support for IB CM (which will hopefully be a more scalable
> connect system as compared to the current RML/OOB-based method of
> making IB QPs)
> 
> When complete, there will be 4 CPCs: RDMA CM, IB CM, OOB, and XOOB
> (same as OOB but with ConnectX XRC extensions).
> 
> RDMA CM has some known scaling issues, and at least some known
> workarounds -- I won't discuss the merits/drawbacks of RDMA CM here.
> IB CM has unknown scaling characteristics, but seems to look good on
> paper (e.g., it uses UD for a 3-way handshake to make an IB QP).
> 
> On the trunk, it's a per-MPI process decision as to which CPC you'll
> use.  Per ticket #1191, one of the goals of the /tmp-public branch is
> to make CPC decision be a per-openib-BTL-module decision.  So you can
> mix iWarp and IB hardware in a single host, for example.  This fits in
> quite well with the "mpirun should work out of the box" philosophy of
> Open MPI.
> 
> In the openib BTL, each BTL module is paired with a specific HCA/NIC
> (verbs) port.  And depending on the interface hardware and software,
> one or more CPCs may be available for each.  Hence, for each BTL
> module in each MPI process, we may send one or more CPC connect
> information blobs in the modex (note that the oob and xoob CPCs don't
> need to send anything additional in the modex).
> 
> Jon and I are actually getting closer to completion on the branch, and
> it seems to be working.
> 
> In conjunction with several other scalability discussions that are
> ongoing right now, several of us have toyed with two basic ideas to
> improve scalability of job launch / startup:
> 
> 1. the possibility of eliminating the modex altogether (e.g., have
> ORTE dump enough information to each MPI process to figure out/
> calculate/locally lookup [in local files?] BTL addressing information
> for all peers in MPI_COMM_WORLD, etc.), a la Portals.
> 
> 2. reducing the amount of data in the modex.
> 
> One obvious idea for #2 is to have only one process on each host send
> all/the majority of openib BTL modex information for that host.  The
> rationale here is that all MPI processes on a single host will share
> much of the same BTL addressing information, so why send it N times?
> Local rank 0 can modex send all/the majority of the modex for the
> openib BTL modules; local ranks 1-N can either send nothing or a
> [very] small piece of differentiating information (e.g., IBCM service
> ID).
> 
> This effectively makes the modex info for the openib BTL scale with
> the number of nodes, not the number of processes.  This can be a big
> win in terms of overall modex size that needs to be both gathered and
> bcast.
> 
> I worked up a spreadsheet showing the current size of the modex in the
> openib-cpc2 branch right now (using some "somewhat" contrived machine
> size/ppn/port combinations), and then compared it to the size after
> implementing the #2 idea shown above (see attached PDF).
> 
> I also included a 3rd comparison for if Jon/I are able to reduce the
> CPC modex blob sizes -- we don't know yet if that'll work or not.  But
> the numbers show that reducing the blobs by a few bytes clearly has
> [much] less of an impact than the "eliminating redundant modex
> information" idea, so we'll work on that one first.
> 
> Additionally, reducing the modex size, paired with other ongoing ORTE
> scalability efforts, may obviate the need to eliminate the modex (at
> least for now...).  Or, more specifically, efforts for eliminating the
> modex can be pushed to beyond v1.3.
> 
> Of course, the same ideas can apply to other BTLs.  We're only working
> on the openib BTL for now.




Re: [OMPI devel] Scalability of openib modex

2008-03-31 Thread Jeff Squyres

On Mar 31, 2008, at 9:22 AM, Ralph H Castain wrote:
Thanks Jeff. It appears to me that the first approach to reducing  
modex data
makes the most sense and has the largest impact - I would advocate  
pursuing

it first. We can look at further refinements later.

Along that line, one thing we also exchange in the modex (not IB  
specific)
is hostname and arch. This is in the ompi/proc/proc.c code. It seems  
to me
that this is also wasteful and can be removed. The daemons already  
have that

info for the job and can easily "drop" it into each proc - there is no
reason to send it around.

I'll take a look at cleaning that up, ensuring we don't "break"  
daemonless

environments, along with the other things underway.


Sounds perfect.



Ralph



On 3/28/08 11:37 AM, "Jeff Squyres"  wrote:


I've had this conversation independently with several people now, so
I'm sending it to the list rather than continuing to have the same
conversation over and over.  :-)

--

As most of you know, Jon and I are working on the new openib
"CPC" (connect pseudo-component) stuff in /tmp-public/openib-cpc2.
There are two main reasons for it:

1. Add support for RDMA CM (they need it for iWarp support)
2. Add support for IB CM (which will hopefully be a more scalable
connect system as compared to the current RML/OOB-based method of
making IB QPs)

When complete, there will be 4 CPCs: RDMA CM, IB CM, OOB, and XOOB
(same as OOB but with ConnectX XRC extensions).

RDMA CM has some known scaling issues, and at least some known
workarounds -- I won't discuss the merits/drawbacks of RDMA CM here.
IB CM has unknown scaling characteristics, but seems to look good on
paper (e.g., it uses UD for a 3-way handshake to make an IB QP).

On the trunk, it's a per-MPI process decision as to which CPC you'll
use.  Per ticket #1191, one of the goals of the /tmp-public branch is
to make CPC decision be a per-openib-BTL-module decision.  So you can
mix iWarp and IB hardware in a single host, for example.  This fits  
in

quite well with the "mpirun should work out of the box" philosophy of
Open MPI.

In the openib BTL, each BTL module is paired with a specific HCA/NIC
(verbs) port.  And depending on the interface hardware and software,
one or more CPCs may be available for each.  Hence, for each BTL
module in each MPI process, we may send one or more CPC connect
information blobs in the modex (note that the oob and xoob CPCs don't
need to send anything additional in the modex).

Jon and I are actually getting closer to completion on the branch,  
and

it seems to be working.

In conjunction with several other scalability discussions that are
ongoing right now, several of us have toyed with two basic ideas to
improve scalability of job launch / startup:

1. the possibility of eliminating the modex altogether (e.g., have
ORTE dump enough information to each MPI process to figure out/
calculate/locally lookup [in local files?] BTL addressing information
for all peers in MPI_COMM_WORLD, etc.), a la Portals.

2. reducing the amount of data in the modex.

One obvious idea for #2 is to have only one process on each host send
all/the majority of openib BTL modex information for that host.  The
rationale here is that all MPI processes on a single host will share
much of the same BTL addressing information, so why send it N times?
Local rank 0 can modex send all/the majority of the modex for the
openib BTL modules; local ranks 1-N can either send nothing or a
[very] small piece of differentiating information (e.g., IBCM service
ID).

This effectively makes the modex info for the openib BTL scale with
the number of nodes, not the number of processes.  This can be a big
win in terms of overall modex size that needs to be both gathered and
bcast.

I worked up a spreadsheet showing the current size of the modex in  
the

openib-cpc2 branch right now (using some "somewhat" contrived machine
size/ppn/port combinations), and then compared it to the size after
implementing the #2 idea shown above (see attached PDF).

I also included a 3rd comparison for if Jon/I are able to reduce the
CPC modex blob sizes -- we don't know yet if that'll work or not.   
But

the numbers show that reducing the blobs by a few bytes clearly has
[much] less of an impact than the "eliminating redundant modex
information" idea, so we'll work on that one first.

Additionally, reducing the modex size, paired with other ongoing ORTE
scalability efforts, may obviate the need to eliminate the modex (at
least for now...).  Or, more specifically, efforts for eliminating  
the

modex can be pushed to beyond v1.3.

Of course, the same ideas can apply to other BTLs.  We're only  
working

on the openib BTL for now.



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Josh Hursey

Ralph,

I've just noticed that it seems that the 'unity' routed component  
seems to be broken when using more than one machine. I'm using Odin  
and r18028 of the trunk, and have confirmed that this problem occurs  
with SLURM and rsh. I think this break came in on Friday as that is  
when some of my MTT tests started to hang and fail, but I cannot point  
to a specific revision at this point. The backtraces (enclosed) of the  
processes point to the grpcomm allgather routine.


The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize.

RSH example from odin023 - so no SLURM variables:
These work:
 shell$ mpirun -np 2 -host odin023  noop -v 1
 shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
 shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1

This hangs:
 shell$ mpirun -np 2 -mca routed unity -host odin023,odin024  noop -v 1


If I attach to the 'noop' process on odin023 I get the following  
backtrace:


(gdb) bt
#0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
#1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,  
maxevents=1023, timeout=1000) at epoll_sub.c:61
#2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910,  
tv=0x7fbfffe840) at epoll.c:210
#3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,  
flags=5) at event.c:779

#4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
#5  0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c: 
169
#6  0x002a958b9e48 in orte_grpcomm_base_allgather  
(sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ 
grpcomm_base_allgather.c:238
#7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ 
grpcomm_base_modex.c:413
#8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58,  
requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510
#9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c,  
argv=0x7fbfffec70) at pinit.c:88

#10 0x00400bf4 in main (argc=3, argv=0x7fbfffed58) at noop.c:39


The 'noop' process on odin024 has a similar backtrace:

(gdb) bt
#0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
#1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b390,  
maxevents=1023, timeout=1000) at epoll_sub.c:61
#2  0x002a95a1e7f7 in epoll_dispatch (base=0x506cc0, arg=0x506c20,  
tv=0x7fbfffe9d0) at epoll.c:210
#3  0x002a95a1c057 in opal_event_base_loop (base=0x506cc0,  
flags=5) at event.c:779

#4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
#5  0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c: 
169
#6  0x002a958b97c5 in orte_grpcomm_base_allgather  
(sbuf=0x7fbfffec70, rbuf=0x7fbfffec10) at base/ 
grpcomm_base_allgather.c:163
#7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ 
grpcomm_base_modex.c:413
#8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffeee8,  
requested=0, provided=0x7fbfffedc8) at runtime/ompi_mpi_init.c:510
#9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffee0c,  
argv=0x7fbfffee00) at pinit.c:88

#10 0x00400bf4 in main (argc=3, argv=0x7fbfffeee8) at noop.c:39



Cheers,
Josh


Re: [OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Ralph H Castain
I figured out the issue - there is a simple and a hard way to fix this. So
before I do, let me see what makes sense.

The simple solution involves updating the daemons with contact info for the
procs so that they can send their collected modex info to the rank=0 proc.
This will measurably slow the launch when using unity.

The hard solution is to do a hybrid routed approach whereby the daemons
would route any daemon-to-proc communication while the procs continue to do
direct proc-to-proc messaging.

Is there some reason to be using the "unity" component? Do you care if jobs
using unity launch slower?

Thanks
Ralph



On 3/31/08 7:57 AM, "Josh Hursey"  wrote:

> Ralph,
> 
> I've just noticed that it seems that the 'unity' routed component
> seems to be broken when using more than one machine. I'm using Odin
> and r18028 of the trunk, and have confirmed that this problem occurs
> with SLURM and rsh. I think this break came in on Friday as that is
> when some of my MTT tests started to hang and fail, but I cannot point
> to a specific revision at this point. The backtraces (enclosed) of the
> processes point to the grpcomm allgather routine.
> 
> The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize.
> 
> RSH example from odin023 - so no SLURM variables:
> These work:
>   shell$ mpirun -np 2 -host odin023  noop -v 1
>   shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
>   shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1
> 
> This hangs:
>   shell$ mpirun -np 2 -mca routed unity -host odin023,odin024  noop -v 1
> 
> 
> If I attach to the 'noop' process on odin023 I get the following
> backtrace:
> 
> (gdb) bt
> #0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
> #1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,
> maxevents=1023, timeout=1000) at epoll_sub.c:61
> #2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910,
> tv=0x7fbfffe840) at epoll.c:210
> #3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,
> flags=5) at event.c:779
> #4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
> #5  0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c:
> 169
> #6  0x002a958b9e48 in orte_grpcomm_base_allgather
> (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/
> grpcomm_base_allgather.c:238
> #7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/
> grpcomm_base_modex.c:413
> #8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58,
> requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510
> #9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c,
> argv=0x7fbfffec70) at pinit.c:88
> #10 0x00400bf4 in main (argc=3, argv=0x7fbfffed58) at noop.c:39
> 
> 
> The 'noop' process on odin024 has a similar backtrace:
> 
> (gdb) bt
> #0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
> #1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b390,
> maxevents=1023, timeout=1000) at epoll_sub.c:61
> #2  0x002a95a1e7f7 in epoll_dispatch (base=0x506cc0, arg=0x506c20,
> tv=0x7fbfffe9d0) at epoll.c:210
> #3  0x002a95a1c057 in opal_event_base_loop (base=0x506cc0,
> flags=5) at event.c:779
> #4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
> #5  0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c:
> 169
> #6  0x002a958b97c5 in orte_grpcomm_base_allgather
> (sbuf=0x7fbfffec70, rbuf=0x7fbfffec10) at base/
> grpcomm_base_allgather.c:163
> #7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/
> grpcomm_base_modex.c:413
> #8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffeee8,
> requested=0, provided=0x7fbfffedc8) at runtime/ompi_mpi_init.c:510
> #9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffee0c,
> argv=0x7fbfffee00) at pinit.c:88
> #10 0x00400bf4 in main (argc=3, argv=0x7fbfffeee8) at noop.c:39
> 
> 
> 
> Cheers,
> Josh




Re: [OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Josh Hursey
At the moment I only use unity with C/R. Mostly because I have not  
verified that the other components work properly under the C/R  
conditions. I can verify others, but that doesn't solve the problem  
with the unity component. :/


It is not critical that these jobs launch quickly, but that they  
launch correctly for the moment. When you say 'slow the launch' are  
you talking severely as in seconds/minutes for small nps? I guess a  
followup question is why did this component break in the first place?  
or worded differently, what changed in ORTE such that the unity  
component will suddenly deadlock when it didn't before?


Thanks for looking into this,
Josh

On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote:

I figured out the issue - there is a simple and a hard way to fix  
this. So

before I do, let me see what makes sense.

The simple solution involves updating the daemons with contact info  
for the
procs so that they can send their collected modex info to the rank=0  
proc.

This will measurably slow the launch when using unity.

The hard solution is to do a hybrid routed approach whereby the  
daemons
would route any daemon-to-proc communication while the procs  
continue to do

direct proc-to-proc messaging.

Is there some reason to be using the "unity" component? Do you care  
if jobs

using unity launch slower?

Thanks
Ralph



On 3/31/08 7:57 AM, "Josh Hursey"  wrote:


Ralph,

I've just noticed that it seems that the 'unity' routed component
seems to be broken when using more than one machine. I'm using Odin
and r18028 of the trunk, and have confirmed that this problem occurs
with SLURM and rsh. I think this break came in on Friday as that is
when some of my MTT tests started to hang and fail, but I cannot  
point
to a specific revision at this point. The backtraces (enclosed) of  
the

processes point to the grpcomm allgather routine.

The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize.

RSH example from odin023 - so no SLURM variables:
These work:
 shell$ mpirun -np 2 -host odin023  noop -v 1
 shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
 shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1

This hangs:
 shell$ mpirun -np 2 -mca routed unity -host odin023,odin024  noop - 
v 1



If I attach to the 'noop' process on odin023 I get the following
backtrace:

(gdb) bt
#0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
#1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,
maxevents=1023, timeout=1000) at epoll_sub.c:61
#2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30,  
arg=0x506910,

tv=0x7fbfffe840) at epoll.c:210
#3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,
flags=5) at event.c:779
#4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
#5  0x002a95a0bef8 in opal_progress () at runtime/ 
opal_progress.c:

169
#6  0x002a958b9e48 in orte_grpcomm_base_allgather
(sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/
grpcomm_base_allgather.c:238
#7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at  
base/

grpcomm_base_modex.c:413
#8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58,
requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510
#9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c,
argv=0x7fbfffec70) at pinit.c:88
#10 0x00400bf4 in main (argc=3, argv=0x7fbfffed58) at  
noop.c:39



The 'noop' process on odin024 has a similar backtrace:

(gdb) bt
#0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
#1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b390,
maxevents=1023, timeout=1000) at epoll_sub.c:61
#2  0x002a95a1e7f7 in epoll_dispatch (base=0x506cc0,  
arg=0x506c20,

tv=0x7fbfffe9d0) at epoll.c:210
#3  0x002a95a1c057 in opal_event_base_loop (base=0x506cc0,
flags=5) at event.c:779
#4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
#5  0x002a95a0bef8 in opal_progress () at runtime/ 
opal_progress.c:

169
#6  0x002a958b97c5 in orte_grpcomm_base_allgather
(sbuf=0x7fbfffec70, rbuf=0x7fbfffec10) at base/
grpcomm_base_allgather.c:163
#7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at  
base/

grpcomm_base_modex.c:413
#8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffeee8,
requested=0, provided=0x7fbfffedc8) at runtime/ompi_mpi_init.c:510
#9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffee0c,
argv=0x7fbfffee00) at pinit.c:88
#10 0x00400bf4 in main (argc=3, argv=0x7fbfffeee8) at  
noop.c:39




Cheers,
Josh






Re: [OMPI devel] limit tcp fragment size?

2008-03-31 Thread George Bosilca
The btl_tcp_min_send_size is not exactly what you expect it to be. It  
drive only the send protocol (as implemented in Open MPI), and not the  
put protocol the TCP BTL is using.


You can achieve what you want with 2 parameters:
1. btl_tcp_frag set to 9. This will force the send protocol over TCP  
all the time
2. btl_tcp_max_send_size set to 16K which will define the size of a  
fragment in the pipelined send protocol.


  george.

On Mar 31, 2008, at 2:46 AM, Muhammad Atif wrote:

G'day
Just a quick basic question. in case of tcp btl, how do I limit  
the frag size?
I do not want MPI to send a fragment of size greater than lets say  
16K in size.


If I am not mistaken, should not the btl_tcp_min_send_size do the  
trick?  If it is supposed to do it, why do i see packets of lenght  
64K ?


Thanks in advance.

Best Regards,
Muhammad Atif



Special deal for Yahoo! users & friends - No Cost. Get a month of  
Blockbuster Total  
Accessnow___

devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Ralph H Castain



On 3/31/08 9:28 AM, "Josh Hursey"  wrote:

> At the moment I only use unity with C/R. Mostly because I have not
> verified that the other components work properly under the C/R
> conditions. I can verify others, but that doesn't solve the problem
> with the unity component. :/
> 
> It is not critical that these jobs launch quickly, but that they
> launch correctly for the moment. When you say 'slow the launch' are
> you talking severely as in seconds/minutes for small nps?

I didn't say "severely" - I said "measurably". ;-)

It will require an additional communication to the daemons to let them know
how to talk to the procs. In the current unity component, the daemons never
talk to the procs themselves, and so they don't know contact info for
rank=0.

> I guess a  
> followup question is why did this component break in the first place?
> or worded differently, what changed in ORTE such that the unity
> component will suddenly deadlock when it didn't before?

We are trying to improve scalability. Biggest issue is the modex, which we
improved considerably by having the procs pass the modex info to the
daemons, letting the daemons collect all modex info from procs on their
node, and then having the daemons send that info along to the rank=0 proc
for collection and xcast.

Problem is that in the unity component, the local daemons don't know how to
send the modex to the rank=0 proc. So what I will now have to do is tell all
the daemons how to talk to the procs, and then we will have every daemon
opening a socket to rank=0. That's where the time will be lost.

Our original expectation was to get everyone off of unity as quickly as
possible - in fact, Brian and I had planned to completely remove that
component as quickly as possible as it (a) scales ugly and (b) gets in the
way of things. Very hard to keep it alive.

So for now, I'll just do the simple thing and hopefully that will be
adequate - let me know if/when you are able to get C/R working on other
routed components.

Thanks!
Ralph

> 
> Thanks for looking into this,
> Josh
> 
> On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote:
> 
>> I figured out the issue - there is a simple and a hard way to fix
>> this. So
>> before I do, let me see what makes sense.
>> 
>> The simple solution involves updating the daemons with contact info
>> for the
>> procs so that they can send their collected modex info to the rank=0
>> proc.
>> This will measurably slow the launch when using unity.
>> 
>> The hard solution is to do a hybrid routed approach whereby the
>> daemons
>> would route any daemon-to-proc communication while the procs
>> continue to do
>> direct proc-to-proc messaging.
>> 
>> Is there some reason to be using the "unity" component? Do you care
>> if jobs
>> using unity launch slower?
>> 
>> Thanks
>> Ralph
>> 
>> 
>> 
>> On 3/31/08 7:57 AM, "Josh Hursey"  wrote:
>> 
>>> Ralph,
>>> 
>>> I've just noticed that it seems that the 'unity' routed component
>>> seems to be broken when using more than one machine. I'm using Odin
>>> and r18028 of the trunk, and have confirmed that this problem occurs
>>> with SLURM and rsh. I think this break came in on Friday as that is
>>> when some of my MTT tests started to hang and fail, but I cannot
>>> point
>>> to a specific revision at this point. The backtraces (enclosed) of
>>> the
>>> processes point to the grpcomm allgather routine.
>>> 
>>> The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize.
>>> 
>>> RSH example from odin023 - so no SLURM variables:
>>> These work:
>>>  shell$ mpirun -np 2 -host odin023  noop -v 1
>>>  shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
>>>  shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1
>>> 
>>> This hangs:
>>>  shell$ mpirun -np 2 -mca routed unity -host odin023,odin024  noop -
>>> v 1
>>> 
>>> 
>>> If I attach to the 'noop' process on odin023 I get the following
>>> backtrace:
>>> 
>>> (gdb) bt
>>> #0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
>>> #1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,
>>> maxevents=1023, timeout=1000) at epoll_sub.c:61
>>> #2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30,
>>> arg=0x506910,
>>> tv=0x7fbfffe840) at epoll.c:210
>>> #3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,
>>> flags=5) at event.c:779
>>> #4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
>>> #5  0x002a95a0bef8 in opal_progress () at runtime/
>>> opal_progress.c:
>>> 169
>>> #6  0x002a958b9e48 in orte_grpcomm_base_allgather
>>> (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/
>>> grpcomm_base_allgather.c:238
>>> #7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at
>>> base/
>>> grpcomm_base_modex.c:413
>>> #8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58,
>>> requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510
>>> #9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c,
>>> argv=0x7

Re: [OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Josh Hursey


On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote:





On 3/31/08 9:28 AM, "Josh Hursey"  wrote:


At the moment I only use unity with C/R. Mostly because I have not
verified that the other components work properly under the C/R
conditions. I can verify others, but that doesn't solve the problem
with the unity component. :/

It is not critical that these jobs launch quickly, but that they
launch correctly for the moment. When you say 'slow the launch' are
you talking severely as in seconds/minutes for small nps?


I didn't say "severely" - I said "measurably". ;-)

It will require an additional communication to the daemons to let  
them know
how to talk to the procs. In the current unity component, the  
daemons never

talk to the procs themselves, and so they don't know contact info for
rank=0.


ah I see.





I guess a
followup question is why did this component break in the first place?
or worded differently, what changed in ORTE such that the unity
component will suddenly deadlock when it didn't before?


We are trying to improve scalability. Biggest issue is the modex,  
which we

improved considerably by having the procs pass the modex info to the
daemons, letting the daemons collect all modex info from procs on  
their
node, and then having the daemons send that info along to the rank=0  
proc

for collection and xcast.

Problem is that in the unity component, the local daemons don't know  
how to
send the modex to the rank=0 proc. So what I will now have to do is  
tell all
the daemons how to talk to the procs, and then we will have every  
daemon

opening a socket to rank=0. That's where the time will be lost.

Our original expectation was to get everyone off of unity as quickly  
as

possible - in fact, Brian and I had planned to completely remove that
component as quickly as possible as it (a) scales ugly and (b) gets  
in the

way of things. Very hard to keep it alive.

So for now, I'll just do the simple thing and hopefully that will be
adequate - let me know if/when you are able to get C/R working on  
other

routed components.


Sounds good. I'll look into supporting the tree routed component, but  
that will probably take a couple weeks.


Thanks for the clarification.

Cheers,
Josh




Thanks!
Ralph



Thanks for looking into this,
Josh

On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote:


I figured out the issue - there is a simple and a hard way to fix
this. So
before I do, let me see what makes sense.

The simple solution involves updating the daemons with contact info
for the
procs so that they can send their collected modex info to the rank=0
proc.
This will measurably slow the launch when using unity.

The hard solution is to do a hybrid routed approach whereby the
daemons
would route any daemon-to-proc communication while the procs
continue to do
direct proc-to-proc messaging.

Is there some reason to be using the "unity" component? Do you care
if jobs
using unity launch slower?

Thanks
Ralph



On 3/31/08 7:57 AM, "Josh Hursey"  wrote:


Ralph,

I've just noticed that it seems that the 'unity' routed component
seems to be broken when using more than one machine. I'm using Odin
and r18028 of the trunk, and have confirmed that this problem  
occurs

with SLURM and rsh. I think this break came in on Friday as that is
when some of my MTT tests started to hang and fail, but I cannot
point
to a specific revision at this point. The backtraces (enclosed) of
the
processes point to the grpcomm allgather routine.

The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize.

RSH example from odin023 - so no SLURM variables:
These work:
shell$ mpirun -np 2 -host odin023  noop -v 1
shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1

This hangs:
shell$ mpirun -np 2 -mca routed unity -host odin023,odin024  noop -
v 1


If I attach to the 'noop' process on odin023 I get the following
backtrace:

(gdb) bt
#0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
#1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,
maxevents=1023, timeout=1000) at epoll_sub.c:61
#2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30,
arg=0x506910,
tv=0x7fbfffe840) at epoll.c:210
#3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,
flags=5) at event.c:779
#4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702
#5  0x002a95a0bef8 in opal_progress () at runtime/
opal_progress.c:
169
#6  0x002a958b9e48 in orte_grpcomm_base_allgather
(sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/
grpcomm_base_allgather.c:238
#7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at
base/
grpcomm_base_modex.c:413
#8  0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58,
requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510
#9  0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c,
argv=0x7fbfffec70) at pinit.c:88
#10 0x00400bf4 in ma

Re: [OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Ralph H Castain
Okay - fixed with r18040

Thanks
Ralph


On 3/31/08 11:01 AM, "Josh Hursey"  wrote:

> 
> On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote:
> 
>> 
>> 
>> 
>> On 3/31/08 9:28 AM, "Josh Hursey"  wrote:
>> 
>>> At the moment I only use unity with C/R. Mostly because I have not
>>> verified that the other components work properly under the C/R
>>> conditions. I can verify others, but that doesn't solve the problem
>>> with the unity component. :/
>>> 
>>> It is not critical that these jobs launch quickly, but that they
>>> launch correctly for the moment. When you say 'slow the launch' are
>>> you talking severely as in seconds/minutes for small nps?
>> 
>> I didn't say "severely" - I said "measurably". ;-)
>> 
>> It will require an additional communication to the daemons to let
>> them know
>> how to talk to the procs. In the current unity component, the
>> daemons never
>> talk to the procs themselves, and so they don't know contact info for
>> rank=0.
> 
> ah I see.
> 
>> 
>> 
>>> I guess a
>>> followup question is why did this component break in the first place?
>>> or worded differently, what changed in ORTE such that the unity
>>> component will suddenly deadlock when it didn't before?
>> 
>> We are trying to improve scalability. Biggest issue is the modex,
>> which we
>> improved considerably by having the procs pass the modex info to the
>> daemons, letting the daemons collect all modex info from procs on
>> their
>> node, and then having the daemons send that info along to the rank=0
>> proc
>> for collection and xcast.
>> 
>> Problem is that in the unity component, the local daemons don't know
>> how to
>> send the modex to the rank=0 proc. So what I will now have to do is
>> tell all
>> the daemons how to talk to the procs, and then we will have every
>> daemon
>> opening a socket to rank=0. That's where the time will be lost.
>> 
>> Our original expectation was to get everyone off of unity as quickly
>> as
>> possible - in fact, Brian and I had planned to completely remove that
>> component as quickly as possible as it (a) scales ugly and (b) gets
>> in the
>> way of things. Very hard to keep it alive.
>> 
>> So for now, I'll just do the simple thing and hopefully that will be
>> adequate - let me know if/when you are able to get C/R working on
>> other
>> routed components.
> 
> Sounds good. I'll look into supporting the tree routed component, but
> that will probably take a couple weeks.
> 
> Thanks for the clarification.
> 
> Cheers,
> Josh
> 
>> 
>> 
>> Thanks!
>> Ralph
>> 
>>> 
>>> Thanks for looking into this,
>>> Josh
>>> 
>>> On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote:
>>> 
 I figured out the issue - there is a simple and a hard way to fix
 this. So
 before I do, let me see what makes sense.
 
 The simple solution involves updating the daemons with contact info
 for the
 procs so that they can send their collected modex info to the rank=0
 proc.
 This will measurably slow the launch when using unity.
 
 The hard solution is to do a hybrid routed approach whereby the
 daemons
 would route any daemon-to-proc communication while the procs
 continue to do
 direct proc-to-proc messaging.
 
 Is there some reason to be using the "unity" component? Do you care
 if jobs
 using unity launch slower?
 
 Thanks
 Ralph
 
 
 
 On 3/31/08 7:57 AM, "Josh Hursey"  wrote:
 
> Ralph,
> 
> I've just noticed that it seems that the 'unity' routed component
> seems to be broken when using more than one machine. I'm using Odin
> and r18028 of the trunk, and have confirmed that this problem
> occurs
> with SLURM and rsh. I think this break came in on Friday as that is
> when some of my MTT tests started to hang and fail, but I cannot
> point
> to a specific revision at this point. The backtraces (enclosed) of
> the
> processes point to the grpcomm allgather routine.
> 
> The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize.
> 
> RSH example from odin023 - so no SLURM variables:
> These work:
> shell$ mpirun -np 2 -host odin023  noop -v 1
> shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
> shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1
> 
> This hangs:
> shell$ mpirun -np 2 -mca routed unity -host odin023,odin024  noop -
> v 1
> 
> 
> If I attach to the 'noop' process on odin023 I get the following
> backtrace:
> 
> (gdb) bt
> #0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
> #1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,
> maxevents=1023, timeout=1000) at epoll_sub.c:61
> #2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30,
> arg=0x506910,
> tv=0x7fbfffe840) at epoll.c:210
> #3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,

Re: [OMPI devel] segfault on host not found error.

2008-03-31 Thread Ralph H Castain
I am unable to replicate the segfault. However, I was able to get the job to
hang. I fixed that behavior with r18044.

Perhaps you can test this again and let me know what you see. A gdb stack
trace would be more helpful.

Thanks
Ralph



On 3/31/08 5:13 AM, "Lenny Verkhovsky"  wrote:

> 
> 
> 
> I accidently run job on the hostfile where one of hosts was not properly
> mounted. As a result I got an error and a segfault.
> 
> 
> /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile
> ./mpi_p01 -t lt
> bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or
> directory
> 
> --
> A daemon (pid 9753) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
> 
> There may be more information reported by the environment (see above).
> 
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> 
> --
> 
> --
> mpirun was unable to start the specified application as it encountered
> an error.
> More information may be available above.
> 
> --
> [witch1:09745] *** Process received signal ***
> [witch1:09745] Signal: Segmentation fault (11)
> [witch1:09745] Signal code: Address not mapped (1)
> [witch1:09745] Failing at address: 0x3c
> [witch1:09745] [ 0] /lib64/libpthread.so.0 [0x2aff223ebc10]
> [witch1:09745] [ 1]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cdfe21]
> [witch1:09745] [ 2]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_rml_oob.so
> [0x2aff22c398f1]
> [witch1:09745] [ 3]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
> [0x2aff22d426ee]
> [witch1:09745] [ 4]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
> [0x2aff22d433fb]
> [witch1:09745] [ 5]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
> [0x2aff22d4485b]
> [witch1:09745] [ 6]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
> [witch1:09745] [ 7] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> [0x403203]
> [witch1:09745] [ 8]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
> [witch1:09745] [ 9]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
> 8b) [0x2aff21e060cb]
> [witch1:09745] [10]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_trigger_eve
> nt+0x20) [0x2aff21cc6940]
> [witch1:09745] [11]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_wakeup+0x2d
> ) [0x2aff21cc776d]
> [witch1:09745] [12]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
> [0x2aff22b34756]
> [witch1:09745] [13]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cc6ea7]
> [witch1:09745] [14]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
> [witch1:09745] [15]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
> 8b) [0x2aff21e060cb]
> [witch1:09745] [16]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_plm_base_da
> emon_callback+0xad) [0x2aff21ce068d]
> [witch1:09745] [17]
> /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
> [0x2aff22b34e5e]
> [witch1:09745] [18] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> [0x402e13]
> [witch1:09745] [19] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> [0x402873]
> [witch1:09745] [20] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x2aff22512154]
> [witch1:09745] [21] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> [0x4027c9]
> [witch1:09745] *** End of error message ***
> Segmentation fault (core dumped)
> 
> 
> Best Regards,
> Lenny.
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Routed 'unity' broken on trunk

2008-03-31 Thread Josh Hursey

Looks good. Thanks for the fix.

Cheers,
Josh

On Mar 31, 2008, at 1:43 PM, Ralph H Castain wrote:

Okay - fixed with r18040

Thanks
Ralph


On 3/31/08 11:01 AM, "Josh Hursey"  wrote:



On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote:





On 3/31/08 9:28 AM, "Josh Hursey"  wrote:


At the moment I only use unity with C/R. Mostly because I have not
verified that the other components work properly under the C/R
conditions. I can verify others, but that doesn't solve the problem
with the unity component. :/

It is not critical that these jobs launch quickly, but that they
launch correctly for the moment. When you say 'slow the launch' are
you talking severely as in seconds/minutes for small nps?


I didn't say "severely" - I said "measurably". ;-)

It will require an additional communication to the daemons to let
them know
how to talk to the procs. In the current unity component, the
daemons never
talk to the procs themselves, and so they don't know contact info  
for

rank=0.


ah I see.





I guess a
followup question is why did this component break in the first  
place?

or worded differently, what changed in ORTE such that the unity
component will suddenly deadlock when it didn't before?


We are trying to improve scalability. Biggest issue is the modex,
which we
improved considerably by having the procs pass the modex info to the
daemons, letting the daemons collect all modex info from procs on
their
node, and then having the daemons send that info along to the rank=0
proc
for collection and xcast.

Problem is that in the unity component, the local daemons don't know
how to
send the modex to the rank=0 proc. So what I will now have to do is
tell all
the daemons how to talk to the procs, and then we will have every
daemon
opening a socket to rank=0. That's where the time will be lost.

Our original expectation was to get everyone off of unity as quickly
as
possible - in fact, Brian and I had planned to completely remove  
that

component as quickly as possible as it (a) scales ugly and (b) gets
in the
way of things. Very hard to keep it alive.

So for now, I'll just do the simple thing and hopefully that will be
adequate - let me know if/when you are able to get C/R working on
other
routed components.


Sounds good. I'll look into supporting the tree routed component, but
that will probably take a couple weeks.

Thanks for the clarification.

Cheers,
Josh




Thanks!
Ralph



Thanks for looking into this,
Josh

On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote:


I figured out the issue - there is a simple and a hard way to fix
this. So
before I do, let me see what makes sense.

The simple solution involves updating the daemons with contact  
info

for the
procs so that they can send their collected modex info to the  
rank=0

proc.
This will measurably slow the launch when using unity.

The hard solution is to do a hybrid routed approach whereby the
daemons
would route any daemon-to-proc communication while the procs
continue to do
direct proc-to-proc messaging.

Is there some reason to be using the "unity" component? Do you  
care

if jobs
using unity launch slower?

Thanks
Ralph



On 3/31/08 7:57 AM, "Josh Hursey"  wrote:


Ralph,

I've just noticed that it seems that the 'unity' routed component
seems to be broken when using more than one machine. I'm using  
Odin

and r18028 of the trunk, and have confirmed that this problem
occurs
with SLURM and rsh. I think this break came in on Friday as  
that is

when some of my MTT tests started to hang and fail, but I cannot
point
to a specific revision at this point. The backtraces  
(enclosed) of

the
processes point to the grpcomm allgather routine.

The 'noop' program calls MPI_Init, sleeps, then calls  
MPI_Finalize.


RSH example from odin023 - so no SLURM variables:
These work:
shell$ mpirun -np 2 -host odin023  noop -v 1
shell$ mpirun -np 2 -host odin023,odin024  noop -v 1
shell$ mpirun -np 2 -mca routed unity -host odin023  noop -v 1

This hangs:
shell$ mpirun -np 2 -mca routed unity -host odin023,odin024   
noop -

v 1


If I attach to the 'noop' process on odin023 I get the following
backtrace:

(gdb) bt
#0  0x002a96226b39 in syscall () from /lib64/tls/libc.so.6
#1  0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330,
maxevents=1023, timeout=1000) at epoll_sub.c:61
#2  0x002a95a1e7f7 in epoll_dispatch (base=0x506c30,
arg=0x506910,
tv=0x7fbfffe840) at epoll.c:210
#3  0x002a95a1c057 in opal_event_base_loop (base=0x506c30,
flags=5) at event.c:779
#4  0x002a95a1be8f in opal_event_loop (flags=5) at event.c: 
702

#5  0x002a95a0bef8 in opal_progress () at runtime/
opal_progress.c:
169
#6  0x002a958b9e48 in orte_grpcomm_base_allgather
(sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/
grpcomm_base_allgather.c:238
#7  0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at
base/
grpcomm_base_modex.c:413
#8  0x002a956b8416 in ompi_mpi_init (argc=3,  
argv=0x7fbfffe

[OMPI devel] Session directories in $HOME?

2008-03-31 Thread Josh Hursey
So does anyone know why the session directories are in $HOME instead  
of /tmp?


I'm using r18044 and every time I run the session directories are  
created in $HOME. George does this have anything to do with your  
commits from earlier?


-- Josh


Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread George Bosilca
I looked over the code and I don't see any problems with the changes.  
The only think I did is replacing the getenv("HOME") by  
opal_home_directory ...


Here is the logic for selecting the TMP directory:

if( NULL == (str = getenv("TMPDIR")) )
if( NULL == (str = getenv("TEMP")) )
if( NULL == (str = getenv("TMP")) )
if( NULL == (str = opal_home_directory()) )
str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

  george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Shipman, Galen M.


Slightly OT but along the same lines..

We currently have an argument to mpirun to set the HNP tmpdir (-- 
tmpdir).
Why don't we have an mca param to set the tmpdir for all the orted's  
and such?


- Galen

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:

I looked over the code and I don't see any problems with the  
changes. The only think I did is replacing the getenv("HOME") by  
opal_home_directory ...


Here is the logic for selecting the TMP directory:

if( NULL == (str = getenv("TMPDIR")) )
if( NULL == (str = getenv("TEMP")) )
if( NULL == (str = getenv("TMP")) )
if( NULL == (str = opal_home_directory()) )
str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

  george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Josh Hursey
Nope. None of those environment variables are defined. Should they  
be? It would seem that the last part of the logic should be (re-) 
extended to use /tmp if it exists.


-- Josh

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:
I looked over the code and I don't see any problems with the  
changes. The only think I did is replacing the getenv("HOME") by  
opal_home_directory ...


Here is the logic for selecting the TMP directory:

if( NULL == (str = getenv("TMPDIR")) )
if( NULL == (str = getenv("TEMP")) )
if( NULL == (str = getenv("TMP")) )
if( NULL == (str = opal_home_directory()) )
str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

  george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Jeff Squyres

I confirm that this is new behavior.

Session directories have just started showing up in my $HOME as well,  
and TMPDIR, TEMP, TMP have never been set on my cluster (for  
interactive logins, anyway).



On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote:

Nope. None of those environment variables are defined. Should they
be? It would seem that the last part of the logic should be (re-)
extended to use /tmp if it exists.

-- Josh

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:

I looked over the code and I don't see any problems with the
changes. The only think I did is replacing the getenv("HOME") by
opal_home_directory ...

Here is the logic for selecting the TMP directory:

   if( NULL == (str = getenv("TMPDIR")) )
   if( NULL == (str = getenv("TEMP")) )
   if( NULL == (str = getenv("TMP")) )
   if( NULL == (str = opal_home_directory()) )
   str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

 george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Josh Hursey
Taking a quick look at the commits it seems that r18037 looks like  
the most likely cause of this problem.


Previously the session directory was forced to "/tmp" if no  
environment variables were set. This revision removes this logic and  
uses the opal_tmp_directory(). Though I agree with this change, I  
think the logic for selecting the TMP directory should be extended to  
use '/tmp' if it exists. If it does not then the home directory  
should be a fine last alternative.


How does that sound as a solution? This would prevent us from  
unexpectedly changing our running behavior in user environments in  
which none of those variables are set.


Cheers,
Josh

On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote:

Nope. None of those environment variables are defined. Should they
be? It would seem that the last part of the logic should be (re-)
extended to use /tmp if it exists.

-- Josh

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:

I looked over the code and I don't see any problems with the
changes. The only think I did is replacing the getenv("HOME") by
opal_home_directory ...

Here is the logic for selecting the TMP directory:

if( NULL == (str = getenv("TMPDIR")) )
if( NULL == (str = getenv("TEMP")) )
if( NULL == (str = getenv("TMP")) )
if( NULL == (str = opal_home_directory()) )
str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

  george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread George Bosilca
TMPDIR and TMP are standard on Unix. If they are not defined ... one  
cannot guess where the temporary files should be located.  
Unfortunately, if we start using the /tmp directly we might make the  
wrong guess.


What mktemp is returning on your system ?

  george.

On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote:

Nope. None of those environment variables are defined. Should they
be? It would seem that the last part of the logic should be (re-)
extended to use /tmp if it exists.

-- Josh

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:

I looked over the code and I don't see any problems with the
changes. The only think I did is replacing the getenv("HOME") by
opal_home_directory ...

Here is the logic for selecting the TMP directory:

   if( NULL == (str = getenv("TMPDIR")) )
   if( NULL == (str = getenv("TEMP")) )
   if( NULL == (str = getenv("TMP")) )
   if( NULL == (str = opal_home_directory()) )
   str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

 george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Ralph H Castain
Here is the problem - the following code was changed in session_dir.c:

-#ifdef __WINDOWS__
-#define OMPI_DEFAULT_TMPDIR "C:\\TEMP"
-#else
-#define OMPI_DEFAULT_TMPDIR "/tmp"
-#endif
-
 #define OMPI_PRINTF_FIX_STRING(a) ((NULL == a) ? "(null)" : a)

 /
@@ -262,14 +257,8 @@
 else if( NULL != getenv("OMPI_PREFIX_ENV") ) { /* OMPI Environment var
*/
 prefix = strdup(getenv("OMPI_PREFIX_ENV"));
 }
-else if( NULL != getenv("TMPDIR") ) { /* General Environment var */
-prefix = strdup(getenv("TMPDIR"));
-}
-else if( NULL != getenv("TMP") ) { /* Another general environment var
*/
-prefix = strdup(getenv("TMP"));
-}
-else { /* ow. just use the default tmp directory */
-prefix = strdup(OMPI_DEFAULT_TMPDIR);
+else { /* General Environment var */
+prefix = strdup(opal_tmp_directory());
 }

I believe the problem is that opal_tmp_directory doesn't have
OMPI_DEFAULT_TMPDIR - it just defaults to $HOME.

This should probably be fixed.


On 3/31/08 2:01 PM, "Josh Hursey"  wrote:

> Nope. None of those environment variables are defined. Should they
> be? It would seem that the last part of the logic should be (re-)
> extended to use /tmp if it exists.
> 
> -- Josh
> 
> On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:
>> I looked over the code and I don't see any problems with the
>> changes. The only think I did is replacing the getenv("HOME") by
>> opal_home_directory ...
>> 
>> Here is the logic for selecting the TMP directory:
>> 
>> if( NULL == (str = getenv("TMPDIR")) )
>> if( NULL == (str = getenv("TEMP")) )
>> if( NULL == (str = getenv("TMP")) )
>> if( NULL == (str = opal_home_directory()) )
>> str = ".";
>> 
>> Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?
>> 
>>   george.
>> 
>> On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:
>>> So does anyone know why the session directories are in $HOME instead
>>> of /tmp?
>>> 
>>> I'm using r18044 and every time I run the session directories are
>>> created in $HOME. George does this have anything to do with your
>>> commits from earlier?
>>> 
>>> -- Josh
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Aurélien Bouteiller

I more than agree with Galen.

Aurelien
Le 31 mars 08 à 16:00, Shipman, Galen M. a écrit :


Slightly OT but along the same lines..

We currently have an argument to mpirun to set the HNP tmpdir (--
tmpdir).
Why don't we have an mca param to set the tmpdir for all the orted's
and such?

- Galen

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:


I looked over the code and I don't see any problems with the
changes. The only think I did is replacing the getenv("HOME") by
opal_home_directory ...

Here is the logic for selecting the TMP directory:

   if( NULL == (str = getenv("TMPDIR")) )
   if( NULL == (str = getenv("TEMP")) )
   if( NULL == (str = getenv("TMP")) )
   if( NULL == (str = opal_home_directory()) )
   str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

 george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:

So does anyone know why the session directories are in $HOME instead
of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread George Bosilca
Commit r18046 restore exactly the same logic as it was before r18037.  
It redirects everything to /tmp is no special environment variable is  
set.


  george.

On Mar 31, 2008, at 4:09 PM, Josh Hursey wrote:

Taking a quick look at the commits it seems that r18037 looks like
the most likely cause of this problem.

Previously the session directory was forced to "/tmp" if no
environment variables were set. This revision removes this logic and
uses the opal_tmp_directory(). Though I agree with this change, I
think the logic for selecting the TMP directory should be extended to
use '/tmp' if it exists. If it does not then the home directory
should be a fine last alternative.

How does that sound as a solution? This would prevent us from
unexpectedly changing our running behavior in user environments in
which none of those variables are set.

Cheers,
Josh

On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote:

Nope. None of those environment variables are defined. Should they
be? It would seem that the last part of the logic should be (re-)
extended to use /tmp if it exists.

-- Josh

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:

I looked over the code and I don't see any problems with the
changes. The only think I did is replacing the getenv("HOME") by
opal_home_directory ...

Here is the logic for selecting the TMP directory:

   if( NULL == (str = getenv("TMPDIR")) )
   if( NULL == (str = getenv("TEMP")) )
   if( NULL == (str = getenv("TMP")) )
   if( NULL == (str = opal_home_directory()) )
   str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your environment ?

 george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:
So does anyone know why the session directories are in $HOME  
instead

of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Session directories in $HOME?

2008-03-31 Thread Josh Hursey

Thanks for the fix.

Cheers,
Josh

On Mar 31, 2008, at 4:17 PM, George Bosilca wrote:
Commit r18046 restore exactly the same logic as it was before  
r18037. It redirects everything to /tmp is no special environment  
variable is set.


  george.

On Mar 31, 2008, at 4:09 PM, Josh Hursey wrote:

Taking a quick look at the commits it seems that r18037 looks like
the most likely cause of this problem.

Previously the session directory was forced to "/tmp" if no
environment variables were set. This revision removes this logic and
uses the opal_tmp_directory(). Though I agree with this change, I
think the logic for selecting the TMP directory should be extended to
use '/tmp' if it exists. If it does not then the home directory
should be a fine last alternative.

How does that sound as a solution? This would prevent us from
unexpectedly changing our running behavior in user environments in
which none of those variables are set.

Cheers,
Josh

On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote:

Nope. None of those environment variables are defined. Should they
be? It would seem that the last part of the logic should be (re-)
extended to use /tmp if it exists.

-- Josh

On Mar 31, 2008, at 3:51 PM, George Bosilca wrote:

I looked over the code and I don't see any problems with the
changes. The only think I did is replacing the getenv("HOME") by
opal_home_directory ...

Here is the logic for selecting the TMP directory:

   if( NULL == (str = getenv("TMPDIR")) )
   if( NULL == (str = getenv("TEMP")) )
   if( NULL == (str = getenv("TMP")) )
   if( NULL == (str = opal_home_directory()) )
   str = ".";

Do you have any of those (TMPDIR, TEMP or TMP) in your  
environment ?


 george.

On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote:
So does anyone know why the session directories are in $HOME  
instead

of /tmp?

I'm using r18044 and every time I run the session directories are
created in $HOME. George does this have anything to do with your
commits from earlier?

-- Josh
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [OMPI svn] svn:open-mpi r18046

2008-03-31 Thread Bert Wesarg
On Mon, Mar 31, 2008 at 10:15 PM,   wrote:
> Author: bosilca
>  Date: 2008-03-31 16:15:49 EDT (Mon, 31 Mar 2008)
>  New Revision: 18046
>  URL: https://svn.open-mpi.org/trac/ompi/changeset/18046
>
>  Modified: trunk/opal/util/opal_environ.c
>  +#ifdef __WINDOWS__
>  +#define OMPI_DEFAULT_TMPDIR "C:\\TEMP"
>  +#else
>  +#define OMPI_DEFAULT_TMPDIR "/tmp"
>  +#endif
>  +
Wrong prefix for this file?

Bert


Re: [OMPI devel] [OMPI svn] svn:open-mpi r18046

2008-03-31 Thread George Bosilca

You're right ... I'll make the change asap.

  Thanks,
george.

On Mar 31, 2008, at 5:39 PM, Bert Wesarg wrote:

On Mon, Mar 31, 2008 at 10:15 PM,   wrote:

Author: bosilca
Date: 2008-03-31 16:15:49 EDT (Mon, 31 Mar 2008)
New Revision: 18046
URL: https://svn.open-mpi.org/trac/ompi/changeset/18046

Modified: trunk/opal/util/opal_environ.c
+#ifdef __WINDOWS__
+#define OMPI_DEFAULT_TMPDIR "C:\\TEMP"
+#else
+#define OMPI_DEFAULT_TMPDIR "/tmp"
+#endif
+

Wrong prefix for this file?

Bert
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature