[OMPI devel] limit tcp fragment size?
G'day Just a quick basic question. in case of tcp btl, how do I limit the frag size? I do not want MPI to send a fragment of size greater than lets say 16K in size. If I am not mistaken, should not the btl_tcp_min_send_size do the trick? If it is supposed to do it, why do i see packets of lenght 64K ? Thanks in advance. Best Regards, Muhammad Atif Like movies? Here's a limited-time offer: Blockbuster Total Access for one month at no cost. http://tc.deals.yahoo.com/tc/blockbuster/text4.com
[OMPI devel] segfault on host not found error.
I accidently run job on the hostfile where one of hosts was not properly mounted. As a result I got an error and a segfault. /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile ./mpi_p01 -t lt bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or directory -- A daemon (pid 9753) died unexpectedly with status 127 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -- -- mpirun was unable to start the specified application as it encountered an error. More information may be available above. -- [witch1:09745] *** Process received signal *** [witch1:09745] Signal: Segmentation fault (11) [witch1:09745] Signal code: Address not mapped (1) [witch1:09745] Failing at address: 0x3c [witch1:09745] [ 0] /lib64/libpthread.so.0 [0x2aff223ebc10] [witch1:09745] [ 1] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cdfe21] [witch1:09745] [ 2] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_rml_oob.so [0x2aff22c398f1] [witch1:09745] [ 3] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so [0x2aff22d426ee] [witch1:09745] [ 4] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so [0x2aff22d433fb] [witch1:09745] [ 5] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so [0x2aff22d4485b] [witch1:09745] [ 6] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b] [witch1:09745] [ 7] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun [0x403203] [witch1:09745] [ 8] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b] [witch1:09745] [ 9] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x 8b) [0x2aff21e060cb] [witch1:09745] [10] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_trigger_eve nt+0x20) [0x2aff21cc6940] [witch1:09745] [11] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_wakeup+0x2d ) [0x2aff21cc776d] [witch1:09745] [12] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so [0x2aff22b34756] [witch1:09745] [13] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cc6ea7] [witch1:09745] [14] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b] [witch1:09745] [15] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x 8b) [0x2aff21e060cb] [witch1:09745] [16] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_plm_base_da emon_callback+0xad) [0x2aff21ce068d] [witch1:09745] [17] /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so [0x2aff22b34e5e] [witch1:09745] [18] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun [0x402e13] [witch1:09745] [19] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun [0x402873] [witch1:09745] [20] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2aff22512154] [witch1:09745] [21] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun [0x4027c9] [witch1:09745] *** End of error message *** Segmentation fault (core dumped) Best Regards, Lenny.
Re: [OMPI devel] RMAPS rank_file component patch and modifications for review
On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote: - I don't think we can delete the MCA param ompi_paffinity_alone; it exists in the v1.2 series and has historical precedent. It will not be deleted, It will just use the same infrastructure ( slot_list parameter and opal_base functions ). It will be transparent for the user. User have 3 ways to setup it 1. mca opal_paffinity_alone 1 This will set paffinity as it did before 2. mca opal_paffinity_slot_list "slot_list" Used to define slots that will be used for all ranks on all nodes. 3. mca rmaps_rank_file_path rankfile Assigning ranks to CPUs according to the file I don't see the MCA parameter "mpi_paffinity_alone" anymore: - [4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep paffinity_alone MCA opal: parameter "opal_paffinity_alone" (current value: "0") [4:54] svbu-mpi:~/svn/ompi2 % - My point is that I don't think we should delete this parameter; there is historical precedence for it (and it has been documented on the web page for a long, long time). Perhaps it can now simply be a synonym for opal_paffinity_alone (registered in the MPI layer, not opal). -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RMAPS rank_file component patch and modifications for review
Sorry, I missed this mail. IIRC, the verbosity level for stream 0 is 0. It probably would not be good to increase it; many places in the code use output stream 0. Perhaps you could make a new stream with a different verbosity level to do what you want...? See the docs in opal/util/output.h. On Mar 27, 2008, at 8:12 AM, Lenny Verkhovsky wrote: NO, just tried to see some printouts during the run, I use in the code opal_output_verbose(0, 0,"LNY100 opal_paffinity_base_slot_list_set ver=%d ",0); opal_output_verbose(1, 0,"LNY101 opal_paffinity_base_slot_list_set ver=%d ",1); OPAL_OUTPUT_VERBOSE((1, 0,"VERBOSE LNY102 opal_paffinity_base_slot_list_set ver=%d ",1)); but all I see is the first line ( since I put level 0) I suppose that to see the second line I must configure with --enable- debug, but this is not working for me either. On Thu, Mar 27, 2008 at 2:02 PM, Jeff Squyres wrote: Are you using BTL_OUTPUT or something else from btl_base_error.h? On Mar 27, 2008, at 7:49 AM, Lenny Verkhovsky wrote: > Hi, > thanks for the comments. I will definetly implement all of them and > commit the code as soon as I finished. > > Also I experience few problems with using opal_verbose_output, > either there is a bugs or I am doing something wrong. > > > /home/USERS/lenny/OMPI_ORTE_DEBUG/bin/mpirun -mca mca_verbose 0 -mca > paffinity_base_verbose 1 --byslot -np 2 -hostfile hostfile -mca > btl_openib_max_lmc 1 -mca opal_paffinity_alone 1 -mca > btl_openib_verbose 1 /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug - t lt > > > /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug: symbol lookup error: / > home/USERS/lenny/OMPI_ORTE_DEBUG//lib/openmpi/mca_btl_openib.so: > undefined symbol: mca_btl_base_out > /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug: symbol lookup error: / > home/USERS/lenny/OMPI_ORTE_DEBUG//lib/openmpi/mca_btl_openib.so: > undefined symbol: mca_btl_base_out > -- > mpirun has exited due to process rank 1 with PID 5896 on > node witch17 exiting without calling "finalize". This may > have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here). > > > On Wed, Mar 26, 2008 at 2:50 PM, Ralph H Castain wrote: > I would tend to echo Tim's suggestions. I note that you do lookup > that opal > mca param in orte as well. I know you sent me a note about that off- > list - I > apologize for not getting to it yet, but was swamped yesterday. > > I think the solution suggested in #1 below is the right approach. > Looking up > opal params in orte or ompi is probably not a good idea. We have had > problems in the past where params were looked up in multiple places as > people -do- sometimes change the names (ahem...). > > Also, I would suggest using the macro version of verbose > OPAL_OUTPUT_VERBOSE > so that it compiles out for non-debug builds - up to you. Many of us > use it > as we don't need the output from optimized builds. > > Other than that, I think this looks fine. I do truly appreciate the > cleanup > of ompi_mpi_init. > > Ralph > > > > On 3/26/08 6:09 AM, "Tim Prins" wrote: > > > Hi Lenny, > > > > This looks good. But I have a couple of suggestions (which others > may > > disagree with): > > > > 1. You register an opal mca parameter, but look it up in ompi, > then call > > a opal function with the result. What if you had a function > > opal_paffinity_base_set_slots(long rank) (or some other name, I > don't > > care) which looked up the mca parameter and then setup the slots > as you > > are doing if it is fount. This would make things a bit cleaner IMHO. > > > > 2. the functions in the paffinety base should be prefixed with > > 'opal_paffinity_base_' > > > > 3. Why was the ompi_debug_flag added? It is not used anywhere. > > > > 4. You probably do not need to add the opal debug flag. There is > already > > a 'paffinity_base_verbose' flag which should suit your purposes > fine. So > > you should just be able to replace all of the conditional output > > statements in paffinity with something like > > opal_output_verbose(10, opal_paffinity_base_output, ...), > > where 10 is the verbosity level number. > > > > Tim > > > > > > Lenny Verkhovsky wrote: > >> > >> > >> Hi, all > >> > >> Attached patch for modified Rank_File RMAPS component. > >> > >> > >> > >> 1.introduced new general purpose debug flags > >> > >> mpi_debug > >> > >> opal_debug > >> > >> > >> > >> 2.introduced new mca parameter opal_paffinity_slot_list > >> > >> 3.ompi_mpi_init cleaned from opal paffinity functions > >> > >> 4.opal paffinity functions moved to new file > >> opal/mca/paffinity/base/paffinity_base_service.c > >> > >> 5.rank_file component files were renamed according to prefix > policy > >> > >> 6.global variables renamed as well. > >> > >> 7.few bug fixes that were brought during previous discussions. > >> > >> 8.If user defines opal_paffinity_a
Re: [OMPI devel] RMAPS rank_file component patch and modifications for review
Jeff Squyres wrote: On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote: - I don't think we can delete the MCA param ompi_paffinity_alone; it exists in the v1.2 series and has historical precedent. It will not be deleted, It will just use the same infrastructure ( slot_list parameter and opal_base functions ). It will be transparent for the user. User have 3 ways to setup it 1. mca opal_paffinity_alone 1 This will set paffinity as it did before 2. mca opal_paffinity_slot_list "slot_list" Used to define slots that will be used for all ranks on all nodes. 3. mca rmaps_rank_file_path rankfile Assigning ranks to CPUs according to the file I don't see the MCA parameter "mpi_paffinity_alone" anymore: - [4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep paffinity_alone MCA opal: parameter "opal_paffinity_alone" (current value: "0") [4:54] svbu-mpi:~/svn/ompi2 % - My point is that I don't think we should delete this parameter; there is historical precedence for it (and it has been documented on the web page for a long, long time). Perhaps it can now simply be a synonym for opal_paffinity_alone (registered in the MPI layer, not opal). I agree with Jeff on the above. This would cause a lot of busy work for our customers and internal setups. --td
Re: [OMPI devel] RMAPS rank_file component patch and modifications for review
OK, I am putting it back. > -Original Message- > From: terry.don...@sun.com [mailto:terry.don...@sun.com] > Sent: Monday, March 31, 2008 2:59 PM > To: Open MPI Developers > Cc: Lenny Verkhovsky; Sharon Melamed > Subject: Re: [OMPI devel] RMAPS rank_file component patch and > modifications for review > > Jeff Squyres wrote: > > On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote: > > > >>> - I don't think we can delete the MCA param ompi_paffinity_alone; it > >>> exists in the v1.2 series and has historical precedent. > >>> > >> It will not be deleted, > >> It will just use the same infrastructure ( slot_list parameter and > >> opal_base functions ). It will be transparent for the user. > >> > >> User have 3 ways to setup it > >> 1. mca opal_paffinity_alone 1 > >>This will set paffinity as it did before > >> 2. mca opal_paffinity_slot_list "slot_list" > >>Used to define slots that will be used for all ranks on all > >> nodes. > >> 3. mca rmaps_rank_file_path rankfile > >>Assigning ranks to CPUs according to the file > >> > > > > > > I don't see the MCA parameter "mpi_paffinity_alone" anymore: > > > > - > > [4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep > > paffinity_alone > > MCA opal: parameter "opal_paffinity_alone" (current > > value: "0") > > [4:54] svbu-mpi:~/svn/ompi2 % > > - > > > > My point is that I don't think we should delete this parameter; there > > is historical precedence for it (and it has been documented on the web > > page for a long, long time). Perhaps it can now simply be a synonym > > for opal_paffinity_alone (registered in the MPI layer, not opal). > > > > > I agree with Jeff on the above. This would cause a lot of busy work for > our customers and internal setups. > > --td
Re: [OMPI devel] Scalability of openib modex
Thanks Jeff. It appears to me that the first approach to reducing modex data makes the most sense and has the largest impact - I would advocate pursuing it first. We can look at further refinements later. Along that line, one thing we also exchange in the modex (not IB specific) is hostname and arch. This is in the ompi/proc/proc.c code. It seems to me that this is also wasteful and can be removed. The daemons already have that info for the job and can easily "drop" it into each proc - there is no reason to send it around. I'll take a look at cleaning that up, ensuring we don't "break" daemonless environments, along with the other things underway. Ralph On 3/28/08 11:37 AM, "Jeff Squyres" wrote: > I've had this conversation independently with several people now, so > I'm sending it to the list rather than continuing to have the same > conversation over and over. :-) > > -- > > As most of you know, Jon and I are working on the new openib > "CPC" (connect pseudo-component) stuff in /tmp-public/openib-cpc2. > There are two main reasons for it: > > 1. Add support for RDMA CM (they need it for iWarp support) > 2. Add support for IB CM (which will hopefully be a more scalable > connect system as compared to the current RML/OOB-based method of > making IB QPs) > > When complete, there will be 4 CPCs: RDMA CM, IB CM, OOB, and XOOB > (same as OOB but with ConnectX XRC extensions). > > RDMA CM has some known scaling issues, and at least some known > workarounds -- I won't discuss the merits/drawbacks of RDMA CM here. > IB CM has unknown scaling characteristics, but seems to look good on > paper (e.g., it uses UD for a 3-way handshake to make an IB QP). > > On the trunk, it's a per-MPI process decision as to which CPC you'll > use. Per ticket #1191, one of the goals of the /tmp-public branch is > to make CPC decision be a per-openib-BTL-module decision. So you can > mix iWarp and IB hardware in a single host, for example. This fits in > quite well with the "mpirun should work out of the box" philosophy of > Open MPI. > > In the openib BTL, each BTL module is paired with a specific HCA/NIC > (verbs) port. And depending on the interface hardware and software, > one or more CPCs may be available for each. Hence, for each BTL > module in each MPI process, we may send one or more CPC connect > information blobs in the modex (note that the oob and xoob CPCs don't > need to send anything additional in the modex). > > Jon and I are actually getting closer to completion on the branch, and > it seems to be working. > > In conjunction with several other scalability discussions that are > ongoing right now, several of us have toyed with two basic ideas to > improve scalability of job launch / startup: > > 1. the possibility of eliminating the modex altogether (e.g., have > ORTE dump enough information to each MPI process to figure out/ > calculate/locally lookup [in local files?] BTL addressing information > for all peers in MPI_COMM_WORLD, etc.), a la Portals. > > 2. reducing the amount of data in the modex. > > One obvious idea for #2 is to have only one process on each host send > all/the majority of openib BTL modex information for that host. The > rationale here is that all MPI processes on a single host will share > much of the same BTL addressing information, so why send it N times? > Local rank 0 can modex send all/the majority of the modex for the > openib BTL modules; local ranks 1-N can either send nothing or a > [very] small piece of differentiating information (e.g., IBCM service > ID). > > This effectively makes the modex info for the openib BTL scale with > the number of nodes, not the number of processes. This can be a big > win in terms of overall modex size that needs to be both gathered and > bcast. > > I worked up a spreadsheet showing the current size of the modex in the > openib-cpc2 branch right now (using some "somewhat" contrived machine > size/ppn/port combinations), and then compared it to the size after > implementing the #2 idea shown above (see attached PDF). > > I also included a 3rd comparison for if Jon/I are able to reduce the > CPC modex blob sizes -- we don't know yet if that'll work or not. But > the numbers show that reducing the blobs by a few bytes clearly has > [much] less of an impact than the "eliminating redundant modex > information" idea, so we'll work on that one first. > > Additionally, reducing the modex size, paired with other ongoing ORTE > scalability efforts, may obviate the need to eliminate the modex (at > least for now...). Or, more specifically, efforts for eliminating the > modex can be pushed to beyond v1.3. > > Of course, the same ideas can apply to other BTLs. We're only working > on the openib BTL for now.
Re: [OMPI devel] Scalability of openib modex
On Mar 31, 2008, at 9:22 AM, Ralph H Castain wrote: Thanks Jeff. It appears to me that the first approach to reducing modex data makes the most sense and has the largest impact - I would advocate pursuing it first. We can look at further refinements later. Along that line, one thing we also exchange in the modex (not IB specific) is hostname and arch. This is in the ompi/proc/proc.c code. It seems to me that this is also wasteful and can be removed. The daemons already have that info for the job and can easily "drop" it into each proc - there is no reason to send it around. I'll take a look at cleaning that up, ensuring we don't "break" daemonless environments, along with the other things underway. Sounds perfect. Ralph On 3/28/08 11:37 AM, "Jeff Squyres" wrote: I've had this conversation independently with several people now, so I'm sending it to the list rather than continuing to have the same conversation over and over. :-) -- As most of you know, Jon and I are working on the new openib "CPC" (connect pseudo-component) stuff in /tmp-public/openib-cpc2. There are two main reasons for it: 1. Add support for RDMA CM (they need it for iWarp support) 2. Add support for IB CM (which will hopefully be a more scalable connect system as compared to the current RML/OOB-based method of making IB QPs) When complete, there will be 4 CPCs: RDMA CM, IB CM, OOB, and XOOB (same as OOB but with ConnectX XRC extensions). RDMA CM has some known scaling issues, and at least some known workarounds -- I won't discuss the merits/drawbacks of RDMA CM here. IB CM has unknown scaling characteristics, but seems to look good on paper (e.g., it uses UD for a 3-way handshake to make an IB QP). On the trunk, it's a per-MPI process decision as to which CPC you'll use. Per ticket #1191, one of the goals of the /tmp-public branch is to make CPC decision be a per-openib-BTL-module decision. So you can mix iWarp and IB hardware in a single host, for example. This fits in quite well with the "mpirun should work out of the box" philosophy of Open MPI. In the openib BTL, each BTL module is paired with a specific HCA/NIC (verbs) port. And depending on the interface hardware and software, one or more CPCs may be available for each. Hence, for each BTL module in each MPI process, we may send one or more CPC connect information blobs in the modex (note that the oob and xoob CPCs don't need to send anything additional in the modex). Jon and I are actually getting closer to completion on the branch, and it seems to be working. In conjunction with several other scalability discussions that are ongoing right now, several of us have toyed with two basic ideas to improve scalability of job launch / startup: 1. the possibility of eliminating the modex altogether (e.g., have ORTE dump enough information to each MPI process to figure out/ calculate/locally lookup [in local files?] BTL addressing information for all peers in MPI_COMM_WORLD, etc.), a la Portals. 2. reducing the amount of data in the modex. One obvious idea for #2 is to have only one process on each host send all/the majority of openib BTL modex information for that host. The rationale here is that all MPI processes on a single host will share much of the same BTL addressing information, so why send it N times? Local rank 0 can modex send all/the majority of the modex for the openib BTL modules; local ranks 1-N can either send nothing or a [very] small piece of differentiating information (e.g., IBCM service ID). This effectively makes the modex info for the openib BTL scale with the number of nodes, not the number of processes. This can be a big win in terms of overall modex size that needs to be both gathered and bcast. I worked up a spreadsheet showing the current size of the modex in the openib-cpc2 branch right now (using some "somewhat" contrived machine size/ppn/port combinations), and then compared it to the size after implementing the #2 idea shown above (see attached PDF). I also included a 3rd comparison for if Jon/I are able to reduce the CPC modex blob sizes -- we don't know yet if that'll work or not. But the numbers show that reducing the blobs by a few bytes clearly has [much] less of an impact than the "eliminating redundant modex information" idea, so we'll work on that one first. Additionally, reducing the modex size, paired with other ongoing ORTE scalability efforts, may obviate the need to eliminate the modex (at least for now...). Or, more specifically, efforts for eliminating the modex can be pushed to beyond v1.3. Of course, the same ideas can apply to other BTLs. We're only working on the openib BTL for now. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
[OMPI devel] Routed 'unity' broken on trunk
Ralph, I've just noticed that it seems that the 'unity' routed component seems to be broken when using more than one machine. I'm using Odin and r18028 of the trunk, and have confirmed that this problem occurs with SLURM and rsh. I think this break came in on Friday as that is when some of my MTT tests started to hang and fail, but I cannot point to a specific revision at this point. The backtraces (enclosed) of the processes point to the grpcomm allgather routine. The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. RSH example from odin023 - so no SLURM variables: These work: shell$ mpirun -np 2 -host odin023 noop -v 1 shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 This hangs: shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop -v 1 If I attach to the 'noop' process on odin023 I get the following backtrace: (gdb) bt #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, maxevents=1023, timeout=1000) at epoll_sub.c:61 #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910, tv=0x7fbfffe840) at epoll.c:210 #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30, flags=5) at event.c:779 #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 #5 0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c: 169 #6 0x002a958b9e48 in orte_grpcomm_base_allgather (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ grpcomm_base_allgather.c:238 #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ grpcomm_base_modex.c:413 #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58, requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510 #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c, argv=0x7fbfffec70) at pinit.c:88 #10 0x00400bf4 in main (argc=3, argv=0x7fbfffed58) at noop.c:39 The 'noop' process on odin024 has a similar backtrace: (gdb) bt #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b390, maxevents=1023, timeout=1000) at epoll_sub.c:61 #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506cc0, arg=0x506c20, tv=0x7fbfffe9d0) at epoll.c:210 #3 0x002a95a1c057 in opal_event_base_loop (base=0x506cc0, flags=5) at event.c:779 #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 #5 0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c: 169 #6 0x002a958b97c5 in orte_grpcomm_base_allgather (sbuf=0x7fbfffec70, rbuf=0x7fbfffec10) at base/ grpcomm_base_allgather.c:163 #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ grpcomm_base_modex.c:413 #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffeee8, requested=0, provided=0x7fbfffedc8) at runtime/ompi_mpi_init.c:510 #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffee0c, argv=0x7fbfffee00) at pinit.c:88 #10 0x00400bf4 in main (argc=3, argv=0x7fbfffeee8) at noop.c:39 Cheers, Josh
Re: [OMPI devel] Routed 'unity' broken on trunk
I figured out the issue - there is a simple and a hard way to fix this. So before I do, let me see what makes sense. The simple solution involves updating the daemons with contact info for the procs so that they can send their collected modex info to the rank=0 proc. This will measurably slow the launch when using unity. The hard solution is to do a hybrid routed approach whereby the daemons would route any daemon-to-proc communication while the procs continue to do direct proc-to-proc messaging. Is there some reason to be using the "unity" component? Do you care if jobs using unity launch slower? Thanks Ralph On 3/31/08 7:57 AM, "Josh Hursey" wrote: > Ralph, > > I've just noticed that it seems that the 'unity' routed component > seems to be broken when using more than one machine. I'm using Odin > and r18028 of the trunk, and have confirmed that this problem occurs > with SLURM and rsh. I think this break came in on Friday as that is > when some of my MTT tests started to hang and fail, but I cannot point > to a specific revision at this point. The backtraces (enclosed) of the > processes point to the grpcomm allgather routine. > > The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. > > RSH example from odin023 - so no SLURM variables: > These work: > shell$ mpirun -np 2 -host odin023 noop -v 1 > shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 > shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 > > This hangs: > shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop -v 1 > > > If I attach to the 'noop' process on odin023 I get the following > backtrace: > > (gdb) bt > #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 > #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, > maxevents=1023, timeout=1000) at epoll_sub.c:61 > #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910, > tv=0x7fbfffe840) at epoll.c:210 > #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30, > flags=5) at event.c:779 > #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 > #5 0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c: > 169 > #6 0x002a958b9e48 in orte_grpcomm_base_allgather > (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ > grpcomm_base_allgather.c:238 > #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ > grpcomm_base_modex.c:413 > #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58, > requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510 > #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c, > argv=0x7fbfffec70) at pinit.c:88 > #10 0x00400bf4 in main (argc=3, argv=0x7fbfffed58) at noop.c:39 > > > The 'noop' process on odin024 has a similar backtrace: > > (gdb) bt > #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 > #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b390, > maxevents=1023, timeout=1000) at epoll_sub.c:61 > #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506cc0, arg=0x506c20, > tv=0x7fbfffe9d0) at epoll.c:210 > #3 0x002a95a1c057 in opal_event_base_loop (base=0x506cc0, > flags=5) at event.c:779 > #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 > #5 0x002a95a0bef8 in opal_progress () at runtime/opal_progress.c: > 169 > #6 0x002a958b97c5 in orte_grpcomm_base_allgather > (sbuf=0x7fbfffec70, rbuf=0x7fbfffec10) at base/ > grpcomm_base_allgather.c:163 > #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ > grpcomm_base_modex.c:413 > #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffeee8, > requested=0, provided=0x7fbfffedc8) at runtime/ompi_mpi_init.c:510 > #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffee0c, > argv=0x7fbfffee00) at pinit.c:88 > #10 0x00400bf4 in main (argc=3, argv=0x7fbfffeee8) at noop.c:39 > > > > Cheers, > Josh
Re: [OMPI devel] Routed 'unity' broken on trunk
At the moment I only use unity with C/R. Mostly because I have not verified that the other components work properly under the C/R conditions. I can verify others, but that doesn't solve the problem with the unity component. :/ It is not critical that these jobs launch quickly, but that they launch correctly for the moment. When you say 'slow the launch' are you talking severely as in seconds/minutes for small nps? I guess a followup question is why did this component break in the first place? or worded differently, what changed in ORTE such that the unity component will suddenly deadlock when it didn't before? Thanks for looking into this, Josh On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote: I figured out the issue - there is a simple and a hard way to fix this. So before I do, let me see what makes sense. The simple solution involves updating the daemons with contact info for the procs so that they can send their collected modex info to the rank=0 proc. This will measurably slow the launch when using unity. The hard solution is to do a hybrid routed approach whereby the daemons would route any daemon-to-proc communication while the procs continue to do direct proc-to-proc messaging. Is there some reason to be using the "unity" component? Do you care if jobs using unity launch slower? Thanks Ralph On 3/31/08 7:57 AM, "Josh Hursey" wrote: Ralph, I've just noticed that it seems that the 'unity' routed component seems to be broken when using more than one machine. I'm using Odin and r18028 of the trunk, and have confirmed that this problem occurs with SLURM and rsh. I think this break came in on Friday as that is when some of my MTT tests started to hang and fail, but I cannot point to a specific revision at this point. The backtraces (enclosed) of the processes point to the grpcomm allgather routine. The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. RSH example from odin023 - so no SLURM variables: These work: shell$ mpirun -np 2 -host odin023 noop -v 1 shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 This hangs: shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop - v 1 If I attach to the 'noop' process on odin023 I get the following backtrace: (gdb) bt #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, maxevents=1023, timeout=1000) at epoll_sub.c:61 #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910, tv=0x7fbfffe840) at epoll.c:210 #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30, flags=5) at event.c:779 #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 #5 0x002a95a0bef8 in opal_progress () at runtime/ opal_progress.c: 169 #6 0x002a958b9e48 in orte_grpcomm_base_allgather (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ grpcomm_base_allgather.c:238 #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ grpcomm_base_modex.c:413 #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58, requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510 #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c, argv=0x7fbfffec70) at pinit.c:88 #10 0x00400bf4 in main (argc=3, argv=0x7fbfffed58) at noop.c:39 The 'noop' process on odin024 has a similar backtrace: (gdb) bt #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b390, maxevents=1023, timeout=1000) at epoll_sub.c:61 #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506cc0, arg=0x506c20, tv=0x7fbfffe9d0) at epoll.c:210 #3 0x002a95a1c057 in opal_event_base_loop (base=0x506cc0, flags=5) at event.c:779 #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 #5 0x002a95a0bef8 in opal_progress () at runtime/ opal_progress.c: 169 #6 0x002a958b97c5 in orte_grpcomm_base_allgather (sbuf=0x7fbfffec70, rbuf=0x7fbfffec10) at base/ grpcomm_base_allgather.c:163 #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ grpcomm_base_modex.c:413 #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffeee8, requested=0, provided=0x7fbfffedc8) at runtime/ompi_mpi_init.c:510 #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffee0c, argv=0x7fbfffee00) at pinit.c:88 #10 0x00400bf4 in main (argc=3, argv=0x7fbfffeee8) at noop.c:39 Cheers, Josh
Re: [OMPI devel] limit tcp fragment size?
The btl_tcp_min_send_size is not exactly what you expect it to be. It drive only the send protocol (as implemented in Open MPI), and not the put protocol the TCP BTL is using. You can achieve what you want with 2 parameters: 1. btl_tcp_frag set to 9. This will force the send protocol over TCP all the time 2. btl_tcp_max_send_size set to 16K which will define the size of a fragment in the pipelined send protocol. george. On Mar 31, 2008, at 2:46 AM, Muhammad Atif wrote: G'day Just a quick basic question. in case of tcp btl, how do I limit the frag size? I do not want MPI to send a fragment of size greater than lets say 16K in size. If I am not mistaken, should not the btl_tcp_min_send_size do the trick? If it is supposed to do it, why do i see packets of lenght 64K ? Thanks in advance. Best Regards, Muhammad Atif Special deal for Yahoo! users & friends - No Cost. Get a month of Blockbuster Total Accessnow___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] Routed 'unity' broken on trunk
On 3/31/08 9:28 AM, "Josh Hursey" wrote: > At the moment I only use unity with C/R. Mostly because I have not > verified that the other components work properly under the C/R > conditions. I can verify others, but that doesn't solve the problem > with the unity component. :/ > > It is not critical that these jobs launch quickly, but that they > launch correctly for the moment. When you say 'slow the launch' are > you talking severely as in seconds/minutes for small nps? I didn't say "severely" - I said "measurably". ;-) It will require an additional communication to the daemons to let them know how to talk to the procs. In the current unity component, the daemons never talk to the procs themselves, and so they don't know contact info for rank=0. > I guess a > followup question is why did this component break in the first place? > or worded differently, what changed in ORTE such that the unity > component will suddenly deadlock when it didn't before? We are trying to improve scalability. Biggest issue is the modex, which we improved considerably by having the procs pass the modex info to the daemons, letting the daemons collect all modex info from procs on their node, and then having the daemons send that info along to the rank=0 proc for collection and xcast. Problem is that in the unity component, the local daemons don't know how to send the modex to the rank=0 proc. So what I will now have to do is tell all the daemons how to talk to the procs, and then we will have every daemon opening a socket to rank=0. That's where the time will be lost. Our original expectation was to get everyone off of unity as quickly as possible - in fact, Brian and I had planned to completely remove that component as quickly as possible as it (a) scales ugly and (b) gets in the way of things. Very hard to keep it alive. So for now, I'll just do the simple thing and hopefully that will be adequate - let me know if/when you are able to get C/R working on other routed components. Thanks! Ralph > > Thanks for looking into this, > Josh > > On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote: > >> I figured out the issue - there is a simple and a hard way to fix >> this. So >> before I do, let me see what makes sense. >> >> The simple solution involves updating the daemons with contact info >> for the >> procs so that they can send their collected modex info to the rank=0 >> proc. >> This will measurably slow the launch when using unity. >> >> The hard solution is to do a hybrid routed approach whereby the >> daemons >> would route any daemon-to-proc communication while the procs >> continue to do >> direct proc-to-proc messaging. >> >> Is there some reason to be using the "unity" component? Do you care >> if jobs >> using unity launch slower? >> >> Thanks >> Ralph >> >> >> >> On 3/31/08 7:57 AM, "Josh Hursey" wrote: >> >>> Ralph, >>> >>> I've just noticed that it seems that the 'unity' routed component >>> seems to be broken when using more than one machine. I'm using Odin >>> and r18028 of the trunk, and have confirmed that this problem occurs >>> with SLURM and rsh. I think this break came in on Friday as that is >>> when some of my MTT tests started to hang and fail, but I cannot >>> point >>> to a specific revision at this point. The backtraces (enclosed) of >>> the >>> processes point to the grpcomm allgather routine. >>> >>> The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. >>> >>> RSH example from odin023 - so no SLURM variables: >>> These work: >>> shell$ mpirun -np 2 -host odin023 noop -v 1 >>> shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 >>> shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 >>> >>> This hangs: >>> shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop - >>> v 1 >>> >>> >>> If I attach to the 'noop' process on odin023 I get the following >>> backtrace: >>> >>> (gdb) bt >>> #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 >>> #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, >>> maxevents=1023, timeout=1000) at epoll_sub.c:61 >>> #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, >>> arg=0x506910, >>> tv=0x7fbfffe840) at epoll.c:210 >>> #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30, >>> flags=5) at event.c:779 >>> #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 >>> #5 0x002a95a0bef8 in opal_progress () at runtime/ >>> opal_progress.c: >>> 169 >>> #6 0x002a958b9e48 in orte_grpcomm_base_allgather >>> (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ >>> grpcomm_base_allgather.c:238 >>> #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at >>> base/ >>> grpcomm_base_modex.c:413 >>> #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58, >>> requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510 >>> #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c, >>> argv=0x7
Re: [OMPI devel] Routed 'unity' broken on trunk
On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote: On 3/31/08 9:28 AM, "Josh Hursey" wrote: At the moment I only use unity with C/R. Mostly because I have not verified that the other components work properly under the C/R conditions. I can verify others, but that doesn't solve the problem with the unity component. :/ It is not critical that these jobs launch quickly, but that they launch correctly for the moment. When you say 'slow the launch' are you talking severely as in seconds/minutes for small nps? I didn't say "severely" - I said "measurably". ;-) It will require an additional communication to the daemons to let them know how to talk to the procs. In the current unity component, the daemons never talk to the procs themselves, and so they don't know contact info for rank=0. ah I see. I guess a followup question is why did this component break in the first place? or worded differently, what changed in ORTE such that the unity component will suddenly deadlock when it didn't before? We are trying to improve scalability. Biggest issue is the modex, which we improved considerably by having the procs pass the modex info to the daemons, letting the daemons collect all modex info from procs on their node, and then having the daemons send that info along to the rank=0 proc for collection and xcast. Problem is that in the unity component, the local daemons don't know how to send the modex to the rank=0 proc. So what I will now have to do is tell all the daemons how to talk to the procs, and then we will have every daemon opening a socket to rank=0. That's where the time will be lost. Our original expectation was to get everyone off of unity as quickly as possible - in fact, Brian and I had planned to completely remove that component as quickly as possible as it (a) scales ugly and (b) gets in the way of things. Very hard to keep it alive. So for now, I'll just do the simple thing and hopefully that will be adequate - let me know if/when you are able to get C/R working on other routed components. Sounds good. I'll look into supporting the tree routed component, but that will probably take a couple weeks. Thanks for the clarification. Cheers, Josh Thanks! Ralph Thanks for looking into this, Josh On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote: I figured out the issue - there is a simple and a hard way to fix this. So before I do, let me see what makes sense. The simple solution involves updating the daemons with contact info for the procs so that they can send their collected modex info to the rank=0 proc. This will measurably slow the launch when using unity. The hard solution is to do a hybrid routed approach whereby the daemons would route any daemon-to-proc communication while the procs continue to do direct proc-to-proc messaging. Is there some reason to be using the "unity" component? Do you care if jobs using unity launch slower? Thanks Ralph On 3/31/08 7:57 AM, "Josh Hursey" wrote: Ralph, I've just noticed that it seems that the 'unity' routed component seems to be broken when using more than one machine. I'm using Odin and r18028 of the trunk, and have confirmed that this problem occurs with SLURM and rsh. I think this break came in on Friday as that is when some of my MTT tests started to hang and fail, but I cannot point to a specific revision at this point. The backtraces (enclosed) of the processes point to the grpcomm allgather routine. The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. RSH example from odin023 - so no SLURM variables: These work: shell$ mpirun -np 2 -host odin023 noop -v 1 shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 This hangs: shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop - v 1 If I attach to the 'noop' process on odin023 I get the following backtrace: (gdb) bt #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, maxevents=1023, timeout=1000) at epoll_sub.c:61 #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910, tv=0x7fbfffe840) at epoll.c:210 #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30, flags=5) at event.c:779 #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c:702 #5 0x002a95a0bef8 in opal_progress () at runtime/ opal_progress.c: 169 #6 0x002a958b9e48 in orte_grpcomm_base_allgather (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ grpcomm_base_allgather.c:238 #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ grpcomm_base_modex.c:413 #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffed58, requested=0, provided=0x7fbfffec38) at runtime/ompi_mpi_init.c:510 #9 0x002a956f2109 in PMPI_Init (argc=0x7fbfffec7c, argv=0x7fbfffec70) at pinit.c:88 #10 0x00400bf4 in ma
Re: [OMPI devel] Routed 'unity' broken on trunk
Okay - fixed with r18040 Thanks Ralph On 3/31/08 11:01 AM, "Josh Hursey" wrote: > > On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote: > >> >> >> >> On 3/31/08 9:28 AM, "Josh Hursey" wrote: >> >>> At the moment I only use unity with C/R. Mostly because I have not >>> verified that the other components work properly under the C/R >>> conditions. I can verify others, but that doesn't solve the problem >>> with the unity component. :/ >>> >>> It is not critical that these jobs launch quickly, but that they >>> launch correctly for the moment. When you say 'slow the launch' are >>> you talking severely as in seconds/minutes for small nps? >> >> I didn't say "severely" - I said "measurably". ;-) >> >> It will require an additional communication to the daemons to let >> them know >> how to talk to the procs. In the current unity component, the >> daemons never >> talk to the procs themselves, and so they don't know contact info for >> rank=0. > > ah I see. > >> >> >>> I guess a >>> followup question is why did this component break in the first place? >>> or worded differently, what changed in ORTE such that the unity >>> component will suddenly deadlock when it didn't before? >> >> We are trying to improve scalability. Biggest issue is the modex, >> which we >> improved considerably by having the procs pass the modex info to the >> daemons, letting the daemons collect all modex info from procs on >> their >> node, and then having the daemons send that info along to the rank=0 >> proc >> for collection and xcast. >> >> Problem is that in the unity component, the local daemons don't know >> how to >> send the modex to the rank=0 proc. So what I will now have to do is >> tell all >> the daemons how to talk to the procs, and then we will have every >> daemon >> opening a socket to rank=0. That's where the time will be lost. >> >> Our original expectation was to get everyone off of unity as quickly >> as >> possible - in fact, Brian and I had planned to completely remove that >> component as quickly as possible as it (a) scales ugly and (b) gets >> in the >> way of things. Very hard to keep it alive. >> >> So for now, I'll just do the simple thing and hopefully that will be >> adequate - let me know if/when you are able to get C/R working on >> other >> routed components. > > Sounds good. I'll look into supporting the tree routed component, but > that will probably take a couple weeks. > > Thanks for the clarification. > > Cheers, > Josh > >> >> >> Thanks! >> Ralph >> >>> >>> Thanks for looking into this, >>> Josh >>> >>> On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote: >>> I figured out the issue - there is a simple and a hard way to fix this. So before I do, let me see what makes sense. The simple solution involves updating the daemons with contact info for the procs so that they can send their collected modex info to the rank=0 proc. This will measurably slow the launch when using unity. The hard solution is to do a hybrid routed approach whereby the daemons would route any daemon-to-proc communication while the procs continue to do direct proc-to-proc messaging. Is there some reason to be using the "unity" component? Do you care if jobs using unity launch slower? Thanks Ralph On 3/31/08 7:57 AM, "Josh Hursey" wrote: > Ralph, > > I've just noticed that it seems that the 'unity' routed component > seems to be broken when using more than one machine. I'm using Odin > and r18028 of the trunk, and have confirmed that this problem > occurs > with SLURM and rsh. I think this break came in on Friday as that is > when some of my MTT tests started to hang and fail, but I cannot > point > to a specific revision at this point. The backtraces (enclosed) of > the > processes point to the grpcomm allgather routine. > > The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. > > RSH example from odin023 - so no SLURM variables: > These work: > shell$ mpirun -np 2 -host odin023 noop -v 1 > shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 > shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 > > This hangs: > shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop - > v 1 > > > If I attach to the 'noop' process on odin023 I get the following > backtrace: > > (gdb) bt > #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 > #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, > maxevents=1023, timeout=1000) at epoll_sub.c:61 > #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, > arg=0x506910, > tv=0x7fbfffe840) at epoll.c:210 > #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30,
Re: [OMPI devel] segfault on host not found error.
I am unable to replicate the segfault. However, I was able to get the job to hang. I fixed that behavior with r18044. Perhaps you can test this again and let me know what you see. A gdb stack trace would be more helpful. Thanks Ralph On 3/31/08 5:13 AM, "Lenny Verkhovsky" wrote: > > > > I accidently run job on the hostfile where one of hosts was not properly > mounted. As a result I got an error and a segfault. > > > /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile > ./mpi_p01 -t lt > bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or > directory > > -- > A daemon (pid 9753) died unexpectedly with status 127 while attempting > to launch so we are aborting. > > There may be more information reported by the environment (see above). > > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have > the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > > -- > > -- > mpirun was unable to start the specified application as it encountered > an error. > More information may be available above. > > -- > [witch1:09745] *** Process received signal *** > [witch1:09745] Signal: Segmentation fault (11) > [witch1:09745] Signal code: Address not mapped (1) > [witch1:09745] Failing at address: 0x3c > [witch1:09745] [ 0] /lib64/libpthread.so.0 [0x2aff223ebc10] > [witch1:09745] [ 1] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cdfe21] > [witch1:09745] [ 2] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_rml_oob.so > [0x2aff22c398f1] > [witch1:09745] [ 3] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so > [0x2aff22d426ee] > [witch1:09745] [ 4] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so > [0x2aff22d433fb] > [witch1:09745] [ 5] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so > [0x2aff22d4485b] > [witch1:09745] [ 6] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b] > [witch1:09745] [ 7] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun > [0x403203] > [witch1:09745] [ 8] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b] > [witch1:09745] [ 9] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x > 8b) [0x2aff21e060cb] > [witch1:09745] [10] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_trigger_eve > nt+0x20) [0x2aff21cc6940] > [witch1:09745] [11] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_wakeup+0x2d > ) [0x2aff21cc776d] > [witch1:09745] [12] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so > [0x2aff22b34756] > [witch1:09745] [13] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cc6ea7] > [witch1:09745] [14] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b] > [witch1:09745] [15] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x > 8b) [0x2aff21e060cb] > [witch1:09745] [16] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_plm_base_da > emon_callback+0xad) [0x2aff21ce068d] > [witch1:09745] [17] > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so > [0x2aff22b34e5e] > [witch1:09745] [18] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun > [0x402e13] > [witch1:09745] [19] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun > [0x402873] > [witch1:09745] [20] /lib64/libc.so.6(__libc_start_main+0xf4) > [0x2aff22512154] > [witch1:09745] [21] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun > [0x4027c9] > [witch1:09745] *** End of error message *** > Segmentation fault (core dumped) > > > Best Regards, > Lenny. > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Routed 'unity' broken on trunk
Looks good. Thanks for the fix. Cheers, Josh On Mar 31, 2008, at 1:43 PM, Ralph H Castain wrote: Okay - fixed with r18040 Thanks Ralph On 3/31/08 11:01 AM, "Josh Hursey" wrote: On Mar 31, 2008, at 12:57 PM, Ralph H Castain wrote: On 3/31/08 9:28 AM, "Josh Hursey" wrote: At the moment I only use unity with C/R. Mostly because I have not verified that the other components work properly under the C/R conditions. I can verify others, but that doesn't solve the problem with the unity component. :/ It is not critical that these jobs launch quickly, but that they launch correctly for the moment. When you say 'slow the launch' are you talking severely as in seconds/minutes for small nps? I didn't say "severely" - I said "measurably". ;-) It will require an additional communication to the daemons to let them know how to talk to the procs. In the current unity component, the daemons never talk to the procs themselves, and so they don't know contact info for rank=0. ah I see. I guess a followup question is why did this component break in the first place? or worded differently, what changed in ORTE such that the unity component will suddenly deadlock when it didn't before? We are trying to improve scalability. Biggest issue is the modex, which we improved considerably by having the procs pass the modex info to the daemons, letting the daemons collect all modex info from procs on their node, and then having the daemons send that info along to the rank=0 proc for collection and xcast. Problem is that in the unity component, the local daemons don't know how to send the modex to the rank=0 proc. So what I will now have to do is tell all the daemons how to talk to the procs, and then we will have every daemon opening a socket to rank=0. That's where the time will be lost. Our original expectation was to get everyone off of unity as quickly as possible - in fact, Brian and I had planned to completely remove that component as quickly as possible as it (a) scales ugly and (b) gets in the way of things. Very hard to keep it alive. So for now, I'll just do the simple thing and hopefully that will be adequate - let me know if/when you are able to get C/R working on other routed components. Sounds good. I'll look into supporting the tree routed component, but that will probably take a couple weeks. Thanks for the clarification. Cheers, Josh Thanks! Ralph Thanks for looking into this, Josh On Mar 31, 2008, at 11:10 AM, Ralph H Castain wrote: I figured out the issue - there is a simple and a hard way to fix this. So before I do, let me see what makes sense. The simple solution involves updating the daemons with contact info for the procs so that they can send their collected modex info to the rank=0 proc. This will measurably slow the launch when using unity. The hard solution is to do a hybrid routed approach whereby the daemons would route any daemon-to-proc communication while the procs continue to do direct proc-to-proc messaging. Is there some reason to be using the "unity" component? Do you care if jobs using unity launch slower? Thanks Ralph On 3/31/08 7:57 AM, "Josh Hursey" wrote: Ralph, I've just noticed that it seems that the 'unity' routed component seems to be broken when using more than one machine. I'm using Odin and r18028 of the trunk, and have confirmed that this problem occurs with SLURM and rsh. I think this break came in on Friday as that is when some of my MTT tests started to hang and fail, but I cannot point to a specific revision at this point. The backtraces (enclosed) of the processes point to the grpcomm allgather routine. The 'noop' program calls MPI_Init, sleeps, then calls MPI_Finalize. RSH example from odin023 - so no SLURM variables: These work: shell$ mpirun -np 2 -host odin023 noop -v 1 shell$ mpirun -np 2 -host odin023,odin024 noop -v 1 shell$ mpirun -np 2 -mca routed unity -host odin023 noop -v 1 This hangs: shell$ mpirun -np 2 -mca routed unity -host odin023,odin024 noop - v 1 If I attach to the 'noop' process on odin023 I get the following backtrace: (gdb) bt #0 0x002a96226b39 in syscall () from /lib64/tls/libc.so.6 #1 0x002a95a1e485 in epoll_wait (epfd=3, events=0x50b330, maxevents=1023, timeout=1000) at epoll_sub.c:61 #2 0x002a95a1e7f7 in epoll_dispatch (base=0x506c30, arg=0x506910, tv=0x7fbfffe840) at epoll.c:210 #3 0x002a95a1c057 in opal_event_base_loop (base=0x506c30, flags=5) at event.c:779 #4 0x002a95a1be8f in opal_event_loop (flags=5) at event.c: 702 #5 0x002a95a0bef8 in opal_progress () at runtime/ opal_progress.c: 169 #6 0x002a958b9e48 in orte_grpcomm_base_allgather (sbuf=0x7fbfffeae0, rbuf=0x7fbfffea80) at base/ grpcomm_base_allgather.c:238 #7 0x002a958bd37c in orte_grpcomm_base_modex (procs=0x0) at base/ grpcomm_base_modex.c:413 #8 0x002a956b8416 in ompi_mpi_init (argc=3, argv=0x7fbfffe
[OMPI devel] Session directories in $HOME?
So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh
Re: [OMPI devel] Session directories in $HOME?
I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] Session directories in $HOME?
Slightly OT but along the same lines.. We currently have an argument to mpirun to set the HNP tmpdir (-- tmpdir). Why don't we have an mca param to set the tmpdir for all the orted's and such? - Galen On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Session directories in $HOME?
Nope. None of those environment variables are defined. Should they be? It would seem that the last part of the logic should be (re-) extended to use /tmp if it exists. -- Josh On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Session directories in $HOME?
I confirm that this is new behavior. Session directories have just started showing up in my $HOME as well, and TMPDIR, TEMP, TMP have never been set on my cluster (for interactive logins, anyway). On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote: Nope. None of those environment variables are defined. Should they be? It would seem that the last part of the logic should be (re-) extended to use /tmp if it exists. -- Josh On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Session directories in $HOME?
Taking a quick look at the commits it seems that r18037 looks like the most likely cause of this problem. Previously the session directory was forced to "/tmp" if no environment variables were set. This revision removes this logic and uses the opal_tmp_directory(). Though I agree with this change, I think the logic for selecting the TMP directory should be extended to use '/tmp' if it exists. If it does not then the home directory should be a fine last alternative. How does that sound as a solution? This would prevent us from unexpectedly changing our running behavior in user environments in which none of those variables are set. Cheers, Josh On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote: Nope. None of those environment variables are defined. Should they be? It would seem that the last part of the logic should be (re-) extended to use /tmp if it exists. -- Josh On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Session directories in $HOME?
TMPDIR and TMP are standard on Unix. If they are not defined ... one cannot guess where the temporary files should be located. Unfortunately, if we start using the /tmp directly we might make the wrong guess. What mktemp is returning on your system ? george. On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote: Nope. None of those environment variables are defined. Should they be? It would seem that the last part of the logic should be (re-) extended to use /tmp if it exists. -- Josh On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] Session directories in $HOME?
Here is the problem - the following code was changed in session_dir.c: -#ifdef __WINDOWS__ -#define OMPI_DEFAULT_TMPDIR "C:\\TEMP" -#else -#define OMPI_DEFAULT_TMPDIR "/tmp" -#endif - #define OMPI_PRINTF_FIX_STRING(a) ((NULL == a) ? "(null)" : a) / @@ -262,14 +257,8 @@ else if( NULL != getenv("OMPI_PREFIX_ENV") ) { /* OMPI Environment var */ prefix = strdup(getenv("OMPI_PREFIX_ENV")); } -else if( NULL != getenv("TMPDIR") ) { /* General Environment var */ -prefix = strdup(getenv("TMPDIR")); -} -else if( NULL != getenv("TMP") ) { /* Another general environment var */ -prefix = strdup(getenv("TMP")); -} -else { /* ow. just use the default tmp directory */ -prefix = strdup(OMPI_DEFAULT_TMPDIR); +else { /* General Environment var */ +prefix = strdup(opal_tmp_directory()); } I believe the problem is that opal_tmp_directory doesn't have OMPI_DEFAULT_TMPDIR - it just defaults to $HOME. This should probably be fixed. On 3/31/08 2:01 PM, "Josh Hursey" wrote: > Nope. None of those environment variables are defined. Should they > be? It would seem that the last part of the logic should be (re-) > extended to use /tmp if it exists. > > -- Josh > > On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: >> I looked over the code and I don't see any problems with the >> changes. The only think I did is replacing the getenv("HOME") by >> opal_home_directory ... >> >> Here is the logic for selecting the TMP directory: >> >> if( NULL == (str = getenv("TMPDIR")) ) >> if( NULL == (str = getenv("TEMP")) ) >> if( NULL == (str = getenv("TMP")) ) >> if( NULL == (str = opal_home_directory()) ) >> str = "."; >> >> Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? >> >> george. >> >> On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: >>> So does anyone know why the session directories are in $HOME instead >>> of /tmp? >>> >>> I'm using r18044 and every time I run the session directories are >>> created in $HOME. George does this have anything to do with your >>> commits from earlier? >>> >>> -- Josh >>> ___ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Session directories in $HOME?
I more than agree with Galen. Aurelien Le 31 mars 08 à 16:00, Shipman, Galen M. a écrit : Slightly OT but along the same lines.. We currently have an argument to mpirun to set the HNP tmpdir (-- tmpdir). Why don't we have an mca param to set the tmpdir for all the orted's and such? - Galen On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Session directories in $HOME?
Commit r18046 restore exactly the same logic as it was before r18037. It redirects everything to /tmp is no special environment variable is set. george. On Mar 31, 2008, at 4:09 PM, Josh Hursey wrote: Taking a quick look at the commits it seems that r18037 looks like the most likely cause of this problem. Previously the session directory was forced to "/tmp" if no environment variables were set. This revision removes this logic and uses the opal_tmp_directory(). Though I agree with this change, I think the logic for selecting the TMP directory should be extended to use '/tmp' if it exists. If it does not then the home directory should be a fine last alternative. How does that sound as a solution? This would prevent us from unexpectedly changing our running behavior in user environments in which none of those variables are set. Cheers, Josh On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote: Nope. None of those environment variables are defined. Should they be? It would seem that the last part of the logic should be (re-) extended to use /tmp if it exists. -- Josh On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] Session directories in $HOME?
Thanks for the fix. Cheers, Josh On Mar 31, 2008, at 4:17 PM, George Bosilca wrote: Commit r18046 restore exactly the same logic as it was before r18037. It redirects everything to /tmp is no special environment variable is set. george. On Mar 31, 2008, at 4:09 PM, Josh Hursey wrote: Taking a quick look at the commits it seems that r18037 looks like the most likely cause of this problem. Previously the session directory was forced to "/tmp" if no environment variables were set. This revision removes this logic and uses the opal_tmp_directory(). Though I agree with this change, I think the logic for selecting the TMP directory should be extended to use '/tmp' if it exists. If it does not then the home directory should be a fine last alternative. How does that sound as a solution? This would prevent us from unexpectedly changing our running behavior in user environments in which none of those variables are set. Cheers, Josh On Mar 31, 2008, at 4:01 PM, Josh Hursey wrote: Nope. None of those environment variables are defined. Should they be? It would seem that the last part of the logic should be (re-) extended to use /tmp if it exists. -- Josh On Mar 31, 2008, at 3:51 PM, George Bosilca wrote: I looked over the code and I don't see any problems with the changes. The only think I did is replacing the getenv("HOME") by opal_home_directory ... Here is the logic for selecting the TMP directory: if( NULL == (str = getenv("TMPDIR")) ) if( NULL == (str = getenv("TEMP")) ) if( NULL == (str = getenv("TMP")) ) if( NULL == (str = opal_home_directory()) ) str = "."; Do you have any of those (TMPDIR, TEMP or TMP) in your environment ? george. On Mar 31, 2008, at 3:13 PM, Josh Hursey wrote: So does anyone know why the session directories are in $HOME instead of /tmp? I'm using r18044 and every time I run the session directories are created in $HOME. George does this have anything to do with your commits from earlier? -- Josh ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [OMPI svn] svn:open-mpi r18046
On Mon, Mar 31, 2008 at 10:15 PM, wrote: > Author: bosilca > Date: 2008-03-31 16:15:49 EDT (Mon, 31 Mar 2008) > New Revision: 18046 > URL: https://svn.open-mpi.org/trac/ompi/changeset/18046 > > Modified: trunk/opal/util/opal_environ.c > +#ifdef __WINDOWS__ > +#define OMPI_DEFAULT_TMPDIR "C:\\TEMP" > +#else > +#define OMPI_DEFAULT_TMPDIR "/tmp" > +#endif > + Wrong prefix for this file? Bert
Re: [OMPI devel] [OMPI svn] svn:open-mpi r18046
You're right ... I'll make the change asap. Thanks, george. On Mar 31, 2008, at 5:39 PM, Bert Wesarg wrote: On Mon, Mar 31, 2008 at 10:15 PM, wrote: Author: bosilca Date: 2008-03-31 16:15:49 EDT (Mon, 31 Mar 2008) New Revision: 18046 URL: https://svn.open-mpi.org/trac/ompi/changeset/18046 Modified: trunk/opal/util/opal_environ.c +#ifdef __WINDOWS__ +#define OMPI_DEFAULT_TMPDIR "C:\\TEMP" +#else +#define OMPI_DEFAULT_TMPDIR "/tmp" +#endif + Wrong prefix for this file? Bert ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature