Re: [OMPI devel] [RFC] mca_base_select()

2008-05-12 Thread Ralph Castain
I -think- I may have found the problem here, but don't have a real test case - try r18429 and see if it works. On 5/11/08 4:32 PM, "Josh Hursey" wrote: > From the stacktrace, this doesn't look like a problem with > base_select, but with 'orte_util_encode_pidmap'. You

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-11 Thread Josh Hursey
From the stacktrace, this doesn't look like a problem with base_select, but with 'orte_util_encode_pidmap'. You may want to start looking there. -- Josh On May 11, 2008, at 1:30 PM, Lenny Verkhovsky wrote: Hi, I tried r 18423 with rank_file component and got seqfault ( I increase priority

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-11 Thread Lenny Verkhovsky
Hi, I tried r 18423 with rank_file component and got seqfault ( I increase priority of the component if rmaps_rank_file_path exist) /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun -np 4 -hostfile hostfile_ompi -mca rmaps_rank_file_path rankfile -mca paffinity_base_verbose 5 ./mpi_p_SMD -t bw -output

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-09 Thread Ralph Castain
Not quite, Josh - I fixed it in our branch. Will send you a revised patch that does the job off-list for your review. Thanks Ralph On 5/9/08 9:35 AM, "Josh Hursey" wrote: > Ok I think I understand the problem a bit better now. I attached a > patch that should fix this,

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-09 Thread Josh Hursey
Ok I think I understand the problem a bit better now. I attached a patch that should fix this, but I want you to check it out before I commit just to make sure. If you specify '-mca filter xml' on the command line then only the 'xml' component should be opened by mca_base_open. The problem

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-09 Thread Ralph Castain
Sure - take a look at the hg repository Jeff and I are working on: http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/channel Te opal/mca/filter framework illustrates the problem. I have one component in there right now, with a default module defined in the base. That component must only be selected if

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-09 Thread Josh Hursey
Ralph, Can you give me an example of a component that I can look at? It will allow me to test the fix before committing, and to better understand the problem. -- Josh On May 9, 2008, at 10:41 AM, Ralph Castain wrote: I just hit a problem with this logic - should be a minor change. We

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-09 Thread Ralph Castain
I just hit a problem with this logic - should be a minor change. We have several frameworks where we have components that are only allowed be selected if the user specifically requests them by stating -mca foo bar. Because it is possible for there to be no other components that want to be

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-08 Thread Pak Lui
Thanks very much Josh! Will try it out soon. Josh Hursey wrote: Sorry about that. I didn't test that type of option. It should be working in r18418. Let me know if you see any more issues. -- Josh On May 8, 2008, at 6:04 PM, Pak Lui wrote: I think I have a problem but I am not sure. I used

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-08 Thread Josh Hursey
Sorry about that. I didn't test that type of option. It should be working in r18418. Let me know if you see any more issues. -- Josh On May 8, 2008, at 6:04 PM, Pak Lui wrote: I think I have a problem but I am not sure. I used to be able to use the circumflex (^) to switch between the

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-08 Thread Pak Lui
I think I have a problem but I am not sure. I used to be able to use the circumflex (^) to switch between the gridengine launcher and the ssh launchers by doing something like this, e.g. -mca plm ^gridengine, to exclude some of the components plm (and also in ras). It doesn't seem like the

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-06 Thread Ralph Castain
Excellent! Thanks Josh - both for the original work/commit and for the quick fix! Ralph On 5/6/08 3:58 PM, "Josh Hursey" wrote: > Sorry about that. Looking back at the filem logic it seems that I > returned success even if select failed (and just use the 'none' >

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-06 Thread Josh Hursey
Sorry about that. Looking back at the filem logic it seems that I returned success even if select failed (and just use the 'none' passthrough component). I committed a patch in r18389 that fixes this problem. This commit now has a warning that prints on the filem verbose stream so if a

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-06 Thread Ralph H Castain
Hmmmwell, I hit a problem (of course!). I have mca-no-build on the filem framework on my Mac. If I just mpriun -n 3 ./hello, I get the following error: -- It looks like orte_init failed for some reason; your parallel

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-06 Thread Josh Hursey
This has been committed in r18381 Please let me know if you have any problems with this commit. Cheers, Josh On May 5, 2008, at 10:41 AM, Josh Hursey wrote: Awesome. The branch is updated to the latest trunk head. I encourage folks to check out this repository and make sure that it builds

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-05 Thread Josh Hursey
Awesome. The branch is updated to the latest trunk head. I encourage folks to check out this repository and make sure that it builds on their system. A normal build of the branch should be enough to find out if there are any cut-n-paste problems (though I tried to be careful, mistakes do

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-05 Thread Jeff Squyres
This all sounds good to me! On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote: What: Add mca_base_select() and adjust frameworks & components to use it. Why: Consolidation of code for general goodness. Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play When: Code ready now.

[OMPI devel] [RFC] mca_base_select()

2008-04-29 Thread Josh Hursey
What: Add mca_base_select() and adjust frameworks & components to use it. Why: Consolidation of code for general goodness. Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play When: Code ready now. Documentation ready soon. Timeout: May 6, 2008 (After teleconf) [1 week]