Re: [OMPI devel] (loose) SGE Integration fails, why?
Hi, Jeff many thanks for your reply.. > 1. You might want to update your version of Open MPI if possible; the > v1.1.1 version is quite old. We have added many new bug fixes and > features since v1.1.1 (including tight SGE integration). There is > nothing special about the Open MPI that is included in the OFED > distribution; you can download a new version from the Open MPI web > site (the current stable version is v1.2.3), configure, compile, and > install it with your current OFED installation. You should be able > to configure Open MPI with: Hmm, I've heard about conflicts with OMPI 1.2.x and OFED 1.1 (sorry no refference here), and I've got no luck producing a working OMPI installation ("mpirun --help" runs, and ./IMB-MPI compiles and runs too, but "mpirun -np 2 node03,node14 IMB-MPI1" doesnt (segmentation fault))... (beside that, I know that OFED 1.1 is quite old too) So I'm tested it with OMPI 1.1.5 => same error. > 2. I know little/nothing about SGE, but I'm assuming that you need to > have SGE pass the proper memory lock limits to new processes. In an > interactive login, you showed that the max limit is "8162952" -- you > might just want to make it unlimited, unless you have a reason for > limiting it. See http://www.open-mpi.org/faq/? yes I allready read the faq, and even setting them to unlimited has shown not be working. In the SGE one could specify the limits to SGE-jobs by e.g. the qmon tool, (configuring queues > select queue > modify > limits) But there is everything set to infinity. (Beside that, the job is running with a static machinefile (is this an "noninteractive" job?)) How could I test ulimits of interactive and noninteractive jobs? Thank you for your great help.
Re: [OMPI devel] (loose) SGE Integration fails, why?
Hi. I think it is not necessary to specify the hosts via the hostfile using SGE and OpenMPI, even the $NSLOTS is not necessary , just run mpirun executable this works very well. to your memory problem: I had similar problems when I specified the h_vmem option to use in SGE. Without SGE everything works, but starting with SGE gives such memory errors. You can easily check this with 'qconf -sc'. If you have used this option, try without it. The problem in my case was that OpenMPI allocates sometimes a lot of memory and the job gets immediately killed by SGE, and one gets such error messages, see my posting some days ago. I am not sure if this helps in your case but it could be an explanation. Markus Am Donnerstag, 21. Juni 2007 15:26 schrieb sad...@gmx.net: > Hi, > > I'm having some really strange error causing me some serious headaches. > I want to integrate OpenMPI version 1.1.1 from the OFED package version > 1.1 with SGE version 6.0. For mvapich all works, but for OpenMPI not ;(. > Here is my jobfile and error message: > #!/bin/csh -f > #$ -N MPI_Job > #$ -pe mpi 4 > export PATH=$PATH:/usr/ofed/mpi/gcc/openmpi-1.1.1-1/bin > export > LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/ofed/mpi/gcc/openmpi-1.1.1.-1/lib64 > /usr/ofed/mpi/gcc/openmpi-1.1.1-1/bin/mpirun -np $NSLOTS -hostfile > $TMPDIR/machines /usr/ofed/mpi/gcc/openmpi-1.1.1-1/tests/IMB-2.3/IMB-MPI1 > > ERRORMESSAGE: > [node04:25768] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25768] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25768] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25768] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25769] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25769] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25769] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25769] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25770] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25770] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25770] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25770] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25771] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25771] mca_mpool_openib_register: ibv_reg_mr(0x584000,102400) > failed with error: Cannot allocate memory > [node04:25771] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [node04:25771] mca_mpool_openib_register: ibv_reg_mr(0x584000,528384) > failed with error: Cannot allocate memory > [0,1,1][btl_openib.c:808:mca_btl_openib_create_cq_srq] error creating > low priority cq for mthca0 errno says Cannot allocate memory > > -- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > PML add procs failed > --> Returned "Error" (-1) instead of "Success" (0) > -- > *** An error occurred in MPI_Init > *** before MPI was initialized > *** MPI_ERRORS_ARE_FATAL (goodbye) > MPI_Job.e111975 (END) > > > If I run the OMPI job just with out SGE => everything works e.g. the > following command: > /usr/ofed/mpi/gcc/openmpi-1.1.1-1/bin/mpirun -v -np 4 -H > node04,node04,node04,node04 > /usr/ofed/mpi/gcc/openmpi-1.1.1-1/tests/IMB-2.3/IMB-MPI1 > > If I do this with static machinefiles, it works too: > $ cat /tmp/machines > node04 > node04 > node04 > node04 > > /usr/ofed/mpi/gcc/openmpi-1.1.1-1/bin/mpirun -v -np 4 -hostfile > /tmp/machines /usr/ofed/mpi/gcc/openmpi-1.1.1-1/tests/IMB-2.3/IMB-MPI1 > > And if I run this in a jobscript it works even with a static machinefile > (not shown below): > #!/bin/csh -f > #$ -N MPI_Job > #$ -pe mpi 4 > export PATH=$PATH:/usr/ofed/mpi/gcc/openmpi-1.1.1-1/bin > export > LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/ofed/mpi/gcc/openmpi-1.1.1.-1/lib64 > /usr/ofed/mpi/gcc/openmpi-1.1.1-1/bin/mpirun -v -np 4 -H > node04,no
Re: [OMPI devel] (loose) SGE Integration fails, why?
Markus Daene schrieb: > Hi. > > I think it is not necessary to specify the hosts via the hostfile using SGE > and OpenMPI, even the $NSLOTS is not necessary , just run > mpirun executable this works very well. This produces the same error, but thanks for your suggestion. (For the sake of interest: how controls then ompi how many slots it may use?) > to your memory problem: > I had similar problems when I specified the h_vmem option to use in SGE. > Without SGE everything works, but starting with SGE gives such memory errors. > You can easily check this with 'qconf -sc'. If you have used this option, try > without it. The problem in my case was that OpenMPI allocates sometimes a lot > of memory and the job gets immediately killed by SGE, and one gets such error > messages, see my posting some days ago. I am not sure if this helps in your > case but it could be an explanation. Hmm it seems that I'm not using such an option (for my queue the h_vmem and s_vmem values are set to infinity). Here the output for the qconf -sc command. (Sorry for posting SGE related stuff on this mailing list): [~]# qconf -sc #name shortcut typerelop requestable consumable default urgency # archa RESTRING==YES NO NONE 0 calendarc RESTRING==YES NO NONE 0 cpu cpuDOUBLE >=YES NO 00 h_core h_core MEMORY <=YES NO 00 h_cpu h_cpu TIME<=YES NO 0:0:00 h_data h_data MEMORY <=YES NO 00 h_fsize h_fsizeMEMORY <=YES NO 00 h_rss h_rss MEMORY <=YES NO 00 h_rth_rt TIME<=YES NO 0:0:00 h_stack h_stackMEMORY <=YES NO 00 h_vmem h_vmem MEMORY <=YES NO 00 hostnameh HOST==YES NO NONE 0 load_avgla DOUBLE >=NO NO 00 load_long ll DOUBLE >=NO NO 00 load_medium lm DOUBLE >=NO NO 00 load_short ls DOUBLE >=NO NO 00 mem_freemf MEMORY <=YES NO 00 mem_total mt MEMORY <=YES NO 00 mem_usedmu MEMORY >=YES NO 00 min_cpu_intervalmciTIME<=NO NO 0:0:00 np_load_avg nlaDOUBLE >=NO NO 00 np_load_longnllDOUBLE >=NO NO 00 np_load_medium nlmDOUBLE >=NO NO 00 np_load_short nlsDOUBLE >=NO NO 00 num_procp INT ==YES NO 00 qname q RESTRING==YES NO NONE 0 rerun re BOOL==NO NO 00 s_core s_core MEMORY <=YES NO 00 s_cpu s_cpu TIME<=YES NO 0:0:00 s_data s_data MEMORY <=YES NO 00 s_fsize s_fsizeMEMORY <=YES NO 00 s_rss s_rss MEMORY <=YES NO 00 s_rts_rt TIME<=YES NO 0:0:00 s_stack s_stackMEMORY <=YES NO 00 s_vmem s_vmem MEMORY <=YES NO 00 seq_no seqINT ==NO NO 00 slots s INT <=YES YES 11000 swap_free sf MEMORY <=YES NO 00 swap_rate sr MEMORY >=YES NO 00 swap_rsvd srsv MEMORY >=YES NO 00 swap_total st MEMORY <=YES NO 00 swap_used su MEMORY >=YES NO 00 tmpdir tmpRESTRING==NO NO NONE 0 virtual_freevf MEMORY <=YES NO 00 virtual_total vt MEMORY <=YES NO 00 virtual_usedvu MEMORY >=YES NO 00 # >#< starts a comment but comments are not saved across edits thanks for your help.
Re: [OMPI devel] create new btl
It couldn't be easier. Thanks a lot! Pablo On Friday 22 June 2007 00:32:13 George Bosilca wrote: > Rerun the autogen.sh script and the new BTL will get auto-magically > included in the build. You don't have to modify anything, just run > the script. > > Once you get it compiled, you can specify --mca btl ,self > on your mpirun command line to get access at runtime to your BTL. > >george. > > On Jun 21, 2007, at 3:36 PM, pcas...@atc.ugr.es wrote: > > Hello all, > >I just arrived to open-mpi. I'm trying to create a new btl. The > > goal is > > to use open-mpi with a library that sends/receives packets with a > > network processor (IXP) based board. Since it's an ethernet board I > > thought the best way to start it's to reproduce the TCP btl. So I made > > a copy of the directory ompi/mca/btl/tcp/ just to have something to > > start. But then, I don't know how to include this "new" btl into the > > build system (./configure ; make all install). My knowledge about the > > GNU autotools it's not good enough I guess. I believe first step it's > > to modify the configure script through 'autoconf' but not sure how to > > do this. I've been searching for information about that on this > > maillist with no luck. What will be the steps to create a basic btl? > > What's the best way to integrate the code onto the whole open-mpi? > > Thanks a lot for reading :) > > > > Regards > >Pablo > > > > > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] (loose) SGE Integration fails, why?
On Jun 22, 2007, at 3:52 AM, sad...@gmx.net wrote: 1. You might want to update your version of Open MPI if possible; the v1.1.1 version is quite old. We have added many new bug fixes and features since v1.1.1 (including tight SGE integration). There is nothing special about the Open MPI that is included in the OFED distribution; you can download a new version from the Open MPI web site (the current stable version is v1.2.3), configure, compile, and install it with your current OFED installation. You should be able to configure Open MPI with: Hmm, I've heard about conflicts with OMPI 1.2.x and OFED 1.1 (sorry no refference here), I'm unaware of any problems with OMPI 1.2.x and OFED 1.1. I run OFED 1.1 on my cluster at Cisco and have many different versions of OMPI installed (1.2, trunk, etc.). and I've got no luck producing a working OMPI installation ("mpirun --help" runs, and ./IMB-MPI compiles and runs too, but "mpirun -np 2 node03,node14 IMB-MPI1" doesnt (segmentation fault))... Can you send more information on this? See http://www.open-mpi.org/ community/help/ (beside that, I know that OFED 1.1 is quite old too) So I'm tested it with OMPI 1.1.5 => same error. *IF* all goes well, OFED 1.2 should be released today (famous last words). 2. I know little/nothing about SGE, but I'm assuming that you need to have SGE pass the proper memory lock limits to new processes. In an interactive login, you showed that the max limit is "8162952" -- you might just want to make it unlimited, unless you have a reason for limiting it. See http://www.open-mpi.org/faq/? yes I allready read the faq, and even setting them to unlimited has shown not be working. In the SGE one could specify the limits to SGE-jobs by e.g. the qmon tool, (configuring queues > select queue > modify > limits) But there is everything set to infinity. (Beside that, the job is running with a static machinefile (is this an "noninteractive" job?)) How could I test ulimits of interactive and noninteractive jobs? Launch an SGE job that calls the shell command "limit" (if you run C- shell variants) or "ulimit -l" (if you run Bourne shell variants). Ensure that the output is "unlimited". What are the limits of the user that launches the SGE daemons? I.e., did the SGE daemons get started with proper "unlimited" limits? If not, that could hamper SGE's ability to set the limits that you told it to via qmon (remember my disclaimer: I know nothing about SGE, so this is speculation). -- Jeff Squyres Cisco Systems
Re: [OMPI devel] (loose) SGE Integration fails, why?
> Markus Daene wrote: > > Hi. > > > > I think it is not necessary to specify the hosts via the hostfile using > > SGE and OpenMPI, even the $NSLOTS is not necessary , just run > > mpirun executable this works very well. > > This produces the same error, but thanks for your suggestion. (For the > sake of interest: how controls then ompi how many slots it may use?) It just knows ist, I think the developers could answer this quastions. > > to your memory problem: > > I had similar problems when I specified the h_vmem option to use in SGE. > > Without SGE everything works, but starting with SGE gives such memory > > errors. You can easily check this with 'qconf -sc'. If you have used this > > option, try without it. The problem in my case was that OpenMPI allocates > > sometimes a lot of memory and the job gets immediately killed by SGE, and > > one gets such error messages, see my posting some days ago. I am not sure > > if this helps in your case but it could be an explanation. I am sorry to discuss SGE stuff here as well, but there was this question and one should make clear that this is not just related to OMPI. I think your output shows exactely the problem: you have set h_vmem as requestable and the default value to 0, the job has no memory at all. OMPI somehow knows that is has just this memory granted by SGE, so it cannot allocate any memory in this case. Of course you get the errors. You should either set h_vmem to not requestable, or set a proper default value. e.g. 2.0G, or specify the memory consumption in your job script like #$ -l h_vmem=2000M it is not important that your queue has set h_vmem to infinity, this gives you just the maximum which you can request. Markus > Hmm it seems that I'm not using such an option (for my queue the h_vmem > and s_vmem values are set to infinity). Here the output for the qconf > -sc command. (Sorry for posting SGE related stuff on this mailing list): > [~]# qconf -sc > #name shortcut typerelop requestable consumable > default urgency > #-- >-- archa RESTRING==YES > NO > NONE 0 > calendarc RESTRING==YES NO > NONE 0 > cpu cpuDOUBLE >=YES NO > 00 > h_core h_core MEMORY <=YES NO > 00 > h_cpu h_cpu TIME<=YES NO > 0:0:00 > h_data h_data MEMORY <=YES NO > 00 > h_fsize h_fsizeMEMORY <=YES NO > 00 > h_rss h_rss MEMORY <=YES NO > 00 > h_rth_rt TIME<=YES NO > 0:0:00 > h_stack h_stackMEMORY <=YES NO > 00 > h_vmem h_vmem MEMORY <=YES NO > 00 > hostnameh HOST==YES NO > NONE 0 > load_avgla DOUBLE >=NO NO > 00 > load_long ll DOUBLE >=NO NO > 00 > load_medium lm DOUBLE >=NO NO > 00 > load_short ls DOUBLE >=NO NO > 00 > mem_freemf MEMORY <=YES NO > 00 > mem_total mt MEMORY <=YES NO > 00 > mem_usedmu MEMORY >=YES NO > 00 > min_cpu_intervalmciTIME<=NO NO > 0:0:00 > np_load_avg nlaDOUBLE >=NO NO > 00 > np_load_longnllDOUBLE >=NO NO > 00 > np_load_medium nlmDOUBLE >=NO NO > 00 > np_load_short nlsDOUBLE >=NO NO > 00 > num_procp INT ==YES NO > 00 > qname q RESTRING==YES NO > NONE 0 > rerun re BOOL==NO NO > 00 > s_core s_core MEMORY <=YES NO > 00 > s_cpu s_cpu TIME<=YES NO > 0:0:00 > s_data s_data MEMORY <=YES NO > 00 > s_fsize s_fsizeMEMORY <=YES NO > 00 > s_rss s_rss MEMORY <=YES NO > 00 > s_rts_rt TIME<=YES NO > 0:0:00 > s_stack s_stackMEMORY <=YES NO > 00 > s_vmem s_vmem MEMORY <=YES NO > 00 > seq_no seqINT ==NO NO > 00 > slots
Re: [OMPI devel] (loose) SGE Integration fails, why?
Markus Daene wrote: >>> to your memory problem: >>> I had similar problems when I specified the h_vmem option to use in SGE. >>> Without SGE everything works, but starting with SGE gives such memory >>> errors. You can easily check this with 'qconf -sc'. If you have used this >>> option, try without it. The problem in my case was that OpenMPI allocates >>> sometimes a lot of memory and the job gets immediately killed by SGE, and >>> one gets such error messages, see my posting some days ago. I am not sure >>> if this helps in your case but it could be an explanation. > > I am sorry to discuss SGE stuff here as well, but there was this question and > one should make clear that this is not just related to OMPI. > > I think your output shows exactely the problem: you have set h_vmem as > requestable and the default value to 0, the job has no memory at all. OMPI (thought that zero means infinity) > somehow knows that is has just this memory granted by SGE, so it cannot > allocate any memory in this case. Of course you get the errors. > You should either set h_vmem to not requestable, or set a proper default > value. e.g. 2.0G, or specify the memory consumption in your job script like > #$ -l h_vmem=2000M > it is not important that your queue has set h_vmem to infinity, this gives > you > just the maximum which you can request. If I use the h_vmem option I get a slight different error, but if I mark h_vmem as not requestable => same error. Below is the slight different error message: [node17:02861] mca: base: component_find: unable to open: libsysfs.so.1: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_pml_ob1.so: failed to map segment from shared o bject: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_basic.so: failed to map segment from share d object: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_hierarch.so: failed to map segment from sh ared object: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_self.so: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_sm.so: failed to map segment from shared o bject: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_tuned.so: failed to map segment from share d object: Cannot allocate memory (ignored) [node17:02861] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_osc_pt2pt.so: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: libsysfs.so.1: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_pml_ob1.so: failed to map segment from shared o bject: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_basic.so: failed to map segment from share d object: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_hierarch.so: failed to map segment from sh ared object: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_self.so: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_sm.so: failed to map segment from shared o bject: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_coll_tuned.so: failed to map segment from share d object: Cannot allocate memory (ignored) [node17:02862] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_osc_pt2pt.so: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02863] mca: base: component_find: unable to open: libsysfs.so.1: failed to map segment from shared object: Cannot allocate memory (ignored) [node17:02863] mca: base: component_find: unable to open: /usr/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/openmpi/mca_pml_ob1.so: failed to map segment from shared o bject: Cannot allocate memory (ignored) [node17:02863] mca: base: componen
Re: [OMPI devel] (loose) SGE Integration fails, why?
Jeff Squyres wrote: 2. I know little/nothing about SGE, but I'm assuming that you need to have SGE pass the proper memory lock limits to new processes. In an interactive login, you showed that the max limit is "8162952" -- you might just want to make it unlimited, unless you have a reason for limiting it. See http://www.open-mpi.org/faq/? yes I allready read the faq, and even setting them to unlimited has shown not be working. In the SGE one could specify the limits to SGE-jobs by e.g. the qmon tool, (configuring queues > select queue > modify > limits) But there is everything set to infinity. (Beside that, the job is running with a static machinefile (is this an "noninteractive" job?)) How could I test ulimits of interactive and noninteractive jobs? Launch an SGE job that calls the shell command "limit" (if you run C- shell variants) or "ulimit -l" (if you run Bourne shell variants). Ensure that the output is "unlimited". What are the limits of the user that launches the SGE daemons? I.e., did the SGE daemons get started with proper "unlimited" limits? If not, that could hamper SGE's ability to set the limits that you told it to via qmon (remember my disclaimer: I know nothing about SGE, so this is speculation). I am assuming you have tried without using SGE (like via ssh or others) to launch your job and that works correctly? If yes then you should compare the outputs of limit as Jeff suggested to see if they are any difference between the two (with and without using SGE). I know of a similar problem with SGE's limitation that it cannot set the file descriptor limit for the user processes (and I believe the SGE folks are aware of the problem.) The workaround was to put the setting into the ~/.tcshrc. So if SGE is not setting other resource limit correctly or doesn't provide the option, you may have to workaround into the ~/.tcshrc or simliar settings file for your shell. Otherwise it'll probably fall back to use the system default. -- - Pak Lui pak@sun.com
Re: [OMPI devel] (loose) SGE Integration fails, why?
Hi Pak, > Jeff Squyres wrote: 2. I know little/nothing about SGE, but I'm assuming that you need to have SGE pass the proper memory lock limits to new processes. In an interactive login, you showed that the max limit is "8162952" -- you might just want to make it unlimited, unless you have a reason for limiting it. See http://www.open-mpi.org/faq/? >>> yes I allready read the faq, and even setting them to unlimited has >>> shown not be working. In the SGE one could specify the limits to >>> SGE-jobs by e.g. the qmon tool, (configuring queues > select queue > >>> modify > limits) But there is everything set to infinity. (Beside >>> that, >>> the job is running with a static machinefile (is this an >>> "noninteractive" job?)) How could I test ulimits of interactive and >>> noninteractive jobs? >> Launch an SGE job that calls the shell command "limit" (if you run C- >> shell variants) or "ulimit -l" (if you run Bourne shell variants). >> Ensure that the output is "unlimited". >> >> What are the limits of the user that launches the SGE daemons? I.e., >> did the SGE daemons get started with proper "unlimited" limits? If >> not, that could hamper SGE's ability to set the limits that you told >> it to via qmon (remember my disclaimer: I know nothing about SGE, so >> this is speculation). >> > > I am assuming you have tried without using SGE (like via ssh or others) > to launch your job and that works correctly? If yes then you should > compare the outputs of limit as Jeff suggested to see if they are any > difference between the two (with and without using SGE). Yes, without SGE all works, with SGE it does work too if I use a static machinefile (see initial post), or -H h1,...,hn does work too! Just with the SGE's generate $TMPDIR/machines file (which in turn is valid! I checked this), the job doesn't run. And the ulimits are (in every three possibilities every time) unlimited: pos1: pdsh -R shh -w node[XX-YY] ulimit -a => unlimited (loose coupled) pos2: qsub jobscribt, where jobscript just calls the command as in pos1 (thight coupled?) pos3: qsub jobscribt, where jobscript calls another script (containing the same command as in pos1) and additionally passing $TMPDIR/machines as argument to it. Thanks for your help.
Re: [OMPI devel] (loose) SGE Integration fails, why?
Jeff Squyres schrieb: >> Hmm, I've heard about conflicts with OMPI 1.2.x and OFED 1.1 (sorry no >> refference here), > > I'm unaware of any problems with OMPI 1.2.x and OFED 1.1. I run OFED > 1.1 on my cluster at Cisco and have many different versions of OMPI > installed (1.2, trunk, etc.). Yes you are right, I read wrong (in the OMPI 1.2 changelog (README) OFED 1.0 isn't considered to work with OMPI 1.2. Sorry..). >> and I've got no luck producing a working OMPI >> installation ("mpirun --help" runs, and ./IMB-MPI compiles and runs >> too, >> but "mpirun -np 2 node03,node14 IMB-MPI1" doesnt (segmentation >> fault))... > > Can you send more information on this? See http://www.open-mpi.org/ > community/help/ -sh-3.00$ ompi/bin/mpirun -d -np 2 -H node03,node06 hostname [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] [0,0,0] setting up session dir with [headnode:23178]universe default-universe-23178 [headnode:23178]user me [headnode:23178]host headnode [headnode:23178]jobid 0 [headnode:23178]procid 0 [headnode:23178] procdir: /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/0/0 [headnode:23178] jobdir: /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/0 [headnode:23178] unidir: /tmp/openmpi-sessions-me@headnode_0/default-universe-23178 [headnode:23178] top: openmpi-sessions-me@headnode_0 [headnode:23178] tmp: /tmp [headnode:23178] [0,0,0] contact_file /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/universe-setup.txt [headnode:23178] [0,0,0] wrote setup file [headnode:23178] *** Process received signal *** [headnode:23178] Signal: Segmentation fault (11) [headnode:23178] Signal code: Address not mapped (1) [headnode:23178] Failing at address: 0x1 [headnode:23178] [ 0] /lib64/tls/libpthread.so.0 [0x39ed80c430] [headnode:23178] [ 1] /lib64/tls/libc.so.6(strcmp+0) [0x39ecf6ff00] [headnode:23178] [ 2] /home/me/ompi/lib/openmpi/mca_pls_rsh.so(orte_pls_rsh_launch+0x24f) [0x2a9723cc7f] [headnode:23178] [ 3] /home/me/ompi/lib/openmpi/mca_rmgr_urm.so [0x2a9764fa90] [headnode:23178] [ 4] /home/me/ompi/bin/mpirun(orterun+0x35b) [0x402ca3] [headnode:23178] [ 5] /home/me/ompi/bin/mpirun(main+0x1b) [0x402943] [headnode:23178] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x39ecf1c3fb] [headnode:23178] [ 7] /home/me/ompi/bin/mpirun [0x40289a] [headnode:23178] *** End of error message *** Segmentation fault >> yes I allready read the faq, and even setting them to unlimited has >> shown not be working. In the SGE one could specify the limits to >> SGE-jobs by e.g. the qmon tool, (configuring queues > select queue > >> modify > limits) But there is everything set to infinity. (Beside >> that, >> the job is running with a static machinefile (is this an >> "noninteractive" job?)) How could I test ulimits of interactive and >> noninteractive jobs? > > Launch an SGE job that calls the shell command "limit" (if you run C- > shell variants) or "ulimit -l" (if you run Bourne shell variants). > Ensure that the output is "unlimited". I've done that allready, but how to distinguish between tight coupled job ulimits and loose coupled job ulimits? I tested to pass $TMPDIR/machines to a shell script which in turn delivers a "ulimit -a", *assuming* this is considered as a tight coupled job, but each node returned unlimited.. and without this $TMPDIR/machines too. Even the headnode is set to unlimited. > What are the limits of the user that launches the SGE daemons? I.e., > did the SGE daemons get started with proper "unlimited" limits? If > not, that could hamper SGE's ability to set the limits that you told The limits in /etc/security/limits.conf apply to all users (using a '*'), hence the SGE processes and deamons shouldn't have any limits. > it to via qmon (remember my disclaimer: I know nothing about SGE, so > this is speculation). But thanks anyway => I will post this issue to an SGE mailing list soon. The config.log and the `ompi_info --all` is attached. Thanks again to all of you. logs.tbz Description: application/bzip-compressed-tar
Re: [OMPI devel] (loose) SGE Integration fails, why?
On Jun 22, 2007, at 10:44 AM, sad...@gmx.net wrote: Can you send more information on this? See http://www.open-mpi.org/ community/help/ -sh-3.00$ ompi/bin/mpirun -d -np 2 -H node03,node06 hostname [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] connect_uni: connection not allowed [headnode:23178] [0,0,0] setting up session dir with [headnode:23178]universe default-universe-23178 [headnode:23178]user me [headnode:23178]host headnode [headnode:23178]jobid 0 [headnode:23178]procid 0 [headnode:23178] procdir: /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/0/0 [headnode:23178] jobdir: /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/0 [headnode:23178] unidir: /tmp/openmpi-sessions-me@headnode_0/default-universe-23178 [headnode:23178] top: openmpi-sessions-me@headnode_0 [headnode:23178] tmp: /tmp [headnode:23178] [0,0,0] contact_file /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/universe- setup.txt [headnode:23178] [0,0,0] wrote setup file [headnode:23178] *** Process received signal *** [headnode:23178] Signal: Segmentation fault (11) [headnode:23178] Signal code: Address not mapped (1) [headnode:23178] Failing at address: 0x1 [headnode:23178] [ 0] /lib64/tls/libpthread.so.0 [0x39ed80c430] [headnode:23178] [ 1] /lib64/tls/libc.so.6(strcmp+0) [0x39ecf6ff00] [headnode:23178] [ 2] /home/me/ompi/lib/openmpi/mca_pls_rsh.so(orte_pls_rsh_launch+0x24f) [0x2a9723cc7f] [headnode:23178] [ 3] /home/me/ompi/lib/openmpi/mca_rmgr_urm.so [0x2a9764fa90] [headnode:23178] [ 4] /home/me/ompi/bin/mpirun(orterun+0x35b) [0x402ca3] [headnode:23178] [ 5] /home/me/ompi/bin/mpirun(main+0x1b) [0x402943] [headnode:23178] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x39ecf1c3fb] [headnode:23178] [ 7] /home/me/ompi/bin/mpirun [0x40289a] [headnode:23178] *** End of error message *** Segmentation fault This should not happen -- this is [obviously] even before any MPI processing starts. Are you inside an SGE job here? Pak/Ralph: any ideas? Launch an SGE job that calls the shell command "limit" (if you run C- shell variants) or "ulimit -l" (if you run Bourne shell variants). Ensure that the output is "unlimited". I've done that allready, but how to distinguish between tight coupled job ulimits and loose coupled job ulimits? I tested to pass $TMPDIR/machines to a shell script which in turn delivers a "ulimit -a", *assuming* this is considered as a tight coupled job, but each node returned unlimited.. and without this $TMPDIR/machines too. Even the headnode is set to unlimited. I don't really know what this means. People have explained "loose" vs. "tight" integration to me before, but since I'm not an SGE user, the definitions always fall away. Based on your prior e-mail, it looks like you are always invoking "ulimit" via "pdsh", even under SGE jobs. This is incorrect. Can't you just submit an SGE job script that runs "ulimit"? What are the limits of the user that launches the SGE daemons? I.e., did the SGE daemons get started with proper "unlimited" limits? If not, that could hamper SGE's ability to set the limits that you told The limits in /etc/security/limits.conf apply to all users (using a '*'), hence the SGE processes and deamons shouldn't have any limits. Not really. limits.conf is not universally applied; it's a PAM entity. So for daemons that start via /etc/init.d scripts (or whatever the equivalent is on your system), PAM limits are not necessarily applied. For example, I had to manually insert a "ulimit -Hl unlimited" in the startup script for my SLURM daemons. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] PML/BTL MCA params review
On Jun 20, 2007, at 8:29 AM, Jeff Squyres wrote: 1. btl_*_min_send_size is used to decide when to stop striping a message across multiple BTL's. Is there a reason that we don't just use eager_limit for this value? It seems weird to say "this message is short enough to go across 1 BTL, even though it'll take multiple sends if min_send_size > eager_limit". If no one has any objections, we suggest eliminating this MCA parameter (!!) and corresponding value and just using the BTL's eager limit for this value (this value is set by every BTL, but only used in exactly 1 place in OB1). Len: please put this on the agenda for next Tuesday (just so that there's a deadline to ensure progress). No one has commented on this, so I assume we'll discuss on Tuesday. :-) 2. rdma_pipeline_offset is bad name; it is not an accurate description of what this value represents. See the attached figure for what this value is: it is the length that is sent/received after the eager match before the RDMA (it happens to be at the end of the message, but that's irrelevant). Specifically: it is a length, not an offset. We should change this name. Here's some suggestions we came up with: rdma_pipeline_send_length (this is our favorite) Gleb made this change in the code. I've attached a new slide showing the new name. -- Jeff Squyres Cisco Systems pml-btl-values.pdf Description: Adobe PDF document