Hi, Am 20.08.2014 um 06:26 schrieb Tetsuya Mishima:
> Reuti and Oscar, > > I'm a Torque user and I myself have never used SGE, so I hesitated to join > the discussion. > > From my experience with the Torque, the openmpi 1.8 series has already > resolved the issue you pointed out in combining MPI with OpenMP. > > Please try to add --map-by slot:pe=8 option, if you want to use 8 threads. > Then, then openmpi 1.8 should allocate processes properly without any > modification > of the hostfile provided by the Torque. > > In your case(8 threads and 10 procs): > > # you have to request 80 slots using SGE command before mpirun > mpirun --map-by slot:pe=8 -np 10 ./inverse.exe Thx for pointing me to this option, for now I can't get it working though (in fact, I want to use it without binding essentially). This allows to tell Open MPI to bind more cores to each of the MPI processes - ok, but does it lower the slot count granted by Torque too? I mean, was your submission command like: $ qsub -l nodes=10:ppn=8 ... so that Torque knows, that it should grant and remember this slot count of a total of 80 for the correct accounting? -- Reuti > where you can omit --bind-to option because --bind-to core is assumed > as default when pe=N is provided by the user. > Regards, > Tetsuya > >> Hi, >> >> Am 19.08.2014 um 19:06 schrieb Oscar Mojica: >> >>> I discovered what was the error. I forgot include the '-fopenmp' when I >>> compiled the objects in the Makefile, so the program worked but it didn't >>> divide the job > in threads. Now the program is working and I can use until 15 cores for > machine in the queue one.q. >>> >>> Anyway i would like to try implement your advice. Well I'm not alone in the >>> cluster so i must implement your second suggestion. The steps are >>> >>> a) Use '$ qconf -mp orte' to change the allocation rule to 8 >> >> The number of slots defined in your used one.q was also increased to 8 >> (`qconf -sq one.q`)? >> >> >>> b) Set '#$ -pe orte 80' in the script >> >> Fine. >> >> >>> c) I'm not sure how to do this step. I'd appreciate your help here. I can >>> add some lines to the script to determine the PE_HOSTFILE path and >>> contents, but i > don't know how alter it >> >> For now you can put in your jobscript (just after OMP_NUM_THREAD is >> exported): >> >> awk -v omp_num_threads=$OMP_NUM_THREADS '{ $2/=omp_num_threads; print }' >> $PE_HOSTFILE > $TMPDIR/machines >> export PE_HOSTFILE=$TMPDIR/machines >> >> ============= >> >> Unfortunately noone stepped into this discussion, as in my opinion it's a >> much broader issue which targets all users who want to combine MPI with >> OpenMP. The > queuingsystem should get a proper request for the overall amount of slots the > user needs. For now this will be forwarded to Open MPI and it will use this > information to start the appropriate number of processes (which was an > achievement for the Tight Integration out-of-the-box of course) and ignores > any setting of > OMP_NUM_THREADS. So, where should the generated list of machines be adjusted; > there are several options: >> >> a) The PE of the queuingsystem should do it: >> >> + a one time setup for the admin >> + in SGE the "start_proc_args" of the PE could alter the $PE_HOSTFILE >> - the "start_proc_args" would need to know the number of threads, i.e. >> OMP_NUM_THREADS must be defined by "qsub -v ..." outside of the jobscript >> (tricky scanning > of the submitted jobscript for OMP_NUM_THREADS would be too nasty) >> - limits to use inside the jobscript calls to libraries behaving in the same >> way as Open MPI only >> >> >> b) The particular queue should do it in a queue prolog: >> >> same as a) I think >> >> >> c) The user should do it >> >> + no change in the SGE installation >> - each and every user must include it in all the jobscripts to adjust the >> list and export the pointer to the $PE_HOSTFILE, but he could change it >> forth and back > for different steps of the jobscript though >> >> >> d) Open MPI should do it >> >> + no change in the SGE installation >> + no change to the jobscript >> + OMP_NUM_THREADS can be altered for different steps of the jobscript while >> staying inside the granted allocation automatically >> o should MKL_NUM_THREADS be covered too (does it use OMP_NUM_THREADS >> already)? >> >> -- Reuti >> >> >>> echo "PE_HOSTFILE:" >>> echo $PE_HOSTFILE >>> echo >>> echo "cat PE_HOSTFILE:" >>> cat $PE_HOSTFILE >>> >>> Thanks for take a time for answer this emails, your advices had been very >>> useful >>> >>> PS: The version of SGE is OGS/GE 2011.11p1 >>> >>> >>> Oscar Fabian Mojica Ladino >>> Geologist M.S. in Geophysics >>> >>> >>>> From: re...@staff.uni-marburg.de >>>> Date: Fri, 15 Aug 2014 20:38:12 +0200 >>>> To: us...@open-mpi.org >>>> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program >>>> >>>> Hi, >>>> >>>> Am 15.08.2014 um 19:56 schrieb Oscar Mojica: >>>> >>>>> Yes, my installation of Open MPI is SGE-aware. I got the following >>>>> >>>>> [oscar@compute-1-2 ~]$ ompi_info | grep grid >>>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.2) >>>> >>>> Fine. >>>> >>>> >>>>> I'm a bit slow and I didn't understand the las part of your message. So i >>>>> made a test trying to solve my doubts. >>>>> This is the cluster configuration: There are some machines turned off but >>>>> that is no problem >>>>> >>>>> [oscar@aguia free-noise]$ qhost >>>>> HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS >>>>> ------------------------------------------------------------------------------- >>>>> global - - - - - - - >>>>> compute-1-10 linux-x64 16 0.97 23.6G 558.6M 996.2M 0.0 >>>>> compute-1-11 linux-x64 16 - 23.6G - 996.2M - >>>>> compute-1-12 linux-x64 16 0.97 23.6G 561.1M 996.2M 0.0 >>>>> compute-1-13 linux-x64 16 0.99 23.6G 558.7M 996.2M 0.0 >>>>> compute-1-14 linux-x64 16 1.00 23.6G 555.1M 996.2M 0.0 >>>>> compute-1-15 linux-x64 16 0.97 23.6G 555.5M 996.2M 0.0 >>>>> compute-1-16 linux-x64 8 0.00 15.7G 296.9M 1000.0M 0.0 >>>>> compute-1-17 linux-x64 8 0.00 15.7G 299.4M 1000.0M 0.0 >>>>> compute-1-18 linux-x64 8 - 15.7G - 1000.0M - >>>>> compute-1-19 linux-x64 8 - 15.7G - 996.2M - >>>>> compute-1-2 linux-x64 16 1.19 23.6G 468.1M 1000.0M 0.0 >>>>> compute-1-20 linux-x64 8 0.04 15.7G 297.2M 1000.0M 0.0 >>>>> compute-1-21 linux-x64 8 - 15.7G - 1000.0M - >>>>> compute-1-22 linux-x64 8 0.00 15.7G 297.2M 1000.0M 0.0 >>>>> compute-1-23 linux-x64 8 0.16 15.7G 299.6M 1000.0M 0.0 >>>>> compute-1-24 linux-x64 8 0.00 15.7G 291.5M 996.2M 0.0 >>>>> compute-1-25 linux-x64 8 0.04 15.7G 293.4M 996.2M 0.0 >>>>> compute-1-26 linux-x64 8 - 15.7G - 1000.0M - >>>>> compute-1-27 linux-x64 8 0.00 15.7G 297.0M 1000.0M 0.0 >>>>> compute-1-29 linux-x64 8 - 15.7G - 1000.0M - >>>>> compute-1-3 linux-x64 16 - 23.6G - 996.2M - >>>>> compute-1-30 linux-x64 16 - 23.6G - 996.2M - >>>>> compute-1-4 linux-x64 16 0.97 23.6G 571.6M 996.2M 0.0 >>>>> compute-1-5 linux-x64 16 1.00 23.6G 559.6M 996.2M 0.0 >>>>> compute-1-6 linux-x64 16 0.66 23.6G 403.1M 996.2M 0.0 >>>>> compute-1-7 linux-x64 16 0.95 23.6G 402.7M 996.2M 0.0 >>>>> compute-1-8 linux-x64 16 0.97 23.6G 556.8M 996.2M 0.0 >>>>> compute-1-9 linux-x64 16 1.02 23.6G 566.0M 1000.0M 0.0 >>>>> >>>>> I ran my program using only MPI with 10 processors of the queue one.q >>>>> which has 14 machines (compute-1-2 to compute-1-15). Whit 'qstat -t' I >>>>> got: >>>>> >>>>> [oscar@aguia free-noise]$ qstat -t >>>>> job-ID prior name user state submit/start at queue master ja-task-ID >>>>> task-ID state cpu mem io stat failed >>>>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------- > ---- >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-2.local >>>>> MASTER r 00:49:12 554.13753 0.09163 >>>>> one.q@compute-1-2.local SLAVE >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-5.local >>>>> SLAVE 1.compute-1-5 r 00:48:53 551.49022 0.09410 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-9.local >>>>> SLAVE 1.compute-1-9 r 00:50:00 564.22764 0.09409 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-12.local >>>>> SLAVE 1.compute-1-12 r 00:47:30 535.30379 0.09379 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-13.local >>>>> SLAVE 1.compute-1-13 r 00:49:51 561.69868 0.09379 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-14.local >>>>> SLAVE 1.compute-1-14 r 00:49:14 554.60818 0.09379 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-10.local >>>>> SLAVE 1.compute-1-10 r 00:49:59 562.95487 0.09349 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-15.local >>>>> SLAVE 1.compute-1-15 r 00:50:01 563.27221 0.09361 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-8.local >>>>> SLAVE 1.compute-1-8 r 00:49:26 556.68431 0.09349 >>>>> 2726 0.50500 job oscar r 08/15/2014 12:38:21 one.q@compute-1-4.local >>>>> SLAVE 1.compute-1-4 r 00:49:27 556.87510 0.04967 >>>> >>>> Yes, here you got 10 slots (= cores) granted by SGE. So there is no free >>>> core left inside the allocation of SGE to allow the use of additional >>>> cores for your > threads. If you use more cores than granted by SGE, it will oversubscribe the > machines. >>>> >>>> The issue is now: >>>> >>>> a) If you want 8 threads per MPI process, your job will use 80 cores in >>>> total - for now SGE isn't aware of it. >>>> >>>> b) Although you specified $fill_up as allocation rule, it looks like >>>> $round_robin. Is there more than one slot defined in the queue definition >>>> of one.q to get > exclusive access? >>>> >>>> c) What version of SGE are you using? Certain ones use cgroups or bind >>>> processes directly to cores (although it usually needs to be requested by >>>> the job: > first line of `qconf -help`). >>>> >>>> >>>> In case you are alone in the cluster, you could bypass the allocation with >>>> b) (unless you are hit by c)). But having a mixture of users and jobs a >>>> different > handling would be necessary to handle this in a proper way IMO: >>>> >>>> a) having a PE with a fixed allocation rule of 8 >>>> >>>> b) requesting this PE with an overall slot count of 80 >>>> >>>> c) copy and alter the $PE_HOSTFILE to show only (granted core count per >>>> machine) divided by (OMP_NUM_THREADS) per entry, change $PE_HOSTFILE so >>>> that it points > to the altered file >>>> >>>> d) Open MPI with a Tight Integration will now start only N process per >>>> machine according to the altered hostfile, in your case one >>>> >>>> e) Your application can start the desired threads and you stay inside the >>>> granted allocation >>>> >>>> -- Reuti >>>> >>>> >>>>> I accessed to the MASTER processor with 'ssh compute-1-2.local' , and >>>>> with $ ps -e f and got this, I'm showing only the last lines >>>>> >>>>> 2506 ? Ss 0:00 /usr/sbin/atd >>>>> 2548 tty1 Ss+ 0:00 /sbin/mingetty /dev/tty1 >>>>> 2550 tty2 Ss+ 0:00 /sbin/mingetty /dev/tty2 >>>>> 2552 tty3 Ss+ 0:00 /sbin/mingetty /dev/tty3 >>>>> 2554 tty4 Ss+ 0:00 /sbin/mingetty /dev/tty4 >>>>> 2556 tty5 Ss+ 0:00 /sbin/mingetty /dev/tty5 >>>>> 2558 tty6 Ss+ 0:00 /sbin/mingetty /dev/tty6 >>>>> 3325 ? Sl 0:04 /opt/gridengine/bin/linux-x64/sge_execd >>>>> 17688 ? S 0:00 \_ sge_shepherd-2726 -bg >>>>> 17695 ? Ss 0:00 \_ -bash >>>>> /opt/gridengine/default/spool/compute-1-2/job_scripts/2726 >>>>> 17797 ? S 0:00 \_ /usr/bin/time -f %E /opt/openmpi/bin/mpirun -v -np 10 >>>>> ./inverse.exe >>>>> 17798 ? S 0:01 \_ /opt/openmpi/bin/mpirun -v -np 10 ./inverse.exe >>>>> 17799 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-5.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>> 17800 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-9.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>> 17801 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-12.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>> 17802 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-13.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>> 17803 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-14.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>> 17804 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-10.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>> 17805 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-15.local PATH=/opt/openmpi/bin:$PATH ; exp >>>>> 17806 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-8.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>> 17807 ? Sl 0:00 \_ /opt/gridengine/bin/linux-x64/qrsh -inherit -nostdin >>>>> -V compute-1-4.local PATH=/opt/openmpi/bin:$PATH ; expo >>>>> 17826 ? R 31:36 \_ ./inverse.exe >>>>> 3429 ? Ssl 0:00 automount --pid-file /var/run/autofs.pid >>>>> >>>>> So the job is using the 10 machines, Until here is all right OK. Do you >>>>> think that changing the "allocation_rule " to a number instead $fill_up >>>>> the MPI > processes would divide the work in that number of threads? >>>>> >>>>> Thanks a lot >>>>> >>>>> Oscar Fabian Mojica Ladino >>>>> Geologist M.S. in Geophysics >>>>> >>>>> >>>>> PS: I have another doubt, what is a slot? is a physical core? >>>>> >>>>> >>>>>> From: re...@staff.uni-marburg.de >>>>>> Date: Thu, 14 Aug 2014 23:54:22 +0200 >>>>>> To: us...@open-mpi.org >>>>>> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program >>>>>> >>>>>> Hi, >>>>>> >>>>>> I think this is a broader issue in case an MPI library is used in >>>>>> conjunction with threads while running inside a queuing system. First: >>>>>> whether your > actual installation of Open MPI is SGE-aware you can check with: >>>>>> >>>>>> $ ompi_info | grep grid >>>>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.5) >>>>>> >>>>>> Then we can look at the definition of your PE: "allocation_rule >>>>>> $fill_up". This means that SGE will grant you 14 slots in total in any >>>>>> combination on the > available machines, means 8+4+2 slots allocation is an allowed combination > like 4+4+3+3 and so on. Depending on the SGE-awareness it's a question: will > your > application just start processes on all nodes and completely disregard the > granted allocation, or as the other extreme does it stays on one and the same > machine > for all started processes? On the master node of the parallel job you can > issue: >>>>>> >>>>>> $ ps -e f >>>>>> >>>>>> (f w/o -) to have a look whether `ssh` or `qrsh -inhert ...` is used to >>>>>> reach other machines and their requested process count. >>>>>> >>>>>> >>>>>> Now to the common problem in such a set up: >>>>>> >>>>>> AFAICS: for now there is no way in the Open MPI + SGE combination to >>>>>> specify the number of MPI processes and intended number of threads which >>>>>> are > automatically read by Open MPI while staying inside the granted slot count > and allocation. So it seems to be necessary to have the intended number of > threads being > honored by Open MPI too. >>>>>> >>>>>> Hence specifying e.g. "allocation_rule 8" in such a setup while >>>>>> requesting 32 processes, would for now start 32 processes by MPI >>>>>> already, as Open MP reads > the $PE_HOSTFILE and acts accordingly. >>>>>> >>>>>> Open MPI would have to read the generated machine file in a slightly >>>>>> different way regarding threads: a) read the $PE_HOSTFILE, b) divide the >>>>>> granted > slots per machine by OMP_NUM_THREADS, c) throw an error in case it's not > divisible by OMP_NUM_THREADS. Then start one process per quotient. >>>>>> >>>>>> Would this work for you? >>>>>> >>>>>> -- Reuti >>>>>> >>>>>> PS: This would also mean to have a couple of PEs in SGE having a fixed >>>>>> "allocation_rule". While this works right now, an extension in SGE could >>>>>> be > "$fill_up_omp"/"$round_robin_omp" and using OMP_NUM_THREADS there too, hence > it must not be specified as an `export` in the job script but either on the > command > line or inside the job script in #$ lines as job requests. This would mean to > collect slots in bunches of OMP_NUM_THREADS on each machine to reach the > overall > specified slot count. Whether OMP_NUM_THREADS or n times OMP_NUM_THREADS is > allowed per machine needs to be discussed. >>>>>> >>>>>> PS2: As Univa SGE can also supply a list of granted cores in the >>>>>> $PE_HOSTFILE, it would be an extension to feed this to Open MPI to allow >>>>>> any UGE aware > binding. >>>>>> >>>>>> >>>>>> Am 14.08.2014 um 21:52 schrieb Oscar Mojica: >>>>>> >>>>>>> Guys >>>>>>> >>>>>>> I changed the line to run the program in the script with both options >>>>>>> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-none -np >>>>>>> $NSLOTS ./inverse.exe >>>>>>> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v --bind-to-socket -np >>>>>>> $NSLOTS ./inverse.exe >>>>>>> >>>>>>> but I got the same results. When I use man mpirun appears: >>>>>>> >>>>>>> -bind-to-none, --bind-to-none >>>>>>> Do not bind processes. (Default.) >>>>>>> >>>>>>> and the output of 'qconf -sp orte' is >>>>>>> >>>>>>> pe_name orte >>>>>>> slots 9999 >>>>>>> user_lists NONE >>>>>>> xuser_lists NONE >>>>>>> start_proc_args /bin/true >>>>>>> stop_proc_args /bin/true >>>>>>> allocation_rule $fill_up >>>>>>> control_slaves TRUE >>>>>>> job_is_first_task FALSE >>>>>>> urgency_slots min >>>>>>> accounting_summary TRUE >>>>>>> >>>>>>> I don't know if the installed Open MPI was compiled with '--with-sge'. >>>>>>> How can i know that? >>>>>>> before to think in an hybrid application i was using only MPI and the >>>>>>> program used few processors (14). The cluster possesses 28 machines, 15 >>>>>>> with 16 > cores and 13 with 8 cores totalizing 344 units of processing. When I > submitted the job (only MPI), the MPI processes were spread to the cores > directly, for that > reason I created a new queue with 14 machines trying to gain more time. the > results were the same in both cases. In the last case i could prove that the > processes > were distributed to all machines correctly. >>>>>>> >>>>>>> What I must to do? >>>>>>> Thanks >>>>>>> >>>>>>> Oscar Fabian Mojica Ladino >>>>>>> Geologist M.S. in Geophysics >>>>>>> >>>>>>> >>>>>>>> Date: Thu, 14 Aug 2014 10:10:17 -0400 >>>>>>>> From: maxime.boissonnea...@calculquebec.ca >>>>>>>> To: us...@open-mpi.org >>>>>>>> Subject: Re: [OMPI users] Running a hybrid MPI+openMP program >>>>>>>> >>>>>>>> Hi, >>>>>>>> You DEFINITELY need to disable OpenMPI's new default binding. >>>>>>>> Otherwise, >>>>>>>> your N threads will run on a single core. --bind-to socket would be my >>>>>>>> recommendation for hybrid jobs. >>>>>>>> >>>>>>>> Maxime >>>>>>>> >>>>>>>> >>>>>>>> Le 2014-08-14 10:04, Jeff Squyres (jsquyres) a 馗rit : >>>>>>>>> I don't know much about OpenMP, but do you need to disable Open MPI's >>>>>>>>> default bind-to-core functionality (I'm assuming you're using Open >>>>>>>>> MPI 1.8.x)? >>>>>>>>> >>>>>>>>> You can try "mpirun --bind-to none ...", which will have Open MPI not >>>>>>>>> bind MPI processes to cores, which might allow OpenMP to think that >>>>>>>>> it can use > all the cores, and therefore it will spawn num_cores threads...? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Aug 14, 2014, at 9:50 AM, Oscar Mojica <o_moji...@hotmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hello everybody >>>>>>>>>> >>>>>>>>>> I am trying to run a hybrid mpi + openmp program in a cluster. I >>>>>>>>>> created a queue with 14 machines, each one with 16 cores. The >>>>>>>>>> program divides the > work among the 14 processors with MPI and within each processor a loop is > also divided into 8 threads for example, using openmp. The problem is that > when I submit > the job to the queue the MPI processes don't divide the work into threads and > the program prints the number of threads that are working within each process > as one. >>>>>>>>>> >>>>>>>>>> I made a simple test program that uses openmp and I logged in one >>>>>>>>>> machine of the fourteen. I compiled it using gfortran -fopenmp >>>>>>>>>> program.f -o exe, > set the OMP_NUM_THREADS environment variable equal to 8 and when I ran > directly in the terminal the loop was effectively divided among the cores and > for example in > this case the program printed the number of threads equal to 8 >>>>>>>>>> >>>>>>>>>> This is my Makefile >>>>>>>>>> >>>>>>>>>> # Start of the makefile >>>>>>>>>> # Defining variables >>>>>>>>>> objects = inv_grav3d.o funcpdf.o gr3dprm.o fdjac.o dsvd.o >>>>>>>>>> #f90comp = /opt/openmpi/bin/mpif90 >>>>>>>>>> f90comp = /usr/bin/mpif90 >>>>>>>>>> #switch = -O3 >>>>>>>>>> executable = inverse.exe >>>>>>>>>> # Makefile >>>>>>>>>> all : $(executable) >>>>>>>>>> $(executable) : $(objects) >>>>>>>>>> $(f90comp) -fopenmp -g -O -o $(executable) $(objects) >>>>>>>>>> rm $(objects) >>>>>>>>>> %.o: %.f >>>>>>>>>> $(f90comp) -c $< >>>>>>>>>> # Cleaning everything >>>>>>>>>> clean: >>>>>>>>>> rm $(executable) >>>>>>>>>> # rm $(objects) >>>>>>>>>> # End of the makefile >>>>>>>>>> >>>>>>>>>> and the script that i am using is >>>>>>>>>> >>>>>>>>>> #!/bin/bash >>>>>>>>>> #$ -cwd >>>>>>>>>> #$ -j y >>>>>>>>>> #$ -S /bin/bash >>>>>>>>>> #$ -pe orte 14 >>>>>>>>>> #$ -N job >>>>>>>>>> #$ -q new.q >>>>>>>>>> >>>>>>>>>> export OMP_NUM_THREADS=8 >>>>>>>>>> /usr/bin/time -f "%E" /opt/openmpi/bin/mpirun -v -np $NSLOTS >>>>>>>>>> ./inverse.exe >>>>>>>>>> >>>>>>>>>> am I forgetting something? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Oscar Fabian Mojica Ladino >>>>>>>>>> Geologist M.S. in Geophysics >>>>>>>>>> _______________________________________________ >>>>>>>>>> users mailing list >>>>>>>>>> us...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>>> Link to this post: >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25016.php >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> --------------------------------- >>>>>>>> Maxime Boissonneault >>>>>>>> Analyste de calcul - Calcul Qu饕ec, Universit・Laval >>>>>>>> Ph. D. en physique >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> Link to this post: >>>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25020.php >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25032.php >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25034.php >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2014/08/25037.php >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/08/25038.php >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/08/25079.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/08/25080.php > > ---- > Tetsuya Mishima tmish...@jcity.maeda.co.jp > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/08/25081.php