Hi Satish, 1st of all, I forgot to inform u that I've changed the m and n to 800. I would like to see if the larger value can make the scaling better. If req, I can redo the test with m,n=600.
I can install MPICH but I don't think I can choose to run on a single machine using from 1 to 8 procs. In order to run the code, I usually have to use the command bsub -o log -q linux64 ./a.out for single procs bsub -o log -q mcore_parallel -n $ -a mvapich mpirun.lsf ./a.out where $=no. of procs. for multiple procs After that, when the job is running, I'll be given the server which my job runs on e.g. atlas3-c10 (1 procs) or 2*atlas3-c10 + 2*atlas3-c12 (4 procs) or 2*atlas3-c10 + 2*atlas3-c12 +2*atlas3-c11 + 2*atlas3-c13 (8 procs). I was told that 2*atlas3-c10 doesn't mean that it is running on a dual core single cpu. Btw, are you saying that I should 1st install the latest MPICH2 build with the option : ./configure --with-device=ch3:nemesis:newtcp -with-pm=gforker And then install PETSc with the MPICH2? So after that do you know how to do what you've suggest for my servers? I don't really understand what you mean. May I supposed to run 4 jobs on 1 quadcore? Or 1 job using 4 cores on 1 quadcore? Well, I do know that atlas3-c00 to c03 are the location of the quad cores. I can force to use them by bsub -o log -q mcore_parallel -n $ -m quadcore -a mvapich mpirun.lsf ./a.out Lastly, I make a mistake in the different times reported by the same compiler. Sorry abt that. Thank you very much. Satish Balay wrote: > On Sat, 19 Apr 2008, Ben Tay wrote: > > >> Btw, I'm not able to try the latest mpich2 because I do not have the >> administrator rights. I was told that some special configuration is >> required. >> > > You don't need admin rights to install/use MPICH with the options I > mentioned. I was sugesting just running in SMP mode on a single > machine [from 1-8 procs on Quad-Core Intel Xeon X5355, to compare with > my SMP runs] with: > > ./configure --with-device=ch3:nemesis:newtcp -with-pm=gforker > > >> Btw, should there be any different in speed whether I use mpiuni and >> ifort or mpi and mpif90? I tried on ex2f (below) and there's only a >> small difference. If there is a large difference (mpi being slower), >> then it mean there's something wrong in the code? >> > > For one - you are not using MPIUNI. You are using > --with-mpi-dir=/lsftmp/g0306332/mpich2. However - if compilers are the > same & compiler options are the same, I would expect the same > performance in both the cases. Do you get such different times for > different runs of the same binary? > > MatMult 384 vs 423 > > What if you run both of the binaries on the same machine? [as a single > job?]. > > If you are using pbs scheduler - sugest doing: > - squb -I [to get interactive access to thenodes] > - login to each node - to check no one else is using the scheduled nodes. > - run multiple jobs during this single allocation for comparision. > > These are general tips to help you debug performance on your cluster. > > BTW: I get: > ex2f-600-1p.log:MatMult 1192 1.0 9.7109e+00 1.0 3.86e+09 1.0 > 0.0e+00 0.0e+00 0.0e+00 14 11 0 0 0 14 11 0 0 0 397 > > You get: > log.1:MatMult 1879 1.0 2.8137e+01 1.0 3.84e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 12 11 0 0 0 12 11 0 0 0 384 > > > There is a difference in number of iterations. Are you sure you are > using the same ex2f with -m 600 -n 600 options? > > Satish
