Hi, Lately I was using 2D examples and when I changed to 3D I noticed bad performance. When I checked the code it was not allocating enough memory for 3D. Instead of 6 nonzeros in each it row it had 7 and performance went down very significantly as mentioned in PETSc manual.
Billy. Quoting Barry Smith <bsmith at mcs.anl.gov>: > > > On Fri, 9 Feb 2007, Shi Jin wrote: > > > MatGetRow are used to build the right hand side > > vector. > ^^^^^^ > > Huh? > > > We use it in order to get the number of nonzero cols, > > global col indices and values in a row. > > Huh? What do you do with all this information? Maybe > we can do what you do with this information much more efficiently? > Without all the calls to MatGetRow(). > > Barry > > > > > The reason it is time consuming is that it is called > > for each row of the matrix. I am not sure how I can > > get away without it. > > Thanks. > > > > Shi > > --- Barry Smith <bsmith at mcs.anl.gov> wrote: > > > > > > > > What are all the calls for MatGetRow() for? They > > > are consuming a > > > great deal of time. Is there anyway to get rid of > > > them? > > > > > > Barry > > > > > > > > > On Fri, 9 Feb 2007, Shi Jin wrote: > > > > > > > Sorry that is not informative. > > > > So I decide to attach the 5 files for > > > NP=1,2,4,8,16 > > > > for > > > > the 400,000 finite element case. > > > > > > > > Please note that the simulation runs over 100 > > > steps. > > > > The 1st step is first order update, named as stage > > > 1. > > > > The rest 99 steps are second order updates. Within > > > > that, stage 2-9 are created for the 8 stages of a > > > > second order update. We should concentrate on the > > > > second order updates. So four calls to KSPSolve in > > > the > > > > log file are important, in stage 4,5,6,and 8 > > > > separately. > > > > Pleaes let me know if you need any other > > > information > > > > or explanation. > > > > Thank you very much. > > > > > > > > Shi > > > > --- Matthew Knepley <knepley at gmail.com> wrote: > > > > > > > > > You really have to give us the log summary > > > output. > > > > > None of the relevant > > > > > numbers are in your summary. > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > On 2/9/07, Shi Jin <jinzishuai at yahoo.com> wrote: > > > > > > > > > > > > Dear Barry, > > > > > > > > > > > > Thank you. > > > > > > I actually have done the staging already. > > > > > > I summarized the timing of the runs in google > > > > > online > > > > > > spreadsheets. I have two runs. > > > > > > 1. with 400,000 finite elements: > > > > > > > > > > > > > > > > > > > > http://spreadsheets.google.com/pub?key=pZHoqlL60quZeDZlucTjEIA > > > > > > 2. with 1,600,000 finite elements: > > > > > > > > > > > > > > > > > > > > http://spreadsheets.google.com/pub?key=pZHoqlL60quZcCVLAqmzqQQ > > > > > > > > > > > > If you can take a look at them and give me > > > some > > > > > > advice, I will be deeply grateful. > > > > > > > > > > > > Shi > > > > > > --- Barry Smith <bsmith at mcs.anl.gov> wrote: > > > > > > > > > > > > > > > > > > > > NO, NO, don't spend time stripping your > > > code! > > > > > > > Unproductive > > > > > > > > > > > > > > See the manul pages for > > > > > PetscLogStageRegister(), > > > > > > > PetscLogStagePush() and > > > > > > > PetscLogStagePop(). All you need to do is > > > > > maintain a > > > > > > > seperate stage for each > > > > > > > of your KSPSolves; in your case you'll > > > create 3 > > > > > > > stages. > > > > > > > > > > > > > > Barry > > > > > > > > > > > > > > On Fri, 9 Feb 2007, Shi Jin wrote: > > > > > > > > > > > > > > > Thank you. > > > > > > > > But my code has 10 calls to KSPSolve of > > > three > > > > > > > > different linear systems at each time > > > update. > > > > > > > Should I > > > > > > > > strip it down to a single KSPSolve so that > > > it > > > > > is > > > > > > > > easier to analysis? I might have the code > > > dump > > > > > the > > > > > > > > Matrix and vector and write another code > > > to > > > > > read > > > > > > > them > > > > > > > > into and call KSPSolve. I don't know > > > whether > > > > > this > > > > > > > is > > > > > > > > worth doing or should I just send in the > > > > > messy > > > > > > > log > > > > > > > > file of the whole run. > > > > > > > > Thanks for any advice. > > > > > > > > > > > > > > > > Shi > > > > > > > > > > > > > > > > --- Barry Smith <bsmith at mcs.anl.gov> > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Shi, > > > > > > > > > > > > > > > > > > There is never a better test problem > > > then > > > > > > > your > > > > > > > > > actual problem. > > > > > > > > > Send the results from running on 1, 4, > > > and 8 > > > > > > > > > processes with the options > > > > > > > > > -log_summary -ksp_view (use the > > > optimized > > > > > > > version of > > > > > > > > > PETSc (running > > > > > > > > > config/configure.py --with-debugging=0)) > > > > > > > > > > > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 9 Feb 2007, Shi Jin wrote: > > > > > > > > > > > > > > > > > > > Hi there, > > > > > > > > > > > > > > > > > > > > I am tuning our 3D FEM CFD code > > > written > > > > > with > > > > > > > > > PETSc. > > > > > > > > > > The code doesn't scale very well. For > > > > > example, > > > > > > > > > with 8 > > > > > > > > > > processes on a linux cluster, the > > > speedup > > > > > we > > > > > > > > > achieve > > > > > > > > > > with a fairly large problem > > > size(million > > > > > of > > > > > > > > > elements) > > > > > > > > > > is only 3 to 4 using the Congugate > > > > > gradient > > > > > > > > > solver. We > > > > > > > > > > can achieve a speed up of a 6.5 using > > > a > > > > > GMRes > > > > > > > > > solver > > > > > > > > > > but the wall clock time of a GMRes is > > > > > longer > > > > > > > than > > > > > > > > > a CG > > > > > > > > > > solver which indicates that CG is the > > > > > faster > > > > > > > > > solver > > > > > > > > > > and it scales not as good as GMRes. Is > > > > > this > > > > > > > > > generally > > > > > > > > > > true? > > > > > > > > > > > > > > > > > > > > I then went to the examples and find a > > > 2D > > > > > > > example > > > > > > > > > of > > > > > > > > > > KSPSolve (ex2.c). I let the code ran > > > with > > > > > a > > > > > > > > > 1000x1000 > > > > > > > > > > mesh and get a linear scaling of the > > > CG > > > > > solver > > > > > > > and > > > > > > > > > a > > > > > > > > > > super linear scaling of the GMRes. > > > These > > > > > are > > > > > > > both > > > > > > > > > much > > > > > > > > > > better than our code. However, I think > > > the > > > > > 2D > > > > > > > > > nature > > > > > === message truncated === > > > > > > > > > > > ____________________________________________________________________________________ > > Cheap talk? > > Check out Yahoo! Messenger's low PC-to-Phone call rates. > > http://voice.yahoo.com > > > > > >
