> On Jan 7, 2017, at 2:36 PM, Łukasz Kasza <rpgw...@wp.pl> wrote: > > I am unable to locate the source of the issue. It is the same for problem for > the newest petsc version also. Gprof does not profile shared libraries (petsc)
Thats odd. ./configure with --with-shared-libraries=0 to get non-shared library version of PETSc. > and there is nothing suspicious in the profile of my code. Sprof does not > work due to known issue. When I run my code in callgrind this issue does not > occur i.e. VexAXPY takes approximately the same time on every call! Nothing > meaningful in the petsc log also. I will have to find a workaround or try > another blas as you mentioned. > > Dnia Piątek, 6 Stycznia 2017 23:40 Barry Smith <bsm...@mcs.anl.gov> > napisał(a) >> >> The second one should absolutely be slower than the first (because it >> actually iterations through the indices you pass in with an indirection) and >> the first should not get slower the more you run it. >> >> Depending on your environment I recommend you using a profiling tool on >> the code and look at where it is spending its time within VecAXPY. The basic >> Linux/Unix profiling tool is gprof, but you can use Instruments on macOS >> (part of Xcode) or Intel's vtune if you have that. >> >> >> You can also try a different BLAS to see if that matters. For example >> --download-fblaslapack or don't use MKL if you are using it. >> >> Barry >> >>> On Jan 6, 2017, at 4:31 PM, Łukasz Kasza <rpgw...@wp.pl> wrote: >>> >>> >>> >>> Dear PETSc Users, >>> >>> Please consider the following 2 snippets which do exactly the same >>> (calculate a sum of two vectors): >>> 1. >>> VecAXPY(amg_level_x[level],1.0,amg_level_residuals[level]); >>> >>> 2. >>> VecGetArray(amg_level_residuals[level], &values); >>> VecSetValues(amg_level_x[level],size,indices,values,ADD_VALUES); >>> VecRestoreArray(amg_level_residuals[level], &values); >>> VecAssemblyBegin(amg_level_x[level]); >>> VecAssemblyEnd(amg_level_x[level]); >>> >>> In my program I have both of the snippets executed in a loop. The problem >>> with the first one is that the longer the program goes the longer it takes >>> to execute it. At the same time the execution time of the second snippet is >>> more or less constant. I don't know why but after a few hundreds of >>> iterations VecAXPY takes more than MatMult on the matrix and vector of the >>> same size and after that it still grows! Always returning a correct value >>> though. I am using 4.5.3 version, the vectors are >>> sequential. VecAXPY in such case is just a wrapper for blas, do you have >>> any idea why the execution time of this function constantly grows? >>> >>> Best regards. >>> >>> > > >