[petsc-dev] [GPU] Performance on Fermi
On Fri, 27 Aug 2010 16:06:30 -0500, Keita Teranishi keita at cray.com wrote: Barry, The CPU timing I reported was after recompiling the code (I removed PETSC_USE_DEBUG and GDB macros from petscconf.h). Unless you were manually overriding compiler flags, it still wasn't optimized. Please just reconfigure a new PETSC_ARCH --with-debugging=0. It's as easy as foo-dbg/conf/reconfigure-foo-dbg.py --with-debugging=0 PETSC_ARCH=foo-opt make PETSC_ARCH=foo-opt Jed
[petsc-dev] [GPU] Performance on Fermi
On Fri, 27 Aug 2010 16:18:43 -0500, Keita Teranishi keita at cray.com wrote: Yes, I replaced all the compiler flags by -O3. petsc-maint doesn't come to me, but if the snippet that Barry quoted was from your log_summary, then PETSC_USE_DEBUG was definitely defined when plog.c was compiled. It's really much easier to have two separate builds and always use the optimized one when profiling. Jed
[petsc-dev] [GPU] Performance on Fermi
On Fri, 27 Aug 2010 16:34:45 -0500, Keita Teranishi keita at cray.com wrote: Jed, I usually manually edit petscconf.h and petscvariables to change the installation configurations for Cray XT/XE. The problem is configure script of PETSc picks up wrong variables and #define macros because the OS and library setting on the login node is different from the compute node. This particular case is just a mistake in configure script (and it's not a big deal to fix), but it will be great if you have any ideas to avoid picking up wrong settings. If it's behaving incorrectly when you configure --with-batch, it is a configure bug, so please submit the full error. Jed
[petsc-dev] [GPU] Performance on Fermi
Keita, I'd just like to echo what Barry says. I probably build petsc-dev on Jaguar more than any other person, and I generally don't have to manually edit any files generated by configure.py. When I do, I either find and fix the problem in BuildSystem, or work with the petsc-maint folks to fix it. If you will report problems to petsc-maint, we can work to ensure that you don't have to do these manual edits. Best regards, Richard On 8/27/2010 8:00 PM, Barry Smith wrote: On Aug 27, 2010, at 4:34 PM, Keita Teranishi wrote: Jed, I usually manually edit petscconf.h and petscvariables to change the installation configurations for Cray XT/XE. The problem is configure script of PETSc picks up wrong variables and #define macros because the OS and library setting on the login node is different from the compute node. Keita, We would prefer that you complain to petsc-maint at mcs.anl.gov so that we can fix configure problems and not have anyone editing the generated files. Barry 1) so that it works for all users not just those that know how to edit those files. We cannot fix problems we don't know about 2) editing those files repeatedly is fragile and it is easy to make a slight mistake that's hard to track down. This particular case is just a mistake in configure script (and it's not a big deal to fix), but it will be great if you have any ideas to avoid picking up wrong settings. Thanks, Keita Teranishi Scientific Library Group Cray, Inc. keita at cray.com -Original Message- From: Jed Brown [mailto:five9a2 at gmail.com] On Behalf Of Jed Brown Sent: Friday, August 27, 2010 4:29 PM To: Keita Teranishi; For users of the development version of PETSc Subject: RE: [petsc-dev] [GPU] Performance on Fermi On Fri, 27 Aug 2010 16:18:43 -0500, Keita Teranishikeita at cray.com wrote: Yes, I replaced all the compiler flags by -O3. petsc-maint doesn't come to me, but if the snippet that Barry quoted was from your log_summary, then PETSC_USE_DEBUG was definitely defined when plog.c was compiled. It's really much easier to have two separate builds and always use the optimized one when profiling. Jed -- Richard Tran Mills, Ph.D.| E-mail: rmills at climate.ornl.gov Computational Scientist | Phone: (865) 241-3198 Computational Earth Sciences Group | Fax:(865) 574-0405 Oak Ridge National Laboratory| http://climate.ornl.gov/~rmills
[petsc-dev] What's the point of D(A/M)GetGlobalVector?
I would support a name change to Create(). However, we should be really sure before we do stuff like that since nothing pisses people off (Wolfgang) like name changes. Matt On Fri, Aug 27, 2010 at 5:49 PM, Barry Smith bsmith at mcs.anl.gov wrote: Hmmm, petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT DMGetColoring(DM,ISColoringType,const MatType,ISColoring*); petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT DMGetMatrix(DM, const MatType,Mat*); petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT DMGetInterpolation(DM,DM,Mat*,Vec*); petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT DMGetInterpolationScale(DM,DM,Mat,Vec*); petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT DMGetAggregates(DM,DM,Mat*); should all of these be Create? In my mind usually Get means get something intrinsic to the underlying object (some property of it for example); Create means generate a new thing that while it may be associated with the DA is not owned or controlled by the DA. Another way to organize is Create() implies you later Destroy() that object, while for things you Get you do something else (like restore). I'm inclined to change all of these ones to Create() since they are all Destroyed() Barry On Aug 27, 2010, at 11:10 AM, Jed Brown wrote: On Fri, 27 Aug 2010 12:00:32 -0400, Kai Germaschewski kai.germaschewski at unh.edu wrote: And it also requires some more memory management framework which would call upon caches to expire long-unused objects when memory is running low. How would you detect this? Note that further allocation may be done external to PETSc, and perhaps even in a separate process. We're not in a managed environment, we can't get a reliable time to GC. If we could get that sort of signal, then I would be for such caching at all times, but I don't think we can, in which case I still think managed/pooled access versus owned creation needs to be explicitly different. Jed -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100828/0ed5ce22/attachment.html
[petsc-dev] [petsc4py] Vec.getArray()
On 28 August 2010 06:23, Matthew Knepley knepley at gmail.com wrote: On Fri, Aug 27, 2010 at 6:14 PM, Lisandro Dalcin dalcinl at gmail.com wrote: I cannot figure out how to implement a copy-free and safe VecGetArray()/VecRestoreArray() pattern in Python (not even by using the 'with' statement, it leaks the target variable). 1) Provide a 100% safe but slow, copy-based way: a = x.getArray() #gives you a copy. It is implemented with VecGetArrayRead(x, p), memcpy p-a.data, VecRestoreArray(x,p) on a freshly allocated numpy array that is returned to the user. a.base is None # True, the array owns its memory buffer x.setArray(a) #writes array on the vector. It is implemented with VecGetArray(x,p) and memcpy a.data - p, VecRestoreArray(x,p) 2) Provide a unsafe but fast, copy-free way to get a numpy array sharing memory with PETSc vectors: a = numpy.asarray(x) # gives you a numpy array that shares mem with the vec, it is implemented with VecGetArray() and special Python/NumPy protocols for buffer sharing. a.base is x # True, the base attr holds a ref to the Vec instance, the array does not own its memory buffer. del a # force garbage collection explicitily, then VecRestoreArray() will be called when a gets deallocated. Relying in explicit use of del for garbage collection is not reliable. NumPy is designed to support array views, these views hold references to the base array. So users have to be very careful about how the arrays obtained the fast way are used. Comments ? Suggestions? Complaints? I am for 2). PETSc users generally want to sacrifice safety for performance. ?? Matt Matt, that was not a list to choose from. I'm proposing to provide both models. a=x.getArray() x.setArray(a) will be the slow and safe, and a=numpy.asarray(x) will be the fast and safe, Additionally, we could add with statement support for (2), and other syntax sugar currently supported that is a=x[...] -- Lisandro Dalcin --- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169
[petsc-dev] [petsc4py] Vec.getArray()
On Sat, 28 Aug 2010 13:43:34 +0200, Jed Brown jed at 59A2.org wrote: On Fri, 27 Aug 2010 15:14:02 -0300, Lisandro Dalcin dalcinl at gmail.com wrote: I cannot figure out how to implement a copy-free and safe VecGetArray()/VecRestoreArray() pattern in Python (not even by using the 'with' statement, it leaks the target variable). What exactly leaks? We discussed this on GChat, thanks to Lisandro for pointing out lots of gotchas. with open('/tmp/tmp.txt','w') as f: g = f f, g # Both f and g are in scope, Python's with is not scoped like in Lisp or Haskell This is sort of okay because use of the closed f will raise an exception, but with numpy arrays, there is no way to invalidate an array. A first step would be to nullify the pointer so that invalid access would seg-fault instead of silently corrupting memory. numpy.array is an extension type, there is one vtable per class (like C++), not one per object (like PETSc). So it would not be okay to overwrite methods in the vtable. But there is still a type pointer (much like a C++ vptr) in each object that could perhaps be overwritten. This would allow with X as x: pass x[0] = 1 # x is not valid to actually raise an exception instead of doing something bad. Now consider with X as x: y = x[1:] y[0] = 1 Since we don't have control of y's vtable, we can't invalidate it. If x and y are native numpy arrays, then after y=x[1:], y must carry a reference to x (or rather, to the memory that backs x). The gc package is not aware of this reference, maybe it can't be queried from Python at all, but it must be accessible from C. I assume it won't hold full backward-links, so you couldn't use gc.get_referrers() to rewrite the vptr of all array views. But perhaps it is still possible to verify that only one exclusive reference remains. This would cause with X as x: y = x[1:] # Exception in __exit__ because the reference is not exclusive, use of # y would be unsafe (could cause silent corruption). with X as x: y = x[1:] # use y del y # Good, no hanging references Perhaps this is somewhat ugly, but I think it's better than silently corrupting memory. Attached is a short example code, it outputs $ python3 with.py __exit__: [102], nrefs=3, refs=[frame object at 0x23b16a0, {'val': [102]}, [[100], [102]]] __exit__: [101], nrefs=2, refs=[frame object at 0x23b16a0, {'val': [101]}] __exit__: [100], nrefs=3, refs=[frame object at 0x23b16a0, {'val': [100]}, [[100], [102]]] The opaque frame object is for the scope containing the with statement, the next is the reference saved by the Dispenser object, the third is the hanging reference (holding just a and c). Note that gc.get_referrers(), used in this example, is probably not available so we'd be raising the exception based purely on a reference count. Is there a case where a leak is unavoidable or where the reference count would otherwise be artifically high at __exit__(), so that it would be unacceptable to raise a usage exception when a reference is leaked? with Vec.getArrays(X,Y,Z) as (x,y,z): x = y + z # Numpy vectorized addition This can be written with X as x, Y as y, Z as z: in Python 2.7 and 3.1+ Jed -- next part -- A non-text attachment was scrubbed... Name: with.py Type: text/x-python Size: 713 bytes Desc: not available URL: http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100828/7a369a5c/attachment.py
[petsc-dev] [GPU] Performance on Fermi
On Fri, Aug 27, 2010 at 7:19 PM, Keita Teranishi keita at cray.com wrote: Barry, Yes. It improves the performance dramatically, but the execution time for KSPSolve stays the same. MatMult 5.2 Gflops I will note that to put the matvec on the GPU you will also need -mat_type aijcuda. Matt Thanks, Keita Teranishi Scientific Library Group Cray, Inc. keita at cray.com -Original Message- From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, August 27, 2010 2:15 PM To: For users of the development version of PETSc Subject: [petsc-dev] [GPU] Performance on Fermi PETSc-dev folks, Please prepend all messages to petsc-dev that involve GPUs with [GPU] so they can be easily filtered. Keita, To run src/ksp/ksp/examples/tutorials/ex2.c with CUDA you need the flag -vec_type cuda Note also that this example is fine for simple ONE processor tests but should not be used for parallel testing because it does not do a proper parallel partitioning for performance Barry On Aug 27, 2010, at 2:04 PM, Keita Teranishi wrote: Hi, I ran ex2.c with a matrix from 512x512 grid. I set CG and Jacobi for the solver and preconditioner. GCC-4.4.4 and CUDA-3.1 are used to compile the code. BLAS and LAPAKCK are not optimized. MatMult Fermi:1142 MFlops 1 core Istanbul: 420 MFlops KSPSolve: Fermi:1.5 Sec 1 core Istanbul: 1.7 Sec Keita Teranishi Scientific Library Group Cray, Inc. keita at cray.com -Original Message- From: petsc-dev-bounces at mcs.anl.gov [mailto: petsc-dev-bounces at mcs.anl.gov] On Behalf Of Satish Balay Sent: Friday, August 27, 2010 1:49 PM To: For users of the development version of PETSc Subject: Re: [petsc-dev] Problem with petsc-dev On Fri, 27 Aug 2010, Satish Balay wrote: There was a problem with tarball creation for the past few days. Will try to respin manually today - and update you. the petsc-dev tarball is now updated on the website.. Satish -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100828/38cd38a9/attachment.html
[petsc-dev] When __FUNCT__ is wrong
On 27 August 2010 17:54, Jed Brown jed at 59a2.org wrote: On Fri, 27 Aug 2010 12:45:46 -0500, Barry Smith bsmith at mcs.anl.gov wrote: ? Jed, ? ? ? You are certainly welcome to add it. fdbdc93647ff This just writes inconsistencies via PetscErrorPrintf, only in debug mode. ?I think I've gotten all the major inconsistencies in PETSc proper, Sieve might have more, but I don't have a current build of that. Note that this might be noisy for user code that redefine __FUNCT__, but not everywhere. ?If this bothers anyone, we could add a configure option to turn this, and only this, on and off. Jed running test_mat_fact [0]PETSC ERROR: src/mat/impls/baij/seq/baijfact3.c:80: __FUNCT__=MatSeqBAIJSetNumericFactorization does not agree with __func__=MatSeqBAIJSetNumericFactorization_inplace ..[0]PETSC ERROR: src/mat/impls/baij/seq/baijfact3.c:80: __FUNCT__=MatSeqBAIJSetNumericFactorization does not agree with __func__=MatSeqBAIJSetNumericFactorization_inplace [0]PETSC ERROR: src/mat/impls/sbaij/seq//u/dalcinl/Devel/PETSc/petsc-dev/include/../src/mat/impls/sbaij/seq/relax.h:80: __FUNCT__=MatMult_SeqSBAIJ_1 does not agree with __func__=MatMult_SeqSBAIJ_1_ushort .[0]PETSC ERROR: src/mat/impls/sbaij/seq//u/dalcinl/Devel/PETSc/petsc-dev/include/../src/mat/impls/sbaij/seq/relax.h:80: __FUNCT__=MatMult_SeqSBAIJ_1 does not agree with __func__=MatMult_SeqSBAIJ_1_ushort . No chance right now to look at these... -- Lisandro Dalcin --- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169
[petsc-dev] When __FUNCT__ is wrong
Fixed the MatSeqBAIJSetNumericFactorization_inplace() on it was missing Don't know why MatMult_SeqSBAIJ_1_ushort doesn't work. I guess someone who cares like Jed will have to figure it out. Or maybe you have outdated petsc-dev? Barry On Aug 28, 2010, at 5:24 PM, Lisandro Dalcin wrote: On 27 August 2010 17:54, Jed Brown jed at 59a2.org wrote: On Fri, 27 Aug 2010 12:45:46 -0500, Barry Smith bsmith at mcs.anl.gov wrote: Jed, You are certainly welcome to add it. fdbdc93647ff This just writes inconsistencies via PetscErrorPrintf, only in debug mode. I think I've gotten all the major inconsistencies in PETSc proper, Sieve might have more, but I don't have a current build of that. Note that this might be noisy for user code that redefine __FUNCT__, but not everywhere. If this bothers anyone, we could add a configure option to turn this, and only this, on and off. Jed running test_mat_fact [0]PETSC ERROR: src/mat/impls/baij/seq/baijfact3.c:80: __FUNCT__=MatSeqBAIJSetNumericFactorization does not agree with __func__=MatSeqBAIJSetNumericFactorization_inplace ..[0]PETSC ERROR: src/mat/impls/baij/seq/baijfact3.c:80: __FUNCT__=MatSeqBAIJSetNumericFactorization does not agree with __func__=MatSeqBAIJSetNumericFactorization_inplace [0]PETSC ERROR: src/mat/impls/sbaij/seq//u/dalcinl/Devel/PETSc/petsc-dev/include/../src/mat/impls/sbaij/seq/relax.h:80: __FUNCT__=MatMult_SeqSBAIJ_1 does not agree with __func__=MatMult_SeqSBAIJ_1_ushort .[0]PETSC ERROR: src/mat/impls/sbaij/seq//u/dalcinl/Devel/PETSc/petsc-dev/include/../src/mat/impls/sbaij/seq/relax.h:80: __FUNCT__=MatMult_SeqSBAIJ_1 does not agree with __func__=MatMult_SeqSBAIJ_1_ushort . No chance right now to look at these... -- Lisandro Dalcin --- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169
[petsc-dev] When __FUNCT__ is wrong
On Fri, 27 Aug 2010, Jed Brown wrote: On Fri, 27 Aug 2010 12:45:46 -0500, Barry Smith bsmith at mcs.anl.gov wrote: Jed, You are certainly welcome to add it. fdbdc93647ff This just writes inconsistencies via PetscErrorPrintf, only in debug mode. I think I've gotten all the major inconsistencies in PETSc proper, Sieve might have more, but I don't have a current build of that. Note that this might be noisy for user code that redefine __FUNCT__, but not everywhere. If this bothers anyone, we could add a configure option to turn this, and only this, on and off. If compiler supports the equivalent of __FUNCT__ - then configure should set things in such a way that all macro automatically use that one [and ignore __FUNCT__] The error check option [for __FUNCT__ being correct] should just be a special case test for us - or users - so an explicit configure can be used for it. Satish
[petsc-dev] When __FUNCT__ is wrong
On Aug 28, 2010, at 6:25 PM, Satish Balay wrote: On Fri, 27 Aug 2010, Jed Brown wrote: On Fri, 27 Aug 2010 12:45:46 -0500, Barry Smith bsmith at mcs.anl.gov wrote: Jed, You are certainly welcome to add it. fdbdc93647ff This just writes inconsistencies via PetscErrorPrintf, only in debug mode. I think I've gotten all the major inconsistencies in PETSc proper, Sieve might have more, but I don't have a current build of that. Note that this might be noisy for user code that redefine __FUNCT__, but not everywhere. If this bothers anyone, we could add a configure option to turn this, and only this, on and off. If compiler supports the equivalent of __FUNCT__ - then configure should set things in such a way that all macro automatically use that one [and ignore __FUNCT__] This may not work properly for our weird templated functions in .h files for SOR for SBAIJ etc and for VecScatter. I'd like to keep using out macro for now and only use the compiler version for checking. Then in the releases we don't have Jed's check so it doesn't bother people. Barry The error check option [for __FUNCT__ being correct] should just be a special case test for us - or users - so an explicit configure can be used for it. Satish
[petsc-dev] When __FUNCT__ is wrong
On Sat, 28 Aug 2010, Barry Smith wrote: On Aug 28, 2010, at 6:25 PM, Satish Balay wrote: On Fri, 27 Aug 2010, Jed Brown wrote: On Fri, 27 Aug 2010 12:45:46 -0500, Barry Smith bsmith at mcs.anl.gov wrote: Jed, You are certainly welcome to add it. fdbdc93647ff This just writes inconsistencies via PetscErrorPrintf, only in debug mode. I think I've gotten all the major inconsistencies in PETSc proper, Sieve might have more, but I don't have a current build of that. Note that this might be noisy for user code that redefine __FUNCT__, but not everywhere. If this bothers anyone, we could add a configure option to turn this, and only this, on and off. If compiler supports the equivalent of __FUNCT__ - then configure should set things in such a way that all macro automatically use that one [and ignore __FUNCT__] This may not work properly for our weird templated functions in .h files for SOR for SBAIJ etc and for VecScatter. I'd like to keep using out macro for now and only use the compiler version for checking. Then in the releases we don't have Jed's check so it doesn't bother people. Looks like Jed's current change already does this. So its currently broken? The addition in my sugestion was to have a configure option to enable-disable PetscCheck__FUNCT__() Satish Barry The error check option [for __FUNCT__ being correct] should just be a special case test for us - or users - so an explicit configure can be used for it. Satish
[petsc-dev] When __FUNCT__ is wrong
On Aug 28, 2010, at 7:57 PM, Satish Balay wrote: The addition in my sugestion was to have a configure option to enable-disable PetscCheck__FUNCT__() It is fine to have that configure option. My concern is the default, if we default off then no one will remember to turn it on and test if we default on then it will annoy many users. I say default off and make some nightly builds have it on. Barry Satish Barry The error check option [for __FUNCT__ being correct] should just be a special case test for us - or users - so an explicit configure can be used for it. Satish
[petsc-dev] When __FUNCT__ is wrong
I see Jed has a User provided function exception in PetscCheck__FUNCT__(). I was thinking of this issue [user code warnings with no funct usage] - when I suggested the configure option. So - at this point - I'm fine with current code.. Satish On Sat, 28 Aug 2010, Barry Smith wrote: On Aug 28, 2010, at 7:57 PM, Satish Balay wrote: The addition in my sugestion was to have a configure option to enable-disable PetscCheck__FUNCT__() It is fine to have that configure option. My concern is the default, if we default off then no one will remember to turn it on and test if we default on then it will annoy many users. I say default off and make some nightly builds have it on. Barry Satish Barry The error check option [for __FUNCT__ being correct] should just be a special case test for us - or users - so an explicit configure can be used for it. Satish