date:20100827

I'm posting this to -dev since I'm working in -dev, let me know if I should
be pushing this sort of question to -users.  I was debugging a code today
and noticed the following note in DAGetGlobalVector:

The vector values are NOT initialized and may have garbage in them, so you
may need to zero them.

What exactly is the purpose of these routines then?  Is there a global
Vector associated with a DA?  If so, why are the values uninitialized?  On
the other hand, if there isn't one, what's the sense of 'get/restore'?  The
following code does NOT work the way I'd expect:

DAGetGlobalVector(x)

/* modify x */

DARestoreGlobalVector(x)

DAGetGlobalVector(x)

/* access previous values, except everything is zero*/

I guess I'm not grokking something about the concept of a DA or PETSc
objects, could somebody explain the purpose or correct usage of Get/Restore
here.

Thanks in advance,
Aron
-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/268d23b4/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 14:13:01 +0300, Aron Ahmadia aron.ahmadia at kaust.edu.sa 
wrote:
 What exactly is the purpose of these routines then?  Is there a global
 Vector associated with a DA?  If so, why are the values uninitialized? 

It's common to need work vectors in places like residual evaluation and
Jacobian assembly.  There is a little bit of setup cost to allocate a
new vector each time, so usually we'd prefer that they be persistent and
just reuse them.  One option would be to make the user manage this
themselves, but that's error prone because it's easy to accidentally
alias the work vectors, so instead the DA keeps a cache of vectors.  It
starts out empty, and each time you call DAGetGlobalVector(), the cache
is searched for an available vector.  If none are found, a new one is
allocated and the cache grows by one.  DARestoreGlobalVector() checks a
vector back in so it may be used elsewhere.  These vectors are destroyed
in DADestroy().

Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

2010-08-27 Thread Matthew Knepley

Simply, in PETSc, getFoo() and restoreFoo() operate an object pool.

   Matt

On Fri, Aug 27, 2010 at 11:23 AM, Jed Brown jed at 59a2.org wrote:

 On Fri, 27 Aug 2010 14:13:01 +0300, Aron Ahmadia 
 aron.ahmadia at kaust.edu.sa wrote:
  What exactly is the purpose of these routines then?  Is there a global
  Vector associated with a DA?  If so, why are the values uninitialized?

 It's common to need work vectors in places like residual evaluation and
 Jacobian assembly.  There is a little bit of setup cost to allocate a
 new vector each time, so usually we'd prefer that they be persistent and
 just reuse them.  One option would be to make the user manage this
 themselves, but that's error prone because it's easy to accidentally
 alias the work vectors, so instead the DA keeps a cache of vectors.  It
 starts out empty, and each time you call DAGetGlobalVector(), the cache
 is searched for an available vector.  If none are found, a new one is
 allocated and the cache grows by one.  DARestoreGlobalVector() checks a
 vector back in so it may be used elsewhere.  These vectors are destroyed
 in DADestroy().

 Jed




-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/3e5116e5/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

Thanks Matt and Jed,

I think I'm straight on usage/philosophy here.

On Fri, Aug 27, 2010 at 2:42 PM, Matthew Knepley knepley at gmail.com wrote:

Simply, in PETSc, getFoo() and restoreFoo() operate an object pool.

Matt

On Fri, Aug 27, 2010 at 11:23 AM, Jed Brown jed at 59a2.org wrote:

On Fri, 27 Aug 2010 14:13:01 +0300, Aron Ahmadia
aron.ahmadia at kaust.edu.sa wrote:
What exactly is the purpose of these routines then? Is there a global
Vector associated with a DA? If so, why are the values uninitialized?

It's common to need work vectors in places like residual evaluation and
Jacobian assembly. There is a little bit of setup cost to allocate a
new vector each time, so usually we'd prefer that they be persistent and
just reuse them. One option would be to make the user manage this
themselves, but that's error prone because it's easy to accidentally
alias the work vectors, so instead the DA keeps a cache of vectors. It
starts out empty, and each time you call DAGetGlobalVector(), the cache
is searched for an available vector. If none are found, a new one is
allocated and the cache grows by one. DARestoreGlobalVector() checks a
vector back in so it may be used elsewhere. These vectors are destroyed
in DADestroy().

Jed

--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/dcd73c11/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

2010-08-27 Thread Dmitry Karpeev

Except VecGetArray, etc, which operate a pool of one object.
I think this may be the root cause of confusion.

Dmitry.

On Fri, Aug 27, 2010 at 6:42 AM, Matthew Knepley knepley at gmail.com wrote:
 Simply, in PETSc, getFoo() and restoreFoo() operate an object pool.
 ?? Matt

 On Fri, Aug 27, 2010 at 11:23 AM, Jed Brown jed at 59a2.org wrote:

 On Fri, 27 Aug 2010 14:13:01 +0300, Aron Ahmadia
 aron.ahmadia at kaust.edu.sa wrote:
  What exactly is the purpose of these routines then? ?Is there a global
  Vector associated with a DA? ?If so, why are the values uninitialized?

 It's common to need work vectors in places like residual evaluation and
 Jacobian assembly. ?There is a little bit of setup cost to allocate a
 new vector each time, so usually we'd prefer that they be persistent and
 just reuse them. ?One option would be to make the user manage this
 themselves, but that's error prone because it's easy to accidentally
 alias the work vectors, so instead the DA keeps a cache of vectors. ?It
 starts out empty, and each time you call DAGetGlobalVector(), the cache
 is searched for an available vector. ?If none are found, a new one is
 allocated and the cache grows by one. ?DARestoreGlobalVector() checks a
 vector back in so it may be used elsewhere. ?These vectors are destroyed
 in DADestroy().

 Jed



 --
 What most experimenters take for granted before they begin their experiments
 is infinitely more interesting than any results to which their experiments
 lead.
 -- Norbert Wiener

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

Not to mention the various Get routines that are actually used to create
things, such as DAGetMatrix. Still, the idea of a pool of work vectors
makes sense, I was just trying to wrap my head around the actual use for
those routines.

On Fri, Aug 27, 2010 at 4:06 PM, Dmitry Karpeev karpeev at mcs.anl.gov wrote:

Except VecGetArray, etc, which operate a pool of one object.
I think this may be the root cause of confusion.

Dmitry.

On Fri, Aug 27, 2010 at 6:42 AM, Matthew Knepley knepley at gmail.com
wrote:
Simply, in PETSc, getFoo() and restoreFoo() operate an object pool.
Matt

On Fri, Aug 27, 2010 at 11:23 AM, Jed Brown jed at 59a2.org wrote:

Jed

--
What most experimenters take for granted before they begin their
experiments
is infinitely more interesting than any results to which their
experiments
lead.
-- Norbert Wiener

-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/39eceb50/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 16:13:14 +0300, Aron Ahmadia aron.ahmadia at kaust.edu.sa 
wrote:
 Not to mention the various Get routines that are actually used to create
 things, such as DAGetMatrix.

I think that should have been named DACreateMatrix().  Other XGetY() are
just accessors which create a managed object if needed.  When there is a
Get/Restore pair, it implies that the access to that managed resource
has some sort of exclusivity.  I think
{Get,Restore}{Local,Global}Vector() are actually the only functions with
that naming scheme that really manage a pool.

Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

2010-08-27 Thread Matthew Knepley

I agree with DACreateMatrix().

  Matt

On Fri, Aug 27, 2010 at 1:24 PM, Jed Brown jed at 59a2.org wrote:

 On Fri, 27 Aug 2010 16:13:14 +0300, Aron Ahmadia 
 aron.ahmadia at kaust.edu.sa wrote:
  Not to mention the various Get routines that are actually used to create
  things, such as DAGetMatrix.

 I think that should have been named DACreateMatrix().  Other XGetY() are
 just accessors which create a managed object if needed.  When there is a
 Get/Restore pair, it implies that the access to that managed resource
 has some sort of exclusivity.  I think
 {Get,Restore}{Local,Global}Vector() are actually the only functions with
 that naming scheme that really manage a pool.

 Jed




-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/3c4594d7/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Aug 27, 2010, at 8:13 AM, Aron Ahmadia wrote:

Not to mention the various Get routines that are actually used to create
things, such as DAGetMatrix. Still, the idea of a pool of work vectors makes
sense, I was just trying to wrap my head around the actual use for those
routines.

Aron,

DAGetMatrix() is actually a bug and should be DACreateMatrix() (or maybe
better DACreateMat() while the DACreateGlobalVector() and friends should really
be DACreateGlobalVec()).

Are there others beside DAGetMatrix() that are incorrect with gets that
should be creates?

Thanks

Barry

It would actually be nice if we made DACreateGlobal/LocalVector() so
light-weight that it could be used for work vectors (instead of needing a
different set of light weight get routines) but then we would need
DADestroyGlobal/Vector() to handle putting back in the free list or need to
modify VecDestroy() to handle not actually destroying but managing a free
list). And there is also the issue of zeroing or not zeroing the Vec initially.
This is why we still have the Create and Get versions.

On Fri, Aug 27, 2010 at 4:06 PM, Dmitry Karpeev karpeev at mcs.anl.gov
wrote:
Except VecGetArray, etc, which operate a pool of one object.
I think this may be the root cause of confusion.

Dmitry.

On Fri, Aug 27, 2010 at 6:42 AM, Matthew Knepley knepley at gmail.com wrote:
Simply, in PETSc, getFoo() and restoreFoo() operate an object pool.
Matt

On Fri, Aug 27, 2010 at 11:23 AM, Jed Brown jed at 59a2.org wrote:

Jed

--
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener

-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/27177aa0/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On 27 August 2010 10:27, Barry Smith bsmith at mcs.anl.gov wrote:

 On Aug 27, 2010, at 8:13 AM, Aron Ahmadia wrote:

 Not to mention the various Get routines that are actually used to create
 things, such as?DAGetMatrix. ?Still, the idea of a pool of work vectors
 makes sense, I was just trying to wrap my head around the actual use for
 those routines.

 ?? Aron,
 ?? ?DAGetMatrix() is actually a bug and should be DACreateMatrix() ? (or
 maybe better DACreateMat() while the DACreateGlobalVector() and friends
 should really be DACreateGlobalVec()).

I think you are right.


 ?? ?Are there others beside DAGetMatrix() that are incorrect with gets that
 should be creates?

MatGetVecs()


-- 
Lisandro Dalcin
---
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

Barry,

I was being snippy, DAGetMatrix is the only one I know of that acts
unexpectedly.

Still, when working with Vec, you expect Get/Restore to modify data in the
Vec itself, which breaks for DAs since a DA has no internal Vec storage.
Unless you are willing to append 'Work' somewhere into the names of the
work vector routines, I don't see an obvious solution.

On Fri, Aug 27, 2010 at 4:27 PM, Barry Smith bsmith at mcs.anl.gov wrote:

On Aug 27, 2010, at 8:13 AM, Aron Ahmadia wrote:

Aron,

DAGetMatrix() is actually a bug and should be DACreateMatrix() (or
maybe better DACreateMat() while the DACreateGlobalVector() and friends
should really be DACreateGlobalVec()).

Are there others beside DAGetMatrix() that are incorrect with gets that
should be creates?

Thanks

Barry

It would actually be nice if we made DACreateGlobal/LocalVector() so
light-weight that it could be used for work vectors (instead of needing a
different set of light weight get routines) but then we would need
DADestroyGlobal/Vector() to handle putting back in the free list or need to
modify VecDestroy() to handle not actually destroying but managing a free
list). And there is also the issue of zeroing or not zeroing the Vec
initially.
This is why we still have the Create and Get versions.

On Fri, Aug 27, 2010 at 4:06 PM, Dmitry Karpeev karpeev at mcs.anl.govwrote:

Except VecGetArray, etc, which operate a pool of one object.
I think this may be the root cause of confusion.

Dmitry.

On Fri, Aug 27, 2010 at 6:42 AM, Matthew Knepley knepley at gmail.com
wrote:
Simply, in PETSc, getFoo() and restoreFoo() operate an object pool.
Matt

On Fri, Aug 27, 2010 at 11:23 AM, Jed Brown jed at 59a2.org wrote:

On Fri, 27 Aug 2010 14:13:01 +0300, Aron Ahmadia
aron.ahmadia at kaust.edu.sa wrote:
What exactly is the purpose of these routines then? Is there a
global
Vector associated with a DA? If so, why are the values
uninitialized?

It's common to need work vectors in places like residual evaluation and
Jacobian assembly. There is a little bit of setup cost to allocate a
new vector each time, so usually we'd prefer that they be persistent
and
just reuse them. One option would be to make the user manage this
themselves, but that's error prone because it's easy to accidentally
alias the work vectors, so instead the DA keeps a cache of vectors. It
starts out empty, and each time you call DAGetGlobalVector(), the cache
is searched for an available vector. If none are found, a new one is
allocated and the cache grows by one. DARestoreGlobalVector() checks a
vector back in so it may be used elsewhere. These vectors are
destroyed
in DADestroy().

Jed

--
What most experimenters take for granted before they begin their
experiments
is infinitely more interesting than any results to which their
experiments
lead.
-- Norbert Wiener

-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/69ed0811/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 08:27:22 -0500, Barry Smith bsmith at mcs.anl.gov wrote:
  It would actually be nice if we made DACreateGlobal/LocalVector()
 so light-weight that it could be used for work vectors (instead of
 needing a different set of light weight get routines) but then we
 would need DADestroyGlobal/Vector() to handle putting back in the free
 list or need to modify VecDestroy() to handle not actually destroying
 but managing a free list).

Hmm, I'm not sure that only having Create versions would be a good
thing.  Overhead could be reduced to a VecDuplicate, but you still have
to allocate the memory.  Malloc is plenty fast (~400 cycles everywhere
I've checked) if you don't touch the memory, but traversing it extra
times is not ideal, yet the reproducibility of always zeroing newly
created vectors is handy.  Maybe there are places that malloc() is more
expensive, or that fragmentation (if you have some very small vectors,
or are using huge pages) would become a factor.

If the vectors are going to be persistent, their life has to be managed
somehow, in which case the Get versions are needed.  It also helps (me)
to clarify intent: if I see a Get, then I know the vector is temporary
without needing to check for a matching Destroy.

Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On 27 August 2010 10:47, Jed Brown jed at 59a2.org wrote:

 ?It also helps (me)
 to clarify intent: if I see a Get, then I know the vector is temporary
 without needing to check for a matching Destroy.


You need a matching Restore()


-- 
Lisandro Dalcin
---
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 10:56:56 -0300, Lisandro Dalcin dalcinl at gmail.com 
wrote:
 On 27 August 2010 10:47, Jed Brown jed at 59a2.org wrote:
 
  ?It also helps (me)
  to clarify intent: if I see a Get, then I know the vector is temporary
  without needing to check for a matching Destroy.
 
 
 You need a matching Restore()

Right, but I know when I see the Get that there will be a matching
Restore (I don't have to infer from context or looking ahead that the
object isn't handed off in some other way).

Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

2010-08-27 Thread Dmitry Karpeev

I think DAGetXXX etc should really be thought of as constructors,
that under the hood manage a pool to amortize the construction time.
Perhaps more precisely, DA acts as a factory.
It would be natural to rename them DACreateXXX, except that
then XXXDestroy is the wrong thing to do (unless the destructor
is overloaded in the corresponding object).

Dmitry.


On Fri, Aug 27, 2010 at 8:24 AM, Jed Brown jed at 59a2.org wrote:
 On Fri, 27 Aug 2010 16:13:14 +0300, Aron Ahmadia aron.ahmadia at 
 kaust.edu.sa wrote:
 Not to mention the various Get routines that are actually used to create
 things, such as DAGetMatrix.

 I think that should have been named DACreateMatrix(). ?Other XGetY() are
 just accessors which create a managed object if needed. ?When there is a
 Get/Restore pair, it implies that the access to that managed resource
 has some sort of exclusivity. ?I think
 {Get,Restore}{Local,Global}Vector() are actually the only functions with
 that naming scheme that really manage a pool.

 Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 09:45:36 -0500, Dmitry Karpeev karpeev at mcs.anl.gov 
wrote:
 I think DAGetXXX etc should really be thought of as constructors,
 that under the hood manage a pool to amortize the construction time.
 Perhaps more precisely, DA acts as a factory.
 It would be natural to rename them DACreateXXX, except that
 then XXXDestroy is the wrong thing to do (unless the destructor
 is overloaded in the corresponding object).

Hmm, I think there is an important distinction, in terms of overall
memory use, between objects that live a long time and those that do not.
I might allocate a bunch of memory in a preprocessing stage, release it,
and then build solver objects.  If all that preprocessing memory stayed
alive for the life of the program, I would run out of memory.

You can't overload Destroy in place of Restore unless you maintain
upward links or guarantee that the managing object will never need to do
anything when you restore.  One way to do this would be to
double-reference gotten objects and consider them to be checked in any
time the reference count drops to 1.  But this excludes more elaborate
data structures and extra consistency checks.

I don't see any great nastiness of Get/Restore for managed objects and
Create/Destroy for objects that the user wants to own.

Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

2010-08-27 Thread Dmitry Karpeev

Maybe Create/Destroy isn't the right solution, but there appears to be
some confusion
about the meaning of Get/Restore in this context.  It definitely
differs from VecGet/RestoreArray.
For example: there is no guarantee that subsequent DAGetXXXs,
punctuated by DARestoreXXXs,
will give one the same object.

Dmitry.

On Fri, Aug 27, 2010 at 10:02 AM, Jed Brown jed at 59a2.org wrote:
 On Fri, 27 Aug 2010 09:45:36 -0500, Dmitry Karpeev karpeev at mcs.anl.gov 
 wrote:
 I think DAGetXXX etc should really be thought of as constructors,
 that under the hood manage a pool to amortize the construction time.
 Perhaps more precisely, DA acts as a factory.
 It would be natural to rename them DACreateXXX, except that
 then XXXDestroy is the wrong thing to do (unless the destructor
 is overloaded in the corresponding object).

 Hmm, I think there is an important distinction, in terms of overall
 memory use, between objects that live a long time and those that do not.
 I might allocate a bunch of memory in a preprocessing stage, release it,
 and then build solver objects. ?If all that preprocessing memory stayed
 alive for the life of the program, I would run out of memory.

 You can't overload Destroy in place of Restore unless you maintain
 upward links or guarantee that the managing object will never need to do
 anything when you restore. ?One way to do this would be to
 double-reference gotten objects and consider them to be checked in any
 time the reference count drops to 1. ?But this excludes more elaborate
 data structures and extra consistency checks.

 I don't see any great nastiness of Get/Restore for managed objects and
 Create/Destroy for objects that the user wants to own.

 Jed

[petsc-dev] Problem with petsc-dev

Satish,

I still have the same error with the tar ball I downloaded several minutes ago.

Regards,

?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com


-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Satish Balay
Sent: Thursday, August 26, 2010 11:15 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] Problem with petsc-dev

tonights tarball should have a fix for this.

satish

On Thu, 26 Aug 2010, Keita Teranishi wrote:

 I downloaded a nightly tar ball.  
 
 
 ?Keita Teranishi
 ?Scientific Library Group
 ?Cray, Inc.
 ?keita at cray.com
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Thursday, August 26, 2010 10:33 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 Do you obtain petsc-dev via nightly tarball - and not mercurial?
 
 
 On Thu, 26 Aug 2010, Keita Teranishi wrote:
 
  Hi,
  
  I haven't been able to run the configure script of petsc-dev.  The error 
  message is, No module named cmakegen.  What does the message mean?
  
  Thanks,
  
   Keita Teranishi
   Scientific Library Group
   Cray, Inc.
   keita at cray.com

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 10:14:30 -0500, Dmitry Karpeev karpeev at mcs.anl.gov 
wrote:
 Maybe Create/Destroy isn't the right solution, but there appears to be
 some confusion
 about the meaning of Get/Restore in this context.  It definitely
 differs from VecGet/RestoreArray.
 For example: there is no guarantee that subsequent DAGetXXXs,
 punctuated by DARestoreXXXs,
 will give one the same object.

Indeed, managing a pool has different semantics.  Is it worth looking
for a less ambiguous/overloaded name?

  Checkout/Checkin
  Borrow/Return
  Claim/Release

Jed

[petsc-dev] Problem with petsc-dev

2010-08-27 Thread Satish Balay

There was a problem with tarball creation for the past few days. Will
try to respin manually today - and update you.

If using petsc-dev - its best to use mercurial though..

Satish

On Fri, 27 Aug 2010, Keita Teranishi wrote:

 Satish,
 
 I still have the same error with the tar ball I downloaded several minutes 
 ago.
 
 Regards,
 
 ?Keita Teranishi
 ?Scientific Library Group
 ?Cray, Inc.
 ?keita at cray.com
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Thursday, August 26, 2010 11:15 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 tonights tarball should have a fix for this.
 
 satish
 
 On Thu, 26 Aug 2010, Keita Teranishi wrote:
 
  I downloaded a nightly tar ball.  
  
  
  ?Keita Teranishi
  ?Scientific Library Group
  ?Cray, Inc.
  ?keita at cray.com
  
  
  -Original Message-
  From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
  mcs.anl.gov] On Behalf Of Satish Balay
  Sent: Thursday, August 26, 2010 10:33 PM
  To: For users of the development version of PETSc
  Subject: Re: [petsc-dev] Problem with petsc-dev
  
  Do you obtain petsc-dev via nightly tarball - and not mercurial?
  
  
  On Thu, 26 Aug 2010, Keita Teranishi wrote:
  
   Hi,
   
   I haven't been able to run the configure script of petsc-dev.  The error 
   message is, No module named cmakegen.  What does the message mean?
   
   Thanks,
   
Keita Teranishi
Scientific Library Group
Cray, Inc.
keita at cray.com

[petsc-dev] Problem with petsc-dev

Satish,

Thanks.  I do not see any mercurial package for SUSE, let me try if it works.

Regards,

?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Satish Balay
Sent: Friday, August 27, 2010 10:33 AM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] Problem with petsc-dev

There was a problem with tarball creation for the past few days. Will
try to respin manually today - and update you.

If using petsc-dev - its best to use mercurial though..

Satish

On Fri, 27 Aug 2010, Keita Teranishi wrote:

 Satish,
 
 I still have the same error with the tar ball I downloaded several minutes 
 ago.
 
 Regards,
 
 ?Keita Teranishi
 ?Scientific Library Group
 ?Cray, Inc.
 ?keita at cray.com
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Thursday, August 26, 2010 11:15 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 tonights tarball should have a fix for this.
 
 satish
 
 On Thu, 26 Aug 2010, Keita Teranishi wrote:
 
  I downloaded a nightly tar ball.  
  
  
  ?Keita Teranishi
  ?Scientific Library Group
  ?Cray, Inc.
  ?keita at cray.com
  
  
  -Original Message-
  From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
  mcs.anl.gov] On Behalf Of Satish Balay
  Sent: Thursday, August 26, 2010 10:33 PM
  To: For users of the development version of PETSc
  Subject: Re: [petsc-dev] Problem with petsc-dev
  
  Do you obtain petsc-dev via nightly tarball - and not mercurial?
  
  
  On Thu, 26 Aug 2010, Keita Teranishi wrote:
  
   Hi,
   
   I haven't been able to run the configure script of petsc-dev.  The error 
   message is, No module named cmakegen.  What does the message mean?
   
   Thanks,
   
Keita Teranishi
Scientific Library Group
Cray, Inc.
keita at cray.com

[petsc-dev] What's the point of D(A/M)GetGlobalVector?


On Aug 27, 2010, at 8:47 AM, Jed Brown wrote:

 On Fri, 27 Aug 2010 08:27:22 -0500, Barry Smith bsmith at mcs.anl.gov wrote:
 It would actually be nice if we made DACreateGlobal/LocalVector()
 so light-weight that it could be used for work vectors (instead of
 needing a different set of light weight get routines) but then we
 would need DADestroyGlobal/Vector() to handle putting back in the free
 list or need to modify VecDestroy() to handle not actually destroying
 but managing a free list).
 
 Hmm, I'm not sure that only having Create versions would be a good
 thing.  Overhead could be reduced to a VecDuplicate, but you still have
 to allocate the memory.  Malloc is plenty fast (~400 cycles everywhere
 I've checked) if you don't touch the memory, but traversing it extra
 times is not ideal, yet the reproducibility of always zeroing newly
 created vectors is handy.  Maybe there are places that malloc() is more
 expensive, or that fragmentation (if you have some very small vectors,
 or are using huge pages) would become a factor.

If the DACreateVec managed a pool and the destroy put it back in a pool 
then you would not all this overhead. 

It is fine to have a DACreate and DAGet but I think it is possible to have 
just one.

Barry

 
 If the vectors are going to be persistent, their life has to be managed
 somehow, in which case the Get versions are needed.  It also helps (me)
 to clarify intent: if I see a Get, then I know the vector is temporary
 without needing to check for a matching Destroy.
 
 Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 10:50:22 -0500, Barry Smith bsmith at mcs.anl.gov wrote:
 If the DACreateVec managed a pool and the destroy put it back in a pool 
 then you would not all this overhead. 

The problem with this is that it doesn't actually release the memory.
So in the preprocessing scenario where I create, say, 50 local vectors,
use them all temporarily, then destroy them and build a solver, I would
run out of memory.  I think there needs to be some way for the user to
guarantee that the memory is actually released.

Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

2010-08-27 Thread Kai Germaschewski

I think it's conceptually not right to have Get()/Restore() pairs to do
caching, but provide that feature only in special cases. Shouldn't other
places that needs temp work vectors have a similar facility? I think a
better way of handling this would be to get rid of the special
Get()/Restore() caching pairs, and rather make caching a feature of the
object itself. I.e., VecCreate() might have a pool of allocated vector data
structures + associated data storage arrays. VecDestroy() would not really
destroy a Vec but return it to the pool. Unfortunately that doesn't mix too
well with how the real setup of a vector is done later, after options and
potentially command line options have been processed, so by the time you
know what kind of vector you want, you've pretty much already filled the
data structure. And it also requires some more memory management framework
which would call upon caches to expire long-unused objects when memory is
running low.

I think a more consistent user interface would be to just have
DACreateVec(), and something like MatCreateVecs(), and then VecDestroy() the
Vec when you're done, no matter where it came from. Whether it's cached or
not is an implementation detail. Probably the first thing to figure out
would be whether caching is making an actual difference in the real world,
and if not, there's a pretty straight forward solution...

--Kai

On Fri, Aug 27, 2010 at 11:36 AM, Jed Brown jed at 59a2.org wrote:

On Fri, 27 Aug 2010 10:14:30 -0500, Dmitry Karpeev karpeev at mcs.anl.gov
wrote:
Maybe Create/Destroy isn't the right solution, but there appears to be
some confusion
about the meaning of Get/Restore in this context. It definitely
differs from VecGet/RestoreArray.
For example: there is no guarantee that subsequent DAGetXXXs,
punctuated by DARestoreXXXs,
will give one the same object.

Indeed, managing a pool has different semantics. Is it worth looking
for a less ambiguous/overloaded name?

Checkout/Checkin
Borrow/Return
Claim/Release

Jed

--
Kai Germaschewski
Assistant Professor, Dept of Physics / Space Science Center
University of New Hampshire, Durham, NH 03824
office: Morse Hall 245E
phone: +1-603-862-2912
fax: +1-603-862-2771
-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100827/707ed688/attachment.html

[petsc-dev] What's the point of D(A/M)GetGlobalVector?

On Fri, 27 Aug 2010 12:00:32 -0400, Kai Germaschewski kai.germaschewski at 
unh.edu wrote:
 And it also requires some more memory management framework which would
 call upon caches to expire long-unused objects when memory is running
 low.

How would you detect this?  Note that further allocation may be done
external to PETSc, and perhaps even in a separate process.  We're not in
a managed environment, we can't get a reliable time to GC.  If we
could get that sort of signal, then I would be for such caching at all
times, but I don't think we can, in which case I still think
managed/pooled access versus owned creation needs to be explicitly
different.

Jed

[petsc-dev] Problem with petsc-dev

2010-08-27 Thread Satish Balay

On Fri, 27 Aug 2010, Keita Teranishi wrote:

 Satish,
 
 Thanks.  I do not see any mercurial package for SUSE, let me try if it works.

source install of mercruial is pretty easy

cd mercurial
python setup.py install --prefix=/foo/bar

[instructions say - use PYTHONPATH - but I like to hardcode it in 'hg'
script] i.e edit /foo/bar/bin/hg and add the following at the very
begining [where mercurial uses a different value of pythonXX -
depending on the python version on your machine]

import sys
sys.path.insert(0,'/foo/bar/lib/pythonXX/site-packages/')

Satish

[petsc-dev] When FUNCT is wrong

C99 mandates __func__, but unfortunately C++ and earlier C standards
mandate no such thing.  Even so, most compilers support __FUNCTION__,
__PRETTY_FUNCTION__ (distinct in C++), or __func__.  Inaccurate traces
annoy me greatly and it's inevitable that __FUNCT__ is occasionally
incorrect.  It also seems a shame for traces to just say USER provided
function when the compiler supports something better.  So how about
having configure check for the existance of a compiler-supported name,
and perhaps also add an assertion to PetscFunctionBegin to error if
__FUNCT__ does not match __FUNCTION__ (and is defined to something other
than User provided function, so that this wouldn't break user code
that ignores __FUNCT__ entirely).

Jed

[petsc-dev] When FUNCT is wrong


  Jed,

  You are certainly welcome to add it.

Barry

On Aug 27, 2010, at 11:59 AM, Jed Brown wrote:

 C99 mandates __func__, but unfortunately C++ and earlier C standards
 mandate no such thing.  Even so, most compilers support __FUNCTION__,
 __PRETTY_FUNCTION__ (distinct in C++), or __func__.  Inaccurate traces
 annoy me greatly and it's inevitable that __FUNCT__ is occasionally
 incorrect.  It also seems a shame for traces to just say USER provided
 function when the compiler supports something better.  So how about
 having configure check for the existance of a compiler-supported name,
 and perhaps also add an assertion to PetscFunctionBegin to error if
 __FUNCT__ does not match __FUNCTION__ (and is defined to something other
 than User provided function, so that this wouldn't break user code
 that ignores __FUNCT__ entirely).
 
 Jed

[petsc-dev] What's the point of D(A/M)GetGlobalVector?


  Hmmm,

petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT  
DMGetColoring(DM,ISColoringType,const MatType,ISColoring*);
petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT  DMGetMatrix(DM, const 
MatType,Mat*);
petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT  
DMGetInterpolation(DM,DM,Mat*,Vec*);
petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT  
DMGetInterpolationScale(DM,DM,Mat,Vec*);
petscda.h:EXTERN PetscErrorCode PETSCDM_DLLEXPORT  DMGetAggregates(DM,DM,Mat*);

   should all of these be Create? 

   In my mind usually Get means get something intrinsic to the underlying 
object (some property of it for example);  Create means generate a new thing 
that while it may be associated with the DA is not owned or controlled by 
the DA. 

Another way to organize is Create() implies you later Destroy() that 
object, while for things you Get you do something else (like restore).

I'm inclined to change all of these ones to Create() since they are all 
Destroyed()
   Barry



On Aug 27, 2010, at 11:10 AM, Jed Brown wrote:

 On Fri, 27 Aug 2010 12:00:32 -0400, Kai Germaschewski kai.germaschewski at 
 unh.edu wrote:
 And it also requires some more memory management framework which would
 call upon caches to expire long-unused objects when memory is running
 low.
 
 How would you detect this?  Note that further allocation may be done
 external to PETSc, and perhaps even in a separate process.  We're not in
 a managed environment, we can't get a reliable time to GC.  If we
 could get that sort of signal, then I would be for such caching at all
 times, but I don't think we can, in which case I still think
 managed/pooled access versus owned creation needs to be explicitly
 different.
 
 Jed

[petsc-dev] Problem with petsc-dev

Satish,

Now I got the latest copy using mercurial.  Thanks!
I am going to check the performance with Fermi.  Is there any command line 
option available to swith CUSP? Or do I have to apply MatConvet() with PETSc 
function calls?

Thanks,

?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Satish Balay
Sent: Friday, August 27, 2010 11:14 AM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] Problem with petsc-dev

On Fri, 27 Aug 2010, Keita Teranishi wrote:

 Satish,
 
 Thanks.  I do not see any mercurial package for SUSE, let me try if it works.

source install of mercruial is pretty easy

cd mercurial
python setup.py install --prefix=/foo/bar

[instructions say - use PYTHONPATH - but I like to hardcode it in 'hg'
script] i.e edit /foo/bar/bin/hg and add the following at the very
begining [where mercurial uses a different value of pythonXX -
depending on the python version on your machine]

import sys
sys.path.insert(0,'/foo/bar/lib/pythonXX/site-packages/')

Satish

[petsc-dev] Problem with petsc-dev


   You can run, for example, src/snes/examples/tutorials ex19 with the options 
-pc_type jacobi -dmmg_nlevels 5  -da_vec_type cuda -da_mat_type aijcuda  and 
run without those last two options to NOT use the GPU and compare the results 
with -log_summary. We'd be interesting in seeing those numbers also.

   Barry

On Aug 27, 2010, at 1:07 PM, Keita Teranishi wrote:

 Satish,
 
 Now I got the latest copy using mercurial.  Thanks!
 I am going to check the performance with Fermi.  Is there any command line 
 option available to swith CUSP? Or do I have to apply MatConvet() with PETSc 
 function calls?
 
 Thanks,
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 11:14 AM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Keita Teranishi wrote:
 
 Satish,
 
 Thanks.  I do not see any mercurial package for SUSE, let me try if it works.
 
 source install of mercruial is pretty easy
 
 cd mercurial
 python setup.py install --prefix=/foo/bar
 
 [instructions say - use PYTHONPATH - but I like to hardcode it in 'hg'
 script] i.e edit /foo/bar/bin/hg and add the following at the very
 begining [where mercurial uses a different value of pythonXX -
 depending on the python version on your machine]
 
 import sys
 sys.path.insert(0,'/foo/bar/lib/pythonXX/site-packages/')
 
 Satish

[petsc-dev] [petsc4py] Vec.getArray()

I cannot figure out how to implement a copy-free and safe
VecGetArray()/VecRestoreArray() pattern in Python (not even by using
the 'with' statement, it leaks the target variable).

1) Provide a 100% safe but slow, copy-based way:

a = x.getArray() #gives you a copy. It is implemented with
VecGetArrayRead(x, p), memcpy p-a.data, VecRestoreArray(x,p) on a
freshly allocated numpy array that is returned to the user.
a.base is None # True, the array owns its memory buffer
x.setArray(a) #writes array on the vector. It is implemented with
VecGetArray(x,p) and memcpy a.data - p, VecRestoreArray(x,p)


2) Provide a unsafe but fast, copy-free way to get a numpy array
sharing memory with PETSc vectors:

a = numpy.asarray(x) # gives you a numpy array that shares mem with
the vec, it is implemented with VecGetArray() and special Python/NumPy
protocols for buffer sharing.
a.base is x # True, the base attr holds a ref to the Vec instance, the
array does not own its memory buffer.
del a # force garbage collection explicitily, then VecRestoreArray()
will be called when a gets deallocated.

Relying in explicit use of del for garbage collection is not reliable.
NumPy is designed to support array views, these views hold references
to the base array. So users have to be very careful about how the
arrays obtained the fast way are used.


Comments ? Suggestions? Complaints?


-- 
Lisandro Dalcin
---
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

[petsc-dev] Problem with petsc-dev

Barry,

I already see a big difference in MatMult routine of 
ksp/ksp/examples/tutorials/ex2.c, and I am very happy to try that example 
program.

Thanks,

?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Barry Smith
Sent: Friday, August 27, 2010 1:13 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] Problem with petsc-dev


   You can run, for example, src/snes/examples/tutorials ex19 with the options 
-pc_type jacobi -dmmg_nlevels 5  -da_vec_type cuda -da_mat_type aijcuda  and 
run without those last two options to NOT use the GPU and compare the results 
with -log_summary. We'd be interesting in seeing those numbers also.

   Barry

On Aug 27, 2010, at 1:07 PM, Keita Teranishi wrote:

 Satish,
 
 Now I got the latest copy using mercurial.  Thanks!
 I am going to check the performance with Fermi.  Is there any command line 
 option available to swith CUSP? Or do I have to apply MatConvet() with PETSc 
 function calls?
 
 Thanks,
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 11:14 AM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Keita Teranishi wrote:
 
 Satish,
 
 Thanks.  I do not see any mercurial package for SUSE, let me try if it works.
 
 source install of mercruial is pretty easy
 
 cd mercurial
 python setup.py install --prefix=/foo/bar
 
 [instructions say - use PYTHONPATH - but I like to hardcode it in 'hg'
 script] i.e edit /foo/bar/bin/hg and add the following at the very
 begining [where mercurial uses a different value of pythonXX -
 depending on the python version on your machine]
 
 import sys
 sys.path.insert(0,'/foo/bar/lib/pythonXX/site-packages/')
 
 Satish

[petsc-dev] Problem with petsc-dev

On 27 August 2010 15:28, Keita Teranishi keita at cray.com wrote:
 Barry,

 I already see a big difference in MatMult routine of 
 ksp/ksp/examples/tutorials/ex2.c, and I am very happy to try that example 
 program.


Could you post your numbers?

-- 
Lisandro Dalcin
---
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

[petsc-dev] Problem with petsc-dev


  Keita,

Make sure you ALWAYS use the flag -cuda_synchronize when you run with 
-log_summary

Otherwise you get misleading numbers. (Which I am guessing you got).

 Barry



On Aug 27, 2010, at 1:28 PM, Keita Teranishi wrote:

 Barry,
 
 I already see a big difference in MatMult routine of 
 ksp/ksp/examples/tutorials/ex2.c, and I am very happy to try that example 
 program.
 
 Thanks,
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Barry Smith
 Sent: Friday, August 27, 2010 1:13 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 
   You can run, for example, src/snes/examples/tutorials ex19 with the options 
 -pc_type jacobi -dmmg_nlevels 5  -da_vec_type cuda -da_mat_type aijcuda  and 
 run without those last two options to NOT use the GPU and compare the results 
 with -log_summary. We'd be interesting in seeing those numbers also.
 
   Barry
 
 On Aug 27, 2010, at 1:07 PM, Keita Teranishi wrote:
 
 Satish,
 
 Now I got the latest copy using mercurial.  Thanks!
 I am going to check the performance with Fermi.  Is there any command line 
 option available to swith CUSP? Or do I have to apply MatConvet() with PETSc 
 function calls?
 
 Thanks,
 
 Keita Teranishi
 Scientific Library Group
 Cray, Inc.
 keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 11:14 AM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Keita Teranishi wrote:
 
 Satish,
 
 Thanks.  I do not see any mercurial package for SUSE, let me try if it 
 works.
 
 source install of mercruial is pretty easy
 
 cd mercurial
 python setup.py install --prefix=/foo/bar
 
 [instructions say - use PYTHONPATH - but I like to hardcode it in 'hg'
 script] i.e edit /foo/bar/bin/hg and add the following at the very
 begining [where mercurial uses a different value of pythonXX -
 depending on the python version on your machine]
 
 import sys
 sys.path.insert(0,'/foo/bar/lib/pythonXX/site-packages/')
 
 Satish

[petsc-dev] Problem with petsc-dev

2010-08-27 Thread Satish Balay

On Fri, 27 Aug 2010, Satish Balay wrote:

 There was a problem with tarball creation for the past few days. Will
 try to respin manually today - and update you.

the petsc-dev tarball is now updated on the website..

Satish

[petsc-dev] [GPU] Performance on Fermi


   PETSc-dev folks,

  Please prepend all messages to petsc-dev that involve GPUs with [GPU] so 
they can be easily filtered.

Keita,

  To run src/ksp/ksp/examples/tutorials/ex2.c with CUDA you need the flag 
-vec_type cuda

  Note also that this example is fine for simple ONE processor tests but 
should not be used for parallel testing because it does not do a proper 
parallel partitioning for performance

Barry

On Aug 27, 2010, at 2:04 PM, Keita Teranishi wrote:

 Hi,
 
 I ran ex2.c with a matrix from 512x512 grid. 
 I set CG and Jacobi for the solver and preconditioner. 
 GCC-4.4.4 and CUDA-3.1 are used to compile the code.
 BLAS and LAPAKCK are not optimized.
 
 MatMult
 Fermi:1142 MFlops
 1 core Istanbul:  420 MFlops
 
 KSPSolve:
 Fermi:1.5 Sec
 1 core Istanbul:  1.7 Sec
 
 
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 1:49 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Satish Balay wrote:
 
 There was a problem with tarball creation for the past few days. Will
 try to respin manually today - and update you.
 
 the petsc-dev tarball is now updated on the website..
 
 Satish

[petsc-dev] [GPU] Performance on Fermi

Barry,

CPU version takes another digit. So it is 1.6 sec on Fermi and 17 sec 1 core 
CPU.

Thanks,

?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Keita Teranishi
Sent: Friday, August 27, 2010 2:20 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] [GPU] Performance on Fermi

Barry,

Yes. It improves the performance dramatically, but the execution time for 
KSPSolve stays the same.

MatMult 5.2 Gflops

Thanks,


?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Barry Smith
Sent: Friday, August 27, 2010 2:15 PM
To: For users of the development version of PETSc
Subject: [petsc-dev] [GPU] Performance on Fermi


   PETSc-dev folks,

  Please prepend all messages to petsc-dev that involve GPUs with [GPU] so 
they can be easily filtered.

Keita,

  To run src/ksp/ksp/examples/tutorials/ex2.c with CUDA you need the flag 
-vec_type cuda

  Note also that this example is fine for simple ONE processor tests but 
should not be used for parallel testing because it does not do a proper 
parallel partitioning for performance

Barry

On Aug 27, 2010, at 2:04 PM, Keita Teranishi wrote:

 Hi,
 
 I ran ex2.c with a matrix from 512x512 grid. 
 I set CG and Jacobi for the solver and preconditioner. 
 GCC-4.4.4 and CUDA-3.1 are used to compile the code.
 BLAS and LAPAKCK are not optimized.
 
 MatMult
 Fermi:1142 MFlops
 1 core Istanbul:  420 MFlops
 
 KSPSolve:
 Fermi:1.5 Sec
 1 core Istanbul:  1.7 Sec
 
 
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov 
 [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 1:49 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Satish Balay wrote:
 
 There was a problem with tarball creation for the past few days. Will 
 try to respin manually today - and update you.
 
 the petsc-dev tarball is now updated on the website..
 
 Satish

[petsc-dev] [GPU] Problem with petsc-dev

Barry,

The SNES example program fails at DACreate_2D().  I am not using MPI to build 
the program (it's Linux white box). Do I need MPI to run the code?

Thanks,

[0]PETSC ERROR: DACreate_2D() line 1338 in src/dm/da/src/da2.c

[0]PETSC ERROR: - Error Message 

[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: Given Bad partition!
[0]PETSC ERROR: 


?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Barry Smith
Sent: Friday, August 27, 2010 1:13 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] Problem with petsc-dev


   You can run, for example, src/snes/examples/tutorials ex19 with the options 
-pc_type jacobi -dmmg_nlevels 5  -da_vec_type cuda -da_mat_type aijcuda  and 
run without those last two options to NOT use the GPU and compare the results 
with -log_summary. We'd be interesting in seeing those numbers also.

   Barry

On Aug 27, 2010, at 1:07 PM, Keita Teranishi wrote:

 Satish,
 
 Now I got the latest copy using mercurial.  Thanks!
 I am going to check the performance with Fermi.  Is there any command line 
 option available to swith CUSP? Or do I have to apply MatConvet() with PETSc 
 function calls?
 
 Thanks,
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 11:14 AM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Keita Teranishi wrote:
 
 Satish,
 
 Thanks.  I do not see any mercurial package for SUSE, let me try if it works.
 
 source install of mercruial is pretty easy
 
 cd mercurial
 python setup.py install --prefix=/foo/bar
 
 [instructions say - use PYTHONPATH - but I like to hardcode it in 'hg'
 script] i.e edit /foo/bar/bin/hg and add the following at the very
 begining [where mercurial uses a different value of pythonXX -
 depending on the python version on your machine]
 
 import sys
 sys.path.insert(0,'/foo/bar/lib/pythonXX/site-packages/')
 
 Satish

[petsc-dev] [GPU] Performance on Fermi


 ##
  ##
  #  WARNING!!!#
  ##
  #   This code was compiled with a debugging option,  #
  #   To get timing results run ./configure#
  #   using --with-debugging=no, the performance will  #
  #   be generally two or three times faster.  #
  ##
  ##


  You need to build the code with ./configure --with-debugging=0 to make a far 
comparison. This will speed up the CPU version.

   Barry


On Aug 27, 2010, at 2:22 PM, Keita Teranishi wrote:

 Barry,
 
 CPU version takes another digit. So it is 1.6 sec on Fermi and 17 sec 1 core 
 CPU.
 
 Thanks,
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Keita Teranishi
 Sent: Friday, August 27, 2010 2:20 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] [GPU] Performance on Fermi
 
 Barry,
 
 Yes. It improves the performance dramatically, but the execution time for 
 KSPSolve stays the same.
 
 MatMult 5.2 Gflops
 
 Thanks,
 
 
  Keita Teranishi
  Scientific Library Group
  Cray, Inc.
  keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at 
 mcs.anl.gov] On Behalf Of Barry Smith
 Sent: Friday, August 27, 2010 2:15 PM
 To: For users of the development version of PETSc
 Subject: [petsc-dev] [GPU] Performance on Fermi
 
 
   PETSc-dev folks,
 
  Please prepend all messages to petsc-dev that involve GPUs with [GPU] so 
 they can be easily filtered.
 
Keita,
 
  To run src/ksp/ksp/examples/tutorials/ex2.c with CUDA you need the flag 
 -vec_type cuda
 
  Note also that this example is fine for simple ONE processor tests but 
 should not be used for parallel testing because it does not do a proper 
 parallel partitioning for performance
 
Barry
 
 On Aug 27, 2010, at 2:04 PM, Keita Teranishi wrote:
 
 Hi,
 
 I ran ex2.c with a matrix from 512x512 grid. 
 I set CG and Jacobi for the solver and preconditioner. 
 GCC-4.4.4 and CUDA-3.1 are used to compile the code.
 BLAS and LAPAKCK are not optimized.
 
 MatMult
 Fermi:   1142 MFlops
 1 core Istanbul: 420 MFlops
 
 KSPSolve:
 Fermi:   1.5 Sec
 1 core Istanbul: 1.7 Sec
 
 
 
 Keita Teranishi
 Scientific Library Group
 Cray, Inc.
 keita at cray.com
 
 
 
 -Original Message-
 From: petsc-dev-bounces at mcs.anl.gov 
 [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Satish Balay
 Sent: Friday, August 27, 2010 1:49 PM
 To: For users of the development version of PETSc
 Subject: Re: [petsc-dev] Problem with petsc-dev
 
 On Fri, 27 Aug 2010, Satish Balay wrote:
 
 There was a problem with tarball creation for the past few days. Will 
 try to respin manually today - and update you.
 
 the petsc-dev tarball is now updated on the website..
 
 Satish

[petsc-dev] [GPU] Performance on Fermi

Yes, I replaced all the compiler flags by -O3.


?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: Jed Brown [mailto:five...@gmail.com] On Behalf Of Jed Brown
Sent: Friday, August 27, 2010 4:16 PM
To: Keita Teranishi; For users of the development version of PETSc
Subject: Re: [petsc-dev] [GPU] Performance on Fermi

On Fri, 27 Aug 2010 16:06:30 -0500, Keita Teranishi keita at cray.com wrote:
 Barry,
 
 The CPU timing I reported was after recompiling the code (I removed 
 PETSC_USE_DEBUG and GDB macros from petscconf.h).  

Unless you were manually overriding compiler flags, it still wasn't
optimized.  Please just reconfigure a new PETSC_ARCH --with-debugging=0.
It's as easy as

  foo-dbg/conf/reconfigure-foo-dbg.py --with-debugging=0 PETSC_ARCH=foo-opt
  make PETSC_ARCH=foo-opt

Jed

[petsc-dev] [GPU] Performance on Fermi

Jed,

I usually manually edit petscconf.h and petscvariables to change the 
installation configurations for Cray XT/XE.   The problem is configure script 
of PETSc picks up wrong variables and #define macros because the OS and library 
setting on the login node is different from the compute node. 

This particular case is just a mistake in configure script (and it's not a big 
deal to fix), but it will be great if you have any ideas to avoid picking up 
wrong settings.  

Thanks,

?Keita Teranishi
?Scientific Library Group
?Cray, Inc.
?keita at cray.com



-Original Message-
From: Jed Brown [mailto:five...@gmail.com] On Behalf Of Jed Brown
Sent: Friday, August 27, 2010 4:29 PM
To: Keita Teranishi; For users of the development version of PETSc
Subject: RE: [petsc-dev] [GPU] Performance on Fermi

On Fri, 27 Aug 2010 16:18:43 -0500, Keita Teranishi keita at cray.com wrote:
 Yes, I replaced all the compiler flags by -O3.

petsc-maint doesn't come to me, but if the snippet that Barry quoted was
from your log_summary, then PETSC_USE_DEBUG was definitely defined when
plog.c was compiled.  It's really much easier to have two separate
builds and always use the optimized one when profiling.

Jed

[petsc-dev] [GPU] Performance on Fermi