Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Jed Brown via petsc-users
"Smith, Barry F. via petsc-users"  writes:

>   So it sounds like spack is still mostly a "package manager" where people 
> use "static" packages and don't hack the package's code. This is not 
> unreasonable, no other package manager supports hacking a package's code 
> easily, presumably. The problem is that in the HPC world a "packaged"
> code is always incomplete and may need hacking or application of newly 
> generated patches and this is painful with static package managers so people 
> want to use the git repository directly and mange the build themselves which 
> negates the advantages of using a package manager.

I don't think people "want" to hack the packages that they don't
contribute to.  Spack provides pretty rapid distribution of patches.

What if PETSc had

  ./configure --with-mumps=spack

or some alternative that would check with spack to find a suitable
MUMPS, installing it (with Spack's dependency resolution) if not
available?  Then you could hack on PETSc with multiple PETSC_ARCH,
branches, incremental rebuilds, and testing, but not need to deal with
PETSc's crude package installation and upgrade mechanism.

Upon completion, the build could offer a yaml snippet for packages.yaml
in case the user wanted other Spack packages to use that PETSc.


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Balay, Satish via petsc-users
On Wed, 17 Apr 2019, Smith, Barry F. wrote:

> 
>   So it sounds like spack is still mostly a "package manager" where people 
> use "static" packages and don't hack the package's code. This is not 
> unreasonable, no other package manager supports hacking a package's code 
> easily, presumably. The problem is that in the HPC world a "packaged"
> code is always incomplete and may need hacking or application of newly 
> generated patches and this is painful with static package managers so people 
> want to use the git repository directly and mange the build themselves which 
> negates the advantages of using a package manager.
> 
>Thanks
> 
> Barry
> 
> Perhaps if spack had an easier mechanism to allow the user to "point to" 
> local git clones it could get closer to the best of both worlds. Maybe spack 
> could support a list of local repositories and branches in the yaml file. But 
> yes the issue of rerunning the "./configure" stage still comes up.

$ spack help --all| grep diy
  diy   do-it-yourself: build from an existing source directory

I haven't explored this mode though. [but useful for packages that are
not already represented in repo]

This more was in instructions for one of the packages - but then I
couldn't figure out equivalent of 'spack spec' vs 'spack install'
[query and check dependencies before installing] with diy

Its not ideal - but having local changes in our spack clones (change
git url, add appropriate version lines to branches that one is working
on) is possible [for a group working in this mode]. But might not be
ideal for folk who might want to easily do a one off PR - in this
mode.

Satish


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
>Perhaps if spack had an easier mechanism to allow the user to "point to"
local git clones it could get closer to the best of both worlds. Maybe
spack could support a list of local repositories and branches in the yaml
file.

I wonder if a local git clone of petsc can become a "mirror" for petsc
spack package, though this is not the intended use of mirrors. Refer to
https://spack.readthedocs.io/en/latest/mirrors.html


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users


  So it sounds like spack is still mostly a "package manager" where people use 
"static" packages and don't hack the package's code. This is not unreasonable, 
no other package manager supports hacking a package's code easily, presumably. 
The problem is that in the HPC world a "packaged"
code is always incomplete and may need hacking or application of newly 
generated patches and this is painful with static package managers so people 
want to use the git repository directly and mange the build themselves which 
negates the advantages of using a package manager.

   Thanks

Barry

Perhaps if spack had an easier mechanism to allow the user to "point to" local 
git clones it could get closer to the best of both worlds. Maybe spack could 
support a list of local repositories and branches in the yaml file. But yes the 
issue of rerunning the "./configure" stage still comes up.






> On Apr 17, 2019, at 12:00 AM, Balay, Satish via petsc-users 
>  wrote:
> 
> On Tue, 16 Apr 2019, Sajid Ali via petsc-users wrote:
> 
>>> develop > 3.11.99 > 3.10.xx > maint (or other strings)
>> Just discovered this issue when trying to build with my fork of spack at [1
>> ].
>> 
>> 
>> So, ideally each developer has to have their develop point to the branch
>> they want to build ? That would make communication a little confusing since
>> spack's develop version is some package's master and now everyone wants a
>> different develop so as to not let spack apply any patches for string
>> version sorted lower than lowest numeric version.
> 
> There is some issue filed [with PR?] regarding this with sorting order
> of string versions and numerical versions. This might improve in the
> future. But for now 'bugfix-vecduplicate-fftw-vec' will be lower than
> version 0.1
> 
> Also 'develop' might not be appropriate for all branches.
> 
> For ex: - petsc has maint, maint-3.10 etc branches. - so if one is
> creating a bugfix for maint - (i.e start a branch off maint) it would
> be inappropriate to call it 'develop' - as it will be marked > version
> 3.11.99 and break some of the version comparisons.
> 
>> 
>>> Even if you change commit from 'abc' to 'def'spack won't recognize this
>> change and use the cached tarball.
>> True, but since checksum changes and the user has to constantly zip and
>> unzip, I personally find git cloning easier to deal with so it's just a
>> matter of preference.
>> 
> 
> Here you are referring to tarballs - where the sha256sum is listed.
> 
> url  = 
> "http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.5.3.tar.gz;
> version('3.11.1', 
> 'cb627f99f7ce1540ebbbf338189f89a5f1ecf3ab3b5b0e357f9e46c209f1fb23')
> 
> However - one can also say:
> 
> git  = "https://bitbucket.org/sajid__ali/petsc.git;
> version('3.11.1', commit='f3d32574624d5351549675da8733a2646265404f')
> 
> Here - spack downloads the git snapshot as tarball (saves in tarball
> cache as petsc-3.11.1.tar.gz - and reuses it) - and there is no
> sha256sum listed here to check. If you change this to some-other
> commit (perhaps to test a fix) - spack will use the cached tarball -
> and not downloaded the snapshot corresponding to the changed commit.
> 
> Satish



Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Balay, Satish via petsc-users
On Tue, 16 Apr 2019, Sajid Ali via petsc-users wrote:

> > develop > 3.11.99 > 3.10.xx > maint (or other strings)
> Just discovered this issue when trying to build with my fork of spack at [1
> ].
> 
> 
> So, ideally each developer has to have their develop point to the branch
> they want to build ? That would make communication a little confusing since
> spack's develop version is some package's master and now everyone wants a
> different develop so as to not let spack apply any patches for string
> version sorted lower than lowest numeric version.

There is some issue filed [with PR?] regarding this with sorting order
of string versions and numerical versions. This might improve in the
future. But for now 'bugfix-vecduplicate-fftw-vec' will be lower than
version 0.1

Also 'develop' might not be appropriate for all branches.

For ex: - petsc has maint, maint-3.10 etc branches. - so if one is
creating a bugfix for maint - (i.e start a branch off maint) it would
be inappropriate to call it 'develop' - as it will be marked > version
3.11.99 and break some of the version comparisons.

> 
> >Even if you change commit from 'abc' to 'def'spack won't recognize this
> change and use the cached tarball.
> True, but since checksum changes and the user has to constantly zip and
> unzip, I personally find git cloning easier to deal with so it's just a
> matter of preference.
> 

Here you are referring to tarballs - where the sha256sum is listed.

 url  = 
"http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.5.3.tar.gz;
 version('3.11.1', 
'cb627f99f7ce1540ebbbf338189f89a5f1ecf3ab3b5b0e357f9e46c209f1fb23')

However - one can also say:

 git  = "https://bitbucket.org/sajid__ali/petsc.git;
 version('3.11.1', commit='f3d32574624d5351549675da8733a2646265404f')

Here - spack downloads the git snapshot as tarball (saves in tarball
cache as petsc-3.11.1.tar.gz - and reuses it) - and there is no
sha256sum listed here to check. If you change this to some-other
commit (perhaps to test a fix) - spack will use the cached tarball -
and not downloaded the snapshot corresponding to the changed commit.

Satish


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
> develop > 3.11.99 > 3.10.xx > maint (or other strings)
Just discovered this issue when trying to build with my fork of spack at [1
].


So, ideally each developer has to have their develop point to the branch
they want to build ? That would make communication a little confusing since
spack's develop version is some package's master and now everyone wants a
different develop so as to not let spack apply any patches for string
version sorted lower than lowest numeric version.

>Even if you change commit from 'abc' to 'def'spack won't recognize this
change and use the cached tarball.
True, but since checksum changes and the user has to constantly zip and
unzip, I personally find git cloning easier to deal with so it's just a
matter of preference.


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Balay, Satish via petsc-users
On Wed, 17 Apr 2019, Balay, Satish via petsc-users wrote:

> On Tue, 16 Apr 2019, Smith, Barry F. wrote:
> 
> > 
> > 
> > > On Apr 16, 2019, at 10:30 PM, Sajid Ali 
> > >  wrote:
> > > 
> > > @Barry: Thanks for the bugfix! 
> > > 
> > > @Satish: Thanks for pointing out this method!
> > > 
> > > My preferred way previously was to download the source code, unzip, edit, 
> > > zip. Now ask spack to not checksum (because my edit has changed stuff) 
> > > and build. Lately, spack has added git support and now I create a branch 
> > > of spack where I add my bugfix branch as the default build git repo 
> > > instead of master to now deal with checksum headaches. 
> > 
> >With the PETSc build system directly it handles dependencies, that is if 
> > you use a PETSC_ARCH and edit one PETSc file it will only recompile that 
> > one file and add it to the library instead of insisting on recompiling all 
> > of PETSc (as developers of course we rely on this or we'd go insane waiting 
> > for builds to complete when we are adding code).
> 
> Yeah but this is within a single package - and only if we don't redo a 
> configure.
> 
> And some of our code to avoid rebuilding external packages have corner cases 
> - so we have to occasionally ask users to do 'rm -rf PETSC_ARCH'
> 
> > 
> >  Is this possible with spack?
> 
> Spack tries to do this [avoid rebuilds] at a package level.
> 
> However within a package - it doesn't keep build files. [and if the
> user forces spack to not delete them with '--dont-restage
> --keep-stage' - it doesn't check if the package need to run configure
> again or not etc..] I'm not sure if this is possible to do
> consistently without error cases across the package collection spack
> has.

One additional note: spack has a way to get into a mode where one can
do the builds manually - primarily for debugging [when builds fail]

spack build-env petsc bash
spack cd petsc
make

But I haven't checked on how to replicate all the steps [configure;
make all; make install; delete] exactly as what spack would do in
'spack install'

However - in this 'build-env' mode - one could use incremental build
feature provided by any given package.

Satish


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Balay, Satish via petsc-users
On Tue, 16 Apr 2019, Smith, Barry F. wrote:

> 
> 
> > On Apr 16, 2019, at 10:30 PM, Sajid Ali  
> > wrote:
> > 
> > @Barry: Thanks for the bugfix! 
> > 
> > @Satish: Thanks for pointing out this method!
> > 
> > My preferred way previously was to download the source code, unzip, edit, 
> > zip. Now ask spack to not checksum (because my edit has changed stuff) and 
> > build. Lately, spack has added git support and now I create a branch of 
> > spack where I add my bugfix branch as the default build git repo instead of 
> > master to now deal with checksum headaches. 
> 
>With the PETSc build system directly it handles dependencies, that is if 
> you use a PETSC_ARCH and edit one PETSc file it will only recompile that one 
> file and add it to the library instead of insisting on recompiling all of 
> PETSc (as developers of course we rely on this or we'd go insane waiting for 
> builds to complete when we are adding code).

Yeah but this is within a single package - and only if we don't redo a 
configure.

And some of our code to avoid rebuilding external packages have corner cases - 
so we have to occasionally ask users to do 'rm -rf PETSC_ARCH'

> 
>  Is this possible with spack?

Spack tries to do this [avoid rebuilds] at a package level.

However within a package - it doesn't keep build files. [and if the
user forces spack to not delete them with '--dont-restage
--keep-stage' - it doesn't check if the package need to run configure
again or not etc..] I'm not sure if this is possible to do
consistently without error cases across the package collection spack
has.

Satish


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Balay, Satish via petsc-users
On Tue, 16 Apr 2019, Sajid Ali wrote:

> Lately, spack has added git support and now I create a branch of
> spack where I add my bugfix branch as the default build git repo instead of
> master to now deal with checksum headaches.

Some good and bad here..

version('develop', branch='master')
version('3.11.99', branch='maint')
version('maint', branch='maint')

git branch is the only way to get a rebuild to pick up package changes (in 
branch)

However its best to set appropriate version numbers here. i.e
'3.11.99' should be preferable over 'maint'. Otherwise spack version
comparison logic will give unwanted results. It does stuff like:

develop > 3.11.99 > 3.10.xx > maint (or other strings)

Wrt tarballs and commit-ids - spack saves them as tarballs in cache
and reuses them. For ex: - the download below will be saved as
petsc-3.10.1.tar.gz.  Even if you change commit from 'abc' to 'def'
spack won't recognize this change and use the cached tarball.

However - the bad part wrt branch is - each time you do a 'spack
install' - it does a git clone.  [i.e there is no local git clone
which does a 'fetch' to minimize the clone overhead]

Satish


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users



> On Apr 16, 2019, at 10:30 PM, Sajid Ali  
> wrote:
> 
> @Barry: Thanks for the bugfix! 
> 
> @Satish: Thanks for pointing out this method!
> 
> My preferred way previously was to download the source code, unzip, edit, 
> zip. Now ask spack to not checksum (because my edit has changed stuff) and 
> build. Lately, spack has added git support and now I create a branch of spack 
> where I add my bugfix branch as the default build git repo instead of master 
> to now deal with checksum headaches. 

   With the PETSc build system directly it handles dependencies, that is if you 
use a PETSC_ARCH and edit one PETSc file it will only recompile that one file 
and add it to the library instead of insisting on recompiling all of PETSc (as 
developers of course we rely on this or we'd go insane waiting for builds to 
complete when we are adding code).

  Satish,

 Is this possible with spack?

  Barry


> 



Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
@Barry: Thanks for the bugfix!

@Satish: Thanks for pointing out this method!

My preferred way previously was to download the source code, unzip, edit,
zip. Now ask spack to not checksum (because my edit has changed stuff) and
build. Lately, spack has added git support and now I create a branch of
spack where I add my bugfix branch as the default build git repo instead of
master to now deal with checksum headaches.


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users


  Satish,

Thanks for the instructions. Since eventually we want everyone to use spack 
(but with the git repository of PETSc) is there a place we should 
document this? Perhaps at the bottom of the installation.html?

  Barry

Note: of course we want them to be able to painlessly make pull requests from 
the thing spack downloads. 


> On Apr 16, 2019, at 10:23 PM, Balay, Satish  wrote:
> 
> On Wed, 17 Apr 2019, Smith, Barry F. via petsc-users wrote:
> 
>> 
>>  Funny you should ask, I just found the bug. 
>> 
>>> On Apr 16, 2019, at 9:47 PM, Sajid Ali  
>>> wrote:
>>> 
>>> Quick question : To drop a print statement at the required location, I need 
>>> to modify the source code, build petsc from source and compile with this 
>>> new version of petsc, right or is there an easier way? (Just to confirm 
>>> before putting in the effort)
>> 
>>   Yes. But perhaps spack has a way to handle this as well; it should. 
>> Satish? If you can get spack to use the git repository then you could edit 
>> in that and somehow have spack rebuild using your edited repository.
> 
> 
> $ spack help install |grep stage
>  --keep-stage  don't remove the build stage if installation succeeds
>  --dont-restageif a partial install is detected, don't delete prior 
> state
> 
> Here is how it works.
> 
> - By default - spack downloads the tarball/git-snapshots and saves them in 
> var/spack/cache
> - and it stages them for build in var/spack/stage [i.e untar and ready to 
> compile]
> - after the build is complete - it installs in opt/.. and deletes the 
> staged/build files.
> [if the build breaks - it leaves the stage alone]
> 
> So if we want to add some modifications to a broken build and rebuild - I 
> would:
> 
> - 'spack stage' or 'spack install --keep-stage' [to get the package files 
> staged but not deleted]
> - edit files in stage
> - 'spack install --dont-restage --keep-stage'
>  i.e use the currently staged files and build from it. And don't delete them 
> even if the build succeeds
> 
> Satish



Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Balay, Satish via petsc-users
On Wed, 17 Apr 2019, Smith, Barry F. via petsc-users wrote:

> 
>   Funny you should ask, I just found the bug. 
> 
> > On Apr 16, 2019, at 9:47 PM, Sajid Ali  
> > wrote:
> > 
> > Quick question : To drop a print statement at the required location, I need 
> > to modify the source code, build petsc from source and compile with this 
> > new version of petsc, right or is there an easier way? (Just to confirm 
> > before putting in the effort)
> 
>Yes. But perhaps spack has a way to handle this as well; it should. 
> Satish? If you can get spack to use the git repository then you could edit in 
> that and somehow have spack rebuild using your edited repository.


$ spack help install |grep stage
  --keep-stage  don't remove the build stage if installation succeeds
  --dont-restageif a partial install is detected, don't delete prior 
state

Here is how it works.

- By default - spack downloads the tarball/git-snapshots and saves them in 
var/spack/cache
- and it stages them for build in var/spack/stage [i.e untar and ready to 
compile]
- after the build is complete - it installs in opt/.. and deletes the 
staged/build files.
 [if the build breaks - it leaves the stage alone]

So if we want to add some modifications to a broken build and rebuild - I would:

- 'spack stage' or 'spack install --keep-stage' [to get the package files 
staged but not deleted]
- edit files in stage
- 'spack install --dont-restage --keep-stage'
  i.e use the currently staged files and build from it. And don't delete them 
even if the build succeeds

Satish


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users
https://bitbucket.org/petsc/petsc/pull-requests/1551/chunksize-could-overflow-and-become/diff

With this fix I can run with your vector size on 1 process. With 2 processes I 
get

$ petscmpiexec -n 2 ./ex1 
Assertion failed in file adio/common/ad_write_coll.c at line 904: 
(curr_to_proc[p] + len - done_to_proc[p]) == (unsigned) (curr_to_proc[p] + len 
- done_to_proc[p])
0   libpmpi.0.dylib 0x000111241f3e backtrace_libc + 62
1   libpmpi.0.dylib 0x000111241ef5 MPL_backtrace_show + 
21
2   libpmpi.0.dylib 0x0009f85a MPIR_Assert_fail + 90
3   libpmpi.0.dylib 0x000a15f3 MPIR_Ext_assert_fail 
+ 35
4   libmpi.0.dylib  0x000110eee16e 
ADIOI_Fill_send_buffer + 1134
5   libmpi.0.dylib  0x000110eefe74 
ADIOI_W_Exchange_data + 2980
6   libmpi.0.dylib  0x000110eed7ad ADIOI_Exch_and_write 
+ 3197
7   libmpi.0.dylib  0x000110eec854 
ADIOI_GEN_WriteStridedColl + 2004
8   libpmpi.0.dylib 0x00011128ad4b MPIOI_File_write_all 
+ 1179
9   libmpi.0.dylib  0x000110ec382b 
MPI_File_write_at_all + 91
10  libhdf5.10.dylib0x0001108b982a H5FD_mpio_write + 
1466
11  libhdf5.10.dylib0x0001108b127a H5FD_write + 634
12  li

Looks like an int overflow in the MPIIO. (It is scary to see the ints in the 
ADIO code as opposed to 64 bit integers but I guess somehow it works, maybe 
this is a strange corner case and I don't know if the problem is with HDF5 or 
MPIIO) 

 on 4 and 8 processes it runs. 

Note that you are playing with a very dangerous size. 32768 * 32768 * 2 is a 
negative number in int. So this is essentially the largest problem you can run 
before switching to 64 bit indices for PETSc. 

  Barry



> On Apr 16, 2019, at 9:32 AM, Sajid Ali via petsc-users 
>  wrote:
> 
> Hi PETSc developers,
> 
> I’m trying to write a large vector created with VecCreateMPI (size 
> 32768x32768) concurrently from 4 nodes (+32 tasks per node, total 128 
> mpi-ranks) and I see the following (indicative) error : [Full error log is 
> here : https://file.io/CdjUfe] 
> 
> HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 52:
>   #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
> major: Dataset
> minor: Unable to initialize object
>   #001: H5Dint.c line 329 in H5D__create_named(): unable to create and link 
> to dataset
> major: Dataset
> minor: Unable to initialize object
>   #002: H5L.c line 1557 in H5L_link_object(): unable to create new link to 
> object
> major: Links
> minor: Unable to initialize object
>   #003: H5L.c line 1798 in H5L__create_real(): can't insert link
> major: Links
> minor: Unable to insert object
>   #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal 
> failed
> major: Symbol table
> HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 59:
>   
>   #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset  
>   
> major: Dataset
>   
> minor: Unable to initialize object
>   
>   #001: H5Dint.c line 329 in H5D__create_named(): unable to create and link 
> to dataset  
> major: Dataset
>   
> minor: Unable to initialize object
>   
>   #002: H5L.c line 1557 in H5L_link_object(): unable to create new link to 
> object   
> major: Links  
>   
> minor: Unable to initialize object
>   
>   #003: H5L.c line 1798 in H5L__create_real(): can't insert link  
>   
> major: Links  
>   
> minor: Unable to insert object
>   
>   #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal 
> failed
> major: Symbol table   
>   
> minor: Object not found   
>   
>   #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator 
> failed   
> major: Symbol table   
>   
> minor: Callback failed
>   
>   #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
>   
> major: Links  
>   
> minor: Unable to initialize object 

Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users


  Funny you should ask, I just found the bug. 

> On Apr 16, 2019, at 9:47 PM, Sajid Ali  
> wrote:
> 
> Quick question : To drop a print statement at the required location, I need 
> to modify the source code, build petsc from source and compile with this new 
> version of petsc, right or is there an easier way? (Just to confirm before 
> putting in the effort)

   Yes. But perhaps spack has a way to handle this as well; it should. Satish? 
If you can get spack to use the git repository then you could edit in that and 
somehow have spack rebuild using your edited repository.

   Barry
> 
> On Tue, Apr 16, 2019 at 8:42 PM Smith, Barry F.  wrote:
> 
>   Dang, I  ranted too soon.
> 
>   I built mpich  using spack (master branch) and a very old Gnu C compiler 
> and it produced valgrind clean code. Spack definitely is not passing the 
> --enable-g=meminit to MPICH ./configure so this version of MPICH valgrind 
> must be clean by default? MPICH's ./configure has
> 
> meminit  - Preinitialize memory associated structures and unions to
>eliminate access warnings from programs like valgrind
> 
> The default for enable-g is most and
> 
> most|yes)
> perform_memtracing=yes
> enable_append_g=yes
> perform_meminit=yes
> perform_dbgmutex=yes
> perform_mutexnesting=yes
> perform_handlealloc=yes
> perform_handle=yes
> 
> So it appears that at least some releases of MPICH are suppose to be valgrind 
> clean by default ;).
> 
> Looking back at Sajid's valgrind output more carefully
> 
> Conditional jump or move depends on uninitialised value(s)
> ==15359==at 0x1331069A: __intel_sse4_strncmp (in 
> /opt/intel/compilers_and_libraries_2019.1.144/linux/compiler/lib/intel64_lin/libintlc.so.5)
> 
> is the only valgrind error. Which I remember seeing from using Intel 
> compilers for a long time, nothing to do with MPICH
> 
> Thus I conclude that Sajid's code is actually valgrind clean; and I withdraw 
> my rant about MPICH/spack
> 
> Barry
> 
> 
> 
> > On Apr 16, 2019, at 5:13 PM, Smith, Barry F.  wrote:
> >
> >
> >  So valgrind is printing all kinds of juicy information about uninitialized 
> > values but it is all worthless because MPICH was not built by spack to be 
> > valgrind clean. We can't know if any of the problems valgrind flags are 
> > real. MPICH needs to be configured with the option --enable-g=meminit to be 
> > valgrind clean. PETSc's --download-mpich always installs a valgrind clean 
> > MPI.
> >
> > It is unfortunate Spack doesn't provide a variant of MPICH that is valgrind 
> > clean; actually it should default to valgrind clean MPICH.
> >
> >  Barry
> >
> >
> >
> >
> >> On Apr 16, 2019, at 2:43 PM, Sajid Ali via petsc-users 
> >>  wrote:
> >>
> >> So, I tried running the debug version with valgrind to see if I can find 
> >> the chunk size that's being set but I don't see it. Is there a better way 
> >> to do it ?
> >>
> >> `$ mpirun -np 32 valgrind ./ex_ms -prop_steps 1 -info &> out`. [The out 
> >> file is attached.]
> >> 
> >
> 
> 
> 
> -- 
> Sajid Ali
> Applied Physics
> Northwestern University



Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
Quick question : To drop a print statement at the required location, I need
to modify the source code, build petsc from source and compile with this
new version of petsc, right or is there an easier way? (Just to confirm
before putting in the effort)

On Tue, Apr 16, 2019 at 8:42 PM Smith, Barry F.  wrote:

>
>   Dang, I  ranted too soon.
>
>   I built mpich  using spack (master branch) and a very old Gnu C compiler
> and it produced valgrind clean code. Spack definitely is not passing the
> --enable-g=meminit to MPICH ./configure so this version of MPICH valgrind
> must be clean by default? MPICH's ./configure has
>
> meminit  - Preinitialize memory associated structures and unions to
>eliminate access warnings from programs like valgrind
>
> The default for enable-g is most and
>
> most|yes)
> perform_memtracing=yes
> enable_append_g=yes
> perform_meminit=yes
> perform_dbgmutex=yes
> perform_mutexnesting=yes
> perform_handlealloc=yes
> perform_handle=yes
>
> So it appears that at least some releases of MPICH are suppose to be
> valgrind clean by default ;).
>
> Looking back at Sajid's valgrind output more carefully
>
> Conditional jump or move depends on uninitialised value(s)
> ==15359==at 0x1331069A: __intel_sse4_strncmp (in
> /opt/intel/compilers_and_libraries_2019.1.144/linux/compiler/lib/intel64_lin/libintlc.so.5)
>
> is the only valgrind error. Which I remember seeing from using Intel
> compilers for a long time, nothing to do with MPICH
>
> Thus I conclude that Sajid's code is actually valgrind clean; and I
> withdraw my rant about MPICH/spack
>
> Barry
>
>
>
> > On Apr 16, 2019, at 5:13 PM, Smith, Barry F.  wrote:
> >
> >
> >  So valgrind is printing all kinds of juicy information about
> uninitialized values but it is all worthless because MPICH was not built by
> spack to be valgrind clean. We can't know if any of the problems valgrind
> flags are real. MPICH needs to be configured with the option
> --enable-g=meminit to be valgrind clean. PETSc's --download-mpich always
> installs a valgrind clean MPI.
> >
> > It is unfortunate Spack doesn't provide a variant of MPICH that is
> valgrind clean; actually it should default to valgrind clean MPICH.
> >
> >  Barry
> >
> >
> >
> >
> >> On Apr 16, 2019, at 2:43 PM, Sajid Ali via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
> >>
> >> So, I tried running the debug version with valgrind to see if I can
> find the chunk size that's being set but I don't see it. Is there a better
> way to do it ?
> >>
> >> `$ mpirun -np 32 valgrind ./ex_ms -prop_steps 1 -info &> out`. [The out
> file is attached.]
> >> 
> >
>
>

-- 
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users


  Dang, I  ranted too soon. 

  I built mpich  using spack (master branch) and a very old Gnu C compiler and 
it produced valgrind clean code. Spack definitely is not passing the 
--enable-g=meminit to MPICH ./configure so this version of MPICH valgrind must 
be clean by default? MPICH's ./configure has 

meminit  - Preinitialize memory associated structures and unions to
   eliminate access warnings from programs like valgrind

The default for enable-g is most and 

most|yes)
perform_memtracing=yes
enable_append_g=yes
perform_meminit=yes
perform_dbgmutex=yes
perform_mutexnesting=yes
perform_handlealloc=yes
perform_handle=yes

So it appears that at least some releases of MPICH are suppose to be valgrind 
clean by default ;).

Looking back at Sajid's valgrind output more carefully 

Conditional jump or move depends on uninitialised value(s)
==15359==at 0x1331069A: __intel_sse4_strncmp (in 
/opt/intel/compilers_and_libraries_2019.1.144/linux/compiler/lib/intel64_lin/libintlc.so.5)

is the only valgrind error. Which I remember seeing from using Intel compilers 
for a long time, nothing to do with MPICH

Thus I conclude that Sajid's code is actually valgrind clean; and I withdraw my 
rant about MPICH/spack

Barry



> On Apr 16, 2019, at 5:13 PM, Smith, Barry F.  wrote:
> 
> 
>  So valgrind is printing all kinds of juicy information about uninitialized 
> values but it is all worthless because MPICH was not built by spack to be 
> valgrind clean. We can't know if any of the problems valgrind flags are real. 
> MPICH needs to be configured with the option --enable-g=meminit to be 
> valgrind clean. PETSc's --download-mpich always installs a valgrind clean 
> MPI. 
> 
> It is unfortunate Spack doesn't provide a variant of MPICH that is valgrind 
> clean; actually it should default to valgrind clean MPICH.
> 
>  Barry
> 
> 
> 
> 
>> On Apr 16, 2019, at 2:43 PM, Sajid Ali via petsc-users 
>>  wrote:
>> 
>> So, I tried running the debug version with valgrind to see if I can find the 
>> chunk size that's being set but I don't see it. Is there a better way to do 
>> it ?
>> 
>> `$ mpirun -np 32 valgrind ./ex_ms -prop_steps 1 -info &> out`. [The out file 
>> is attached.]
>> 
> 



Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users

   You can run with -start_in_debugger  -debugger_nodes 0 and in that one 
debugger window put a break point in the needed vector routine to print the 
chunk sizes at the appropriate line. 

  Barry


> On Apr 16, 2019, at 9:32 AM, Sajid Ali via petsc-users 
>  wrote:
> 
> Hi PETSc developers,
> 
> I’m trying to write a large vector created with VecCreateMPI (size 
> 32768x32768) concurrently from 4 nodes (+32 tasks per node, total 128 
> mpi-ranks) and I see the following (indicative) error : [Full error log is 
> here : https://file.io/CdjUfe] 
> 
> HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 52:
>   #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
> major: Dataset
> minor: Unable to initialize object
>   #001: H5Dint.c line 329 in H5D__create_named(): unable to create and link 
> to dataset
> major: Dataset
> minor: Unable to initialize object
>   #002: H5L.c line 1557 in H5L_link_object(): unable to create new link to 
> object
> major: Links
> minor: Unable to initialize object
>   #003: H5L.c line 1798 in H5L__create_real(): can't insert link
> major: Links
> minor: Unable to insert object
>   #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal 
> failed
> major: Symbol table
> HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 59:
>   
>   #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset  
>   
> major: Dataset
>   
> minor: Unable to initialize object
>   
>   #001: H5Dint.c line 329 in H5D__create_named(): unable to create and link 
> to dataset  
> major: Dataset
>   
> minor: Unable to initialize object
>   
>   #002: H5L.c line 1557 in H5L_link_object(): unable to create new link to 
> object   
> major: Links  
>   
> minor: Unable to initialize object
>   
>   #003: H5L.c line 1798 in H5L__create_real(): can't insert link  
>   
> major: Links  
>   
> minor: Unable to insert object
>   
>   #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal 
> failed
> major: Symbol table   
>   
> minor: Object not found   
>   
>   #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator 
> failed   
> major: Symbol table   
>   
> minor: Callback failed
>   
>   #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
>   
> major: Links  
>   
> minor: Unable to initialize object
>   
>   #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object 
>   
> major: Object header  
>   
> minor: Can't open object  
>   
>   #008: H5Doh.c line 300 in H5O__dset_create(): unable to create dataset  
>   
> minor: Object not found   
>   
>   #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator 
> failed   
> major: Symbol table   
>   
> minor: Callback failed
>   
>   #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
>   
> major: Links  
>   
> minor: Unable to initialize object
>   
>   #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object 
>   
> major: Object header  
>   
> minor: Can't open object  
>   
>   #008: H5Doh.c line 300 in H5O__dset_create(): unable to create dataset  
>   
> major: Dataset
>   
> minor: Unable to initialize object
>   
>   #009: H5Dint.c line 1274 in H5D__create(): 

Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Smith, Barry F. via petsc-users


  So valgrind is printing all kinds of juicy information about uninitialized 
values but it is all worthless because MPICH was not built by spack to be 
valgrind clean. We can't know if any of the problems valgrind flags are real. 
MPICH needs to be configured with the option --enable-g=meminit to be valgrind 
clean. PETSc's --download-mpich always installs a valgrind clean MPI. 

It is unfortunate Spack doesn't provide a variant of MPICH that is valgrind 
clean; actually it should default to valgrind clean MPICH.

  Barry




> On Apr 16, 2019, at 2:43 PM, Sajid Ali via petsc-users 
>  wrote:
> 
> So, I tried running the debug version with valgrind to see if I can find the 
> chunk size that's being set but I don't see it. Is there a better way to do 
> it ?
> 
> `$ mpirun -np 32 valgrind ./ex_ms -prop_steps 1 -info &> out`. [The out file 
> is attached.]
> 



Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Matthew Knepley via petsc-users
On Tue, Apr 16, 2019 at 3:44 PM Sajid Ali 
wrote:

> So, I tried running the debug version with valgrind to see if I can find
> the chunk size that's being set but I don't see it. Is there a better way
> to do it ?
>
> `$ mpirun -np 32 valgrind ./ex_ms -prop_steps 1 -info &> out`. [The out
> file is attached.]
>

valgrind does not do anything here. Just put a breakpoint in the code
section I shared and print chunksize.

  Thanks,

 Matt
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-16 Thread Sajid Ali via petsc-users
Hi Barry/Matt,

Since VecDuplicate calls v->ops->duplicate, can't we just add custom
duplicate ops to the (f_in/f_out/b_out) vectors when they are created via
MatCreateFFTW? (just like the custom destroy ops are defined)

Also, what is the PetscObjectStateIncrease function doing ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
Hi Matt,

I tried running the same example with a smaller grid on a workstation and I
see that for a grid size of 8192x8192 (vector write dims 67108864, 2), the
output file has a chunk size of (16777215, 2).

I can’t see HDF5_INT_MAX in the spack build-log (which includes configure).
Is there a better way to look it up?

[sajid@xrmlite .spack]$ cat build.out | grep "HDF"
#define PETSC_HAVE_HDF5 1
#define PETSC_HAVE_LIBHDF5HL_FORTRAN 1
#define PETSC_HAVE_LIBHDF5 1
#define PETSC_HAVE_LIBHDF5_HL 1
#define PETSC_HAVE_LIBHDF5_FORTRAN 1
#define PETSC_HAVE_HDF5_RELEASE_VERSION 5
#define PETSC_HAVE_HDF5_MINOR_VERSION 10
#define PETSC_HAVE_HDF5_MAJOR_VERSION 1

Thank You,
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
Hi PETSc developers,

I’m trying to write a large vector created with VecCreateMPI (size
32768x32768) concurrently from 4 nodes (+32 tasks per node, total 128
mpi-ranks) and I see the following (indicative) error : [Full error log is
here : https://file.io/CdjUfe]

HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 52:
  #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
major: Dataset
minor: Unable to initialize object
  #001: H5Dint.c line 329 in H5D__create_named(): unable to create and
link to dataset
major: Dataset
minor: Unable to initialize object
  #002: H5L.c line 1557 in H5L_link_object(): unable to create new
link to object
major: Links
minor: Unable to initialize object
  #003: H5L.c line 1798 in H5L__create_real(): can't insert link
major: Links
minor: Unable to insert object
  #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal failed
major: Symbol table
HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 59:
  #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
major: Dataset
minor: Unable to initialize object
  #001: H5Dint.c line 329 in H5D__create_named(): unable to create and
link to dataset
major: Dataset
minor: Unable to initialize object
  #002: H5L.c line 1557 in H5L_link_object(): unable to create new
link to object
major: Links
minor: Unable to initialize object
  #003: H5L.c line 1798 in H5L__create_real(): can't insert link
major: Links
minor: Unable to insert object
  #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path
traversal failed
major: Symbol table
minor: Object not found
  #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal
operator failed
major: Symbol table
minor: Callback failed
  #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
major: Links
minor: Unable to initialize object
  #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object
major: Object header
minor: Can't open object
  #008: H5Doh.c line 300 in H5O__dset_create(): unable to create
dataset
minor: Object not found
  #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal
operator failed
major: Symbol table
minor: Callback failed
  #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
major: Links
minor: Unable to initialize object
  #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object
major: Object header
minor: Can't open object
  #008: H5Doh.c line 300 in H5O__dset_create(): unable to create
dataset
major: Dataset
minor: Unable to initialize object
  #009: H5Dint.c line 1274 in H5D__create(): unable to construct
layout information
major: Dataset
minor: Unable to initialize object
  #010: H5Dchunk.c line 872 in H5D__chunk_construct(): unable to set chunk sizes
major: Dataset
minor: Bad value
  #011: H5Dchunk.c line 831 in H5D__chunk_set_sizes(): chunk size must be < 4GB
major: Dataset
minor: Unable to initialize object
major: Dataset
minor: Unable to initialize object
  #009: H5Dint.c line 1274 in H5D__create(): unable to construct
layout information
major: Dataset
minor: Unable to initialize object
  #010: H5Dchunk.c line 872 in H5D__chunk_construct(): unable to set chunk sizes
major: Dataset
minor: Bad value
  #011: H5Dchunk.c line 831 in H5D__chunk_set_sizes(): chunk size must be < 4GB
major: Dataset
minor: Unable to initialize object
...

I spoke to Barry last evening who said that this is a known error that was
fixed for DMDA vecs but is broken for non-dmda vecs.

Could this be fixed ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Using -malloc_dump to examine memory leak

2019-04-16 Thread Yuyun Yang via petsc-users
Great, thank you for the advice!

Best regards,
Yuyun

Get Outlook for iOS

From: Smith, Barry F. 
Sent: Tuesday, April 16, 2019 5:54:15 AM
To: Yuyun Yang
Cc: petsc-users@mcs.anl.gov
Subject: Re: [petsc-users] Using -malloc_dump to examine memory leak


  Please try the flag -options_dump this tries to give a much more concise view 
of what objects have not been freed. For example I commented
out the last VecDestroy() in src/snes/examples/tutorials/ex19.c and then 
obtained:

./ex19 -objects_dump
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
Number of SNES iterations = 2
The following objects were never freed
-
[0] DM da DM_0x8400_0
  [0]  DMDACreate2d() in /Users/barrysmith/Src/petsc/src/dm/impls/da/da2.c
  [0]  main() in 
/Users/barrysmith/Src/petsc/src/snes/examples/tutorials/ex19.c
[0] Vec seq Vec_0x8400_6
  [0]  DMCreateGlobalVector() in 
/Users/barrysmith/Src/petsc/src/dm/interface/dm.c
  [0]  main() in 
/Users/barrysmith/Src/petsc/src/snes/examples/tutorials/ex19.c

Now I just need to look at the calls to DMCreateGlobalVector and DMDACreate2d 
in main to see what I did not free.

Note that since PETSc objects may hold references to other PETSc objects some 
items may not be freed for which you DID call destroy on.
For example because the unfreed vector holds a reference to the DM the DM is 
listed as not freed. Once you properly destroy the vector you'll
not that the DM is no longer listed as non freed.

It can be a little overwhelming at first to figure out what objects have not 
been freed. We recommending setting the environmental variable
export PETSC_OPTIONS=-malloc_test so that every run of your code reports memory 
issues and you can keep them under control from
the beginning (when the code is small and growing) rather than wait until the 
code is large and there are many unfreed objects to chase down.

   Good luck



   Barry


> On Apr 16, 2019, at 1:14 AM, Yuyun Yang via petsc-users 
>  wrote:
>
> Hello team,
>
> I’m trying to use the options -malloc_dump and -malloc_debug to examine 
> memory leaks. The messages however, are quite generic, and don’t really tell 
> me where the problems occur, for example:
>
> [ 0]1520 bytes VecCreate() line 35 in 
> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
>   [0]  PetscMallocA() line 35 in 
> /home/yyy910805/petsc/src/sys/memory/mal.c
>   [0]  VecCreate() line 30 in 
> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
>   [0]  VecDuplicate_Seq() line 804 in 
> /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c
>   [0]  VecDuplicate() line 375 in 
> /home/yyy910805/petsc/src/vec/vec/interface/vector.c
>
> The code is huge, so going through every single VecCreate/VecDuplicate and 
> VecDestroy is going to be time-consuming. Meanwhile, running valgrind gave me 
> some uninitialized values errors that don’t seem to be related to the above 
> message (or maybe they are?).
>
> How can I use this option to debug effectively?
>
> Thanks a lot,
> Yuyun



Re: [petsc-users] Using -malloc_dump to examine memory leak

2019-04-16 Thread Smith, Barry F. via petsc-users

  Please try the flag -options_dump this tries to give a much more concise view 
of what objects have not been freed. For example I commented
out the last VecDestroy() in src/snes/examples/tutorials/ex19.c and then 
obtained:

./ex19 -objects_dump
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
Number of SNES iterations = 2
The following objects were never freed
-
[0] DM da DM_0x8400_0
  [0]  DMDACreate2d() in /Users/barrysmith/Src/petsc/src/dm/impls/da/da2.c
  [0]  main() in 
/Users/barrysmith/Src/petsc/src/snes/examples/tutorials/ex19.c
[0] Vec seq Vec_0x8400_6
  [0]  DMCreateGlobalVector() in 
/Users/barrysmith/Src/petsc/src/dm/interface/dm.c
  [0]  main() in 
/Users/barrysmith/Src/petsc/src/snes/examples/tutorials/ex19.c

Now I just need to look at the calls to DMCreateGlobalVector and DMDACreate2d 
in main to see what I did not free. 

Note that since PETSc objects may hold references to other PETSc objects some 
items may not be freed for which you DID call destroy on.
For example because the unfreed vector holds a reference to the DM the DM is 
listed as not freed. Once you properly destroy the vector you'll 
not that the DM is no longer listed as non freed.

It can be a little overwhelming at first to figure out what objects have not 
been freed. We recommending setting the environmental variable
export PETSC_OPTIONS=-malloc_test so that every run of your code reports memory 
issues and you can keep them under control from
the beginning (when the code is small and growing) rather than wait until the 
code is large and there are many unfreed objects to chase down.

   Good luck



   Barry


> On Apr 16, 2019, at 1:14 AM, Yuyun Yang via petsc-users 
>  wrote:
> 
> Hello team,
>  
> I’m trying to use the options -malloc_dump and -malloc_debug to examine 
> memory leaks. The messages however, are quite generic, and don’t really tell 
> me where the problems occur, for example:
>  
> [ 0]1520 bytes VecCreate() line 35 in 
> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
>   [0]  PetscMallocA() line 35 in 
> /home/yyy910805/petsc/src/sys/memory/mal.c
>   [0]  VecCreate() line 30 in 
> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
>   [0]  VecDuplicate_Seq() line 804 in 
> /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c
>   [0]  VecDuplicate() line 375 in 
> /home/yyy910805/petsc/src/vec/vec/interface/vector.c
>  
> The code is huge, so going through every single VecCreate/VecDuplicate and 
> VecDestroy is going to be time-consuming. Meanwhile, running valgrind gave me 
> some uninitialized values errors that don’t seem to be related to the above 
> message (or maybe they are?).
>  
> How can I use this option to debug effectively?
>  
> Thanks a lot,
> Yuyun



Re: [petsc-users] Using -malloc_dump to examine memory leak

2019-04-16 Thread Mark Adams via petsc-users
Use valgrind with --leak-check=yes

This should give a stack trace at the end of the run.

On Tue, Apr 16, 2019 at 2:14 AM Yuyun Yang via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Hello team,
>
>
>
> I’m trying to use the options -malloc_dump and -malloc_debug to examine
> memory leaks. The messages however, are quite generic, and don’t really
> tell me where the problems occur, for example:
>
>
>
> [ 0]1520 bytes VecCreate() line 35 in
> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
>
>   [0]  PetscMallocA() line 35 in
> /home/yyy910805/petsc/src/sys/memory/mal.c
>
>   [0]  VecCreate() line 30 in
> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
>
>   [0]  VecDuplicate_Seq() line 804 in
> /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c
>
>   [0]  VecDuplicate() line 375 in
> /home/yyy910805/petsc/src/vec/vec/interface/vector.c
>
>
>
> The code is huge, so going through every single VecCreate/VecDuplicate and
> VecDestroy is going to be time-consuming. Meanwhile, running valgrind gave
> me some uninitialized values errors that don’t seem to be related to the
> above message (or maybe they are?).
>
>
>
> How can I use this option to debug effectively?
>
>
>
> Thanks a lot,
>
> Yuyun
>


[petsc-users] Using -malloc_dump to examine memory leak

2019-04-16 Thread Yuyun Yang via petsc-users
Hello team,

I'm trying to use the options -malloc_dump and -malloc_debug to examine memory 
leaks. The messages however, are quite generic, and don't really tell me where 
the problems occur, for example:

[ 0]1520 bytes VecCreate() line 35 in 
/home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
  [0]  PetscMallocA() line 35 in /home/yyy910805/petsc/src/sys/memory/mal.c
  [0]  VecCreate() line 30 in 
/home/yyy910805/petsc/src/vec/vec/interface/veccreate.c
  [0]  VecDuplicate_Seq() line 804 in 
/home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c
  [0]  VecDuplicate() line 375 in 
/home/yyy910805/petsc/src/vec/vec/interface/vector.c

The code is huge, so going through every single VecCreate/VecDuplicate and 
VecDestroy is going to be time-consuming. Meanwhile, running valgrind gave me 
some uninitialized values errors that don't seem to be related to the above 
message (or maybe they are?).

How can I use this option to debug effectively?

Thanks a lot,
Yuyun