still observing similar crashes.
I am attaching the trace of the latest crash (with PETSc-3.20.0) for
reference.
Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io
2_gpu_crash
Description
exit from
PetscCallAbort).
Is this usage pattern not recommended? Should I be manually checking for
success of the `function_returning_petsc_error_code` and throw instead of
relying on PetscCallAbort?
Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning
Via a checkpoint in `PetscOptionsCheckInitial_Private`, I can confirm that
`checkstack` is set to `PETSC_TRUE` and this leads to no (additional)
information about erroneous stack handling.
Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi
Hi Matt,
This is a trace from the same crash, but with `-checkstack` included in
.petscrc : https://gist.github.com/s-sajid-ali/455b3982d47a31bff9e7ee211dd43991
I don't see any additional information regarding the possible cause.
Thank You,
Sajid Ali (he/him) | Research Associate
Data
Hi Matt,
Adding `-checkstack` does not prevent the crash, both on my laptop and on the
cluster.
What does prevent the crash (on my laptop at least) is changing
`PETSCSTACKSIZE` from 64 to 256 here :
https://github.com/petsc/petsc/blob/main/include/petscerror.h#L1153
Thank You,
Sajid Ali (he
). The crash finally
occurs on the 43rd call to KSP_solve.
Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>
From: Saj
Hi Barry,
Thanks a lot for fixing this issue. I ran the same problem on a linux machine
and have the following trace for the same crash (with ASAN turned on for both
PETSc (on the latest commit of the branch) and the application) :
https://gist.github.com/s-sajid-ali
I’ve also printed out the head struct in the debugger, and it looks like this:
(lldb) print (TRSPACE)*head(TRSPACE) $7 = {
size = 16
rsize = 16
id = 12063
lineno = 217
filename = 0x0001167fd865
"/Users/sasyed/Documents/packages/petsc/src/sys/dll/reg.c"
functionname =
. Logs for this configuration and the error
trace with this build are attached with this email.
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github
The configuration log is attached with this email.
configure_log_tail
Description: configure_log_tail
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>
ksp_crash_log
Description: ksp_crash_log
red with CUDA. With this,
I'm seeing that each call to PCApply/MatSolve involves one GPU->CPU
transfer. Is it possible to avoid this?
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io
r events a bit.
>
> Barry
>
>
> On Oct 5, 2022, at 4:47 PM, Sajid Ali
> wrote:
>
> Hi PETSc-developers,
>
> I'm having trouble with getting performance logs from an application that
> uses PETSc. There are no issues when I run it on a CPU, but every time a
> GP
to PETSC_VIEWER_STDOUT_WORLD :
https://github.com/fnalacceleratormodeling/synergia2/blob/devel3/src/synergia/utils/utils.h
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io
log-gpu
Description: Binary data
log-cpu
Description
,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>
From: Matthew Knepley
Sent: Thursday, March 17, 2022 7:25 PM
To: Mark Adams
Cc: Sajid Ali Syed ;
Hi George,
That clarifies all my questions. Thanks for taking the time to answer them!
ould use GPU-aware MPI to populate
off-process values) ?
If this is not currently supported, is supporting this on the roadmap? Thanks
in advance!
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://
Hi George,
Thanks a lot for the confirmation!
When one uses the deprecated `--bind-to-core` option, is OpenMPI silently
ignoring this on OS X? Would this be indicated with increased verbosity
when using mpiexec?
Thank You,
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern
whether OpenMPI supports process binding on OS X and
also comment on why --bind-to-core works but --bind-to core doesn’t? Thanks
in advance!
Thank You,
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
Hi Mark,
Thanks for the information.
@Junchao: Given that there are known issues with GPU aware MPI, it might be
best to wait until there is an updated version of cray-mpich (which hopefully
contains the relevant fixes).
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing
Solvers Available In PETSc — PETSc
v3.16.2-540-g1213a6437a
documentation<https://petsc.org/main/overview/linear_solve_table/>
Last updated on 2022-01-01T03:38:46-0600 (v3.16.2-540-g1213a6437a).
petsc.org
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi Na
; but I do not like the approach of having a second matrix as temporary
> > storage space. Are there more efficient approaches possible using
> > PETSc-functions?
> >
> > Thanks!
> >
> > Regards,
> >
> > Roland Richter
> >
>
> -- next part --
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210518/b8710455/attachment-0001.html
> >
>
>
--
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
Hi everybody,
Thank you very much for allowing me to join the group. I’m Sajid from the
United Kingdom. I do have an interest in recording with audio. I hope this
question is not too long winded for you. I’m very keen to use my phone to
capture audio in conjunction with my computer. Basically I
to a natural
ordered slice by using an AO associated with a temporary 2D DMDA object
that lives only on the subset of ranks where the slice vector lives). I've
attached the code for the same should it be of interest to anyone who reads
this.
--
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern
shown below)
|-|
| A | C |
| B | D |
|-|
Thank You,
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
ex53.c
Description: Binary data
of opmi_info are available here (
https://we.tl/t-CaiOt7OefS) should it be of any help.
Thank You,
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
g preconditioner options database keys and vice versa for clarity.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
is optional).
Thanks for the insight into the cause of this bug.
--
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
*mat->ops->getrowmax)(mat,v,idx);CHKERRQ(ierr);
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
0x760ca248 in MatGetRowMax_SeqAIJ (A=0x8d1450, v=0x9b1220,
idx=0x99a620) at
/home/sajid/packages/petsc/src/mat/impls/aij/seq/aij.c:3195
3195 x[i] = *aa; if (idx) id
the IS ?)
I've seen on earlier threads that XDMF can be used to create a map of where
data is present in hdf5 files, is there an example for doing this with
regular vectors to select subvectors as described above ?
Also, is it possible to have different sub-comms read different hdf5 groups
?
Thank You,
Sajid
(cudavec, ) )
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
should’ve created valid memory on the host.
Could someone explain if this is the correct approach to take and what the
above error means ?
(PS : I’ve run ksp tutorial-ex2 with -vec_type cuda -mat_type aijcusparse
to test the installation and everything works as expected.)
Thank You,
Sajid Ali | PhD
en
running with 1 mpi rank). Is this the correct approach to take and if yes
what does "number of local columns" mean when combining the seqaij matrices
?
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
Hi Barry,
All entries for a row are available together, but there is no requirement
to compute them in order.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
ell, implementing a matrix free method is
something we might pursue in the future.
--
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
->ops->replacearray = NULL; at line 1286
in the rvector.c file if one of you can confirm that the above logic is
correct. The example attached in the last email could be used as a test for
the same if necessary.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s
the subvector
data back to the parent vector regardless of how the subvector is modified.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
ex_subvecio.c
Description: Binary data
`testload.h5` to be a vector of size 100 with the first 50 being 1 and the
rest being the input values.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
testinput.h5
Description: Binary data
ex2.c
Description: Binary data
the
interpolations at each time step.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
differentiation (b) Multigrid solvers being
fast/optimal (c) PDE models being more accurate on downsampled data.
PS : @Alp : Could you share the slides/manuscript from the siam pp20
meeting that describes the new multi-objective minimization features in TAO
?
Thank You,
Sajid Ali | PhD Candidate
Applied
le pointer types
and the example crashes at runtime)
Is there any plan to support TAO with complex scalars ?
I had planned to re-use the TS object in an optimization loop with the F
vector defined both as a parameter in TS and as the independent variable in
the outer TAO loop.
Thank You,
Sajid
would
mean a loss of well conditioned forward solve (and increase in solving time
itself), I was wondering if it would be better to keep the complex PDE
formulation and write an optimization loop in PETSc while defining the
regularizer via a cost integrand.
Thank You,
Sajid Ali | PhD Candidate
Applied
as to what I could do to make the real formulation well
conditioned ? Or should I not bother with this for now and implement a
first order gradient descent method in PETSc (while approximating the
regularizer as a cost integrand) ?
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern
=/home/path2"
fi
```
Note: This was tested on my laptop using a set of docker containers using
this configuration : https://github.com/SciDAS/slurm-in-docker .
--
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
starts to run, (specifically `SLURM_JOB_NODELIST` and
`MODULEPATH`), should I do so via getenv (in allocator context for
`srun`/`sbatch`) or via the `spank_job_control_getenv/setenv` functions in
the PrologSlurmctld stage ?
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
? I've already
added a citation for the Comput. Mech. (2007) 39: 497–507 as a reference
for the general idea of applying agglomeration type multigrid
preconditioning to helmholtz operators.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
late that ? Setting the initialization for
adjoint vector with respect to initial conditions to be NULL in
TSSetCostGradients doesn't work.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
nodelist to set the appropriate modulepath.
PS: I'm trying to do the same thing with spack!
--
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
that it will be a relatively simple task
to add a small script to determine nodelist and prepend the moduepath env
var.
Alternatively, if someone could point out how they do this at their sites
it would be useful as well.
Thanks in advance for the advice!
--
Sajid Ali | PhD Candidate
Applied
TAO would take care of the
constraints) or do I also have to take the constraints into account (since
I'd also have to differentiate the regularizers) ?
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
he correct initialization
for the adjoint variables when calling TSSetCostGradients. The
initialization for mu vector is whereby given to be dΦi/dp at t=tF. If p is
time dependent, does one evaluate this derivative with respect to p(t) at
t=tF ?
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
No
numpy arrays to PETSc matrices via
petsc4py, I’d switch to that instead of directly creating hdf5 files.
Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io
. If it's a complicated fix, this issue can serve as a note to
future users.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
ex10.c
Description: Binary data
Hi Barry,
Looking at the current code, am I right in assuming that the change is only
in naming conventions and not in logic? I'll make a MR soon.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
t)0));
size_t size = sizeof(hsize_t);
printf("size = %zu\n", size);
}
(ipy3) [sajid@xrmlite misc]$ h5cc ex.c
(ipy3) [sajid@xrmlite misc]$ ./a.out
ref=18446744073709551615
size = 8
Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
Also, both versions of PETSc were built with ^hdf5@1.10.5 ^mpich@3.3
%gcc@8.3.0 so the error is most likely not from hdf5.
ks like the chunking logic somehow got broken in
3.12.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
log
Description: Binary data
Hi PETSc-developers,
Has this bug been fixed in the new 3.12 release ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
with
this, is there a way to tell PETSc to pick fftw functions from MKL ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
piling...configure: error: in
`/gpfs/mira-home/sajid/packages/petsc/complex_int_64_fftw_debug/externalpackages/fftw-3.3.8':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
***
```
Thank You,
Sajid Ali
Applied Physics
Northwestern University
attached the
backtrace which shows that there is some error with mpi-fftw being called.
I also attach the output with -start_in_debugger command option.
What could possibly cause this error and how do I fix it ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
sajid@thetamom1:/gpfs/mira
options to ensure that
it converges in one run, but since there's no guarantee that what works for
one problem might not work for another (or the same problem at a different
grid size), I'll stick with GMRES+gamg for now.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
,
Sajid Ali
Applied Physics
Northwestern University
Hi Hong,
The solution has the right characteristics but it's off by many orders of
magnitude. It is ~3.5x faster as before.
Am I supposed to keep the TSRHSJacobianSetReuse function or not?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
using
TSComputeRHSFunctionLinear()" and now I'm even more confused.
PS : Doing the simple switch is as slow as the original code and the answer
is wrong as well.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
Hi Barry,
Thanks a lot for pointing this out. I'm seeing ~3X speedup in time !
Attached are the new log files. Does everything look right ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
out_50
Description: Binary data
out_100
Description: Binary data
, the time spent in TSJacobianEval also increases
with decreasing time-step (or increasing number of steps).
For reference, I attach the log files for two cases which were run with
different time steps and the source code.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
ex_dmda.c
(hopefully
making it more efficient). Does calling vecscatter on each rank with the
local index set take care of the necessary communication behind the scenes
then?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
ex_modify.c
Description: Binary data
this with VecScatter and two index sets, one shifted and one un-shifted.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
This can be tracked down to n vs N being used. The vector in the second
loop is created using N while MatCreateVecsFFTW uses n (for real numbers).
n!=N and hence the error.
If the lines 50/51 and line 91 are switched to MatCreateVecsFFW instead of
MatGetVecs and VecCreateSeq respectively, the
Hi Barry,
I'm not sure why MatCreateVecsFFW is not used at line 50/51.
The error occurs at line 94 because in the second loop, the example
manually creates the x vector instead of the one created using the A
matrix. For complex numbers this is not an issue but for real numbers the
dimensions
Apologies for the post. I didn't see that it was for complex vectors only.
On Mon, Apr 22, 2019 at 5:00 PM Sajid Ali
wrote:
> Hi,
>
> I see that src/mat/examples/tests/ex112.c is failing for petsc@3.11.1
> configured without complex scalars. With complex scalars everything
```
I came across this because I saw that MatMult was failing for a new test
related to a PR I was working on. Is this a bug ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
add a vecdupliate/vecdestroy pair in the tests.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
>Perhaps if spack had an easier mechanism to allow the user to "point to"
local git clones it could get closer to the best of both worlds. Maybe
spack could support a list of local repositories and branches in the yaml
file.
I wonder if a local git clone of petsc can become a "mirror" for petsc
> develop > 3.11.99 > 3.10.xx > maint (or other strings)
Just discovered this issue when trying to build with my fork of spack at [1
<https://github.com/s-sajid-ali/spack/commit/05e499571b428f37b8cd1c7d39013e3dec08e5c8>].
So, ideally each developer has to have their develop p
@Barry: Thanks for the bugfix!
@Satish: Thanks for pointing out this method!
My preferred way previously was to download the source code, unzip, edit,
zip. Now ask spack to not checksum (because my edit has changed stuff) and
build. Lately, spack has added git support and now I create a branch
oblems valgrind
> flags are real. MPICH needs to be configured with the option
> --enable-g=meminit to be valgrind clean. PETSc's --download-mpich always
> installs a valgrind clean MPI.
> >
> > It is unfortunate Spack doesn't provide a variant of MPICH that is
> valgrind
nk You,
Sajid Ali
Applied Physics
Northwestern University
5
#define PETSC_HAVE_HDF5_MINOR_VERSION 10
#define PETSC_HAVE_HDF5_MAJOR_VERSION 1
Thank You,
Sajid Ali
Applied Physics
Northwestern University
error that was
fixed for DMDA vecs but is broken for non-dmda vecs.
Could this be fixed ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
igns to PETSC_MEMALIGN ?)
PS: I've only ever used FFTW via the python interface (and automated the
build & but couldn't automate testing of pyfftw-mpi since cython coverage
reporting is confusing).
Thank You,
Sajid Ali
Applied Physics
Northwestern University
Thanks for the temporary fix.
(PS: I was wondering if it would be trivial to just extend the code to have
four mallocs and create a new function but it looks like the logic is much
more complicated.)
,
Sajid Ali
Applied Physics
Northwestern University
ex_modify.c
Description: Binary data
enough to point it out to me. The issue is at line 87/88.
With 87, the program crashes, with 88 it works fine.
Thanks in advance for the help!
--
Sajid Ali
Applied Physics
Northwestern University
ex_ms.c
Description: Binary data
Hi Balay,
Confirming that the spack variant works. Thanks for adding it.
mit/19eeeca592f63413698f23dd02b9961f22581803
Thank You,
Sajid Ali
Applied Physics
Northwestern University
One last question I have is : Does PETSc automatically chose a good chunk
size for the size of the vector it has and use it to write the dataset ? Or
is this something I shouldn't really worry about (not that it affects me
now but it would be good to not have a slow read from a python script for
?
Because I see that if I write to hdf5 from a complex vector created using
DMDA, I get a vector that has dimensions (dim_x,dim_y,2) but before I saw
the dimension of the same to be (dim_x*dim_y,2).
--
Sajid Ali
Applied Physics
Northwestern University
n an error that depends on number of
mpi processes.
I'm attaching the code ( which works if the matrix is created without using
the DA, i.e. comment out line 159, uncomment 161/162 and I'm doing this on
a small grid to catch errors.).
Thanks in advance for the help.
--
Sajid Ali
Applied Physics
No
vector, another 50 Gb. For the matrix I need ~250 Gb and some overhead for
the solver.
How do I estimate this overhead (and estimate how many nodes I would need
to run this given the maximum memory per node (as specified by slurm's
--mem option)) ?
Thanks in advance for the help!
--
Sajid Ali
?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
detailed analysis via Intel Vtune if
needed.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
submit_script
Description: Binary data
intel_aps_report
Description: Binary data
knl_petsc
Description: Binary data
://software.intel.com/en-us/mpi-developer-guide-linux-fabrics-control
Thank You,
Sajid Ali
Applied Physics
Northwestern University
step for job 916208: More processors
requested than permitted
I’m following the advice as given at slide 33 of
https://www.nersc.gov/assets/Uploads/02-using-cori-knl-nodes-20170609.pdf
For further info, I’m using LCRC at ANL.
Thank You,
Sajid Ali
Applied Physics
Northwestern University
Hi,
The links to the Jan 2019 presentations at
https://www.mcs.anl.gov/petsc/documentation/tutorials/index.html are
broken. Could these be fixed ?
Thank You,
Sajid Ali
Applied Physics
Northwestern University
The vector is essentially snapshots in time of a data array. I should
probably store this as a 2D dense matrix of dimensions (dim_x*dim_y) *
dim_z. Now I can pick one column at a time and use it for my TS Jacobian.
Apologies for being a little unclear.
--
Sajid Ali
Applied Physics
Northwestern
I think I understand what's happening here, when I look at a data file I
create similar to the aforementioned example, I see a complex=1 attribute
that I'm missing when I make my hdf5 file.
Column 1 contains the real value and column 2 contains the imaginary value,
correct?
I did that last time as well (and opened it using h5py just to be sure that
the shape is indeed dim x 2 and the datatype is f8), yet I get the error.
The error comes from these lines in PETSc :
#if
1 - 100 of 205 matches
Mail list logo