search time).
Barry
On Feb 8, 2023, at 7:56 AM, Matteo Semplice
wrote:
Dear all,
I am trying to optimize the nonlinear solvers in a code of mine,
but I am having a hard time at interpreting the profiling data from
the SNES. In particular, if I run with -sne
hat KSPSolve time is not part of the line search time).
Barry
> On Feb 8, 2023, at 7:56 AM, Matteo Semplice
> wrote:
>
> Dear all,
>
> I am trying to optimize the nonlinear solvers in a code of mine, but I am
> having a hard time at interpreting the profiling data
Dear all,
I am trying to optimize the nonlinear solvers in a code of mine,
but I am having a hard time at interpreting the profiling data from the
SNES. In particular, if I run with -snesCorr_snes_lag_jacobian 5
-snesCorr_snes_linesearch_monitor -snesCorr_snes_monitor
ctionReturnVoid\\(\\)'
Hope this helps.
Satish
On Mon, 21 Feb 2022, Shourong Hao wrote:
> Hello,
>
> I'm using PETSc for a finite element computation.
>
> Now I want to use TAU (https://www.cs.uoregon.edu/research/tau/home.php) for
> a performance profiling after I used -l
Hello,I'm using PETSc for a finite element computation. Now I want to use TAU (https://www.cs.uoregon.edu/research/tau/home.php) for a performance profiling after I used -log_view option.However, I encountered many errors in the compiling stage, and there is little inform
ofile numbers unreliable.
--Richard
>
> Thanks in advance
>
> Pietro Incardona
>
>
>
> --
> *Da:* Barry Smith
> *Inviato:* giovedì 15 giugno 2017 23:16:50
> *A:* Pietro Incardona
> *Cc:* petsc-users@mcs.anl.gov
> *Oggetto:* Re: [pet
t that
> I compiled PETSC with debugging = 0 could affect the profiler numbers to be
> unreliable ?
>
> Thanks in advance
> Pietro Incardona
>
>
> Da: Barry Smith
> Inviato: giovedì 15 giugno 2017 23:16:50
> A: Pietro Incardona
> Cc: petsc-users@mcs.anl.gov
>
23:15:04
A: Pietro Incardona; petsc-users@mcs.anl.gov
Oggetto: Re: [petsc-users] PETSC profiling on 1536 cores
Using no preconditioner is a bad bad idea and anyone with the gall to do this
deserves to be spanked.
For the Poisson equation, why not use PETSc's native algebraic multigrid s
uation.
>
> I am using Conjugate gradient (the matrix is symmetric) with no
> preconditioner. Visualizing the solution is reasonable.
> Unfortunately the Conjugate-Gradient does not scale at all and I am extremely
> concerned about this problem in paticular about the profiling numbers.
e Conjugate-Gradient does not scale at all and I am
> extremely concerned about this problem in paticular about the profiling
> numbers.
>
> Looking at the profiler it seem that
>
>
> 1536 cores = 24 cores x 64
>
>
> VecScatterBegin 348 1.0 2.3975e-01 1.8 0.00e+0
-Gradient does not scale at all and I am extremely
concerned about this problem in paticular about the profiling numbers.
Looking at the profiler it seem that
1536 cores = 24 cores x 64
VecScatterBegin 348 1.0 2.3975e-01 1.8 0.00e+00 0.0 7.7e+06 3.1e+04
0.0e+00 0 0 85 99 0 0 0 85 99
log_view, I got the Petsc
> API function profiling posted as below. But I wonder is there any way for me
> to get the performance/execution profiling of the functions in the code, like
> in Xcode profiling, so that I would know which part needs special
> optimization? I now badly nee
Thanks Barry for the reply. When I use the option -log_view, I got the
Petsc API function profiling posted as below. But I wonder is there any way
for me to get the performance/execution profiling of the functions in the
code, like in Xcode profiling, so that I would know which part needs
special
running slowly
and that gives a good idea of what needs to be optimized.
Barry
> On Oct 19, 2016, at 3:15 PM, Sharp Stone wrote:
>
> Dear all,
>
> Now I'm using a Petsc code which needs to be optimized. But after trying, I
> still don't know how to get the profiling
Dear all,
Now I'm using a Petsc code which needs to be optimized. But after trying, I
still don't know how to get the profiling for each of the function for each
process. I mean, for each process, how should I know the execution time for
each function?
Thanks!
--
Best regards,
Feng
On Tue, Aug 4, 2015 at 6:33 PM, Barry Smith wrote:
>
> > On Aug 4, 2015, at 6:20 PM, Patrick Sanan
> wrote:
> >
> > And note that it is possible to run gdb/lldb on each of several MPI
> processes, useful when you hit a bug that only appears in parallel. For
> example, this FAQ describes a coupl
> On Aug 4, 2015, at 6:20 PM, Patrick Sanan wrote:
>
> And note that it is possible to run gdb/lldb on each of several MPI
> processes, useful when you hit a bug that only appears in parallel. For
> example, this FAQ describes a couple of ways to do this:
>
> https://www.open-mpi.org/faq/?ca
And note that it is possible to run gdb/lldb on each of several MPI processes,
useful when you hit a bug that only appears in parallel. For example, this FAQ
describes a couple of ways to do this:
https://www.open-mpi.org/faq/?category=debugging#serial-debuggers
> Am 05.08.2015 um 00:36 schri
Correction, even in parallel you should be able to use a 0 for the viewer for
calls to KSPView() etc; just make sure you do the same call on each process
that shares the object.
To change the viewer format you do need to use
PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD),
P
I do this by running in the debugger and putting in breakpoints. At the
breakpoint you can look directly at variables like the n in call to VecMDot()
you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it
will print out the information about the object right then. Cal
Justin Chang writes:
> Hi all,
>
> Not sure what to title this mail, but let me begin with an analogy of what
> I am looking for:
>
> In MATLAB, we could insert breakpoints into the code, such that when we run
> the program, we could pause the execution and see what the variables
> contain and wh
Hi all,
Not sure what to title this mail, but let me begin with an analogy of what
I am looking for:
In MATLAB, we could insert breakpoints into the code, such that when we run
the program, we could pause the execution and see what the variables
contain and what is going on exactly within your fu
On Tue, Jun 21, 2011 at 07:34, khalid ashraf wrote:
> In order to find where the extra time is consumed, I started from
> ksp/ksp/example/tutorials/ex22.c and changed it one line at a time. I found
> that the time is consumed in the call:
>
> ierr =
> MatSetValuesStencil(B,7,row,7,col,&val[0][0]
Hi Jed,
In order to find where the extra time is consumed, I started from
ksp/ksp/example/tutorials/ex22.c and changed it one line at a time. I found
that
the time is consumed in the call:
ierr =
MatSetValuesStencil(B,7,row,7,col,&val[0][0],INSERT_VALUES);CHKERRQ(ierr);
The same with the
On Thu, Jun 16, 2011 at 11:14, khalid ashraf wrote:
> When I look at the breakdown of the stages time required, the total add up
> to ~7s however, the main stage time is ~350s.
Two possibilities:
1. The time is not in PETSc.
2. The matrix is not preallocated correctly.
http://www.mcs.anl.gov
Hi,
I am trying to benchmark the performance of my code on 8 processors and am
trying to find where most of the time is used. When I look at the breakdown of
the stages time required, the total add up to ~7s however, the main stage time
is ~350s. I am not being able to find out the stage which i
On Tue, Sep 21, 2010 at 13:49, Leo van Kampenhout
wrote:
> Thanks for the helpful response Jed. I was not aware of the possibility to
> run seperate PETSC_COMM_WORLDS in the same program,? at least this is not
> clear from the documentation
This is really an MPI thing, you are using a different
P
Thanks for the helpful response Jed. I was not aware of the possibility to
run seperate PETSC_COMM_WORLDS in the same program, at least this is not
clear from the documentation (e.g.
http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Sys/PetscInitialize.html)
I'll probably
On Tue, Sep 21, 2010 at 10:41, Leo van Kampenhout
wrote:
> At the cluster I am currently working on, each node is made up by 12 PEs and
> have shared memory. When I would just reserve 1 PE for my job, the other 11
> processors are given to other users, therefore giving dynamic load on the
> memory
Dear all,
in order to calculate speedup (Sp = T1/Tp) I need an accurate measurement of
T1, the time to solve on 1 processor. I will be using the parallel algorithm
for that, but there seems to be a hick-up.
At the cluster I am currently working on, each node is made up by 12 PEs and
have shared m
Hi,
I am trying to profile my petsc code. Particularly I am trying to have a
graph dependance of processor.
Thank you for help me.
Guy
On Tue, Aug 26, 2008 at 3:50 AM, wrote:
> Hi,
> I am trying to profile my petsc code. Particularly I am trying to have a
> graph dependance of processor.
Not sure exactly what you want to plot. However, all the profiling information
can be printed at the end of the run using -lo
; I would like to create two communicator subgroups of PETSC_COMM_WORLD.
> Is it possible to use the petsc profiling utilities to profile the two
> communicator sub-groups individually?
>
>
>
> thank you,
> David Fuentes
>
>
I would like to create two communicator subgroups of PETSC_COMM_WORLD.
Is it possible to use the petsc profiling utilities to profile the two
communicator sub-groups individually?
thank you,
David Fuentes
Yes - we limit the e-mail sizes on the mailing list - as we don't want
to flood all list participents with multi-megabyte emails.
Issues that require such interaction should be done at
petsc-mait at mcs.anl.gov not petsc-users at mcs.anl.gov.
Satish
On Tue, 15 Aug 2006, Matt Funk wrote:
> Is th
Yes, I got it. You are correct, the matrix partitions are exactly the same
size. I guess you have a bad network, since not only are the ILU times
unbalanced, but vector operations as well.
Matt
On 8/15/06, Matt Funk wrote:
> Is there a limit to how big an attachment can be?
> The file is 1.3Mb
Is there a limit to how big an attachment can be?
The file is 1.3Mb big. I tried to send it twice and none of the emails went
through. I also send it directly to Barry and Matthews email. I hope that got
though?
mat
On Tuesday 15 August 2006 14:44, Matthew Knepley wrote:
> I don't think it mat
I don't think it matters initially since the problem is BIG imbalances.
Matt
On 8/15/06, Matt Funk wrote:
> Do you want me to use the debug version or the optimized version of PETSc?
>
> mat
>
> On Tuesday 15 August 2006 13:56, Barry Smith wrote:
> >Please send the entire -info output as a
Please send the entire -info output as an attachment to me. (Not
in the email) I'll study it in more detail.
Barry
On Tue, 15 Aug 2006, Matt Funk wrote:
> On Tuesday 15 August 2006 11:52, Barry Smith wrote:
>>> MatSolve 16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
>>
>
Do you want me to use the debug version or the optimized version of PETSc?
mat
On Tuesday 15 August 2006 13:56, Barry Smith wrote:
>Please send the entire -info output as an attachment to me. (Not
> in the email) I'll study it in more detail.
>
> Barry
>
> On Tue, 15 Aug 2006, Matt Funk w
> MatSolve 16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
^
balance
Hmmm, I would guess that the matrix entries are not so well balanced?
One process takes 1.4 times as long for the triangul
On Tuesday 15 August 2006 11:52, Barry Smith wrote:
> > MatSolve 16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
>
> ^
> balance
>
>Hmmm, I would guess that the matrix entries are not so well b
Hi Matt,
sorry for the delay since the last email, but there were some other things i
needed to do.
Anyway, I hope that maybe I can get some more help from you guys with respect
to the loadimbalance problem i have. Here is the situtation:
I run my code on 2 procs. I profile my KSPSolve call and
On 8/2/06, Matt Funk wrote:
> Hi Matt,
>
> thanks for all the help so far. The -info option is really very helpful. So i
> think i straightened the actual errors out. However, now i am back to the
> original question i had. That is why it takes so much longer on 4 procs than
> on 1 proc.
So you h
Hi Matt,
It could be a bad load imbalance because i don't let PETSc decide. I need to
fix that anyway, so i think i'll try that first and then let you know.
Thanks though for the quick response and helping me to interpret those
numbers ...
mat
On Wednesday 02 August 2006 15:50, Matthew Knepl
Hi Matt,
thanks for all the help so far. The -info option is really very helpful. So i
think i straightened the actual errors out. However, now i am back to the
original question i had. That is why it takes so much longer on 4 procs than
on 1 proc.
I profiled the KSPSolve(...) as stage 2:
For
On 8/1/06, Matt Funk wrote:
> Actually the errors occur on my calls to a PETSc functions after calling
> PETSCInitialize.
Yes, it is the error I pointed out in the last message.
Matt
> mat
--
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness
On 8/1/06, Matt Funk wrote:
> Hi,
>
> i don't think it is the mallocs since it says things like:
> [0] MatAssemblyEnd_SeqAIJMatrix size: 2912 X 2912; storage space: 0
> unneeded,2912 used
> [0] MatAssemblyEnd_SeqAIJNumber of mallocs during MatSetValues() is 0
This is only on one processor.
> How
27;
> > > Mem.
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > >Index Set 3 3 35976 0
> > > Vec 1091092458360 0
> > > Matrix 2
0 0
> > Preconditioner 1 1168 0
> > =
> >======= Average time to get
> > PetscTime(): 9.53674e-08
> > Compiled wi
ith full precision matrices (default)
>
> ...
>
> am i using the push and pop calls in an manner they are not intended to be
> used?
Not exactly. You need to register a stage first before pushing it.
http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/P
tal
> Avg %Total counts %Total
> 0: Main Stage: 4.1647e+01 99.9% 2.0937e+08 100.0% 0.000e+00 0.0%
> 0.000e+00 0.0% 2.402e+04 100.0%
>
>
> See the 'Profiling' chapter of the users' manual for details on interpret
Hi,
well, now i do get summary:
...
# WARNING!!!#
##
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend#
2e+04 100.0%
----
See the 'Profiling' chapter of the users' manual for details on interpreting
output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - m
54 matches
Mail list logo