Hi,
Table 2 reports negative latencies. This doesn't look right to me ;-)
If it's the outcome of a parameter fit to the performance model, then
use a parameter name (e.g. alpha) instead of the term 'latency'.
Figure 11 has a very narrow range in the y-coordinate and thus
exaggerates the varia
libaxb approach is in place.
I should be able to provide a good playground on time for the Summit
hackathon. In the meantime you can try the matrix market reader of
nsparse directly and see what you get, especially compared to cuSPARSE
and MKL.
Best regards,
Karli
Karl Rupp via pet
nd then easily try it out and compare against the other packages. In
the end it doesn't matter which package provides the best performance;
we just want to leverage it :-)
Best regards,
Karli
Karl Rupp via petsc-dev writes:
Hi Richard,
CPU spGEMM is about twice as fast even on the
Hi Richard,
CPU spGEMM is about twice as fast even on the GPU-friendly case of a
single rank: http://viennacl.sourceforge.net/viennacl-benchmarks-spmm.html
I agree that it would be good to have a GPU-MatMatMult for the sake of
experiments. Under these performance constraints it's not top prio
Hi Junchao,
I recall that Jed already suggested to make this a bitmask ~7 years ago ;-)
On the other hand: If we touch valid_GPU_array, then we should also use
a better name or refactor completely. Code like
(V->valid_GPU_array & PETSC_OFFLOAD_GPU)
simply isn't intuitive (nor does it make s
Hi Mark,
OK, so now the problem has shifted somewhat in that it now manifests
itself on small cases. In earlier investigation I was drawn to
MatTranspose but had a hard time pinning it down. The bug seems more
stable now or you probably fixed what looks like all the other bugs.
I added print
Wed, Sep 25, 2019 at 5:26 AM Karl Rupp via petsc-dev
mailto:petsc-dev@mcs.anl.gov>> wrote:
On 9/25/19 11:12 AM, Mark Adams via petsc-dev wrote:
> I am using karlrupp/fix-cuda-streams, merged with master, and I
get this
> error:
>
> Could not execu
On 9/25/19 11:12 AM, Mark Adams via petsc-dev wrote:
I am using karlrupp/fix-cuda-streams, merged with master, and I get this
error:
Could not execute "['jsrun -g\\ 1 -c\\ 1 -a\\ 1 --oversubscribe -n 1
printenv']":
Error, invalid argument: 1
My branch mark/fix-cuda-with-gamg-pintocpu see
Hi Mark, Richard, Junchao, et al.,
here we go:
https://gitlab.com/petsc/petsc/merge_requests/2091
This fixes indeed all the inconsistencies in test results for SNES ex19
and even ex56. A-priori I wasn't sure about the latter, but it looks
like this was the only missing piece.
Mark, this shou
Hi,
`git grep cudaStreamCreate` reports that vectors, matrices and scatters
create their own streams. This will almost inevitably create races
(there is no synchronization mechanism implemented), unless one calls
WaitForGPU() after each operation. Some of the non-deterministic tests
can likel
On 9/22/19 6:15 AM, Jed Brown wrote:
Karl Rupp via petsc-dev writes:
Hi Junchao,
thanks, these numbers are interesting.
Do you have an easy way to evaluate the benefits of a CUDA-aware MPI vs.
a non-CUDA-aware MPI that still keeps the benefits of your
packing/unpacking routines?
I
Hi Junchao,
thanks, these numbers are interesting.
Do you have an easy way to evaluate the benefits of a CUDA-aware MPI vs.
a non-CUDA-aware MPI that still keeps the benefits of your
packing/unpacking routines?
I'd like to get a feeling of where the performance gains come from. Is
it due to
Hi,
one way to test is to run a sequential example through nv-prof:
$> nvprof ./ex56 ...
https://devblogs.nvidia.com/cuda-pro-tip-nvprof-your-handy-universal-gpu-profiler/
If it uses the GPU, then you will get some information on the GPU
kernels called. If it doesn't use the GPU, the list wil
Hi all,
let me propose the following schedule for the next release:
* until Sunday, September 15: New pull requests are considered for the
upcoming release.
* from Monday, September 16, to Sunday, September 22: Fixing and merging
of open pull requests received by September 15. Extended testi
t as per Barry
c26191aaa4 use non-collective VecSet
12042c4bfa removing ViennaCL fix to GAMG
3c46958f6d fix bug with empty processor
8bcb2d50b7 fixed MPI lock from call to collective method
54cfeb1831 added missing settypes
9508265e8e adding support for MatTranspose
e5a6000419 adding fix for Vien
Hi Mark,
most of the CUDA-related fixes from your PR are now in master. Thank you!
The pinning of GPU-matrices to CPUs is not in master because it had
several issues:
https://bitbucket.org/petsc/petsc/pull-requests/1954/cuda-fixes-to-pinning-onto-cpu/diff
The ViennaCL-related changes in mark
Hi Mark,
it's fine if you just double-check that all your fixes are in master
when you're back :-)
Best regards and enjoy your vacation,
Karli
On 8/3/19 8:47 PM, Mark Adams wrote:
Karl,
Did you want me to do anything at this point? (on vacation this week) I
will verify that master is all f
If you ignore the initial ViennaCL-related commits and check against
current master (that just received cherry-picked updates from your PR),
then there are really only a few commits left that are not yet integrated.
(I'll extract two more PRs on Monday, so master will soon have your
fixes in.)
You should be able to just cherry-pick the commits from Barry's branch
as well as the two other branches.
On 8/2/19 8:13 PM, Mark Adams wrote:
I picked these two into Barry's branch and it built.
I would like to get them into my cuda branch. Should I just pick them?
And not worry about Barr
FYI: The two branches are currently testing in `next-tmp` and are likely
to be merged to master in ~5 hours.
Best regards,
Karli
On 8/2/19 4:53 PM, Smith, Barry F. via petsc-dev wrote:
Yes, these are bugs in Stefano's work that got into master because we didn't
have comprehensive testing
Hi Mark,
feel free to submit a fresh pull request now. I looked at your latest
commit in the repository in order to cherry-pick it, but it looked like
it had a few other bits in it as well.
Best regards,
Karli
On 7/28/19 6:27 PM, Mark Adams via petsc-dev wrote:
This is looking good. I'm not
Hi Stefano,
I have just noticed we have different occurrences of the
valid_GPU_matrix flag in src/mat/interface and src/mat/utils
I think that how they are used now is wrong, as they assume that all
those operations can only be executed on the CPU, irrespective of the
specific type.
Is there a
That's just a manifestation of Satish merging really well today ;-)
Best regards,
Karli
On 5/30/19 1:11 AM, Smith, Barry F. via petsc-dev wrote:
I just got this same merged message sent to me three times.
In recent days I've received several sent to me twice.
It's not like we d
Using alt files for testing is painful. Whenever you add, for example, a
new variable to be output in a viewer it changes the output files and you need
to regenerate the alt files for all the test configurations. Even though the
run behavior of the code hasn't changed.
I'm look
Hi,
Scott and PETSc folks,
Using alt files for testing is painful. Whenever you add, for example, a
new variable to be output in a viewer it changes the output files and you need
to regenerate the alt files for all the test configurations. Even though the
run behavior of the code h
Hi,
I fixed this warning after merge.
Best regards,
Karli
On 4/26/19 2:28 PM, PETSc checkBuilds via petsc-checkbuilds wrote:
Dear PETSc developer,
This email contains listings of contributions attributed to you by
`git blame` that caused compiler errors or warnings in PETSc automated
testi
On 4/25/19 6:53 PM, Jed Brown wrote:
Karl Rupp via petsc-dev writes:
With some effort we can certainly address 1.) and to some extent 3.),
probably 4.) as well, but I don't know how to solve 2.) and 5.) with
Jenkins. Given that a significant effort is required for 1.), 3.) and
4.) a
Dear PETSc developers,
the current Jenkins server went live last summer. Since then, the
stability of master and next has indeed improved. Who would have thought
three years ago that `next` is almost as stable as `master`?
However, over the weeks and months some weaknesses of our current
con
Hi Matt,
(...)
His slides have more,
"
PETSc is a widely used library for large sparse iterative solves.
Excellent and comprehensive library of solvers
It is the basis of a significant number of home-made
simulation codes
It is notoriously
Hi Richard,
the check for the GNU compilers is mostly a historic relic. We haven't
done any systematic tests with other compilers, so that test has just
remained in place.
It would certainly be good if you could update the check to also work
well with the default environment on Summit.
Tha
en after March 22 [i.e
anything that would be acceptable in our maint work-flow shouldn't
be frozen]
- And we should be able to drop troublesome PRs if they are blocking
the release.
full ack :-)
Best regards,
Karli
Satish
On Tue, 5 Mar 2019, Karl Rupp via petsc-dev wrote:
De
Dear PETSc developers,
let me suggest Friday, March 22, as the cut-off-date for new Pull
Requests for the upcoming release. This allows for 7 days to iron out
any remaining glitches. (It only took us a few days to release after the
cut-off date last September, so this should be fine)
Also, a
Hi Fabian,
I just merged the patch to master and maint. Please let us know whether
this solves the issue.
Best regards,
Karli
On 2/4/19 8:51 PM, Matthew Knepley via petsc-dev wrote:
On Mon, Feb 4, 2019 at 2:42 PM Fabian.Jakub via petsc-dev
mailto:petsc-dev@mcs.anl.gov>> wrote:
Dear Pe
I have not quickly found how is that "VTK ordering" defined but I
hopefully it's a well-defined unambiguous cell-local numbering. I will
try to find it out soon and get back to you.
Hope this helps:
https://www.vtk.org/wp-content/uploads/2015/04/file-formats.pdf
(page 9)
Best regards,
Ka
Hi Mark,
ah, I was confused by the Python information at the beginning of
configure.log. So it is picking up the correct compiler.
Have you tried uncommenting the check for GNU?
Best regards,
Karli
On 10/31/18 11:40 AM, Mark Adams wrote:
It looks like configure is not finding the correct cc
Hi Mark,
please comment or remove lines 83 and 84 in
config/BuildSystem/config/packages/cuda.py
Is there a compiler newer than GCC 4.3 available?
Best regards,
Karli
On 10/31/18 8:15 AM, Mark Adams via petsc-dev wrote:
After loading a cuda module ...
On Wed, Oct 31, 2018 at 2:58 AM Mark A
36 matches
Mail list logo