> On Feb 13, 2020, at 5:39 PM, Zhang, Hong wrote:
>
>
>
>> On Feb 13, 2020, at 7:39 AM, Smith, Barry F. wrote:
>>
>>
>> How are the two being compiled and linked? The same way, one with the PETSc
>> library in the path and the other without? Or does the PETSc one have lots
>> of flags an
> On Feb 13, 2020, at 7:39 AM, Smith, Barry F. wrote:
>
>
> How are the two being compiled and linked? The same way, one with the PETSc
> library in the path and the other without? Or does the PETSc one have lots of
> flags and stuff while the non-PETSc one is just simple by hand?
PETSc was
Hi Hong,
have you tried running the code through gprof and look at the output
(e.g. with kcachegrind)?
(apologies if this has been suggested already)
Best regards,
Karli
On 2/12/20 7:29 PM, Zhang, Hong via petsc-dev wrote:
On Feb 12, 2020, at 5:11 PM, Smith, Barry F. wrote:
ldd -o
How are the two being compiled and linked? The same way, one with the PETSc
library in the path and the other without? Or does the PETSc one have lots of
flags and stuff while the non-PETSc one is just simple by hand?
Barry
> On Feb 12, 2020, at 7:29 PM, Zhang, Hong wrote:
>
>
>
>> O
> On Feb 12, 2020, at 5:11 PM, Smith, Barry F. wrote:
>
>
> ldd -o on the petsc program (static) and the non petsc program (static),
> what are the differences?
There is no difference in the outputs.
>
> nm -o both executables | grep cudaFree()
Non petsc program:
[hongzh@login3.summit
ldd -o on the petsc program (static) and the non petsc program (static), what
are the differences?
nm -o both executables | grep cudaFree()
> On Feb 12, 2020, at 1:51 PM, Munson, Todd via petsc-dev
> wrote:
>
>
> There are some side effects when loading shared libraries, such as
>
There are some side effects when loading shared libraries, such as
initializations of
static variables, etc. Is something like that happening?
Another place is the initial runtime library that gets linked (libcrt0 maybe?).
I
think some MPI compilers insert their own version.
Todd.
> On Feb
On Wed, Feb 12, 2020 at 11:38 AM Zhang, Hong wrote:
>
>
> On Feb 12, 2020, at 11:09 AM, Matthew Knepley wrote:
>
> On Wed, Feb 12, 2020 at 11:06 AM Zhang, Hong via petsc-dev <
> petsc-dev@mcs.anl.gov> wrote:
>
>> Sorry for the long post. Here are replies I have got from OLCF so far. We
>> still
On Feb 12, 2020, at 11:09 AM, Matthew Knepley
mailto:knep...@gmail.com>> wrote:
On Wed, Feb 12, 2020 at 11:06 AM Zhang, Hong via petsc-dev
mailto:petsc-dev@mcs.anl.gov>> wrote:
Sorry for the long post. Here are replies I have got from OLCF so far. We still
don’t know how to solve the problem.
On Wed, Feb 12, 2020 at 11:06 AM Zhang, Hong via petsc-dev <
petsc-dev@mcs.anl.gov> wrote:
> Sorry for the long post. Here are replies I have got from OLCF so far. We
> still don’t know how to solve the problem.
>
> One interesting thing that Tom noticed is PetscInitialize() may have
> called cuda
Sorry for the long post. Here are replies I have got from OLCF so far. We still
don’t know how to solve the problem.
One interesting thing that Tom noticed is PetscInitialize() may have called
cudaFree(0) 32 times as NVPROF shows, and they all run very fast. These calls
may be triggered by some
gprof or some similar tool?
> On Feb 10, 2020, at 11:18 AM, Zhang, Hong via petsc-dev
> wrote:
>
> -cuda_initialize 0 does not make any difference. Actually this issue has
> nothing to do with PetscInitialize(). I tried to call cudaFree(0) before
> PetscInitialize(), and it still took 7.
-cuda_initialize 0 does not make any difference. Actually this issue has
nothing to do with PetscInitialize(). I tried to call cudaFree(0) before
PetscInitialize(), and it still took 7.5 seconds.
Hong
On Feb 10, 2020, at 10:44 AM, Zhang, Junchao
mailto:jczh...@mcs.anl.gov>> wrote:
As I mentio
As I mentioned, have you tried -cuda_initialize 0? Also,
PetscCUDAInitialize contains
ierr = PetscCUBLASInitializeHandle();CHKERRQ(ierr);
ierr = PetscCUSOLVERDnInitializeHandle();CHKERRQ(ierr);
Have you tried to comment out them and test again?
--Junchao Zhang
On Sat, Feb 8, 2020 at 5:22 PM Zha
On Feb 8, 2020, at 5:03 PM, Matthew Knepley
mailto:knep...@gmail.com>> wrote:
On Sat, Feb 8, 2020 at 4:34 PM Zhang, Hong via petsc-dev
mailto:petsc-dev@mcs.anl.gov>> wrote:
I did some further investigation. The overhead persists for both the PETSc
shared library and the static library. In the
On Sat, Feb 8, 2020 at 4:34 PM Zhang, Hong via petsc-dev <
petsc-dev@mcs.anl.gov> wrote:
> I did some further investigation. The overhead persists for both the PETSc
> shared library and the static library. In the previous example, it does not
> call any PETSc function, the first CUDA function bec
I did some further investigation. The overhead persists for both the PETSc
shared library and the static library. In the previous example, it does not
call any PETSc function, the first CUDA function becomes very slow when it is
linked to the petsc so. This indicates that the slowdown occurs if
Given that OLCF filesystems are the issue, have you engaged their support
personnel regarding this issue?
Jeff
On Fri, Feb 7, 2020 at 6:37 PM Junchao Zhang via petsc-dev <
petsc-dev@mcs.anl.gov> wrote:
> Have you tried passing -cuda_initialize 0 to petsc?
>
> --Junchao Zhang
>
>
> On Fri, Feb 7,
Have you tried passing -cuda_initialize 0 to petsc?
--Junchao Zhang
On Fri, Feb 7, 2020 at 5:16 PM Zhang, Hong via petsc-dev <
petsc-dev@mcs.anl.gov> wrote:
> I tried to install PETSc shared library in /gpfs/alpine/scratch, which
> should be faster than the home directory. But the same overhead
I tried to install PETSc shared library in /gpfs/alpine/scratch, which should
be faster than the home directory. But the same overhead still persists.
Hong
> On Feb 7, 2020, at 4:32 PM, Smith, Barry F. wrote:
>
>
> Perhaps the intent is that you build or install (--prefix) your libraries
>
Perhaps the intent is that you build or install (--prefix) your libraries in
a different place than /autofs/nccs-svm1_home1
> On Feb 7, 2020, at 3:09 PM, Zhang, Hong wrote:
>
> Note that the overhead was triggered by the first call to a CUDA function. So
> it seems that the first CUDA
Note that the overhead was triggered by the first call to a CUDA function. So
it seems that the first CUDA function triggered loading petsc so (if petsc so
is linked), which is slow on the summit file system.
Hong
On Feb 7, 2020, at 2:54 PM, Zhang, Hong via petsc-dev
mailto:petsc-dev@mcs.anl.g
Linking any other shared library does not slow down the execution. The PETSc
shared library is the only one causing trouble.
Here are the ldd output for two different versions. For the first version, I
removed -lpetsc and it ran very fast. The second (slow) version was linked to
petsc so.
bash
ldd -o on the executable of both linkings of your code.
My guess is that without PETSc it is linking the static version of the needed
libraries and with PETSc the shared. And, in typical fashion, the shared
libraries are off on some super slow file system so take a long time to be
loaded
Statically linked excitable works fine. The dynamic linker is probably broken.
Hong
On Feb 7, 2020, at 12:53 PM, Matthew Knepley
mailto:knep...@gmail.com>> wrote:
On Fri, Feb 7, 2020 at 1:23 PM Zhang, Hong via petsc-dev
mailto:petsc-dev@mcs.anl.gov>> wrote:
Hi all,
Previously I have noticed t
On Fri, Feb 7, 2020 at 1:23 PM Zhang, Hong via petsc-dev <
petsc-dev@mcs.anl.gov> wrote:
> Hi all,
>
> Previously I have noticed that the first call to a CUDA function such as
> cudaMalloc and cudaFree in PETSc takes a long time (7.5 seconds) on summit.
> Then I prepared a simple example as attach
Hi all,
Previously I have noticed that the first call to a CUDA function such as
cudaMalloc and cudaFree in PETSc takes a long time (7.5 seconds) on summit.
Then I prepared a simple example as attached to help OCLF reproduce the
problem. It turned out that the problem was caused by PETSc. The
27 matches
Mail list logo