> The memory overhead (for both CPU and GPU) of PyTorch is getting worse and
> worse as it evolves. A conjecture is that the CUDA kernels in the library are
> responsible for this. But the overhead for Tensorflow2 is just around 300MB
> (compare to 1.5GB for PyTorch).
I read through the thread
Here is an interesting thread discussing the memory issue for PyTorch (which I
think is also relevant to PETSc):
https://github.com/pytorch/pytorch/issues/12873
The memory overhead (for both CPU and GPU) of PyTorch is getting worse and
worse as it evolves. A conjecture is that the CUDA kernels
cuda-memcheck is a valgrind clone, but like valgrind it does not report
usage as it goes. Just in a report at the end.
On Fri, Jan 7, 2022 at 10:23 PM Barry Smith wrote:
>
> Doesn't Nvidia supply a "valgrind" like tool that will allow tracking
> memory usage? I'm pretty sure I've seen one; it