> You need to go into the PetscInitialize() routine find where it loads the 
> cublas and cusolve and comment out those lines then run with -log_view

Comment out

#if (PetscDefined(HAVE_CUDA) || PetscDefined(HAVE_HIP) || 
PetscDefined(HAVE_SYCL))
  ierr = 
PetscDeviceInitializeFromOptions_Internal(PETSC_COMM_WORLD);CHKERRQ(ierr);
#endif

At src/sys/objects/pinit.c:956

Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)

> On Jan 7, 2022, at 11:24, Barry Smith <bsm...@petsc.dev> wrote:
> 
> 
> Without log_view it does not load any cuBLAS/cuSolve immediately with 
> -log_view it loads all that stuff at startup. You need to go into the 
> PetscInitialize() routine find where it loads the cublas and cusolve and 
> comment out those lines then run with -log_view
> 
> 
>> On Jan 7, 2022, at 11:14 AM, Zhang, Hong via petsc-dev 
>> <petsc-dev@mcs.anl.gov <mailto:petsc-dev@mcs.anl.gov>> wrote:
>> 
>> When PETSc is initialized, it takes about 2GB CUDA memory. This is way too 
>> much for doing nothing. A test script is attached to reproduce the issue. If 
>> I remove the first line "import torch", PETSc consumes about 0.73GB, which 
>> is still significant. Does anyone have any idea about this behavior?
>> 
>> Thanks,
>> Hong
>> 
>> hongzhang@gpu02:/gpfs/jlse-fs0/users/hongzhang/Projects/pnode/examples 
>> (caidao22/update-examples)$ python3 test.py
>> CUDA memory before PETSc 0.000GB
>> CUDA memory after PETSc 0.004GB
>> hongzhang@gpu02:/gpfs/jlse-fs0/users/hongzhang/Projects/pnode/examples 
>> (caidao22/update-examples)$ python3 test.py -log_view :0.txt
>> CUDA memory before PETSc 0.000GB
>> CUDA memory after PETSc 1.936GB
>> 
>> import torch
>> import sys
>> import os
>> 
>> import nvidia_smi
>> nvidia_smi.nvmlInit()
>> handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0)
>> info = nvidia_smi.nvmlDeviceGetMemoryInfo(handle)
>> print('CUDA memory before PETSc %.3fGB' % (info.used/1e9))
>> 
>> petsc4py_path = 
>> os.path.join(os.environ['PETSC_DIR'],os.environ['PETSC_ARCH'],'lib')
>> sys.path.append(petsc4py_path)
>> import petsc4py
>> petsc4py.init(sys.argv)
>> handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0)
>> info = nvidia_smi.nvmlDeviceGetMemoryInfo(handle)
>> print('CUDA memory after PETSc %.3fGB' % (info.used/1e9))
>> 
> 

Reply via email to