> You need to go into the PetscInitialize() routine find where it loads the > cublas and cusolve and comment out those lines then run with -log_view
Comment out #if (PetscDefined(HAVE_CUDA) || PetscDefined(HAVE_HIP) || PetscDefined(HAVE_SYCL)) ierr = PetscDeviceInitializeFromOptions_Internal(PETSC_COMM_WORLD);CHKERRQ(ierr); #endif At src/sys/objects/pinit.c:956 Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 7, 2022, at 11:24, Barry Smith <bsm...@petsc.dev> wrote: > > > Without log_view it does not load any cuBLAS/cuSolve immediately with > -log_view it loads all that stuff at startup. You need to go into the > PetscInitialize() routine find where it loads the cublas and cusolve and > comment out those lines then run with -log_view > > >> On Jan 7, 2022, at 11:14 AM, Zhang, Hong via petsc-dev >> <petsc-dev@mcs.anl.gov <mailto:petsc-dev@mcs.anl.gov>> wrote: >> >> When PETSc is initialized, it takes about 2GB CUDA memory. This is way too >> much for doing nothing. A test script is attached to reproduce the issue. If >> I remove the first line "import torch", PETSc consumes about 0.73GB, which >> is still significant. Does anyone have any idea about this behavior? >> >> Thanks, >> Hong >> >> hongzhang@gpu02:/gpfs/jlse-fs0/users/hongzhang/Projects/pnode/examples >> (caidao22/update-examples)$ python3 test.py >> CUDA memory before PETSc 0.000GB >> CUDA memory after PETSc 0.004GB >> hongzhang@gpu02:/gpfs/jlse-fs0/users/hongzhang/Projects/pnode/examples >> (caidao22/update-examples)$ python3 test.py -log_view :0.txt >> CUDA memory before PETSc 0.000GB >> CUDA memory after PETSc 1.936GB >> >> import torch >> import sys >> import os >> >> import nvidia_smi >> nvidia_smi.nvmlInit() >> handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0) >> info = nvidia_smi.nvmlDeviceGetMemoryInfo(handle) >> print('CUDA memory before PETSc %.3fGB' % (info.used/1e9)) >> >> petsc4py_path = >> os.path.join(os.environ['PETSC_DIR'],os.environ['PETSC_ARCH'],'lib') >> sys.path.append(petsc4py_path) >> import petsc4py >> petsc4py.init(sys.argv) >> handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0) >> info = nvidia_smi.nvmlDeviceGetMemoryInfo(handle) >> print('CUDA memory after PETSc %.3fGB' % (info.used/1e9)) >> >