Is there any reason why petsc compiled to link with goto blas shared libraries would not run multi-threaded by default ?
I've set (OMP/GOTO)_NUM_THREADS=8 but when I call dgemm from PETSc I can't seem to get it to run on multiple cores (<= 100% cpu usage from top). I checked and the test routines installed with goto library build called w/o petsc runs multi-threaded (~600% cpu usage on top). I'm calling MatMatMult with dense matrices from petsc4py. from petsc4py import PETSc import numpy n = 10000 J1 = PETSc.Mat().createDense([n, n], array=numpy.random.rand(n,n), comm=PETSc.COMM_WORLD) J1.assemblyBegin();J1.assemblyEnd() J2 = PETSc.Mat().createDense([n, n], array=numpy.random.rand(n,n),comm=PETSc.COMM_WORLD) J2.assemblyBegin(); J2.assemblyEnd() X = J1.matMult(J2)