The cost of the initial matrix creation to get the row starts and ends is trivial compared to the later computations, so is fine to retain.
> On Aug 19, 2023, at 10:51 AM, Erik Kneller via petsc-users > <petsc-users@mcs.anl.gov> wrote: > > Hi All, > > Thank you for the recommendations. The first option provided by Matthew works > in petsc4py but required some additional computations to re-organize > information. I developed an approach that was easier to build within my > current system using the option provided by Hong along with some other posts > I found on the user forum. See below for an example of how I integrated Numba > and petcs4py so the filling of matrix elements on each processor is done > using the optimized machine code produced by Numba (you can eliminate the > cleaning step if the number of non-zero elements for each row is well > defined). Scaling and overall performance is now satisfactory. > > I do have another question. In order to make this work I create the Petsc > matrix 'A' twice, first to get local ownership and second to define the local > elements for a particular processor: > > Algorithm > ------------ > (1) Create a Petsc matrix 'A' and set size and type > (2) Get the local row start and end for matrix 'A' > (3) Define the local non-zero coefficients for the rows owned by processor > using a Numba JIT-compiled loop and store result in a csr matrix defined > using Scipy. > (4) Redefine Petsc matix 'A' using the the local csr elements define in step > 3. > (5) Begin and end assembly > (6) Define RHS and initial solution vectors > (7) Solve system > (see code example below for details) > > Steps (1) and (4) appear redundant and potentially sub-optimal from a > performance perspective (perhaps due to my limited experience with > petscy4py). Is there a better approach in terms of elegance and performance? > Also, if this is the optimal syntax could someone elaborate on what exactly > is occurring behind the seen in terms of memory allocation? > > Thank you all again for your guidance and insights. > > Modified Version of Dalcin's 2D Poisson Example with Numba > ---------------------------------------------------------------------------------- > import sys > from typing import Tuple > import numpy.typing as npt > import numpy as np > from numba import njit > import petsc4py > petsc4py.init(sys.argv) > from petsc4py import PETSc > import scipy.sparse as sps > > > def main() -> None: > xnodes = 8000 > nnz_max = 10 > > ynodes = xnodes > N = xnodes*ynodes > dx = 1.0/(xnodes + 1) > > A = PETSc.Mat() > A.create(comm=PETSc.COMM_WORLD) > A.setSizes((xnodes*ynodes, xnodes*ynodes)) > A.setType(PETSc.Mat.Type.AIJ) > > rstart, rend = A.getOwnershipRange() > Ascipy = build_csr_matrix( > rstart, rend, nnz_max, dx, xnodes, ynodes) > csr=( > Ascipy.indptr[rstart:rend+1] - Ascipy.indptr[rstart], > Ascipy.indices[Ascipy.indptr[rstart]:Ascipy.indptr[rend]], > Ascipy.data[Ascipy.indptr[rstart]:Ascipy.indptr[rend]] > ) > A = PETSc.Mat().createAIJ(size=(N,N), csr=csr) > > A.assemblyBegin() > A.assemblyEnd() > > ksp = PETSc.KSP() > ksp.create(comm=A.getComm()) > ksp.setType(PETSc.KSP.Type.CG) > ksp.getPC().setType(PETSc.PC.Type.GAMG) > > ksp.setOperators(A) > ksp.setFromOptions() > > x, b = A.createVecs() > b.set(1.0) > > ksp.solve(b, x) > > > def build_csr_matrix( > rstart:int, > rend:int, > nnz_max:int, > dx:float, > xnodes:int, > ynodes:int > ): > Anz, Arow, Acol = build_nonzero_arrays( > rstart, rend, nnz_max, dx, xnodes, ynodes > ) > N = xnodes*ynodes > Ls = sps.csr_matrix((Anz, (Arow,Acol)), shape=(N,N), dtype=np.float64) > return Ls > > > def build_nonzero_arrays( > rstart:int, > rend:int, > nnz_max:int, > dx:float, > xnodes:int, > ynodes:int > ) -> Tuple[ > npt.NDArray[np.float64], > npt.NDArray[np.float64], > npt.NDArray[np.float64] > ]: > nrows_local = (rend - rstart) + 1 > Anz_ini = np.zeros((nnz_max*nrows_local), dtype=np.float64) > Arow_ini = np.zeros((nnz_max*nrows_local), dtype=np.int32) > Acol_ini = np.zeros((nnz_max*nrows_local), dtype=np.int32) > icount_nz = define_nonzero_values( > rstart, rend, dx, xnodes, ynodes, Anz_ini, Arow_ini, Acol_ini > ) > ( > Anz, Arow, Acol > ) = clean_nonzero_arrays(icount_nz, Anz_ini, Arow_ini, Acol_ini) > return Anz, Arow, Acol > > > @njit > def define_nonzero_values( > rstart:int, > rend:int, > dx:float, > xnodes:int, > ynodes:int, > Anz:npt.NDArray[np.float64], > Arow:npt.NDArray[np.int64], > Acol:npt.NDArray[np.int64] > ) -> int: > """ Fill matrix A > """ > icount_nz = 0 > for row in range(rstart, rend): > i, j = index_to_grid(row, xnodes) > #A[row, row] = 4.0/dx**2 > Anz[icount_nz] = 4.0/dx**2 > Arow[icount_nz] = row > Acol[icount_nz] = row > icount_nz += 1 > if i > 0: > column = row - xnodes > #A[row, column] = -1.0/dx**2 > Anz[icount_nz] = -1.0/dx**2 > Arow[icount_nz] = row > Acol[icount_nz] = column > icount_nz += 1 > if i < xnodes - 1: > column = row + xnodes > #A[row, column] = -1.0/dx**2 > Anz[icount_nz] = -1.0/dx**2 > Arow[icount_nz] = row > Acol[icount_nz] = column > icount_nz += 1 > if j > 0: > column = row - 1 > #A[row, column] = -1.0/dx**2 > Anz[icount_nz] = -1.0/dx**2 > Arow[icount_nz] = row > Acol[icount_nz] = column > icount_nz += 1 > if j < xnodes - 1: > column = row + 1 > #A[row, column] = -1.0/dx**2 > Anz[icount_nz] = -1.0/dx**2 > Arow[icount_nz] = row > Acol[icount_nz] = column > icount_nz += 1 > return icount_nz > > > @njit > def clean_nonzero_arrays( > icount_nz:int, > Anz_ini:npt.NDArray[np.float64], > Arow_ini:npt.NDArray[np.float64], > Acol_ini:npt.NDArray[np.float64] > ) -> Tuple[ > npt.NDArray[np.float64], > npt.NDArray[np.float64], > npt.NDArray[np.float64] > ]: > Anz = np.zeros((icount_nz), dtype=np.float64) > Arow = np.zeros((icount_nz), dtype=np.int32) > Acol = np.zeros((icount_nz), dtype=np.int32) > for i in range(icount_nz): > Anz[i] = Anz_ini[i] > Arow[i] = Arow_ini[i] > Acol[i] = Acol_ini[i] > return Anz, Arow, Acol > > > @njit > def index_to_grid(r:int, n:int) -> Tuple[int,int]: > """Convert a row number into a grid point.""" > return (r//n, r%n) > > > if __name__ == "__main__": > main() > > > On Friday, August 18, 2023 at 11:24:36 AM CDT, Zhang, Hong > <hongzh...@anl.gov> wrote: > > > You can use this to build a PETSc matrix with the index arrays ai,aj and the > value array aa: > PETSc.Mat().createAIJ(size=(nrows,ncols), csr=(ai,aj,aa)) > > Hong (Mr.) > > > On Aug 17, 2023, at 7:37 AM, Erik Kneller via petsc-users > > <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote: > > > > Hi All, > > > > I need to fill non-zero values of a Petsc matrix via petsc4py for the > > domain defined by A.getOwnershipRange() using three Numpy arrays: (1) array > > containing row indices of non-zero value, (2) array containing column > > indices of non-zero values and (3) array containing the non-zero matrix > > values. How can one perform this type of filling operation in petsc4py? The > > method A.setValues does not appear to allow this since it only works on an > > individual matrix element or a block of matrix elements. > > > > I am using Numpy arrays since they can be computed in loops optimized using > > Numba on each processor. I also cannot pass the Petsc matrix to a Numba > > compiled function since type information cannot be inferred. I absolutely > > need to avoid looping in standard Python to define Petsc matrix elements > > due to performance issues. I also need to use a standard petscy4py method > > and avoid writing new C or Fortran wrappers to minimize language complexity. > > > > Example Algorithm Building on Lisandro Dalcin's 2D Poisson Example: > > ---------------------------------------------------------------------------------------------- > > comm = PETSc.COMM_WORLD > > rank = comm.getRank() > > > > dx = 1.0/(xnodes + 1) # xnodes is the number of nodes in the x and > > y-directions of the grid > > nnz_max = 5 # max number of non-zero values per row > > > > A = PETSc.Mat() > > A.create(comm=PETSc.COMM_WORLD) > > A.setSizes((xnodes*ynodes, xnodes*ynodes)) > > A.setType(PETSc.Mat.Type.AIJ) > > A.setPreallocationNNZ(nnz_max) > > > > rstart, rend = A.getOwnershipRange() > > > > # Here Anz, Arow and Acol are vectors with size equal to the number of > > non-zero values > > Anz, Arow, Acol = build_nonzero_numpy_arrays_using_numba(rstart, rend, > > nnz_max, dx, xnodes, ynodes) > > > > A.setValues(Arow, Acol, Anz) # <--- This does not work. > > > > A.assemblyBegin() > > A.assemblyEnd() > > ksp = PETSc.KSP() > > ksp.create(comm=A.getComm()) > > ksp.setType(PETSc.KSP.Type.CG) > > ksp.getPC().setType(PETSc.PC.Type.GAMG) > > ksp.setOperators(A) > > ksp.setFromOptions() > > x, b = A.createVecs() > > b.set(1.0) > > > > ksp.solve(b, x) > > > > Regards, > > Erik Kneller, Ph.D. > > >