[petsc-users] Postdoctoral position at Argonne: Numerical Solvers for Next Generation High Performance Computing Architectures

2020-08-24 Thread Mills, Richard Tran via petsc-users
Dear PETSc Users and Developers, The PETSc/TAO team at Argonne National Laboratory has an opening for a postdoctoral researcher to work on development of robust and efficient algebraic solvers and related technologies targeting exascale-class supercomputers -- such as the Aurora machine slated

[petsc-users] Argonne National Laboratory hiring for staff position in Numerical PDEs and Scientific Computing

2020-07-29 Thread Mills, Richard Tran via petsc-users
Dear PETSc Users and Developers, The Laboratory for Applied Mathematics, Numerical Software, and Statistics (LANS, https://www.anl.gov/mcs/lans) in the Mathematics and Computer Science Division at Argonne National Laboratory -- which has served as the "home" for PETSc development for over two

Re: [petsc-users] Gather and Broadcast Parallel Vectors in k-means algorithm

2020-05-22 Thread Mills, Richard Tran via petsc-users
Hi Eda, If you are using the MATLAB k-means function, calling it like idx = kmeans(X,k) will give you the index set, but if you do [idx,C] = kmeans(X,k) then you will also get a matrix C which contains the cluster centroids. Is this not what you need? --Richard On 5/22/20 10:38 AM, Eda

Re: [petsc-users] Possible bug PETSc+Complex+CUDA

2020-05-22 Thread Mills, Richard Tran via petsc-users
Yes, Junchao said he gets the segfault, but it works for Karl. Sounds like this may be a case of one compiler liking the definitions for complex that Thrust uses, and some not, as Stefano says. Karl and Junchao, can you please share the version of the compilers (and maybe associated settings)

Re: [petsc-users] Gather and Broadcast Parallel Vectors in k-means algorithm

2020-04-29 Thread Mills, Richard Tran via petsc-users
Hi Eda, Thanks for your reply. I'm still trying to understand why you say you need to duplicate the row vectors across all processes. When I have implemented parallel k-means, I don't duplicate the row vectors. (This would be very unscalable and largely defeat the point of doing this with MPI

Re: [petsc-users] Gather and Broadcast Parallel Vectors in k-means algorithm

2020-04-06 Thread Mills, Richard Tran via petsc-users
Hi Eda, I think that you probably want to use VecScatter routines, as Junchao has suggested, instead of the lower level star forest for this. I believe that VecScatterCreateToZero() is what you want for the broadcast problem you describe, in the second part of your question. I'm not sure what

Re: [petsc-users] chowiluviennacl

2020-01-20 Thread Mills, Richard Tran via petsc-users
Hi Xiangdong, Maybe I am misunderstanding you, but it sounds like you want an exact direct solution, so I don't understand why you are using an incomplete factorization solver for this. SuperLU_DIST (as Mark has suggested) or MUMPS are two such packages that provide MPI-parallel sparse LU

Re: [petsc-users] BAIJCUSPARSE?

2019-10-29 Thread Mills, Richard Tran via petsc-users
We will let you know when this is ready, Xiangdong. Let me address a part of your original question that I don't think anyone else noticed: In my current code, the Jacobian matrix preallocated and assembled as BAIJ format. Do I have to rewrite this part of code to preallocate and assemble the

Re: [petsc-users] BAIJCUSPARSE?

2019-10-25 Thread Mills, Richard Tran via petsc-users
Xiangdong, cuSPARSE does support block compressed sparse row (BAIJ) format, but we don't currently support that cuSPARSE functionality in PETSc. It should be easy to add, but we are currently refactoring the way we interface with third party GPU libraries such as cuSPARSE, and it would

Re: [petsc-users] MatMultTranspose memory usage

2019-07-30 Thread Mills, Richard Tran via petsc-users
ta in one process, I got a crash and error saying > object too big. Thank you for any insight. > > 1) Always send the complete error. > > 2) It sounds like you got an out of memory error for that process. > >Matt > > Regards, > > Karl > > On Thu, Jul 18, 2019 a

Re: [petsc-users] MatMultTranspose memory usage

2019-07-18 Thread Mills, Richard Tran via petsc-users
Hi Kun and Karl, If you are using the AIJMKL matrix types and have a recent version of MKL, the AIJMKL code uses MKL's inspector-executor sparse BLAS routines, which are described at https://software.intel.com/en-us/mkl-developer-reference-c-inspector-executor-sparse-blas-routines The

Re: [petsc-users] Communication during MatAssemblyEnd

2019-06-24 Thread Mills, Richard Tran via petsc-users
Hi Ale, I don't know if this has anything to do with the strange performance you are seeing, but I notice that some of your Intel MPI settings are inconsistent and I'm not sure what you are intending. You have specified a value for I_MPI_PIN_DOMAIN and also a value for

Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4

2019-03-26 Thread Mills, Richard Tran via petsc-users
Hi Kun, I'm the author of most of the AIJMKL stuff in PETSc. My apologies for having inadvertently omitted these function prototypes for these interfaces; I'm glad that Satish's patch has fixed this. I want to point out that -- though I can envision some scenarios in which one would want to