Glad it worked!
--Junchao Zhang

On Wed, Dec 6, 2023 at 1:20 PM Sreeram R Venkat <srven...@utexas.edu> wrote:

> Thank you for your help. It turned out the problem was that I forgot to
> assemble the "x" vector before doing the scatter. It seems to be working
> now.
>
> Thanks,
> Sreeram
>
> On Wed, Dec 6, 2023 at 11:12 AM Junchao Zhang <junchao.zh...@gmail.com>
> wrote:
>
>> Hi, Sreeram,
>>   I made an example with your approach.  It worked fine as you see the
>> output at the end
>>
>> #include <petscvec.h>
>> int main(int argc, char **argv)
>> {
>> PetscInt i, j, rstart, rend, n, N, *indices;
>> PetscMPIInt size, rank;
>> IS ix;
>> VecScatter vscat;
>> Vec x, y;
>>
>> PetscFunctionBeginUser;
>> PetscCall(PetscInitialize(&argc, &argv, (char *)0, NULL));
>> PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
>> PetscCallMPI(MPI_Comm_rank(PETSC_COMM_WORLD, &rank));
>>
>> PetscCall(VecCreate(PETSC_COMM_WORLD, &x));
>> PetscCall(VecSetFromOptions(x));
>> PetscCall(PetscObjectSetName((PetscObject)x, "Vec X"));
>> n = (rank < 4) ? 9 : 0;
>> PetscCall(VecSetSizes(x, n, PETSC_DECIDE));
>>
>> PetscCall(VecGetOwnershipRange(x, &rstart, &rend));
>> for (i = rstart; i < rend; i++) PetscCall(VecSetValue(x, i,
>> (PetscScalar)i, INSERT_VALUES));
>> PetscCall(VecAssemblyBegin(x));
>> PetscCall(VecAssemblyEnd(x));
>> PetscCall(VecGetSize(x, &N));
>>
>> PetscCall(VecCreate(PETSC_COMM_WORLD, &y));
>> PetscCall(VecSetFromOptions(y));
>> PetscCall(PetscObjectSetName((PetscObject)y, "Vec Y"));
>> PetscCall(VecSetSizes(y, PETSC_DECIDE, N));
>>
>> PetscCall(VecGetOwnershipRange(y, &rstart, &rend));
>> PetscCall(PetscMalloc1(rend - rstart, &indices));
>> for (i = rstart, j = 0; i < rend; i++, j++) indices[j] = rank + size * j;
>>
>> PetscCall(ISCreateGeneral(PETSC_COMM_WORLD, rend - rstart, indices,
>> PETSC_OWN_POINTER, &ix));
>> PetscCall(VecScatterCreate(x, ix, y, NULL, &vscat));
>>
>> PetscCall(VecScatterBegin(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
>> PetscCall(VecScatterEnd(vscat, x, y, INSERT_VALUES, SCATTER_FORWARD));
>>
>> PetscCall(ISView(ix, PETSC_VIEWER_STDOUT_WORLD));
>> PetscCall(VecView(x, PETSC_VIEWER_STDERR_WORLD));
>> PetscCall(VecView(y, PETSC_VIEWER_STDERR_WORLD));
>>
>> PetscCall(VecScatterDestroy(&vscat));
>> PetscCall(ISDestroy(&ix));
>> PetscCall(VecDestroy(&x));
>> PetscCall(VecDestroy(&y));
>> PetscCall(PetscFinalize());
>> return 0;
>> }
>>
>> $ mpirun -n 12 ./ex100
>> IS Object: 12 MPI processes
>>   type: general
>> [0] Number of indices in set 3
>> [0] 0 0
>> [0] 1 12
>> [0] 2 24
>> [1] Number of indices in set 3
>> [1] 0 1
>> [1] 1 13
>> [1] 2 25
>> [2] Number of indices in set 3
>> [2] 0 2
>> [2] 1 14
>> [2] 2 26
>> [3] Number of indices in set 3
>> [3] 0 3
>> [3] 1 15
>> [3] 2 27
>> [4] Number of indices in set 3
>> [4] 0 4
>> [4] 1 16
>> [4] 2 28
>> [5] Number of indices in set 3
>> [5] 0 5
>> [5] 1 17
>> [5] 2 29
>> [6] Number of indices in set 3
>> [6] 0 6
>> [6] 1 18
>> [6] 2 30
>> [7] Number of indices in set 3
>> [7] 0 7
>> [7] 1 19
>> [7] 2 31
>> [8] Number of indices in set 3
>> [8] 0 8
>> [8] 1 20
>> [8] 2 32
>> [9] Number of indices in set 3
>> [9] 0 9
>> [9] 1 21
>> [9] 2 33
>> [10] Number of indices in set 3
>> [10] 0 10
>> [10] 1 22
>> [10] 2 34
>> [11] Number of indices in set 3
>> [11] 0 11
>> [11] 1 23
>> [11] 2 35
>> Vec Object: Vec X 12 MPI processes
>>   type: mpi
>> Process [0]
>> 0.
>> 1.
>> 2.
>> 3.
>> 4.
>> 5.
>> 6.
>> 7.
>> 8.
>> Process [1]
>> 9.
>> 10.
>> 11.
>> 12.
>> 13.
>> 14.
>> 15.
>> 16.
>> 17.
>> Process [2]
>> 18.
>> 19.
>> 20.
>> 21.
>> 22.
>> 23.
>> 24.
>> 25.
>> 26.
>> Process [3]
>> 27.
>> 28.
>> 29.
>> 30.
>> 31.
>> 32.
>> 33.
>> 34.
>> 35.
>> Process [4]
>> Process [5]
>> Process [6]
>> Process [7]
>> Process [8]
>> Process [9]
>> Process [10]
>> Process [11]
>> Vec Object: Vec Y 12 MPI processes
>>   type: mpi
>> Process [0]
>> 0.
>> 12.
>> 24.
>> Process [1]
>> 1.
>> 13.
>> 25.
>> Process [2]
>> 2.
>> 14.
>> 26.
>> Process [3]
>> 3.
>> 15.
>> 27.
>> Process [4]
>> 4.
>> 16.
>> 28.
>> Process [5]
>> 5.
>> 17.
>> 29.
>> Process [6]
>> 6.
>> 18.
>> 30.
>> Process [7]
>> 7.
>> 19.
>> 31.
>> Process [8]
>> 8.
>> 20.
>> 32.
>> Process [9]
>> 9.
>> 21.
>> 33.
>> Process [10]
>> 10.
>> 22.
>> 34.
>> Process [11]
>> 11.
>> 23.
>> 35.
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Dec 5, 2023 at 10:09 PM Sreeram R Venkat <srven...@utexas.edu>
>> wrote:
>>
>>> Yes, I have an example code at github.com/s769/petsc-test. Only thing
>>> is, when I described the example before, I simplified the actual use case
>>> in the code to make things simpler. Here are the extra details relevant to
>>> this code:
>>>
>>>    - We assume a 2D processor grid, given by the command-line args
>>>    -proc_rows and -proc_cols
>>>    - The total length of the vector is n_m*n_t given by command-line
>>>    args -nm and -nt. n_m corresponds to a space index and n_t a time index.
>>>    - In the "Start" phase, the vector is divided into n_m blocks each
>>>    of size n_t (indexed space->time). The blocks are partitioned over the
>>>    first row of processors. For example if -nm = 4 and -proc_cols = 4, each
>>>    processor in the first row will get one block of size n_t. Each processor
>>>    in the first row gets n_m_local blocks of size n_t, where the sum of all
>>>    n_m_locals for the first row of processors is n_m.
>>>    - In the "Finish" phase, the vector is divided into n_t blocks each
>>>    of size n_m (indexed time->space; this is the reason for the permutation 
>>> of
>>>    indices). The blocks are partitioned over all processors. Each processor
>>>    will get n_t_local blocks of size n_m, where the sum of all n_t_locals 
>>> for
>>>    all processors is n_t.
>>>
>>> I think the basic idea is similar to the previous example, but these
>>> details make things a bit more complicated. Please let me know if anything
>>> is unclear, and I can try to explain more.
>>>
>>> Thanks for your help,
>>> Sreeram
>>>
>>> On Tue, Dec 5, 2023 at 9:30 PM Junchao Zhang <junchao.zh...@gmail.com>
>>> wrote:
>>>
>>>> I think your approach is correct.  Do you have an example code?
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Tue, Dec 5, 2023 at 5:15 PM Sreeram R Venkat <srven...@utexas.edu>
>>>> wrote:
>>>>
>>>>> Hi, I have a follow up question on this.
>>>>>
>>>>> Now, I'm trying to do a scatter and permutation of the vector. Under
>>>>> the same setup as the original example, here are the new Start and Finish
>>>>> states I want to achieve:
>>>>>  Start                                Finish
>>>>> Proc | Entries                 Proc | Entries
>>>>>     0   |  0,...,8                     0   | 0, 12, 24
>>>>>     1   |  9,...,17                   1   | 1, 13, 25
>>>>>     2   |  18,...,26                 2   | 2, 14, 26
>>>>>     3   |  27,...,35                 3   | 3, 15, 27
>>>>>     4   |  None                     4   | 4, 16, 28
>>>>>     5   |  None                     5   | 5, 17, 29
>>>>>     6   |  None                     6   | 6, 18, 30
>>>>>     7   |  None                     7   | 7, 19, 31
>>>>>     8   |  None                     8   | 8, 20, 32
>>>>>     9   |  None                     9   | 9, 21, 33
>>>>>     10   |  None                   10 | 10, 22, 34
>>>>>     11   |  None                   11  | 11, 23, 35
>>>>>
>>>>> So far, I've tried to use ISCreateGeneral(), with each process giving
>>>>> an idx array corresponding to the indices it wants (i.e. idx on P0 looks
>>>>> like [0,12,24] P1 -> [1,13, 25], and so on).
>>>>> Then I use this to create the VecScatter with VecScatterCreate(x, is,
>>>>> y, NULL, &scatter).
>>>>>
>>>>> However, when I try to do the scatter, I get some illegal memory
>>>>> access errors.
>>>>>
>>>>> Is there something wrong with how I define the index sets?
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Oct 5, 2023 at 12:57 PM Sreeram R Venkat <srven...@utexas.edu>
>>>>> wrote:
>>>>>
>>>>>> Thank you. This works for me.
>>>>>>
>>>>>> Sreeram
>>>>>>
>>>>>> On Wed, Oct 4, 2023 at 6:41 PM Junchao Zhang <junchao.zh...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi, Sreeram,
>>>>>>> You can try this code. Since x, y are both MPI vectors, we just need
>>>>>>> to say we want to scatter x[0:N] to y[0:N]. The 12 index sets with your
>>>>>>> example on the 12 processes would be [0..8], [9..17], [18..26], 
>>>>>>> [27..35],
>>>>>>> [], ..., [].  Actually, you can do it arbitrarily, say, with 12 index 
>>>>>>> sets
>>>>>>> [0..17], [18..35], .., [].  PETSc will figure out how to do the
>>>>>>> communication.
>>>>>>>
>>>>>>> PetscInt rstart, rend, N;
>>>>>>> IS ix;
>>>>>>> VecScatter vscat;
>>>>>>> Vec y;
>>>>>>> MPI_Comm comm;
>>>>>>> VecType type;
>>>>>>>
>>>>>>> PetscObjectGetComm((PetscObject)x, &comm);
>>>>>>> VecGetType(x, &type);
>>>>>>> VecGetSize(x, &N);
>>>>>>> VecGetOwnershipRange(x, &rstart, &rend);
>>>>>>>
>>>>>>> VecCreate(comm, &y);
>>>>>>> VecSetSizes(y, PETSC_DECIDE, N);
>>>>>>> VecSetType(y, type);
>>>>>>>
>>>>>>> ISCreateStride(PetscObjectComm((PetscObject)x), rend - rstart,
>>>>>>> rstart, 1, &ix);
>>>>>>> VecScatterCreate(x, ix, y, ix, &vscat);
>>>>>>>
>>>>>>> --Junchao Zhang
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 4, 2023 at 6:03 PM Sreeram R Venkat <srven...@utexas.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Suppose I am running on 12 processors, and I have a vector "v" of
>>>>>>>> size 36 partitioned over the first 4. v still uses the 
>>>>>>>> PETSC_COMM_WORLD, so
>>>>>>>> it has a layout of (9, 9, 9, 9, 0, 0, ..., 0). Now, I would like to
>>>>>>>> repartition it over all 12 processors, so that the layout becomes (3, 
>>>>>>>> 3, 3,
>>>>>>>> ..., 3). I've been trying to use VecScatter to do this, but I'm not 
>>>>>>>> sure
>>>>>>>> what IndexSets to use for the sender and receiver.
>>>>>>>>
>>>>>>>> The result I am trying to achieve is this:
>>>>>>>>
>>>>>>>> Assume the vector is v = <0, 1, 2, ..., 35>
>>>>>>>>
>>>>>>>>      Start                                Finish
>>>>>>>> Proc | Entries                 Proc | Entries
>>>>>>>>     0   |  0,...,8                     0   | 0, 1, 2
>>>>>>>>     1   |  9,...,17                   1   | 3, 4, 5
>>>>>>>>     2   |  18,...,26                 2   | 6, 7, 8
>>>>>>>>     3   |  27,...,35                 3   | 9, 10, 11
>>>>>>>>     4   |  None                     4   | 12, 13, 14
>>>>>>>>     5   |  None                     5   | 15, 16, 17
>>>>>>>>     6   |  None                     6   | 18, 19, 20
>>>>>>>>     7   |  None                     7   | 21, 22, 23
>>>>>>>>     8   |  None                     8   | 24, 25, 26
>>>>>>>>     9   |  None                     9   | 27, 28, 29
>>>>>>>>     10   |  None                   10 | 30, 31, 32
>>>>>>>>     11   |  None                   11  | 33, 34, 35
>>>>>>>>
>>>>>>>> Appreciate any help you can provide on this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sreeram
>>>>>>>>
>>>>>>>

Reply via email to