[OMPI users] 1D and 2D arrays allocate memory by maloc() and MPI_Send and MPI_Recv problem.
Hello all Sorry, may be this is a stupid question, bat a have a big problem with maloc() and matrix arrays. I want to make a program that do very simple thing like matriA * matrixB = matrixC. Because I need more matrix size than 100x100 (5000x5000), I have to use maloc() for memory allocation. First I tried this way: The typical form for dynamically allocating an NxM array of type T is: T **a = malloc(sizeof *a * N); if (a) { for (i = 0; i < N; i++) { a[i] = malloc(sizeof *a[i] * M); } } // the arrays are created before split to nodes No problem with create, fill array,but the problem started when I have send and receive it. Of course before send I calculated "cont" for MPI_Send. To be shore, that the count for MPI_Send i MPI_Recv is the same I also send "count". count = rows*matrix_size*sizeof (double); //part of matrix MPI_Send(&count, 1, MPI_INT, dest, mtype,MPI_COMM_WORLD); MPI_Send(&matrixA[offset][0], count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD); from worker side the code looks like: MPI_Recv(&countA, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status); MPI_Recv(&matrixA[0][0], countA, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD, &status); An error looks like: [pawcioj-VirtualBox:01700] *** Process received signal *** [pawcioj-VirtualBox:01700] Signal: Segmentation fault (11) [pawcioj-VirtualBox:01700] Signal code: Address not mapped (1) [pawcioj-VirtualBox:01700] Failing at address: 0x88fa000 [pawcioj-VirtualBox:01700] [ 0] [0xc2740c] [pawcioj-VirtualBox:01700] [ 1] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x906c) [0x17606c] [pawcioj-VirtualBox:01700] [ 2] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x6a1b) [0x173a1b] [pawcioj-VirtualBox:01700] [ 3] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x3ae6) [0x7b7ae6] [pawcioj-VirtualBox:01700] [ 4] /usr/lib/libopen-pal.so.0(opal_progress+0x81) [0x406fa1] [pawcioj-VirtualBox:01700] [ 5] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x48e5) [0x1718e5] [pawcioj-VirtualBox:01700] [ 6] /usr/lib/libmpi.so.0(MPI_Recv+0x165) [0x1ef9d5] [pawcioj-VirtualBox:01700] [ 7] macierz_V.02(main+0x927) [0x8049870] [pawcioj-VirtualBox:01700] [ 8] /lib/libc.so.6(__libc_start_main+0xe7) [0xddfce7] [pawcioj-VirtualBox:01700] [ 9] macierz_V.02() [0x8048b71] [pawcioj-VirtualBox:01700] *** End of error message *** -- mpirun noticed that process rank 1 with PID 1700 on node pawcioj-VirtualBox exited on signal 11 (Segmentation fault). Because I have no result, I tied do that by 1D array but the problem seems similar. Probably I do something wrong, so I would like to ask you about advice how do that proper or maybe link to useful tutorial. I spend two weeks to find out how do that but unfortunately without result :(. -- -- pozdrawiam Paweł Jaromin
Re: [OMPI users] 1D and 2D arrays allocate memory by maloc() and MPI_Send and MPI_Recv problem.
All MPI operations (including MPI_Send and MPI_Recv) consider any type of buffers (input and output) as a contiguous entity. Therefore, you have two options: 1. Have a loop around MPI_Send & MPI_Recv similar to the allocation section. 2. Build an MPI Datatype representing the non-contiguous memory layout you really intend to send between peers. george. On Aug 7, 2012, at 10:33 , Paweł Jaromin wrote: > Hello all > > Sorry, may be this is a stupid question, bat a have a big problem with > maloc() and matrix arrays. > I want to make a program that do very simple thing like matriA * > matrixB = matrixC. > Because I need more matrix size than 100x100 (5000x5000), I have to > use maloc() for memory allocation. > First I tried this way: > > The typical form for dynamically allocating an NxM array of type T is: > T **a = malloc(sizeof *a * N); > if (a) > { > for (i = 0; i < N; i++) > { >a[i] = malloc(sizeof *a[i] * M); > } > } > // the arrays are created before split to nodes > > No problem with create, fill array,but the problem started when I have > send and receive it. > Of course before send I calculated "cont" for MPI_Send. > To be shore, that the count for MPI_Send i MPI_Recv is the same I also > send "count". > > count = rows*matrix_size*sizeof (double); //part of matrix > MPI_Send(&count, 1, MPI_INT, dest, mtype,MPI_COMM_WORLD); > MPI_Send(&matrixA[offset][0], count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD); > > from worker side the code looks like: > > MPI_Recv(&countA, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status); > MPI_Recv(&matrixA[0][0], countA, MPI_DOUBLE, source, mtype, > MPI_COMM_WORLD, &status); > > > An error looks like: > > [pawcioj-VirtualBox:01700] *** Process received signal *** > [pawcioj-VirtualBox:01700] Signal: Segmentation fault (11) > [pawcioj-VirtualBox:01700] Signal code: Address not mapped (1) > [pawcioj-VirtualBox:01700] Failing at address: 0x88fa000 > [pawcioj-VirtualBox:01700] [ 0] [0xc2740c] > [pawcioj-VirtualBox:01700] [ 1] > /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x906c) [0x17606c] > [pawcioj-VirtualBox:01700] [ 2] > /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x6a1b) [0x173a1b] > [pawcioj-VirtualBox:01700] [ 3] > /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x3ae6) [0x7b7ae6] > [pawcioj-VirtualBox:01700] [ 4] > /usr/lib/libopen-pal.so.0(opal_progress+0x81) [0x406fa1] > [pawcioj-VirtualBox:01700] [ 5] > /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x48e5) [0x1718e5] > [pawcioj-VirtualBox:01700] [ 6] /usr/lib/libmpi.so.0(MPI_Recv+0x165) > [0x1ef9d5] > [pawcioj-VirtualBox:01700] [ 7] macierz_V.02(main+0x927) [0x8049870] > [pawcioj-VirtualBox:01700] [ 8] /lib/libc.so.6(__libc_start_main+0xe7) > [0xddfce7] > [pawcioj-VirtualBox:01700] [ 9] macierz_V.02() [0x8048b71] > [pawcioj-VirtualBox:01700] *** End of error message *** > -- > mpirun noticed that process rank 1 with PID 1700 on node > pawcioj-VirtualBox exited on signal 11 (Segmentation fault). > > > Because I have no result, I tied do that by 1D array but the problem > seems similar. > > Probably I do something wrong, so I would like to ask you about advice > how do that proper or maybe link to useful tutorial. > I spend two weeks to find out how do that but unfortunately without result :(. > > > > -- > -- > pozdrawiam > > Paweł Jaromin > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 1D and 2D arrays allocate memory by maloc() and MPI_Send and MPI_Recv problem.
Hi 2012/8/7 George Bosilca : > All MPI operations (including MPI_Send and MPI_Recv) consider any type of > buffers (input and output) as a contiguous entity. I tried use 1D array (instead of 2D) to have contiguous data - but result was the same :( > > Therefore, you have two options: > > 1. Have a loop around MPI_Send & MPI_Recv similar to the allocation section. How do You see this loop - byte per byte ?? > > 2. Build an MPI Datatype representing the non-contiguous memory layout you > really intend to send between peers. > I'm not enough good programmer to do that :( - as You can see I'm a a beginner. Maybe more details how do that ?? an example ?? > george. > > On Aug 7, 2012, at 10:33 , Paweł Jaromin wrote: > >> Hello all >> >> Sorry, may be this is a stupid question, bat a have a big problem with >> maloc() and matrix arrays. >> I want to make a program that do very simple thing like matriA * >> matrixB = matrixC. >> Because I need more matrix size than 100x100 (5000x5000), I have to >> use maloc() for memory allocation. >> First I tried this way: >> >> The typical form for dynamically allocating an NxM array of type T is: >> T **a = malloc(sizeof *a * N); >> if (a) >> { >> for (i = 0; i < N; i++) >> { >>a[i] = malloc(sizeof *a[i] * M); >> } >> } >> // the arrays are created before split to nodes >> >> No problem with create, fill array,but the problem started when I have >> send and receive it. >> Of course before send I calculated "cont" for MPI_Send. >> To be shore, that the count for MPI_Send i MPI_Recv is the same I also >> send "count". >> >> count = rows*matrix_size*sizeof (double); //part of matrix >> MPI_Send(&count, 1, MPI_INT, dest, mtype,MPI_COMM_WORLD); >> MPI_Send(&matrixA[offset][0], count, MPI_DOUBLE, dest, mtype, >> MPI_COMM_WORLD); >> >> from worker side the code looks like: >> >> MPI_Recv(&countA, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status); >> MPI_Recv(&matrixA[0][0], countA, MPI_DOUBLE, source, mtype, >> MPI_COMM_WORLD, &status); >> >> >> An error looks like: >> >> [pawcioj-VirtualBox:01700] *** Process received signal *** >> [pawcioj-VirtualBox:01700] Signal: Segmentation fault (11) >> [pawcioj-VirtualBox:01700] Signal code: Address not mapped (1) >> [pawcioj-VirtualBox:01700] Failing at address: 0x88fa000 >> [pawcioj-VirtualBox:01700] [ 0] [0xc2740c] >> [pawcioj-VirtualBox:01700] [ 1] >> /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x906c) [0x17606c] >> [pawcioj-VirtualBox:01700] [ 2] >> /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x6a1b) [0x173a1b] >> [pawcioj-VirtualBox:01700] [ 3] >> /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x3ae6) [0x7b7ae6] >> [pawcioj-VirtualBox:01700] [ 4] >> /usr/lib/libopen-pal.so.0(opal_progress+0x81) [0x406fa1] >> [pawcioj-VirtualBox:01700] [ 5] >> /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x48e5) [0x1718e5] >> [pawcioj-VirtualBox:01700] [ 6] /usr/lib/libmpi.so.0(MPI_Recv+0x165) >> [0x1ef9d5] >> [pawcioj-VirtualBox:01700] [ 7] macierz_V.02(main+0x927) [0x8049870] >> [pawcioj-VirtualBox:01700] [ 8] /lib/libc.so.6(__libc_start_main+0xe7) >> [0xddfce7] >> [pawcioj-VirtualBox:01700] [ 9] macierz_V.02() [0x8048b71] >> [pawcioj-VirtualBox:01700] *** End of error message *** >> -- >> mpirun noticed that process rank 1 with PID 1700 on node >> pawcioj-VirtualBox exited on signal 11 (Segmentation fault). >> >> >> Because I have no result, I tied do that by 1D array but the problem >> seems similar. >> >> Probably I do something wrong, so I would like to ask you about advice >> how do that proper or maybe link to useful tutorial. >> I spend two weeks to find out how do that but unfortunately without result >> :(. >> >> >> >> -- >> -- >> pozdrawiam >> >> Paweł Jaromin >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- -- pozdrawiam Paweł Jaromin
Re: [OMPI users] 1D and 2D arrays allocate memory by maloc() and MPI_Send and MPI_Recv problem.
Look at this declaration: int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) here*"count" is the**number of elements* (not bytes!) in the send buffer (nonnegative integer) Your "count" was defined as count = rows*matrix_size*sizeof (double); and seems to be erroneous; variable "count" cannot depend on the size of the matrix element! Z Koza On Aug 7, 2012, at 10:33 , Paweł Jaromin wrote: Hello all Sorry, may be this is a stupid question, bat a have a big problem with maloc() and matrix arrays. I want to make a program that do very simple thing like matriA * matrixB = matrixC. Because I need more matrix size than 100x100 (5000x5000), I have to use maloc() for memory allocation. First I tried this way: The typical form for dynamically allocating an NxM array of type T is: T **a = malloc(sizeof *a * N); if (a) { for (i = 0; i < N; i++) { a[i] = malloc(sizeof *a[i] * M); } } // the arrays are created before split to nodes No problem with create, fill array,but the problem started when I have send and receive it. Of course before send I calculated "cont" for MPI_Send. To be shore, that the count for MPI_Send i MPI_Recv is the same I also send "count". count = rows*matrix_size*sizeof (double); //part of matrix MPI_Send(&count, 1, MPI_INT, dest, mtype,MPI_COMM_WORLD); MPI_Send(&matrixA[offset][0], count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD); from worker side the code looks like: MPI_Recv(&countA, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status); MPI_Recv(&matrixA[0][0], countA, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD, &status); An error looks like: [pawcioj-VirtualBox:01700] *** Process received signal *** [pawcioj-VirtualBox:01700] Signal: Segmentation fault (11) [pawcioj-VirtualBox:01700] Signal code: Address not mapped (1) [pawcioj-VirtualBox:01700] Failing at address: 0x88fa000 [pawcioj-VirtualBox:01700] [ 0] [0xc2740c] [pawcioj-VirtualBox:01700] [ 1] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x906c) [0x17606c] [pawcioj-VirtualBox:01700] [ 2] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x6a1b) [0x173a1b] [pawcioj-VirtualBox:01700] [ 3] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x3ae6) [0x7b7ae6] [pawcioj-VirtualBox:01700] [ 4] /usr/lib/libopen-pal.so.0(opal_progress+0x81) [0x406fa1] [pawcioj-VirtualBox:01700] [ 5] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x48e5) [0x1718e5] [pawcioj-VirtualBox:01700] [ 6] /usr/lib/libmpi.so.0(MPI_Recv+0x165) [0x1ef9d5] [pawcioj-VirtualBox:01700] [ 7] macierz_V.02(main+0x927) [0x8049870] [pawcioj-VirtualBox:01700] [ 8] /lib/libc.so.6(__libc_start_main+0xe7) [0xddfce7] [pawcioj-VirtualBox:01700] [ 9] macierz_V.02() [0x8048b71] [pawcioj-VirtualBox:01700] *** End of error message *** -- mpirun noticed that process rank 1 with PID 1700 on node pawcioj-VirtualBox exited on signal 11 (Segmentation fault). Because I have no result, I tied do that by 1D array but the problem seems similar. Probably I do something wrong, so I would like to ask you about advice how do that proper or maybe link to useful tutorial. I spend two weeks to find out how do that but unfortunately without result :(. -- -- pozdrawiam Paweł Jaromin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users