[OMPI devel] Collective communications may be abend when it use over 2GiB buffer
Dear All, Next feedback is about "collective communications". Collective communication may be abend when it use over 2GiB buffer. This problem occurs following condition: -- communicator_size * count(scount/rcount) >= 2GiB It occurs in even small PC cluster. The following is one of the suspicious parts. (Many similar code in ompi/coll/tuned/*.c) --- in ompi/coll/tuned/coll_tuned_allgather.c (V1.4.X's trunk)--- 398tmprecv = (char*) rbuf + rank * rcount * rext; - if this condition is met, "rank * rcount" is overflowed. So, we fixed it tentatively like following: (cast int to size_t) --- in ompi/coll/tuned/coll_tuned_allgather.c -- 398tmprecv = (char*) rbuf + (size_t)rank * rcount * rext; It needs not only "ompi/coll/tuned" but also other codes to fix this problem. We try to fix, but following functions have problem (argument may be overflowed): -"ompi_coll_tuned_sendrecv" may be called when "scount/rcount" sets over 2GiB. -"ompi_datatype_copy_content_same_ddt" may be called when "count" sets over 2GiB. -"basic_linear in Allgather": Bcast may be called when "count" sets over 2GiB. Best Regards, Yuki Matsumoto MPI development team, Fujitsu
[OMPI devel] [PATCH]Incorrect algorithm choice using coll_tuned_dynamic_rules_filename (over 2GiB message)
Dear All, Next feedback is about "coll_tuned_dynamic_rules_filename". Incorrect algorithm is selected in following conditions: 1:"--mca coll_tuned_use_dynamic_rules 1" is set. 2:"--mca coll_tuned_dynamic_rules_filename" is set. 3: Collective communication which is written in 2, called >= 2GiB communication. (ex) MPI_Bcast:data type size * count >= 2GiB MPI_Allgather: data type size * count * communication size >= 2GiB) Please see attached patch(Patch is for V1.4.x). But, we found problem when over 2GiB message is written in rulefile as "message size". (over 2GiB message cannot read correctly.) And we do not fix it. Best Regards, yuki Matsumoto MPI development team, Fujitsu Copyright (c) 2011-2012 FUJITSU LIMITED. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer listed in this license in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. The copyright holders provide no reassurances that the source code provided does not infringe any patent, copyright, or any other intellectual property rights of third parties. The copyright holders disclaim any liability to any recipient for claims brought against recipient by any third party for infringement of that parties intellectual property rights. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Index: ompi/mca/coll/tuned/coll_tuned_dynamic_rules.c === --- ompi/mca/coll/tuned/coll_tuned_dynamic_rules.c (revision 25978) +++ ompi/mca/coll/tuned/coll_tuned_dynamic_rules.c (working copy) @@ -350,7 +350,7 @@ * */ -int ompi_coll_tuned_get_target_method_params (ompi_coll_com_rule_t* base_com_rule, int mpi_msgsize, int *result_topo_faninout, +int ompi_coll_tuned_get_target_method_params (ompi_coll_com_rule_t* base_com_rule, size_t mpi_msgsize, int *result_topo_faninout, int* result_segsize, int* max_requests) { ompi_coll_msg_rule_t* msg_p = (ompi_coll_msg_rule_t*) NULL; Index: ompi/mca/coll/tuned/coll_tuned_dynamic_rules.h === --- ompi/mca/coll/tuned/coll_tuned_dynamic_rules.h (revision 25978) +++ ompi/mca/coll/tuned/coll_tuned_dynamic_rules.h (working copy) @@ -37,7 +37,7 @@ int msg_rule_id; /* unique msg rule id */ /* RULE */ - int msg_size;/* message size */ + size_t msg_size; /* message size */ /* RESULT */ int result_alg; /* result algorithm to use */ @@ -95,7 +95,7 @@ ompi_coll_com_rule_t* ompi_coll_tuned_get_com_rule_ptr (ompi_coll_alg_rule_t* rules, int alg_id, int mpi_comsize); -int ompi_coll_tuned_get_target_method_params (ompi_coll_com_rule_t* base_com_rule, int mpi_msgsize, +int ompi_coll_tuned_get_target_method_params (ompi_coll_com_rule_t* base_com_rule, size_t mpi_msgsize, int* result_topo_faninout, int* result_segsize, int* max_requests);
[OMPI devel] [PATCH]Segmentation Fault occurs when the function called from MPI_Comm_spawn_multiple fails
Dear All, Next feedback is "MPI_Comm_spawn_multiple". When the function called from MPI_Comm_spawn_multiple failed, Segmentation fault occurs. In that condition, "newcomp" sets NULL. But member of "newcomp" is referred at following part. (ompi/mpi/c/comm_spawn_multiple.c) 176 /* set array of errorcodes */ 177 if (MPI_ERRCODES_IGNORE != array_of_errcodes) { 178 for ( i=0; i < newcomp->c_remote_group->grp_proc_count; i++ ) { 179 array_of_errcodes[i]=rc; 180 } 181 } Attached patch fixes it. (Patch is for V1.4.x). Best regards, Yuki MATSUMOTO MPI development team, Fujitsu Index: ompi/mpi/c/comm_spawn_multiple.c === --- ompi/mpi/c/comm_spawn_multiple.c(revision 25723) +++ ompi/mpi/c/comm_spawn_multiple.c(working copy) @@ -42,7 +42,7 @@ int root, MPI_Comm comm, MPI_Comm *intercomm, int *array_of_errcodes) { -int i=0, rc=0, rank=0, flag; +int i=0, rc=0, rank=0, size=0, flag; ompi_communicator_t *newcomp=NULL; bool send_first=false; /* they are contacting us first */ char port_name[MPI_MAX_PORT_NAME]; @@ -175,8 +175,18 @@ /* set array of errorcodes */ if (MPI_ERRCODES_IGNORE != array_of_errcodes) { -for ( i=0; i < newcomp->c_remote_group->grp_proc_count; i++ ) { -array_of_errcodes[i]=rc; +if (NULL != newcomp) { +for ( i=0; i < newcomp->c_remote_group->grp_proc_count; i++ ) { +array_of_errcodes[i]=rc; +} +} else { +for ( i=0; i < count; i++) { +size = size + array_of_maxprocs[i]; +} + +for ( i=0; i < size; i++) { +array_of_errcodes[i]=rc; +} } }
[OMPI devel] [PATCH]Some typos in error code, func_name and man
Dear All, We found some typos in error code/func_name/man. Attached three patches fix them(Patch is for in V1.4x). Best regards, Yuki MATSUMOTO MPI development team, Fujitsu Index: ompi/errhandler/errcode-internal.c === --- ompi/errhandler/errcode-internal.c (revision 25448) +++ ompi/errhandler/errcode-internal.c (working copy) @@ -95,7 +95,7 @@ ompi_err_temp_out_of_resource.code = OMPI_ERR_TEMP_OUT_OF_RESOURCE; ompi_err_temp_out_of_resource.mpi_code = MPI_ERR_INTERN; ompi_err_temp_out_of_resource.index = pos++; -strncpy(ompi_err_temp_out_of_resource.errstring, "MPI_ERR_TEMP_OUT_OF_RESOURCE", OMPI_MAX_ERROR_STRING); +strncpy(ompi_err_temp_out_of_resource.errstring, "OMPI_ERR_TEMP_OUT_OF_RESOURCE", OMPI_MAX_ERROR_STRING); opal_pointer_array_set_item(_errcodes_intern, ompi_err_temp_out_of_resource.index, _err_temp_out_of_resource); Index: ompi/mpi/man/man3/MPI_Comm_delete_attr.3in === --- ompi/mpi/man/man3/MPI_Comm_delete_attr.3in (revision 25723) +++ ompi/mpi/man/man3/MPI_Comm_delete_attr.3in (working copy) @@ -15,7 +15,7 @@ .SH Fortran Syntax .nf INCLUDE 'mpif.h' -MPI_Comm_delete_attr(\fICOMM, COMM_KEYVAL, IERROR\fP) +MPI_COMM_DELETE_ATTR(\fICOMM, COMM_KEYVAL, IERROR\fP) INTEGER \fICOMM, COMM_KEYVAL, IERROR \fP .fi Index: ompi/mpi/man/man3/MPI_Init_thread.3in === --- ompi/mpi/man/man3/MPI_Init_thread.3in (revision 25723) +++ ompi/mpi/man/man3/MPI_Init_thread.3in (working copy) @@ -20,7 +20,7 @@ .SH Fortran Syntax .nf INCLUDE 'mpif.h' -MPI_INIT(\fIREQUIRED, PROVIDED, IERROR\fP) +MPI_INIT_THREAD(\fIREQUIRED, PROVIDED, IERROR\fP) INTEGER \fIREQUIRED, PROVIDED, IERROR\fP .fi Index: ompi/mpi/man/man3/MPI_Comm_split.3in === --- ompi/mpi/man/man3/MPI_Comm_split.3in(revision 25723) +++ ompi/mpi/man/man3/MPI_Comm_split.3in(working copy) @@ -54,7 +54,7 @@ .ft R This function partitions the group associated with comm into disjoint subgroups, one for each value of color. Each subgroup contains all processes of the same color. Within each subgroup, the processes are ranked in the order defined by the value of the argument key, with ties broken according to their rank in the old group. A new communicator is created for each subgroup and returned in newcomm. A process may supply the color value MPI_UNDEFINED, in which case newcomm returns MPI_COMM_NULL. This is a collective call, but each process is permitted to provide different values for color and key. .sp -When you call MPI_Comm_split on an inter-communicator, the processes on the left with the same color as those on the right combine to create a new inter-communicator. The key argument describes the relative rank of processes on each side of the inter-communicator. The function returns MPI_COMM_NULL for those colors that are specified on only one side of the inter-communicator, or for those that specify MPI_UNEDEFINED as the color. +When you call MPI_Comm_split on an inter-communicator, the processes on the left with the same color as those on the right combine to create a new inter-communicator. The key argument describes the relative rank of processes on each side of the inter-communicator. The function returns MPI_COMM_NULL for those colors that are specified on only one side of the inter-communicator, or for those that specify MPI_UNDEFINED as the color. .sp A call to MPI_Comm_create(\fIcomm\fP, \fIgroup\fP, \fInewcomm\fP) is equivalent to a call to MPI_Comm_split(\fIcomm\fP, \fIcolor\fP,\fI key\fP, \fInewcomm\fP), where all members of \fIgroup\fP provide \fIcolor\fP = 0 and \fIkey\fP = rank in group, and all processes that are not members of \fIgroup\fP provide \fIcolor\fP = MPI_UNDEFINED. The function MPI_Comm_split allows more general partitioning of a group into one or more subgroups with optional reordering. .sp Index: ompi/mpi/man/man3/MPI_Comm_free_keyval.3in === --- ompi/mpi/man/man3/MPI_Comm_free_keyval.3in (revision 25723) +++ ompi/mpi/man/man3/MPI_Comm_free_keyval.3in (working copy) @@ -39,7 +39,7 @@ .SH DESCRIPTION .ft R -MPI_Comm_free_keyval frees an extant attribute key. This function sets the value of \fIkeyval\fP to MPI_KEYVAL_INVALID. Note that it is not erroneous to free an attribute key that is in use, because the actual free does not transpire until after all references (in other communicators on the process) to the key have been freed. These references need to be explictly freed by the program, either via calls to MPI_Comm_delete_attr that free one attribute instance, or by calls to MPI_Comm_free that free all attribute instances associated
[OMPI devel] [PATCH] MPI_FILE_SEEK_SHARED is wrong in Fortran
Dear All, Next is about "MPI_FILE_SEEK_SHARED" in Fortran. When MPI_FILE_SEEK_SHARED is called in Fortran Program, the shared file pointer is not updated. Incorrent function call is the following part: ompi/mpi/f77/file_seek_shared_f.c--- 60 void mpi_file_seek_shared_f(MPI_Fint *fh, MPI_Offset *offset, 61 MPI_Fint *whence, MPI_Fint *ierr) 62 { 63 MPI_File c_fh = MPI_File_f2c(*fh); 64 65 *ierr = OMPI_INT_2_FINT(MPI_File_seek(c_fh, (MPI_Offset) *offset, 66 OMPI_FINT_2_INT(*whence))); 67 } ompi/mpi/f77/file_seek_shared_f.c--- Attached patch fixes it(Patch is for in V1.4x). Best regards, Yuki MATSUMOTO MPI development team, Fujitsu Index: ompi/mpi/f77/file_seek_shared_f.c === --- ompi/mpi/f77/file_seek_shared_f.c (revision 25723) +++ ompi/mpi/f77/file_seek_shared_f.c (working copy) @@ -62,6 +62,6 @@ { MPI_File c_fh = MPI_File_f2c(*fh); -*ierr = OMPI_INT_2_FINT(MPI_File_seek(c_fh, (MPI_Offset) *offset, +*ierr = OMPI_INT_2_FINT(MPI_File_seek_shared(c_fh, (MPI_Offset) *offset, OMPI_FINT_2_INT(*whence))); }
[OMPI devel] Violating standard in MPI_Close_port
Dear All, Next is question about "MPI_Close_port". According to the MPI-2.2 standard, the "port_name" argument of MPI_Close_port() is marked as 'IN'. But, in Open MPI (both trunk and 1.4.x), the content of "port_name" is updated in MPI_Close_port(). It seems to violate the MPI standard. The following is the suspicious part. ---ompi/mca/dpm/orte/dpm_orte.c--- 919 static int close_port(char *port_name) 920 { 921 /* the port name is a pointer to an array - DO NOT FREE IT! */ 922 memset(port_name, 0, MPI_MAX_PORT_NAME); 923 return OMPI_SUCCESS; 924 } ---ompi/mca/dpm/orte/dpm_orte.c--- This memset makes "port_name" "INOUT". Would you tell me why call this memset? Best regards, Yuki MATSUMOTO MPI development team, Fujitsu
Re: [OMPI devel] Incorrect and undefined return code/function/data type at C++ header
Dear All, I fixed the patch. (MPI::Fint etc.) So, please replace the patch. Best regards. --- Yuki MATSUMOTO MPI development team, Fujitsu (2011/12/09 11:35), Y.MATSUMOTO wrote: Dear Jeff and all, Thank you for your comment. I'm sorry for not replying sooner. 1:MPI::Fint We checked C++ header using MPI-2.1 standard. So, it doesn't need MPI::Fint definition. (Please remove it!) 2:MPI::Grequest::Start Sorry! I send you incorrect list. Best regards. --- Yuki MATSUMOTO MPI development team, Fujitsu (2011/12/06 1:35), Jeff Squyres wrote: Many thanks for the patch! Two minor points: 1. I do not believe that MPI::Fint exists. It's surprising, but I'm pretty sure we double checked this back in the MPI-2.2 timeframe and came to the conclusions that a) it does not exist, and b) it should not exist, because all C++<--> Fortran interaction is supposed to go through the C translation routines. 2. Grequest::Start is a static function on the MPI namespace -- it is not marked "const" in MPI 2.1 or 2.2 (I don't see it in the patch, either). On Dec 4, 2011, at 9:31 PM, Y.MATSUMOTO wrote: Dear all, We send next feed back. It's about C++ header file. In ompi/mpi/cxx/*.h, Some definitions of return code, type and function are lacked or incorrect. Attached patch fixes them (This Patch is for V1.4.X). Following list is what is lacked and incorrect. *Undefined return code -- MPI::ERR_ACCESS MPI::ERR_AMODE MPI::ERR_ASSERT MPI::ERR_BAD_FILE MPI::ERR_CONVERSION MPI::ERR_DISP MPI::ERR_DUP_DATAREP MPI::ERR_FILE_EXISTS MPI::ERR_FILE_IN_USE MPI::ERR_FILE MPI::ERR_INFO MPI::ERR_IO MPI::ERR_LOCKTYPE MPI::ERR_NOT_SAME MPI::ERR_NO_SPACE MPI::ERR_NO_SUCH_FILE MPI::ERR_PORT MPI::ERR_QUOTA MPI::ERR_READ_ONLY MPI::ERR_RMA_CONFLICT MPI::ERR_RMA_SYNC MPI::ERR_SIZE MPI::ERR_UNSUPPORTED_DATAREP MPI::ERR_UNSUPPORTED_OPERATION -- *Undefined data type -- MPI::LONG_LONG_INT MPI::Fint MPI::F_DOUBLE_COMPLEX -- *Undefined function -- MPI::Datatype::Create_darray MPI::Datatype::Pack_external MPI::Datatype::Pack_external_size MPI::Datatype::Unpack_external MPI::Add_error_class MPI::Add_error_code MPI::Add_error_string MPI::Datatype::Create_f90_complex MPI::Datatype::Create_f90_integer MPI::Datatype::Create_f90_real MPI::Datatype::Match_size -- *Incorrect of definitions (MPI-2.1 standard defines these as "const", but they are not "const" in code) -- MPI::Intercomm::Merge MPI::Cartcomm::Sub MPI::Grequest::Start -- *Incorrect of definitions (MPI-2.1 standard defines these as not "const", but they are "const" in code) -- MPI::Comm::Set_errhandler MPI::File::Set_errhandler MPI::Win::Set_errhandler -- Best regards. -- Yuki MATSUMOTO MPI development team, Fujitsu ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel Index: ompi/mpi/cxx/comm.h === --- ompi/mpi/cxx/comm.h (revision 25570) +++ ompi/mpi/cxx/comm.h (working copy) @@ -382,7 +382,7 @@ static Errhandler Create_errhandler(Comm::Errhandler_fn* function); - virtual void Set_errhandler(const Errhandler& errhandler) const; + virtual void Set_errhandler(const Errhandler& errhandler); virtual Errhandler Get_errhandler() const; Index: ompi/mpi/cxx/topology_inln.h === --- ompi/mpi/cxx/topology_inln.h(revision 25570) +++ ompi/mpi/cxx/topology_inln.h(working copy) @@ -99,7 +99,7 @@ } inline MPI::Cartcomm -MPI::Cartcomm::Sub(const bool remain_dims[]) +MPI::Cartcomm::Sub(const bool remain_dims[]) const { int ndims; MPI_Cartdim_get(mpi_comm, ); Index: ompi/mpi/cxx/intercomm.h === --- ompi/mpi/cxx/intercomm.h(revision 25570) +++ ompi/mpi/cxx/intercomm.h(working copy) @@ -77,7 +77,7 @@ virtual Group Get_remote_group() const; - virtual Intracomm Merge(bool high); + virtual Intracomm Merge(bool high) const; virtual Intercomm Create(const Group& group) const; Index: ompi/mpi/cxx/mpicxx.cc === --- ompi/mpi/cxx/mpicxx.cc (revision 25570) +++ ompi/mpi/cxx/mpicxx.cc (workin
Re: [OMPI devel] Incorrect and undefined return code/function/data type at C++ header
Dear Jeff and all, Thank you for your comment. I'm sorry for not replying sooner. 1:MPI::Fint We checked C++ header using MPI-2.1 standard. So, it doesn't need MPI::Fint definition. (Please remove it!) 2:MPI::Grequest::Start Sorry! I send you incorrect list. Best regards. --- Yuki MATSUMOTO MPI development team, Fujitsu (2011/12/06 1:35), Jeff Squyres wrote: Many thanks for the patch! Two minor points: 1. I do not believe that MPI::Fint exists. It's surprising, but I'm pretty sure we double checked this back in the MPI-2.2 timeframe and came to the conclusions that a) it does not exist, and b) it should not exist, because all C++<--> Fortran interaction is supposed to go through the C translation routines. 2. Grequest::Start is a static function on the MPI namespace -- it is not marked "const" in MPI 2.1 or 2.2 (I don't see it in the patch, either). On Dec 4, 2011, at 9:31 PM, Y.MATSUMOTO wrote: Dear all, We send next feed back. It's about C++ header file. In ompi/mpi/cxx/*.h, Some definitions of return code, type and function are lacked or incorrect. Attached patch fixes them (This Patch is for V1.4.X). Following list is what is lacked and incorrect. *Undefined return code -- MPI::ERR_ACCESS MPI::ERR_AMODE MPI::ERR_ASSERT MPI::ERR_BAD_FILE MPI::ERR_CONVERSION MPI::ERR_DISP MPI::ERR_DUP_DATAREP MPI::ERR_FILE_EXISTS MPI::ERR_FILE_IN_USE MPI::ERR_FILE MPI::ERR_INFO MPI::ERR_IO MPI::ERR_LOCKTYPE MPI::ERR_NOT_SAME MPI::ERR_NO_SPACE MPI::ERR_NO_SUCH_FILE MPI::ERR_PORT MPI::ERR_QUOTA MPI::ERR_READ_ONLY MPI::ERR_RMA_CONFLICT MPI::ERR_RMA_SYNC MPI::ERR_SIZE MPI::ERR_UNSUPPORTED_DATAREP MPI::ERR_UNSUPPORTED_OPERATION -- *Undefined data type -- MPI::LONG_LONG_INT MPI::Fint MPI::F_DOUBLE_COMPLEX -- *Undefined function -- MPI::Datatype::Create_darray MPI::Datatype::Pack_external MPI::Datatype::Pack_external_size MPI::Datatype::Unpack_external MPI::Add_error_class MPI::Add_error_code MPI::Add_error_string MPI::Datatype::Create_f90_complex MPI::Datatype::Create_f90_integer MPI::Datatype::Create_f90_real MPI::Datatype::Match_size -- *Incorrect of definitions (MPI-2.1 standard defines these as "const", but they are not "const" in code) -- MPI::Intercomm::Merge MPI::Cartcomm::Sub MPI::Grequest::Start -- *Incorrect of definitions (MPI-2.1 standard defines these as not "const", but they are "const" in code) -- MPI::Comm::Set_errhandler MPI::File::Set_errhandler MPI::Win::Set_errhandler -- Best regards. -- Yuki MATSUMOTO MPI development team, Fujitsu ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Incorrect and undefined return code/function/data type at C++ header
Dear all, We send next feed back. It's about C++ header file. In ompi/mpi/cxx/*.h, Some definitions of return code, type and function are lacked or incorrect. Attached patch fixes them (This Patch is for V1.4.X). Following list is what is lacked and incorrect. *Undefined return code -- MPI::ERR_ACCESS MPI::ERR_AMODE MPI::ERR_ASSERT MPI::ERR_BAD_FILE MPI::ERR_CONVERSION MPI::ERR_DISP MPI::ERR_DUP_DATAREP MPI::ERR_FILE_EXISTS MPI::ERR_FILE_IN_USE MPI::ERR_FILE MPI::ERR_INFO MPI::ERR_IO MPI::ERR_LOCKTYPE MPI::ERR_NOT_SAME MPI::ERR_NO_SPACE MPI::ERR_NO_SUCH_FILE MPI::ERR_PORT MPI::ERR_QUOTA MPI::ERR_READ_ONLY MPI::ERR_RMA_CONFLICT MPI::ERR_RMA_SYNC MPI::ERR_SIZE MPI::ERR_UNSUPPORTED_DATAREP MPI::ERR_UNSUPPORTED_OPERATION -- *Undefined data type -- MPI::LONG_LONG_INT MPI::Fint MPI::F_DOUBLE_COMPLEX -- *Undefined function -- MPI::Datatype::Create_darray MPI::Datatype::Pack_external MPI::Datatype::Pack_external_size MPI::Datatype::Unpack_external MPI::Add_error_class MPI::Add_error_code MPI::Add_error_string MPI::Datatype::Create_f90_complex MPI::Datatype::Create_f90_integer MPI::Datatype::Create_f90_real MPI::Datatype::Match_size -- *Incorrect of definitions (MPI-2.1 standard defines these as "const", but they are not "const" in code) -- MPI::Intercomm::Merge MPI::Cartcomm::Sub MPI::Grequest::Start -- *Incorrect of definitions (MPI-2.1 standard defines these as not "const", but they are "const" in code) -- MPI::Comm::Set_errhandler MPI::File::Set_errhandler MPI::Win::Set_errhandler -- Best regards. -- Yuki MATSUMOTO MPI development team, Fujitsu Index: ompi/mpi/cxx/comm.h === --- ompi/mpi/cxx/comm.h (revision 25518) +++ ompi/mpi/cxx/comm.h (working copy) @@ -11,6 +11,7 @@ // Copyright (c) 2004-2005 The Regents of the University of California. // All rights reserved. // Copyright (c) 2006-2008 Cisco Systems, Inc. All rights reserved. +// Copyright (c) 2011 FUJITSU LIMITED. All rights reserved. // $COPYRIGHT$ // // Additional copyrights may follow @@ -382,7 +383,7 @@ static Errhandler Create_errhandler(Comm::Errhandler_fn* function); - virtual void Set_errhandler(const Errhandler& errhandler) const; + virtual void Set_errhandler(const Errhandler& errhandler); virtual Errhandler Get_errhandler() const; Index: ompi/mpi/cxx/topology_inln.h === --- ompi/mpi/cxx/topology_inln.h(revision 25518) +++ ompi/mpi/cxx/topology_inln.h(working copy) @@ -11,6 +11,7 @@ // Copyright (c) 2004-2005 The Regents of the University of California. // All rights reserved. // Copyright (c) 2007 Sun Microsystems, Inc. All rights reserved. +// Copyright (c) 2011 FUJITSU LIMITED. All rights reserved. // $COPYRIGHT$ // // Additional copyrights may follow @@ -99,7 +100,7 @@ } inline MPI::Cartcomm -MPI::Cartcomm::Sub(const bool remain_dims[]) +MPI::Cartcomm::Sub(const bool remain_dims[]) const { int ndims; MPI_Cartdim_get(mpi_comm, ); Index: ompi/mpi/cxx/intercomm.h === --- ompi/mpi/cxx/intercomm.h(revision 25518) +++ ompi/mpi/cxx/intercomm.h(working copy) @@ -11,6 +11,7 @@ // Copyright (c) 2004-2005 The Regents of the University of California. // All rights reserved. // Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. +// Copyright (c) 2011 FUJITSU LIMITED. All rights reserved. // $COPYRIGHT$ // // Additional copyrights may follow @@ -77,7 +78,7 @@ virtual Group Get_remote_group() const; - virtual Intracomm Merge(bool high); + virtual Intracomm Merge(bool high) const; virtual Intercomm Create(const Group& group) const; Index: ompi/mpi/cxx/mpicxx.cc === --- ompi/mpi/cxx/mpicxx.cc (revision 25518) +++ ompi/mpi/cxx/mpicxx.cc (working copy) @@ -12,6 +12,7 @@ // All rights reserved. // Copyright (c) 2007-2009 Cisco Systems, Inc. All rights reserved. // Copyright (c) 2007 Sun Microsystems, Inc. All rights reserved. +// Copyright (c) 2011 FUJITSU LIMITED. All rights reserved. // $COPYRIGHT$ // // Additional copyrights may follow @@ -102,11 +103,13 @@ // optional datatype (C / C++) const Datatype UNSIGNED_LONG_LONG(MPI_UNSIGNED_LONG_LONG); const Datatype
Re: [OMPI devel] "Open MPI"-based MPI library used by K computer
Dear Open MPI community, I'm a member of MPI library development team in Fujitsu, Takahiro Kawashima, who sent mail before, is my colleague. We start to feed back. First, we fixed about MPI_LB/MPI_UB and data packing problem. Program crashes when it meets all of the following conditions: a: The type of sending data is contiguous and derived type. b: Either or both of MPI_LB and MPI_UB is used in the data type. c: The size of sending data is smaller than extent(Data type has gap). d: Send-count is bigger than 1. e: Total size of data is bigger than "eager limit" This problem occurs in attachment C program. An incorrect-address accessing occurs because an unintended value of "done" inputs and the value of "max_allowd" becomes minus in the following place in "ompi/datatype/datatype_pack.c(in version 1.4.3)". (ompi/datatype/datatype_pack.c) 188 packed_buffer = (unsigned char *) iov[iov_count].iov_base; 189 done = pConv->bConverted - i * pData->size; /* partial data from last pack */ 190 if( done != 0 ) { /* still some data to copy from the last time */ 191 done = pData->size - done; 192 OMPI_DDT_SAFEGUARD_POINTER( user_memory, done, pConv->pBaseBuf, pData, pConv->count ); 193 MEMCPY_CSUM( packed_buffer, user_memory, done, pConv ); 194 packed_buffer += done; 195 max_allowed -= done; 196 total_bytes_converted += done; 197 user_memory += (extent - pData->size + done); 198 } This program assumes "done" as the size of partial data from last pack. However, when the program crashes, "done" equals the sum of all transmitted data size. It makes "max_allowed" to be a negative value. We modified the code as following and it passed our test suite. But we are not sure this fix is correct. Can anyone review this fix? Patch (against Open MPI 1.4 branch) is attached to this mail. -if( done != 0 ) { /* still some data to copy from the last time */ +if( (done + max_allowed) >= pData->size ) { /* still some data to copy from the last time */ Best regards, Yuki MATSUMOTO MPI development team, Fujitsu (2011/06/28 10:58), Takahiro Kawashima wrote: Dear Open MPI community, I'm a member of MPI library development team in Fujitsu. Shinji Sumimoto, whose name appears in Jeff's blog, is one of our bosses. As Rayson and Jeff noted, K computer, world's most powerful HPC system developed by RIKEN and Fujitsu, utilizes Open MPI as a base of its MPI library. We, Fujitsu, are pleased to announce that, and also have special thanks to Open MPI community. We are sorry to be late announce! Our MPI library is based on Open MPI 1.4 series, and has a new point- to-point component (BTL) and new topology-aware collective communication algorithms (COLL). Also, it is adapted to our runtime environment (ESS, PLM, GRPCOMM etc). K computer connects 68,544 nodes by our custom interconnect. Its runtime environment is our proprietary one. So we don't use orted. We cannot tell start-up time yet because of disclosure restriction, sorry. We are surprised by the extensibility of Open MPI, and have proved that Open MPI is scalable to 68,000 processes level! We feel pleasure to utilize such a great open-source software. We cannot tell detail of our technology yet because of our contract with RIKEN AICS, however, we will plan to feedback of our improvements and bug fixes. We can contribute some bug fixes soon, however, for contribution of our improvements will be next year with Open MPI agreement. Best regards, MPI development team, Fujitsu I got more information: http://blogs.cisco.com/performance/open-mpi-powers-8-petaflops/ Short version: yes, Open MPI is used on K and was used to power the 8PF runs. w00t! On Jun 24, 2011, at 7:16 PM, Jeff Squyres wrote: w00t! OMPI powers 8 petaflops! (at least I'm guessing that -- does anyone know if that's true?) On Jun 24, 2011, at 7:03 PM, Rayson Ho wrote: Interesting... page 11: http://www.fujitsu.com/downloads/TC/sc10/programming-on-k-computer.pdf Open MPI based: * Open Standard, Open Source, Multi-Platform including PC Cluster. * Adding extension to Open MPI for "Tofu" interconnect Rayson ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel Index: ompi/datatype/datatype_pack.c === --- ompi/datatype/datatype_pack.c (revision 25474) +++ ompi/datatype/datatype_pack.c (working copy) @@ -187,7 +187,7 @@ packed_buffer = (unsigned char *) iov[iov_count].iov_base; done = pConv->bConverted - i * pData->size; /* partial data from last pack */ -if( done != 0 ) { /* still some data to copy from the last time */ +if( (done + max_allowed) >= pData->size ) { /* still some data to