[hwloc-devel] Create success (hwloc git 1.7.2-26-gaaaf369)

2014-04-23 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc 1.7.2-26-gaaaf369
Start time: Wed Apr 23 21:03:25 EDT 2014
End time:   Wed Apr 23 21:05:43 EDT 2014

Your friendly daemon,
Cyrador


[hwloc-devel] Create success (hwloc git dev-156-g7489287)

2014-04-23 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc dev-156-g7489287
Start time: Wed Apr 23 21:01:01 EDT 2014
End time:   Wed Apr 23 21:03:24 EDT 2014

Your friendly daemon,
Cyrador


Re: [hwloc-devel] PATCH: Mark fd as close-on-exec

2014-04-23 Thread Paul Hargrove
On Wed, Apr 23, 2014 at 4:14 PM, Brice Goglin  wrote:

> This code is only built on Linux


Yes, of course!
I neglected to look at the name of the file in question.

No guard is needed for even my oldest Linux systems.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] Bug report: non-blocking allreduce with user-defined operation gives segfault

2014-04-23 Thread George Bosilca
Rupert,

You are right, the code of any non-blocking reduce is not built with
user-level op in mind. However, I'm not sure about your patch. One
reason is that ompi_3buff is doing  target = source1 op source2 while
   ompi_2buf is doing target op= source (notice the op=)

Thus you can't replace ompi_3buff by 2 ompi_2buff because you
basically replace target = source1 op source2 by target op= source1 op
source2

Moreover, I much nicer solution will be to patch directly the
ompi_3buff_op_reduce function in op.h to fallback to a user defined
function when necessary.

  George.

On Wed, Apr 23, 2014 at 12:52 PM, Rupert Nash  wrote:
> Hello devel list
>
> I've been trying to use a non-blocking MPI_Iallreduce in a CFD application 
> I'm working on, but it kept segfaulting on me. I have reduced it to a simple 
> test case - see the gist here for the full code
> https://gist.github.com/rupertnash/1182
> build and run with:
> mpicc test.c -o test && mpirun -n 2 ./test
>
> I am working on OS X Mavericks with open-mpi 1.8 built from the source 
> tarball.
>
> Through some debugging I have narrowed the problem down:
> In ompi/mca/coll/libnbc/nbc.c, in NBC_Start_round, where the code switches on 
> which type of operation has been put in the schedule:
>
>   case OP:
> NBC_DEBUG(5, "  OP   (offset %li) ", (long)ptr-(long)myschedule);
> NBC_GET_BYTES(ptr,opargs);
> NBC_DEBUG(5, "*buf1: %p, buf2: %p, count: %i, type: %lu)\n", 
> opargs.buf1, opargs.buf2, opargs.count, (unsigned long)opargs.datatype);
> /* get buffers */
> /* SNIP */
> --->ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, 
> opargs.datatype);
> break;
>
> The line marked with an arrow --> is the problem. Looking at the comments 
> describing ompi_3buff_op_reduce, it states "This function will *only* be 
> invoked on intrinsic MPI_Ops." Examining the code bears this out as it's 
> clearly indexing into a table of function pointers, which are all null for a 
> user-defined MPI_Op.
>
> Presumably the fix will be to replace the use of the 3buffer version with the 
> usual ompi_op_reduce, at least of non-intrinsic operations. I have made a 
> temporary patch by replacing the arrowed line with the following:
> if (0 != (opargs.op->o_flags & OMPI_OP_FLAGS_INTRINSIC)) {
>   ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, 
> opargs.datatype);
> } else {
>   ompi_op_reduce(opargs.op, buf1, buf3, opargs.count, 
> opargs.datatype);
>   ompi_op_reduce(opargs.op, buf2, buf3, opargs.count, 
> opargs.datatype);
> }
> However this is the first time I've looked under the hood of OpenMPI. 
> Hopefully you can patch it properly soon.
>
> Best wishes,
>
> Rupert
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/04/14586.php


Re: [hwloc-devel] PATCH: Mark fd as close-on-exec

2014-04-23 Thread Brice Goglin
This code is only built on Linux so I am not sure we're more portable than OMPI 
here. The oldest Linux we've tested bwloc on is likely your machines ;)
Brice


On 24 avril 2014 00:48:46 UTC+02:00, Paul Hargrove  wrote:
>Since I suspect hwloc may run on *more* platforms than ompi, I'd
>recommend
>the guards.
>The X11 sources actually go as far as the following (Stevens notes that
>older systems used '1' before FD_CLOEXEC was specified).
>
>#ifdef F_SETFD
>#ifdef FD_CLOEXEC
>ret = fcntl (fd, F_SETFD, FD_CLOEXEC);
>#else
>ret = fcntl (fd, F_SETFD, 1);
>#endif /* FD_CLOEXEC */
>#endif /* F_SETFD */
>
>-Paul
>
>
>On Wed, Apr 23, 2014 at 3:07 PM, Jeff Squyres (jsquyres)
>> wrote:
>
>> Actually, I just checked around: we have some unprotected FD_CLOEXEC
>code
>> in OMPI was that committed 2010-08-24 that has never caused a
>problem.
>>
>> So I'm not thinking it should be necessary here, either.
>>
>>
>> On Apr 23, 2014, at 5:55 PM, Jeff Squyres (jsquyres)
>
>> wrote:
>>
>> > Will do.
>> >
>> > On Apr 23, 2014, at 5:52 PM, Samuel Thibault
>
>> wrote:
>> >
>> >> Jeff Squyres (jsquyres), le Wed 23 Apr 2014 21:05:55 +, a
>écrit :
>> >>> Any objections to this patch?  In OMPI, we're seeing this fd leak
>into
>> child processes.
>> >>>
>> >>> diff --git a/src/topology-linux.c b/src/topology-linux.c
>> >>> index e934d4c..8c5fba1 100644
>> >>> --- a/src/topology-linux.c
>> >>> +++ b/src/topology-linux.c
>> >>> @@ -4601,6 +4601,13 @@ hwloc_linux_component_instantiate(struct
>> hwloc_disc_compo
>> >>>data->is_real_fsroot = 0;
>> >>>  }
>> >>>
>> >>
>> >> We probably want an #ifdef FD_CLOEXEC here, not all systems have
>it.
>> >>
>> >>> +  /* Since this fd stays open after hwloc returns, mark it as
>> >>> + close-on-exec so that children don't inherit it */
>> >>> +  if (fcntl(root, F_SETFD, FD_CLOEXEC) == -1) {
>> >>> +  close(root);
>> >>> +  root = -1;
>> >>> +  goto out_with_data;
>> >>> +  }
>> >>> #else
>> >>>  if (strcmp(fsroot_path, "/")) {
>> >>>errno = ENOSYS;
>> >>>
>> >>> --
>> >>> Jeff Squyres
>> >>> jsquy...@cisco.com
>> >>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >>>
>> >>> ___
>> >>> hwloc-devel mailing list
>> >>> hwloc-de...@open-mpi.org
>> >>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> >>>
>> >>
>> >> --
>> >> Samuel
>> >> Je suis maintenant possesseur d'un ordinateur portable Compaq
>Armada
>> >> 1592DT avec port infra-rouge. Auriez-vous connaissance de
>programmes
>> >> suceptibles d'utiliser ce port afin de servir de télécommande ?
>> >> -+- JN in NPC : ben quoi, c'est pas à ça que ça sert ?
>> >> ___
>> >> hwloc-devel mailing list
>> >> hwloc-de...@open-mpi.org
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>> >
>> >
>> > --
>> > Jeff Squyres
>> > jsquy...@cisco.com
>> > For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >
>> > ___
>> > hwloc-devel mailing list
>> > hwloc-de...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>>
>
>
>
>-- 
>Paul H. Hargrove  phhargr...@lbl.gov
>Future Technologies Group
>Computer and Data Sciences Department Tel: +1-510-495-2352
>Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
>
>
>
>___
>hwloc-devel mailing list
>hwloc-de...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


Re: [hwloc-devel] PATCH: Mark fd as close-on-exec

2014-04-23 Thread Paul Hargrove
Currently, POSIX defines exactly one flag accessed via F_GETFD/F_SETFD and
that is FD_CLOEXEC.
However, it does not prohibit a conforming implementation from defining
additional bits.

So, a portable program should assume other bits may be set and try to
preserve them.

Quoting from section 3.14 of Steven's Advanced Programming in the UNIX
Environment:

When we modify either the file descriptor flags or the file status flags,
we must be
careful to fetch the existing value, modify it as desired, and then set the
new flag
value.  We can't simply issue an F_SETFD or F_SETFL command, as this could
turn
off flag bits that were previously set.

See also the example in
http://www.gnu.org/software/libc/manual/html_node/Descriptor-Flags.html

-Paul [Who always does what the late W. Richard Stevens says to.]


On Wed, Apr 23, 2014 at 3:11 PM, Jeff Squyres (jsquyres)  wrote:

> We opened the fd a few lines above with default flags -- is the addition
> GETFD necessary?
>
>
> https://github.com/open-mpi/hwloc/blob/master/src/topology-linux.c#L4595
>
>
> On Apr 23, 2014, at 6:04 PM, Paul Hargrove  wrote:
>
> > In order to preserve any existing flags, shouldn't this be more like:
> >   int prev;
> >   if ((-1 == (prev =  fcntl(root, F_GETFD, 0)) ||
> >   (-1 == fcntl(root, F_SETFD, FD_CLOEXEC | prev)))
> >
> >
> >
> >
> > On Wed, Apr 23, 2014 at 2:55 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > Will do.
> >
> > On Apr 23, 2014, at 5:52 PM, Samuel Thibault 
> wrote:
> >
> > > Jeff Squyres (jsquyres), le Wed 23 Apr 2014 21:05:55 +, a écrit :
> > >> Any objections to this patch?  In OMPI, we're seeing this fd leak
> into child processes.
> > >>
> > >> diff --git a/src/topology-linux.c b/src/topology-linux.c
> > >> index e934d4c..8c5fba1 100644
> > >> --- a/src/topology-linux.c
> > >> +++ b/src/topology-linux.c
> > >> @@ -4601,6 +4601,13 @@ hwloc_linux_component_instantiate(struct
> hwloc_disc_compo
> > >> data->is_real_fsroot = 0;
> > >>   }
> > >>
> > >
> > > We probably want an #ifdef FD_CLOEXEC here, not all systems have it.
> > >
> > >> +  /* Since this fd stays open after hwloc returns, mark it as
> > >> + close-on-exec so that children don't inherit it */
> > >> +  if (fcntl(root, F_SETFD, FD_CLOEXEC) == -1) {
> > >> +  close(root);
> > >> +  root = -1;
> > >> +  goto out_with_data;
> > >> +  }
> > >> #else
> > >>   if (strcmp(fsroot_path, "/")) {
> > >> errno = ENOSYS;
> > >>
> > >> --
> > >> Jeff Squyres
> > >> jsquy...@cisco.com
> > >> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> > >>
> > >> ___
> > >> hwloc-devel mailing list
> > >> hwloc-de...@open-mpi.org
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> > >>
> > >
> > > --
> > > Samuel
> > > Je suis maintenant possesseur d'un ordinateur portable Compaq Armada
> > > 1592DT avec port infra-rouge. Auriez-vous connaissance de programmes
> > > suceptibles d'utiliser ce port afin de servir de télécommande ?
> > > -+- JN in NPC : ben quoi, c'est pas à ça que ça sert ?
> > > ___
> > > hwloc-devel mailing list
> > > hwloc-de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > ___
> > hwloc-devel mailing list
> > hwloc-de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> >
> >
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > hwloc-devel mailing list
> > hwloc-de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[hwloc-devel] PATCH: Mark fd as close-on-exec

2014-04-23 Thread Jeff Squyres (jsquyres)
Any objections to this patch?  In OMPI, we're seeing this fd leak into child 
processes.

diff --git a/src/topology-linux.c b/src/topology-linux.c
index e934d4c..8c5fba1 100644
--- a/src/topology-linux.c
+++ b/src/topology-linux.c
@@ -4601,6 +4601,13 @@ hwloc_linux_component_instantiate(struct hwloc_disc_compo
 data->is_real_fsroot = 0;
   }
 
+  /* Since this fd stays open after hwloc returns, mark it as
+ close-on-exec so that children don't inherit it */
+  if (fcntl(root, F_SETFD, FD_CLOEXEC) == -1) {
+  close(root);
+  root = -1;
+  goto out_with_data;
+  }
 #else
   if (strcmp(fsroot_path, "/")) {
 errno = ENOSYS;

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] openmpi and XRC API from ofed-3.12

2014-04-23 Thread Paul Hargrove
As maintainer of a different communications library over Verbs I'd like to
ask:
  Where can one find information on the new APIs and their use?

-Paul


On Wed, Apr 23, 2014 at 7:01 AM, Nathan Hjelm  wrote:

> Yes, we plan to fix support for XRC due to the changes in 3.12. It will
> probably not happen before 1.8.2 though.
>
> -Nathan
>
> On Wed, Apr 23, 2014 at 02:58:49PM +0200, Piotr Lesnicki wrote:
> > Hi,
> >
> > In OFED-3.12 the API for XRC has changed. I did not find
> > corresponding changes in Open MPI: for example the function
> > 'ibv_create_xrc_rcv_qp()' queried in openmpi configure script no
> > longer exists in ofed-3.12-rc1.
> >
> > Are there any plans to support the new XRC API ?
> >
> >
> > --
> > Piotr
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/04/14583.php
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/04/14585.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] Bug report: non-blocking allreduce with user-defined operation gives segfault

2014-04-23 Thread Rupert Nash
Hello devel list 

I've been trying to use a non-blocking MPI_Iallreduce in a CFD application I'm 
working on, but it kept segfaulting on me. I have reduced it to a simple test 
case - see the gist here for the full code
https://gist.github.com/rupertnash/1182
build and run with:
mpicc test.c -o test && mpirun -n 2 ./test

I am working on OS X Mavericks with open-mpi 1.8 built from the source tarball. 

Through some debugging I have narrowed the problem down:
In ompi/mca/coll/libnbc/nbc.c, in NBC_Start_round, where the code switches on 
which type of operation has been put in the schedule:

  case OP:
NBC_DEBUG(5, "  OP   (offset %li) ", (long)ptr-(long)myschedule);
NBC_GET_BYTES(ptr,opargs);
NBC_DEBUG(5, "*buf1: %p, buf2: %p, count: %i, type: %lu)\n", 
opargs.buf1, opargs.buf2, opargs.count, (unsigned long)opargs.datatype);
/* get buffers */
/* SNIP */
--->ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, 
opargs.datatype);
break;

The line marked with an arrow --> is the problem. Looking at the comments 
describing ompi_3buff_op_reduce, it states "This function will *only* be 
invoked on intrinsic MPI_Ops." Examining the code bears this out as it's 
clearly indexing into a table of function pointers, which are all null for a 
user-defined MPI_Op.

Presumably the fix will be to replace the use of the 3buffer version with the 
usual ompi_op_reduce, at least of non-intrinsic operations. I have made a 
temporary patch by replacing the arrowed line with the following:
if (0 != (opargs.op->o_flags & OMPI_OP_FLAGS_INTRINSIC)) {
  ompi_3buff_op_reduce(opargs.op, buf1, buf2, buf3, opargs.count, 
opargs.datatype);
} else {
  ompi_op_reduce(opargs.op, buf1, buf3, opargs.count, opargs.datatype);
  ompi_op_reduce(opargs.op, buf2, buf3, opargs.count, opargs.datatype);
}
However this is the first time I've looked under the hood of OpenMPI. Hopefully 
you can patch it properly soon.

Best wishes,

Rupert
-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



Re: [OMPI devel] openmpi and XRC API from ofed-3.12

2014-04-23 Thread Nathan Hjelm
Yes, we plan to fix support for XRC due to the changes in 3.12. It will
probably not happen before 1.8.2 though.

-Nathan

On Wed, Apr 23, 2014 at 02:58:49PM +0200, Piotr Lesnicki wrote:
> Hi,
> 
> In OFED-3.12 the API for XRC has changed. I did not find
> corresponding changes in Open MPI: for example the function
> 'ibv_create_xrc_rcv_qp()' queried in openmpi configure script no
> longer exists in ofed-3.12-rc1.
> 
> Are there any plans to support the new XRC API ?
> 
> 
> --
> Piotr
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/04/14583.php


pgpYRrcp7FSgA.pgp
Description: PGP signature


[OMPI devel] openmpi and XRC API from ofed-3.12

2014-04-23 Thread Piotr Lesnicki

Hi,

In OFED-3.12 the API for XRC has changed. I did not find
corresponding changes in Open MPI: for example the function
'ibv_create_xrc_rcv_qp()' queried in openmpi configure script no
longer exists in ofed-3.12-rc1.

Are there any plans to support the new XRC API ?


--
Piotr


Re: [OMPI devel] coll/tuned MPI_Bcast can crash or silently fail when using distinct datatypes across tasks

2014-04-23 Thread Gilles Gouaillardet
my bad :-(

this has just been fixed

Gilles

On 2014/04/23 14:55, Nathan Hjelm wrote:
> The ompi_datatype_flatten.c file appears to be missing. Let me know once
> it is committed and I will take a look. I will see if I can write the
> RMA code using it over the next week or so.
>



Re: [OMPI devel] coll/tuned MPI_Bcast can crash or silently fail when using distinct datatypes across tasks

2014-04-23 Thread Nathan Hjelm
The ompi_datatype_flatten.c file appears to be missing. Let me know once
it is committed and I will take a look. I will see if I can write the
RMA code using it over the next week or so.

-Nathan

On Wed, Apr 23, 2014 at 02:43:12PM +0900, Gilles Gouaillardet wrote:
> Nathan,
> 
> i uploaded this part to github :
> https://github.com/ggouaillardet/ompi-svn-mirror/tree/flatten-datatype
> 
> you really need to check the last commit :
> https://github.com/ggouaillardet/ompi-svn-mirror/commit/a8d014c6f144fa5732bdd25f8b6b05b07ea8
> 
> please consider this as experimental and poorly tested.
> that being said, this is only addition to existing code, so it does not
> break anything and could be pushed to the trunk.
> 
> Gilles
> 
> On 2014/04/23 0:05, Hjelm, Nathan T wrote:
> > I need the flatten datatype call for handling true rdma in the one-sided 
> > code as well. Is there a plan to implement this feature soon?
> >
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/04/14579.php


pgpNQanAfaGLx.pgp
Description: PGP signature


Re: [OMPI devel] coll/tuned MPI_Bcast can crash or silently fail when using distinct datatypes across tasks

2014-04-23 Thread Gilles Gouaillardet
George,

i am sorry i cannot see how flatten datatype can be helpful here :-(

in this example, the master must broadcast a long vector. this datatype
is contiguous
so the flatten'ed datatype *is* the type provided by the MPI application.

how would pipelining happen in this case (e.g. who has to cut the long
vector into pieces and how) ?

should a temporary buffer be used ? and then should it be sent into
pieces of type MPI_PACKED ?
(and if yes, would this be safe in an heterogenous communicator ?)

Thanks in advance for your insights,

Gilles

On 2014/04/22 12:04, George Bosilca wrote:
> Indeed there are many potential solutions, but all require too much
> intervention on the code to be generic enough. As we discussed
> privately mid last year, the "flatten datatype" approach seems to me
> to be the most profitable.It is simple to implement and it is also
> generic, a simple change will make all pipelined collective work (not
> only tuned but all the other as well).



Re: [OMPI devel] coll/tuned MPI_Bcast can crash or silently fail when using distinct datatypes across tasks

2014-04-23 Thread Gilles Gouaillardet
Nathan,

i uploaded this part to github :
https://github.com/ggouaillardet/ompi-svn-mirror/tree/flatten-datatype

you really need to check the last commit :
https://github.com/ggouaillardet/ompi-svn-mirror/commit/a8d014c6f144fa5732bdd25f8b6b05b07ea8

please consider this as experimental and poorly tested.
that being said, this is only addition to existing code, so it does not
break anything and could be pushed to the trunk.

Gilles

On 2014/04/23 0:05, Hjelm, Nathan T wrote:
> I need the flatten datatype call for handling true rdma in the one-sided code 
> as well. Is there a plan to implement this feature soon?
>



Re: [OMPI devel] Issues with MPI_Add_error_class()

2014-04-23 Thread George Bosilca
Both these issues are fixed in the trunk and are scheduled for the
1.8. The commit you need to check is r28584 and the corresponding
ticket is
https://svn.open-mpi.org/trac/ompi/ticket/4554

  George.


On Mon, Apr 21, 2014 at 8:45 AM, Lisandro Dalcin  wrote:
> It seems the implementation of MPI_Add_error_class() is out of sync
> with the definition of MPI_ERR_LASTCODE.
>
> Please review the list of error classes in mpi.h and the code in this
> file: 
> https://bitbucket.org/ompiteam/ompi-svn-mirror/src/v1.8/ompi/errhandler/errcode.c
>
> BTW, in that file, all the MPI_T_ERR_XXX are not handled. The MPI-3
> standard says they should be treated as other MPI error classes.
> Trying to get an error string out of them (eg. MPI_T_ERR_MEMORY)
> generates an error.
>
>
>
> [dalcinl@kw2060 openmpi]$ cat add_error_class.c
> #include 
> #include 
> int main(int argc, char *argv[])
> {
>   int errorclass,*lastused,flag;
>   MPI_Init(, );
>   MPI_Add_error_class();
>   MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_LASTUSEDCODE, , );
>   printf("errorclass:%d lastused:%d MPI_ERR_LASTCODE:%d\n",
> errorclass, *lastused, MPI_ERR_LASTCODE);
>   MPI_Finalize();
>   return 0;
> }
> [dalcinl@kw2060 openmpi]$ mpicc add_error_class.c
> [dalcinl@kw2060 openmpi]$ ./a.out
> errorclass:54 lastused:54 MPI_ERR_LASTCODE:71
>
>
> [dalcinl@kw2060 openmpi]$ cat error_string.c
> #include 
> #include 
> int main(int argc, char *argv[])
> {
>   char errorstring[MPI_MAX_ERROR_STRING];
>   int slen;
>   MPI_Init(, );
>   MPI_Error_string(MPI_T_ERR_MEMORY, errorstring, );
>   printf("errorclass:%d errorstring:%s\n", MPI_T_ERR_MEMORY, errorstring);
>   MPI_Finalize();
>   return 0;
> }
> [dalcinl@kw2060 openmpi]$ mpicc error_string.c
> [dalcinl@kw2060 openmpi]$ ./a.out
> [kw2060:20883] *** An error occurred in MPI_Error_string
> [kw2060:20883] *** reported by process [140737332576257,0]
> [kw2060:20883] *** on communicator MPI_COMM_WORLD
> [kw2060:20883] *** MPI_ERR_ARG: invalid argument of some other kind
> [kw2060:20883] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
> [kw2060:20883] ***and potentially your MPI job)
>
> --
> Lisandro Dalcin
> ---
> CIMEC (UNL/CONICET)
> Predio CONICET-Santa Fe
> Colectora RN 168 Km 472, Paraje El Pozo
> 3000 Santa Fe, Argentina
> Tel: +54-342-4511594 (ext 1016)
> Tel/Fax: +54-342-4511169
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/04/14564.php