This may be me mis-communicating with Mike off list. I had suggested he add
this "feature" to help in catching a rare race condition in his MTT runs.
However, I had expected him to do it on his private branch, not commit it
to the main repo.

I agree that I'm not sure what I think about it for the trunk. It is
indicative of a bug in the code, but if someone hits that bug at
scale....generating core files at scale can be really bad.


On Tue, Oct 7, 2014 at 5:54 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
wrote:

> I'm not sure how I feel about this commit:
>
> 1. It blindly ignores the "return" statement.  I.e., if the intent for
> this commit was to kill the process, that "return" statement should have
> been deleted, too.
>
> 2. We clearly decided a long time ago that removing an item from a list
> from which it does not belong is NOT a fatal error.  This commit is a
> fundamental change in behavior that really should have been RFC'ed (e.g., I
> RFC'ed the calloc-vs-malloc idea last week).
>
> I'm not saying that this is a bad change in core behavior, but I would
> have appreciated a little heads-up and a chance to think about it before it
> was made (I'm still not sure what I think about this).
>
>
>
> On Oct 7, 2014, at 7:09 AM, <git...@crest.iu.edu> <git...@crest.iu.edu>
> wrote:
>
> > This is an automated email from the git hooks/post-receive script. It was
> > generated because a ref change was pushed to the repository containing
> > the project "open-mpi/ompi".
> >
> > The branch, master has been updated
> >       via  86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c (commit)
> >      from  cd48fbeec67f1a511a9cf5ce890fef6cc535ef60 (commit)
> >
> > Those revisions listed above that are new to this repository have
> > not appeared on any other notification email; so we list those
> > revisions in full, below.
> >
> > - Log -----------------------------------------------------------------
> >
> https://github.com/open-mpi/ompi/commit/86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c
> >
> > commit 86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c
> > Author: Mike Dubman <mi...@mellanox.com>
> > Date:   Tue Oct 7 14:07:41 2014 +0300
> >
> >    OPAL: drop dead with core on bad flow. rarely happens with helloworld
> on large scale.
> >
> > diff --git a/opal/class/opal_list.h b/opal/class/opal_list.h
> > index b66438e..bad4cbf 100644
> > --- a/opal/class/opal_list.h
> > +++ b/opal/class/opal_list.h
> > @@ -486,6 +486,7 @@ static inline opal_list_item_t *opal_list_remove_item
> >     if (!found) {
> >         fprintf(stderr," Warning :: opal_list_remove_item - the item %p
> is not on the list %p \n",(void*) item, (void*) list);
> >         fflush(stderr);
> > +        abort();
> >         return (opal_list_item_t *)NULL;
> >     }
> >
> >
> >
> > -----------------------------------------------------------------------
> >
> > Summary of changes:
> > opal/class/opal_list.h | 1 +
> > 1 file changed, 1 insertion(+)
> >
> >
> > hooks/post-receive
> > --
> > open-mpi/ompi
> > _______________________________________________
> > ompi-commits mailing list
> > ompi-comm...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/ompi-commits
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16019.php
>

Reply via email to