If you have positive confirmation that such things have happened, this will go a long way. I will not trust the code until this has also been done with multiple independent network paths. I very rarely express such strong opinions, even if I don't agree with what is being done, but this is the core of correct MPI functionality, and first hand experience has shown that just thinking through the logic, I can miss some of the race conditions. The code here has been running for 8+ years in two production MPI's running on very large clusters, so I am very reluctant to make changes for what seems to amount to people's taste - maintenance is not an issue in this case. Had this not been such a key bit of code, I would not even bat an eye. I suppose if you can go through some formal verification, this would also be good - actually better than hoping that one will hit out-of-order situations.
Rich On 12/14/07 2:20 AM, "Gleb Natapov" <gl...@voltaire.com> wrote: > On Thu, Dec 13, 2007 at 06:16:49PM -0500, Richard Graham wrote: >> The situation that needs to be triggered, just as George has mentions, is >> where we have a lot of unexpected messages, to make sure that when one that >> we can match against comes in, all the unexpected messages that can be >> matched with pre-posted receives are matched. Since we attempt to match >> only when a new fragment comes in, we need to make sure that we don't leave >> other unexpected messages that can be matched in the unexpected queue, as >> these (if the out of order scenario is just right) would block any new >> matches from occurring. >> >> For example: Say the next expect message is 25 >> >> Unexpected message queue has: 26 28 29 .. >> >> If 25 comes in, and is handled, if 26 is not pulled off the unexpected >> message queue, when 27 comes in it won't be able to be matched, as 26 is >> sitting in the unexpected queue, and will never be looked at again ... > This situation is triggered constantly with openib BTL. OpenIB BTL has > two ways to receive a packet: over a send queue or over an eager RDMA path. > Receiver polls both of them and may reorders packets locally. Actually > currently there is a bug in openib BTL that one channel may starve the other > at the receiver so if a match fragment with a next sequence number is in the > starved path tenth of thousands fragment can be reorederd. Test case attached > to ticket #1158 triggers this case and my patch handles all reordered packets. > > And, by the way, the code is much simpler now and can be review easily ;) > > -- > Gleb. > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel