Re: [OMPI devel] IOF repair

2008-07-11 Thread Bogdan Costescu
On Thu, 10 Jul 2008, Ralph Castain wrote: We would appreciate it if people could test this to the extent possible over the next few days. Please let us know (good or bad) so we can decide whether or not to move it to the 1.3 release branch. I've tested with r18878 and the strange behaviour me

Re: [OMPI devel] IOF repair

2008-07-10 Thread Ralph Castain
Just an update. Jeff and I have completed and checked in a fix to this problem (see the trunk, r18873). Please note that this fix has only been lightly tested, and we don't know for certain that it hasn't opened another hole somewhere else in the dyke. We would appreciate it if people could test t

Re: [OMPI devel] IOF repair

2008-07-10 Thread Jeff Squyres
[Finally] Fixed in https://svn.open-mpi.org/trac/ompi/changeset/18873. On Jul 10, 2008, at 11:29 AM, Jeff Squyres wrote: Ya, no worries -- we're working on a fix. We're just debating exactly *how* to fix it. See https://svn.open-mpi.org/trac/ompi/ticket/1135 if you want to keep up with th

Re: [OMPI devel] IOF repair

2008-07-10 Thread Jeff Squyres
Ya, no worries -- we're working on a fix. We're just debating exactly *how* to fix it. See https://svn.open-mpi.org/trac/ompi/ticket/1135 if you want to keep up with the conversation. On Jul 10, 2008, at 11:20 AM, Bogdan Costescu wrote: On Wed, 9 Jul 2008, Ralph Castain wrote: stdin is

Re: [OMPI devel] IOF repair

2008-07-10 Thread Bogdan Costescu
On Wed, 9 Jul 2008, Ralph Castain wrote: stdin is read twice if rank=0 shares the node with mpirun I consider this to be a very serious regression. Many Fortran scientific programs (at least many that I know) read their input from stdin. This comes as a result of them being (or started to be

Re: [OMPI devel] IOF repair

2008-07-10 Thread Ralph Castain
Can't argue with that...when Jeff gets back from his meeting he forgot about, we'll chat and see what makes sense to recommend. The current code is "worse" in the sense that we have this new bad behavior on stdin. It is "better" in that Rolf and Jeff -did- plug a hole or two from the 1.2 days. We'

Re: [OMPI devel] IOF repair

2008-07-10 Thread Terry Dontje
This all seems like a 6 of one half dozen of the other decision. Both solutions suck because there are holes. So, it comes down to whether we think the current code is worse than 1.2 or not. If they are the same I'd be inclined to stay with what we have now for fear of inadvertantly borking

Re: [OMPI devel] IOF repair

2008-07-10 Thread Ralph Castain
I believe the changes all pretty much related to an attempt to fix the iof_flush problem and correction of a different problem affecting the reading of stdin. Unfortunately, the iof_flush problem still remains, albeit perhaps in different form, and we now have a new problem in the stdin behavior.

Re: [OMPI devel] IOF repair

2008-07-10 Thread Terry Dontje
I see that Jeff has updated the ticket saying that he is looking at the code to see if he can generate a fix so the below may be superfluous. Anyways, what were the issues fixed in 1.3? I really comes down to how much more pain are we giving our users by rolling back to 1.2 or not. Note, I a

Re: [OMPI devel] IOF repair

2008-07-09 Thread Jeff Squyres
I'd like to have a look at the diff between the two, but I can't do so until tomorrow at the earliest. On Jul 9, 2008, at 7:26 PM, Ralph Castain wrote: I have been investigating Ticket #1135 - stdin is read twice if rank=0 shares the node with mpirun. Repairing this problem is going to be q

[OMPI devel] IOF repair

2008-07-09 Thread Ralph Castain
I have been investigating Ticket #1135 - stdin is read twice if rank=0 shares the node with mpirun. Repairing this problem is going to be quite difficult due to the rather terrible spaghetti code in the IOF, and the fact that the IOF in the HNP actually rml.sends the IO to itself multiple times as