This patch finally make it's way back into the trunk. I had to modify it to fit again into the source, but hopefully I manage to do it right. I did some testing and it seems to not harm anything. I split it up in several commits, in order to have a clean submission with one commit related to one particular patch. They span between revision r14923 and r14928.

  george.

On Jun 6, 2007, at 9:51 AM, Tim Prins wrote:

I hate to go back to this, but...

The original commits also included changes to gpr_replica_dict_fn.c
(r14331 and r14336). This change shows some performance improvement for me (about %8 on mpi hello, 123 nodes, 4ppn), and cleans up some ugliness in the gpr. Again, this is a algorithmic change so as the job scales the
performance improvement would be more noticeable.

I vote that this be put back in.

On a related topic, a small memory leak was fixed in r14328, and then
reverted. This change should be put back in.

Tim

George Bosilca wrote:
Commit r14791 apply this patch to the trunk. Let me know if you
encounter any kind of troubles.

  Thanks,
    george.

On May 29, 2007, at 2:28 PM, Ralph Castain wrote:

After some work off-list with Tim, it appears that something has been
broken
again on the OMPI trunk with respect to comm_spawn. It was working
two weeks
ago, but...sigh.

Anyway, it doesn't appear to have any bearing either way on George's
patch(es), so whomever wants to commit them is welcome to do so.

Thanks
Ralph


On 5/29/07 11:44 AM, "Ralph Castain" <r...@lanl.gov> wrote:




On 5/29/07 11:02 AM, "Tim Prins" <tpr...@open-mpi.org> wrote:

Well, after fixing many of the tests...

Interesting - they worked fine for me. Perhaps a difference in
environment.

It passes all the tests
except the spawn tests. However, the spawn tests are seriously broken without this patch as well, and the ibm mpi spawn tests seem to work
fine.

Then something is seriously wrong. The spawn tests were working as
of my
last commit - that is a test I religiously run. If the spawn test here
doesn't work, then it is hard to understand how the mpi spawn can
work since
the call is identical.

Let me see what's wrong first...


As far as I'm concerned, this should assuage any fear of problems
with these changes and they should now go in.

Tim

On May 29, 2007, at 11:34 AM, Ralph Castain wrote:

Well, I'll be the voice of caution again...

Tim: did you run all of the orte tests in the orte/test/system
directory? If
so, and they all run correctly, then I have no issue with doing the
commit.
If not, then I would ask that we not do the commit until that has
been done.

In running those tests, you need to run them on a multi-node
system, both
using mpirun and as singletons (you'll have to look at the tests to
see
which ones make sense in the latter case). This will ensure that we
have at
least some degree of coverage.

Thanks
Ralph



On 5/29/07 9:23 AM, "George Bosilca" <bosi...@cs.utk.edu> wrote:

I'd be happy to commit the patch into the trunk. But after what
happened last time, I'm more than cautious. If the community think the patch is worth having it, let me know and I'll push it in the
trunk asap.

   Thanks,
     george.

On May 29, 2007, at 10:56 AM, Tim Prins wrote:

I think both patches should be put in immediately. I have done some
simple testing, and with 128 nodes of odin, with 1024 processes
running mpi hello, these decrease our running time from about 14.2 seconds to 10.9 seconds. This is a significant decrease, and as the
scale increases there should be increasing benefit.

I'd be happy to commit these changes if no one objects.

Tim

On May 24, 2007, at 8:39 AM, Ralph H Castain wrote:

Thanks - I'll take a look at this (and the prior ones!) in the
next
couple
of weeks when time permits and get back to you.

Ralph


On 5/23/07 1:11 PM, "George Bosilca" <bosi...@cs.utk.edu> wrote:

Attached is another patch to the ORTE layer, more specifically
the
replica. The idea is to decrease the number of strcmp by using a
small hash function before doing the strcmp. The hask key for
each
registry entry is computed when it is added to the registry. When we're doing a query, instead of comparing the 2 strings we first check if the hash key match, and if they do match then we compare
the
2 strings in order to make sure we eliminate collisions from our
answers.

There is some benefit in terms of performance. It's hardly
visible
for few processes, but it start showing up when the number of
processes increase. In fact the number of strcmp in the trace
file
drastically decrease. The main reason it works well, is because
most
of the keys start with basically the same chars (such as orte- blahblah) which transform the strcmp on a loop over few chars.

Ralph, please consider it for inclusion on the ORTE layer.

   Thanks,
     george.


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--------------------------------------------------------------------- ---

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to