Re: [OMPI devel] Bug with MPI_TYPE_CREATE_STRUCT, MPI_BOTTOM and MPI_BCAST in fortran

2007-06-13 Thread Daniel Spångberg
Dear Rainer, Thanks for your (and George's) very quick answer and the fast solution! The patch works great. Best regards Daniel Spångberg On Wed, 13 Jun 2007 01:38:26 +0200, Rainer Keller wrote: Dear Daniel, well, this definitly would be an issue for the us...@open-mpi.org list. Your rep

[OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Gleb Natapov
Hello everyone, I encountered a problem with openib on depend connection code. Basically it works only by pure luck if you have more then one endpoint for the same proc and sometimes breaks in mysterious ways. The algo works like this: A wants to connect to B so it creates QP and sends it to B.

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Jeff Squyres
Hey Gleb -- Can you explain the rationale for this change? Is there a reason why the bandwidths reported by the IBV API are not sufficient? Are you trying to do creative things with multi-LID scenarios (perhaps QOS- like things)? If so, this looks like a good idea, but I'm not sure that

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 11:03:09AM -0400, Jeff Squyres wrote: > Hey Gleb -- > > Can you explain the rationale for this change? Is there a reason why > the bandwidths reported by the IBV API are not sufficient? Are you > trying to do creative things with multi-LID scenarios (perhaps QOS- > l

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread George Bosilca
On Wed, 13 Jun 2007, Gleb Natapov wrote: I'm not particularly fond of creating variable MCA parameters after the btl open call because they won't show up in ompi_info. Should we do something else if you want to override bandwidths, perhaps something similar to the HCA params file? If you recal

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Galen Shipman
Hi Gleb, As we have discussed before I am working on adding support for multiple QPs with either per peer resources or shared resources. As a result of this I am trying to clean up a lot of the OpenIB code. It has grown up organically over the years and needs some attention. Perhaps we can co

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Torsten Hoefler
Hi Galen,Gleb, there is also something weird going on if I call the basic alltoall during the module_init() of a collective module (I need to wire up my own QPs in my coll component). It takes 7 seconds for 4 nodes and more than 30 minutes for 120 nodes. It seems to be an OpenIB wireup issue becaus

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 11:32 AM, George Bosilca wrote: Right ... blame me :) The problem is that we have to know the number of interfaces in order to be able to generate the MCA parameters, and the number of interfaces will only be know inside the init call (and I really doon't think it's a goo

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 9:49 AM, Torsten Hoefler wrote: Hi Galen,Gleb, there is also something weird going on if I call the basic alltoall during the module_init() of a collective module (I need to wire up my own QPs in my coll component). It takes 7 seconds for 4 nodes and more than 30 minutes for

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread George Bosilca
On Wed, 13 Jun 2007, Jeff Squyres wrote: I don't mind having some MCA parameters that are never showed by ompi_info (we already have the hidden ones). Anyway, for TCP by default there is the btl_tcp_latency and btl_tcp_bandwidth which will be used as a default value for all NICs. For the others,

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 09:38:21AM -0600, Galen Shipman wrote: > Hi Gleb, > > As we have discussed before I am working on adding support for > multiple QPs with either per peer resources or shared resources. > As a result of this I am trying to clean up a lot of the OpenIB code. > It has grown

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 12:03 PM, George Bosilca wrote: I think the "hidden" MCA parameters are a different issue; they were created for a different purpose (users are not supposed to see/set them). These variable parameters would be intended to be used by the users, but they would have no way of f

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote: I am not committing this yet. I want people to review my logic and the patch. If the change is OK with everyone how cares then I want this change to go into 1.2 branch. I don't care how this change will get to the trunk. I can use patched versio

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Jeff Squyres
I wonder if this is bringing up the point that there are several of us working in the openib code base -- I wonder if it would be worthwhile to have a [short] teleconference to discuss what we're all doing in openib, where we're doing it (trunk, branch, whatever), when we expect to have it

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 10:48 AM, Jeff Squyres wrote: I wonder if this is bringing up the point that there are several of us working in the openib code base -- I wonder if it would be worthwhile to have a [short] teleconference to discuss what we're all doing in openib, where we're doing it (trunk,

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Nysal Jan
I was just bitten yesterday by a problem that I've known about for a while but had never gotten around to looking into (I could have sworn that there was an open trac ticket on this, but I can't find one anywhere). I have 2 hosts: one with 3 active ports and one with 2 active ports. If I run an

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 11:15 AM, Nysal Jan wrote: I was just bitten yesterday by a problem that I've known about for a while but had never gotten around to looking into (I could have sworn that there was an open trac ticket on this, but I can't find one anywhere). I have 2 hosts: one with 3 act

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 1:15 PM, Nysal Jan wrote: There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/ ticket/548 It was fixed by Galen for 1.2. Ah -- I forgot to look at closed tickets. I think we broke it again; it certainly fails on the trunk (perhaps related to what Gleb

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 12:45:01PM -0400, Jeff Squyres wrote: > On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote: > > > I am not committing this yet. I want people to review my logic and the > > patch. If the change is OK with everyone how cares then I want this > > change to go into 1.2 branch. >

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 11:33 AM, Jeff Squyres wrote: On Jun 13, 2007, at 1:15 PM, Nysal Jan wrote: There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/ ticket/548 It was fixed by Galen for 1.2. Ah -- I forgot to look at closed tickets. I think we broke it again; it certainly f

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 10:52:53AM -0600, Galen Shipman wrote: > > On Jun 13, 2007, at 10:48 AM, Jeff Squyres wrote: > > > I wonder if this is bringing up the point that there are several of > > us working in the openib code base -- I wonder if it would be > > worthwhile to have a [short] telecon

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 12:35:55PM -0400, Jeff Squyres wrote: > On Jun 13, 2007, at 12:03 PM, George Bosilca wrote: > > >> I think the "hidden" MCA parameters are a different issue; they were > >> created for a different purpose (users are not supposed to see/set > >> them). These variable parame

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 1:37 PM, Gleb Natapov wrote: I have 2 hosts: one with 3 active ports and one with 2 active ports. If I run an MPI job between them, the openib BTL wireup got badly and it aborts. So handling a heterogeneous number of ports is not currently handled properly in the code. Are

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote: With today's trunk, I still see the problem: Same thing happens on v1.2 branch. I'll re-open #548. -- Jeff Squyres Cisco Systems

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 02:05:00PM -0400, Jeff Squyres wrote: > On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote: > > > With today's trunk, I still see the problem: > > Same thing happens on v1.2 branch. I'll re-open #548. > I am sure it was never tested with multiple subnets. I'll try to get su

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 12:07 PM, Gleb Natapov wrote: On Wed, Jun 13, 2007 at 02:05:00PM -0400, Jeff Squyres wrote: On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote: With today's trunk, I still see the problem: Same thing happens on v1.2 branch. I'll re-open #548. I am sure it was never test

[OMPI devel] openib coord teleconf (was: Problem with openib on demand connection bring up)

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 1:40 PM, Gleb Natapov wrote: [snip] coordination kind of teleconference. If people think this is a good idea, I can setup the call. sounds good to me. Sounds good to me to. Pasha also works on async event thread. This patch is not something I planned to work on. This pr

Re: [OMPI devel] openib coord teleconf (was: Problem with openib on demand connection bring up)

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 12:23 PM, Jeff Squyres wrote: On Jun 13, 2007, at 1:40 PM, Gleb Natapov wrote: [snip] coordination kind of teleconference. If people think this is a good idea, I can setup the call. sounds good to me. Sounds good to me to. Pasha also works on async event thread. Thi

Re: [OMPI devel] openib coord teleconf (was: Problem with openib on demand connection bring up)

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 02:23:37PM -0400, Jeff Squyres wrote: > On Jun 13, 2007, at 1:40 PM, Gleb Natapov wrote: > > >>> [snip] > >>> coordination kind of teleconference. If people think this is a good > >>> idea, I can setup the call. > >> > >> sounds good to me. > > Sounds good to me to. Pasha

Re: [OMPI devel] openib coord teleconf (was: Problem with openib on demand connection bring up)

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote: Pasha tells me that the best times for Ishai and him are: - 2000-2030 Israel time - 1300-1300 US Eastern - 1100-1130 US Mountain - 2230-2300 India (Bangalore) Although they could also do the preceding half hour as well. Depends on the date. Th

Re: [OMPI devel] openib coord teleconf (was: Problem with openib on demand connection bring up)

2007-06-13 Thread Gleb Natapov
On Wed, Jun 13, 2007 at 02:48:02PM -0400, Jeff Squyres wrote: > On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote: > > >> Pasha tells me that the best times for Ishai and him are: > >> > >> - 2000-2030 Israel time > >> - 1300-1300 US Eastern > >> - 1100-1130 US Mountain > >> - 2230-2300 India (Banga

Re: [OMPI devel] openib coord teleconf (was: Problem with openib on demand connection bring up)

2007-06-13 Thread Galen Shipman
On Jun 13, 2007, at 12:52 PM, Gleb Natapov wrote: On Wed, Jun 13, 2007 at 02:48:02PM -0400, Jeff Squyres wrote: On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote: Pasha tells me that the best times for Ishai and him are: - 2000-2030 Israel time - 1300-1300 US Eastern - 1100-1130 US Mountain -

Re: [OMPI devel] openib coord teleconf

2007-06-13 Thread Pavel Shamis (Pasha)
Jeff Squyres wrote: On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote: Pasha tells me that the best times for Ishai and him are: - 2000-2030 Israel time - 1300-1300 US Eastern - 1100-1130 US Mountain - 2230-2300 India (Bangalore) Although they could also do the preceding half hour as well.

Re: [OMPI devel] openib coord teleconf

2007-06-13 Thread Andrew Friedley
I'd like to call in as some of this applies to UD as well, is that okay? Andrew Jeff Squyres wrote: On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote: Pasha tells me that the best times for Ishai and him are: - 2000-2030 Israel time - 1300-1300 US Eastern - 1100-1130 US Mountain - 2230-2300 In

Re: [OMPI devel] openib coord teleconf

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 3:38 PM, Andrew Friedley wrote: I'd like to call in as some of this applies to UD as well, is that okay? Sounds good. Andrew Jeff Squyres wrote: On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote: Pasha tells me that the best times for Ishai and him are: - 2000-2030 Is

Re: [OMPI devel] openib coord teleconf

2007-06-13 Thread Gil Bloch
I will probably join the call as well. Regards, Gil Bloch > -Original Message- > From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On > Behalf Of Pavel Shamis (Pasha) > Sent: ד 13 יוני 2007 12:15 > To: Open MPI Developers > Subject: Re: [OMPI devel] openib coord teleco

[OMPI devel] openib connection semantics

2007-06-13 Thread Jeff Squyres
Gleb's post earlier today inspired me to [finally] work on the modularizing the openib btl connection semantics. We need this to add the RDMA CM support anyway, and it seemed like a natural time to actually start something on it. I used Gleb's patch as a starting point. The idea is to s

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 1:48 PM, Gleb Natapov wrote: 3. Use a file to convey this information, because it's better suited to what we're trying to do (vs. MCA parameters). Seriously, why is a file a bad thing? The file can list interfaces by hostname. For example, if you have a heterogeneous setup

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Jeff Squyres
On Jun 13, 2007, at 9:43 PM, Jeff Squyres wrote: More specifically, I'm proposing two things: 1. The MCA system itself accept this ini-style file that keys off hostnames so that this works across all of Open MPI. 2. The bandwidth/latency MCA params accept values in two forms: - a single in

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Patrick Geoffray
Jeff Squyres wrote: Let's take a step back and see exactly what we *want*. Then we can talk about how to have an interface for it. I must be missing something but why is the bandwidth/latency passed by the user (by whatever means) ? Would it be easier to automagically get these values by pr