The patch applies to ib_multifrag as is without a conflict. But the
branch
doesn't compile with or without the patch so I was not able to test
it.
Do you have some uncommitted changes that may generate a conflict? Can
you commit them so they can be resolved? If there is no conflict
between
On Jun 14, 2007, at 7:11 AM, Jeff Squyres wrote:
Now I see that my fix was in the right place, but still a little bit
wrong. I committed a fix to my fix in r15073. Can you check it?
My cluster is still running MTT from last night; I'll need to wait
for several jobs to finish. I'll check it la
On Jun 14, 2007, at 6:32 AM, Gleb Natapov wrote:
794:mca_btl_openib_endpoint_recv] can't find suitable endpoint for
this peer
Now I see that my fix was in the right place, but still a little bit
wrong. I committed a fix to my fix in r15073. Can you check it?
My cluster is still running MTT f
On Wed, Jun 13, 2007 at 07:08:51PM +0300, Gleb Natapov wrote:
> On Wed, Jun 13, 2007 at 09:38:21AM -0600, Galen Shipman wrote:
> > Hi Gleb,
> >
> > As we have discussed before I am working on adding support for
> > multiple QPs with either per peer resources or shared resources.
> > As a result
On Wed, Jun 13, 2007 at 01:54:28PM -0400, Jeff Squyres wrote:
> On Jun 13, 2007, at 1:37 PM, Gleb Natapov wrote:
>
> >> I have 2 hosts: one with 3 active ports and one with 2 active ports.
> >> If I run an MPI job between them, the openib BTL wireup got badly and
> >> it aborts. So handling a het
On Jun 13, 2007, at 12:07 PM, Gleb Natapov wrote:
On Wed, Jun 13, 2007 at 02:05:00PM -0400, Jeff Squyres wrote:
On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote:
With today's trunk, I still see the problem:
Same thing happens on v1.2 branch. I'll re-open #548.
I am sure it was never test
On Wed, Jun 13, 2007 at 02:05:00PM -0400, Jeff Squyres wrote:
> On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote:
>
> > With today's trunk, I still see the problem:
>
> Same thing happens on v1.2 branch. I'll re-open #548.
>
I am sure it was never tested with multiple subnets. I'll try to get
su
On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote:
With today's trunk, I still see the problem:
Same thing happens on v1.2 branch. I'll re-open #548.
--
Jeff Squyres
Cisco Systems
On Jun 13, 2007, at 1:37 PM, Gleb Natapov wrote:
I have 2 hosts: one with 3 active ports and one with 2 active ports.
If I run an MPI job between them, the openib BTL wireup got badly and
it aborts. So handling a heterogeneous number of ports is not
currently handled properly in the code.
Are
On Wed, Jun 13, 2007 at 10:52:53AM -0600, Galen Shipman wrote:
>
> On Jun 13, 2007, at 10:48 AM, Jeff Squyres wrote:
>
> > I wonder if this is bringing up the point that there are several of
> > us working in the openib code base -- I wonder if it would be
> > worthwhile to have a [short] telecon
On Jun 13, 2007, at 11:33 AM, Jeff Squyres wrote:
On Jun 13, 2007, at 1:15 PM, Nysal Jan wrote:
There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/
ticket/548
It was fixed by Galen for 1.2.
Ah -- I forgot to look at closed tickets. I think we broke it again;
it certainly f
On Wed, Jun 13, 2007 at 12:45:01PM -0400, Jeff Squyres wrote:
> On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote:
>
> > I am not committing this yet. I want people to review my logic and the
> > patch. If the change is OK with everyone how cares then I want this
> > change to go into 1.2 branch.
>
On Jun 13, 2007, at 1:15 PM, Nysal Jan wrote:
There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/
ticket/548
It was fixed by Galen for 1.2.
Ah -- I forgot to look at closed tickets. I think we broke it again;
it certainly fails on the trunk (perhaps related to what Gleb
On Jun 13, 2007, at 11:15 AM, Nysal Jan wrote:
I was just bitten yesterday by a problem that I've known about for a
while but had never gotten around to looking into (I could have sworn
that there was an open trac ticket on this, but I can't find one
anywhere).
I have 2 hosts: one with 3 act
I was just bitten yesterday by a problem that I've known about for a
while but had never gotten around to looking into (I could have sworn
that there was an open trac ticket on this, but I can't find one
anywhere).
I have 2 hosts: one with 3 active ports and one with 2 active ports.
If I run an
On Jun 13, 2007, at 10:48 AM, Jeff Squyres wrote:
I wonder if this is bringing up the point that there are several of
us working in the openib code base -- I wonder if it would be
worthwhile to have a [short] teleconference to discuss what we're all
doing in openib, where we're doing it (trunk,
I wonder if this is bringing up the point that there are several of
us working in the openib code base -- I wonder if it would be
worthwhile to have a [short] teleconference to discuss what we're all
doing in openib, where we're doing it (trunk, branch, whatever), when
we expect to have it
On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote:
I am not committing this yet. I want people to review my logic and the
patch. If the change is OK with everyone how cares then I want this
change to go into 1.2 branch.
I don't care how this change will get to the trunk. I can use patched
versio
On Wed, Jun 13, 2007 at 09:38:21AM -0600, Galen Shipman wrote:
> Hi Gleb,
>
> As we have discussed before I am working on adding support for
> multiple QPs with either per peer resources or shared resources.
> As a result of this I am trying to clean up a lot of the OpenIB code.
> It has grown
On Jun 13, 2007, at 9:49 AM, Torsten Hoefler wrote:
Hi Galen,Gleb,
there is also something weird going on if I call the basic alltoall
during the module_init() of a collective module (I need to wire up my
own QPs in my coll component). It takes 7 seconds for 4 nodes and more
than 30 minutes for
Hi Galen,Gleb,
there is also something weird going on if I call the basic alltoall
during the module_init() of a collective module (I need to wire up my
own QPs in my coll component). It takes 7 seconds for 4 nodes and more
than 30 minutes for 120 nodes. It seems to be an OpenIB wireup issue
becaus
Hi Gleb,
As we have discussed before I am working on adding support for
multiple QPs with either per peer resources or shared resources.
As a result of this I am trying to clean up a lot of the OpenIB code.
It has grown up organically over the years and needs some attention.
Perhaps we can co
Hello everyone,
I encountered a problem with openib on depend connection code. Basically
it works only by pure luck if you have more then one endpoint for the same
proc and sometimes breaks in mysterious ways.
The algo works like this: A wants to connect to B so it creates QP and sends it
to B.
23 matches
Mail list logo