It looks like a new issue to me, Pasha. Possibly a side consequence of the
IOF change made by Jeff and I the other day. From what I can see, it looks
like you app was a simple "hello" - correct?
If you look at the error, the problem occurs when mpirun is trying to route
a message. Since the app is
On Jul 14, 2008, at 5:48 PM, Sean Hefty wrote:
Is there a service ID range that is guaranteed to be available for
user apps?
I need to check on this. You may want to look at section A3.2.3 of
the spec.
If you set the first byte (network order) to 0x00, and the 2nd byte
to 0x01,
then you h
>Ah! I did not realize that there were other services on the machine
>that were using / reserving IBCM service ID's.
Intel MPI hit a similar problem a long, long time ago.
>Is there a service ID range that is guaranteed to be available for
>user apps?
I need to check on this. You may want to l
On Jul 14, 2008, at 5:18 PM, Sean Hefty wrote:
Open MPI certainly could be buggy with IBCM, of course -- but it's
fishy that the same exact "mpirun ..." command line works one time
and
fails the next (it's kinda random when the problem occurs).
I just want to make sure that service ID colli
>The service ID that it uses is its PID and the mask is always 0.
>There will only be one call to ib_cm_listen() per device per MPI
>process.
>
>Open MPI certainly could be buggy with IBCM, of course -- but it's
>fishy that the same exact "mpirun ..." command line works one time and
>fails the next
On Jul 14, 2008, at 1:17 PM, Sean Hefty wrote:
I talked to Sean Hefty about it, but we never figured out a
definitive
cause or solution. My best guess is that there is something wonky
about multiple processes simultaneously interacting with the IBCM
kernel driver from userspace; but I don't k
I've been quietly following this discussion, but now feel a need to jump
in here. I really must disagree with the idea of building either IBCM or
RDMACM support by default. Neither of these has been proven to reliably
work, or to be advantageous. Our own experiences in testing them have been
slight
>I talked to Sean Hefty about it, but we never figured out a definitive
>cause or solution. My best guess is that there is something wonky
>about multiple processes simultaneously interacting with the IBCM
>kernel driver from userspace; but I don't know jack about kernel
>stuff, so that's a total
On Jul 14, 2008, at 9:21 AM, Pavel Shamis (Pasha) wrote:
Should we not even build support for it?
I think IBCM CPC build should be enabled by default. The IBCM is
supplied with OFED so it should not be any problem during install.
Ok. But remember that there are at least some OS's where /dev
Should we not even build support for it?
I think IBCM CPC build should be enabled by default. The IBCM is
supplied with OFED so it should not be any problem during install.
PRO: don't even allow the possibility of running with it, because we
know that there are issues with the ibcm userspac
George Bosilca wrote:
I'm tracked the SM performance over the last couple of months and I
didn't notice any major change on the performance side. I guess there
is the architecture factor involved in this. My tests are performed on
a PPC (MAC OS X) and on a dual core AMD.
What is the architect
On Jul 14, 2008, at 7:55 AM, Pavel Shamis (Pasha) wrote:
I can add in head of query function something like :
if (!mca_btl_openib_component.cpc_explicitly_defined)
return OMPI_ERR_NOT_SUPPORTED;
That sounds reasonable until the ibcm userspace library issues can be
sorted out. Then perha
I can add in head of query function something like :
if (!mca_btl_openib_component.cpc_explicitly_defined)
return OMPI_ERR_NOT_SUPPORTED;
Jeff Squyres wrote:
On Jul 14, 2008, at 3:59 AM, Lenny Verkhovsky wrote:
Seems to be fixed.
Well, it's "fixed" in that Pasha turned off the error me
On Jul 14, 2008, at 3:59 AM, Lenny Verkhovsky wrote:
Seems to be fixed.
Well, it's "fixed" in that Pasha turned off the error message. But
the same issue is undoubtedly happening.
I was asking for something a little stronger: perhaps we should
actually have IBCM not try to be used unles
Right about when Brad and I discovered that issue, I ran out of time.
This made IBCM more-or-less unusable for many installations -- we were
kinda hoping for an OpenFabrics fix...
On Jul 13, 2008, at 12:43 PM, Pavel Shamis (Pasha) wrote:
Fixed in https://svn.open-mpi.org/trac/ompi/changes
Seems to be fixed.
On 7/14/08, Lenny Verkhovsky wrote:
>
> ../configure --with-memory-manager=ptmalloc2 --with-openib
>
> I guess not. I always use same configure line, and only recently I started
> to see this error.
>
> On 7/13/08, Jeff Squyres wrote:
>>
>> I think you said opposite things: Le
Please see http://www.open-mpi.org/mtt/index.php?do_redir=764
The error is not consistent. It takes a lot of iteration to reproduce it.
In my MTT testing I seen it few times.
Is it know issue ?
Regards,
Pasha
../configure --with-memory-manager=ptmalloc2 --with-openib
I guess not. I always use same configure line, and only recently I started
to see this error.
On 7/13/08, Jeff Squyres wrote:
>
> I think you said opposite things: Lenny's command line did not specifically
> ask for ibcm, but it was used
18 matches
Mail list logo