port or Intel's support at
> ibsupp...@intel.com. They might have seen this problem before. Since
> you're running the RHEL versions of PSM and related software, one thing you
> could try is IFS. I think I was running IFS 7.3.0, so that's a difference
> bet
L versions of PSM and related software, one thing
> > >> you could try is IFS. I think I was running IFS 7.3.0, so that's a
> > >> difference between your setup and mine. At the least, it may help
> > >> support nail down the issue.
> > >>
>
> >> -Original Message-
> >> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
> >> Castain
> >> Sent: Tuesday, November 11, 2014 2:23 PM
> >> To: Open MPI Developers
> >> Subject: Re: [OMPI devel] 1.8.3 and PSM errors
>
PSM and OMPI 1.6.5; it fails on 1.8.1
> and 1.8.3.
> >
> > Andrew
> >
> >> -Original Message-
> >> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
> >> Castain
> >> Sent: Tuesday, November 11, 2014 2:23 PM
> >> To
Andrew
>
>> -Original Message-
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
>> Castain
>> Sent: Tuesday, November 11, 2014 2:23 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] 1.8.3 and PSM errors
>>
>> I
Hi Folks,
I remember in the psm provider for libfabric, that there is a check in the
av_insert method for endpoints
that had previously been inserted into the av. In the libfabric psm
provider, a mask array is created and fed
in to the psm_ep_connect call to handle ep's that were already
"connect
> On Nov 11, 2014, at 17:13 , Jeff Squyres (jsquyres)
> wrote:
>
>> More particularly, it looks like add_procs is being called a second time
>> during MPI_Intercomm_create and being passed a process that is already
>> connected (passed into the first add_procs call). Is that right? Should
; Sent: Tuesday, November 11, 2014 2:23 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] 1.8.3 and PSM errors
>
> I thought PSM didn’t support dynamic operations such as Intercomm_create
> - yes? The PSM security key wouldn’t match between the two jobs, and so
> there i
I thought PSM didn’t support dynamic operations such as Intercomm_create - yes?
The PSM security key wouldn’t match between the two jobs, and so there is no
way for them to communicate.
Which is why I thought PSM can’t be used for dynamic operations at all,
including comm_spawn and connect/acce
On Nov 11, 2014, at 4:56 PM, Friedley, Andrew wrote:
> OK, I'm able to reproduce this now, not sure why I couldn't before. I took a
> look at the diff of the PSM MTL from 1.6.5 to 1.8.1, and nothing is standing
> out to me.
>
> Question more for the general group: Did anything related to the
OK, I'm able to reproduce this now, not sure why I couldn't before. I took a
look at the diff of the PSM MTL from 1.6.5 to 1.8.1, and nothing is standing
out to me.
Question more for the general group: Did anything related to the
behavior/usage of MTL add_procs() change in this time window?
d software, one thing
> >> you could try is IFS. I think I was running IFS 7.3.0, so that's a
> >> difference between your setup and mine. At the least, it may help support
> >> nail down the issue.
> >>
> >> Andrew
> >>
> >>> -Original Messag
, it may help support nail down
>> the issue.
>>
>> Andrew
>>
>>> -Original Message-
>>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
>>> Reber
>>> Sent: Monday, November 10, 2014 12:39 PM
>>> To:
help support nail down the issue.
>
> Andrew
>
> > -Original Message-
> > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
> > Reber
> > Sent: Monday, November 10, 2014 12:39 PM
> > To: Open MPI Developers
> > Subject: Re:
g] On Behalf Of Adrian
> Reber
> Sent: Monday, November 10, 2014 1:19 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] 1.8.3 and PSM errors
>
> What is IFS?
>
> On Mon, Nov 10, 2014 at 09:12:41PM +, Friedley, Andrew wrote:
> > Hi Adrian,
> >
> > Y
l [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
> > Reber
> > Sent: Monday, November 10, 2014 12:39 PM
> > To: Open MPI Developers
> > Subject: Re: [OMPI devel] 1.8.3 and PSM errors
> >
> > Andrew,
> >
> > thanks for looking into this. I w
h various np from 8 to 32. Your original case:
> >
> > $ mpirun -np 32 ./mpi_test_suite -t "All,^io,^one-sided"
> >
> > Runs for a while and eventually hits send cancellation errors.
> >
> > Any chance you could try updating your infinipath libraries?
> >
> &
; > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
> > Reber
> > Sent: Monday, October 27, 2014 9:11 AM
> > To: Open MPI Developers
> > Subject: Re: [OMPI devel] 1.8.3 and PSM errors
> >
> > This is a simpler test setup:
> >
>
rs.
>
> Any chance you could try updating your infinipath libraries?
>
> Andrew
>
> > -Original Message-
> > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
> > Reber
> > Sent: Monday, October 27, 2014 9:11 AM
> > To: Open MPI Developers
&g
d try updating your infinipath libraries?
Andrew
> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
> Reber
> Sent: Monday, October 27, 2014 9:11 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] 1.8.3 and PSM errors
>
> This
Andrew@Intel is looking into it - he has some PSM patches coming that may
resolve this already.
> On Oct 27, 2014, at 9:10 AM, Adrian Reber wrote:
>
> This is a simpler test setup:
>
> On 8 core machines this works:
>
> $ mpirun -np 8 mpi_test_suite -t "environment"
> [...]
> Number of fai
This is a simpler test setup:
On 8 core machines this works:
$ mpirun -np 8 mpi_test_suite -t "environment"
[...]
Number of failed tests:0
Using 9 or more cores it fails:
$ mpirun -np 9 mpi_test_suite -t "environment"
mpi_test_suite:20293 terminated with signal 11 at PC=2b6d107fa9a4
SP=7f
I’m afraid I can’t quite decipher from all this what actually fails. Of course,
PSM doesn’t support dynamic operations like comm_spawn or connect_accept, so if
you are running those tests that just won’t work. Is that the heart of the
problem here?
> On Oct 27, 2014, at 1:40 AM, Adrian Reber
23 matches
Mail list logo