Paul,
i tried PSM_RCVTHREAD=0 but it did not help
Jeff,
you did not read too much ... but my words were not quite accurate.
yes, the signal handlers are set in the library constructor.
by reading the source code, i found that can be avoided by setting
the yet undocumented IPATH_NO_BACKTRACE en
On Aug 26, 2015, at 11:29 AM, Ralph Castain wrote:
>
>> ...but only when the PSM MTL is not compiled directly into libmpi, e.g., via
>> using --disable-dlopen, or --enable-static (neither of which are the
>> default, but it's worth mentioning).
>
> Is that true? If the problem lies in not “nic
> On Aug 26, 2015, at 8:23 AM, Jeff Squyres (jsquyres)
> wrote:
>
> On Aug 26, 2015, at 11:17 AM, Ralph Castain wrote:
>>
>> Meantime, the proposed workaround to use the MCA param to ignore PSM should
>> work.
>
> ...but only when the PSM MTL is not compiled directly into libmpi, e.g., via
On Aug 26, 2015, at 11:17 AM, Ralph Castain wrote:
>
> Meantime, the proposed workaround to use the MCA param to ignore PSM should
> work.
...but only when the PSM MTL is not compiled directly into libmpi, e.g., via
using --disable-dlopen, or --enable-static (neither of which are the default,
Yeah, my bad for being cryptic - very busy day.
The PSM team is doing some internal review of the problem and coming up with
solutions. Since this involves a product, the discussion has to go thru some
standard review and approval procedures before we can publicly comment on it.
Our hamsters ar
I think it would be good to get some hard facts here:
- is the infinipath library hijacking signal handlers?
- is the infinipath library not resetting those signal handlers when it is done?
- is there a way to make the infinipath library release its use of signal
handlers upon demand? (e.g., via
Sorry - but there are some discussions that cannot and should not take place on
a public mailing list. As a former corporate person yourself, you should
understand :-)
> On Aug 25, 2015, at 6:56 PM, Howard Pritchard wrote:
>
> which off-list are we talking about?
> very annoying.
>
>
> 2015-
which off-list are we talking about?
very annoying.
2015-08-25 10:38 GMT-06:00 Ralph Castain :
> We’re looking at this off-list. It would be preferable not to disable PSM
> if we can avoid it
>
> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham
> wrote:
>
> What if we modify the mpirun script to i
Thanks Paul,
I will give it a try
Cheers,
Gilles
On Wednesday, August 26, 2015, Paul Hargrove wrote:
> Gilles,
>
> Is the conflict over "SIG32"?
> If so, I believe setenv PSM_RCVTHREAD=0 in the environment will disable
> InfiniPath's use of that signal.
>
> -Paul
>
> On Tue, Aug 25, 2015 at 6
Gilles,
Is the conflict over "SIG32"?
If so, I believe setenv PSM_RCVTHREAD=0 in the environment will disable
InfiniPath's use of that signal.
-Paul
On Tue, Aug 25, 2015 at 6:02 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> i run on a centos 7 vm, and with the OFED that come
i run on a centos 7 vm, and with the OFED that comes with centos
(I will send full details tomorrow)
there is no psm hardware, just infinipath libs
a first trivial workaround in ompi would be to
putenv("OMPI_MCA_mtl_psm_priority=0")
in the java binding before invoking ompi_mpi_init,
but that canno
Gilles: what version of PSM were you using? and with which cards?
> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham wrote:
>
> What if we modify the mpirun script to include the --mca mtl ^psm tag if java
> is in the run string?
>
> -Nathan
>
> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard
We’re looking at this off-list. It would be preferable not to disable PSM if we
can avoid it
> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham wrote:
>
> What if we modify the mpirun script to include the --mca mtl ^psm tag if java
> is in the run string?
>
> -Nathan
>
> On Tue, Aug 25, 2015 a
What if we modify the mpirun script to include the --mca mtl ^psm tag if
java is in the run string?
-Nathan
On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard
wrote:
> I'll update the java FAQ.
>
> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :
>
>> On Aug 25, 2015, at 10:00 AM, Howard Prit
I'll update the java FAQ.
2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :
> On Aug 25, 2015, at 10:00 AM, Howard Pritchard
> wrote:
> >
> > I think rather than trying workarounds of dubious robustness inside open
> mpi we
> >
> > - dicument the issue on either the somewhat aged open mpi webs
On Aug 25, 2015, at 10:00 AM, Howard Pritchard wrote:
>
> I think rather than trying workarounds of dubious robustness inside open mpi
> we
>
> - dicument the issue on either the somewhat aged open mpi website faq or add
> it to a wiki page on github
It should probably be documented in the RE
I think rather than trying workarounds of dubious robustness inside open
mpi we
- dicument the issue on either the somewhat aged open mpi website faq or
add it to a wiki page on github
- file a bug against intel psm
--
sent from my smart phonr so no good type.
Howard
On Aug 25, 2015 6:
Intel folks: can you comment on this? It appears that the libinfinipath signal
handler is interfering with the java garbage collector.
> On Aug 25, 2015, at 8:01 AM, Gilles Gouaillardet
> wrote:
>
> i do not know if this can be runtime detected ...
> note we should report this to intel folks
i do not know if this can be runtime detected ...
note we should report this to intel folks and ask them to advise.
ideally, they would provide a way to make sure libinfinipath.so does not
conflict with the jvm signal handlers.
my idea is to dlopen libinfinipath only if java bindings are not used.
Is it possible to run-time detect this situation? E.g., probe the signal
handler, or somesuch.
Rationale: I'd rather have something run-time disabled than not built.
Would dlopen'ing libinfinipath change actually change its signal handler
behavior?
> On Aug 25, 2015, at 4:27 AM, Gilles Gouai
Folks,
some time ago, some crashes were reported when using java bindings.
one of them was caused was caused by mca_mtl_psm.so.
the root cause is libinfinipath.so initializer sets its own signal
handler, which
conflicts with the signal handler sets by the jvm.
the only workaround is to disable
21 matches
Mail list logo