Howard,
i made PR 593 https://github.com/open-mpi/ompi/pull/593 in order to fix
this.
George,
could you please review this ?
Cheers,
Gilles
On 5/20/2015 12:57 PM, Howard Pritchard wrote:
HI Gilles,
First a disclaimer - I do not know what the intended design was nor
where the design
me.
however, there is a grey area for small communicators and i think it
should be cleared.
Cheers,
Gilles
On 5/21/2015 1:04 AM, George Bosilca wrote:
Each rule define an interval with the previous rule, and everything in
an interval will be bound the the rule with the next message size.
r;
}
Cheers,
Gilles
On Thu, May 21, 2015 at 12:04 PM, George Bosilca
wrote:
> Gilles,
>
> There is no need to define a rule for zero-sized messages, it is
> implicitly matched by the first rule. To be extremely pedantic the
> selection logic for the communicator s
local addresses (that
requires some extra devel)
as far as i am concerned, i am fine with 1) because i think it is very
unlikely an user ever wants to use link-local addresses.
Thanks in advance for your feedback so we can move forward.
Cheers,
Gilles
,
Gilles
Edgar,
i am sorry about that.
i fixed some memory leaks (some memory was leaking in some error cases).
i also moved (up) some malloc in order to group them and simplify the
handling
of error cases.
per your comment, one move was incorrect indeed :-(
Cheers,
Gilles
On 5/28/2015 12:14 PM
this is good enough for 1.8.6 we might want to hardcode the
mca value in this branch.
I dont remember anything was committed to the master.
if I get a positive review, I will do the back port and pr vs v1.8 on Monday
cheers,
Gilles
On Friday, May 29, 2015, Ralph Castain wrote:
> Hi fo
i do not know :-(
all I can tell is I did not run into any issue involving the oob tcp
that being said, I am not sure link-local addresses are considered as local
addresses by oob/tcp since these IPv6 are part of the orted command line
Cheers,
Gilles
On Saturday, May 30, 2015, Ralph Castain
t;,
"ERROR",
"WARN",
"NOTICE",
"INFO",
"DEBUG",
"TRACE",
"NOTSET",
"UNKNOWN"
};
Cheers,
Gilles
On 5/30/2015 1:32 AM, Nathan Hjelm wrote:
At the moment we have a loosely enforced standard for verb
so what about :
static const char* const priorities[] = {
"ERROR",
"WARN",
"INFO",
"DEBUG",
"TRACE"
};
and merge debug and trace if there should be only 4
Cheers,
Gilles
On Monday, June 8, 2015, Ralph Castain wrote:
>
Will do tomorrow.
proc is only used in heterogeneous mode, hence the warning
On Sunday, June 14, 2015, Ralph Castain wrote:
> *pml_ob1_recvreq.c:* In function '*mca_pml_ob1_recv_request_put_frag*':
> *pml_ob1_recvreq.c:397:18:* *warning: *unused variable '*proc*'
> [-Wunused-variable]
> omp
Ralph and all,
this is fixed at
https://github.com/open-mpi/ompi/commit/ee3a1da28a3c018115bad82e0a9e7d1e04d35148
Cheers,
Gilles
On 6/14/2015 10:43 AM, Gilles Gouaillardet wrote:
Will do tomorrow.
proc is only used in heterogeneous mode, hence the warning
On Sunday, June 14, 2015, Ralph
Jeff,
currently, the github "v2.0" branch is called "v2.x"
was this intended ?
Cheers,
Gilles
On 6/21/2015 2:00 AM, Jeff Squyres (jsquyres) wrote:
The v2.0 branch has been created on the github ompi-release repo. Let the pull
requests commence.
Just so that we develo
mentionned messages coming from
coll_libnbc_ireduce_scatter_block.c
there might be a bug left in this area, but i was unable to reproduce it.
could you please post the code you used initially ?
when all is fixed, i will make the PR for v1.8, v1.10 and v2.x/v2.0
Cheers,
Gilles
On 6/21/2015 1:46 AM
Thanks Jeff,
that is a bit too subtle for me :-)
do you mean that for example v2.1 will not be forked from v2.0 ?
Cheers,
Gilles
On Monday, June 22, 2015, Jeff Squyres (jsquyres)
wrote:
> Yes.
>
> I actually created it as 2.0; I then deleted it and re-created it as 2.x.
> It&
(coll_libnbc_ireduce_scatter_block.c, 67)
at first glance, i could not find how to reproduce this
(e.g. coll_libnbc_ireduce_scatter_block.c malloc 0 bytes)
if you still have the test program that can do that, could you please
post it ?
Cheers,
Gilles
On 6/22/2015 11:28 PM, Lisandro Dalcin wrote:
On 21 June 2015 at 19:50
on
the other hand, i do not think as a community, we are interested by
mpi4py bugs.
i will let other folks comment on that.
Cheers,
Gilles
On 6/23/2015 9:49 AM, Lisandro Dalcin wrote:
On 22 June 2015 at 18:26, Gilles Gouaillardet wrote:
if you still have the test program that can do that
Lisandro,
this is related to your previous report :
some bugs were introduced when silencing zero size mallocs
here is attached a patch (to be applied as well as the previous one)
Cheers,
Gilles
On 6/23/2015 12:23 AM, Lisandro Dalcin wrote:
The attached test code used to work in 1.8.5 and
file to blacklist the coll_ml module to
ensure this is working.
Mike and Mellanox folks, could you please comment on that ?
Cheers,
Gilles
On 6/24/2015 5:23 PM, Daniel Letai wrote:
Gilles,
Attached the two output logs.
Thanks,
Daniel
On 06/22/2015 08:08 AM, Gilles Gouaillardet wrote
a coll ^ml
otherwise, it might crash (if coll_ml is loaded before coll_hcoll, which
is really system dependent)
Cheers,
Gilles
On 6/25/2015 10:46 AM, Gilles Gouaillardet wrote:
Daniel,
thanks for the logs.
an other workaround is to
mpirun --mca coll ^hcoll ...
i was able to reproduce the iss
Jeff,
this is exactly what happens.
I will send a stack trace later
Cheers,
Gilles
On Thursday, June 25, 2015, Jeff Squyres (jsquyres)
wrote:
> Gilles --
>
> Can you send a stack trace from one of these crashes?
>
> I am *guessing* that the following is happening:
>
&
sure (i blame my poor understanding of linkers) this is an error if
Open MPI is configure'd with --disable-dlopen
Cheers,
Gilles
On 6/26/2015 8:12 AM, Paul Hargrove wrote:
I can see cloning of existing component's source as a starting point
for a new one as a common occurrence (at leas
issues ?
Cheers,
Gilles
On 6/26/2015 12:31 PM, Paul Hargrove wrote:
On Thu, Jun 25, 2015 at 5:05 PM, Paul Hargrove <mailto:phhargr...@lbl.gov>> wrote:
On Thu, Jun 25, 2015 at 4:59 PM, Gilles Gouaillardet
mailto:gil...@rist.or.jp>> wrote:
In this case, mca_co
Ralph,
my bad, I wil fix this today
sorry for the inconvenience
Gilles
On Monday, June 29, 2015, Ralph Castain wrote:
> Hey folks
>
> I don’t know who has been working on the Java bindings, but they are
> totally broken in the master repo - cannot compile. I tried fixing a
Ralph and all,
master is now fixed
Cheers,
Gilles
On 6/29/2015 7:07 AM, Gilles Gouaillardet wrote:
Ralph,
my bad, I wil fix this today
sorry for the inconvenience
Gilles
On Monday, June 29, 2015, Ralph Castain <mailto:r...@open-mpi.org>> wrote:
Hey folks
I don’t kno
bug ?
(and not an anticipated errata)
If a bug, then I can fix it tomorrow
Cheers,
Gilles
I think Paul concern was about cross compilation
(e.g. no AC_TRY_RUN ...)
fwiw, fortran bindings cannot be built "as is" when cross compiling ompi
Cheers,
Gilles
On Wednesday, July 1, 2015, Ralph Castain wrote:
> Given the description, I suspect that any MPI applicat
In other places, initialization looks like
opal_mutex_t mutex = {{0}};
Btw, opal_condition is a standalone binary (e.g. Not part of ompi library),
so I do not think uninitialized common hurts here.
Cheers,
Gilles
On Wednesday, July 1, 2015, Nathan Hjelm wrote:
>
> PGI no longer supri
.
i left some #if 0 in the code since i do not know if something need to
be done about rdma fragments
Cheers,
Gilles
On 7/2/2015 6:04 AM, Nathan Hjelm wrote:
Don't see the leak on master with OS X using the leaks command. Will see
what valgrind finds on linux.
-Nathan
On Wed, Jul 01, 20
ake, make install
i am now double checking this
Cheers,
Gilles
On 7/9/2015 11:25 AM, Paul Hargrove wrote:
I just gave the whole 1.8 series a spin and it looks like "ConnectX
XRC" configure logic has been broken since 1.8.5, but worked in 1.8.4:
$ grep 'ConnectX XRC support
Paul,
can you please compress and post your config.log ?
what is the OFED version you are running ?
on master, that fix did the trick on mellanox test cluster (recent OFED
version) but did not
enable XRC on lanl test clusters (my best bet is an old OFED library)
Thanks
Gilles
On 7/10/2015
Thanks Paul,
i just found an other bug ...
(and i should be blamed for it)
here is attached a patch.
basically, xrc was incorrectly disabled on "older" ofed stacks
Cheers,
Gilles
On 7/10/2015 10:06 AM, Paul Hargrove wrote:
Gilles,
A bzip2-compressed config.log is attached.
I
Paul,
i just applied the patch on the tarball, and it worked for me.
anyway, the IBV_SRQT_XRC test was misplaced (and i just read you already
found out ...)
we need if for XRC_DOMAINS and *not* for XRC
the newly attached patch will (hopefully) fix this
Cheers,
Gilles
On 7/10/2015 11:06 AM
which is incorrect
i will fix that too
Cheers,
Gilles
On 7/10/2015 1:16 PM, Paul Hargrove wrote:
Gilles,
I've made another observation about what I believe is an error in the
XRC configure probe.
If I am following the code below correctly, then *both* ConnectX and
Connec
cray one)
Is it possible to have it installed ?
Cheers,
Gilles
On Friday, July 10, 2015, Jeff Squyres (jsquyres)
wrote:
> On Jul 10, 2015, at 2:12 AM, Paul Hargrove > wrote:
> >
> > The only "new" (non-cosmetic) problem I observed was the failure to
> detect "
Ralph,
(Some) things got broken when adding support for XRC domains / OFED 3.12.
In 1.8.4 there is no XRC support with OFED 3.12
As far as I am concerned, reverting opening btl to 1.8.4 is not a
good option.
Cheers,
Gilles
On Friday, July 10, 2015, Ralph Castain wrote:
> Given that 1.8.
Paul,
Here is a revised patch to be applied vs the 1.8.7-rc1 tarball
Could you please give it a try ?
Cheers,
Gilles
On 7/11/2015 4:22 AM, Paul Hargrove wrote:
The timing on this is less than ideal for me.
To accommodate work on some high-voltage switching equipment, our
building will be
Paul,
thanks for the report,
i made ConnectX XRC (aka XRC) and ConnectIb XRC (aka XRC domains) exclusive,
so yes, you got the desired behavior.
Cheers,
Gilles
On 7/13/2015 3:11 PM, Paul Hargrove wrote:
Giles,
With this latest patch on my "new" system I see
checking if Co
Hi Chris,
i pushed my tarball into a gist :
git clone https://gist.github.com/ec20f77ec35533fa575a.git
and then the tarball is in ec20f77ec35533fa575a/openmpi-gitclone.tar.bz2
Cheers,
Gilles
On 7/13/2015 4:59 PM, Chris Samuel wrote:
Hi Gilles,
On Mon, 13 Jul 2015 03:16:57 PM Gilles
intel compilers.
generally speaking, should we revert the fortran initialization part and
let these common symbols uninitialized ?
I realize this is very confusing for end users ...
I think Jeff is the one who understand this part best, but he might not be
available this week.
Cheers,
Gilles
?
Cheers,
Gilles
On Monday, July 13, 2015, Ralph Castain wrote:
> Gilles - just to confirm, the patch you provided here is the one in the
> updated PRs, yes? If so, I’ll consider those PRs as confirmed and commit
> them
>
>
> On Jul 13, 2015, at 7:20 AM, Gilles Gouaillardet &l
Hi Ralph,
you are right.
the f08 warnings have kind of always been there.
master has a few extra warnings (caused by initialization of common
symbols) but the changes have not been PR'ed to v1.8
i made PR 719 https://github.com/open-mpi/ompi/pull/719 to fix this.
Cheers,
Gilles
On
scenario in which the progress thread (aka thread 2)
is still dealing with some memory that was just freed/unmapped/corrupted by
the main thread.
I empirically noticed the error is more likely to occur when there are many
tasks on one node
e.g. mpirun --oversubscribe -np 32 a.out
Cheers,
Gilles
Ralph,
I will try to reproduce this.
I guess you already checked the output of ompi_info to confirm params are
checked at runtime.
Cheers,
Gilles
On Saturday, July 18, 2015, Ralph Castain wrote:
> Hi folks
>
> I keep getting segfault errors when testing 1.10, while others say the
&g
Ralph,
based on the source code (ompi_mpi_params.c:91) I was expecting a Boolean
ompi_mpi_param_check
Cheers,
Gilles
On Saturday, July 18, 2015, Ralph Castain wrote:
> Yep, I checked:
>
> MPI parameter check: runtime
>
>
>
> On Jul 17, 2015, at 8:00 PM
Ralph,
it seems (google) that MPI_CHECK_ARGS is specific to (at least) cray and
sgi mpi
for openmpi, we need to set
OMPI_MCA_mpi_param_check=1
i updated the onesided test suite and pushed it to the ompi-tests repo
Cheers,
Gilles
On 7/18/2015 11:57 PM, Ralph Castain wrote:
Ah, I found the
t, next, &orte_rml_base.posted_recvs,
orte_rml_posted_recv_t) {
/* since names could include wildcards, must use
* the more generalized comparison function
*/
i hope this helps,
Gilles
On 7/17/2015 11:04 PM, Ralph Castain wrote:
It’s probably a race condition ca
ain, i was unable to reproduce any crash.
Cheers,
Gilles
On 7/22/2015 12:48 AM, Ralph Castain wrote:
I believe I have this fixed - please see if this solves the problem:
https://github.com/open-mpi/ompi/pull/730
On Jul 21, 2015, at 12:22 AM, Gilles Gouaillardet <mailto:gil...@rist.or.jp>
ow to fix this is welcome.
if not, the test can be made optional via a MCA param, or be simply removed
Cheers,
Gilles
On Saturday, July 25, 2015, Paul Hargrove wrote:
> I know Gilles and I went to a fair amount of effort to get configure
> detection of "older" XRC working again f
if nodal open or static libs only.
but I am afraid I cannot get a working solution if both static and dynamic
libs are built.
Cheers,
Gilles
On Saturday, July 25, 2015, Paul Hargrove wrote:
> Gilles,
>
> I can confirm that it is not an environment problem, since the strace
> command
Paul,
where do you run mpirun ?
on a compute node ?
on a login node with no infiniband interface ?
if on a login node, are the infiniband libraries at least available ?
Cheers,
Gilles
On Saturday, July 25, 2015, Paul Hargrove wrote:
> I know Gilles and I went to a fair amount of effort
Lisandro,
I think I see what is going wrong and will fix it
Thanks for the report,
Gilles
On Saturday, July 25, 2015, Lisandro Dalcin wrote:
> Using a debug build of 1.8.7, I'm still getting this malloc(0) warning:
>
> malloc debug: Request for 0 bytes (coll_libnbc_ireduce_s
Lisandro,
i fixed it on master at
https://github.com/open-mpi/ompi/commit/318a1a40a4ab345f417b8932326d4dd2e68d82bc
could you git it a try ?
Cheers,
Gilles
On 7/26/2015 9:26 AM, Gilles Gouaillardet wrote:
Lisandro,
I think I see what is going wrong and will fix it
Thanks for the report
not even sure stdout is a
tty.
Cheers,
Gilles
On Monday, July 27, 2015, Christoph Niethammer wrote:
> Hello,
>
> I know, using stdout and stderr within MPI programs is in no way good.
> Nevertheless I found that - and now wonder why - isatty inside an MPI
> program reports diffe
Harmut,
yes this is a bug ...
we are still working on a proper fix.
in the mean time, you can comment the dlsym test in the openib btl
(otherwise, openmpi falls back to tcp ...)
Cheers,
Gilles
On Tuesday, August 4, 2015, Hartmut Häfner (SCC)
wrote:
> Dear developers,
>
> we have
Hi Howard,
it looks like i pushed by branch to ompi repo instead of my clone ...
that was clearly a mistake and i deleted the branch
Cheers,
Gilles
On 8/6/2015 12:14 AM, Howard Pritchard wrote:
HI Folks,
There's a new branch on open-mpi/ompi repo.
Is this intentional?
H
the PRs from now
Cheers,
Gilles
On 8/14/2015 3:20 AM, Paul Hargrove wrote:
On Thu, Aug 13, 2015 at 7:42 AM, Ralph Castain <mailto:r...@open-mpi.org>> wrote:
Please take one last look around to see if anything else is
missing. I'd like to get this released next week.
urgent, i assigned them to you.
this simply remove a bogus test (OFED version used at runtime
vs compile time)
note i made a PR for master but i did not push my changes
Cheers,
Gilles
On 8/14/2015 8:44 AM, Gilles Gouaillardet wrote:
Paul,
i tried to fix this test, and at this stage, i do not
round the issue.
Cheers,
Gilles
On Sunday, August 23, 2015, Paul Hargrove wrote:
> Having seen problems with mtl:ofi with "--enable-static --disable-shared",
> I tried mtl:psm and mtl:mxm with those options as well.
>
> The good news is that mtl:psm was fine, but the bad new
, and this is
specific to each system.
so at this stage, I cannot suspect this is a different issue or not.
if the crash still occurs with .ompi_ignore in coll ml, then I could
conclude this is a different issue.
Cheers,
Gilles
On Sunday, August 23, 2015, Paul Hargrove wrote:
> Gilles,
>
with
--verbs and OFED is available.
any thoughts ?
Cheers,
Gilles
al_dl_open() NULL and them look for symbols
> that are unique to libnl and libnl3, but a) when to do that, and b) it's
> not guaranteed to work in all cases.
>
>
>
>
> > On Aug 24, 2015, at 7:36 AM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com > w
f both libnl and libnl3 are present
> in the same process (e.g., if some of OMPI's dependent libraries pull them
> both in). We could try to opal_dl_open() NULL and them look for symbols
> that are unique to libnl and libnl3, but a) when to do that, and b) it's
> not guaranteed
a first step could be adding a --disable-libnl3 option to configure, which
means components should not even try to use libnl3
makes sense ?
On Monday, August 24, 2015, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> iirc, librdmacm uses libnl
>
> I am not sure if h
Thanks Adrian,
i fixed this in PR #831 https://github.com/open-mpi/ompi/pull/831 and
push it shortly to master
Best regards,
Gilles
On 8/25/2015 4:47 PM, Adrian Reber wrote:
On Mon, Aug 24, 2015 at 09:47:22PM +, Jeff Squyres (jsquyres) wrote:
Who runs the esslingen MTT?
You
not to build the psm mtl if java bindings are built
and an other option is to revamp mca_mtl_psm.so so it does not link with
libinfinipath.so
(use an intermediate component, or dlopen libinfinipath)
any thoughts ?
Cheers,
Gilles
ipath change actually change its signal handler
> behavior?
>
>
> > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet > wrote:
> >
> > Folks,
> >
> > some time ago, some crashes were reported when using java bindings.
> > one of them was caused was cause
the PMPI_* symbols
3. we add a configure option to call PMPI_* symbols instead of the MPI_*
ones
any thoughts ?
Cheers,
Gilles
,
but that cannot works because libinfinipath is dlopen'ed and it's signal
handler is set
also, I guess putenv("OMPI_MCA_mtl=^psm") would not work if ompi was
configure'd with--disable-dlopen
Cheers,
Gilles
On Wednesday, August 26, 2015, Ralph Castain wrote:
> Gilles: w
$
Gilles
On Wednesday, August 26, 2015, Jeff Squyres (jsquyres)
wrote:
> Fair point.
>
> I don't know if there's an easy way to fix that, though.
>
>
> > On Aug 25, 2015, at 6:01 PM, Cabral, Matias A > wrote:
> >
> > Hi,
> >
> >
>
Thanks Paul,
I will give it a try
Cheers,
Gilles
On Wednesday, August 26, 2015, Paul Hargrove wrote:
> Gilles,
>
> Is the conflict over "SIG32"?
> If so, I believe setenv PSM_RCVTHREAD=0 in the environment will disable
> InfiniPath's use of that signal.
>
&g
or fail at build or runtime)
i will also shut up from now and let the fine folks at Intel implement a
definitive solution :-D
Cheers,
Gilles
On 8/27/2015 12:41 AM, Jeff Squyres (jsquyres) wrote:
On Aug 26, 2015, at 11:29 AM, Ralph Castain wrote:
...but only when the PSM MTL is not compile
fine its own MPI_Alltoall subroutine, then
then PMPI_Alltoall is invoked directly since MPI_Alltoall is a weak
symbol pointing to
PMPI_Alltoall.
Cheers,
Gilles
On 8/26/2015 9:39 AM, Jeff Squyres (jsquyres) wrote:
On Aug 25, 2015, at 11:03 AM, George Bosilca wrote:
This seems to be the case only wi
iirc, the MPI_Win_detach discrepancy with the standard is intentional in
fortran 2008,
there is a comment in the source code to explain this.
On Thursday, August 27, 2015, Kawashima, Takahiro <
t-kawash...@jp.fujitsu.com> wrote:
> Oh, I also noticed it yesterday and was about to report it.
>
> An
Kawashima-san,
you are right, I mixed MPI_Buffer_detach and MPI_Win_detach
sorry for the confusion
Cheers,
Gilles
On Thursday, August 27, 2015, Kawashima, Takahiro <
t-kawash...@jp.fujitsu.com> wrote:
> Gilles,
>
> > there is a comment in the source code to explain this.
&
Ralph,
what about :
- if only one interface is specified (e.g. *_if_include eth0), then bind
to that interface
- otherwise, bind to all interfaces
Mark, would that solve your issue ?
Cheers,
Gilles
On 8/28/2015 9:50 AM, Ralph Castain wrote:
I committed the change that prevents orte-submit
Thanks Michael and Kawashima-san,
i made PR #838 to fix this
it is currently available at https://github.com/open-mpi/ompi/pull/838
Cheers,
Gilles
On 8/27/2015 6:29 PM, Michael Knobloch wrote:
Dear OpenMPI developers,
I noticed a bug in the definition of the 3 MPI-3 RMA functions
*_f files are impacted, and for
mpif-h only,
so i'd rather ask before I fill the pr, and even if a sed command will do
most of the job */
Cheers,
Gilles
On Saturday, August 29, 2015, Jeff Squyres (jsquyres)
wrote:
> On Aug 27, 2015, at 3:25 AM, Gilles Gouaillardet > wro
Jeff,
i filed PR #845 https://github.com/open-mpi/ompi/pull/845
could you please have a look ?
Cheers,
Gilles
On 8/30/2015 9:20 PM, Gilles Gouaillardet wrote:
ok, will do
basically, I simply have to
#include "ompi/mpi/c/profile/defines.h"
if configure set the WANT_MPI_PROFI
Brice,
as a side note, what is the rationale for defining the distance as a
floating point number ?
i remember i had to fix a bug in ompi a while ago
/* e.g. replace if (d1 == d2) with if((d1-d2) < epsilon) */
Cheers,
Gilles
On 9/1/2015 5:28 AM, Brice Goglin wrote:
The locality is mlx
Hi,
this part has been revamped recently.
at first, i would recommend you make a fresh install
remove the install directory, and the build directory if you use VPATH,
re-run configure && make && make install
that should hopefully fix the issue
Cheers,
Gilles
On 9/1/2015
ly do nothing
(the end user might know what he/she is doing, and there will be nothing
to do on the ompi side
when this gets fixed by the PSM folks)
Cheers,
Gilles
On 9/3/2015 10:21 AM, Ralph Castain wrote:
Hi folks
I regret to say that 1.10.0 is hitting an issue with at least one upstream
d
the option to choose which PSM version
(if any) should be used ?
Cheers,
Gilles
On 9/3/2015 12:47 PM, Ralph Castain wrote:
I’m afraid that won’t solve the problem - the distro will still feel the need
to release -two- versions of OMPI, one with PSM and one with PSM2. Ordinarily,
I wouldn’t
George,
about your third point :
some libraries does stuff in the constructors, so "mtl = ^psm" might
also not work if OMPI was configure'd with --disable-dlopen.
as far as i know, --disable-dlopen is quite popular (and
--disable-shared --enable-static is not so much)
Cheers,
Michael,
if a solution with two packages is acceptable,
then an other and simpler option is to configure
openmpi for PSM with --without-psm2,
and openmpi for PSM2 with --without-psm
this is safe for --disable-dlopen or --enable-static, and you do not need
to tweak the conf files
Cheers,
Gilles
Jeff,
on second thought, wouldn't it be better to simple disable both PSM and
PSM2 in openmpi,
and let libfabric handle these conflicts ?
does that make any sense ?
Cheers,
Gilles
On Thursday, September 3, 2015, Jeff Squyres (jsquyres)
wrote:
> I agree with what George says.
>
&
(known not to
support PSM), or a mpirun-psm2 wrapper, or a release note (e.g. use --mca
mtl ^psm or a psm2 param file)
I still do not get how removing PSM2 makes things better
(and the same result can be achieved by configuring with --without-psm2)
Cheers,
Gilles
On Thursday, September 3, 2015
ation ?
for example, on Fujitsu FX10 node (single socket, 16 cores), hwloc reports
16 sockets with one core each and no cache. though this is not correct,
that can be seen as equivalent to the real config by ompi, so this is not
really an issue for ompi.
Cheers,
Gilles
On Friday, September 4,
Thanks Brice,
bottom line, even if hwloc is not fully ported, it should build and ompi
should get something usable.
in this case, i have no objection removing the --without-hwloc configure
option.
you can contact me off-list regarding the FX10 specific issue
Cheers,
Gilles
On 9/4/2015 2
generate the MPI_* bindings
- an other time to generate the PMPI_* bindings */
any thoughts or objections to the removal of the --enable-mpi-profile
configure option ?
Cheers,
Gilles
ot to add the const modifier to
MPI_User_function
as i wrote earlier, the change is quite massive.
i plan to commit it by the end of next week, unless there are any
objections.
(and then i will PR for v2.x, and v1.10 but only if there is a request)
Cheers,
Gilles
per of Fujitsu MPI for K computer and Fujitsu
> PRIMEHPC FX10/FX100 (SPARC-based CPU).
>
> Though I'm not familiar with the hwloc code and didn't know
> the issue reported by Gilles, I also would be able to help
> you to fix the issue.
>
> Takahiro Kawashima,
> MPI
Pasha,
i fixed that in
https://github.com/open-mpi/ompi/commit/c404e98dced4104cd3abe7485846368325c3d150
but forgot to post it to the ML ...
Cheers,
Gilles
On 9/11/2015 7:31 AM, Shamis, Pavel wrote:
Ralph,
I don't see these warnings on my fedora box with gcc 5.1.1.
I will try to f
Ralph,
will do
i think this new warnings are a consequence of the changes i pushed recently
(e.g. add the const keyword)
Cheers,
Gilles
On 9/11/2015 12:47 PM, Ralph Castain wrote:
FWIW: I’m still seeing these on CentOS7 using gcc 4.8.3 in a debug build:
*coll_ml_allocation.c:20:13
Ralph,
this is fixed in
https://github.com/open-mpi/ompi/commit/a1627feaf74d8562146a1afbfabec60651496c06
Cheers,
Gilles
On 9/11/2015 1:02 PM, Gilles Gouaillardet wrote:
Ralph,
will do
i think this new warnings are a consequence of the changes i pushed
recently
(e.g. add the const
t ompi_proc_t *) 0xf8010010f540
what about using the lower bit instead ?
my assumption is that ompi_proc_t objects are aligned (static or
malloc'ed one) on at least a pointer size (4 in x86) so the lower bit
should always be zero.
any thoughts ?
Cheers,
Gilles
clean your nodes when it
was fixed)
the neighbor_allgather_self failure is discussed at
https://github.com/open-mpi/ompi/pull/790
I will have a look at the op related failure on Monday
(looks like a MPI conformance issue unrelated to PMIx)
Cheers,
Gilles
On Saturday, September 12, 2015, Ralph
w ompi was configure'd when built outside mtt.
as a side note...
ideally, the configure command line would be available from ompi_info.
but unfortunately, it seems there is no reliable way to capture the
configure command line.
Cheers,
Gilles
On Sunday, September 13, 2015, Ralph Castain wrote:
is set, then force the environment variable but do
not propagate it)
random/attr-error-code only check mpi_param_check at configure time, and
i will fix that from now
for now, i suggest you comment the mpi_param_check = 0 line from your
linux.conf file
Cheers,
Gilles
On 9/12/2015 9:51 AM
George,
I will revisit this.
if I added const modifier when not required by the standard, this was not
intentional, this was a mistake.
thanks for the report
Gilles
On Wednesday, September 16, 2015, George Bosilca
wrote:
> Gilles,
>
> Your commit 6e6a3e96 is only partially correct.
601 - 700 of 832 matches
Mail list logo