Re: [OMPI devel] Open MPI collectives algorithm selection

2015-05-20 Thread Gilles Gouaillardet
Howard, i made PR 593 https://github.com/open-mpi/ompi/pull/593 in order to fix this. George, could you please review this ? Cheers, Gilles On 5/20/2015 12:57 PM, Howard Pritchard wrote: HI Gilles, First a disclaimer - I do not know what the intended design was nor where the design

Re: [OMPI devel] Open MPI collectives algorithm selection

2015-05-20 Thread Gilles Gouaillardet
me. however, there is a grey area for small communicators and i think it should be cleared. Cheers, Gilles On 5/21/2015 1:04 AM, George Bosilca wrote: Each rule define an interval with the previous rule, and everything in an interval will be bound the the rule with the next message size.

Re: [OMPI devel] Open MPI collectives algorithm selection

2015-05-20 Thread Gilles Gouaillardet
r; } Cheers, Gilles On Thu, May 21, 2015 at 12:04 PM, George Bosilca wrote: > Gilles, > > There is no need to define a rule for zero-sized messages, it is > implicitly matched by the first rule. To be extremely pedantic the > selection logic for the communicator s

[OMPI devel] RFC: how should Open MPI handle link-local addresses

2015-05-21 Thread Gilles Gouaillardet
local addresses (that requires some extra devel) as far as i am concerned, i am fine with 1) because i think it is very unlikely an user ever wants to use link-local addresses. Thanks in advance for your feedback so we can move forward. Cheers, Gilles

[OMPI devel] Jenkins and coverity logs

2015-05-25 Thread Gilles Gouaillardet
, Gilles

Re: [OMPI devel] change in io_ompio.c

2015-05-27 Thread Gilles Gouaillardet
Edgar, i am sorry about that. i fixed some memory leaks (some memory was leaking in some error cases). i also moved (up) some malloc in order to group them and simplify the handling of error cases. per your comment, one move was incorrect indeed :-( Cheers, Gilles On 5/28/2015 12:14 PM

Re: [OMPI devel] v1.8.6 release

2015-05-29 Thread Gilles Gouaillardet
this is good enough for 1.8.6 we might want to hardcode the mca value in this branch. I dont remember anything was committed to the master. if I get a positive review, I will do the back port and pr vs v1.8 on Monday cheers, Gilles On Friday, May 29, 2015, Ralph Castain wrote: > Hi fo

Re: [OMPI devel] v1.8.6 release

2015-05-29 Thread Gilles Gouaillardet
i do not know :-( all I can tell is I did not run into any issue involving the oob tcp that being said, I am not sure link-local addresses are considered as local addresses by oob/tcp since these IPv6 are part of the orted command line Cheers, Gilles On Saturday, May 30, 2015, Ralph Castain

Re: [OMPI devel] RFC: standardize verbosity values

2015-06-08 Thread Gilles Gouaillardet
t;, "ERROR", "WARN", "NOTICE", "INFO", "DEBUG", "TRACE", "NOTSET", "UNKNOWN" }; Cheers, Gilles On 5/30/2015 1:32 AM, Nathan Hjelm wrote: At the moment we have a loosely enforced standard for verb

Re: [OMPI devel] RFC: standardize verbosity values

2015-06-08 Thread Gilles Gouaillardet
so what about : static const char* const priorities[] = { "ERROR", "WARN", "INFO", "DEBUG", "TRACE" }; and merge debug and trace if there should be only 4 Cheers, Gilles On Monday, June 8, 2015, Ralph Castain wrote: >

Re: [OMPI devel] Unused var in OB1 on master

2015-06-13 Thread Gilles Gouaillardet
Will do tomorrow. proc is only used in heterogeneous mode, hence the warning On Sunday, June 14, 2015, Ralph Castain wrote: > *pml_ob1_recvreq.c:* In function '*mca_pml_ob1_recv_request_put_frag*': > *pml_ob1_recvreq.c:397:18:* *warning: *unused variable '*proc*' > [-Wunused-variable] > omp

Re: [OMPI devel] Unused var in OB1 on master

2015-06-14 Thread Gilles Gouaillardet
Ralph and all, this is fixed at https://github.com/open-mpi/ompi/commit/ee3a1da28a3c018115bad82e0a9e7d1e04d35148 Cheers, Gilles On 6/14/2015 10:43 AM, Gilles Gouaillardet wrote: Will do tomorrow. proc is only used in heterogeneous mode, hence the warning On Sunday, June 14, 2015, Ralph

Re: [OMPI devel] v2.0 branch has been created

2015-06-21 Thread Gilles Gouaillardet
Jeff, currently, the github "v2.0" branch is called "v2.x" was this intended ? Cheers, Gilles On 6/21/2015 2:00 AM, Jeff Squyres (jsquyres) wrote: The v2.0 branch has been created on the github ompi-release repo. Let the pull requests commence. Just so that we develo

Re: [OMPI devel] Bug

2015-06-21 Thread Gilles Gouaillardet
mentionned messages coming from coll_libnbc_ireduce_scatter_block.c there might be a bug left in this area, but i was unable to reproduce it. could you please post the code you used initially ? when all is fixed, i will make the PR for v1.8, v1.10 and v2.x/v2.0 Cheers, Gilles On 6/21/2015 1:46 AM

Re: [OMPI devel] v2.0 branch has been created

2015-06-22 Thread Gilles Gouaillardet
Thanks Jeff, that is a bit too subtle for me :-) do you mean that for example v2.1 will not be forked from v2.0 ? Cheers, Gilles On Monday, June 22, 2015, Jeff Squyres (jsquyres) wrote: > Yes. > > I actually created it as 2.0; I then deleted it and re-created it as 2.x. > It&

Re: [OMPI devel] Bug

2015-06-22 Thread Gilles Gouaillardet
(coll_libnbc_ireduce_scatter_block.c, 67) at first glance, i could not find how to reproduce this (e.g. coll_libnbc_ireduce_scatter_block.c malloc 0 bytes) if you still have the test program that can do that, could you please post it ? Cheers, Gilles On 6/22/2015 11:28 PM, Lisandro Dalcin wrote: On 21 June 2015 at 19:50

Re: [OMPI devel] Bug

2015-06-22 Thread Gilles Gouaillardet
on the other hand, i do not think as a community, we are interested by mpi4py bugs. i will let other folks comment on that. Cheers, Gilles On 6/23/2015 9:49 AM, Lisandro Dalcin wrote: On 22 June 2015 at 18:26, Gilles Gouaillardet wrote: if you still have the test program that can do that

Re: [OMPI devel] Regressions: MPI_Win_{start|post}() with MPI_GROUP_EMPTY

2015-06-22 Thread Gilles Gouaillardet
Lisandro, this is related to your previous report : some bugs were introduced when silencing zero size mallocs here is attached a patch (to be applied as well as the previous one) Cheers, Gilles On 6/23/2015 12:23 AM, Lisandro Dalcin wrote: The attached test code used to work in 1.8.5 and

Re: [OMPI devel] [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-24 Thread Gilles Gouaillardet
file to blacklist the coll_ml module to ensure this is working. Mike and Mellanox folks, could you please comment on that ? Cheers, Gilles On 6/24/2015 5:23 PM, Daniel Letai wrote: Gilles, Attached the two output logs. Thanks, Daniel On 06/22/2015 08:08 AM, Gilles Gouaillardet wrote

Re: [OMPI devel] [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-25 Thread Gilles Gouaillardet
a coll ^ml otherwise, it might crash (if coll_ml is loaded before coll_hcoll, which is really system dependent) Cheers, Gilles On 6/25/2015 10:46 AM, Gilles Gouaillardet wrote: Daniel, thanks for the logs. an other workaround is to mpirun --mca coll ^hcoll ... i was able to reproduce the iss

Re: [OMPI devel] [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-25 Thread Gilles Gouaillardet
Jeff, this is exactly what happens. I will send a stack trace later Cheers, Gilles On Thursday, June 25, 2015, Jeff Squyres (jsquyres) wrote: > Gilles -- > > Can you send a stack trace from one of these crashes? > > I am *guessing* that the following is happening: > &

Re: [OMPI devel] [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-25 Thread Gilles Gouaillardet
sure (i blame my poor understanding of linkers) this is an error if Open MPI is configure'd with --disable-dlopen Cheers, Gilles On 6/26/2015 8:12 AM, Paul Hargrove wrote: I can see cloning of existing component's source as a starting point for a new one as a common occurrence (at leas

Re: [OMPI devel] [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-26 Thread Gilles Gouaillardet
issues ? Cheers, Gilles On 6/26/2015 12:31 PM, Paul Hargrove wrote: On Thu, Jun 25, 2015 at 5:05 PM, Paul Hargrove <mailto:phhargr...@lbl.gov>> wrote: On Thu, Jun 25, 2015 at 4:59 PM, Gilles Gouaillardet mailto:gil...@rist.or.jp>> wrote: In this case, mca_co

Re: [OMPI devel] Java bindings are completely broken

2015-06-28 Thread Gilles Gouaillardet
Ralph, my bad, I wil fix this today sorry for the inconvenience Gilles On Monday, June 29, 2015, Ralph Castain wrote: > Hey folks > > I don’t know who has been working on the Java bindings, but they are > totally broken in the master repo - cannot compile. I tried fixing a

Re: [OMPI devel] Java bindings are completely broken

2015-06-28 Thread Gilles Gouaillardet
Ralph and all, master is now fixed Cheers, Gilles On 6/29/2015 7:07 AM, Gilles Gouaillardet wrote: Ralph, my bad, I wil fix this today sorry for the inconvenience Gilles On Monday, June 29, 2015, Ralph Castain <mailto:r...@open-mpi.org>> wrote: Hey folks I don’t kno

[OMPI devel] MPI_Buffer_detach fortran binding

2015-06-29 Thread Gilles Gouaillardet
bug ? (and not an anticipated errata) If a bug, then I can fix it tomorrow Cheers, Gilles

Re: [OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Gilles Gouaillardet
I think Paul concern was about cross compilation (e.g. no AC_TRY_RUN ...) fwiw, fortran bindings cannot be built "as is" when cross compiling ompi Cheers, Gilles On Wednesday, July 1, 2015, Ralph Castain wrote: > Given the description, I suspect that any MPI applicat

Re: [OMPI devel] error in test/threads/opal_condition.c

2015-07-01 Thread Gilles Gouaillardet
In other places, initialization looks like opal_mutex_t mutex = {{0}}; Btw, opal_condition is a standalone binary (e.g. Not part of ompi library), so I do not think uninitialized common hurts here. Cheers, Gilles On Wednesday, July 1, 2015, Nathan Hjelm wrote: > > PGI no longer supri

Re: [OMPI devel] Open MPI 1.8.6 memory leak

2015-07-01 Thread Gilles Gouaillardet
. i left some #if 0 in the code since i do not know if something need to be done about rdma fragments Cheers, Gilles On 7/2/2015 6:04 AM, Nathan Hjelm wrote: Don't see the leak on master with OS X using the leaks command. Will see what valgrind finds on linux. -Nathan On Wed, Jul 01, 20

Re: [OMPI devel] XRC Support

2015-07-09 Thread Gilles Gouaillardet
ake, make install i am now double checking this Cheers, Gilles On 7/9/2015 11:25 AM, Paul Hargrove wrote: I just gave the whole 1.8 series a spin and it looks like "ConnectX XRC" configure logic has been broken since 1.8.5, but worked in 1.8.4: $ grep 'ConnectX XRC support&#x

Re: [OMPI devel] 1.8.7 rc1 out for review

2015-07-09 Thread Gilles Gouaillardet
Paul, can you please compress and post your config.log ? what is the OFED version you are running ? on master, that fix did the trick on mellanox test cluster (recent OFED version) but did not enable XRC on lanl test clusters (my best bet is an old OFED library) Thanks Gilles On 7/10/2015

Re: [OMPI devel] 1.8.7 rc1 out for review

2015-07-09 Thread Gilles Gouaillardet
Thanks Paul, i just found an other bug ... (and i should be blamed for it) here is attached a patch. basically, xrc was incorrectly disabled on "older" ofed stacks Cheers, Gilles On 7/10/2015 10:06 AM, Paul Hargrove wrote: Gilles, A bzip2-compressed config.log is attached. I

Re: [OMPI devel] 1.8.7 rc1 out for review

2015-07-10 Thread Gilles Gouaillardet
Paul, i just applied the patch on the tarball, and it worked for me. anyway, the IBV_SRQT_XRC test was misplaced (and i just read you already found out ...) we need if for XRC_DOMAINS and *not* for XRC the newly attached patch will (hopefully) fix this Cheers, Gilles On 7/10/2015 11:06 AM

Re: [OMPI devel] 1.8.7 rc1 out for review

2015-07-10 Thread Gilles Gouaillardet
which is incorrect i will fix that too Cheers, Gilles On 7/10/2015 1:16 PM, Paul Hargrove wrote: Gilles, I've made another observation about what I believe is an error in the XRC configure probe. If I am following the code below correctly, then *both* ConnectX and Connec

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-10 Thread Gilles Gouaillardet
cray one) Is it possible to have it installed ? Cheers, Gilles On Friday, July 10, 2015, Jeff Squyres (jsquyres) wrote: > On Jul 10, 2015, at 2:12 AM, Paul Hargrove > wrote: > > > > The only "new" (non-cosmetic) problem I observed was the failure to > detect "

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-10 Thread Gilles Gouaillardet
Ralph, (Some) things got broken when adding support for XRC domains / OFED 3.12. In 1.8.4 there is no XRC support with OFED 3.12 As far as I am concerned, reverting opening btl to 1.8.4 is not a good option. Cheers, Gilles On Friday, July 10, 2015, Ralph Castain wrote: > Given that 1.8.

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-12 Thread Gilles Gouaillardet
Paul, Here is a revised patch to be applied vs the 1.8.7-rc1 tarball Could you please give it a try ? Cheers, Gilles On 7/11/2015 4:22 AM, Paul Hargrove wrote: The timing on this is less than ideal for me. To accommodate work on some high-voltage switching equipment, our building will be

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-13 Thread Gilles Gouaillardet
Paul, thanks for the report, i made ConnectX XRC (aka XRC) and ConnectIb XRC (aka XRC domains) exclusive, so yes, you got the desired behavior. Cheers, Gilles On 7/13/2015 3:11 PM, Paul Hargrove wrote: Giles, With this latest patch on my "new" system I see checking if Co

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-13 Thread Gilles Gouaillardet
Hi Chris, i pushed my tarball into a gist : git clone https://gist.github.com/ec20f77ec35533fa575a.git and then the tarball is in ec20f77ec35533fa575a/openmpi-gitclone.tar.bz2 Cheers, Gilles On 7/13/2015 4:59 PM, Chris Samuel wrote: Hi Gilles, On Mon, 13 Jul 2015 03:16:57 PM Gilles

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-13 Thread Gilles Gouaillardet
intel compilers. generally speaking, should we revert the fortran initialization part and let these common symbols uninitialized ? I realize this is very confusing for end users ... I think Jeff is the one who understand this part best, but he might not be available this week. Cheers, Gilles

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-13 Thread Gilles Gouaillardet
? Cheers, Gilles On Monday, July 13, 2015, Ralph Castain wrote: > Gilles - just to confirm, the patch you provided here is the one in the > updated PRs, yes? If so, I’ll consider those PRs as confirmed and commit > them > > > On Jul 13, 2015, at 7:20 AM, Gilles Gouaillardet &l

Re: [OMPI devel] 1.8.7rc1 testing results

2015-07-14 Thread Gilles Gouaillardet
Hi Ralph, you are right. the f08 warnings have kind of always been there. master has a few extra warnings (caused by initialization of common symbols) but the changes have not been PR'ed to v1.8 i made PR 719 https://github.com/open-mpi/ompi/pull/719 to fix this. Cheers, Gilles On

[OMPI devel] race condition in finalize

2015-07-17 Thread Gilles Gouaillardet
scenario in which the progress thread (aka thread 2) is still dealing with some memory that was just freed/unmapped/corrupted by the main thread. I empirically noticed the error is more likely to occur when there are many tasks on one node e.g. mpirun --oversubscribe -np 32 a.out Cheers, Gilles

Re: [OMPI devel] Erroneous test?

2015-07-17 Thread Gilles Gouaillardet
Ralph, I will try to reproduce this. I guess you already checked the output of ompi_info to confirm params are checked at runtime. Cheers, Gilles On Saturday, July 18, 2015, Ralph Castain wrote: > Hi folks > > I keep getting segfault errors when testing 1.10, while others say the &g

Re: [OMPI devel] Erroneous test?

2015-07-17 Thread Gilles Gouaillardet
Ralph, based on the source code (ompi_mpi_params.c:91) I was expecting a Boolean ompi_mpi_param_check Cheers, Gilles On Saturday, July 18, 2015, Ralph Castain wrote: > Yep, I checked: > > MPI parameter check: runtime > > > > On Jul 17, 2015, at 8:00 PM

Re: [OMPI devel] Erroneous test?

2015-07-20 Thread Gilles Gouaillardet
Ralph, it seems (google) that MPI_CHECK_ARGS is specific to (at least) cray and sgi mpi for openmpi, we need to set OMPI_MCA_mpi_param_check=1 i updated the onesided test suite and pushed it to the ompi-tests repo Cheers, Gilles On 7/18/2015 11:57 PM, Ralph Castain wrote: Ah, I found the

Re: [OMPI devel] race condition in finalize

2015-07-21 Thread Gilles Gouaillardet
t, next, &orte_rml_base.posted_recvs, orte_rml_posted_recv_t) { /* since names could include wildcards, must use * the more generalized comparison function */ i hope this helps, Gilles On 7/17/2015 11:04 PM, Ralph Castain wrote: It’s probably a race condition ca

Re: [OMPI devel] race condition in finalize

2015-07-22 Thread Gilles Gouaillardet
ain, i was unable to reproduce any crash. Cheers, Gilles On 7/22/2015 12:48 AM, Ralph Castain wrote: I believe I have this fixed - please see if this solves the problem: https://github.com/open-mpi/ompi/pull/730 On Jul 21, 2015, at 12:22 AM, Gilles Gouaillardet <mailto:gil...@rist.or.jp>

Re: [OMPI devel] MAYBE problem w/ XRC with OFED pre-3.12

2015-07-25 Thread Gilles Gouaillardet
ow to fix this is welcome. if not, the test can be made optional via a MCA param, or be simply removed Cheers, Gilles On Saturday, July 25, 2015, Paul Hargrove wrote: > I know Gilles and I went to a fair amount of effort to get configure > detection of "older" XRC working again f

Re: [OMPI devel] MAYBE problem w/ XRC with OFED pre-3.12

2015-07-25 Thread Gilles Gouaillardet
if nodal open or static libs only. but I am afraid I cannot get a working solution if both static and dynamic libs are built. Cheers, Gilles On Saturday, July 25, 2015, Paul Hargrove wrote: > Gilles, > > I can confirm that it is not an environment problem, since the strace > command

Re: [OMPI devel] MAYBE problem w/ XRC with OFED pre-3.12

2015-07-25 Thread Gilles Gouaillardet
Paul, where do you run mpirun ? on a compute node ? on a login node with no infiniband interface ? if on a login node, are the infiniband libraries at least available ? Cheers, Gilles On Saturday, July 25, 2015, Paul Hargrove wrote: > I know Gilles and I went to a fair amount of effort

Re: [OMPI devel] malloc(0) warning with 1.8.7

2015-07-25 Thread Gilles Gouaillardet
Lisandro, I think I see what is going wrong and will fix it Thanks for the report, Gilles On Saturday, July 25, 2015, Lisandro Dalcin wrote: > Using a debug build of 1.8.7, I'm still getting this malloc(0) warning: > > malloc debug: Request for 0 bytes (coll_libnbc_ireduce_s

Re: [OMPI devel] malloc(0) warning with 1.8.7

2015-07-27 Thread Gilles Gouaillardet
Lisandro, i fixed it on master at https://github.com/open-mpi/ompi/commit/318a1a40a4ab345f417b8932326d4dd2e68d82bc could you git it a try ? Cheers, Gilles On 7/26/2015 9:26 AM, Gilles Gouaillardet wrote: Lisandro, I think I see what is going wrong and will fix it Thanks for the report

Re: [OMPI devel] stdout, stderr reporting different values for isatty

2015-07-27 Thread Gilles Gouaillardet
not even sure stdout is a tty. Cheers, Gilles On Monday, July 27, 2015, Christoph Niethammer wrote: > Hello, > > I know, using stdout and stderr within MPI programs is in no way good. > Nevertheless I found that - and now wonder why - isatty inside an MPI > program reports diffe

Re: [OMPI devel] Error in version 1.8.7?

2015-08-04 Thread Gilles Gouaillardet
Harmut, yes this is a bug ... we are still working on a proper fix. in the mean time, you can comment the dlsym test in the openib btl (otherwise, openmpi falls back to tcp ...) Cheers, Gilles On Tuesday, August 4, 2015, Hartmut Häfner (SCC) wrote: > Dear developers, > > we have

Re: [OMPI devel] new branch on open-mpi/ompi?

2015-08-06 Thread Gilles Gouaillardet
Hi Howard, it looks like i pushed by branch to ompi repo instead of my clone ... that was clearly a mistake and i deleted the branch Cheers, Gilles On 8/6/2015 12:14 AM, Howard Pritchard wrote: HI Folks, There's a new branch on open-mpi/ompi repo. Is this intentional? H

Re: [OMPI devel] v1.10.0 release

2015-08-13 Thread Gilles Gouaillardet
the PRs from now Cheers, Gilles On 8/14/2015 3:20 AM, Paul Hargrove wrote: On Thu, Aug 13, 2015 at 7:42 AM, Ralph Castain <mailto:r...@open-mpi.org>> wrote: Please take one last look around to see if anything else is missing. I'd like to get this released next week.

Re: [OMPI devel] v1.10.0 release

2015-08-14 Thread Gilles Gouaillardet
urgent, i assigned them to you. this simply remove a bogus test (OFED version used at runtime vs compile time) note i made a PR for master but i did not push my changes Cheers, Gilles On 8/14/2015 8:44 AM, Gilles Gouaillardet wrote: Paul, i tried to fix this test, and at this stage, i do not

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-22 Thread Gilles Gouaillardet
round the issue. Cheers, Gilles On Sunday, August 23, 2015, Paul Hargrove wrote: > Having seen problems with mtl:ofi with "--enable-static --disable-shared", > I tried mtl:psm and mtl:mxm with those options as well. > > The good news is that mtl:psm was fine, but the bad new

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Gilles Gouaillardet
, and this is specific to each system. so at this stage, I cannot suspect this is a different issue or not. if the crash still occurs with .ompi_ignore in coll ml, then I could conclude this is a different issue. Cheers, Gilles On Sunday, August 23, 2015, Paul Hargrove wrote: > Gilles, >

[OMPI devel] reachable_netlink mca, libnl and libnl3

2015-08-24 Thread Gilles Gouaillardet
with --verbs and OFED is available. any thoughts ? Cheers, Gilles

Re: [OMPI devel] reachable_netlink mca, libnl and libnl3

2015-08-24 Thread Gilles Gouaillardet
al_dl_open() NULL and them look for symbols > that are unique to libnl and libnl3, but a) when to do that, and b) it's > not guaranteed to work in all cases. > > > > > > On Aug 24, 2015, at 7:36 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > w

Re: [OMPI devel] reachable_netlink mca, libnl and libnl3

2015-08-24 Thread Gilles Gouaillardet
f both libnl and libnl3 are present > in the same process (e.g., if some of OMPI's dependent libraries pull them > both in). We could try to opal_dl_open() NULL and them look for symbols > that are unique to libnl and libnl3, but a) when to do that, and b) it's > not guaranteed

Re: [OMPI devel] reachable_netlink mca, libnl and libnl3

2015-08-24 Thread Gilles Gouaillardet
a first step could be adding a --disable-libnl3 option to configure, which means components should not even try to use libnl3 makes sense ? On Monday, August 24, 2015, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > iirc, librdmacm uses libnl > > I am not sure if h

Re: [OMPI devel] esslingen MTT?

2015-08-25 Thread Gilles Gouaillardet
Thanks Adrian, i fixed this in PR #831 https://github.com/open-mpi/ompi/pull/831 and push it shortly to master Best regards, Gilles On 8/25/2015 4:47 PM, Adrian Reber wrote: On Mon, Aug 24, 2015 at 09:47:22PM +, Jeff Squyres (jsquyres) wrote: Who runs the esslingen MTT? You&#x

[OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
not to build the psm mtl if java bindings are built and an other option is to revamp mca_mtl_psm.so so it does not link with libinfinipath.so (use an intermediate component, or dlopen libinfinipath) any thoughts ? Cheers, Gilles

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
ipath change actually change its signal handler > behavior? > > > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet > wrote: > > > > Folks, > > > > some time ago, some crashes were reported when using java bindings. > > one of them was caused was cause

[OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-25 Thread Gilles Gouaillardet
the PMPI_* symbols 3. we add a configure option to call PMPI_* symbols instead of the MPI_* ones any thoughts ? Cheers, Gilles

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
, but that cannot works because libinfinipath is dlopen'ed and it's signal handler is set also, I guess putenv("OMPI_MCA_mtl=^psm") would not work if ompi was configure'd with--disable-dlopen Cheers, Gilles On Wednesday, August 26, 2015, Ralph Castain wrote: > Gilles: w

Re: [OMPI devel] cosmetic misleading mpirun error message

2015-08-25 Thread Gilles Gouaillardet
$ Gilles On Wednesday, August 26, 2015, Jeff Squyres (jsquyres) wrote: > Fair point. > > I don't know if there's an easy way to fix that, though. > > > > On Aug 25, 2015, at 6:01 PM, Cabral, Matias A > wrote: > > > > Hi, > > > > >

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
Thanks Paul, I will give it a try Cheers, Gilles On Wednesday, August 26, 2015, Paul Hargrove wrote: > Gilles, > > Is the conflict over "SIG32"? > If so, I believe setenv PSM_RCVTHREAD=0 in the environment will disable > InfiniPath's use of that signal. > &g

Re: [OMPI devel] mca_mtl_psm and java

2015-08-26 Thread Gilles Gouaillardet
or fail at build or runtime) i will also shut up from now and let the fine folks at Intel implement a definitive solution :-D Cheers, Gilles On 8/27/2015 12:41 AM, Jeff Squyres (jsquyres) wrote: On Aug 26, 2015, at 11:29 AM, Ralph Castain wrote: ...but only when the PSM MTL is not compile

Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-27 Thread Gilles Gouaillardet
fine its own MPI_Alltoall subroutine, then then PMPI_Alltoall is invoked directly since MPI_Alltoall is a weak symbol pointing to PMPI_Alltoall. Cheers, Gilles On 8/26/2015 9:39 AM, Jeff Squyres (jsquyres) wrote: On Aug 25, 2015, at 11:03 AM, George Bosilca wrote: This seems to be the case only wi

Re: [OMPI devel] OpenMPI 1.8 Bug Report

2015-08-27 Thread Gilles Gouaillardet
iirc, the MPI_Win_detach discrepancy with the standard is intentional in fortran 2008, there is a comment in the source code to explain this. On Thursday, August 27, 2015, Kawashima, Takahiro < t-kawash...@jp.fujitsu.com> wrote: > Oh, I also noticed it yesterday and was about to report it. > > An

Re: [OMPI devel] OpenMPI 1.8 Bug Report

2015-08-27 Thread Gilles Gouaillardet
Kawashima-san, you are right, I mixed MPI_Buffer_detach and MPI_Win_detach sorry for the confusion Cheers, Gilles On Thursday, August 27, 2015, Kawashima, Takahiro < t-kawash...@jp.fujitsu.com> wrote: > Gilles, > > > there is a comment in the source code to explain this. &

Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()

2015-08-27 Thread Gilles Gouaillardet
Ralph, what about : - if only one interface is specified (e.g. *_if_include eth0), then bind to that interface - otherwise, bind to all interfaces Mark, would that solve your issue ? Cheers, Gilles On 8/28/2015 9:50 AM, Ralph Castain wrote: I committed the change that prevents orte-submit

Re: [OMPI devel] OpenMPI 1.8 Bug Report

2015-08-27 Thread Gilles Gouaillardet
Thanks Michael and Kawashima-san, i made PR #838 to fix this it is currently available at https://github.com/open-mpi/ompi/pull/838 Cheers, Gilles On 8/27/2015 6:29 PM, Michael Knobloch wrote: Dear OpenMPI developers, I noticed a bug in the definition of the 3 MPI-3 RMA functions

Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-30 Thread Gilles Gouaillardet
*_f files are impacted, and for mpif-h only, so i'd rather ask before I fill the pr, and even if a sed command will do most of the job */ Cheers, Gilles On Saturday, August 29, 2015, Jeff Squyres (jsquyres) wrote: > On Aug 27, 2015, at 3:25 AM, Gilles Gouaillardet > wro

Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-31 Thread Gilles Gouaillardet
Jeff, i filed PR #845 https://github.com/open-mpi/ompi/pull/845 could you please have a look ? Cheers, Gilles On 8/30/2015 9:20 PM, Gilles Gouaillardet wrote: ok, will do basically, I simply have to #include "ompi/mpi/c/profile/defines.h" if configure set the WANT_MPI_PROFI

Re: [OMPI devel] Dual rail IB card problem

2015-08-31 Thread Gilles Gouaillardet
Brice, as a side note, what is the rationale for defining the distance as a floating point number ? i remember i had to fix a bug in ompi a while ago /* e.g. replace if (d1 == d2) with if((d1-d2) < epsilon) */ Cheers, Gilles On 9/1/2015 5:28 AM, Brice Goglin wrote: The locality is mlx

Re: [OMPI devel] Problem running from ompi master

2015-08-31 Thread Gilles Gouaillardet
Hi, this part has been revamped recently. at first, i would recommend you make a fresh install remove the install directory, and the build directory if you use VPATH, re-run configure && make && make install that should hopefully fix the issue Cheers, Gilles On 9/1/2015

Re: [OMPI devel] 1.10.0 issue

2015-09-02 Thread Gilles Gouaillardet
ly do nothing (the end user might know what he/she is doing, and there will be nothing to do on the ompi side when this gets fixed by the PSM folks) Cheers, Gilles On 9/3/2015 10:21 AM, Ralph Castain wrote: Hi folks I regret to say that 1.10.0 is hitting an issue with at least one upstream d

Re: [OMPI devel] 1.10.0 issue

2015-09-03 Thread Gilles Gouaillardet
the option to choose which PSM version (if any) should be used ? Cheers, Gilles On 9/3/2015 12:47 PM, Ralph Castain wrote: I’m afraid that won’t solve the problem - the distro will still feel the need to release -two- versions of OMPI, one with PSM and one with PSM2. Ordinarily, I wouldn’t

Re: [OMPI devel] 1.10.0 issue

2015-09-03 Thread Gilles Gouaillardet
George, about your third point : some libraries does stuff in the constructors, so "mtl = ^psm" might also not work if OMPI was configure'd with --disable-dlopen. as far as i know, --disable-dlopen is quite popular (and --disable-shared --enable-static is not so much) Cheers,

Re: [OMPI devel] 1.10.0 issue

2015-09-03 Thread Gilles Gouaillardet
Michael, if a solution with two packages is acceptable, then an other and simpler option is to configure openmpi for PSM with --without-psm2, and openmpi for PSM2 with --without-psm this is safe for --disable-dlopen or --enable-static, and you do not need to tweak the conf files Cheers, Gilles

Re: [OMPI devel] 1.10.0 issue

2015-09-03 Thread Gilles Gouaillardet
Jeff, on second thought, wouldn't it be better to simple disable both PSM and PSM2 in openmpi, and let libfabric handle these conflicts ? does that make any sense ? Cheers, Gilles On Thursday, September 3, 2015, Jeff Squyres (jsquyres) wrote: > I agree with what George says. > &

Re: [OMPI devel] 1.10.0 issue

2015-09-03 Thread Gilles Gouaillardet
(known not to support PSM), or a mpirun-psm2 wrapper, or a release note (e.g. use --mca mtl ^psm or a psm2 param file) I still do not get how removing PSM2 makes things better (and the same result can be achieved by configuring with --without-psm2) Cheers, Gilles On Thursday, September 3, 2015

Re: [OMPI devel] RFC: Remove --without-hwloc configure option

2015-09-03 Thread Gilles Gouaillardet
ation ? for example, on Fujitsu FX10 node (single socket, 16 cores), hwloc reports 16 sockets with one core each and no cache. though this is not correct, that can be seen as equivalent to the real config by ompi, so this is not really an issue for ompi. Cheers, Gilles On Friday, September 4,

Re: [OMPI devel] RFC: Remove --without-hwloc configure option

2015-09-04 Thread Gilles Gouaillardet
Thanks Brice, bottom line, even if hwloc is not fully ported, it should build and ompi should get something usable. in this case, i have no objection removing the --without-hwloc configure option. you can contact me off-list regarding the FX10 specific issue Cheers, Gilles On 9/4/2015 2

[OMPI devel] RFC: Remove the --enable-mpi-profile option

2015-09-04 Thread Gilles Gouaillardet
generate the MPI_* bindings - an other time to generate the PMPI_* bindings */ any thoughts or objections to the removal of the --enable-mpi-profile configure option ? Cheers, Gilles

[OMPI devel] no more cast away const

2015-09-04 Thread Gilles Gouaillardet
ot to add the const modifier to MPI_User_function as i wrote earlier, the change is quite massive. i plan to commit it by the end of next week, unless there are any objections. (and then i will PR for v2.x, and v1.10 but only if there is a request) Cheers, Gilles

Re: [OMPI devel] RFC: Remove --without-hwloc configure option

2015-09-04 Thread Gilles Gouaillardet
per of Fujitsu MPI for K computer and Fujitsu > PRIMEHPC FX10/FX100 (SPARC-based CPU). > > Though I'm not familiar with the hwloc code and didn't know > the issue reported by Gilles, I also would be able to help > you to fix the issue. > > Takahiro Kawashima, > MPI

Re: [OMPI devel] New master warnings

2015-09-10 Thread Gilles Gouaillardet
Pasha, i fixed that in https://github.com/open-mpi/ompi/commit/c404e98dced4104cd3abe7485846368325c3d150 but forgot to post it to the ML ... Cheers, Gilles On 9/11/2015 7:31 AM, Shamis, Pavel wrote: Ralph, I don't see these warnings on my fedora box with gcc 5.1.1. I will try to f

Re: [OMPI devel] New master warnings

2015-09-11 Thread Gilles Gouaillardet
Ralph, will do i think this new warnings are a consequence of the changes i pushed recently (e.g. add the const keyword) Cheers, Gilles On 9/11/2015 12:47 PM, Ralph Castain wrote: FWIW: I’m still seeing these on CentOS7 using gcc 4.8.3 in a debug build: *coll_ml_allocation.c:20:13

Re: [OMPI devel] New master warnings

2015-09-11 Thread Gilles Gouaillardet
Ralph, this is fixed in https://github.com/open-mpi/ompi/commit/a1627feaf74d8562146a1afbfabec60651496c06 Cheers, Gilles On 9/11/2015 1:02 PM, Gilles Gouaillardet wrote: Ralph, will do i think this new warnings are a consequence of the changes i pushed recently (e.g. add the const

[OMPI devel] issue with group sentinel values

2015-09-11 Thread Gilles Gouaillardet
t ompi_proc_t *) 0xf8010010f540 what about using the lower bit instead ? my assumption is that ompi_proc_t objects are aligned (static or malloc'ed one) on at least a pointer size (4 in x86) so the lower bit should always be zero. any thoughts ? Cheers, Gilles

Re: [OMPI devel] Remaining MTT errors

2015-09-12 Thread Gilles Gouaillardet
clean your nodes when it was fixed) the neighbor_allgather_self failure is discussed at https://github.com/open-mpi/ompi/pull/790 I will have a look at the op related failure on Monday (looks like a MPI conformance issue unrelated to PMIx) Cheers, Gilles On Saturday, September 12, 2015, Ralph

Re: [OMPI devel] Remaining MTT errors

2015-09-13 Thread Gilles Gouaillardet
w ompi was configure'd when built outside mtt. as a side note... ideally, the configure command line would be available from ompi_info. but unfortunately, it seems there is no reliable way to capture the configure command line. Cheers, Gilles On Sunday, September 13, 2015, Ralph Castain wrote:

Re: [OMPI devel] Remaining MTT errors

2015-09-14 Thread Gilles Gouaillardet
is set, then force the environment variable but do not propagate it) random/attr-error-code only check mpi_param_check at configure time, and i will fix that from now for now, i suggest you comment the mpi_param_check = 0 line from your linux.conf file Cheers, Gilles On 9/12/2015 9:51 AM

Re: [OMPI devel] Commit 6e6a3e96

2015-09-16 Thread Gilles Gouaillardet
George, I will revisit this. if I added const modifier when not required by the standard, this was not intentional, this was a mistake. thanks for the report Gilles On Wednesday, September 16, 2015, George Bosilca wrote: > Gilles, > > Your commit 6e6a3e96 is only partially correct.

<    2   3   4   5   6   7   8   9   >