Larry,
currently, OpenMPI generate mpif-sizeof.h with up to 15 dimensions with
intel compilers, but up to 7 dimensions with "recent" gcc (for example
gcc 5.2 and higher)
so i guess the logic behind this is "give the compiler all it can
handle", so if intel somehow "extended" the standard to
Dave,
you should not expect anything when mixing Fortran compilers
(and to be on the safe side, you might not expect much when mixing C/C++
compilers too,
for example, if you built ompi with intel and use gcc for your app, gcc
might complain about unresolved symbols from the intel runtime)
if
ng about debug builds being used for performance testing
>
+1
I’m increasingly feeling that we shouldn’t output that message every time
> someone executes a debug-based operation, even if we add a param to turn
> off the warning.
>
+1
>
> On Mar 2, 2016, at 5:48 AM, Gilles Gouaill
pers who
> don’t would be nice.
>
> > On Mar 2, 2016, at 4:51 AM, Jeff Squyres (jsquyres) > wrote:
> >
> > On Mar 2, 2016, at 6:30 AM, Mark Santcroos > wrote:
> >>
> >>> On 02 Mar 2016, at 5:06 , Gilles Gouaillardet > wrote:
> >>
performance benchmark, then i will not get the warning i need (and yes,
i will be the only one to blame ... but isn't it something we want to
avoid here ?)
Cheers,
Gilles
On 3/2/2016 1:43 PM, George Bosilca wrote:
On Mar 1, 2016, at 22:27 , Gilles Gouaillardet wrote:
be "me-frien
ng opinion, and i am fine with setting a parameter (i
will likely soon forget i set that) in a config file.
Cheers,
Gilles
On 3/2/2016 1:21 PM, Jeff Squyres (jsquyres) wrote:
On Mar 1, 2016, at 10:17 PM, Gilles Gouaillardet wrote:
In this case, should we only display the warning if debug
(jsquyres) wrote:
On Mar 1, 2016, at 10:06 PM, Gilles Gouaillardet wrote:
what about *not* issuing this warning if OpenMPI is built from git ?
that would be friendlier for OMPI developers,
and should basically *not* affect endusers, since they would rather build OMPI
from a tarball.
We
Jeff,
what about *not* issuing this warning if OpenMPI is built from git ?
that would be friendlier for OMPI developers,
and should basically *not* affect endusers, since they would rather
build OMPI from a tarball.
Cheers,
Gilles
On 3/2/2016 1:00 PM, Jeff Squyres (jsquyres) wrote:
WHAT: Ha
fwiw
in a previous thread, Jeff Hammond explained this is why mpich is
relying on C89 instead of C99,
since C89 appears to be a subset of C++11.
Cheers,
Gilles
On 3/2/2016 1:02 AM, Nathan Hjelm wrote:
I will add to how crazy this is. The C standard has been very careful
to not break existin
Adrian,
About bitness, it is correctly set when MPI install successes
See https://mtt.open-mpi.org/index.php?do_redir or even your successful
install on x86_64
I suspect it is queried once the installation is successful, and I ll try
to have a look at it.
Cheers,
Gilles
On Tuesday, March 1, 20
Ralph,
The goal here is to allow vendor to distribute binary orte frameworks
(on top of binary components they can already distribute) that can be used
by a user compiled "stock" openmpi library).
Did I get it right so far ?
I gave it some thoughts and found that could be simplified.
My unders
Monika,
Can you send all the information listed here:
https://www.open-mpi.org/community/help/
btw, are you using a cross-compiler ?
can you try to compile this simple program :
typedef struct xxx xxx;
struct xxx {
int i;
xxx *p;
};
void yyy(xxx *x) {
x->i = 0;
x->p =
t_sources = $(sources)
>> else
>> lib = libmca_btl_lf.la
>> lib_sources = $(sources)
>> component =
>> component_sources =
>> endif
>>
>> mcacomponentdir = $(opallibdir)
>> mcacomponent_LTLIBRARIES = $(component)
>> mca_btl_lf_la_SOURCES = $(co
Aurelien,
I guess you should also have
noinst_LTLIBRARIES += libmpiext_blabla_usempi.la
in your Makefile.am
is your extension available somewhere in github so we can have a look ?
Cheers,
Gilles
On Wednesday, February 24, 2016, Aurélien Bouteiller
wrote:
> I am making an MPI extension in la
Ralph,
my 0.02 US$
if i understand correctly, we put non-ORTE processes into a different
process group because
ORTE *might* have grand-children and their progeny, and ORTE does not /
cannot know about.
/* note we assume here these processes are all well raised and do not
create yet an other
Folks,
i made https://github.com/open-mpi/ompi/pull/1376 to fix this issue
note it also revert the changes previously introduced in
ompi/runtime/ompi_mpi_init.c
Cheers,
Gilles
On 2/18/2016 8:37 AM, Gilles Gouaillardet wrote:
Jeff,
this commit only fixes MPI_Init() and not the openib btl
problem.
So whatever was done missed those precautions and introduced this symbol
regardless
of the configuration.
On Feb 15, 2016, at 8:39 PM, Gilles Gouaillardet >
wrote:
Ralph,
this is being discussed at https://github.com/open-mpi/ompi/pull/1351
btw, how do you get this warning ? i do not s
Ralph,
this is being discussed at https://github.com/open-mpi/ompi/pull/1351
btw, how do you get this warning ? i do not see it.
fwiw, the abstraction violation was kind of already here, so i am
surprised it pops up now only
Cheers,
Gilles
On 2/16/2016 1:17 PM, Ralph Castain wrote:
Looks l
geneous system and sender and receiver will always
> be using their native format.
> i.e, exactly the same as MPI_Pack and MPI_Unpack.
>
> kindest regards
> Mike
>
> On 12/02/2016, at 9:25 PM, Gilles Gouaillardet wrote:
>
> Michael,
>
> byte swapping only occurs if y
ks for your prompt and most helpful responses.
>
> warmest regards
> MIke
>
> On 12/02/2016, at 7:03 PM, Gilles Gouaillardet wrote:
>
> Michael,
>
> i'd like to correct what i wrote earlier
>
> in heterogeneous clusters, data is sent "as is" (e.g.
onfigure'd with --enable-debug, you would
have ran into an assert error (e.g. crash).
i will work on a fix, but it might take some time before it is ready
Cheers,
Gilles
On 2/11/2016 6:16 PM, Gilles Gouaillardet wrote:
Michael,
MPI_Pack_external must convert data to big endian, so it can be dump
I am now at home, this same problem also exists with
> the Ubuntu 15.10 OpenMP packages
> which surprisingly are still at 1.6.5, same as 14.04.
>
> Again, downloading, building, and using the latest stable version of
> OpenMP solved the problem.
>
> kindest regards
> Mike
>
cked ints are correct.
>>
>> So, this problem still exists in heterogeneous builds with OpenMPI
>> version 1.10.2.
>>
>> kindest regards
>> Mike
>>
>> On 11 February 2016 at 14:48, Gilles Gouaillardet <
>> gilles.gouaillar...@gmail.com
>> >
Michael,
does your two systems have the same endianness ?
do you know how openmpi was configure'd on both systems ?
(is --enable-heterogeneous enabled or disabled on both systems ?)
fwiw, openmpi 1.6.5 is old now and no more maintained.
I strongly encourage you to use openmpi 1.10.2
Cheers,
Gi
ere is another macro for reassembling the
> jobid from the two pieces. If you use those, we’ll avoid any issues with
> future modifications to the fields.
>
>
> On Feb 5, 2016, at 8:17 PM, Gilles Gouaillardet
> wrote:
>
> Thanks Ralph,
>
> I will implement the second o
tely
> wouldn’t advise it.
>
>
> On Feb 5, 2016, at 7:48 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> > wrote:
>
> Thanks George,
>
> I will definitely try that !
>
> back to the initial question, has someone any thoughts on which bit(s) we
&g
to cope
> with the MSB during the shifting operations?
>
> George
> On Feb 5, 2016 10:08 AM, "Jeff Squyres (jsquyres)" > wrote:
>
>> On Feb 5, 2016, at 9:26 AM, Gilles Gouaillardet <
>> gilles.gouaillar...@gmail.com
>> > wrote:
>> >
>>
, February 6, 2016, Jeff Squyres (jsquyres)
wrote:
> On Feb 5, 2016, at 9:26 AM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com > wrote:
> >
> > static inline opal_process_name_t ompi_proc_sentinel_to_name (intptr_t
> sentinel)
> > {
> > sentinel >
Folks,
i was unable to start a simple MPI job using the TCP btl on an
heterogeneous cluster and using --mca mpi_procs_cutoff 0.
The root cause was the most significant bit of the jobid was set on
some nodes but not on others.
This is what we have :
from opal/dss/dss_types.h
typedef uint32_t opa
+1
should we also enable sparse groups by default ?
(or at least on master, and then v2.x later)
Cheers,
Gilles
On Thursday, February 4, 2016, Joshua Ladd wrote:
> +1
>
>
> On Wed, Feb 3, 2016 at 9:54 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com >
> wrote:
>
>> WHAT: Decrease default va
Hi,
this is difficult to answer such a generic request.
MPI symbols (MPI_Bcast, ...) are defined as weak symbols, so the simplest
option is to redefine them an implement them the way you like. you are
always able to invoke PMPI_Bcast if you want to invoke the openmpi
implementation.
a more ompi-
Durga,
did you confuse PML and MTL ?
basically, a BTL (Byte Transport Layer ?) is used with "primitive"
interconnects that can only send bytes.
(e.g. if you need to transmit a tagged message, it is up to you
send/recv the tag and manually match the tag on the receiver side so you
can put the
It
should be sufficient to add the PID of the current process to the
filename to ensure it is unique.
-Nathan
On Tue, Feb 02, 2016 at 09:33:29PM +0900, Gilles Gouaillardet wrote:
Nathan,
the sm osc component uses communicator CID to name the file that will be
used to create sha
sufficient to add the PID of the current process to the
filename to ensure it is unique.
-Nathan
On Tue, Feb 02, 2016 at 09:33:29PM +0900, Gilles Gouaillardet wrote:
Nathan,
the sm osc component uses communicator CID to name the file that will be
used to create shared memory segments
Nathan,
the sm osc component uses communicator CID to name the file that will be
used to create shared memory segments.
if I understand and correctly, two different communicators coming from the
same MPI_Comm_split might share the same CID, so CID (alone) cannot be used
to generate a unique per co
Lisandro,
here is attached a patch (master does things differently, so this has to
be a one-off patch anyway)
could you please give it a try ?
btw, how do you get these warnings automatically ?
Cheers,
Gilles
On 2/2/2016 12:02 AM, Lisandro Dalcin wrote:
You might argue that the attached te
-Post: devel@lists.open-mpi.org
Date: Tue, 2 Feb 2016 10:26:53 +0900
From: Gilles Gouaillardet
To: Open MPI Users
Simon,
this is an usnic requirement
(mca/common/verbs_usnic to be more specific)
as a workaround (and assuming you do not need usnic stuff on any of your
nodes) you can
iirc, there are pipes between orted and app for IOF (I/O
forwarding) (stdin, stdout and stderr)
On Tuesday, January 26, 2016, Gianmario Pozzi wrote:
> Thank you, Ralph.
>
> What about ORTE_DAEMON_MESSAGE_LOCAL_PROCS case into orte_comm.c? I see it
> calls orte_odls.deliver_message() and the com
Thanks Paul,
it seems a "git add" was missed in the upstream pmix repo,
i will make a PR for that
Cheers,
Gilles
On 1/26/2016 9:50 AM, Paul Hargrove wrote:
Using last night's master tarball I am seeing the following at
configure time:
[path-to]/openmpi-dev-3397-g70787d1/opal/mca/pmix/pmix12
enough to take an optimized path when
> doing a loopback as opposed to inter-node communication.
>
>
> On Mon, Jan 25, 2016 at 4:28 AM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> > wrote:
>
>> Federico,
>>
>> I did not expect 0% degradation,
MPI_Alltoall, MPI_Gather, MPI_Scatter, MPI_Scan, MPI_Send/Recv
>
> Cheers,
> Federico
> __
> Federico Reghenzani
> M.Eng. Student @ Politecnico di Milano
> Computer Science and Engineering
>
>
>
> 2016-01-25 12:17 GMT+01:00 Gilles Gouaillardet <
> gilles.goua
Federico,
unless you already took care of that, I would guess all 16 orted
bound their children MPI tasks on socket 0
can you try
mpirun --bind-to none ...
btw, is your benchmark application cpu bound ? memory bound ? MPI bound ?
Cheers,
Gilles
On Monday, January 25, 2016, Federico Reghenzani
Folks,
there was a question about mtt on the mtt mailing list
http://www.open-mpi.org/community/lists/mtt-users/2016/01/0840.php
after a few emails (some offline) it seems that was a configuration issue.
the user is running PBSPro and it seems OpenMPI was not configured with
the tm module
(e.
Ralph,
i noticed a file descriptor leak with current master.
that can be easily reproduced with the loop_spawn test from the
ibm/dynamic test suite
mpirun -np 1 ./loop_spawn
after a few seconds, you can see the leak via
lsof -p $(pidof mpirun)
there is a bunch of files such as
mpirun 20791
This is now fixed in master
Thanks for the report !
Gilles
On Saturday, January 9, 2016, Shamis, Pavel wrote:
> Hey Folks
>
> OpenMPI master appears to be broken for a non-debug build:
> ---
> make[2]: Entering directory `ompi/build/opal'
> CC runtime/opal_progress.lo
> ../../opal/runt
;
>> Those revisions listed above that are new to this repository have
>> not appeared on any other notification email; so we list those
>> revisions in full, below.
>>
>> - Log -
>> https://github.co
i did forget that indeed ... and i just pushed it
Cheers,
Gilles
On 1/7/2016 12:33 AM, Ralph Castain wrote:
Hmmm…I don’t see a second commit message anywhere. Did you perhaps
forget to push it?
Thanks for the explanation!
Ralph
On Jan 6, 2016, at 2:30 AM, Gilles Gouaillardet
https://github.com/open-mpi/ompi/commit/213b2abde47cf02ba3152a301d3ec0ffeec54438
> >
> > commit 213b2abde47cf02ba3152a301d3ec0ffeec54438
> > Author: Gilles Gouaillardet >
> > Date: Wed Jan 6 16:21:13 2016 +0900
> >
> >dpm: correctly handle procs_cutoff in
Paul,
generally speaking, when using mellanox stuff (mxm, hcoll, fca)
these libraries must be accessible, either via LD_LIBRARY_PATH or via
ld.so.conf
I do not the config of these cluster, but you might have to use the
mellanox libraries tha could be in a non standard location.
Cheers,
Gilles
Marco,
If I understand correctly, pmix is mandatory, regardless you run on a
laptop or an exascale system.
Cheers,
Gilles
On Thursday, December 24, 2015, Marco Atzeri wrote:
> On 24/12/2015 06:10, Gilles Gouaillardet wrote:
>
>> Marco,
>>
>> Thanks for the patch,
Marco,
Thanks for the patch, i will apply the changes related to the missing
include files to master and PR to v2.x
on linux, libpmix.so does not depend on libopen-pal.
that being said, libpmix.so has undefined symbols related to hwloc and
libevent, and these symbols are defined in libopen-pa
and 2.x branch all work that way too.
On Dec 22, 2015, at 12:49 AM, Gilles Gouaillardet wrote:
Ralph,
i (re)discovered an old and odd behaviour in v1.10, which was discussed in
https://github.com/open-mpi/ompi-release/pull/664
when running
mpirun --host xxx ...
mpirun v1.10 assumes one slo
at the master and 2.x branch all work that way too.
>
>
> > On Dec 22, 2015, at 12:49 AM, Gilles Gouaillardet > wrote:
> >
> > Ralph,
> >
> > i (re)discovered an old and odd behaviour in v1.10, which was discussed
> in https://github.com/open-mpi/ompi-release
Ralph,
i (re)discovered an old and odd behaviour in v1.10, which was discussed
in https://github.com/open-mpi/ompi-release/pull/664
when running
mpirun --host xxx ...
mpirun v1.10 assumes one slot per host.
consequently, on my vm with 4 cores
mpirun -np 2 ./helloworld_mpi
works fine
but
mpiru
Thanks Paul !
i will review this and make the PRs
Cheers,
Gilles
On 12/20/2015 9:44 AM, Paul Hargrove wrote:
On my Solaris 11.2 system, alloca() is a macro defined in alloca.h.
So, the following is needed to avoid link failures:
--- ompi/mca/pml/cm/pml_cm.h~ Sat Dec 19 16:25:54 2015
+++ om
,
Gilles
On 12/22/2015 7:38 AM, Paul Hargrove wrote:
Gilles,
It looked to me like PR 857 includes this fix.
Are you saying you are going to spilt if off from that one (to speed
up the review)?
-Paul
On Mon, Dec 21, 2015 at 2:26 PM, Gilles Gouaillardet
mailto:gilles.gouaillar...@gmail.com
Paul and Orion,
the fix has been merged into v1.10.
I will issue a separate pr for v2.x since this issue is impacting quite a
lot of openmpi users
Sorry for the inconvenience,
Gilles
On Tuesday, December 22, 2015, Paul Hargrove wrote:
> Orion,
>
> The FCFLAGS_save issue was been fixed in mast
ors are still shared.
>
> BR Justin
>
> On 15. 12. 2015 14:55, Gilles Gouaillardet wrote:
>
> Justin,
>
> at first glance, vader should be symmetric (e.g.
> call opal_shmem_segment_dettach() instead of munmap()
> Nathan, can you please comment ?
>
> using tid instea
>were really small and elegant). So while there are no real processes, new
>binary / ELF file is loaded at different address then the rest of OS - so it
>has separate global variables, and separate environ too. Other resources like
>file descriptors are still shared.
>
>BR Just
Justin,
at first glance, vader should be symmetric (e.g.
call opal_shmem_segment_dettach() instead of munmap()
Nathan, can you please comment ?
using tid instead of pid should also do the trick
that being said, a more elegant approach would be to create a new module in
the shmem framework
basica
Ralph,
iirc, we are using a slightly patched version of libevent, is this correct ?
I guess removing the internal versions is the way to go,
that being said, could/should we do this one step at a time ?
I mean a first step could be to update the configure default option to
configure --with-hwloc=
Federico,
you also need to update orte/tools/Makefile.am
Cheers,
Gilles
On Wednesday, December 9, 2015, Federico Reghenzani <
federico1.reghenz...@mail.polimi.it> wrote:
> Hi!
>
> I'm trying to add a new tool under /orte/tools/, I've followed as example
> the orte-ps and created my orted-resto
Folks,
as discussed off-list, and for the records
https://github.com/open-mpi/ompi-release/commit/8d658a734f352dfa104d1794330f44e3c52c4a76
must also be applied in order to fix v1.8
Cheers,
Gilles
On Wed, Dec 9, 2015 at 2:08 AM, Baldassari Caroline wrote:
> Gilles, Chris,
>
> Thank you for yo
Jeff,
the following program does not compile :
$ mpifort -c mpi_displacement_current_usempi.f90
mpi_displacement_current_usempi.f90:6:64:
& ,MPI_DATATYPE_NULL , "native", MPI_INFO_NULL, ierr )
1
Error: There is no specifi
However, what Howard is doing helps resolve
> it by breaking out the Jenkins runs into categories. So instead of one
> massive test session, setup one Jenkins server for each category. Then we
> can set the specific tags according to the test category.
>
> Make sense?
> Ralph
>
>
Ralph and all,
My 0.02US$
We are kind of limited by the github API
https://developer.github.com/v3/repos/statuses/
Basically, a status is pending, success, error or failure plus a string.
A possible work around is to have Jenkins set labels on the PR.
If only valgrind fails, the status could be
Thanks,
>
> Howard
>
> Von meinem iPhone gesendet
>
> > Am 10.11.2015 um 19:57 schrieb Gilles Gouaillardet >:
> >
> > Nathan,
> >
> > a simple MPI_Win_create test hangs on my non uniform cluster
> (ibm/onesided/c_create)
> >
> > one node ha
Nathan,
a simple MPI_Win_create test hangs on my non uniform cluster
(ibm/onesided/c_create)
one node has an IB card but not the other one.
the node with the IB card select the rdma osc module, but the other node
select the pt2pt module.
and then it hangs because both ends do no try to initi
Jeff,
OK, will do
Cheers,
Gilles
On Saturday, October 31, 2015, Jeff Squyres (jsquyres)
wrote:
> On Oct 30, 2015, at 12:09 PM, Barrett, Brian > wrote:
> >
> > However, I do like Gilles' suggestion to make autogen.pl be a little
> smarter. If I recall correctly (and it's been a couple years
hat the pmix_server.h
includes pmix/pmix_common.h and not pmix_common.h. If you want to
figure this one, that a good starting point. Btw, why do we have 3
headers with the same name (it's s confusing) ?
George.
On Wed, Oct 28, 2015 at 1:08 AM, Gilles Gouaillardet
mailto:gil...@rist.or.jp&
wonder how your compiler gets to know the definition of
the PMIX_ERR_SILENT without the pmix_common.h. I just pushed a fix.
George.
On Wed, Oct 28, 2015 at 12:43 AM, Gilles Gouaillardet
mailto:gil...@rist.or.jp>> wrote:
George,
i am unable to reproduce the issue.
if build
George,
i am unable to reproduce the issue.
if build still breaks for you, could you send me your configure command
line ?
Cheers,
Gilles
On 10/28/2015 1:04 PM, Gilles Gouaillardet wrote:
George,
PMIX_ERR_SILENT is defined in
opal/mca/pmix/pmix1xx/pmix/include/pmix/pmix_common.h.in
i
George,
PMIX_ERR_SILENT is defined in
opal/mca/pmix/pmix1xx/pmix/include/pmix/pmix_common.h.in
i ll have a look at it from now
Cheers,
Gilles
On 10/28/2015 12:02 PM, George Bosilca wrote:
We get a nice compiler complaint:
../../../../../../ompi/opal/mca/pmix/pmix1xx/pmix/src/server/pmix_s
FWIW
before Jeff fixed that, build was successful on my RHEL7 box
(stdio.h is included from verbs_exp.h that is included from verbs.h)
but failed on my RHEL6 box
(verbs.h does *not* include stdio.h)
so there was some room for Jenkins not to fail
Cheers,
Gilles
On 10/27/2015 9:17 PM, Jeff Squy
Jeff and all,
my 0.02 US$ ...
- autogen.pl was recently used with v1.10 on a PowerPC Little Endian arch
(that was mandatory since the libtool we use to generate v1.10 series do
not yet support PPC LE)
- if we remove (from the tarball) autogen.pl, should we also remove
configure.ac ?
and w
Federico,
in order to build one component, just cd into the component directory (in
build directory if you are using VPATH) and run make (install)
components and frameworks depend on other framework, so it is generally
safer to run make from the top build directory
Cheers,
Gilles
On Tuesday, O
?
Cheers,
Federico
__
Federico Reghenzani
M.Eng. Student @ Politecnico di Milano
Computer Science and Engineering
2015-10-23 11:45 GMT+02:00 Gilles Gouaillardet
mailto:gilles.gouaillar...@gmail.com>>:
Gianmario,
Iirc, there is one pipe between orted and each children stderr.
George,
Then you cannot use https otherwise certificate check will fail,
Note if you have a proxy, you can tunnel to the proxy and that should be fine.
The main drawback is the ssh connection must be active when contacting IU, and
if a batch manager is used, no one knows when that will be needed.
al ip, you
could reuse existing infrastructure at least to migrate orted and its tcp/ip
connections
Cheers,
Gilles
Federico Reghenzani wrote:
>Hi Adrian and Gilles,
>
>
>first of all thank you for your responses. I'm working with Gianmario on this
>ambitious project.
>
>
Howard,
that has already been raised in
http://www.open-mpi.org/community/lists/mtt-users/2014/10/0820.php
at the end, Christoph claimed he could achieve that with mtt-relay
(but provided no detail on how ...)
You might want to check the full thread and/or ask Christoph directly
Ralph,
IIRC
Ralph,
i made PR #711 https://github.com/open-mpi/ompi-release/pull/711 to fix
this issue
Cheers,
Gilles
On 10/23/2015 7:39 AM, Gilles Gouaillardet wrote:
Ralph,
these are MPI 3 functions that did not land yet into the v1.10 series.
only MPI_Aint arithmetic functions landed into v1.10 so
Ralph,
these are MPI 3 functions that did not land yet into the v1.10 series.
only MPI_Aint arithmetic functions landed into v1.10 so it seems configure
is confused
(e.g. this test was previously not built, and now it is ...)
I ll try to back port the missing functions
Cheers,
Gilles
On Friday
Gianmario,
there was c/r support in the v1.6 series but it has been removed.
the current trend is to do application level checkpointing
(much more efficient and much smaller checkpoint file size)
iirc, ompi took care of closing/restoring all communication, and a third
party checkpoint was require
Scott and all,
two btl are optimized (and work only) for intra node communications : sm
and vader
by "sm" I am not sure you mean the sm btl, or any/both sm and vader btl.
from an user point of view, and to disambiguate this, maybe we should use
the term "shm"
(which means sm and/or vader btl for
Andrej,
a load average of 700 is very curious.
i guess you already made sure load average is zero when the system is idle ...
are you running an hybrid app (e.g. MPI + OpenMP) ?
one possible explanation is you run 48 MPI tasks and each task has 48
OpenMP threads, and that kills performances.
wh
Andrej,
by "running on the head node", shall i understand you mean
"running mpirun command *and* all mpi tasks on the head node" ?
by "running on the compute node", shall i understand you mean
"running mpirun on the compute node *and* all mpi tasks on the *same*
compute node" ?
or do you mean
>> wrote:
>>
>> Hi Gilles,
>>
>> as for that, recompiling OpenMPI works, but causes no change here.
>>
>> -Tobias
>>
>> --
>> Dr.-Ing. Tobias Hilbrich
>> Research Assistant
>>
>> Technische Universitaet Dresden, Germany
&
Tobias,
Btw, did you recompile ompi with this xcode ?
Iirc, we do similar comparisons in ompi itself
Cheers,
Gilles
Tobias Hilbrich wrote:
>Hi all,
>
>a wonderful puzzle for the OSX folks in your team (Reproducer attached):
>
>Attached source file builds with Xcode 7.0.0, but fails since the r
Tobias,
Fwiw, MPI_Comm_compare can be used to compare communicators.
Hopefully, this is also compiler friendly.
Cheers,
Gilles
On Tuesday, October 20, 2015, Tobias Hilbrich
wrote:
> Hi all,
>
> a wonderful puzzle for the OSX folks in your team (Reproducer attached):
>
> Attached source file b
amatic mechanism to launch a process in your containers? I.e., can
> mpirun programmatically launch MPI processes in OSv containers?
>
>
>
> > On Oct 16, 2015, at 6:48 AM, Justin Cinkelj > wrote:
> >
> > Thank you. At least its clear now that for the immediate pr
Justin,
IOF stands for Input/Output (aka I/O) Forwarding
here is a very high level overview of a quite simple case.
on host A, you run
mpirun -host B,C -np 2 a.out
without any batch manager and TCP interconnect
first, mpirun will fork&exec
ssh B orted ...
ssh C orted ...
the orted daemons will
one (this is the one used) is set by
opal/mca/pmix/pmix1xx/configure.m4)
Cheers,
Gilles
On 10/14/2015 3:37 PM, Gilles Gouaillardet wrote:
Folks,
i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my
configure command line.
here is what happens :
opal/mca/pmix/pmix1xx/configure.m4
Folks,
i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my
configure command line.
here is what happens :
opal/mca/pmix/pmix1xx/configure.m4 set the CPPFLAGS environment
variable with -I/tmp and include paths for hwloc and libevent
then opal/mca/pmix/pmix1xx/pmix/configure is invoked
be great if you can point out some main OMPI files and
functions that are involved in the process.
You might want to step through the selection process with a debugger
to see what happens. Set a breakpoint on mca_coll_base_comm_select()
and step through from there.
> Dahai
>
>
&g
at first, you can check the priorities of the various coll modules
with ompi_info
$ ompi_info --all | grep \"coll_ | grep priority
MCA coll: parameter "coll_basic_priority" (current
value: "10", data source: default, level: 9 dev/all, type: int)
MCA coll: parameter
Jeff,
the minor distinction includes the fact that the web archive does not
include email addresses,
but the mbox does.
I am fine handing them the mbox, with a note asking them not to
redistribute it, and keeping it in a secure place because no one like being
spammed.
Cheers,
Gilles
On Saturday
Paul,
the latest master nightly snapshot does include the fix, and i made PRs
for v2.x and v1.10
Cheers,
Gilles
On 9/28/2015 6:29 PM, Gilles Gouaillardet wrote:
Thanks Brice,
I will do the PR for the various ompi branches from tomorrow
Cheers,
Gilles
Brice Goglin wrote:
Sorry, I
il client to avoid missing hwloc-related things
>among OMPI mails.
>
>Brice
>
>
>
>
>Le 28/09/2015 06:23, Gilles Gouaillardet a écrit :
>
>Paul and Brice,
>
>the error message is displayed by libpciaccess when hwloc invokes
>pci_system_init
>
>on Solaris
what the doctor ordered!
On Sep 23, 2015, at 5:45 PM, Gilles Gouaillardet
mailto:gil...@rist.or.jp>> wrote:
Ralph,
the root cause is
getsockopt(..., SOL_SOCKET, SO_RCVTIMEO,...)
fails with errno ENOPROTOOPT on solaris 11.2
the attached patch is a proof of conc
mpirun a
level of 10 for the pmix_base_verbose param? This output
isn’t what I would have expected from that level - it
looks more like the verbosity was set to 5, and so the
error number isn’t printed.
Thanks
Ralph
301 - 400 of 816 matches
Mail list logo