Siegmar,
how did you configure openmpi ? which java version did you use ?
i just found a regression and you currently have to explicitly add
CFLAGS=-D_REENTRANT CPPFLAGS=-D_REENTRANT
to your configure command line
if you want to debug this issue (i cannot reproduce it on a solaris 11
x86
Can you also check there is no cpu binding issue (several mpi tasks and/or
OpenMP threads if any, bound to the same core and doing time sharing ?
A simple way to check that is to log into a compute node, run top and then
press 1 f j
If some cores have higher usage than others, you are likely
Hi Siegmar,
You might need to configure with --enable-debug and add -g -O0 to your CFLAGS
and LDFLAGS
Then once you attach with gdb, you have to find the thread that is polling :
thread 1
bt
thread 2
bt
and so on until you find the good thread
If _dbg is a local variable, you need to select the
It looks like we faced a similar issue :
opal_process_name_t is 64 bits aligned wheteas orte_process_name_t is 32 bits
aligned. If you run an alignment sensitive cpu such as sparc and you are not
lucky (so to speak) you can run into this issue.
i will make a patch for this shortly
Ralph Castain
variable declaration
only.
Any thought ?
Ralph Castain <r...@open-mpi.org> wrote:
>Will PR#249 solve it? If so, we should just go with it as I suspect that is
>the long-term solution.
>
>> On Oct 26, 2014, at 4:25 PM, Gilles Gouaillardet
>> <gilles.gouaillar...@gma
es to your branch, I can pass you a patch with my suggested
> alterations.
>
>> On Oct 26, 2014, at 5:55 PM, Gilles Gouaillardet
>> <gilles.gouaillar...@gmail.com> wrote:
>>
>> No :-(
>> I need some extra work to stop declaring orte_process_name_t an
;>>>>> while
>>> (_dbg) poll(NULL, 0, 1);
>>>>>> tyr java 400 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i _dbg
>>>>>> tyr java 401 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i
>>>>>> JNI_
Hi,
i tested on a RedHat 6 like linux server and could not observe any
memory leak.
BTW, are you running 32 or 64 bits cygwin ? and what is your configure
command line ?
Thanks,
Gilles
On 2014/10/27 18:26, Marco Atzeri wrote:
> On 10/27/2014 8:30 AM, maxinator333 wrote:
>> Hello,
>>
>> I
Thanks Marco,
I could reproduce the issue even with one node sending/receiving to itself.
I will investigate this tomorrow
Cheers,
Gilles
Marco Atzeri <marco.atz...@gmail.com> wrote:
>
>
>On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
>>
>> i teste
Michael,
Could you please run
mpirun -np 1 df -h
mpirun -np 1 df -hi
on both compute and login nodes
Thanks
Gilles
michael.rach...@dlr.de wrote:
>Dear developers of OPENMPI,
>
>We have now installed and tested the bugfixed OPENMPI Nightly Tarball of
>2014-10-24
Michael,
The available space must be greater than the requested size + 5%
From the logs, the error message makes sense to me : there is not enough space
in /tmp
Since the compute nodes have a lot of memory, you might want to try using
/dev/shm instead of /tmp for the backing files
Cheers,
Ralph,
On 2014/10/28 0:46, Ralph Castain wrote:
> Actually, I propose to also remove that issue. Simple enough to use a
> hash_table_32 to handle the jobids, and let that point to a
> hash_table_32 of vpids. Since we rarely have more than one jobid
> anyway, the memory overhead actually decreases
Marco,
here is attached a patch that fixes the issue
/* i could not find yet why this does not occurs on Linux ... */
could you please give it a try ?
Cheers,
Gilles
On 2014/10/27 18:45, Marco Atzeri wrote:
>
>
> On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
Hi Siegmar,
From the jvm logs, there is an alignment error in native_get_attr but i could
not find it by reading the source code.
Could you please do
ulimit -c unlimited
mpiexec ...
and then
gdb /bin/java core
And run bt on all threads until you get a line number in native_get_attr
Thanks
Thanks Marco,
pthread_mutex_init calls calloc under cygwin but does not allocate memory under
linux, so not invoking pthread_mutex_destroy causes a memory leak only under
cygwin.
Gilles
Marco Atzeri <marco.atz...@gmail.com> wrote:
>On 10/28/2014 12:04 PM, Gilles Gouaillardet wrote:
Yep, will do today
Ralph Castain <r...@open-mpi.org> wrote:
>Gilles: will you be committing this to trunk and PR to 1.8?
>
>
>> On Oct 28, 2014, at 11:05 AM, Marco Atzeri <marco.atz...@gmail.com> wrote:
>>
>> On 10/28/2014 4:41 PM, Gill
1 (LWP 1)]
>>> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to
>>> satisfy query
>>> (gdb) bt
>>> #0 0x7f6173d0 in rtld_db_dlactivity () from
>>> /usr/lib/sparcv9/ld.so.1
>>> #1 0x7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
>>&g
Michael,
could you please share your test program so we can investigate it ?
Cheers,
Gilles
On 2014/10/31 18:53, michael.rach...@dlr.de wrote:
> Dear developers of OPENMPI,
>
> There remains a hanging observed in MPI_WIN_ALLOCATE_SHARED.
>
> But first:
> Thank you for your advices to employ
ved with our large CFD-code.
>
> Are OPENMPI-developers nevertheless interested in that testprogram?
>
> Greetings
> Michael
>
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles
> Gouaillar
Michael,
the root cause is openmpi was not compiled with the intel compilers but
the gnu compiler.
fortran modules are not binary compatible so openmpi and your
application must be compiled
with the same compiler.
Cheers,
Gilles
On 2014/11/05 18:25, michael.rach...@dlr.de wrote:
> Dear OPENMPI
an
>mpi.mod file, because the User can look inside the module
>and can directly see, if something is missing or possibly wrongly coded.
>
>Greetings
> Michael Rachner
>
>
>-Ursprüngliche Nachricht-----
>Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gill
Brock,
Is your post related to ib0/eoib0 being used at all, or being used with load
balancing ?
let me clarify this :
--mca btl ^openib
disables the openib btl aka *native* infiniband.
This does not disable ib0 and eoib0 that are handled by the tcp btl.
As you already figured out,
Ralph,
IIRC there is load balancing accros all the btl, for example
between vader and scif.
So load balancing between ib0 and eoib0 is just a particular case that might
not necessarily be handled by the btl tcp.
Cheers,
Gilles
Ralph Castain wrote:
>OMPI discovers all
Hi,
IIRC there were some bug fixes between 1.8.1 and 1.8.2 in order to really
use all the published interfaces.
by any change, are you running a firewall on your head node ?
one possible explanation is the compute node tries to access the public
interface of the head node, and packets get
Could you please send the output of netstat -nr on both head and compute node ?
no problem obfuscating the ip of the head node, i am only interested in
netmasks and routes.
Ralph Castain wrote:
>
>> On Nov 12, 2014, at 2:45 PM, Reuti wrote:
>>
>>
Hi,
it seems you messed up the command line
could you try
$ mpirun --mca btl ^openib --host compute-01-01,compute-01-06 ring_c
can you also try to run mpirun from a compute node instead of the head
node ?
Cheers,
Gilles
On 2014/11/13 16:07, Syed Ahsan Ali wrote:
> Here is what I see when
9
> [compute-01-01.private.dns.zone][[11064,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
> connect() to 192.168.108.10 failed: No route to host (113)
>
>
> On Thu, Nov 13, 2014 at 12:11 PM, Gilles Gouaillardet
> <gilles.gouaillar...@iferc.org> wrote:
.0 b) TX bytes:0 (0.0 b)
>
>
>
> So the point is why mpirun is following the ib path while I it has
> been disabled. Possible solutions?
>
> On Thu, Nov 13, 2014 at 12:32 PM, Gilles Gouaillardet
> <gilles.gouaillar...@iferc.org> wrote:
>> mpirun complains about the
; ib0 Link encap:InfiniBand HWaddr
>>> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>>> inet addr:192.168.108.14 Bcast:192.168.108.255
>>> Mask:255.255.255.0
>>> UP BROADCAST MULTICAST MTU:65520 Metric:1
>>
.0 255.0.0.0 U 0 0 0 eth0
> 0.0.0.0 10.0.0.10.0.0.0 UG0 0 0 eth0
> [pmdtest@compute-01-06 ~]$
>
>
> On Thu, Nov 13, 2014 at 12:56 PM, Gilles Gouaillardet
> <gilles.gouaillar...@iferc.org> wro
My 0.02 US$
first, the root cause of the problem was a default gateway was
configured on the node,
but this gateway was unreachable.
imho, this is incorrect system setting that can lead to unpredictable
results :
- openmpi 1.8.1 works (you are lucky, good for you)
- openmpi 1.8.3 fails (no luck
Siegmar,
This is correct, --enable-heterogenous is now fixed in the trunk.
Please also note that -D_REENTRANT is now automatically set on solaris
Cheers
Gilles
Siegmar Gross wrote:
>Hi Jeff, hi Ralph,
>
>> This issue should now be fixed, too.
>
>Yes, it
Hi John,
do you MPI_Init() or do you MPI_Init_thread(MPI_THREAD_MULTIPLE) ?
does your program calls MPI anywhere from an OpenMP region ?
does your program calls MPI only within an !$OMP MASTER section ?
does your program does not invoke MPI at all from any OpenMP region ?
can you reproduce this
Daniel,
you can run
$ ompi_info --parseable --all | grep _algorithm: | grep enumerator
that will give you the list of supported algo for the collectives,
here is a sample output :
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:0:ignore
Hi Ghislain,
that sound like a but in MPI_Dist_graph_create :-(
you can use MPI_Dist_graph_create_adjacent instead :
MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD, degrees, [0],
[0],
degrees, [0], [0], info,
rankReordering, );
it does not crash and as far as i
t reagrds,
> Ghislain
>
> 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet <gilles.gouaillar...@iferc.org
>> :
>> Hi Ghislain,
>>
>> that sound like a but in MPI_Dist_graph_create :-(
>>
>> you can use MPI_Dist_graph_create_adjacent instead :
>
based on prior knowledge.
>
> George.
>
>
> On Fri, Nov 21, 2014 at 3:48 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>> Ghislain,
>>
>> i can confirm there is a bug in mca_topo_base_dist_graph_distribute
>>
>> FYI a proof of
It could be because configure did not find the knem headers and hence knem is
not supported and hence this mca parameter is read-only
My 0.2 us$ ...
Dave Love さんのメール:
>Why can't I set parameters like this (not the only one) with 1.8.3?
>
> WARNING: A user-supplied value
Folks,
FWIW, i observe a similar behaviour on my system.
imho, the root cause is OFED has been upgraded from a (quite) older
version to latest 3.12 version
here is the relevant part of code (btl_openib.c from the master) :
static uint64_t calculate_max_reg (void)
{
if (0 ==
Luca,
your email mentions openmpi 1.6.5
but gdb output points to openmpi 1.8.1.
could the root cause be a mix of versions that does not occur with root
account ?
which openmpi version are you expecting ?
you can run
pmap
when your binary is running and/or under gdb to confirm the openmpi
the max locked memory size should be
>> unlimited.
>> Check /etc/security/limits.conf and "ulimit -a".
>>
>> I hope this helps,
>> Gus Correa
>>
>> On 12/10/2014 08:28 AM, Gilles Gouaillardet wrote:
>>> Luca,
>>>
>>> your
Alex,
can you try something like
call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name')
-i start with an empty environment
that being said, you might need to set a few environment variables
manually :
env -i PATH=/bin ...
and that being also said, this "trick" could be just a bad idea :
ize
> getting passed over a job scheduler with this approach might not work at
> all...
>
> I have looked at the MPI_Comm_spawn call but I failed to understand how it
> could help here. For instance, can I use it to launch an mpi app with the
> option "-n 5&quo
MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status)
> enddo
>
> I do get 15 instances of the 'hello_world' app running: 5 for each parent
> rank 1, 2 and 3.
>
> Thanks a lot, Gilles.
>
> Best regargs,
>
> Alex
>
>
>
>
> 2014-12-12 1:32 GMT-02:00 Gilles Goua
just
>a front end to use those, but since we have a lot of data to process
>
>it also benefits from a parallel environment.
>
>
>Alex
>
>
>
>
>2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet <gilles.gouaillar...@iferc.org>:
>
>Alex,
>
>just to m
call back to the scheduler
>queue. How would I track each one for their completion?
>
>Alex
>
>
>2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet <gilles.gouaillar...@gmail.com>:
>
>Alex,
>
>You need MPI_Comm_disconnect at least.
>I am not sure if
n running these codes in serial mode. No need to say that
>we could do a lot better if they could be executed in parallel.
>
>I am not familiar with DMRAA but it seems to be the right choice to deal with
>job schedulers as it covers the ones I am interested in (pbs/torque and
>loadl
Eric,
can you make your test case (source + input file + howto) available so i
can try to reproduce and fix this ?
Based on the stack trace, i assume this is a complete end user application.
have you tried/been able to reproduce the same kind of crash with a
trimmed test program ?
BTW, what
Eric,
i checked the source code (v1.8) and the limit for the shared_fp_fname
is 256 (hard coded).
i am now checking if the overflow is correctly detected (that could
explain the one byte overflow reported by valgrind)
Cheers,
Gilles
On 2014/12/15 11:52, Eric Chamberland wrote:
> Hi again,
>
>
Hi Siegmar,
a similar issue was reported in mpich with xlf compilers :
http://trac.mpich.org/projects/mpich/ticket/2144
They concluded this is a compiler issue (e.g. the compiler does not
implement TS 29113 subclause 8.1)
Jeff,
i made PR 315 https://github.com/open-mpi/ompi/pull/315
f08
Eric,
thanks for the simple test program.
i think i see what is going wrong and i will make some changes to avoid
the memory overflow.
that being said, there is a hard coded limit of 256 characters, and your
path is bigger than 300 characters.
bottom line, and even if there is no more memory
.
Cheers,
Gilles
On 2014/12/16 12:43, Gilles Gouaillardet wrote:
> Eric,
>
> thanks for the simple test program.
>
> i think i see what is going wrong and i will make some changes to avoid
> the memory overflow.
>
> that being said, there is a hard coded limit of 256 cha
>
>So far, I could not find anything about how to set an stdin file for an
>spawnee process.
>Specifiyng it in a app context file doesn't seem to work. Can it be done?
>Maybe through
>an MCA parameter?
>
>
>Alex
>
>
>
>
>
>
>2014-12-15 2:43 GM
FWIW
I faced a simlar issue on my linux virtualbox.
My shared folder is a vboxfs filesystem, but statfs returns the nfs magic id.
That causes some mess and the test fails.
At this stage i cannot tell whether i should blame the glibc, the kernel, a
virtualbox driver or myself
Cheer,
Gilles
Siegmar,
could you please give a try to the attached patch ?
/* and keep in mind this is just a workaround that happen to work */
Cheers,
Gilles
On 2014/12/22 22:48, Siegmar Gross wrote:
> Hi,
>
> today I installed openmpi-dev-602-g82c02b4 on my machines (Solaris 10 Sparc,
> Solaris 10 x86_64,
Kawashima-san,
i'd rather consider this as a bug in the README (!)
heterogenous support has been broken for some time, but it was
eventually fixed.
truth is there are *very* limited resources (both human and hardware)
maintaining heterogeneous
support, but that does not mean heterogeneous
Where does the error occurs ?
MPI_Init ?
MPI_Finalize ?
In between ?
In the first case, the bug is likely a mishandled error case,
which means OpenMPI is unlikely the root cause of the crash.
Did you check infniband is up and running on your cluster ?
Cheers,
Gilles
Saliya Ekanayake
FWIW ompi does not yet support XRC with OFED 3.12.
Cheers,
Gilles
Deva さんのメール:
>Hi Waleed,
>
>
>It is highly recommended to upgrade to latest OFED. Meanwhile, Can you try
>latest OMPI release (v1.8.4), where this warning is ignored on older OFEDs
>
>
>-Devendar
>
Diego,
First, i recommend you redefine tParticle and add a padding integer so
everything is aligned.
Before invoking MPI_Type_create_struct, you need to
call MPI_Get_address(dummy, base, MPI%err)
displacements = displacements - base
MPI_Type_create_resized might be unnecessary if tParticle
DRESS(dummy[1]), newt) ""
>
>
>What do you think?
>
>George, Did i miss something?
>
>
>Thanks a lot
>
>
>
>
>Diego
>
>
>On 2 January 2015 at 12:51, Gilles Gouaillardet
><gilles.gouaillar...@gmail.com> wrote:
>
>Diego,
>
>Fi
nt you can find the program.
>
> What do you meam "remove mpi_get_address(dummy) from all displacements".
>
> Thanks for all your help
>
> Diego
>
>
>
> Diego
>
>
> On 3 January 2015 at 00:45, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com>
ACEMENTS*
> * ENDIF*
>
> and the results is:
>
>*139835891001320 -139835852218120 -139835852213832*
> * -139835852195016 8030673735967299609*
>
> I am not able to understand it.
>
> Thanks a lot.
>
> In the attachment you can find the program
>
>
>
>
>
>Why do I have 16 spaces in displacements(2), I have only an integer in
>dummy%ip?
>
>Why do you use dummy(1) and dummy(2)?
>
>
>Thanks a lot
>
>
>
>Diego
>
>
>On 5 January 2015 at 02:44, Gilles Gouaillardet
><gilles.gouaillar...@iferc.
Diego,
my bad, i should have passed displacements(1) to MPI_Type_create_struct
here is an updated version
(note you have to use a REQUEST integer for MPI_Isend and MPI_Irecv,
and you also have to call MPI_Wait to ensure the requests complete)
Cheers,
Gilles
On 2015/01/08 8:23, Diego Avesani
Well, per the source code, this is not a bug but a feature :
from publish function from ompi/mca/pubsub/orte/pubsub_orte.c
ompi_info_get_bool(info, "ompi_unique", , );
if (0 == flag) {
/* uniqueness not specified - overwrite by default */
unique = false;
}
fwiw, and
b) it seemed just as
> reasonable as the alternative (I believe we flipped a coin)
>
>
>> On Jan 7, 2015, at 6:47 PM, Gilles Gouaillardet
>> <gilles.gouaillar...@iferc.org> wrote:
>>
>> Well, per the source code, this is not a bug but a feature :
>&
the program run in your case?
>
> Thanks again
>
>
>
> Diego
>
>
> On 8 January 2015 at 03:02, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>> Diego,
>>
>> my bad, i should have passed displacements(1) to MPI_Type_create_
ed is my copy of your program with fixes for the above-mentioned issues.
>
> BTW, I missed the beginning of this thread -- I assume that this is an
> artificial use of mpi_type_create_resized for the purposes of a small
> example. The specific use of it in this program appears to
Hi Siegmar,
could you please try again with adding '-D_STDC_C99' to your CFLAGS ?
Thanks and regards,
Gilles
On 2015/01/12 20:54, Siegmar Gross wrote:
> Hi,
>
> today I tried to build openmpi-dev-685-g881b1dc on my machines
> (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1
>
Ryan,
this issue has already been reported.
please refer to
http://www.open-mpi.org/community/lists/users/2015/01/26134.php for a
workaround
Cheers,
Gilles
On 2015/01/14 16:35, Novosielski, Ryan wrote:
> OpenMPI 1.8.4 does not appear to be buildable with GCC 4.9.2. The output, as
> requested
Alexander,
i was able to reproduce this behaviour.
basically, bad things happen when the garbage collector is invoked ...
i was even able to reproduce some crashes (but that happen at random
stages) very early in the code
by manually inserting calls to the garbage collector (e.g. System.gc();)
Dave,
the QDR Infiniband uses the openib btl (by default :
btl_openib_exclusivity=1024)
i assume the RoCE 10Gbps card is using the tcp btl (by default :
btl_tcp_exclusivity=100)
that means that by default, when both openib and tcp btl could be used,
the tcp btl is discarded.
could you give a
Simona,
On 2015/02/08 20:45, simona bellavista wrote:
> I have two systems A (aka Host) and B (aka Target). On A a compiler suite
> is installed (intel 14.0.2), on B there is no compiler. I want to compile
> openmpi on A for running it on system B (in particular, I want to use
> mpirun and
Khalid,
i am not aware of such a mechanism.
/* there might be a way to use MPI_T_* mechanisms to force the algorithm,
and i will let other folks comment on that */
you definetly cannot directly invoke ompi_coll_tuned_bcast_intra_binomial
(abstraction violation, non portable, and you miss the
you know what you are doing, you can try mpirun -mca sec basic)
on blue waters, that would mean ompi does not run out of the box, but
fails with an understandable message.
that would be less user friendly, but more secure
any thoughts ?
Cheers,
Gilles
[gouaillardet@node0 ~]$ echo c
On 2015/03/26 13:00, Ralph Castain wrote:
> Well, I did some digging around, and this PR looks like the right solution.
ok then :-)
following stuff is not directly related to ompi, but you might want to
comment on that anyway ...
> Second, the running of munge on the IO nodes is not only okay but
can see Munge is/can be used by both SLURM and
> TORQUE.
> (http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/1-installConfig/serverConfig.htm#usingMUNGEAuth)
>
> If I misunderstood the drift, please ignore ;-)
>
> Mark
>
>
>> On 26 Mar 2015, at 5:38 , Gil
Xing,
an other approach is to use ompi-server and Publish_name / Lookup_name :
run ompi-server and pass the uri to two jobs (one per user)
then you will have to "merge" the two jobs.
this is obviously a bit more effort, but this is a cleaner approach imho.
while sharing accounts is generally
Andy,
what about reconfiguring Open MPI with
LDFLAGS="-Wl,-rpath,/opt/intel/15.0/composer_xe_2015.2.164/compiler/lib/mic"
?
IIRC, an other option is : LDFLAGS="-static-intel"
last but not least, you can always replace orted with a simple script
that sets the LD_LIBRARY_PATH and exec the
this
option could overwhelm it and cause failures.
I’d try the static method first, or perhaps the LDFLAGS Gilles suggested.
On Apr 14, 2015, at 5:11 PM, Gilles Gouaillardet <gil...@rist.or.jp
<mailto:gil...@rist.or.jp>> wrote:
Andy,
what about reconfiguring Open MPI with
LDFLAGS
This is a known limitation of the sm btl.
FWIW, the vader btl (available in Open MPI 1.8) has the same limitation,
thought i heard there are some works in progress to get rid of this
limitation.
Cheers,
Gilles
On 5/14/2015 3:52 PM, Radoslaw Martyniszyn wrote:
Dear developers of Open MPI,
Siegmar,
do sunpc0 and sunpc1 run the same java version ?
from sunpc1, can you run
mpiexec -np 1 java InitFinalizeMain ?
Cheers,
Gilles
On Friday, May 15, 2015, Siegmar Gross
wrote:
> Hi,
>
> I successfully installed openmpi-1.8.5 on my machines
Siegmar,
can you run
LD_LIBRARY_PATH= LD_LIBRARY_PATH64= /usr/bin/ssh
on all your boxes ?
the root cause could be you try to run ssh on box A with the env of box B
can you also run with the -output-tag (or -tag-output) so we can figure out
on which box ssh is failing
Cheers,
Gilles
On
Hi Khalid,
i checked the source code and it turns out rules must be ordered :
- first by communicator size
- second by message size
Here is attached an updated version of the ompi_tuned_file.conf you
should use
Cheers,
Gilles
On 5/20/2015 8:39 AM, Khalid Hasanov wrote:
Hello,
I am trying
fig for the communicator size 16 (the second one). I am writing
this just in case it is not expected behaviour.
Thanks again.
Best regards,
Khalid
On Wed, May 20, 2015 at 2:12 AM, Gilles Gouaillardet
<gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote:
Hi Khalid,
i che
Hi Mohammad,
the error message is self explanatory.
you cannot invoke MPI functions before invoking MPI_Init or after
MPI_Finalize
the easiest way to solve your problem is to move the MPI_Init call to the
beginning of your program.
Cheers,
Gilles
On Wednesday, May 20, 2015, #MOHAMMAD ASIF
Bill,
the root cause is likely there is not enough free space in /tmp.
the simplest, but slowest, option is to run mpirun --mac btl tcp ...
if you cannot make enough space under /tmp (maybe you run diskless)
there are some options to create these kind of files under /dev/shm
Cheers,
Gilles
Hi Xing,
iirc, open MPI default behavior is to bind to cores (vs hyperthreads),
hence the error message.
I cannot remember the option to bind to threads, but you can mpirun
--oversubscribe if you are currently stuck
Cheers,
Gilles
On Sunday, May 24, 2015, XingFENG
ubscribe, would the
> performance be influenced?
>
> On Sun, May 24, 2015 at 7:24 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>
>> Hi Xing,
>>
>> iirc, open MPI d
Rahul,
per the logs, it seems the /sys pseudo filesystem is not mounted in your
chroot.
at first, can you make sure this is mounted and try again ?
Cheers,
Gilles
On 5/26/2015 12:51 PM, Rahul Yadav wrote:
We were able to solve ssh problem.
But now MPI is not able to use component yalla.
At first glance, it seems all mpi tasks believe they are rank zero and
comm world size is 1 (!)
Did you compile xhpl with OpenMPI (and not a stub library for serial
version only) ?
can you make sure there is nothing wrong with your LD_LIBRARY_PATH and
you do not mix MPI librairies
(e.g.
entry
*From:*users [mailto:users-boun...@open-mpi.org] *On Behalf Of *Gilles
Gouaillardet
*Sent:* Tuesday, May 26, 2015 8:14 PM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] Running HPL on RPi cluster, seems like MPI
is somehow not configured properly since it work with 1 node but not more
Jeff,
shall I assume you made a typo and wrote CCFLAGS instead of CFLAGS ?
also, can you double check the flags are correctly passed to the assembler
with
cd opal/asm
make -n atomic-asm.lo
Cheers,
Gilles
On Friday, May 29, 2015, Jeff Layton wrote:
> Good morning,
>
> I'm
Walt,
can you disable firewall and network if possible and give it an other try ?
Cheers,
Gilles
On Saturday, May 30, 2015, Walt Brainerd wrote:
> It behaved this way with the Cygwin version (very recent update)
> and with 1.8.5 that I built from source.
>
> On
Steve,
MCA_BTL_OPENIB_MODEX_MSG_{HTON,NTOH} do not convert all the fields of the
mca_btl_openib_modex_message_t struct.
I would start here ...
Cheers,
Gilles
On Wednesday, June 3, 2015, Jeff Squyres (jsquyres)
wrote:
> Steve --
>
> I think that this falls directly in
Jeff,
imho, this is a grey area ...
99.999% of the time, posix_memalign is a "pure" function.
"pure" means it has no side effects.
unfortunatly, this part of the code is the 0.001% case in which we
explicitly rely on a side effect
(e.g. posix_memalign calls an Open MPI wrapper that updates a
de how to move forward on this.
>
> George.
>
>
> > On Jun 4, 2015, at 22:47 , Gilles Gouaillardet <gil...@rist.or.jp
> <javascript:;>> wrote:
> >
> > Jeff,
> >
> > imho, this is a grey area ...
> >
> > 99.999% of the ti
i wrote a reproducer i sent to the GCC folks
https://gcc.gnu.org/ml/gcc-bugs/2015-06/msg00757.html
Cheers,
Gilles
On Tue, Jun 9, 2015 at 3:20 AM, Jeff Squyres (jsquyres)
wrote:
> On Jun 8, 2015, at 11:27 AM, Dave Goodell (dgoodell)
> wrote:
>>
>> My
Jeff,
dmb is available only on ARMv7 (Pi 2)
if i remember correctly, you are building Open MPI on ARMv7 as well (Pi 2),
so this is not a cross compilation issue.
if you configure with -march=armv7, the relevant log is
libtool: compile: gcc -std=gnu99 -DHAVE_CONFIG_H -I.
-I../../opal/include
Jeff,
can you
gcc -march=armv7-a foo.c
Cheers,
Gilles
On Tuesday, June 9, 2015, Jeff Layton wrote:
> Gilles,
>
> I'm not cross-compiling - I'm building on the Pi 2.
>
> I'm not sure how to check if gcc can generate armv7 code.
> I'm using Raspbian and I'm just using the
1 - 100 of 1020 matches
Mail list logo