014, at 7:51 AM, Ralph Castain <r...@open-mpi.org> wrote:
Forwarding this for Paul until his email address gets updated on the User list:
Begin forwarded message:
Date: October 17, 2014 at 6:35:31 AM PDT
From: Paul Kapinos <kapi...@itc.rwth-aachen.de>
To: Open MPI Users <us...@ope
el6.x86_64 #1 SMP Tue Sep 9
13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux
pk224850@cluster:~[510]$ cat /etc/issue
Scientific Linux release 6.5 (Carbon)
Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our environment.
Best
Paul Kapinos
P.S. Is there a confugure flag, wh
to get in-deepth feeling?
Best
Paul Kapinos
Attached: some logs from Instalation at 27.05 and today't try, and quota.h
(changed at 29.09). Note that also the kernel changed (and maybe the Scientific
Linux version from 6.4 to 6.5?)
pk224850@cluster:~[502]$ ls -la /usr/include/sys/quota.h
-rw-r--r
P.S. We also have Sun/Oracle Studio:
$ module avail studio
On 12/11/14 19:45, Jeff Squyres (jsquyres) wrote:
Ok.
FWIW: I test with gcc and the intel compiler suite. I do not have a PGI
license to test with.
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen
th moar xperience in linking libs and especially Open MPI take a
look at this? (sorry for pushing this, but all this smells for me being an
general linking problem rooted somewhere in Open MPI and '--disable-dlopen', see
"fun fact" above)
best
Paul Kapinos
P.S. Never used &qu
%+ bandwidth loss?
Best
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
MCA btl: parameter "btl_openib_verbose" (current value:
"fals
d news is that if this fixes your problem, the fix is already included
in the upcoming v1.10.1 release.
Indeed, that was it. Fixed!
Many thanks for support!
Best
Paul
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074
(89 MB!)
Best wishes
Paul Kapinos
1) https://www.open-mpi.org/community/lists/devel/2014/10/16106.php
2) https://www.open-mpi.org/community/lists/devel/2014/10/16109.php
https://github.com/hppritcha/ompi/commit/53fd425a6a0843a5de0a8c544901fbf01246ed31
3)
https://rwth-aachen.sciebo.de/index.
Hello all, JFYI and for log purposes:
*In short: 'caddr_t' issue is known and is addressed in new(er) ROMIO releases.*
Below the (off-list) answer (snippet) from Rob Latham.
On 12/08/15 13:16, Paul Kapinos wrote:
In short: ROMIO in actual OpenMPI versions cannot configure using old versions
ssue...
Have a nice day,
Paul Kapinos
..
access("/opt/MPI/openmpi-1.10.2/linux/intel_16.0.2.181/include/mpp/shmem.fh",
R_OK) = 0
stat("/opt/MPI/openmpi-1.10.2/linux/intel_16.0.2.181/include/mpp/shmem.fh",
{st_mode=S_IFREG|0644, st_size=205, ...}) = 0
open(&q
day,
Paul Kapinos
pk224850@linuxc2:/opt/MPI/openmpi-1.8.1/linux/intel/include[519]$ ls -la
mpp/shmem.fh
lrwxrwxrwx 1 pk224850 pk224850 11 Jul 13 13:20 mpp/shmem.fh -> ../shmem.fh
Cheers,
Gilles
On Wednesday, July 13, 2016, Paul Kapinos <kapi...@itc.rwth-aachen.de
<mailto:kapi...
in reproduce this?
Best,
Paul Kapinos
P.S: The same test with Intel MPI cannot run using DAPL, but run very fine opef
'ofa' (= native verbs as Open MPI use it). So I believe the problem is rooted in
the communication pattern of the program; it send very LARGE messages to a lot
of/all other processes
oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0
Nevertheless, I cannot reproduce your initial issue with 1.6.1rc2 in our
environment.
Best
Paul Kapinos
$ time /opt/MPI/openmpi-1.6.1rc2mt/linux/intel/bin/mpiexec -mca
oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0 -np 4 -H
linuxscc005,linuxscc004 a.out
s):
$ mpiexec a.out 108000 108001
Well, we know about the need to raise the values of one of these parameters, but
I wanted to let you to know that your workaround for the problem is still not
100% perfect but only 99%.
Best,
Paul Kapinos
P.S: A note about the inform
so currently I cannot check it again. Remember me again if the link issue is
fixed!
Best,
Paul
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
istered could be a good idea.
Does this make sense?
Best,
Paul Kapinos
P.S. The used example program is of course an synthetical thing but it is
strongly sympathized with the Serpent software. (however serpent usually use
chunks whereby the actual error arise if all the 8GB are send in one piece).
P.
fallback to workaround this scenarios in future.
Maybe a bit more verbosity at this place is a good idea?
Best,
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel
Paul's input on this first.
Did it work with log_num_mtt=26?
I don't have that kind of machines to test this.
-- YK
On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote:
Hi Paul,
On 10/31/2012 10:22 PM, Paul Kapinos wrote:
Hello Yevgeny, hello all,
Yevgeny, first of all thanks for explaining
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80
fs+ufs+nfs+lustre'
--enable-mpi-ext ..
(adding paths, compiler-specific optimisation things and -m32 or -m64)
An config.log file attached FYI
Best
Paul
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter
utomatically?
Background: Currently we're using the 'carto' framework on our kinda special
'Bull BCS' nodes. Each such node consist of 4 boards with own IB card but build
a shared memory system. Clearly, communicating should go over the nearest IB
interface - for this we use 'carto' now.
best
in advanced
Mohammad
--
Webmail: http://mail.livenet.ch
Glauben entdecken: http://www.jesus.ch
Christliches Webportal: http://www.livenet.ch
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
-
this frontline? Or, can I activate more verbosity
/ what did I do wrong with the path? (see attached file)
Best!
Paul Kapinos
*) the nodes used for testing are also Bull BCS nodes but vonsisting of just two
boards instead of 4
--
Dipl.-Inform. Paul Kapinos - High Performance Comput
On 12/03/13 23:27, Jeff Squyres (jsquyres) wrote:
On Nov 22, 2013, at 1:19 PM, Paul Kapinos <kapi...@rz.rwth-aachen.de> wrote:
Well, I've tried this path on actual 1.7.3 (where the code is moved some 12
lines - beginning with 2700).
!! - no output "skipping device"! Also wh
On 12/04/13 14:53, Jeff Squyres (jsquyres) wrote:
On Dec 4, 2013, at 4:31 AM, Paul Kapinos <kapi...@rz.rwth-aachen.de> wrote:
Argh - what a shame not to see "btl:usnic" :-|
What a shame you don't have Cisco hardware to use the usnic BTL! :-p
Well, this is far above my d
--
No idea, why the file share/openmpi/help-oob-tcp.txt has not been installed in
1.7.4, as we compile this version in pretty the same way as previous versions..
Best,
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter
se
ib0 if it is present and specified in if_include - we should be doing it.
For now, can you run this with "-mca oob_base_verbose 100" on your cmd line and
send me the output? Might help debug the behavior.
Thanks
Ralph
On Feb 11, 2014, at 1:22 AM, Paul Kapinos <kapi...@rz.rwth-aachen
a bunch of diagnostic statements that should help me
track it down.
Thanks
Ralph
On Feb 12, 2014, at 1:26 AM, Paul Kapinos <kapi...@rz.rwth-aachen.de> wrote:
As said, the change in behaviour is new in 1.7.4 - all previous versions has been worked.
Moreover, setting "-mca oob_tcp_if_include ib
e uncomfortable, huh)
Best
Paul Kapinos
P.S. Tested versions: 1.6.5, 1.7.4
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
#!/usr/bin/perl
use Sys::Hostname;
open (MYFIL
P.S. btl_openib_get_alignment and btl_openib_put_alignment are by default '0' -
setting they high did not change the behaviour...
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
smime.p7s
Descript
mproperly?
Best,
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
smime.p7s
Description: S/MIME Cryptographic
+ OpenMPI combination do not like this?
If NO why to hell none of other compiler+MPI combinations complain about this?
:o)
Have a nice day,
Paul Kapinos
P.S. Did you noticed also this one?
https://www.mail-archive.com/users@lists.open-mpi.org//msg30320.html
- 1
ude/lustre/liblustreapi.h file, included from
'openmpi-2.0.1/ompi/mca/fs/lustre/fs_lustre.c' file (line 46, doh).
well, it is about you on change or keep the way the Lustre headers being
included in Open MPI. Just my $2%.
Have a nice day,
Paul Kapinos
pk224850@lnm001:/w0/tmp/pk224850/OpenMPI/2.0.
or MPI_Init_thread, must be called before any other MPI
>routine (apart from MPI_Initialized) is called. MPI can be initialized
>at most once; subsequent calls to MPI_Init or MPI_Init_thread are erro-
>neous.
--
Dipl.-Inform. Paul Kapinos - High Performance
I'm fighting
with some ISV to let they update their Sw to 1.10.x NOW; we know about one who
just managed to go from 1.6.x to 1.8.x a half year ago...)
Thank you very much!
Paul Kapinos
On 10/12/2017 09:31 AM, Gilles Gouaillardet wrote:
> Paul,
>
>
> i made PR #4331 https://github.com
- likely because you develop on much much
newer version of Open MPI.
Q1: on *which* release the path 4331 should be applied?
Q2: I assume it is unlikely that this patch would be back-ported to 1.10.x?
Best
Paul Kapinos
On 10/12/2017 09:31 AM, Gilles Gouaillardet wrote:
> Paul,
>
>
estion: is there a way/a chance to effectively disable the busy wait
using Open MPI?
Best,
Paul Kapinos
[1] http://www.open-mpi.de/faq/?category=running#force-aggressive-degraded
[2]
http://blogs.cisco.com/performance/polling-vs-blocking-message-passingprogress
[3]
https://www.paraview.or
one of type
https://github.com/open-mpi/ompi/issues/4466
Most amazing is that only one version of Open MPI (the patched 3.0.0 one) stops
to work instead of all. Seem's we're lucky. WOW.
will report on results of 3.0.0p rebuild.
best,
Paul Kapinos
$ objdump -S /usr/lib64/libmemkind.so.0.0.1
JFYI: the sane issue is also in Open MPI 4.1.1.
I cannot open an Gitlab issue due to lack of account(*) so I would kindly
ask
somebody to open one, if possible.
Have a nice day
Paul Kapinos
(* too many accounts in my life. )
On 4/16/21 6:02 PM, Paul Kapinos wrote:
Dear Open MPI
rt(external)... no
> ...
> PMIx support: Internal
This is surprising and feels like an error. Could you have a look at this? Thank
you!
Have a nice day,
Paul Kapinos
P.S. grep for 'PMIx' in config-log
https://rwth-aachen.sciebo.de/s/xtNIx2dJlTy2Ams
(pastebin and gist both need accounts and I hate account
of information)
Have a nice day,
Paul Kapinos
[1] https://developer.nvidia.com/hpc-compilers
FCLD libmpi_usempif08.la
/usr/bin/ld: .libs/comm_spawn_multiple_f08.o: relocation R_X86_64_32S against
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld
41 matches
Mail list logo