Unless I am mistaken, the text quoted below from README no longer reflects
the current behavior.
The text appears to be the same in master and v1.8.
-Paul
--with-libltdl(=value)
This option specifies where to find the GNU Libtool libltdl support
library. The following values are permitted:
On 4/22/2015 12:43 AM, Jeff Squyres (jsquyres) wrote:
In the usual location:
http://www.open-mpi.org/software/ompi/v1.8/
Making all in mpi/fortran/use-mpi-f08
make[2]: Entering directory
'/cygdrive/e/cyg_pub/devel/openmpi/openmpi-1.8.5rc2-1.x86_64/build/ompi/mpi/fortran/use-mpi-f08'
We are experiencing a bug in OpenMPI in 1.8.4 which happens also on
master: if locked memory limits are too low, a segfault happens
in openib/udcm because some memory is not correctly deallocated.
To reproduce it, modify /etc/security/limits.conf with:
* soft memlock 64
* hard memlock 64
and launc
Hi Raphael,
Thanks very much for the patches.
Would one of the developers on the list have a system where they
can make these kernel limit changes and which have HCAs installed?
I don't have access to any system where I have such permissions.
Howard
2015-04-22 8:55 GMT-06:00 Raphaël Fouassier
Howard,
Unless there is some reason the settings must be global, you should be able
to set the limits w/o root privs:
Bourne shells:
$ ulimit -l 64
C shells:
% limit -h memorylocked 64
I would have thought these lines might need to go in a .profile or .cshrc
to affect the application pro
And here is the backtrace I probably should have provided in the previous
email.
-Paul
#0 0x2b4107ce9265 in raise () from /lib64/libc.so.6
#1 0x2b4107ceaeb8 in abort () from /lib64/libc.so.6
#2 0x2b4107ce26e6 in __assert_fail () from /lib64/libc.so.6
#3 0x0044e8b3 in udcm_m
Hi Paul,
silly me. forgot this was a ulimit thing. I'll test on carver.
Howard
2015-04-22 10:45 GMT-06:00 Paul Hargrove :
> And here is the backtrace I probably should have provided in the previous
> email.
> -Paul
>
> #0 0x2b4107ce9265 in raise () from /lib64/libc.so.6
> #1 0x2b41
When building from a git clone of master I encountered the following:
checking for flex... no
checking for lex... no
configure: WARNING: *** Could not find GNU Flex on your system.
configure: WARNING: *** GNU Flex required for developer builds of Open MPI.
configure: WARNING: *** Other versions of
Umm, why are you cleaning up this way. The allocated resources *should*
be freed by the udcm_module_finalize call. If there is a bug in that
path it should be fixed there NOT by adding a bunch of gotos (ick).
I will take a look now and apply the appropriate fix.
-Nathan
On Wed, Apr 22, 2015 at
I had reason to look at the linux timer code today and noticed what I
believe to be a subtle error.
This is in both 'master' and v1.8.5rc2
Since casts bind tighter than multiplication in C, I believe that the
1-line patch below is required to produce the desired result of conversion
to an integer
I will commit the log messages with my fix. Will PR the fix for 1.8.5.
-Nathan
On Wed, Apr 22, 2015 at 12:43:56PM -0600, Nathan Hjelm wrote:
>
> Umm, why are you cleaning up this way. The allocated resources *should*
> be freed by the udcm_module_finalize call. If there is a bug in that
> path
Hi Rafael,
I give you an A+ for effort. We always appreciate patches.
Howard
2015-04-22 12:43 GMT-06:00 Nathan Hjelm :
>
> Umm, why are you cleaning up this way. The allocated resources *should*
> be freed by the udcm_module_finalize call. If there is a bug in that
> path it should be fixed
I see the problem. I thought I fixed this awhile ago but apparently
not. The various OBJ_CONSTRUCT lines should be at the top of the
udcm_module_init to ensure that they are always called. Fixing.
-Nathan
On Wed, Apr 22, 2015 at 01:13:08PM -0600, Nathan Hjelm wrote:
>
> I will commit the log me
Agreed. goto's just make me grumpy.
-Nathan
On Wed, Apr 22, 2015 at 01:17:11PM -0600, Howard Pritchard wrote:
>Hi Rafael,
>I give you an A+ for effort. We always appreciate patches.
>Howard
>2015-04-22 12:43 GMT-06:00 Nathan Hjelm :
>
> Umm, why are you cleaning up this w
Well, still not working. I compiled on the AMD and for 1.6.4 I get:
(note: ideally, at this point, we really want 1.6.4 ,not 1.8.4 (yet)).
1.6.4 using --bind-to-socket --bind-to-core
--
An attempt to set processor affinit
PR https://github.com/open-mpi/ompi-release/pull/250. Raphaël, can you
please confirm this fixes your issue.
-Nathan
On Wed, Apr 22, 2015 at 04:55:57PM +0200, Raphaël Fouassier wrote:
> We are experiencing a bug in OpenMPI in 1.8.4 which happens also on
> master: if locked memory limits are too
Fixed in
https://github.com/open-mpi/ompi/commit/46aa20a9191db2f5cc1850c0f4f881ac51653cb4.
Thanks!
> On Apr 22, 2015, at 3:01 PM, Paul Hargrove wrote:
>
> I had reason to look at the linux timer code today and noticed what I believe
> to be a subtle error.
> This is in both 'master' and v1.8.
Fixed -- thanks:
https://github.com/open-mpi/ompi/commit/4b8fa246824418f8bd46419286bb1bcb8ce6e941
FWIW, it *did* print a list of files for me on my Mac when I faked it out and
forced it to *not* find flex.
Shrug.
So I just took that part of the error message out -- let devs always install
fle
On Wed, Apr 22, 2015 at 2:02 PM, Jeff Squyres (jsquyres) wrote:
> FWIW, it *did* print a list of files for me on my Mac when I faked it out
> and forced it to *not* find flex.
>
A quick look and the commit shows
for lfile in `find . -name \*.l -print`; do
Notice the find is rooted at ".".
Good catch.
But I'm ok forcing devs to install flex. Or be creative -- without any help --
to generate .c files from .l files themselves. :-)
> On Apr 22, 2015, at 5:08 PM, Paul Hargrove wrote:
>
>
> On Wed, Apr 22, 2015 at 2:02 PM, Jeff Squyres (jsquyres)
> wrote:
> FWIW, it *did* print
I had an opportunity to try the 1.8.5rc2 tarball on a little-endian POWER8
(aka ppc64el or powerpc64le).
The good news is that things "just worked" as they did when I tried ARMv8
(aka aarch64).
However, I see a little room for improvement with almost no work at all.
I noticed:
checking for __syn
On Apr 22, 2015, at 10:05 AM, Marco Atzeri wrote:
>
> Making all in mpi/fortran/use-mpi-f08
> make[2]: Entering directory
> '/cygdrive/e/cyg_pub/devel/openmpi/openmpi-1.8.5rc2-1.x86_64/build/ompi/mpi/fortran/use-mpi-f08'
> FCLD libmpi_usempif08.la
> .libs/abort_f08.o: In function `mpi_abort
Compilation of OpenMPI 1.8.4 using Intel compiler version 14.0.4.211 results in
usable code but has the following "remarks":
thanks
tom
make[2]: Entering directory
`/home02/tom/src/openmpi-1.8.4_intel_1404211/ompi/mpi/fortran/use-mpi-f08'
PPFC mpi-f08-types.lo
GENERATE sizeof_f08.h
On Apr 22, 2015, at 5:19 PM, Paul Hargrove wrote:
>
> I had an opportunity to try the 1.8.5rc2 tarball on a little-endian POWER8
> (aka ppc64el or powerpc64le).
> The existing powerpc64 inline asm should work.
Sweet -- I put your patch in here:
https://github.com/open-mpi/ompi/pull/550
Jus
This is actually expected.
We use compiler pragmas in the Fortran code that are not recognized by all
compilers. But they're safely ignored (even though they're noisy). :-\
> On Apr 22, 2015, at 5:22 PM, Tom Wurgler wrote:
>
>
> Compilation of OpenMPI 1.8.4 using Intel compiler version 14.
Oops -- missed this when I reviewed / updated README for v1.8. Will fix --
thanks!
> On Apr 22, 2015, at 12:02 AM, Paul Hargrove wrote:
>
> Unless I am mistaken, the text quoted below from README no longer reflects
> the current behavior.
> The text appears to be the same in master and v1.8.
Well, I tried rc2 on just about everything except my phone and my linksys.
For me the configure failure (dlopen() not found) on {Free,Net,Open}BSD is
the only problem.
Since it works on 'master' I am confident Jeff will sort this out.
-Paul
--
Paul H. Hargrove phhargr..
On 4/22/2015 11:19 PM, Jeff Squyres (jsquyres) wrote:
Question:
what is the scope of the new two shared libs
usr/bin/cygmpi_usempi_ignore_tkr-0.dll
usr/bin/cygmpi_usempif08-0.dll
in comparison to previous
usr/bin/cygmpi_mpifh-2.dll
usr/bin/cygmpi_usempi-1.dll
already present in 1.8.4 ?
A
On Apr 22, 2015, at 6:18 PM, Marco Atzeri wrote:
>
>> I'm guessing you upgraded your fortran compiler?
>
> eventually just from 4.8.x to 4.9x
Yep -- that would do it. gfortran 4.8.x is is "old enough" Fortran, gfortran
4.9.x is "new enough" Fortran.
>> The usempif08 library is the "use mpi_f
I think we missed 2 commits on v1.8. Filed PR
https://github.com/open-mpi/ompi-release/pull/254 to fix the problem.
bot:hargrove -- can you test?
> On Apr 21, 2015, at 8:40 PM, Paul Hargrove wrote:
>
>
>
> On Tue, Apr 21, 2015 at 5:33 PM, Jeff Squyres (jsquyres)
> wrote:
> What happens w
On Wed, Apr 22, 2015 at 2:43 PM, Jeff Squyres (jsquyres) wrote:
> > In addition to the one-line patch below, I needed to run autogen.pl
> with a new enough config/config.{guess,sub}.
> > Along the way I noticed
> > opal/mca/common/libfabric/libfabric/config/config.guess
> > opal/m
On Wed, Apr 22, 2015 at 4:20 PM, Jeff Squyres (jsquyres) wrote:
> I think we missed 2 commits on v1.8. Filed PR
> https://github.com/open-mpi/ompi-release/pull/254 to fix the problem.
>
> bot:hargrove -- can you test?
>
Initial testing failed autogen.pl (as did Jenkins).
I am past that point an
> On Apr 22, 2015, at 12:27 PM, Tom Wurgler wrote:
>
> Well, still not working. I compiled on the AMD and for 1.6.4 I get:
> (note: ideally, at this point, we really want 1.6.4 ,not 1.8.4 (yet)).
>
> 1.6.4 using --bind-to-socket --bind-to-core
>
> -
Here is what I see on my machine:
07:59:55 (v1.8) /home/common/openmpi/ompi-release$ mpirun -np 8
--display-devel-map --report-bindings --map-by core -host bend001 --bind-to
core hostname
Data for JOB [45531,1] offset 0
Mapper requested: NULL Last mapper: round_robin Mapping policy: BYCORE
34 matches
Mail list logo