On Dec 14, 2011, at 6:51 PM, George Bosilca wrote:
> To be honest I'm totally lost in the naming scheme, which got me confused
> about the RFC you're referring to. We had an MCA parameter to start a vm, so
> I thought VM is some kind of special virtualized environment and not the
> entire ORTE
On Dec 14, 2011, at 6:44 PM, George Bosilca wrote:
> A comment in the commit suggest that the symbols were not linked into the
> orterun if they were not accessed there. I guess this was the trick to make
> sure MPIR_Breakpoint is in there.
>
> Now that you pointed me to this commit I have to
To be honest I'm totally lost in the naming scheme, which got me confused about
the RFC you're referring to. We had an MCA parameter to start a vm, so I
thought VM is some kind of special virtualized environment and not the entire
ORTE. Based on the behavior of the trunk and the RFC you referred
A comment in the commit suggest that the symbols were not linked into the
orterun if they were not accessed there. I guess this was the trick to make
sure MPIR_Breakpoint is in there.
Now that you pointed me to this commit I have to disagree with. Why the MPI
debugging symbols have been delete
Yes - we were having problems making symbols in orterun visible for the "stat"
debugger when built dynamically. The symbols are actually instantiated in the
debugger base, but they need to be "seen" in orterun prior to us calling
orte_init. So, we had to explicitly reference them.
It was workin
Looks like that line came over in
https://svn.open-mpi.org/trac/ompi/changeset/24561, which was bringing over the
debugger ORTE framework from the trunk
(https://svn.open-mpi.org/trac/ompi/ticket/2688).
Ralph -- do you remember why that line is there?
On Dec 14, 2011, at 7:21 PM, Nathan Hjelm
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 15/12/11 08:33, Ralph Castain wrote:
> That param was intended to catch user-level mistakes
> whereby the user specified a tmpdir location via the
> tmpdir_base MCA param that the system admin wanted to
> protect. It was not intended for someone to
There still seems to be an issue with using mpirun --debug with totalview. For
some reason totalview is not breaking on MPIR_Breakpoint. Removing the foo =
MPIR_Breakpoint line from orterun.c fixes this issue.
Is there any reason I shouldn't remove that line? Any other debuggers that
might bre
On Tue, 13 Dec 2011 20:27:00 -0500
Jeff Squyres wrote:
> On Dec 13, 2011, at 7:59 PM, Christopher Yeoh wrote:
>
> > Sorry, late to the discussion. This is a spurious warning caused by
> > passing the NULL pointer to the opal free function which is
> > actually ok. It was fixed by #2884 - this is
Well, I actually have to eat my words here. This code is alive and well.
However, I don't think it does what you wanted or perhaps expected.
That param was intended to catch user-level mistakes whereby the user specified
a tmpdir location via the tmpdir_base MCA param that the system admin wante
Thanks for all the testing, Paul!
On Dec 14, 2011, at 12:37 AM, Paul H. Hargrove wrote:
> On one of the same "System 2" that I used to check compilation against
> Quadrics Elan, I have multiple versions of the Myrinet GM headers/libs.
>
> System 2: Linux/x86
>> $ cat /etc/redhat-release
>> Red
Hello George and @ll.
Sorry for the late answer, but i was doing some trace to see where is set
the MPI_ERROR. I took a look to ompi_request_default_wait and try to see
what happen with request.
Well, i've noticed that all requests that are not inmediately solved go
to ompi_request_wait_completio
This is amusing - reviewing the code quickly, it appears that the supporting
code for orte_no_session_dir was mistakenly removed at some point.
I'll restore that functionality. Thanks for pointing it out!
On Dec 12, 2011, at 11:10 PM, Christopher Samuel wrote:
> -BEGIN PGP SIGNED MESSAGE---
On Dec 13, 2011, at 9:10 PM, George Bosilca wrote:
> I noticed today a drastic change in how ORTE deal with the hostfile between
> trunk and 1.5.
>
> 1. 1.5 and prior used the hostile as a suggestion, a placeholder where to
> pick the requested number of daemons during the launch. The current t
I took the liberty of GK ratcheting that CMR through, in the interest of
expediency...
On Dec 14, 2011, at 8:15 AM, Shiqing Fan wrote:
> I see the real problem now, the .windows file is not added into the tarball.
>
> On 2011-12-14 1:48 PM, George Bosilca wrote:
>> Shiqing,
>>
>> This file see
I see the real problem now, the .windows file is not added into the tarball.
On 2011-12-14 1:48 PM, George Bosilca wrote:
Shiqing,
This file seems to be there.
$ pwd
/home/bosilca/unstable/1.5/ompi
$ svn info opal/mca/shmem/windows/.windows
Path: opal/mca/shmem/windows/.windows
Name: .windows
Hi George,
Right, I was testing RC1 which has this problem. But now it shouldn't
matter.
Thanks,
Shiqing
On 2011-12-14 1:48 PM, George Bosilca wrote:
Shiqing,
This file seems to be there.
$ pwd
/home/bosilca/unstable/1.5/ompi
$ svn info opal/mca/shmem/windows/.windows
Path: opal/mca/shme
Shiqing,
This file seems to be there.
$ pwd
/home/bosilca/unstable/1.5/ompi
$ svn info opal/mca/shmem/windows/.windows
Path: opal/mca/shmem/windows/.windows
Name: .windows
URL:
https://svn.open-mpi.org/svn/ompi/branches/v1.5/opal/mca/shmem/windows/.windows
Repository Root: https://svn.open-mp
Thanks for the hint, Paul.
This build issue is fixed by CMR #2938.
Matthias
On Wednesday 14 December 2011 07:44:48 Paul H. Hargrove wrote:
> OK, Jeff probably wants to choke me for all these emails, but here comes
> another...
>
> I am now configuring my 5 BSD systems with "--without-hwloc
> --d
Hi George,
A .windows file seems still missing in opal/mca/shmem/windows/. Could
you also svn add it (from the patch in shmem ticket)?
It is not a source file, but rather a CMake required configuration file.
Probably this change doesn't need another rc. :-) Thanks a lot.
Regards,
Shiqing
And a hwloc problem with very old sched_setaffinity on redhat 8, we're
looking at it.
Brice
Le 14/12/2011 11:14, Paul H. Hargrove a écrit :
> Summary of my 1.5.5rc1 testing findings:
>
> + generated config.h in tarball breaks hwloc on non-linux platforms:
> http://www.open-mpi.org/community/list
Summary of my 1.5.5rc1 testing findings:
+ generated config.h in tarball breaks hwloc on non-linux platforms:
http://www.open-mpi.org/community/lists/devel/2011/12/10106.php
+ multiply defined symbols problem on MacOS 10.4 (PPC only):
http://www.open-mpi.org/community/lists/devel/2011/12/10103.p
I have been working w/ Brice off-list and we have found the root cause
of ALL those problems I've reported with linux-specific hwloc symbols on
non-linux systems.
Somehow the 1.5.1rc1 tarball contains a GENERATED file from a Linux system!
$ find openmpi-1.5.5rc1 -name autogen | xargs ls
openm
Grumble. This is getting old.
Add Solaris 11 on x86-64 to the list of platforms where OMPI is
incorrectly trying to link Linux-specific hwloc symbols, even though I
can build a stand-alone hwloc w/o problems:
$ uname -a
SunOS pcp-j-20 5.11 snv_151a i86pc i386 i86pc Solaris
$ gcc --version | h
On 12/13/2011 11:50 PM, Brice Goglin wrote:
Le 14/12/2011 08:29, Paul H. Hargrove a écrit :
I've attempted the build on MacOS 10.4 (Tiger) on x86-64, I hit the
same hwloc issue I've encountered on {Free,Open,Net}BSD.
The build fails with
CCLD opal_wrapper
/usr/bin/ld: Undefined symbols:
I've attempted to reproduce the failure reported below for MacOS 10.4
for PPC on an X86-64 system.
First, I've realized that while I reported "make check" as the source of
the problem, it occurs at "make".
Regardless of that mistake in my reporting, I was unable to reproduce
the problem, makin
Le 14/12/2011 08:29, Paul H. Hargrove a écrit :
> I've attempted the build on MacOS 10.4 (Tiger) on x86-64, I hit the
> same hwloc issue I've encountered on {Free,Open,Net}BSD.
> The build fails with
>> CCLD opal_wrapper
>> /usr/bin/ld: Undefined symbols:
>> _opal_hwloc122_hwloc_backend_sysfs_e
Le 14/12/2011 08:01, Paul H. Hargrove a écrit :
> I cannot even *build* OpenMPI on {Free,Open,Net}BSD systems unless I
> configure with --without-hwloc.
> Thus I cannot agree w/ Brice's suggestion that I ignore this warning.
Please try building hwloc (1.2.2 if you want the same one as OMPI
current
I've attempted the build on MacOS 10.4 (Tiger) on x86-64, I hit the same
hwloc issue I've encountered on {Free,Open,Net}BSD.
The build fails with
CCLD opal_wrapper
/usr/bin/ld: Undefined symbols:
_opal_hwloc122_hwloc_backend_sysfs_exit
_opal_hwloc122_hwloc_backend_sysfs_init
_opal_hwloc122_h
I can "make all install clean" ompi-1.5.5rc1 on the following 2 systems
which have Quadrics Elan headers/libs.
System 1: Linux/x86-64
$ cat /etc/redhat-release
CentOS release 4.2 (Final)
$ uname -a
Linux [hostname] 2.6.9-22.EL #1 Thu Feb 23 16:23:18 EST 2006 x86_64
x86_64 x86_64 GNU/Linux
$ g
On 12/13/2011 10:53 PM, Brice Goglin wrote:
Le 14/12/2011 07:17, Paul H. Hargrove a écrit :
My OpenBSD and NetBSD testers have the same behavior, but now I see
that I was at warned...
On all the affected systems I found the following (modulo the system
tuple) in the configure output:
checkin
Le 14/12/2011 07:12, Paul H. Hargrove a écrit :
> I cannot hwloc in build 1.5.5rc1 on the following system:
>
> System 2: Linux/x86
>> $ cat /etc/redhat-release
>> Red Hat Linux release 8.0 (Psyche)
>> $ uname -a
>> Linux [hostname] 2.4.21-60.ELsmp #1 SMP Fri Aug 28 06:45:10 EDT 2009
>> i686 i6
Le 14/12/2011 07:17, Paul H. Hargrove a écrit :
> My OpenBSD and NetBSD testers have the same behavior, but now I see
> that I was at warned...
>
> On all the affected systems I found the following (modulo the system
> tuple) in the configure output:
>> checking which OS support to include... Unsup
OK, Jeff probably wants to choke me for all these emails, but here comes
another...
I am now configuring my 5 BSD systems with "--without-hwloc
--disable-io-romio".
The systems (all using /usr/bin/gcc) are:
FreeBSD-8.2-RELEASE on amd64:
gcc (GCC) 4.2.1 20070719 [FreeBSD]
FreeBSD-7.2-RELEA
My OpenBSD and NetBSD testers have the same behavior, but now I see that
I was at warned...
On all the affected systems I found the following (modulo the system
tuple) in the configure output:
checking which OS support to include... Unsupported!
(x86_64-unknown-openbsd5.0)
configure: WARNING:
I cannot hwloc in build 1.5.5rc1 on the following system:
System 2: Linux/x86
$ cat /etc/redhat-release
Red Hat Linux release 8.0 (Psyche)
$ uname -a
Linux [hostname] 2.4.21-60.ELsmp #1 SMP Fri Aug 28 06:45:10 EDT 2009
i686 i686 i386 GNU/Linux
$ gcc --version | head -1
gcc (GCC) 3.4.0
On one of the same "System 2" that I used to check compilation against
Quadrics Elan, I have multiple versions of the Myrinet GM headers/libs.
System 2: Linux/x86
$ cat /etc/redhat-release
Red Hat Linux release 8.0 (Psyche)
$ uname -a
Linux [hostname] 2.4.21-60.ELsmp #1 SMP Fri Aug 28 06:45:10
I am seeing build failures on the following:
FreeBSD-8.2-RELEASE on amd64
FreeBSD-7.2-RELEASE on amd64
FreeBSD-6.3-RELEASE on amd64
All three fail with the same error:
CCLD opal_wrapper
../../../opal/.libs/libopen-pal.so: undefined reference to
`opal_hwloc122_hwloc_backend_sysfs
Using the 1.5.5rc1 tarball, I've repeated tests on the following
platforms for which I recently reported 1.4.5rc1 results:
MacOS 10.5 (Leopard) on PPC:
powerpc-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5488)
MacOS 10.4 (Tiger) on PPC:
powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1
39 matches
Mail list logo