Dear George,
The patch got malformed when posted. But I did figure out what was
meant.
It turns out that 3 files had to be fixed:
opal/runtime/opal_init.c
orte/runtime/orte_init_stage1.c
orte/runtime/orte_init_stage2.c
in the same way:
[mighell@asterix openmpi-1.0rc4]$ diff -u opal/runtime/
opal_init.c_original opal/runtime/opal_init.c
--- opal/runtime/opal_init.c_original Fri Oct 21 13:25:52 2005
+++ opal/runtime/opal_init.c Fri Oct 21 13:48:51 2005
@@ -123,7 +123,7 @@
error:
if (ret != OPAL_SUCCESS) {
opal_show_help("help-opal-runtime",
- "opal_init:startup:internal-failure",
+ "opal_init:startup:internal-failure", true,
error, ret);
}
[mighell@asterix openmpi-1.0rc4]$ diff -u orte/runtime/
orte_init_stage1.c_original orte/runtime/orte_init_stage1.c
--- orte/runtime/orte_init_stage1.c_original Fri Oct 21 13:51:41 2005
+++ orte/runtime/orte_init_stage1.c Fri Oct 21 13:52:08 2005
@@ -536,7 +536,7 @@
error:
if (ret != ORTE_SUCCESS) {
opal_show_help("help-orte-runtime",
- "orte_init:startup:internal-failure",
+ "orte_init:startup:internal-failure", true,
error, ret);
}
[mighell@asterix openmpi-1.0rc4]$ diff -u orte/runtime/
orte_init_stage2.c_original orte/runtime/orte_init_stage2.c
--- orte/runtime/orte_init_stage2.c_original Fri Oct 21 13:53:15 2005
+++ orte/runtime/orte_init_stage2.c Fri Oct 21 13:53:32 2005
@@ -81,7 +81,7 @@
error:
if (ret != ORTE_SUCCESS) {
opal_show_help("help-orte-runtime",
- "orte_init:startup:internal-failure",
+ "orte_init:startup:internal-failure", true,
error, ret);
}
The system seems to build.
However, the run times for my qlwfpc2 job are now very slow. Jobs end
with comments like
mpirun noticed that job rank 0 with PID 10837 on node "localhost"
exited on signal 25.
3 processes killed (possibly by Open MPI)
-Ken
Ken,
Please apply the following patch (from your /home/mighell/pkg/ompi/
openmpi-1.0rc4/ base directory).
Index: opal/runtime/opal_init.c
===================================================================
--- opal/runtime/opal_init.c (revision 7831)
+++ opal/runtime/opal_init.c (working copy)
@@ -123,7 +123,7 @@
error:
if (ret != OPAL_SUCCESS) {
opal_show_help("help-opal-runtime",
- "opal_init:startup:internal-failure",
+ "opal_init:startup:internal-failure", true,
error, ret);
}
It should solve this issue. I don't know which compiler you use but
mine it never catch this up .... as it think that an int is a bool so
it manage to match the show_help prototype.
Thanks,
george.
On Oct 21, 2005, at 3:37 PM, Ken Mighell wrote:
> Dear OpenMPI,
>
> I tried to build 1.0rc4 on a 3 year old 5-node Beowulf cluster
> running RedHat Linux 7.3. The build failed during
> make all; the last few lines of the log file are:
>
> mkdir .libs
> gcc -DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../
> src/event -I../../include -I../.. -I../.. -I../../include -I../../
> opal -I../../orte -I../../ompi -O3 -DNDEBUG -fno-strict-aliasing -
> pthread -MT opal_progress.lo -MD -MP -MF .deps/opal_progress.Tpo -c
> opal_progress.c -fPIC -DPIC -o .libs/opal_progress.o
> depbase=`echo opal_finalize.lo | sed 's|[^/]*$|.deps/&|;s|\.lo
$||'`; \
> if /bin/sh ../../libtool --tag=CC --mode=compile gcc -
> DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../src/
> event -I../../include -I../.. -I../.. -I../../include -I../../opal -
> I../../orte -I../../ompi -O3 -DNDEBUG -fno-strict-aliasing -
> pthread -MT opal_finalize.lo -MD -MP -MF "$depbase.Tpo" -c -o
> opal_finalize.lo opal_finalize.c; \
> then mv -f "$depbase.Tpo" "$depbase.Plo"; else rm -f
> "$depbase.Tpo"; exit 1; fi
> gcc -DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../
> src/event -I../../include -I../.. -I../.. -I../../include -I../../
> opal -I../../orte -I../../ompi -O3 -DNDEBUG -fno-strict-aliasing -
> pthread -MT opal_finalize.lo -MD -MP -MF .deps/opal_finalize.Tpo -c
> opal_finalize.c -fPIC -DPIC -o .libs/opal_finalize.o
> depbase=`echo opal_init.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`; \
> if /bin/sh ../../libtool --tag=CC --mode=compile gcc -
> DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../src/
> event -I../../include -I../.. -I../.. -I../../include -I../../opal -
> I../../orte -I../../ompi -O3 -DNDEBUG -fno-strict-aliasing -
> pthread -MT opal_init.lo -MD -MP -MF "$depbase.Tpo" -c -o
> opal_init.lo opal_init.c; \
> then mv -f "$depbase.Tpo" "$depbase.Plo"; else rm -f
> "$depbase.Tpo"; exit 1; fi
> gcc -DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../
> src/event -I../../include -I../.. -I../.. -I../../include -I../../
> opal -I../../orte -I../../ompi -O3 -DNDEBUG -fno-strict-aliasing -
> pthread -MT opal_init.lo -MD -MP -MF .deps/opal_init.Tpo -c
> opal_init.c -fPIC -DPIC -o .libs/opal_init.o
> opal_init.c: In function `opal_init':
> opal_init.c:127: incompatible type for argument 3 of
`opal_show_help'
> make[2]: *** [opal_init.lo] Error 1
> make[1]: *** [all-recursive] Error 1
> make: *** [all-recursive] Error 1
> make[2]: Leaving directory `/home/mighell/pkg/ompi/openmpi-1.0rc4/
> opal/runtime'
> make[1]: Leaving directory `/home/mighell/pkg/ompi/openmpi-1.0rc4/
> opal'
>
> I have included gzipped versions of config.log and the result of
> make all:
>
> <config.log.gz>
> <make_all.log.gz>
>
> I was able to build this same package on my Apple dual G5 tower
> today without any problems.
>
> Keep up the good work!
>
> Best regards,
>
> -Ken Mighell
>
>
----------------------------------------------------------------------
> ---------
> Kenneth Mighell, Associate Scientist E-mail: .............
> mighell_at_[hidden]
> Kitt Peak National Observatory Phone: ..................
> 520-318-8391
> National Optical Astronomy Observatory Fax: ....................
> 520-318-8360
> P.O. Box 26732, Tucson, AZ 85726-6732 URL: http://www.noao.edu/
> staff/mighell
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
"Half of what I say is meaningless; but I say it so that the other
half may reach you"
Kahlil Gibran