A gcc-based build is fine.

So, I think this is similar to issue #3992
<https://github.com/open-mpi/ompi/issues/3992> in which we seem to have
decided that /usr/bin/cc (clang) is not to be trusted on this platform.

-Paul

On Wed, Aug 30, 2017 at 4:49 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Ralph,
>
> See my response to Larry.  The impossibly large value was a figment of
> gdb's imagination.
>
> This system has worked for Open MPI when it was still at 11.0.
> I cannot say if the current problem is w/ FreeBSD-11.1 (e.g. its compiler)
> or with Open MPI.
>
> I am trying a gcc-based build now.
>
> -Pau
>
>
> On Wed, Aug 30, 2017 at 4:22 PM, r...@open-mpi.org <r...@open-mpi.org>
> wrote:
>
>> Yeah, that caught my eye too as that is impossibly large. We only have a
>> handful of active queues - looks to me like there is some kind of alignment
>> issue.
>>
>> Paul - has this configuration worked with prior versions of OMPI? Or is
>> this something new?
>>
>> Ralph
>>
>> On Aug 30, 2017, at 4:17 PM, Larry Baker <ba...@usgs.gov> wrote:
>>
>> Paul,
>>
>> (gdb) print base->nactivequeues
>>
>>
>> seems like an extraordinarily large number to me.  I don't know what the
>> implications are of the --enable-debug clang option is.  Any chance the
>> SEGFAULT is a debugging trap when an uninitialized value is encountered?
>>
>> The other thought I had is an alignment trap if, for example,
>> nactivequeues is a 64-bit int but is not 64-bit aligned.  As far as I can
>> tell, nactivequeues is a plain int.  But, what that is on FreeBSD/amd64, I
>> do not know.
>>
>> Should there be more information in dmesg or a system log file with the
>> trap code so you can identify whether it is an instruction fetch (VERY
>> unlikely), an operand fetch, or a store that caused the trap?
>>
>> Larry Baker
>> US Geological Survey
>> 650-329-5608 <(650)%20329-5608>
>> ba...@usgs.gov
>>
>>
>>
>> On 30 Aug 2017, at 3:17:05 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> I am testing the 2.1.2rc3 tarball on FreeBSD-11.1, configured with
>>    --prefix=[...] --enable-debug CC=clang CXX=clang++
>> --disable-mpi-fortran --with-hwloc=/usr/local
>>
>> The CC/CXX setting are to use the system default compilers (rather than
>> gcc/g++ in /usr/local/bin).
>> The --with-hwloc is to avoid issue #3992
>> <https://github.com/open-mpi/ompi/issues/3992> (though I have not
>> determined if that impacts this RC).
>>
>> When running ring_c I get a SEGV from orterun, for which a gdb backtrace
>> is given below.
>> The one surprising thing (highlighted) in the backtrace is that both the
>> RHS and LHS of the assignment appear to be valid memory locations.
>> So, if the backtrace is accurate then I am at a loss as to why a SEGV
>> occurs.
>>
>> -Paul
>>
>>
>> Program terminated with signal 11, Segmentation fault.
>> [...]
>> #0  opal_libevent2022_event_assign (ev=0x8065482c0, base=<value
>> optimized out>, fd=<value optimized out>,
>>     events=2, callback=<value optimized out>, arg=0x0)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/opal/mca/event/libevent2022/libevent/event.c:1779
>> 1779                    ev->ev_pri = base->nactivequeues / 2;
>> (gdb) print base->nactivequeues
>> $3 = 106201992
>> (gdb) print ev->ev_pri
>> $4 = 0 '\0'
>> (gdb) where
>> #0  opal_libevent2022_event_assign (ev=0x8065482c0, base=<value
>> optimized out>, fd=<value optimized out>,
>>     events=2, callback=<value optimized out>, arg=0x0)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/opal/mca/event/libevent2022/libevent/event.c:1779
>> #1  0x00000008062e1fd2 in pmix_start_progress_thread ()
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/opal/mca/pmix/pmix112/pmix/src/util/progress_threads.c:83
>> #2  0x00000008063047e4 in PMIx_server_init (module=0x806545be8,
>> info=0x802e16a00, ninfo=2)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c:310
>> #3  0x00000008062c12f6 in pmix1_server_init (module=0x800b106a0,
>> info=0x7fffffffe290)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/opal/mca/pmix/pmix112/pmix1_server_south.c:140
>> #4  0x0000000800889f43 in pmix_server_init ()
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/orte/orted/pmix/pmix_server.c:261
>> #5  0x0000000803e22d87 in rte_init ()
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/orte/mca/ess/hnp/ess_hnp_module.c:666
>> #6  0x000000080084a45e in orte_init (pargc=0x7fffffffe988,
>> pargv=0x7fffffffe980, flags=4)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/orte/runtime/orte_init.c:226
>> #7  0x00000000004046a4 in orterun (argc=7, argv=0x7fffffffea18)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/orte/tools/orterun/orterun.c:831
>> #8  0x0000000000403bc2 in main (argc=7, argv=0x7fffffffea18)
>>     at /home/phargrov/OMPI/openmpi-2.1.2rc3-freebsd11-amd64/openmpi
>> -2.1.2rc3/orte/tools/orterun/main.c:13
>>
>>
>>
>> --
>> Paul H. Hargrove                          phhargr...@lbl.gov
>> Computer Languages & Systems Software (CLaSS) Group
>> Computer Science Department               Tel: +1-510-495-2352
>> <(510)%20495-2352>
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>> <(510)%20486-6900>
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/devel
>>
>
>
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352
> <(510)%20495-2352>
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> <(510)%20486-6900>
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to