Allan,

A likely possibility is that some important kernel feature (that Open MPI
assumes is present) is missing.
That includes not only "kernel modules" as you mention, but also features
configure in (or out) of the base kernel.
For instance, some embedded kernels omit UNIX-domain sockets and SysV IPC
support.

If you can send me (preferably off-list) the kernel config files for the
old an new kernels I may be able to spot something.
If present, you are looking for /boot/config-[VERSION]

-Paul

On Tue, Nov 25, 2014 at 10:25 AM, Allan Wu <al...@cs.ucla.edu> wrote:

> I'm sorry I forgot to change the subject when I reply to the digest
> issue. Please find my original email below.
>
> Regards,
> Di
>
> On Tue, Nov 25, 2014 at 10:19 AM, Allan Wu <al...@cs.ucla.edu> wrote:
>
>> Thanks Ralph for the reply. Sorry about the log file, I think I forgot to
>> put an extension to the file. Please find a new one attached with this
>> email.
>>
>> I'm sorry for not enough debugging information, but 'omp_info' and
>> '--debug-devel' are the only ways I know for collecting information, are
>> there any other things I can try to provide more info?
>>
>> When I execute 'mpirun --debug-devel -np 1 ./helloworld', all the output
>> is the logging information in my last email. It got stuck at
>> 
>>  "[fpga1:00718] tmp: /tmp", and nothing from my helloworld program is
>> printed out to the screen. So I think it is mpirun failing to start my
>> executable, not failing to terminate.
>>
>> I was wondering if this has anything to do with my newer kernel version,
>> since it works well in the old case.
>>
>> Thanks,
>> --
>> Di Wu (Allan)
>> PhD student, VAST Laboratory <http://vast.cs.ucla.edu/>,
>> Department of Computer Science, UC Los Angeles
>> Email: al...@cs.ucla.edu
>>
>>
>> Date: Tue, 25 Nov 2014 07:29:51 -0800
>> From:
>> 
>> 
>> Ralph Castain <r...@open-mpi.org>
>> To: Open MPI Developers <de...@open-mpi.org>
>> Subject: Re: [OMPI devel] OpenMPI v1.8 and v1.8.3 mpirun hangs at
>>         execution       on an embedded ARM Linux kernel version 3.15.0
>> Message-ID: <898cb117-f6a6-4569-89c3-49b75d65b...@open-mpi.org>
>> Content-Type: text/plain; charset="utf-8"
>>
>> I don?t know what you put in that log file, but it was an executable and
>> I?m not feeling that trusting :-)
>>
>> I?m afraid there isn?t enough debug output there to really tell anything.
>> From what little I can see, I?m guessing that the application ran fine and
>> you got the usual ?hello? output and the helloworld process exited safely -
>> is that correct? And so it is solely mpirun that is failing to cleanly
>> terminate?
>>
>>
>> > On Nov 24, 2014, at 11:24 PM, Allan Wu <al...@cs.ucla.edu> wrote:
>> >
>> > Hello everyone,
>> >
>> > I have cross-compiled OpenMPI for an embedded ARM Linux. Everything
>> works fine for my system based on Linux 3.8.0. I have previously submitted
>> a post related to my compilation, which can be found here:
>> http://www.open-mpi.org/community/lists/devel/2014/04/14440.php <
>> http://www.open-mpi.org/community/lists/devel/2014/04/14440.php>. When I
>> recently upgraded my Linux kernel to 3.15.0, mpirun begins to stuck at even
>> the helloworld program. The program consists only simple APIs: MPI_Init,
>> MPI_Comm_size, MPI_Comm_rank, MPI_Finalize. The problem occurs even at
>> 'mpirun -np 1 ./helloworld', and below are the output with --debug-devel
>> (before it got stuck):
>> > [fpga1:00716] sess_dir_finalize: job session dir not empty - leaving
>> > [fpga1:00716] procdir: /tmp/openmpi-sessions-root@fpga1_0/63813/0/0
>> > [fpga1:00716] jobdir: /tmp/openmpi-sessions-root@fpga1_0/63813/0
>> > [fpga1:00716] top: openmpi-sessions-root@fpga1_0
>> > [fpga1:00716] tmp: /tmp
>> > [fpga1:00718] procdir: /tmp/openmpi-sessions-root@fpga1_0/63813/1/0
>> > [fpga1:00718] jobdir: /tmp/openmpi-sessions-root@fpga1_0/63813/1
>> > [fpga1:00718] top: openmpi-sessions-root@fpga1_0
>> >
>> 
>> [fpga1:00718] tmp: /tmp
>> >
>> > I suspect maybe it is due to incompatible kernel version or some
>> missing kernel modules. I tried also with the latest version 1.8.3, and had
>> the same problem. Does anyone have any thoughts? I have attached the output
>> of 'ompi-info --all' with this email.
>> >
>> > Please let me know if I need to provide more information. Thanks in
>> advance!
>> >
>> > Regards,
>> > --
>> > Di Wu (Allan)
>> > PhD student, VAST?Laboratory <http://vast.cs.ucla.edu/>,
>> > Department of Computer Science, UC Los Angeles
>> > Email: al...@cs.ucla.edu <mailto:al...@cs.ucla.edu>
>> > <log.tar.gz>_______________________________________________
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16330.php
>>
>>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16341.php
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to