Re: [Xenomai-core] Bug in taskSuspend for user mode vxworks

Niklaus Giger Sat, 28 Oct 2006 11:31:32 -0700

Am Samstag, 28. Oktober 2006 20:00 schrieb Philippe Gerum:
> On Sat, 2006-10-28 at 19:40 +0200, Niklaus Giger wrote:
> > > I recompiled the kernel with CONFIG_XENO_OPT_DEBUG disabled and the
> > > error went away.
> >
> > Actually I think the problem did not go away, as I did see that
> > in /proc/xenomai/faults the following error is incremented when I call
> > the attache simple program.
> > TRAP         CPU0
> >   0:            5    (Data or instruction access)
> > (Btw which exception is it attached on a PPC405 system?)
>
> 0x400, e.g. page fault.
>
> > Here is the stack trace of the simplified example attached as seen by the
> > BDI with a hardware breakpoint at 0x300
> > #0  0x00000300 in ?? ()
> > No symbol table info available.
> > #1  0x100b8c48 in ?? ()
> > No symbol table info available.
> > #2  0x100b8c48 in ?? ()
> > No symbol table info available.
> > Previous frame inner to this frame (corrupt stack?)
> > (gdb)
> >
> > Setting a breakpoint at xnpod_fault_handler and a full backtrace gives me
> > (gdb) bt full
> > #0  xnpod_fault_handler (fltinfo=0xc1839e18)
> > at
> > /mnt/data.ng/hcu/kernel/ppc/linux-2.6.14/include/asm-generic/xenomai/syst
> >em.h:200 thread = (xnthread_t *) 0xc0214f40
> > #1  0xc0048e90 in xnpod_trap_fault (fltinfo=0xc1839e18)
> > at
> > /mnt/data.ng/hcu/kernel/ppc/linux-2.6.14/kernel/xenomai/nucleus/pod.c:290
> >7 No locals.
> > #2  0xc00438f4 in xnarch_trap_fault (event=3246628376, domid=1480937039,
> > data=0xc1839f50) at include2/asm/xenomai/bits/init.h:46
> >         fltinfo = {exception = 0, regs = 0xc1839f50}
> > #3  0xc011ffb8 in exception_event (event=3221520296, ipd=0x58454e4f,
> > data=0xc1839f50)
> >     at
> > /mnt/data.ng/hcu/kernel/ppc/linux-2.6.14/arch/ppc/xenomai/hal.c:385 No
> > locals.
> > #4  0xc003fecc in __ipipe_dispatch_event (event=0, data=0xc1839f50)
> > at /mnt/data.ng/hcu/kernel/ppc/linux-2.6.14/kernel/ipipe/core.c:668
> >         start_domain = (struct ipipe_domain *) 0xc0214f40
> >         this_domain = (struct ipipe_domain *) 0xc0214f40
> >         evhand = (ipipe_event_handler_t) 0xc0048e90
> > <xnpod_trap_fault+100> pos = (struct list_head *) 0xc0214f40
> >         npos = (struct list_head *) 0xc01c6540
> >         flags = 167984
> >         propagate = 1
> > #5  0xc000b02c in do_page_fault (regs=0xc1839f50, address=266719224,
> > error_code=0)
> >     at /mnt/data.ng/hcu/kernel/ppc/linux-2.6.14/arch/ppc/mm/fault.c:119
> >         vma = (struct vm_area_struct *) 0xff86120
> >         mm = (struct mm_struct *) 0xc0200260
> >         info = {si_signo = 1, si_errno = -1071644672, si_code =
> > -1071579136, _sifields = {_pad = {0, -1048338784, -1048338608,
> >       -1071554848, -1070595192, -1048338768, -1073423884, -1048338608,
> > -1070595192, -1048338752, -1073422700, 0, 1, -1048338704,
> >       -1073418440, -1071710208, 14, 1, -1071733764, -1071710208,
> > -1071880896, 167984, 0, 16384, -1071880896, -1048338640, -1073479988,
> >       0, 0}, _kill = {_pid = 0, _uid = 3246628512}, _timer = {_tid = 0,
> > _overrun = -1048338784,
> >       _pad =
> > 0xc1839e94
> > "Á\203\237PÀ!^àÀ0\003\210Á\203\236°À\004ÙôÁ\203\237PÀ0\003\210Á\203\236ÀÀ
> >\004Þ\224", _sigval = {
> >         sival_int = -1048338608, sival_ptr = 0xc1839f50}, _sys_private
> > = -1071554848}, _rt = {_pid = 0, _uid = 3246628512, _sigval = {
> >         sival_int = -1048338608, sival_ptr = 0xc1839f50}}, _sigchld =
> > {_pid = 0, _uid = 3246628512, _status = -1048338608,
> >       _utime = -1071554848, _stime = -1070595192}, _sigfault = {_addr =
> > 0x0}, _sigpoll = {_band = 0, _fd = -1048338784}}}
> >         code = 196609
> >         is_write = 0
> >         __func__ = "do_page_fault"
> > #6  0xc0003258 in handle_page_fault ()
> > No locals.
> > (gdb)
> >
> > But I think it has something to do with my toolchain/compiler or my root
> > file system setup. I just found out, that compiling it with the same gcc
> > 3.4 compiler for my PowerBook and linking it statically the error got
> > away.
>
> I tried to reproduce the issue on a lite5200 here, to no avail. I'm
> using gcc 4.0 from Denx's ELDK 4.0, but I've never had such problem when
> using gcc 3.3.3 from ELDK 3.1 either.
>
> I wonder if something fishy is not happening with the code gcc generates
> to emit syscalls in some place of the library support.
>
Could be. I have only a gdbserver running on the PPC405 system. I compiled 
again without ld -static. Then I do the following:
> This GDB was configured as "powerpc-linux-gnu"...Using host libthread_db
> library "/lib/tls/libthread_db.so.1".
>
> (gdb) set solib-absolute-prefix /mnt/data.ng/hcu/rootfs
> (gdb) dir /mnt/data.ng/hcu/rootfs
> Source directories searched: /mnt/data.ng/hcu/rootfs:$cdir:$cwd
> (gdb) target remote 172.25.1.5:2345
> Remote debugging using 172.25.1.5:2345
> 0x3000fa18 in ?? ()
> (gdb) cont
> Continuing.
> [New thread 16384]
>
> Program received signal SIGILL, Illegal instruction.
> [Switching to thread 16384]
> 0x3000ca1c in _dl_name_match_p () from /mnt/data.ng/hcu/rootfs/lib/ld.so.1
> (gdb) bt fu
> #0  0x3000ca1c in _dl_name_match_p () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #1  0x30008878 in do_lookup_x () from /mnt/data.ng/hcu/rootfs/lib/ld.so.1
> No symbol table info available.
> #2  0x30008cd8 in _dl_lookup_symbol_x () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #3  0x3000b2c8 in fixup () from /mnt/data.ng/hcu/rootfs/lib/ld.so.1
> No symbol table info available.
> #4  0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #5  0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #6  0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #7  0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #8  0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #9  0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> #10 0x3000b528 in _dl_runtime_resolve () from
> /mnt/data.ng/hcu/rootfs/lib/ld.so.1 No symbol table info available.
> Previous frame inner to this frame (corrupt stack?)
> (gdb)
But I have no idea why I get this behaviour.
My toolchain is not a ELDK, but was also built using crosstool.


Setting a breakpoint at _dl_runtime_resolve gives:
> 0x3000fa18 in ?? ()
> (gdb) break _dl_runtime_resolve
> Function "_dl_runtime_resolve" not defined.
> Make breakpoint pending on future shared library load? (y or [n]) y
> Breakpoint 1 (_dl_runtime_resolve) pending.
> (gdb) cont
> Continuing.
> Breakpoint 2 at 0x3000b4f4
> Pending breakpoint "_dl_runtime_resolve" resolved
> [New thread 16384]
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to thread 16384]
> 0x00000000 in ?? ()
> (gdb) info stack
> #0  0x00000000 in ?? ()
> Cannot access memory at address 0x4
> (gdb) bt full
> #0  0x00000000 in ?? ()
> No symbol table info available.
> Cannot access memory at address 0x4

Best regards

-- 
Niklaus Giger

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] Bug in taskSuspend for user mode vxworks

Reply via email to