Jan, I left my application running over the weekend with the new RT-Socket CAN and the system is still running stable today. Thank you for pointing out that bug-- you've made my December vacation!
Sincerely, Joshua Karch ________________________________________ From: Jan Kiszka [[email protected]] Sent: Friday, December 11, 2009 1:43 AM To: Josh Karch Cc: [email protected] Subject: Re: Xenomai scheduling while atomic bug--debugging parameters Josh Karch wrote: > Jan, > > Our program ran without interruption last night with no bug, except it > appears one still exists possibly regarding either ethernet (Intel e100), > sshd, nfsd, or as a result of running top. It seems excessive network > requests and calls to top and dmesg can trigger it. > > Today I will recompile the kernel with all debugging options enabled, make > the system run until the bug occurs, send you that .config file as well as > the current .config that still crashes on occasion (though the xenomai > application runs rock solid in spite of the linux scheduling while atomic > bugs). since the logs may be big, is there an ftp I can send them to or > should I email those directly? > >From the "fulldebugging" trace you sent to Gilles and me: > [ 412.520716] | # func 0 ipipe_trace_panic_freeze+0x4 > (ipipe_check_context+0x41) > [ 412.520716] | # func 0 ipipe_check_context+0x5 > (add_preempt_count+0x10) > [ 412.520716] | # func 0 delay_tsc+0x9 > (__const_udelay+0x1d) > [ 412.520716] | # func -1 __const_udelay+0x3 > (sja1000_irqhandler_common+0x2bd [pcan]) The pcan driver invokes a Linux service that is not supposed to be used from RT context (udelay). Better switch to the CAN driver in upstream Xenomai (it also comes with a standard API for better portability across CAN vendors). > [ 412.520716] | # func -9 xnarch_tsc_to_ns+0x5 > (xnarch_get_cpu_time+0xf) > [ 412.520716] | # func -9 xnarch_get_cpu_time+0x3 > (sja1000_irqhandler_common+0x102 [pcan]) > [ 412.520716] | # func -10 xnintr_edge_shirq_handler+0x9 > (__ipipe_dispatch_wired_nocheck+0x3e) > [ 412.520716] | +func -11 > __ipipe_dispatch_wired_nocheck+0x6 (__ipipe_dispatch_wired+0x4f) > [ 412.520716] | +func -11 __ipipe_dispatch_wired+0x5 > (__ipipe_handle_irq+0x93) > [ 412.520716] | +func -11 native_apic_mem_write+0x3 > (ack_apic_level+0x141) > [ 412.520716] | +func -13 native_apic_mem_read+0x3 > (ack_apic_level+0x3f) > [ 412.520716] | +func -13 ack_apic_level+0x6 > (__ipipe_ack_fasteoi_irq+0xe) > [ 412.520716] | +func -13 __ipipe_ack_fasteoi_irq+0x3 > (__ipipe_handle_irq+0x8a) > [ 412.520716] | +func -14 __ipipe_handle_irq+0x9 > (common_interrupt+0x36) > [ 412.520716] | +begin 0xffffffb6 -14 common_interrupt+0x2f () Besides that, there might be a bug in our bug detection infrastructure with generates a confusing output instead of a proper warning about the core issue. Will have a look. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux _______________________________________________ Xenomai-help mailing list [email protected] https://mail.gna.org/listinfo/xenomai-help
