Am Thu, 20 Dec 2018 10:10:29 +0100
schrieb Jan Kiszka <jan.kis...@siemens.com>:

> On 20.12.18 09:28, Mauro Salvini via Xenomai wrote:
> > Hi all,
> > 
> > I'm testing Xenomai 3 on an Intel Braswell board (Atom x5-E8000).
> > I'm using ipipe kernel at last commit from [1], branch ipipe-4.9.y,
> > 64bit build on a Debian Stretch 9.6 64bit.
> > Xenomai library is from [2], branch stable/v3.0.x on commit
> > bc53d03f (I haven't two last commits but seems not related). I
> > tried both 32bit (mixed ABI) and 64bit builds with same following
> > result.
> > 
> > I launch:
> > 
> > xeno-test -l "dohell -s xxx -p yyy -m xxx 90000" -T 90000
> > 
> > After a variable time (from minutes to hours) from latency test
> > start I get a few overruns that I discovered are generated by a
> > kernel stack dump (attached to mail the dmesg tail). Latency test
> > doesn't stop, and after this stackdump never reports other overruns
> > or latency peaks (seems I need to reboot to reproduce stack).
> > 
> > I read in this mailing list that on last patches much work has done
> > on FPU part, should it be related?
> > 
> > Glad to give other infos if you need.  
> 
> Thanks for reporting. Maybe we need your config later on, but I'm
> first of all looking at the code, see below.
> 
> In general, it's better to run the kernel with frame-pointers enabled
> to get more reliable backtraces, at least when an error occurs.
> 
> > 
> > Thanks in advance, regards.
> > 
> > Mauro
> > 
> > [1] https://gitlab.denx.de/Xenomai/ipipe
> > [2] https://gitlab.denx.de/Xenomai/xenomai
> > -------------- next part --------------
> > [  233.205940] 
> > /home/build-user/develop/linux-ipipe-4.9.y/arch/x86/xenomai/include/asm/xenomai/fptest.h:43:
> > Warning: Linux is compiled to use FPU in kernel-space. For this
> > reason, switchtest can not test using FPU in Linux kernel-space.
> > [  295.660454] ------------[ cut here ]------------ [  295.660461]
> > WARNING: CPU: 0 PID: 139
> > at 
> > /home/build-user/develop/linux-ipipe-4.9.y/arch/x86/include/asm/fpu/internal.h:502
> > xnarch_leave_root+0x1a4/0x1b0  
> 
> Henning, the kernel checks fpu->fpregs_active here and finds it off
> while Xenomai looks at fpu.fpstate_active - intentionally or by
> accident?

That was by accident. In eager mode they should mostly be in sync,
unless when playing the nasty tricks ipipe has to play. I guess there
is a path where we tried to "unown" a task that we unowned shortly
before. I did not try to find that path, i just sent a patch.

Mauro, since you can reproduce the problem you can probably tell if the
patch fixes it.

Henning

> Jan
> 
> > [  295.660465] Modules linked in: binfmt_misc msr iTCO_wdt
> > iTCO_vendor_support coretemp kvm_intel kvm irqbypass
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> > aes_x86_64 lrw snd_pcm gf128mul glue_helper i915 ablk_helper
> > snd_timer cryptd snd soundcore pcspkr drm_kms_helper drm evdev
> > lpc_ich mfd_core fb_sys_fops syscopyarea sysfillrect sysimgblt sg
> > shpchp video button ip_tables x_tables autofs4 ext4 crc16 jbd2
> > fscrypto mbcache sd_mod hid_generic usbhid hid crc32c_intel ahci
> > igb i2c_i801 libahci i2c_smbus i2c_algo_bit xhci_pci dca ptp
> > pps_core libata xhci_hcd sdhci_pci sdhci usbcore scsi_mod mmc_core
> > fjes [last unloaded: rtnet] [  295.660660] CPU: 0 PID: 139 Comm:
> > jbd2/sda1-8 Tainted: G        W       4.9.135+ #1 [  295.660663]
> > Hardware name: Default string Default string/Q7-BW, BIOS
> > V1.20#KW050220A 03/16/2018 [  295.660666] I-pipe domain: Xenomai
> > [  295.660668]  0000000000000000 ffffffff823402c2 0000000000000000
> > 0000000000000000 [  295.660680]  ffffffff829cd400 ffffffff8206cb68
> > ffff985636908040 ffff985636504080 [  295.660706]  0000000000099f50
> > 0000000000000000 ffffbc1580563040 ffff98567ac9b940 [  295.660718]
> > Call Trace: [  295.660721]  [<ffffffff823402c2>] ?
> > dump_stack+0xb5/0xd3 [  295.660724]  [<ffffffff8206cb68>] ?
> > __warn+0xc8/0xf0 [  295.660727]  [<ffffffff82068734>] ?
> > xnarch_leave_root+0x1a4/0x1b0 [  295.660729]
> > [<ffffffff82173b56>] ? ___xnsched_run+0x3f6/0x4b0 [  295.660732]
> > [<ffffffff8219fbf6>] ? timerfd_handler+0x36/0x50 [  295.660735]
> > [<ffffffff8217b075>] ? xntimerq_insert+0x5/0xa0 [  295.660738]
> > [<ffffffff8216a7c9>] ? xnclock_tick+0x1a9/0x2c0 [  295.660740]
> > [<ffffffff8216cd8a>] ? xnintr_core_clock_handler+0x2fa/0x310
> > [  295.660743]  [<ffffffff82119b1a>] ? dispatch_irq_head+0x8a/0x120
> > [  295.660746]  [<ffffffff8203ad75>] ?
> > __ipipe_handle_irq+0x85/0x1b0 [  295.660749]
> > [<ffffffff82605ba9>] ? apic_timer_interrupt+0x89/0xb0
> > [  295.660752]  [<ffffffffc01502a6>] ? continue_block+0x22/0x54
> > [crc32c_intel] [  295.660754]  [<ffffffffc0150233>] ?
> > crc32c_pcl_intel_update+0x53/0x60 [crc32c_intel] [  295.660757]
> > [<ffffffffc031fa90>] ? jbd2_journal_commit_transaction+0x9e0/0x1850
> > [jbd2] [  295.660760]  [<ffffffff82603d60>] ?
> > __switch_to_asm+0x40/0x70 [  295.660763]  [<ffffffff820d19af>] ?
> > try_to_del_timer_sync+0x4f/0x80 [  295.660766]
> > [<ffffffffc0324c92>] ? kjournald2+0xe2/0x290 [jbd2] [  295.660768]
> > [<ffffffff820b0320>] ? wake_up_atomic_t+0x30/0x30 [  295.660771]
> > [<ffffffffc0324bb0>] ? commit_timeout+0x10/0x10 [jbd2]
> > [  295.660774]  [<ffffffff8208df65>] ? kthread+0xf5/0x110
> > [  295.660776]  [<ffffffff82603d60>] ? __switch_to_asm+0x40/0x70
> > [  295.660779]  [<ffffffff8208de70>] ? kthread_park+0x60/0x60
> > [  295.660782]  [<ffffffff82603de5>] ? ret_from_fork+0x55/0x60
> > [  295.660785] ---[ end trace b1bfa97fc17a203c ]--- [  705.768058]
> > perf: interrupt took too long (2543 > 2500), lowering
> > kernel.perf_event_max_sample_rate to 78500 [  865.772646] perf:
> > interrupt took too long (3207 > 3178), lowering
> > kernel.perf_event_max_sample_rate to 62250 


Reply via email to