Am Thu, 20 Dec 2018 10:10:29 +0100 schrieb Jan Kiszka <jan.kis...@siemens.com>:
> On 20.12.18 09:28, Mauro Salvini via Xenomai wrote: > > Hi all, > > > > I'm testing Xenomai 3 on an Intel Braswell board (Atom x5-E8000). > > I'm using ipipe kernel at last commit from [1], branch ipipe-4.9.y, > > 64bit build on a Debian Stretch 9.6 64bit. > > Xenomai library is from [2], branch stable/v3.0.x on commit > > bc53d03f (I haven't two last commits but seems not related). I > > tried both 32bit (mixed ABI) and 64bit builds with same following > > result. > > > > I launch: > > > > xeno-test -l "dohell -s xxx -p yyy -m xxx 90000" -T 90000 > > > > After a variable time (from minutes to hours) from latency test > > start I get a few overruns that I discovered are generated by a > > kernel stack dump (attached to mail the dmesg tail). Latency test > > doesn't stop, and after this stackdump never reports other overruns > > or latency peaks (seems I need to reboot to reproduce stack). > > > > I read in this mailing list that on last patches much work has done > > on FPU part, should it be related? > > > > Glad to give other infos if you need. > > Thanks for reporting. Maybe we need your config later on, but I'm > first of all looking at the code, see below. > > In general, it's better to run the kernel with frame-pointers enabled > to get more reliable backtraces, at least when an error occurs. > > > > > Thanks in advance, regards. > > > > Mauro > > > > [1] https://gitlab.denx.de/Xenomai/ipipe > > [2] https://gitlab.denx.de/Xenomai/xenomai > > -------------- next part -------------- > > [ 233.205940] > > /home/build-user/develop/linux-ipipe-4.9.y/arch/x86/xenomai/include/asm/xenomai/fptest.h:43: > > Warning: Linux is compiled to use FPU in kernel-space. For this > > reason, switchtest can not test using FPU in Linux kernel-space. > > [ 295.660454] ------------[ cut here ]------------ [ 295.660461] > > WARNING: CPU: 0 PID: 139 > > at > > /home/build-user/develop/linux-ipipe-4.9.y/arch/x86/include/asm/fpu/internal.h:502 > > xnarch_leave_root+0x1a4/0x1b0 > > Henning, the kernel checks fpu->fpregs_active here and finds it off > while Xenomai looks at fpu.fpstate_active - intentionally or by > accident? That was by accident. In eager mode they should mostly be in sync, unless when playing the nasty tricks ipipe has to play. I guess there is a path where we tried to "unown" a task that we unowned shortly before. I did not try to find that path, i just sent a patch. Mauro, since you can reproduce the problem you can probably tell if the patch fixes it. Henning > Jan > > > [ 295.660465] Modules linked in: binfmt_misc msr iTCO_wdt > > iTCO_vendor_support coretemp kvm_intel kvm irqbypass > > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel > > aes_x86_64 lrw snd_pcm gf128mul glue_helper i915 ablk_helper > > snd_timer cryptd snd soundcore pcspkr drm_kms_helper drm evdev > > lpc_ich mfd_core fb_sys_fops syscopyarea sysfillrect sysimgblt sg > > shpchp video button ip_tables x_tables autofs4 ext4 crc16 jbd2 > > fscrypto mbcache sd_mod hid_generic usbhid hid crc32c_intel ahci > > igb i2c_i801 libahci i2c_smbus i2c_algo_bit xhci_pci dca ptp > > pps_core libata xhci_hcd sdhci_pci sdhci usbcore scsi_mod mmc_core > > fjes [last unloaded: rtnet] [ 295.660660] CPU: 0 PID: 139 Comm: > > jbd2/sda1-8 Tainted: G W 4.9.135+ #1 [ 295.660663] > > Hardware name: Default string Default string/Q7-BW, BIOS > > V1.20#KW050220A 03/16/2018 [ 295.660666] I-pipe domain: Xenomai > > [ 295.660668] 0000000000000000 ffffffff823402c2 0000000000000000 > > 0000000000000000 [ 295.660680] ffffffff829cd400 ffffffff8206cb68 > > ffff985636908040 ffff985636504080 [ 295.660706] 0000000000099f50 > > 0000000000000000 ffffbc1580563040 ffff98567ac9b940 [ 295.660718] > > Call Trace: [ 295.660721] [<ffffffff823402c2>] ? > > dump_stack+0xb5/0xd3 [ 295.660724] [<ffffffff8206cb68>] ? > > __warn+0xc8/0xf0 [ 295.660727] [<ffffffff82068734>] ? > > xnarch_leave_root+0x1a4/0x1b0 [ 295.660729] > > [<ffffffff82173b56>] ? ___xnsched_run+0x3f6/0x4b0 [ 295.660732] > > [<ffffffff8219fbf6>] ? timerfd_handler+0x36/0x50 [ 295.660735] > > [<ffffffff8217b075>] ? xntimerq_insert+0x5/0xa0 [ 295.660738] > > [<ffffffff8216a7c9>] ? xnclock_tick+0x1a9/0x2c0 [ 295.660740] > > [<ffffffff8216cd8a>] ? xnintr_core_clock_handler+0x2fa/0x310 > > [ 295.660743] [<ffffffff82119b1a>] ? dispatch_irq_head+0x8a/0x120 > > [ 295.660746] [<ffffffff8203ad75>] ? > > __ipipe_handle_irq+0x85/0x1b0 [ 295.660749] > > [<ffffffff82605ba9>] ? apic_timer_interrupt+0x89/0xb0 > > [ 295.660752] [<ffffffffc01502a6>] ? continue_block+0x22/0x54 > > [crc32c_intel] [ 295.660754] [<ffffffffc0150233>] ? > > crc32c_pcl_intel_update+0x53/0x60 [crc32c_intel] [ 295.660757] > > [<ffffffffc031fa90>] ? jbd2_journal_commit_transaction+0x9e0/0x1850 > > [jbd2] [ 295.660760] [<ffffffff82603d60>] ? > > __switch_to_asm+0x40/0x70 [ 295.660763] [<ffffffff820d19af>] ? > > try_to_del_timer_sync+0x4f/0x80 [ 295.660766] > > [<ffffffffc0324c92>] ? kjournald2+0xe2/0x290 [jbd2] [ 295.660768] > > [<ffffffff820b0320>] ? wake_up_atomic_t+0x30/0x30 [ 295.660771] > > [<ffffffffc0324bb0>] ? commit_timeout+0x10/0x10 [jbd2] > > [ 295.660774] [<ffffffff8208df65>] ? kthread+0xf5/0x110 > > [ 295.660776] [<ffffffff82603d60>] ? __switch_to_asm+0x40/0x70 > > [ 295.660779] [<ffffffff8208de70>] ? kthread_park+0x60/0x60 > > [ 295.660782] [<ffffffff82603de5>] ? ret_from_fork+0x55/0x60 > > [ 295.660785] ---[ end trace b1bfa97fc17a203c ]--- [ 705.768058] > > perf: interrupt took too long (2543 > 2500), lowering > > kernel.perf_event_max_sample_rate to 78500 [ 865.772646] perf: > > interrupt took too long (3207 > 3178), lowering > > kernel.perf_event_max_sample_rate to 62250