Am Fri, 11 Jan 2019 14:47:13 +0100
schrieb Mauro Salvini <mauro.salv...@smigroup.net>:

> On Fri, 2019-01-11 at 10:40 +0100, Henning Schild wrote:
> > Am Fri, 11 Jan 2019 09:57:50 +0100
> > schrieb Mauro Salvini via Xenomai <xenomai@xenomai.org>:
> >   
> > > Hi all,
> > > 
> > > I'm testing same hardware of [1], with kernel 4.9.146 from ipipe-
> > > 4.9.y
> > > with [2] applied, compiled with ARCH=i386 and Xenomai 3.0.7.  
> > 
> > To be honest i386 is not really tested anymore, in fact in 4.14 not
> > even supported at the moment. If you can you should go for x86_64.
> >   
> 
> Hi Henning,
> 
> Thank you. I'm trying i386 version due to legacy 32bit code that uses
> rtnet (which cannot be used with mixed ABI).

Ok, maybe something you might want to fix for the future.

> > > Launching
> > > 
> > > xeno-test -l "dohell -s xxx -p yyy -m xxx 90000" -T 90000
> > > 
> > > I got this dump in dmesg, sometimes just after latency starts,
> > > sometimes after few seconds (side effect is a max latency value
> > > increase):
> > > 
> > > [  167.914184] ------------[ cut here ]------------
> > > [  167.914208] WARNING: CPU: 0 PID: 606
> > > at /home/build-ws/develop/linux-
> > > 4.9.146/arch/x86/include/asm/fpu/internal.h:511
> > > fpu__restore+0x1eb/0x2b0 [  167.914216] Modules linked in:
> > > intel_rapl
> > > intel_powerclamp iTCO_wdt iTCO_vendor_support coretemp kvm_intel
> > > kvm
> > > irqbypass crc32_pclmul aesni_intel xts aes_i586 lrw gf128mul
> > > ablk_helper cryptd snd_pcm intel_cstate snd_timer evdev snd
> > > soundcore
> > > i915 pcspkr drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect
> > > sysimgblt shpchp video lpc_ich mfd_core button ip_tables x_tables
> > > autofs4 ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid
> > > mmc_block crc32c_intel i2c_i801 i2c_smbus igb i2c_algo_bit
> > > xhci_pci ptp pps_core xhci_hcd sdhci_pci sdhci usbcore mmc_core
> > > fjes [last unloaded: rtnet] [  167.914768] CPU: 0 PID: 606 Comm:
> > > dohell Not tainted 4.9.146+ #1 [  167.914772] Hardware name:
> > > Default string Default string/Q7-BW, BIOS V1.20#KW050220A
> > > 03/16/2018 [  167.914775]
> > > I-pipe domain: Linux [  167.914778]  f42e5e44 daeffa2d 00000000
> > > db335030 dac1ff3b f42e5e74 dac59dea db34504c
> > > [  167.914800]  00000000
> > > 0000025e db335030 000001ff dac1ff3b 000001ff f4291bc0 00000246
> > > [  167.914822]  f4291c00 f42e5e88 dac59efb 00000009 00000000
> > > 00000000
> > > f42e5ea4 dac1ff3b [  167.914843] Call Trace:
> > > [  167.914846]  [<daeffa2d>] dump_stack+0x9f/0xc2
> > > [  167.914849]  [<dac1ff3b>] ? fpu__restore+0x1eb/0x2b0
> > > [  167.914865]  [<dac59dea>] __warn+0xea/0x110
> > > [  167.914868]  [<dac1ff3b>] ? fpu__restore+0x1eb/0x2b0
> > > [  167.914871]  [<dac59efb>] warn_slowpath_null+0x2b/0x30
> > > [  167.914874]  [<dac1ff3b>] fpu__restore+0x1eb/0x2b0
> > > [  167.914877]  [<dac21b0a>] __fpu__restore_sig+0x2ba/0x680
> > > [  167.914879]  [<dac22141>] fpu__restore_sig+0x31/0x50
> > > [  167.914882]  [<dac13f52>] restore_sigcontext.isra.9+0xf2/0x110
> > > [  167.914885]  [<dac149b9>] sys_sigreturn+0xa9/0xc0
> > > [  167.914888]  [<dac019f5>] do_int80_syscall_32+0x85/0x190
> > > [  167.914891]  [<db1a56d5>] entry_INT80_32+0x31/0x31
> > > [  167.914898]
> > > ---[ end trace e57344f10f300a76 ]---  
> > 
> > I am not sure which path leads you there. But it could well be a
> > state
> > that was caused by the ipipe patch.
> > 
> > could you try this:
> > 
> > --- a/arch/x86/kernel/fpu/core.c
> > +++ b/arch/x86/kernel/fpu/core.c
> > @@ -426,6 +426,10 @@ void fpu__restore(struct fpu *fpu)
> >         /* Avoid __kernel_fpu_begin() right after fpregs_activate()
> > */
> >         kernel_fpu_disable();
> >         trace_x86_fpu_before_restore(fpu);
> > +       if (fpregs_activate(fpu)) {  
> 
> This instruction does not compile due to fpregs_activate() returns
> void, perhaps did you mean "if (fpregs_active(fpu))"?
> Given that fpregs_active() have no args, I tried with this:
> 
> if (fpu->fpregs_active)

I did not test what i wrote there, and your fix is what i meant.

> and warning does not raise (even warning added with this patch).

In that case a similar patch should probably be included upstream. I
will prepare a patch for that.

Henning

> > +               WARN_ON_FPU(fpu !=
> > this_cpu_read_stable(fpu_fpregs_owner_ctx));
> > +               fpregs_deactivate(fpu);
> > +       }
> >         fpregs_activate(fpu);
> >         copy_kernel_to_fpregs(&fpu->state);
> >         trace_x86_fpu_after_restore(fpu);
> > 
> > This would not be a proper fix, especially if you end up seeing that
> > warning ...
> > 
> > Henning
> >   
> > > I found discussion at [3], and applied patch at [4] that comes
> > > from it, but result is the same.
> > > 
> > > Starting xeno-test without -l argument result is the same.
> > > Launching dohell alone (with same arguments as when launched from
> > > xeno- test -l), dump does not appear.
> > > 
> > > Could be a Xenomai-related problem (though the stack seems not
> > > concern
> > > Xenomai) or it is better to post it on LKML?
> > > 
> > > Thanks in advance, regards
> > > 
> > > Mauro
> > > 
> > > [1]
> > > https://xenomai.org/pipermail/xenomai/2018-December/040142.html
> > > [2]
> > > https://xenomai.org/pipermail/xenomai/2019-January/040172.html
> > > [3]
> > > https://lore.kernel.org/lkml/20181120102635.ddv3fvavxajjlfqk@linutr
> > > onix.de/ [4]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/co
> > > mmit/?h=linux-4.9.y&id=d3741e0390287056011950493a641524f49fa05a
> > > 
> > >   
> > 
> >   


Reply via email to