Re: irq_fpu_usable() is irreliable

Ingo Molnar Fri, 27 Nov 2015 00:48:13 -0800

* Jason A. Donenfeld <ja...@zx2c4.com> wrote:

> Intel 3820QM, but inside VMWare Workstation 12.
> 
> > Third, could you post such a problematic stack trace?
> 
> Sure: https://paste.kde.org/pfhhdchs9/7mmtvb


So it's:

    [  187.194226] CPU: 0 PID: 1165 Comm: iperf3 Tainted: G           O    
4.2.3-1-ARCH #1
    [  187.194229] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 07/02/2015
    [  187.194231]  0000000000000000 0000000062ca03ad ffff88003b82f0d0 
ffffffff8156c0ca
    [  187.194233]  ffff88003bfa0dc0 0000000000000090 ffff88003b82f260 
ffffffffa03fc27e
    [  187.194234]  0000000000000010 ffff88003be05300 0000000000000000 
ffff88003b82f3e0
    [  187.194235] Call Trace:
    [  187.194244]  [<ffffffff8156c0ca>] dump_stack+0x4c/0x6e
    [  187.194248]  [<ffffffffa03fc27e>] chacha20_avx+0x23e/0x250 [wireguard]
    [  187.194253]  [<ffffffff8101de03>] ? nommu_map_page+0x43/0x80
    [  187.194257]  [<ffffffffa0344161>] ? e1000_xmit_frame+0xdf1/0x11c0 [e1000]
    [  187.194259]  [<ffffffffa03fbe6e>] ? poly1305_update_asm+0x11e/0x1b0 
[wireguard]
    [  187.194260]  [<ffffffffa03fcd0d>] chacha20_finish+0x3d/0x60 [wireguard]
    [  187.194262]  [<ffffffffa03f8eae>] 
chacha20poly1305_encrypt_finish+0x2e/0xf0 [wireguard]
    [  187.194263]  [<ffffffffa03efa32>] noise_message_encrypt+0x162/0x180 
[wireguard]
    [  187.194269]  [<ffffffff811b60e5>] ? 
__kmalloc_node_track_caller+0x35/0x2e0
    [  187.194274]  [<ffffffff81460af7>] ? __alloc_skb+0x87/0x210
    [  187.194275]  [<ffffffff81460a11>] ? __kmalloc_reserve.isra.5+0x31/0x90
    [  187.194276]  [<ffffffff81460acb>] ? __alloc_skb+0x5b/0x210
    [  187.194278]  [<ffffffff81460b0b>] ? __alloc_skb+0x9b/0x210
    [  187.194279]  [<ffffffffa03f2a65>] noise_message_create_data+0x55/0x80 
[wireguard]
    [  187.194280]  [<ffffffffa03e9708>] packet_send_queue+0x1f8/0x4d0 
[wireguard]
    [  187.194285]  [<ffffffff810a8219>] ? dequeue_entity+0x149/0x690
    [  187.194287]  [<ffffffff810a9051>] ? put_prev_entity+0x31/0x420
    [  187.194289]  [<ffffffff810146ec>] ? __switch_to+0x25c/0x4a0
    [  187.194291]  [<ffffffff81099ce2>] ? finish_task_switch+0x62/0x1b0
    [  187.194292]  [<ffffffff8156d500>] ? __schedule+0x340/0xa00
    [  187.194296]  [<ffffffff810ddf19>] ? hrtimer_try_to_cancel+0x29/0x120
    [  187.194298]  [<ffffffff810b4464>] ? add_wait_queue+0x44/0x50
    [  187.194299]  [<ffffffff811b60e5>] ? 
__kmalloc_node_track_caller+0x35/0x2e0
    [  187.194302]  [<ffffffff811e33ce>] ? __pollwait+0x7e/0xe0
    [  187.194303]  [<ffffffff81460af7>] ? __alloc_skb+0x87/0x210
    [  187.194304]  [<ffffffff81460a11>] ? __kmalloc_reserve.isra.5+0x31/0x90
    [  187.194305]  [<ffffffffa03e861f>] xmit+0x8f/0xe0 [wireguard]
    [  187.194308]  [<ffffffff8147588f>] dev_hard_start_xmit+0x24f/0x3f0
    [  187.194309]  [<ffffffff814753be>] ? 
validate_xmit_skb.isra.34.part.35+0x1e/0x2a0
    [  187.194310]  [<ffffffff81476042>] __dev_queue_xmit+0x4d2/0x540
    [  187.194311]  [<ffffffff814760c3>] dev_queue_xmit_sk+0x13/0x20
    [  187.194313]  [<ffffffff8147d9c2>] neigh_direct_output+0x12/0x20
    [  187.194315]  [<ffffffff814b1756>] ip_finish_output2+0x1b6/0x3c0
    [  187.194317]  [<ffffffff814b309e>] ? __ip_append_data.isra.3+0x6ae/0xac0
    [  187.194317]  [<ffffffff814b376c>] ip_finish_output+0x13c/0x1d0
    [  187.194318]  [<ffffffff814b3b75>] ip_output+0x75/0xe0
    [  187.194319]  [<ffffffff814b468d>] ? ip_make_skb+0x10d/0x130
    [  187.194320]  [<ffffffff814b1381>] ip_local_out_sk+0x31/0x40
    [  187.194321]  [<ffffffff814b44ea>] ip_send_skb+0x1a/0x50
    [  187.194323]  [<ffffffff814dc221>] udp_send_skb+0x151/0x280
    [  187.194325]  [<ffffffff814dd7f5>] udp_sendmsg+0x305/0x9d0
    [  187.194327]  [<ffffffff8157115e>] ? _raw_spin_unlock_bh+0xe/0x10
    [  187.194328]  [<ffffffff814e8daf>] inet_sendmsg+0x7f/0xb0
    [  187.194329]  [<ffffffff81457227>] sock_sendmsg+0x17/0x30
    [  187.194330]  [<ffffffff814572c5>] sock_write_iter+0x85/0xf0
    [  187.194332]  [<ffffffff811d028c>] __vfs_write+0xcc/0x100
    [  187.194333]  [<ffffffff811d0b04>] vfs_write+0xa4/0x1a0
    [  187.194334]  [<ffffffff811d1815>] SyS_write+0x55/0xc0
    [  187.194335]  [<ffffffff8157162e>] entry_SYSCALL_64_fastpath+0x12/0x71

so this does not seem to be a very complex stack trace: we are trying to use 
the 
FPU from a regular process, from a regular system call path. No interrupts, no 
kernel threads, no complications.

We possibly context switched recently:

    [  187.194285]  [<ffffffff810a8219>] ? dequeue_entity+0x149/0x690
    [  187.194287]  [<ffffffff810a9051>] ? put_prev_entity+0x31/0x420
    [  187.194289]  [<ffffffff810146ec>] ? __switch_to+0x25c/0x4a0
    [  187.194291]  [<ffffffff81099ce2>] ? finish_task_switch+0x62/0x1b0
    [  187.194292]  [<ffffffff8156d500>] ? __schedule+0x340/0xa00

but that's all that I can see in the trace.

So as a first step I'd try Linus's very latest kernel, to make sure it's not a 
bug 
that got fixed meanwhile. If it still occurs, try to report it to the vmware 
virtualization folks. Maybe it's some host kernel activity that changes the 
state 
of the FPU. I don't know ...

Thanks,

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: irq_fpu_usable() is irreliable

Reply via email to