----- On Jun 24, 2015, at 1:00 PM, Linus Torvalds torva...@linux-foundation.org wrote:
> On Wed, Jun 24, 2015 at 9:14 AM, Mathieu Desnoyers > <mathieu.desnoy...@efficios.com> wrote: >> When trying to change memory allocation from kmalloc to vmalloc to >> handle memory fragmentation for reallocation of a growing string within >> a kernel module, our testsuite started to trigger kernel OOPS. It >> triggers when the string is copied into a ring buffer using memcpy, >> piece-wise. > > I hate your patch, just because it doesn't make sense. The "when > non-aligned, don't do movsq" might make sense for performance, but it > does *not* make sense for correctness. > > Why would "rep movsq" trigger the oops, but memcpy_orig not? I think > the fundamental bug is something else. > > I don't see *what* the bug is, though. > > Very odd. > > x86 people, can you see anything there? It does look like > vmalloc_fault() *should* have triggered, so why didn't it? The address > is definitely in the VMALLOC_START/END range, and the error code is > 0000, so how come didn't vmalloc_fault() handle this? > >> This points to arch/x86/lib/memcpy_64.S:__memcpy rep movsq instruction. >> This could be reproduced on my Lenovo x240 laptop (i7 CPU), and within a >> virtual machine running on a Intel(R) Xeon(R) CPU E5-2630 v3 host. >> Interestingly, with the VM having the rep_good flag (but not erms), the issue >> triggers. However, if the VM has both rep_good and erms flags, the issue does >> not trigger. > > With ERMS, I think we end up using just "rep movsb" instead. But there > should be absolutely no difference in fault patterns. > > I see the QEMU part, is this just regular kvm? Yes, this is just regular kvm. > Could you add a debug > printk to the vmalloc_fault() caller and then reproduce the oops? It > shouldn't trigger enough to be a horrible logging problem. Here is the output. I added the printk just after the initial range check within vmalloc_fault. What is weird is that the fault happens on an aligned source address. It's the destination which is unaligned. Let me know if you need more info. [ 53.084521] DEBUG: vmalloc_fault at address 0xffffc9000746e000 [ 53.085460] BUG: unable to handle kernel paging request at ffffc9000746e000 [ 53.085460] IP: [ 53.090220] [<ffffffff81316f12>] __memcpy+0x12/0x20 [ 53.090220] PGD 236c92067 PUD 236c93067 PMD 22e840067 PTE 0 [ 53.090220] Oops: 0000 [#1] SMP [ 53.090220] Modules linked in: lttng_probe_workqueue(O) lttng_probe_vmscan(O) lttng_probe_udp(O) lttng_probe_timer(O) lttng_probe_sunrpc(O) lttng_probe_statedump(O) lttng_probe_sock(O) lttng_probe_skb(O) lttng_probe_signal(O) lttng_probe_scsi(O) lttng_probe_sched(O) lttng_probe_regmap(O) lttng_probe_rcu(O) lttng_probe_random(O) lttng_probe_power(O) lttng_probe_net(O) lttng_probe_napi(O) lttng_probe_module(O) lttng_probe_kmem(O) lttng_probe_jbd2(O) lttng_probe_irq(O) lttng_probe_ext4(O) lttng_probe_compaction(O) lttng_probe_block(O) lttng_types(O) lttng_ring_buffer_metadata_mmap_client(O) lttng_ring_buffer_client_mmap_overwrite(O) lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) lttng_tracer(O) lttng_statedump(O) lttng_kprobes(O) lttng_lib_ring_buffer(O) lttng_kretprobes(O) virtio_blk virtio_net virtio_pci virtio_ring virtio [last unloaded: lttng_statedump] [ 53.090220] CPU: 4 PID: 3532 Comm: lttng-consumerd Tainted: G O 4.1.0+ #10 [ 53.090220] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 53.090220] task: ffff880235355aa0 ti: ffff8800bb6d0000 task.ti: ffff8800bb6d0000 [ 53.090220] RIP: 0010:[<ffffffff81316f12>] [<ffffffff81316f12>] __memcpy+0x12/0x20 [ 53.090220] RSP: 0018:ffff8800bb6d3da0 EFLAGS: 00010206 [ 53.090220] RAX: ffff8802355b3025 RBX: 0000000000000fdb RCX: 00000000000001fb [ 53.090220] RDX: 0000000000000003 RSI: ffffc9000746e000 RDI: ffff8802355b3025 [ 53.090220] RBP: ffff8800bb6d3db8 R08: ffff880231cd7200 R09: 0000000000000025 [ 53.090220] R10: 0000000000000000 R11: 0000000000001000 R12: ffff8800bb6d3dc8 [ 53.090220] R13: ffff88022e437400 R14: 0000000000000fdb R15: 0000000000000fdb [ 53.090220] FS: 00007f24d8bbc700(0000) GS:ffff880237280000(0000) knlGS:0000000000000000 [ 53.090220] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 53.090220] CR2: ffffc9000746e000 CR3: 00000000ba6d6000 CR4: 00000000000006e0 [ 53.090220] Stack: [ 53.090220] ffffffffa05ac797 ffff8802334fb300 ffff8802334fb350 ffff8800bb6d3e48 [ 53.090220] ffffffffa0473060 ffff88022e437400 0000000000000000 0000000000000fdb [ 53.090220] ffffffff00000001 ffff880231cd7200 0000000000000fdb 0000000000000025 [ 53.090220] Call Trace: [ 53.090220] [<ffffffffa05ac797>] ? lttng_event_write+0x87/0xb0 [lttng_ring_buffer_metadata_client] [ 53.090220] [<ffffffffa0473060>] lttng_metadata_output_channel+0xd0/0x120 [lttng_tracer] [ 53.090220] [<ffffffffa04755f9>] lttng_metadata_ring_buffer_ioctl+0x79/0xd0 [lttng_tracer] [ 53.090220] [<ffffffff8117ba10>] do_vfs_ioctl+0x2e0/0x4e0 [ 53.090220] [<ffffffff812b35c7>] ? file_has_perm+0x87/0xa0 [ 53.090220] [<ffffffff8117bc91>] SyS_ioctl+0x81/0xa0 [ 53.090220] [<ffffffff818bbd37>] tracesys_phase2+0x84/0x89 [ 53.090220] Code: 5b 5d c3 66 0f 1f 44 00 00 e8 6b fc ff ff eb e1 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 [ 53.090220] RIP [<ffffffff81316f12>] __memcpy+0x12/0x20 [ 53.090220] RSP <ffff8800bb6d3da0> [ 53.090220] CR2: ffffc9000746e000 [ 53.090220] ---[ end trace 850d7bf1b41647ee ]--- -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/