Re: ip_rcv_finish() NULL pointer and possibly related Oopses

2015-09-04 Thread Shaun Crampton
On 03/09/2015 13:10, "Eric Dumazet" wrote: >On Thu, 2015-09-03 at 10:09 +, Shaun Crampton wrote: >> >... >> >> Is there anything I can do on a running system to help figure this >>out? >> >> Some sort of kernel equivalent to pmap to find

Re: ip_rcv_finish() NULL pointer and possibly related Oopses

2015-09-03 Thread Shaun Crampton
>... >> Is there anything I can do on a running system to help figure this out? >> Some sort of kernel equivalent to pmap to find out what module or device >> owns that chunk of memory? > >Hmm, perhaps /proc/kallsyms could point to something. 0xa0087d81 >and 0xa008772b could be fro

Re: ip_rcv_finish() NULL pointer and possibly related Oopses

2015-09-03 Thread Shaun Crampton
>Looking at this one, I am still puzzeled where 0xa008772b and >0xa008772b comes from ... some driver, bridge ...? Is there anything I can do on a running system to help figure this out? Some sort of kernel equivalent to pmap to find out what module or device owns that chunk of me

Re: ip_rcv_finish() NULL pointer and possibly related Oopses

2015-09-02 Thread Shaun Crampton
> Make sure you backported commit > 10e2eb878f3ca07ac2f05fa5ca5e6c4c9174a27a > ("udp: fix dst races with multicast early demux") I just tried the latest CoreOS alpha, which had that patch. Sadly, I saw just as many reboots. Here's a sample of the different types of Oopses I see (I've put the re

Re: ip_rcv_finish() NULL pointer and possibly related Oopses

2015-08-26 Thread Shaun Crampton
>And the kernel thinks it's >outside of any normal text section, so it does not try to dump any >code from before the instruction pointer. > > 0: 48 8b 88 40 03 00 00mov0x340(%rax),%rcx > 7: e8 1d dd dd ff callq 0xff29 > c: 5d pop%rbp

ip_rcv_finish() NULL pointer and possibly related Oopses

2015-08-26 Thread Shaun Crampton
Please CC me in any responses, thanks. Testing our app at scale on Google¹s GCE, running ~1000 CoreOS hosts: over approximately 1 hour, I see about 1 in 50 hosts hit one of the Oopses below and then reboot (I¹m not sure if the different oopses are related to each other). The app is Project Calico

Re: veths often slow to come up

2015-08-06 Thread Shaun Crampton
> Take a look at linkwatch_urgent_event at net/core/link_watch.c, and all >of > link_watch.c in general. That's where the 1s delay comes from. Thanks for the diagnosis, I¹ll take a look. -Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

veths often slow to come up

2015-08-04 Thread Shaun Crampton
Please CC me on any responses, thanks. Setting both ends of a veth to be oper UP completes very quickly but I find that pings only start flowing over the veth after about a second. This seems to correlate with the NO-CARRIER flag being set or the interface being in "state UNKNOWN" or "state DOWN²