On 05/10/16 19:53, Jonathan Morton wrote:
I wonder what it was that caused yesterday's issues?  I really
must try again when I've more time to get proper access.

I’m having trouble reproducing it here.  I know one of my boxes
froze the very first time I loaded it, but it’s been running fine
ever since.  Another machine is currently refusing to insert the
module, claiming a wrong exec format.  It’s all a bit bizarre.

I do have a few more avenues of enquiry to explore, though.

Aha - I managed to capture a kernel panic, which appears to trace to
the lookup in the accelerator array.  It’s a read-only access, so it
only panics if it hits unpaged memory, rather than corrupting
anything.  Of course, if it reads outside the array, it’ll increment
the deficit by a random value, but that usually won’t prevent traffic
flowing.

The lookup is indexed on the host refcnt, which I’m using as the
count of flows attached to that host.  It seems likely that it isn’t
being maintained correctly in all cases, so it can wrap around past
zero very soon after being attached, without needing much traffic.

I’ll try to fix that, and put a sanity check in as well to be
certain.

It's now ok...so far :-)

_______________________________________________
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake

Reply via email to