>> I wonder what it was that caused yesterday's issues?  I really must try 
>> again when I've more time to get proper access.
> 
> I’m having trouble reproducing it here.  I know one of my boxes froze the 
> very first time I loaded it, but it’s been running fine ever since.  Another 
> machine is currently refusing to insert the module, claiming a wrong exec 
> format.  It’s all a bit bizarre.
> 
> I do have a few more avenues of enquiry to explore, though.

Aha - I managed to capture a kernel panic, which appears to trace to the lookup 
in the accelerator array.  It’s a read-only access, so it only panics if it 
hits unpaged memory, rather than corrupting anything.  Of course, if it reads 
outside the array, it’ll increment the deficit by a random value, but that 
usually won’t prevent traffic flowing.

The lookup is indexed on the host refcnt, which I’m using as the count of flows 
attached to that host.  It seems likely that it isn’t being maintained 
correctly in all cases, so it can wrap around past zero very soon after being 
attached, without needing much traffic.

I’ll try to fix that, and put a sanity check in as well to be certain.

 - Jonathan Morton

_______________________________________________
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake

Reply via email to