Hi Alex,

I have been testing your 6lowpan_pending branch with the fragmentation 
updates (commit 5e47d2f516a6cf1acf9927125392484835d93552 - I don't think 
this has the spin_lock patch below) using 2 devices pinging each other, 
one with 1000 byte ping and the other with a 777 byte ping and the good 
news it was working fine for about 3/4 hour.  The bad news is that it 
gets to a point where something happens and then only sporadic pings are 
getting through on one device and the other I've seen one ping get 
through in about 10 minutes.  Could this be a symptom of what your 
describing here or would it completely freeze? If I stop the pings and 
set them both off again withdefault packet sizes it works fine again.  
As soon as I try to using a larger ping payload it becomes sporadic again.

I'll try and get some sort of tracing working to see if this shows 
anything.  If you have a branch with the inet_frag api changes I don't 
mind trying them out too.

Cheers,
Martin.


On 05/02/14 07:47, Alexander Aring wrote:
> Hi David,
>
> thanks for your reply.
>
> On Tue, Feb 04, 2014 at 08:32:03PM -0800, David Miller wrote:
>> From: Alexander Aring<alex.ar...@gmail.com>
>> Date: Tue,  4 Feb 2014 11:57:53 +0100
>>
>>> @@ -197,7 +197,9 @@ static void lowpan_fragment_timer_expired(unsigned long 
>>> entry_addr)
>>>   
>>>     pr_debug("timer expired for frame with tag %d\n", entry->tag);
>>>   
>>> +   spin_lock_bh(&flist_lock);
>>>     list_del(&entry->list);
>>> +   spin_unlock_bh(&flist_lock);
>>>     dev_kfree_skb(entry->skb);
>>>     kfree(entry);
>>>   }
>> This will deadlock, because the other code path holding flist_lock calls
>> del_timer_sync() to wait for this timer to return.
>>
> ok. I detected this some months ago and I talked with Werner Almesberger
> about that. He talked something about del_timer_sync too and other
> issues, but I didn't understand that I open a new deadlock case. Now I
> learned something new things, thanks. :-)
>
>> The synchornization in this code is really a big mess.
> That is one thing which I also detected so I decide to make a new
> implementation based on net/ipv6/reassembly.c which also used the
> inet_frag api.
>
> I will bring these patches mainline and I hope it will remove the most
> of the race condigition. But then it's only solved in net-next branch.
>
> - Alex
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> _______________________________________________
> Linux-zigbee-devel mailing list
> Linux-zigbee-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-zigbee-devel


------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Linux-zigbee-devel mailing list
Linux-zigbee-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-zigbee-devel

Reply via email to