OK I have completed the refactor of edge to remove the second thread. It no
longer segfaults under load. I have been hammering on it and I cannot make it
crash with any of the tests that would previously crash it within seconds.

I can only test for linux.

Don - please let me know if this works for you.


P.S. I used select() instead of poll() because select() was already there and
working and I'm not sure that poll() works for all the target platforms.


--- Richard Andrews <[EMAIL PROTECTED]> wrote:
> I can reproduce the crash reliably with flood ping but only with large
> packets
> (1200 bytes). So I ran edge in gdb. The stack trace on segv is garbage so it
> tells me nothing. However, the last couple of entries in the edge trace (in
> verbose mode), shows that it is processing both a packet from UDP and a
> packet
> from the TAP interface simultaneously.
> 
> 06/May/2008 22:21:48 [     edge.c: 961] ### Rx N2N Msg network -> tun
> 06/May/2008 22:21:48 [     edge.c: 974] Received packet from
> 61.xx.xx.xx:54652
> 06/May/2008 22:21:48 [     edge.c: 978] Received message
> [msg_type=MSG_TYPE_PACKET] from peer [dst mac=6A:43:A8:61:56:BC]
> 06/May/2008 22:21:48 [     edge.c: 693] ### Rx L2 Msg (1242) tun -> network
> 06/May/2008 22:21:48 [     edge.c:1032] ### Tx L2 Msg -> tun
> 06/May/2008 22:21:48 [      n2n.c:  51] Unmarshalled  hdr:
> public_ip=(2)0.0.0.0:0, private_ip=(2)0.0.0.0:38665
> 06/May/2008 22:21:48 [      n2n.c: 576] 1335 bytes decompressed into 1242
> 06/May/2008 22:21:48 [      n
> 
> Note the two different Rx lines without intervening Tx.
> 
> Uh-oh I smell a race condition! I notice there are two threads in edge, one
> for
> reading the UDP socket and one for reading the tuntap interface. I think that
> the two threads are trampling on each other in the case where one has to be
> scheduled off CPU to allow the other to run, or both run concurrently (ie.
> under load).
> 
> IMO due to the amount of shared state edge should be refactored to run in a
> single thread using poll() as the scheduler. This is a small job - it'll take
> me an hour maybe. I should get to it with 24hrs. This will also serve to
> remove
> the select() call. Sorry if you're a select() fan, I'm not.

 
> --- Don Bindner <[EMAIL PROTECTED]> wrote:
> > Interestingly, the edge process seems to crash at either end 
> > now; I've had one at the "server" end and one at the "client".
> 



      Get the name you always wanted with the new y7mail email address. 
www.yahoo7.com.au/y7mail

_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev

Reply via email to