On Tuesday 03 March 2009 22:11:11 Steve wrote: > The real problem is we need a way to set a timeout on the connection > attempt in the background without making it blocking.
Yes, this is what I've done :-) OK, not without sucking CPU, but I did say "The cost at present is higher CPU usage than would be ideal". I didn't make it clear that I can also see how we resolve that point. Regarding your concerns around WSAEINVAL, you may wish to be aware that what I'm doing mirrors what twisted does inside twisted.internet.BaseClient.doConnect. Furthermore, there's an explanatory comment there: # on Windows EINVAL means sometimes that we should keep trying: # http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/connect_2.asp If you follow this back, you find the referred to rationale: Until the connection attempt completes on a nonblocking socket, all subsequent calls to connect on the same socket will fail with the error code WSAEALREADY, and WSAEISCONN when the connection completes successfully. Due to ambiguities in version 1.1 of the Windows Sockets specification, error codes returned from connect while a connection is already pending may vary among implementations. As a result, it is not recommended that applications use multiple calls to connect to detect connection completion. If they do, they must be prepared to handle WSAEINVAL and WSAEWOULDBLOCK error values the same way that they handle WSAEALREADY, to assure robust operation. This tracks with what I've seen in the past (this code was added a long while back - 6 line fix - IIRC by a colleague, 4 years ago? :-). The underlying issue you keep banging up against is that sockets in reality in blocking mode don't provide for timeouts. For example the code in the Python socket module that you're seeking to use looks like this: http://pastebin.com/m1e2171fd In order for that code to work, the underlying code does this: if (defaulttimeout >= 0.0) internal_setblocking(s, 0); or for sock_settimeout, this line: internal_setblocking(s, timeout < 0.0); The upshot being this: if you set a timeout, internally python changes the socket to non-blocking. Then any operation that can fail - for example connection - results (in windows) in entering into a select statement to check to see when the operation would be completed - cf: res = select(s->sock_fd+1, NULL, &fds, &fds_exc, &tv); That &tv is the actual timeout you set originally, and then it's blocking on select. Now the way we'd do this properly in Kamaelia is to get the TCPClient to get access to the Selector service, and to ask the selector service to let the TCPClient know when the socket is ready to read. BUT in the error case - which we're dealing with, the Selector would never re-awaken the TCPClient since it's an error case. So waiting for the selector would need to have a timeout mechanism itself. ie fundamentally we would still need the timeout mechanism I added. That's the sort of thing that twisted implements with deferreds, and in threaded components you can implement both aspects with self.pause(). The nice thing though about doing that with self.pause() is that it then would give self.pause() the same sort of semantics for generator components as it does for threaded components. But beyond that, when we fail we also need to tell the selector that we're no longer want it to notify us that we're done using it. That's easy enough to do btw, buty all of this is a significant complexity jump over what we currently have which is why I've initially gone for the simpler timeout mechanism. (ie to get something working correctly before optimising it - which is what this would be) This is admittedly a little complex, and not something most users have to ever deal with, and it's fundamentally an optimisation really. However it makes sense to address that now since there's a real case that needs it fixed :-) Oh, as for this which has come in as I was typing: > I don't understand why the TCPClient code only > sees an infinte set of WSAEINVALIDs. Sheer speed. As fast as you can type, you're unlikely to be able to repeat anything manually faster than between 8 and 20 ms (best case). That's at least 2-3 (minimum) orders of magnitude slower than python will. Please bear in mind that the code you're critiquing does in fact critique itself: "Rather brute force". My personal view on dealing with this is this: * Get the code working as it should - ie allow timeouts to occur. * Get the code working such that it doesn't suck your CPU as it's doing (ie performance improvement) * Then refactor that "not sucking" CPU into nicer, more readable, more reusable, modular code. I think we've got to stage 1, and are now on stage 2. :) Michael -- http://yeoldeclue.com/blog http://twitter.com/kamaelian http://www.kamaelia.org/Home --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "kamaelia" group. To post to this group, send email to kamaelia@googlegroups.com To unsubscribe from this group, send email to kamaelia+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/kamaelia?hl=en -~----------~----~----~----~------~----~------~--~---