Re: Building on MSVC [was Re: [Libevent-users] libevent win32 issues]
Nick Mathewson wrote: Or possibly the next few days. :) I downloaded MSVC++ Express 2005 last night, along with the platform SDK, and gave them a try. The license issues were not as bad as I thought: they put some tricky restrictions on redistribution of binaries, but so long as I'm just using it for development, it should be good enough. Thanks for the tip, Toby! Well, to be honest, I was thinking of the command line compiler - I've given up using source level debugging, I think it's bad for code quality. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkeymail.org/mailman/listinfo/libevent-users
Re: [Libevent-users] libevent win32 issues
Nick Mathewson wrote: > On Tue, Nov 06, 2007 at 05:09:07PM +0100, Marc Lehmann wrote: > There are two factors that keep the select() implementation on win32 > from using the same strategy/code as the one : > >1) win32's select() doesn't use bitfields; it uses an array of > sockets. This is because... > >2) win32 sockets, unlike unix fds, are not consecutively numbered > integers starting at 0. Thus, it is _NOT_ a good idea to use an > array to map fds to events like select.c does; your array would > be enormous and sparse. > > A balanced tree implementation might an improvement here. It would be > nice if somebody would step up and write one. select() under win32 only works on sockets, so it's only purpose is socket selection. Given that, select() is a poor choice - I/O completion ports are much better; for one thing, they scale. libiocp exists, although I don't know at all what would be involved in the integration with libevent. >> Looking around with google it seems that indeed, evdns et. al does not >> build on windows (except when using e.g. cygwin, but thats trivial). >> >> So I wonder if libevent as a whole is supported under windows at all in >> current versions? > > I'd like to have everything work on win32. Trunk compiles on mingw > fine. I would like it also to compile under MSVC, but I don't have a > copy of MSVC. That's why diffs would be nice. :) MS offer a free command line compiler and SDK. Given they're free and compatable with the commercial build environments, would they be a viable choice of Win32 build environment? ___ Libevent-users mailing list Libevent-users@monkey.org http://monkeymail.org/mailman/listinfo/libevent-users
[Libevent-users] Re: IOCP API
>> I now need to finish the test/example application. >> When that's done, I'll upload. Might even be this weekend. I've been working away. Got a wierd bug which is holding things up. >> Prototypes >> == >> 1. int socket_iocp_new( void **socket_iocp_state, size_t >> read_buffer_size, void (*accept_callback_function)(SOCKET socket, struct >> sockaddr *local_address, struct sockaddr *remote_address, void >> *listen_socket_user_state, void **per_socket_user_state), void >> (*recv_callback_function)(SOCKET socket, void *read_buffer, DWORD byte_count, void *per_socket_user_state), void >> (*send_callback_function)( SOCKET socket, void *per_socket_user_state, void *per_write_user_state ), void (*error_callback_function)(SOCKET socket, void *per_socket_user_state) ); > It looks to me like all your callbacks are asynchronous. Yes. > Is that the case? If so you're probably working off the standard iocp example code which spawns num_cpus/2 or num_cpus*2 or > f(num_cpus) threads, Yup, CPU count * 2. > however that's the sort of choice that > I think should be left to user, exposing instead an interface > int iocp_wait(long timeout) which calls the callbacks. Do you mean that the user goes off in his code, does his stuff, and when he's ready to deal with IOCP stuff, he calls iocp_wait() and the API is then free to call the callbacks, but it will be using the users thread to do this, and after "long timeout" has expired, iocp_wait() returns and the user continues with his code till he next calls iocp_wait()? > This > lets the user choose the "asynchronicity" of their app, > i.e. they can write singlethreaded mutex-less iocp code. Would it be possible to have a single threaded app which is trying to do C10K? > As an interface socket_iocp_new is a tiny bit "busy" as it > wraps almost every function that iocps understand. Yes. In fact, it's the longest prototype I've ever written. OTOH, when you actually write it, it's only six arguments. > A bit of > factoring might make it more friendly. A control-block style > interface could also help as it cuts down the argument list > and makes the "per socket user state" implicit (it actually > makes it per iocp operation user state). It also gives you > two way communication (user -> iocp library -> user in callback) > > typedef struct { > SOCKET s; /* <- */ > char* buf;/* <-, -> */ > longnbytes; /* <-, -> */ > void (*iocp_read_callback)(iocp_state* st, iocp_read* control_block); /* > <- */ > } iocp_read; > > The user can piggy back per operation data like so: > typedef struct { > iocp_read control_block; > /* extra stuff here */ > }my_iocp_read; > > void > my_iocp_read_callback)(iocp_state* st, iocp_read* control_block) { > my_iocp_read* my_cb = (my_iocp_read*)control_block; > /* respond, requeue, whatever */ > } This converts a large one-off complexity in the new() function into a smaller complexity which is required for every function call. Separately, ould it be useful to have per read state in the way that per write state exists? it seems to me since that reads are serial (only one read at a time per socket) that state will be kept by the user in the per socket state. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP
Well, getting there. Have written the test app, currently debugging it and the API. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP API
Okay, so I've finished writing the IOCP code. I now need to finish the test/example application. When that's done, I'll upload. Might even be this weekend. Prototypes == 1. int socket_iocp_new( void **socket_iocp_state, size_t read_buffer_size, void (*accept_callback_function)(SOCKET socket, struct sockaddr *local_address, struct sockaddr *remote_address, void *listen_socket_user_state, void **per_socket_user_state), void (*recv_callback_function)(SOCKET socket, void *read_buffer, DWORD byte_count, void *per_socket_user_state), void (*send_callback_function)( SOCKET socket, void *per_socket_user_state, void *per_write_user_state ), void (*error_callback_function)(SOCKET socket, void *per_socket_user_state) ); 2. void socket_iocp_delete( void *socket_iocp_state ); 3. int socket_iocp_add_listen_socket( void *socket_iocp_state, SOCKET socket, size_t accept_socket_backlog, void *listen_user_state ); 4. int socket_iocp_add_connected_socket( void *socket_iocp_state, SOCKET socket, void *per_socket_user_state ); 5. int socket_iocp_issue_recv( void *socket_iocp_state, void *socket_iocp_socket_state ); 6. int socket_iocp_issue_sync_send( void *socket_iocp_state, void *socket_iocp_socket_state, void *write_buffer, DWORD number_bytes_to_write, void *per_write_user_state ); 7. int socket_iocp_issue_async_send( void *socket_iocp_state, void *socket_iocp_socket_state, void *write_buffer, DWORD number_bytes_to_write, void *per_write_user_state ); API Useage == First, call socket_iocp_new(). This returns the basic state the API uses. Each state represents a single IOCP object. When calling this function, you pass in four callback functions; one for accept, one for send, one for recv and one for error. Once you have this state, you can then add either listen sockets or connected sockets to the IOCP. If you add a listen socket, you specify how many ready sockets should be kept allocated on that listen socket. The API maintains that number of ready sockets. When an incoming connection occurs, the accept callback is called. At this point, the user can associate state of his own with that socket. If you add a connected socket, it's simply added to the IOCP. The user can pass in his own state to be associated with that socket. Sockets, once added, have no reads or writes issued on them; they'll do nothing until the user issues one of three IOCP send/recv functions in the API. The user can issue the IOCP API recv function on a socket. When that recv completes, the recv callback is called with the read buffer and the user's state. The user must call the IOCP recv function again for another read to be present and ready and outstanding on that socket. The user can issue either a "synchronous" or "asynchronous" write on a socket. A sync write requires no malloc, but requires the user to ensure he only has one write occurring at a time on a socket; he cannot issue another write until the current write has completed. When the write is issued, the user can provide state for that write. When the write completes, the write complete function is called, with both the state the user associated with that socket and the state the user associated with that write. An async write requires one malloc per call and permits the user to have an arbitrary number of concurrent writes occurring on a single socket. As with the sync write, the user can pass in a per-write state, and when a write completes, the write callback is called, with the state the user associated with the socket and the state associated with that write. Overall Usage = The idea is that the user calls the new() function once at program init. Then the user adds/removes sockets as necessary, and issues reads/writes as necessary, and services those reads/writes in the callback functions, using the per socket/per write state information. The user has to issue a new read on a socket, in the read callback, to continue reading from a socket. So there you go! Comments and thoughts, please. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Re: IOCP writes
William Ahern wrote: I admittedly don't no anything about IOCP. What I do no is that it doesn't make sense to use any other function call then send() or sendto() to write out UDP data. So whatever API your writing around IOCP, it should short-circuit for UDP writes; just call send() to sendto(). But, that's all I'll say for now, since eventually I'm just going to look foolish for my lack of IOCP knowledge. Well, I think here in lack of knowledge has led you to point out something that I simply hadn't thought of because I've been thinking about everything in terms of IOCP. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Re: IOCP writes
Toby Douglass wrote: William Ahern wrote: On Thu, Feb 22, 2007 at 01:43:48PM -, Toby Douglass wrote: However, the example I had in mind (which is similar) is P2P apps, which use writes on a single local UDP socket to send peering information to peers. They would benefit from async writes on that socket. Writing to a UDP socket should never block, period. If the output buffer is full, and you write another message, it should force the stack to drop a message. Good point. I'd forgotten about that. However, I've just realized that it may make no difference. Writing to a UDP socket should never block; agreed. However, that multiple concurrent writes (one socket, many writer threads) can occur, and so the internal mechanics of the IOCP API must support that behaviour (assume we have enough bandwidth and the write rate is low enough that we're not dropping packets). If I only have a single overlapped structure for writes, I cannot support that behaviour. So I have to malloc the overlapped structure per write (or have the user pass one in). ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Re: IOCP writes
William Ahern wrote: On Thu, Feb 22, 2007 at 01:43:48PM -, Toby Douglass wrote: However, the example I had in mind (which is similar) is P2P apps, which use writes on a single local UDP socket to send peering information to peers. They would benefit from async writes on that socket. Writing to a UDP socket should never block, period. If the output buffer is full, and you write another message, it should force the stack to drop a message. Good point. I'd forgotten about that. But, that could just as well have happened on the wire, so what's the difference? It would be more reliable not to fail locally if you could avoid it. The fact it *could* happen on the wire doesn't give us license to drop things when we could have avoided it. Likely what the NIC can push out and what the link can handle is different, so trying to cater to the output buffer is a futile excercise, as far as I can see. So, the final implication is that users will ONLY ever perform serial writes on a socket, no matter what it is? ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Re: IOCP writes
> From: "Toby Douglass" <[EMAIL PROTECTED]> >> As it is, it's easy to support serial writes with IOCP, it's supporting >> multiple concurrent writes which is awkward for a convenient-to-use API >> over IOCP - and this I would think is absolutely required for some uses >> of UDP sockets. > I haven't tried any UDP async socket stuff, not with IOCPs nor anything. > Tell us how it goes! Well, it's something I've rarely needed to do - in fact, only for P2P VOIP. However, the example I had in mind (which is similar) is P2P apps, which use writes on a single local UDP socket to send peering information to peers. They would benefit from async writes on that socket. > Still, I think I'm missing the point of multiple > concurrent writes - wouldn't the output be undefined? Ah, no, because the destination on a UDP socket will be specified for each write. For a TCP socket, your concern is correct, multiple async writes would only be useful if each write was independent of every other write, so the ordering of dispatch didn't matter. (We would also need to assume each send() function call wrote its payload atomically with regard to other send() function calls, which I think is true, although I don't know that it is actually guaranteed). ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Re: IOCP writes
Rhythmic Fistman wrote: On 2/22/07, Toby Douglass <[EMAIL PROTECTED]> wrote: Yup. I'll be passing in a struct which contains that overlapped as its first member. When it comes back to be, I'll cast the overlapped pointer to my struct type. I think this means though I need a malloc per write, since I have to create the overlapped structure. That's bad, since the point of this is high throughput. I don't think malloc's going to eat that much into your io-bound code's performance, but if you're worried about it just reuse the same overlapped structure instead of re-mallocing it. I don't want to have to be in a situation where I have to cache my allocations - it's extra code to write and debug and it impacts upon the cleanliness and simplicity of the design. It's a shame, really - with serial async writes, the only malloc you do is up front, once, when the socket is handed over to the API. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Re: IOCP writes
Rhythmic Fistman wrote: From: "Toby Douglass" <[EMAIL PROTECTED]> If that's actually untouched, I could use it for my own purposes. The alternative is probably a malloc() per write operation :-/ You shouldn't have to touch that stuff. What are you trying to do? Keep state on a per read/write basis. Currently I keep state on a per socket basis, but that's not enough to let you know which of an arbitrary number of writes has just completed. However, I'm not going to use the abovementioned rather awful method (which I only considered from lack of alternatives) because I'm going to hijack the OVERLAPPED pointer, which is the right way to do this (or the least wrong way). Issuing simultaneous asynchronous reads and writes on socket is a tiny bit rare and special. I agree, but there are so many possible users people are going to want to do it. Also, it's common enough I think - HTTP with keep-alive, you could want to implement that using async IO for both the read and the write on the socket. Are you sure that's what you want to do? If so, great - io completion queues can handle it. Bu-ut if libevent was modelled on the unix select-style interfaces, most of which don't easily support this kind of thing, how is this situation cropping up, assuming that this is just an IOCP port of libevent? That's a wider question. I'm simply implementing a dead-easy-to-use API on IOCP. The unixy case is trickier - I think only kqueues can handle two separate read/write readiness requests being queued up in separate invocations. With all the other interfaces (epoll, event ports, select) you can request read AND write readiness notification, but you have to do with the one system call (the select interface makes this obvious, the other less so). UNIX, I know athing. Well, not quite true, I know POSIX somewhat, but true enough here for the better IO mechanisms. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Re: IOCP writes
Rhythmic Fistman wrote: I don't think you should be passing the same overlapped structure twice to WSARecv/Send. I'm pretty sure that the overlapped isn't "yours" until GetQueuedCompletionStatus says so. Try a unique overlapped structure for each overlappable call. Don't worry, I'm not re-using overlapped's like that. I've had just the one overlapped struct till now, because I only did reads, and you only have one read outstanding at a time. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Re: IOCP writes
Rhythmic Fistman wrote: From: "Toby Douglass" <[EMAIL PROTECTED]> I've realised, though, that if you issue a write on a socket which currently has an outstanding reading, when the IOCP completes, you don't immediately know which operation has completed. You use two different OVERLAPPED structures. The pointers returned by GetQueuedCompletionStatus will be different. Yup. I'll be passing in a struct which contains that overlapped as its first member. When it comes back to be, I'll cast the overlapped pointer to my struct type. I think this means though I need a malloc per write, since I have to create the overlapped structure. That's bad, since the point of this is high throughput. The obvious answer is to use select() to check the socket when an IOCP completes, but that's very awkward, because of the race conditions, for the other operation could complete in the time between the IOCP complete and select() and you have multiple threads calling GQCS concurrently. Locking would be required, which would be Bad. That's doesn't sound right..., IOCP type progs really shouldn't need select. Quite right. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IOCP writes
I wrote: > As it is, it's easy to support serial writes with IOCP, it's supporting > multiple concurrent writes which is awkward for a convenient-to-use API > over IOCP - and this I would think is absolutely required for some uses of > UDP sockets. I've figured out a decent way of doing it, a permutation I've never used before, upon the old trick of passing in your own pointer instead of the one you've supposed to use. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IOCP writes
I wrote: > On a related note, did someone say that the event handle in the overlapped > structure used by the IO functions (ReadFile, etc) is not actually used? > > If that's actually untouched, I could use it for my own purposes. The > alternative is probably a malloc() per write operation :-/ As it is, it's easy to support serial writes with IOCP, it's supporting multiple concurrent writes which is awkward for a convenient-to-use API over IOCP - and this I would think is absolutely required for some uses of UDP sockets. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IOCP writes
>> I'm adding support for IOCP writes. >> >> I've realised, though, that if you issue a write on a socket which >> currently has an outstanding reading, when the IOCP completes, you don't >> immediately know which operation has completed. > > Obviously I used the completion key. Actually, that's wrong. On a related note, did someone say that the event handle in the overlapped structure used by the IO functions (ReadFile, etc) is not actually used? If that's actually untouched, I could use it for my own purposes. The alternative is probably a malloc() per write operation :-/ ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IOCP writes
> I'm adding support for IOCP writes. > > I've realised, though, that if you issue a write on a socket which > currently has an outstanding reading, when the IOCP completes, you don't > immediately know which operation has completed. Obviously I used the completion key. I didn't see it immediately because I'm already using that for something else, bah, hambug. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP writes
I'm adding support for IOCP writes. I've realised, though, that if you issue a write on a socket which currently has an outstanding reading, when the IOCP completes, you don't immediately know which operation has completed. The obvious answer is to use select() to check the socket when an IOCP completes, but that's very awkward, because of the race conditions, for the other operation could complete in the time between the IOCP complete and select() and you have multiple threads calling GQCS concurrently. Locking would be required, which would be Bad. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP
Okay, getting some work done. You'll be amused to know that I've now moved away from the VC build environment (although obviously still using the MS Platform SDK and compilers) and I'm using gnumake and a makefile :-) I found when I learned kernel development that the process of requiring a reboot between tests induced me to take much greater care, and so to have a much firmer comprehension, of the code; to write it correctly, and then run it, rather than to write it and then run it till it seems correct. I suspect (and this is my motive) that moving away from source level debugging will have a similar effect. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] iocp status?
Rhythmic Fistman wrote: How's the IOCP support going? I don't see anything about it on http://monkey.org/~provos/libevent/ My PC is back up. I reinstalled Windows last weekend. I've just finished installing the platform SDK. I'll be working on it tomorrow. (I lost the PSU, the motherboard, the SCSI card and one of the SCSI disks. Ouch!) ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP update
Sorry for the delay. My PC power supply has died. Replacement being ordered tomorrow. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP
My graphics card died Saturday afternoon, so I've not been able to finish the IOCP stuff. Next weekend, hopefully. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP
Just a head's-up; I intend to finish the IOCP API and example application this weekend. Apologies for the intermission! :-) ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Add support for WINDOWS Overlapped IO
Kevin Sanders wrote: I haven't looked at you're code yet, sorry I don't know the function names. I'm referring to when the IOCP thread calls GQCS and sees the read completion, it calls a read completion callback I believe? If someone made a blocking call from this callback (the callback is running on the IOCP thread) then that would be bad (performance wise). So the IOCP threads aren't totally internal because they run the callbacks. Ah, yes, quite so. Unless these IOCP threads are just triggering an event somewhere that the user threads are waiting on (please say no)? No, they're not. :-) ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Add support for WINDOWS Overlapped IO
Niels Provos wrote: > Please, think about that before flaming. I don't think anyone is flaming anyone here right now. It seems to be a pretty scholarly, polite and reasonable discussion. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Add support for WINDOWS Overlapped IO
Zhu Han wrote: I have made some mistake when I type the above words. I just mean you should issue socket_internal_iocp_read firstly and then let the callbacks to consume the data. However, in your code, you invoke the callbacks before socket_internal_iocp_read. See the following snippet: siss->callback_function( siss->socket, SOCKET_IOCP_SOCKET_RECV_SUCCESS, siss->read_buffer, byte_count, (void *) siss->user_state ); socket_internal_iocp_read( siss ); Ah. Go look at socket_iocp_add_socket(). You'll find the very first call to socket_internal_iocp_read() occurs at the end of *that* function, which gets the read ball rolling. As others have pointed out, Async WRITE is important for high-performance server. Yes, it's clear. I'm going to add it. Note though that with GetQueuedCompletionStatus(), the user has no synchronization work to do. The API handles it all behind the scenes. Single-thread event-driven model just means the event loop is running in one thread's context. There a lot of ways to combine it with the multi-thread worker. Ye-e-e-s-s...and no. I think I understand you to mean that we could, for example, arrange it so we have 2xCPU threads running GQCS and when one of those threads returns with a complete, we trigger or signal the main event loop thread? the problem with that surely is going to be the large amount of thread switching. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Add support for WINDOWS Overlapped IO
Kevin Sanders wrote: On 12/10/06, Toby Douglass <[EMAIL PROTECTED]> wrote: I've not added support for writes, because I think people generally issue blocking writes, since non-blocking means that if you return from a function which issues a send you have to ensure the lifetime of the buffer you've sent is non-local. If you're doing even a small amount of writing, you're going to have dismal performance (at best) using blocking writes. If you're talking to a peer socket (real world) that is no longer responding, this write may take more than a minute to error out. True. I non-block and select and give it a couple of seconds, but of course even a couple of seconds is still hugely long for some uses. One of my coworkers recently observed, that handles associated with a IOCP seem to have CPU affinity, at least sometimes. In a read completion callback, he posted another read (which is fine and encouraged) and then went off and did a lot of processing which preventing it from calling GQCS for about 20 seconds (very bad). Even though there were 3 other threads waiting on GQCS, they couldn't pop the completion status for the read from the IOCP even though the read had completed. Finally, as soon as the original thread came back around and called GQCS, it popped the completion instead of the other threads which had been waiting the whole time. Oh man...! The lesson here is you don't want the IOCP threads doing anything except issuing async IO, popping completions and a quick state machine change (see below) and issue another async IO if needed. If more processing is needed, put a work item in a queue for another thread to process. That work thread can call your state machine callback when it is finished, and that may in turn cause further async IO. True. But you realise in the code I've written the IOCP threads are internal to the library itself - the user doesn't see them or touch them. All they do is call GQCS. If the user did blocking writes, they would be outside the IOCP mechanism and they wouldn't be using the IOCP threads. Single-thread event-driven seems to me to basically mean state machine. State machines are wonderful things for achieving simple, bug-free code, but they have a cost; they are implicitly single-threaded. This can mean you cannot use them in some situations, because you will inherently block whenever you perform work. I'm not sure I follow this. Are you saying you can't use state machines in a multi-threaded application because they cause threads to block? No - I'm saying state machines inherently serialise, and so if you need parallelism, you have a problem. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP rewrite
For those of you who are interested, I'm finishing off a rewrite of the IOCP code, such that is handles connect()/accept() behaviour. Which is to say, you init the IOCP state, and all your socket behaviour is then routed through the callback function, which is simply called when an accept, connect or read occurs. You can add listen, to-be-connected or connected sockets to the IOCP. When a listen socket is added, all the user sees after that is that the accept callback is called when an incoming connection occurs and then the recv callback is called whenever a read occurs. The user does *no* other work - for example, he can just return from these functions should he wish. When a to-be-connected socket is added, the user passes in the socket and the remote address, and then the connect callback is called when the connect is successful and then after that the read callback every time a read successfully occurs. Similarly, with a connected sockets, all that happens is the read callback is called whenever a read occurs. So it's basically five functions; iocp_new() iocp_delete() iocp_add_listen_socket() iocp_connect_and_add_socket() iocp_add_connected_socket() And then all the user sees are the callbacks being called. There's one for accept, one for connect, one for recv and one for error. Typical use I think is to add the listen socket, then in the accept allocate your own internal state for that connection (you can in the accept function specify a user state pointer for the accept socket), in the recv callback deal with the data. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Win32 I/O completion ports code release
Kevin Sanders wrote: > Use AcceptEx. Interesting. In the AcceptEx() docs, we find this; "As with all overlapped Windows functions, either Windows events or completion ports can be used as a completion notification mechanism." ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Win32 I/O completion ports code release
>> The docs say the following functions can be used with an IOCP; >> >> ConnectNamedPipe >> DeviceIoControl >> LockFileEx >> ReadDirectoryChangesW >> ReadFile >> TransactNamedPipe >> WaitCommEvent >> WriteFile > Use AcceptEx. The docs sayeth; "The following functions can be used to start I/O operations that complete using completion ports." (And then the above list is given). This to me means that it is these and only these functions which can be used with IOCP, which in turn means having an overlapped structure doesn't automatically mean you can use that function with an IOCP. Have you used AcceptEx() with an IOCP and found it works? ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Win32 I/O completion ports code release
Gordon Scott wrote: One technical note; I believe that I/O completion ports cannot handle accept() type behaviour, What do you mean by this? With an IOCP, you can give the IOCP a *connected* socket and the IOCP will call you back when a read occurs. But you cannot pass in an *unconnected* socket in the listen state, for the IOCP is not capable of calling you back when an incoming connection occurs and you need to call accept(). The docs say the following functions can be used with an IOCP; ConnectNamedPipe DeviceIoControl LockFileEx ReadDirectoryChangesW ReadFile TransactNamedPipe WaitCommEvent WriteFile So, as you can see, it doesn't seem possible to use an IOCP to handle incoming connections. FYI, libiocp 1.03 was put on the site last night. I'm going to add some example code next, and then maybe integrate the listen/accept solution I've written. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Win32 I/O completion ports code release
Hi. I've written a tiny library which provides a simple front end for IO completion ports. The ultimate purpose of this is to provide excellent socket handling under Win32 in libevent so that the socket problem Tor faces is solved. My code base is written around a resource and error handling framework and so the original code that I've written and tested was permeated with that framework. Having completed testing, I've removed my framework code from the library and what remains is what I'm releasing here. I originally intended to create a decent Win32 build of libevent library and to integrate this code. However, last night when I finally had a proper stab at this, I found the libevent code badly disorganised (poor header file/code structure) and after an hour or so, I gave up. As such, I'm releasing my code here and hoping either a libevent expert will integrate it or Tor will use it directly. I will be supporting this code. If people have problems with it or need changed, I will make those changes. One technical note; I believe that I/O completion ports cannot handle accept() type behaviour, only recv() and send(). I have however written a WaitForMultipleObjects() based listen/accept handling library, which might be particularly useful in conjunction with this IOCP code. Technical Information = The API has three functions and is best explained by stating these functions. 1. void socket_iocp_new( void **socket_iocp_state ); 2. void socket_iocp_delete( void *socket_iocp_state ); 3. void socket_iocp_add_socket( void *socket_iocp_state, SOCKET socket, void *user_state, size_t read_buffer_size, void (*callback_function)(SOCKET socket, int event_type, void *read_buffer, DWORD byte_count, void *user_state) ); The purpose of the first two functions is obvious. From the users view, the API works like this; you give a socket, a callback function and a pointer to state of your own (if you have any) to the API and it will call that callback when a read has occurred. The callback function is; void callback_function( SOCKET socket, int event_type, void *read_buffer, DWORD byte_count, void *user_state ); So, you are given the socket which has completed a read (or errored), the event type (read or error), the read buffer, and number of bytes in the buffer, and a pointer to your own state. When the socket closes, the OS automatically removes it from the IOCP socket list, which is why there is no delete_socket() function. Build and Link == I use VC6 and I've arranged the code as a statically linked library. The file and directory structure is; libiocp/libiocp.h // public header file to be used with the .lib libiocp/libiocp/ libiocp/libiocp/libiocp.dsp libiocp/libiocp/libiocp.dsw libiocp/libiocp/libiocp.opt libiocp/libiocp/iocp.c libiocp/libiocp/iocp_internal.h The debug and release libraries are created to these pathnames; libiocp/libiocp (debug).lib libiocp/libiocp (release).lib So the basic arrangement is that the public files (libraries and the public header) are in the root, while the code and build files are in the subdirectory libiocp. The code compiles without error with level 4 warnings and warnings as errors. I've had to pragma warnings 4005, 4201 and 4305 so that the MS headers will compile with warnings at level 4. (I've just been discovering some bad behaviour in WinZip, while making up the archive. If you have empty directories in the archive, *they will not be shown in the archive listing in WinZip*, but they WILL be created - so you have invisible directories! also, if you delete all the files in archive, *the archive file is deleted*, so you then need to remake it to add files to the archive - which is annoying, because I added the wrong files, so I deleted them all, went to add the files I did want...and found I couldn't add them. I had to go remake the archive first, despite having *just* made the archive in the first place). Homepage and Direct Download http://www.summerblue.net/computing/libiocp/index.html http://www.summerblue.net/computing/libiocp/zips/libiocp%201.00%203-Dec-2006.zip ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] Windows build impossible
Kalin Nakov wrote: Hi, I saw that libevent has some .dsp and .dsw files, but the sources are completely unbuildable. First HAVE_CONFIG is not defined in the project, also some WIN32 defines are re-defined in the code, some types are undefined and have to be manually defined, functions are used which are unavailable in windows (gettimeofday, etc). Are there any document with steps on how to build libevent on windows, or the windows port is unsupported? If it is unsupported, I could make a contribution and make it buildable on windows if you want since I intend to use it. Unfortunately for the Windows developers, VC isn't the supported Win32 build platform for libevent, which to some extent is why the build is in the shape it's in. There's actually a general problem here, which is simply that the UNIX bods don't have a Win32 build environment easily available to them and the Win32 bods don't have a UNIX build environment easily available to them. (Easily available in both the sense of physical availability and the knowledge required to operate the build effectively). My view in this is that the Win32 and UNIX builds should be entirely independent. There should be three source directories, one for common code, one for UNIX specific code, one for Windows specific code. There should be two separate builds, one for UNIX (up to the UNIX bods how best to do that) and one for Windows (which should be VC - preferably version 6, since not everyone has or wants the later versions, but this is a difficult request, since it's not strictly possible to now obtain older versions of VC). I've some intention to sort out the build along these lines, but I've been distracted by IOCP work. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] How does libevent deal with more events than a particular syscall can handle?
Roger Clark wrote: Has there ever been any mention of using IOCP or something on Windows? The Win32 implementation currently uses select() and still imposes the limit, which was mainly why I was asking. I've written IOCP code. It's in the hands of another list reader to get some testing; I'm not sure if he's had time or not, I emailed him last night and said if it's been tested, or can be tested in the next week, I'll wait, otherwise I intend to release the code for peer review. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IOCP
I've released the first version of this code to a mailing list member who's going to try to compile it and use it. Once he's had a go and we've sorted the problems out, the code will be released. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] GetQueuedCompletionStatus return values question/problem
Gordon Scott wrote: Now - here's the rub - what's to prevent *lpNumberOfBytes being 0 in case 3? the docs say in case 3, the value in *lpNumberOfBytes is not set by GetQueuedCompletionStatus(). So, it would retain whatever value I set it to originally. I'm thinking then that I need to set *lpNumberOfBytes to a value, say 1, and then if it's different, I'll be able to *know* it was set to 0. Not true. As per the documentation in condition 3 sets the number of bytes read. If there is an error and 0 bytes have been read, it will be set to 0. Hmm. Re-reading the docs, I'm starting to think that that final sentence in return values about socket closure is red herring. It *reads* like it's telling you a fourth unique return value case; but actually, maybe all it's doing is pointing out to you what you'll see when a socket closes - and failing to mention, as it were, that it's only a specific case of the general instance of case 3. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] GetQueuedCompletionStatus return values question/problem
Gordon Scott wrote: The two cases are differentiated by the fact that in case 4, *lpNumberOfBytes == 0. Now - here's the rub - what's to prevent *lpNumberOfBytes being 0 in case 3? the docs say in case 3, the value in *lpNumberOfBytes is not set by GetQueuedCompletionStatus(). So, it would retain whatever value I set it to originally. I'm thinking then that I need to set *lpNumberOfBytes to a value, say 1, and then if it's different, I'll be able to *know* it was set to 0. Not true. As per the documentation in condition 3 sets the number of bytes read. If there is an error and 0 bytes have been read, it will be set to 0. Thankyou - I had been thinking this in the past, but so many different things have been said in the list I'd lost sight of it! Setting a value of 1 would lead to problems for the same reason, what if 1 byte was read? -1 could be an option It's a DWORD, so -1 is a theoretically valid value, although it's fine in practical terms. But I'm looking for a better way - which basically means treating case 3 and 4 as the same and either not differentiating between them, or using GLE to differentiate. Really though, the difference between case 3 and 4 is that GetLastError() returns an error code in case 3, and it does not return an error code in case 4 You mean to say that it returns ERROR_SUCCESS in case 4 but never will in case 3? is ERROR_SUCCESS what you get from say recv() when the other end has done a graceful close? I'm thinking that the GLE you get is actually the GLE set by the socket operation. Also, the problem again is that the possible GLE codes are *not* documented. We may assert we *believe* we'll get ERROR_SUCCESS in case 4 and never in case 3, but how can we *know*? ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] GetQueuedCompletionStatus return values question/problem
Kevin Sanders wrote: Toby, the GLE error codes are documented here; http://msdn.microsoft.com/library/default.asp?url=/library/en-us/debug/base/system_error_codes.asp The problem is that for a given function, the docs for that function don't tell you which errors you could possibly see. So there's the huge list of errors (said URL) and...which of those will I need to manage? I can't know, either now, or with regard to future changes in behaviour. That means GLE is only guaranteed useful for handling specifically documented GLE error codes and for differentiating between no error and *an* error. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] GetQueuedCompletionStatus return values question/problem
Kevin Sanders wrote: On 11/1/06, Toby Douglass <[EMAIL PROTECTED]> wrote: Kevin Sanders wrote: > Call GetLastError(). I could be wrong, but I don't think that helps you, because you would have to know the full list of possible errors for both cases, so that you could be certain that you were correctly differentiating between them. I have no idea what errors either cases might return, nor any idea of where to look to find the full list of possible errors for both cases and indeed I suspect that information is not documented. In case 3, GetLastError will not be 0, in case 4, GetLastError will be 0. In cast 4, will GLE always be 0 - even if the other end has for example reset rather than disconnected gracefully? The problem we face here is that although we can empirically find the GLE values for our current platforms, the possible GLE values are not documented! so we can't be certain we've got it right, that there are no unusual cases we simply haven't seen, or that it won't change in the future. I'm happy to with this now, however, since it seems basically reasonable, although I do wonder at GLE being 0 for both a reset and a graceful disconnect! Thankyou for your help, Kevin. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] GetQueuedCompletionStatus return values question/problem
Kevin Sanders wrote: Now, the problem is this: ERROR_SUCCESS is #defined as zero! In other words, I don't think it's possible to differentiate between case 3 and 4, because in case 3, *lpNumberOfBytes might be 0 just by chance. Call GetLastError(). I could be wrong, but I don't think that helps you, because you would have to know the full list of possible errors for both cases, so that you could be certain that you were correctly differentiating between them. I have no idea what errors either cases might return, nor any idea of where to look to find the full list of possible errors for both cases and indeed I suspect that information is not documented. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] GetQueuedCompletionStatus return values question/problem
This is for those of you who've done some IO completion port work. According to my reading of the docs, GetQueuedCompletionStatus has four return status'; --- 1. return value != 0 - all went well, we've just returned from a successful dequeuing 2. return value != 0 and lpOverlapped == NULL - all handles associated with the IO completion object have been closed and the IO completion object itself has been closed (time to quit) 3. return value == 0 and lpOverlapped != NULL - GetQueuedCompletionStatus() was okay, but we dequeued a *failed* IO operation (note that lpNumberOfBytes is not set in this case) 4. return value == ERROR_SUCCESS and lpOverlapped != NULL and *lpNumberOfBytes == 0 - the handle we've come back on is actually a socket, and the remote host has closed --- Now, the problem is this: ERROR_SUCCESS is #defined as zero! In other words, I don't think it's possible to differentiate between case 3 and 4, because in case 3, *lpNumberOfBytes might be 0 just by chance. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IO Completion Ports
Gordon Scott wrote: Unfortunately the code belongs to my employer so I can't share it, but (as Toby says it's late ) I can give you a general gist of the way it works tomorrow. Also if you can provide the code you have I can take a look and see if I see any glaring errors. Thankyou - that sort of checking is extremely valuable. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IO Completion Ports
Kevin Sanders wrote: On 10/31/06, Toby Douglass <[EMAIL PROTECTED]> wrote: This is with MaxUserPort set to 2 (decimal), although I've not rebooted... Best hope for this to work is rebooting after changing the setting. Yup. Getting between 8200 and 8600 sockets now, with the server failing first, rather than the client. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] IO Completion Ports
Ycrux wrote: Hi Gordon and Toby! I'm interested to test your code. Could you please share it with us? Bed time now. Will post code to list tomorrow morning. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] Re: Libevent-users Digest, Vol 13, Issue 8
<[EMAIL PROTECTED]> wrote: It's pretty spectacular stuff, I had a fairly average xp box doing > 10k sockets (through an external interface, no cheatingly going through localhost) Did you get to 10k+ sockets in your first effort, or did you have to tweak the OS config? ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] IO Completion Ports
First testing using my real IP address, connecting to myself, gives a perfectly repeatable 3973 sockets before failure. This is with MaxUserPort set to 2 (decimal), although I've not rebooted... I have 512mb, running w2k sp4. Anyone interested in running the test app on their machine? Same binary is client and server; Do "test s" first to start the server, then "test c" in another window to run the client. Client will display number of connected sockets, as it continually connects. Client will eventually fail and print message why. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] API review (function prototype typo)
Rats... The third function API is actually this; 3. void socket_iocp_add_socket( void *socket_iocp_state, SOCKET socket, void *user_state, size_t read_buffer_size, void (*callback_function)(SOCKET socket, int event_type, void *read_buffer, DWORD byte_count, void *user_state) ); Note the change to the first arguments of the callback function - I accidentally left in my own framework stuff. The event types are; #define SOCKET_IOCP_SOCKET_RECV_SUCCESS 1 #define SOCKET_IOCP_SOCKET_LOCAL_CLOSE 2 #define SOCKET_IOCP_SOCKET_REMOTE_CLOSE 3 The user, in his callback function, is expected to switch on the event type and behaviour accordingly. He has access to the socket and his own user state through the arguments to the callback. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] API review
Okay, I've finished off the IO completion port code. I'd appreciate feedback on the API design. There are three functions; 1. void socket_iocp_new( void **socket_iocp_state ); This function takes a pointer to a void pointer and malloc's some memory onto it. This memory holds the state for this instance my IOCP handling object. This function also starts up a single thread (more on that later). 2. void socket_iocp_delete( void *socket_iocp_state ); This function takes an existing IOCP object and shuts it down, freeing up all resources. 3. void socket_iocp_add_socket( void *socket_iocp_state, SOCKET socket, void *user_state, size_t read_buffer_size, void (*callback_function)(void *ess, void *socket_state, int event_type, void *read_buffer, DWORD byte_count, void *user_state) ); This function passes a connected socket to an IOCP object and causes reads to begin on that socket. (The user specifies the size of the buffer to be used for reads and can pass in a pointer to his own state, which comes back to him as an argument in the callback function). The callback function is called every time a read successfully occurs. The callback function event argument is either - success/local close/remote close. The callback function provides the buffer which holds the read data and specifies the length of that data. The user does no other work - he simply passes the socket and callback function in and that's it. Everything else is automatic - his callback function will be called when things happen. To remove a socket from the IOCP object, simply call CloseHandle() on that socket. The closing of the socket handle causes the OS to remove it from the IOCP object. Note that the socket must already have been connected before being added. The thread that socket_iocp_new() starts is responsible for waiting on the IOCP object and calling the callback functions. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] API review
> Sounds interesting...what about writing data to the socket? Hmm. If an overlapped write is issued upon a IOCP'd socket, the callback function will be called when that write has completed. If I add a function to the API through which overlapped writes are issued, it will be possible to tell if a write has completed or if a read has completed (e.g. the event type will be useful and correct) with a negligable overhead (e.g. no memory allocation, just trivial stuff). This would mean that all socket I/O would occur through a single function, the callback function. It would switch on the event, the user receives the socket and the event that has occured, and, accordingly to earlier discussion, very large numbers of sockets can so be handled. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] yet more sockets under Win32
So, I've written an API to use IO completion ports. It's very simple - far easier than the framework necessary to handle multiple threads and WaitForMultipleObjects(). The API is; new_iocp() delete_iocp() add_socket() Currently, there's one callback function for all sockets. When the callback is called, an event type is indicated (successful read, failed read, socket closed), the number of bytes available is given and a pointer to user state is provided. Obviously, the user can keep track of the socket handle in his user state, but I'm thinking of having the socket being passed as an argument to the callback function, for convenience. Also, I think I'll change it so there is a callback function on a per socket basis. There's no delete_socket() function - it appears simply closing the socket handle removes it from the IO completion list. I need to seperate the code from my coding framework, so it can be intergrated into libevent. I also need to test with large numbers of sockets. I can do that locally, but I can't do that properly, I lack a second machine. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] sockets under Win32
I've almost finished code to handle arbitrary numbers of sockets under Win32 using WaitForMultipleObjects(). I expected to complete with one more weekend of work and then move to test. I've written the code from scratch last weekend. There's a socket object, a listen thread object and a listen thread manager object. The exposed API is that of the listen thread manager and consists of four functions; new() delete() // this is for the listen thread manager object add_socket() remove_socket() Each socket has an associated callback function which is called when data arrives. Each listen thread handles up to 61 sockets (three objects are used internally). The listen thread manager load balances sockets between threads as necessary, deleting excess threads. The listen threads re-order their socket lists on every read, so that the interrupting socket is moved to the end of the their list of objects. This is because the list is examined in-order by WaitForMultipleObjects(), so rapidly firing objects can mask objects further down the list. There exists some investigation which indicates that there is a limit of about 4k sockets for Windows, due to the necessity of allocating a certain amount of non-paged pool per socket combined with a hard limit on the amount of non-paged pool. This equates to about 65 threads of 61 sockets. A question is whether it's better to fire up all these threads at init, and forget about load balancing. Bear in mind the fact that WaitForMultipleObjects() reads it handle list in-order. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
Re: [Libevent-users] replacement of select on Win32
Nick Mathewson wrote: On Sat, Sep 23, 2006 at 04:51:57PM +0100, Toby Douglass wrote: Is anyone working on implementing native-style (e.g. non-select()) waiting on Windows for socket events, as requested by Tor? Actually, we (the Tor project) have got an undergraduate working on one possible approach to it. We'll let you know how it turns out. I've got existing, debugged code, written in C, which offers a TCP server framework. There's a single thread which waits on an array of events associated with sockets. The API permits the addition and removal of ports to be listen upon (adding a port means the socket is created, an event associated with it, and the array of events passed to WaitForMultipleObjects() is reformed). When a connect occurs, a (per port) callback function is called, being passed the socket accept() returns. I've downloaded the libevent code, but I've not looked at it yet. It's a bit awkward because it doesn't seem really to be for WIN32 at all and I'm not much of a UNIX programmer. ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users
[Libevent-users] replacement of select on Win32
Is anyone working on implementing native-style (e.g. non-select()) waiting on Windows for socket events, as requested by Tor? ___ Libevent-users mailing list Libevent-users@monkey.org http://monkey.org/mailman/listinfo/libevent-users