On Thu, May 08, 2008 at 10:33:21AM -0400, Nick Mathewson wrote: > On Wed, May 07, 2008 at 11:47:09PM -0700, Niels Provos wrote: > > Hi Manual, > > > > this is a good suggestion. Nick and I are currently working on how > > buffers and http work in libevent 2.0. You might want to check out > > trunk to see some of the progress there. In any case, it seems that > > your sendfile changes would be a good fit there. BTW, sendfile is > > available on a large number of platforms now. > > In fact, this fits pretty well into the newer evbuffer implementation. > Whereas the old implementation used a big chunk of memory for the > entire buffer, the new code uses a linked list of small chunks. (This > removes the need for big memmov operations, and generally makes > writing big files more efficient.) > > All we'd need to do to support sendfile is add another chunk type > whose contents were a file rather than a chunk of ram. We'd probably > want at least two ways to "add a file at this point in the stream": > one taking a filename and one taking an fd. Of course, we'd need to > make sure to fall back on mmap() for systems lacking sendfile(), and > maybe on a series of read() operations on systems lacking mmap().
On Linux, I'd bet that AIO+splice can beat sendfile. The problem with sendfile is that it can still block on file I/O. This is one major reason why sendfile is no good for multimedia servers, _unless_ you're willing to juggle and manage multiple threads (w/ the headache of profiling access patterns and adjusting number of threads). On Linux, though, you could have one thread doing kernel AIO from a file into a pipe using vmsplice, and a second thread running the event loop (or N+M threads, where N is your number of cores--an easier calculation.) This leads into the second issue, which is how best to manage vmsplice buffers, which needs to be page aligned, and have certain ownership restrictions. The upside is that it's the only feasible way in any unix (that I'm aware of) to actually get a page of file data into a buffer without worrying about (a) blocking, (b) dealing with POSIX AIO readiness signalling, or (c) copying the data. _Also_, this allows a potential single code pathway for either encrypted or plaintext I/O; for encrypted you'd just insert a filter, and write the new data to a new page buffer. (Given all the recent political "issues" lately, it's disconcerting that more people are giving thought to moving the net toward ubiquitous encryption; including reducing the overhead so there are few excuses.) One caveat w/ the non-blocking splice setup is how to handle seeking. But that can be dealt with as a matter of course. Anyhow, next month I'll have some code and hopefully some numbers. _______________________________________________ Libevent-users mailing list Libevent-users@monkey.org http://monkeymail.org/mailman/listinfo/libevent-users