> Now, as I understand it, sendfile() will perform zero-copy IO; since the 
> contents
> of the file will undoubtedly be in the page cache, it should in theory DMA the
> data straight from the (single copy of the) data in RAM to the NIC buffers.
> 
> It should also handle refcounting for you - you unlink the filename after
> obtaining a descriptor, and close() the FD once you've called sendfile, and 
> the
> kernel *should* in theory free the inode and page containing file data once 
> all
> TCP ACKs have been received.
> 
> You'll still have to make 100k syscalls, and you may find the kernel chooses 
> to
> copy the data anyway.

I see.

So using sendfile .. probably with message as file on RAMFS .. or using the 
Linux
syscalls you mention below, it _might_ be possible to avoid copy overhead,
but not context switching overhead .. ok.

> 
> However - AFAIK Twisted does not support sendfile(), and it can be tricky to
> make it work with non-blocking IO.

;(

Apart from that, we're on FreeBSD .. guess there are similar syscalls (maybe 
with
slightly different semantics) there also.

> 
> :o(
> 
> You may also want to look at the splice() vmsplice() and tee() syscalls added 
> to
> recent Linux kernels. tee() in particular can copy data from pipe to pipe 
> without
> consuming, so can be repeated multiple times. It may be possible to assemble
> something that will do this task efficiently from those building blocks, but 
> the
> APIs aren't available in Twisted.

Thanks alot! This is all very interesting .. from the "tee" man page:

"""
Though we talk of copying, actual copies are generally avoided. The kernel does 
this by implementing a pipe buffer as a set of reference-counted pointers to 
pages of kernel memory. The kernel creates "copies" of pages in a buffer by 
creating new pointers (for the output buffer) referring to the pages, and 
increasing the reference counts for the pages: only pointers are copied, not 
the pages of the buffer. 
"""

Which sounds alot like in your other reply talking about refcounting etc .. 

For ref., these guys are talking about PACKET_MMAP

http://www.linuxquestions.org/questions/programming-9/vectored-write-to-many-sockets-with-tee-splice-915702/
http://dank.qemfd.net/dankwiki/index.php/Fast_UNIX_Servers

The former (very end of page) claims that it achieves zero-copy (which I get),
and also claims you could reduce context switch overheader for the 1 msg
TX to many clients case .. which I can't see how it's done.

> 
> >> and not useful.
> >
> > When using VM pages (_if_ that would be possible) and thus no data
> > duplication, then why not useful?
> 
> Sorry, I should have been more precise - it's probably not often useful.
> There are not very many applications where sending the same TCP stream to
> that many clients at the same time is helpful - realtime video/audio over TCP
> spring to mind, and typically those need to adapt to slow clients by dropping
> them to a lower rate i.e. not the same stream any more.
> 
> As Glyph has mentioned, encryption is also a factor in todays internet.
> 
> I'm kind of curious about what your application is!

The application is PubSub over WebSockets with massive numbers of clients ..

Application message payloads are short (<1k) and JSON/UTF-8. Those are then
framed into WebSocket messages (which basically means prepending a WS
frame header).

> 
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to