Hi Peter, Thanks for the response.
Peter Memishian wrote: > > 1. We are currently using the sockfs interface in the kernel to perform > > our various network tasks, similar to the approach used by the Solaris > > iSCSI initiator and CIFS server. New connections are handled by > > soaccept and then we use sosendmsg/sorecvmsg to transfer data on the new > > iSCSI connection. It's been suggested to me that it would be better to > > accept and manage connections in user-space. Realizing that ultimately > > we need to be able to use the resulting connection in the kernel I'd > > like to understand more about this. Is it possible to accept a > > connection in user-space and then operate on that connection in a kernel > > driver? > > This has traditionally been done by interposing a STREAMS module between > the transport and the socket head and speaking TPI -- e.g., you can look > at what in.telnetd and in.rlogind do with telmod/rlmod/logindmux. There > are similar dances with NFS and rpcmod. That said, we've been moving away > from that model, and projects like Volo are designed assuming that > interposers are unusual (though there has been talk of providing a hook > API to allow similar functionality to be implemented). > I remember hearing at one point that pushing an additional stream module onto a socket would disable a number of performance optimizations in the TCP code. Looking at the source code seems to confirm this. True? Between that and switching to TPI implementing an interposer would be a fairly big deal for us but perhaps that's not necessary since we only really need the socket data in the kernel (in other words, we don't really need anything interposed, we just need our socket from the new connection). Looking at socksyscalls.c it appears if we can get the userland socket handle into kernel space we can easily get the sonode by calling getsonode(). We are very interested in the Volo project (primary because they are introducing a committed kernel socket interface) but right now it looks like it will not be available in the timeframe we need. And... of course that would still leave us operating in the kernel. > > This would require us to add an associated user-space daemon that we > > don't currently have but I'm not opposed to that if it's the correct > > way to handle things. If this is a viable approach, what are the > > advantages? > > An interesting set of questions. Most of the existing cases have come at > this from the other way around: they've had a userland daemon and wanted > to speed it up, so the performance-critical paths got moved into the > kernel (though NFS client support is all in the kernel). The main > advantages I see with userland are (a) visibility with traditional tools > (e.g., netstat) and (b) reducing the odds that a bug will lead to a > significant security breach. > It looks like the userland socket calls are a *very* thin layer on top of the sockfs calls. Are we really missing a lot of functionality? I was curious since you mentioned netstat and our kernel connections actually do show up in the output: hpc-x4200-3-vm1 $ netstat TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- hpc-x4200-3-vm1.3260 hpc-x4200-3-vm2.43157 64240 0 64240 0 ESTABLISHED [snip] hpc-x4200-3-vm2 $ netstat TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- hpc-x4200-3-vm2.43157 hpc-x4200-3-vm1.3260 64240 0 64240 0 ESTABLISHED [snip] Do you have a recommendation for us regarding whether we should pursue the approach of adding a userland component? Thank you very much for the information. -Peter _______________________________________________ networking-discuss mailing list [email protected]
