Re: [networking-discuss] When should fused TCP connections block.

Darren Reed Thu, 04 Jan 2007 07:22:59 -0800

Roch - PAE wrote:

Brian Utterback writes:
> In the interests of open development, I wanted to get the opinions
> of the OpenSolaris developers on this mailing list.

>> In Solaris 10, Sun introduced the concept of "fused" TCP connections.

> The idea is that most of the TCP algorithms are designed to deal with
> unreliable network wires. However, there is no need for all of that
> baggage when both ends of the connection are on the same machine,
> since there are no unreliable wires between them. The is no reason
> to limit the packet flow because of Nagle, or silly window syndrome
> or anything else, just put the data directly into the receive buffer
> and have done with it.

>> This was a great idea, however, there was a slight modification to

> the standard streams flow control added to the fused connections. This
> modification placed a restriction of the number of unread data blocks
> on the queue. In the context of TCP and the kernel, a data block
> amounts to the data written in a single write syscall, and the queue
> is the receive buffer. What this means in practical terms is that the
> producer process can only do 7 write calls without the consumer doing
> a read. The 8th write will block until the read.

>> This is done to balance the process scheduling and prevent the producer

> from starving the consumer for cycles to read the data. The number was
> determined experimentally by tuning to get good results on an important
> benchmark.

>> I am distrustful of the reasoning, and very distrustful of the results.

> You can see how it might improve performance by reducing the latency.
> If your benchmark has a producer and a consumer, you want the consumer
> to start consuming as soon as possible, otherwise the startup cost gets
> high. Also, by having a producer produce a bunch of data and then have
> the consumer consume them, you have to allocate more data buffers than
> might otherwise be necessary. But I am not convinced that it should be
> up to TCP/IP to enforce that. It seems like it should be the job of
> the scheduler, or the application itself. And tuning to a particular
> benchmark strikes me as particularly troublesome.

>> Furthermore, it introduces a deadlock situation that did not exist

> before. Applications that have some knowledge of the size of the
> records that they deal with often use MSG_PEEK or FIONREAD to query
> the available data and wait until a full record arrives before reading
> the data.  If the data is written in more than 8 chunks by the
> producer, then the producer will block waiting for the consumer, who
> will never read, waiting for the rest of the data to arrive.

>> Now this same deadlock was always a possibility with the flow control,

> but as long as the record size was considerably smaller than the receive
> buffer size, the application never had to worry about it. With this type
> of blocking, the receive buffer can effectively be 8 bytes, making the
> deadlock a very real possibility.

>> So, I am open to discussion on this. Is this a reasonable approach to

> context switching between a producer and consumer, or should the
> scheduler do this better? Perhaps instead of blocking, the process
> should just lose the rest of its time slice? (I don't know if that
> is feasible) Any thoughts on the subject?
...



What you say appears quite reasonable.

I   don't  understand  why we  should    block before having
buffered  the sum  of  a  socket  receive buffer and  socket
transmit buffer. On a single CPU system I can imagine having
a provision  to have the  transmitter  yield() to a runnable
receiver  based  on some threshold such   as  N chunks  or M
bytes.


I'd rather see the yield() be every N chunks (or even every
X ticks or some such) rather than once a threshold is reached.

Otherwise the degenerate case of writing one byte at a
time becomes very expensive once you pass the threshold.
e.g. if the reader is waiting for 1k, the threshold is 8 (like
it is now), that's 1016 yield()s after the threshold is crossed.

Darren

_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] When should fused TCP connections block.

Reply via email to