Re: Van Jacobson's net channels and real-time
On Tuesday 02 May 2006 14:41, Vojtech Pavlik wrote: > You seem to be missing the fact that most of todays interrupts are > delivered through the APIC bus, which isn't fast at all. You mean slow right? Modern x86s (anything newer than a P3) generally don't have an separate APIC bus anymore but just send messages over their main processor connection. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Tue, Apr 25, 2006 at 07:29:40AM -0400, linux-os (Dick Johnson) wrote: > >> Message signaled interrupts are just a kudge to save a trace on a > >> PC board (read make junk cheaper still). > > > > yes. Also in PCI-Express there is no physical interrupt line anymore due to > > the architecture, so even classical interrupts are sent as "message" over > > the bus. > > > >> They are not faster and may even be slower. > > > > thus in the case of PCI-Express, MSI interrupts are just as fast as the > > ordinary ones. I have no numbers on whether MSI is faster or not then e.g. > > interrupts on PCI-X, but generally speaking, the PCI-Express bus is not > > designed to be "low latency" at all, at best it gives you X latency, where X > > is something like microseconds. The MSI message itself only takes 10-20 > > nanoseconds though, but all the handling probably adds a large factor to > > that > > (1000 or so). No clue on classical interrupt line latency - anyone? > > About 9 nanosecond per foot of FR-4 (G10) trace, plus the access time > through the gate-arrays (about 20 ns) so, from the time a device needs > the CPU, until it hits the interrupt pin, you have typically 30 to > 50 nanoseconds. Of course the CPU is __much__ slower. However, these > physical latencies are in series, cannot be compensated for because > the CPU can't see into the future. You seem to be missing the fact that most of todays interrupts are delivered through the APIC bus, which isn't fast at all. -- Vojtech Pavlik Director SuSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Mon, 24 Apr 2006, Auke Kok wrote: > linux-os (Dick Johnson) wrote: >> On Mon, 24 Apr 2006, Auke Kok wrote: >> >>> Ingo Oeser wrote: On Saturday, 22. April 2006 15:49, Jörn Engel wrote: > That was another main point, yes. And the endpoints should be as > little burden on the bottlenecks as possible. One bottleneck is the > receive interrupt, which shouldn't wait for cachelines from other cpus > too much. Thats right. This will be made a non issue with early demuxing on the NIC and MSI (or was it MSI-X?) which will select the right CPU based on hardware channels. >>> MSI-X. with MSI you still have only one cpu handling all MSI interrupts and >>> that doesn't look any different than ordinary interrupts. MSI-X will allow >>> much better interrupt handling across several cpu's. >>> >>> Auke >>> - >> >> Message signaled interrupts are just a kudge to save a trace on a >> PC board (read make junk cheaper still). > > yes. Also in PCI-Express there is no physical interrupt line anymore due to > the architecture, so even classical interrupts are sent as "message" over the > bus. > >> They are not faster and may even be slower. > > thus in the case of PCI-Express, MSI interrupts are just as fast as the > ordinary ones. I have no numbers on whether MSI is faster or not then e.g. > interrupts on PCI-X, but generally speaking, the PCI-Express bus is not > designed to be "low latency" at all, at best it gives you X latency, where X > is something like microseconds. The MSI message itself only takes 10-20 > nanoseconds though, but all the handling probably adds a large factor to that > (1000 or so). No clue on classical interrupt line latency - anyone? > About 9 nanosecond per foot of FR-4 (G10) trace, plus the access time through the gate-arrays (about 20 ns) so, from the time a device needs the CPU, until it hits the interrupt pin, you have typically 30 to 50 nanoseconds. Of course the CPU is __much__ slower. However, these physical latencies are in series, cannot be compensated for because the CPU can't see into the future. >> They will not be the salvation of any interrupt latency problems. > > This is also not the problem - we really don't care that our 100.000 packets > arrive 20usec slower per packet, just as long as the bus is not idle for those > intervals. We would care a lot if 25.000 of those arrive directly at the > proper CPU, without the need for one of the cpu's to arbitrate on every > interrupt. That's the idea anyway. It forces driver-writers to loop in ISRs to handle new status changes that happened before an asserted interrupt even got to the CPU. This is bad. You end up polled in the ISR, with the interrupts off. Turning on the interrupts exacerbates the problem, you may never leave the ISR! It becomes the new "idle task". To properly use interrupts, the hardware latency must be less than the CPUs response to the hardware stimulus. > > Nowadays with irq throttling we introduce a lot of designed latency anyway, > especially with network devices. > >> The solutions for increasing networking speed, >> where the bit-rate on the wire gets close to the bit-rate on the >> bus, is to put more and more of the networking code inside the >> network board. The CPU get interrupted after most things (like >> network handshakes) are complete. > > That is a limited vision of the situation. You could argue that the current > CPU's have so much power that they can easily do a lot of the processing > instead of the hardware, and thus warm caches for userspace, setup sockets > etc. This is the whole idea of Van Jacobsen's net channels. Putting more > offloading into the hardware just brings so much problems with itself, that > are just far easier solved in the OS. > > > Cheers, > > Auke > Cheers, Dick Johnson Penguin : Linux version 2.6.16.4 on an i686 machine (5592.89 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. _ The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
linux-os (Dick Johnson) wrote: On Mon, 24 Apr 2006, Auke Kok wrote: Ingo Oeser wrote: On Saturday, 22. April 2006 15:49, Jörn Engel wrote: That was another main point, yes. And the endpoints should be as little burden on the bottlenecks as possible. One bottleneck is the receive interrupt, which shouldn't wait for cachelines from other cpus too much. Thats right. This will be made a non issue with early demuxing on the NIC and MSI (or was it MSI-X?) which will select the right CPU based on hardware channels. MSI-X. with MSI you still have only one cpu handling all MSI interrupts and that doesn't look any different than ordinary interrupts. MSI-X will allow much better interrupt handling across several cpu's. Auke - Message signaled interrupts are just a kudge to save a trace on a PC board (read make junk cheaper still). yes. Also in PCI-Express there is no physical interrupt line anymore due to the architecture, so even classical interrupts are sent as "message" over the bus. They are not faster and may even be slower. thus in the case of PCI-Express, MSI interrupts are just as fast as the ordinary ones. I have no numbers on whether MSI is faster or not then e.g. interrupts on PCI-X, but generally speaking, the PCI-Express bus is not designed to be "low latency" at all, at best it gives you X latency, where X is something like microseconds. The MSI message itself only takes 10-20 nanoseconds though, but all the handling probably adds a large factor to that (1000 or so). No clue on classical interrupt line latency - anyone? They will not be the salvation of any interrupt latency problems. This is also not the problem - we really don't care that our 100.000 packets arrive 20usec slower per packet, just as long as the bus is not idle for those intervals. We would care a lot if 25.000 of those arrive directly at the proper CPU, without the need for one of the cpu's to arbitrate on every interrupt. That's the idea anyway. Nowadays with irq throttling we introduce a lot of designed latency anyway, especially with network devices. The solutions for increasing networking speed, where the bit-rate on the wire gets close to the bit-rate on the bus, is to put more and more of the networking code inside the network board. The CPU get interrupted after most things (like network handshakes) are complete. That is a limited vision of the situation. You could argue that the current CPU's have so much power that they can easily do a lot of the processing instead of the hardware, and thus warm caches for userspace, setup sockets etc. This is the whole idea of Van Jacobsen's net channels. Putting more offloading into the hardware just brings so much problems with itself, that are just far easier solved in the OS. Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Mon, 24 Apr 2006, Rick Jones wrote: Thats right. This will be made a non issue with early demuxing on the NIC and MSI (or was it MSI-X?) which will select the right CPU based on hardware channels. >>> >>> MSI-X. with MSI you still have only one cpu handling all MSI interrupts and >>> that doesn't look any different than ordinary interrupts. MSI-X will allow >>> much better interrupt handling across several cpu's. >>> >>> Auke >>> - >> >> >> Message signaled interrupts are just a kudge to save a trace on a >> PC board (read make junk cheaper still). They are not faster and >> may even be slower. They will not be the salvation of any interrupt >> latency problems. The solutions for increasing networking speed, >> where the bit-rate on the wire gets close to the bit-rate on the >> bus, is to put more and more of the networking code inside the >> network board. The CPU get interrupted after most things (like >> network handshakes) are complete. > > if the issue is bus vs network bitrates would offloading really buy that > much? i suppose that for minimum sized packets not DMA'ing the headers > across the bus would be a decent win, but down at small packet sizes > where headers would be 1/3 to 1/2 the stuff DMA'd around, I would think > one is talking more about CPU path lengths than bus bitrates. > > and up and "full size" segments, since everyone is so fond of bulk > transfer tests, the transfer saved by not shovig headers across the bus > is what 54/1448 or ~3.75% > > spreading interrupts via MSI-X seems nice and all, but i keep wondering > if the header field-based distribution that is (will be) done by the > NICs is putting the cart before the horse - should the NIC essentially > be telling the system the CPU on which to run the application, or should > the CPU on which the application runs be telling "networking" where it > should be happening? > > rick jones > Ideally, TCP/IP is so mature that one should be able to tell some hardware state-machine "Connect with 123.555.44.333, port 23" and it signals via interrupt when that happens. Then one should be able to say "send these data to that address" or "fill this buffer with data from that address". All the networking could be done on the board, perhaps with a dedicated CPU (as is now done) or all in silicon. So, the driver end of the networking software just handles buffers. There are interrupts that show status such as completions or time-outs, trivial stuff. Cheers, Dick Johnson Penguin : Linux version 2.6.16.4 on an i686 machine (5592.89 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. _ The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Van Jacobson's net channels and real-time
[EMAIL PROTECTED] wrote: > Subject: Re: Van Jacobson's net channels and real-time > > > On Mon, 24 Apr 2006, Auke Kok wrote: > >> Ingo Oeser wrote: >>> On Saturday, 22. April 2006 15:49, Jörn Engel wrote: >>>> That was another main point, yes. And the endpoints should be as >>>> little burden on the bottlenecks as possible. One bottleneck is >>>> the receive interrupt, which shouldn't wait for cachelines from >>>> other cpus too much. >>> >>> Thats right. This will be made a non issue with early demuxing on >>> the NIC and MSI (or was it MSI-X?) which will select the right CPU >>> based on hardware channels. >> >> MSI-X. with MSI you still have only one cpu handling all MSI >> interrupts and that doesn't look any different than ordinary >> interrupts. MSI-X will allow much better interrupt handling across >> several cpu's. >> >> Auke >> - > > Message signaled interrupts are just a kudge to save a trace > on a PC board (read make junk cheaper still). They are not > faster and may even be slower. They will not be the salvation > of any interrupt latency problems. The solutions for > increasing networking speed, where the bit-rate on the wire > gets close to the bit-rate on the bus, is to put more and > more of the networking code inside the network board. The CPU > get interrupted after most things (like network handshakes) > are complete. > The number of hardware interrupts supported is a bit out of scope. Whatever the capacity is, the key is to have as few meaningless interrupts as possible. In the context of netchannels this would mean that an interrupt should only be fired when there is a sufficient number of packets for the user-mode code to process. Fully offloading the protocol to the hardware is certainly one option, that I also thinks make sense, but the goal of netchannels is to try to optimize performance while keeping TCP processing on host. More hardware offload is distinctly possible and relevant in this context. Statefull offload, such as TSO, are fully relevant. Going directly from the NIC to the channel is also possible (after the channel is set up by the kernel, of course). If the NIC is aware of the channels directly then interrupts can be limited to packets that cross per-channel thresholds configured directly by the ring consumer. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
Thats right. This will be made a non issue with early demuxing on the NIC and MSI (or was it MSI-X?) which will select the right CPU based on hardware channels. MSI-X. with MSI you still have only one cpu handling all MSI interrupts and that doesn't look any different than ordinary interrupts. MSI-X will allow much better interrupt handling across several cpu's. Auke - Message signaled interrupts are just a kudge to save a trace on a PC board (read make junk cheaper still). They are not faster and may even be slower. They will not be the salvation of any interrupt latency problems. The solutions for increasing networking speed, where the bit-rate on the wire gets close to the bit-rate on the bus, is to put more and more of the networking code inside the network board. The CPU get interrupted after most things (like network handshakes) are complete. if the issue is bus vs network bitrates would offloading really buy that much? i suppose that for minimum sized packets not DMA'ing the headers across the bus would be a decent win, but down at small packet sizes where headers would be 1/3 to 1/2 the stuff DMA'd around, I would think one is talking more about CPU path lengths than bus bitrates. and up and "full size" segments, since everyone is so fond of bulk transfer tests, the transfer saved by not shovig headers across the bus is what 54/1448 or ~3.75% spreading interrupts via MSI-X seems nice and all, but i keep wondering if the header field-based distribution that is (will be) done by the NICs is putting the cart before the horse - should the NIC essentially be telling the system the CPU on which to run the application, or should the CPU on which the application runs be telling "networking" where it should be happening? rick jones - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Mon, 24 Apr 2006, Auke Kok wrote: > Ingo Oeser wrote: >> On Saturday, 22. April 2006 15:49, Jörn Engel wrote: >>> That was another main point, yes. And the endpoints should be as >>> little burden on the bottlenecks as possible. One bottleneck is the >>> receive interrupt, which shouldn't wait for cachelines from other cpus >>> too much. >> >> Thats right. This will be made a non issue with early demuxing >> on the NIC and MSI (or was it MSI-X?) which will select >> the right CPU based on hardware channels. > > MSI-X. with MSI you still have only one cpu handling all MSI interrupts and > that doesn't look any different than ordinary interrupts. MSI-X will allow > much better interrupt handling across several cpu's. > > Auke > - Message signaled interrupts are just a kudge to save a trace on a PC board (read make junk cheaper still). They are not faster and may even be slower. They will not be the salvation of any interrupt latency problems. The solutions for increasing networking speed, where the bit-rate on the wire gets close to the bit-rate on the bus, is to put more and more of the networking code inside the network board. The CPU get interrupted after most things (like network handshakes) are complete. Cheers, Dick Johnson Penguin : Linux version 2.6.16.4 on an i686 machine (5592.89 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. _ The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
Ingo Oeser wrote: On Saturday, 22. April 2006 15:49, Jörn Engel wrote: That was another main point, yes. And the endpoints should be as little burden on the bottlenecks as possible. One bottleneck is the receive interrupt, which shouldn't wait for cachelines from other cpus too much. Thats right. This will be made a non issue with early demuxing on the NIC and MSI (or was it MSI-X?) which will select the right CPU based on hardware channels. MSI-X. with MSI you still have only one cpu handling all MSI interrupts and that doesn't look any different than ordinary interrupts. MSI-X will allow much better interrupt handling across several cpu's. Auke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
Hi Dave, On Sunday, 23. April 2006 07:56, David S. Miller wrote: > > If cacheline bouncing because of the shared filled_entries becomes an issue, > > you are receiving or sending a lot. > > Cacheline bouncing is the core issue being addressed by this > data structure, so we really can't consider your idea seriously. Ok, I can see it now more clearly. Many thanks for clearing that up in the other replies. I had a major misunderstanding there. > I've just got an off-by-one error, no need to wreck the entire > data structure just to solve that :-) Yes, you are right. But even then I can still implement the reserve/commit once you provide the helpers for producer_space and consumer_space. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
Ingo Oeser wrote: Hi Jörn, On Saturday, 22. April 2006 13:48, Jörn Engel wrote: Unless I completely misunderstand something, one of the main points of the netchannels if to have *zero* fields written to by both producer and consumer. Hmm, for me the main point was to keep the complete processing of a single packet within one CPU/Core where this is a non-issue. But the interrupt for a packet can be received by cpu 0 whereas the rest of processing proceeds on cpu 1; so it still helps to keep the producer index and consumer index on separate cachelines. -- error compiling committee.c: too many arguments to function - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
From: Ingo Oeser <[EMAIL PROTECTED]> Date: Fri, 21 Apr 2006 18:52:47 +0200 > nice to see you getting started with it. Thanks for reviewing. > I'm not sure about the queue logic there. > > 1867 /* Caller must have exclusive producer access to the netchannel. */ > 1868 int netchannel_enqueue(struct netchannel *np, struct > netchannel_buftrailer *bp) > 1869 { > 1870 unsigned long tail; > 1871 > 1872 tail = np->netchan_tail; > 1873 if (tail == np->netchan_head) > 1874 return -ENOMEM; > > This looks wrong, since empty and full are the same condition in your > case. Thanks, that's obviously wrong. I'll try to fix this up. > What about sth. like > > struct netchannel { >/* This is only read/written by the writer (producer) */ >unsigned long write_ptr; > struct netchannel_buftrailer *netchan_queue[NET_CHANNEL_ENTRIES]; > >/* This is modified by both */ > atomic_t filled_entries; /* cache_line_align this? */ > >/* This is only read/written by the reader (consumer) */ >unsigned long read_ptr; > } As stated elsewhere, if you add atomic operations you break the entire idea of net channels. They are meant to be SMP efficient data structures where the producer has one cache line that only it dirties and the consumer has one cache line that likewise only it dirties. > If cacheline bouncing because of the shared filled_entries becomes an issue, > you are receiving or sending a lot. Cacheline bouncing is the core issue being addressed by this data structure, so we really can't consider your idea seriously. I've just got an off-by-one error, no need to wreck the entire data structure just to solve that :-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
From: bert hubert <[EMAIL PROTECTED]> Date: Sat, 22 Apr 2006 21:30:24 +0200 > On Thu, Apr 20, 2006 at 12:09:55PM -0700, David S. Miller wrote: > > Going all the way to the socket is a large endeavor and will require a > > lot of restructuring to do it right, so expect this to take on the > > order of months. > > That's what you said about Niagara too :-) I'm just trying to keep the expectations low so it's easier to exceed them :-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
From: Ingo Oeser <[EMAIL PROTECTED]> Date: Sat, 22 Apr 2006 15:29:58 +0200 > On Saturday, 22. April 2006 13:48, Jörn Engel wrote: > > Unless I completely misunderstand something, one of the main points of > > the netchannels if to have *zero* fields written to by both producer > > and consumer. > > Hmm, for me the main point was to keep the complete processing > of a single packet within one CPU/Core where this is a non-issue. Both are the important issues. You move the bulk of the packet processing work to the end cores of the system, yes. But you do so with an enormously SMP friendly queue data structure so that it does not matter at all that the packet is received on one cpu, yet processed in socket context on another. If you elide either part of the implementation, you miss the entire point of net channels. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
From: Jörn Engel <[EMAIL PROTECTED]> Date: Sat, 22 Apr 2006 13:48:46 +0200 > Unless I completely misunderstand something, one of the main points of > the netchannels if to have *zero* fields written to by both producer > and consumer. Receiving and sending a lot can be expected to be the > common case, so taking a performance hit in this case is hardly a good > idea. That's absolutely correct, this is absolutely critical to the implementation. If you're doing any atomic operations, or any write operations by both consumer and producer to the same cacheline, you've broken things :-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
From: Ingo Oeser <[EMAIL PROTECTED]> Date: Sun, 23 Apr 2006 02:05:32 +0200 > On Saturday, 22. April 2006 15:49, Jörn Engel wrote: > > That was another main point, yes. And the endpoints should be as > > little burden on the bottlenecks as possible. One bottleneck is the > > receive interrupt, which shouldn't wait for cachelines from other cpus > > too much. > > Thats right. This will be made a non issue with early demuxing > on the NIC and MSI (or was it MSI-X?) which will select > the right CPU based on hardware channels. It is not clear that MSI'ing the RX interrupt to multiple cpus is the answer. Consider the fact that by doing so you're reducing the amount of batch work each interrupt does by a factor N. One of the biggest gains of NAPI btw is that it batches patcket receive, if you don't believe the benefits of this put a simply cycle counter sample around netif_receive_skb() calls, and note the difference between the first packet processed and subsequent ones, it's several orders of magnitude faster to process subsequent packets within a batch. I've done this before on tg3 with sparc64 and posted the numbers on netdev about a year or so ago. If you are doing something like netchannels, it helps to batch so that the demuxing table stays hot in the cpu cache. There is even talk of dedicating a thread on enormously multi- threaded cpus just to the NIC hardware interrupt, so it could net channel to the socket processes running on the other strands. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Saturday, 22. April 2006 15:49, Jörn Engel wrote: > That was another main point, yes. And the endpoints should be as > little burden on the bottlenecks as possible. One bottleneck is the > receive interrupt, which shouldn't wait for cachelines from other cpus > too much. Thats right. This will be made a non issue with early demuxing on the NIC and MSI (or was it MSI-X?) which will select the right CPU based on hardware channels. In the meantime I would reduce the effects with only committing on full buffer or on leaving the interrupt handler. This would be ok, because here you have to wakeup the process anyway on full buffer and if it slept because of empty buffer. You loose only, if your application didn't sleep yet and you need to leave the interrupt handler because there is no work anymore. In this case the atomic_add would be significant. All this is quite similiar to now we do page_vec stuff in mm/ already. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Fri, 21 April 2006 18:52:47 +0200, Ingo Oeser wrote: > What about sth. like > > struct netchannel { >/* This is only read/written by the writer (producer) */ >unsigned long write_ptr; > struct netchannel_buftrailer *netchan_queue[NET_CHANNEL_ENTRIES]; > >/* This is modified by both */ > atomic_t filled_entries; /* cache_line_align this? */ > >/* This is only read/written by the reader (consumer) */ >unsigned long read_ptr; > } > > This would prevent this bug from the beginning and let us still use the > full queue size. > > If cacheline bouncing because of the shared filled_entries becomes an issue, > you are receiving or sending a lot. Unless I completely misunderstand something, one of the main points of the netchannels if to have *zero* fields written to by both producer and consumer. Receiving and sending a lot can be expected to be the common case, so taking a performance hit in this case is hardly a good idea. I haven't looked at Davem's implementation at all, but Van simply seperated fields in consumer-written and producer-written, with proper alignment between them. Some consumer-written fields are also read by the producer and vice versa. But none of this results in cacheline pingpong. If your description of the problem is correct, it should only mean that the implementation has a problem, not the concept. Jörn -- Time? What's that? Time is only worth what you do with it. -- Theo de Raadt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Sat, 22 April 2006 15:29:58 +0200, Ingo Oeser wrote: > On Saturday, 22. April 2006 13:48, Jörn Engel wrote: > > Unless I completely misunderstand something, one of the main points of > > the netchannels if to have *zero* fields written to by both producer > > and consumer. > > Hmm, for me the main point was to keep the complete processing > of a single packet within one CPU/Core where this is a non-issue. That was another main point, yes. And the endpoints should be as little burden on the bottlenecks as possible. One bottleneck is the receive interrupt, which shouldn't wait for cachelines from other cpus too much. Jörn -- Why do musicians compose symphonies and poets write poems? They do it because life wouldn't have any meaning for them if they didn't. That's why I draw cartoons. It's my life. -- Charles Shultz - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
Hi Jörn, On Saturday, 22. April 2006 13:48, Jörn Engel wrote: > Unless I completely misunderstand something, one of the main points of > the netchannels if to have *zero* fields written to by both producer > and consumer. Hmm, for me the main point was to keep the complete processing of a single packet within one CPU/Core where this is a non-issue. > Receiving and sending a lot can be expected to be the > common case, so taking a performance hit in this case is hardly a good > idea. There is no hit. If you receive/send in bursts you can simply aggregate them until a certain queueing threshold. The queue design outlined can split the queueing in reserve and commit stages, where the producer can be told how much in can produce and the consumer is told how much it can consume. Within their areas the producer and consumer can freely move around. So this is not exactly a queue, but a dynamic double buffer :-) So maybe doing queueing with the classic head/tail variant is better here, but the other variant might replace it without problems and allows for some nice improvements. Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
On Thu, Apr 20, 2006 at 12:09:55PM -0700, David S. Miller wrote: > Going all the way to the socket is a large endeavor and will require a > lot of restructuring to do it right, so expect this to take on the > order of months. That's what you said about Niagara too :-) Good luck! -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
Hi David, nice to see you getting started with it. I'm not sure about the queue logic there. 1867 /* Caller must have exclusive producer access to the netchannel. */ 1868 int netchannel_enqueue(struct netchannel *np, struct netchannel_buftrailer *bp) 1869 { 1870unsigned long tail; 1871 1872tail = np->netchan_tail; 1873if (tail == np->netchan_head) 1874return -ENOMEM; This looks wrong, since empty and full are the same condition in your case. 1891 struct netchannel_buftrailer *__netchannel_dequeue(struct netchannel *np) 1892 { 1893unsigned long head = np->netchan_head; 1894struct netchannel_buftrailer *bp = np->netchan_queue[head]; 1895 1896BUG_ON(np->netchan_tail == head); See? What about sth. like struct netchannel { /* This is only read/written by the writer (producer) */ unsigned long write_ptr; struct netchannel_buftrailer *netchan_queue[NET_CHANNEL_ENTRIES]; /* This is modified by both */ atomic_t filled_entries; /* cache_line_align this? */ /* This is only read/written by the reader (consumer) */ unsigned long read_ptr; } This would prevent this bug from the beginning and let us still use the full queue size. If cacheline bouncing because of the shared filled_entries becomes an issue, you are receiving or sending a lot. Then you can enqueue and dequeue multiple and commit the counts later. To be done with a atomic_read, atomic_add and atomic_sub on filled_entries. Maybe even cheaper with local_t instead of atomic_t later on. But I guess the cacheline bouncing will be a non-issue, since the whole point of netchannels was to keep traffic as local to a cpu as possible, right? Would you like to see a sample patch relative to your tree, to show you what I mean? Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Van Jacobson's net channels and real-time
[ Maybe ask questions like this on "netdev" where the networking developers hang out? Added to CC: ] Van fell off the face of the planet after giving his presentation and never published his code, only his slides. I've started to make a slow attempt at implementing his ideas, nothing but pure infrastructure so far, but you can look at what I have here: kernel.org:/pub/scm/linux/kernel/git/davem/vj-2.6.git don't expect major progress and don't expect anything beyond a simple channel to softint packet processing on receive any time soon. Going all the way to the socket is a large endeavor and will require a lot of restructuring to do it right, so expect this to take on the order of months. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html