Re: DP performance
Danial Thom wrote: What do you think my word is? My only point was that I use the usage level at which a machine starts dropping packets to determine its point of capacity. I don't see how I can be wrong about anything, since its hard to argue against that point. And what do you think Matt's point was? I don't even think its relevant. .. Whoever you are, you are maybe knowledgeable. I am personaly not able to judge. But your problem is psychological, not technical. I don't know what you are looking for by harrassing people on technical forums such as this one. But whatever you are looking, nobody can ever give it to you. At this point, there is enough element here and then for you to realise the problem lies within you. Get the appropriate help (maybe even a psychatrist for a starter, tough to find a good one though). best of luck Raphaël
Re: DP performance
On Mon, Dec 12, 2005 at 11:07:08AM +0100, Raphael Marmier wrote: : Danial Thom wrote: : : What do you think my word is? My only point was : that I use the usage level at which a machine : starts dropping packets to determine its point of : capacity. I don't see how I can be wrong about : anything, since its hard to argue against that What OS do you choose to run that comes closest to meeting your standards? Jonathon McKitrick -- My other computer is your Windows box.
Re: DP performance
--- Martin P. Hellwig [EMAIL PROTECTED] wrote: Danial Thom wrote: --- Martin P. Hellwig [EMAIL PROTECTED] wrote: cut all Okay so when your the expert on practical implementation, what do you make of the surfnet internet2 test results (they did actually test current normal hardware too) that prove your practical hypothesis wrong? Or do you just deny the results and continue on trolling? What test are you referring to, and what do you think it disproves? Dear Danial, Your lack of knowledge on this topic is not a valid excuse of being irrational stubborn. Are you related to Edgar Allan Poe by some chance? I'm not sure I know which topic you're referring to, since all you do is make vague references to things that don't seem related to anything. First you cited switch specs without an example or what part of the spec made your point, and then you made some vague reference to a large organization that's in bed with Cisco, without mentioning any specific test or what point the test proves. I understand that the last time you said something you were badly beaten down, so maybe you're just afraid to say anything as it may tarnish your reputation as someone who knows something? __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: DP performance
Danial Thom wrote: cut Are you related to Edgar Allan Poe by some chance? I'm not sure I know which topic you're referring to, since all you do is make vague references to things that don't seem related to anything. First you cited switch specs without an example or what part of the spec made your point, and then you made some vague reference to a large organization that's in bed with Cisco, without mentioning any specific test or what point the test proves. I understand that the last time you said something you were badly beaten down, so maybe you're just afraid to say anything as it may tarnish your reputation as someone who knows something? You pretty fallen for it, by using the same discussion technique as you did, you reacted the same way you expects other to react, normally when that situation is created it is still needed to poke a bit more before the opposite party reveals enough information to identify for sure, however in your case two post where enough. So without further reservation I may without any doubt welcome back EM1897, this time with a valid AOL e-mail account. -- mph
Re: DP performance
Why won't you answer any questions or provide details of your ideas? I keep asking, but you never actually say anything. --- Martin P. Hellwig [EMAIL PROTECTED] wrote: Danial Thom wrote: cut Are you related to Edgar Allan Poe by some chance? I'm not sure I know which topic you're referring to, since all you do is make vague references to things that don't seem related to anything. First you cited switch specs without an example or what part of the spec made your point, and then you made some vague reference to a large organization that's in bed with Cisco, without mentioning any specific test or what point the test proves. I understand that the last time you said something you were badly beaten down, so maybe you're just afraid to say anything as it may tarnish your reputation as someone who knows something? You pretty fallen for it, by using the same discussion technique as you did, you reacted the same way you expects other to react, normally when that situation is created it is still needed to poke a bit more before the opposite party reveals enough information to identify for sure, however in your case two post where enough. So without further reservation I may without any doubt welcome back EM1897, this time with a valid AOL e-mail account. -- mph __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: DP performance
I am terminating this thread on monday at noon. People have till then to say their peace. -Matt
Re: DP performance
Perhaps I have a stake in one of the beta quality OSes (FreeBSD 5.x+, Dragonfly, etc) actually becoming useful? Its very frustrating to see a strong solid OS with a strong management team fragment into a bunch of mini-teams, none of which have a wide-enough range of expertise to be ultimately successful. You obvioulsy have no-one on your Team than is strong in networking, because you don't even have the ability to understand the issues (apparently), much less come up with viable solutions. If the performance of Makeworld is your criteria, and you have no large network exposure (since you are in denial of packet loss and think that flow controlling a loaded gigE switch is a solution), I don't see how there is any chance for success beyond a being a neat desktop OS. I'm feeling a bit like Charlton Heston in The Planet of the Apes here. Long Live Matt! Long Live Matt! --- Hiten Pandya [EMAIL PROTECTED] wrote: Dude, if you have made and are making millions of benjamins, I don't a person in his right mind would be spending his time arguing Gig-E performance speeds on a mailing list for a beta-quality OS. :-) If I was a person that made millions, I would definitely be planning on buying an AMG convertible and thinking of buying some Cuban cigars. So please, lay off the shit of how much you make and talk some sense here. No, we don't think Matt is all singing all dancing and knowing, BUT, he does talk sense most of the times. Anyway, this feels like troll targeting a lot. If you really think Matt is right, or people who agree with his viewpoints, then do provide some numbers and we will take it from there. Kind Regards, -- Hiten Pandya hmp at dragonflybsd.org Danial Thom wrote: I, on the other hand, have made millions of $$ designing and selling network equipment based on unix-like OSes, so I'm not only qualified to lecture on the subject, but I'm also qualified to tell Matt that he's dead wrong about just about everything he's said in this thread. If you think __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: DP performance
--- Erik Wikström [EMAIL PROTECTED] wrote: On 2005-12-02 19:16, Danial Thom wrote: All of the empirical evidence points to Matt being wrong. If you still can't accept that then DFLY is more of a religion than a project, which is damn shame. DT Since I don't know anything about networking at GigE-speed I find this whole diskussion very interesting and I hope to learn something new. However, as always when two people both believe they are right, it's hard for me to really choos whom to trust. However Matt has provided some arguments that I find very convincing (calculations and reasoning). Now, since you say that all empirical evidence suggest that he is wrong I suppose that you have some other numbers (wouldn't be very empirical otherwise would it?) that you could show me (like benchmarking or such). Then I might decide what to think when I've seen both sides arguments. Matt admits that he's not sure how the PCI bus works, so why do you find his calculations convincing? He can't possibly do calculations if he doesn't understand the impact of the bus on the operations in question. Try profiling a kernel thats doing nothing but bridging packets. His arguement that hardly any CPU is used is foolish and wrong. There are millions of operations required to move a packet from one interface to another. You can't explain that in A+B=C math. You have to do real testing. Why do you think that flow-controlling a gigabit switch is an ok solution? What do you think the switch is going to do with the traffic? Its going to dump it. So you've moved the dropped packets from one box to another; the result is still packet loss. You've solved nothing. None of his answers address the original question, which is when will MP be as fast or close to as fast as UP. His answer is that networking takes no CPU cycles. Its the dumbest thing I've ever heard. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: DP performance
cut What do you think the switch is going to do with the traffic? Its going to dump it. The only argument you gave is false, read the full specs of any modern switch (ie all 1Gb switches) -- mph
Re: DP performance
--- Martin P. Hellwig [EMAIL PROTECTED] wrote: cut What do you think the switch is going to do with the traffic? Its going to dump it. The only argument you gave is false, read the full specs of any modern switch (ie all 1Gb switches) -- mph If I relied on specs for my info I'd be in the same boat you guys are in, so I don't. Specs are a nice upper limit but you can hardly come to any conclusions based on specs alone. Why don't you post a snippet from one of your specs to illustrate how long you can flow control a switch before it starts dumping packets. And remember that a spec is the max a box can do, so thats with only 2 ports and large packets. So then you can interpolate out (assuming you want to use math to solve your problems) to much smaller packets and many more ports contending for bus bandwidth. Like I said, its time to get out of college-mode and get yourself a test bed. Its the only way to learn how things really work, rather than how they're supposed to work. The truth is that switches under load don't like to be flow controlled, and they drop packets when their queues are at relatively low watermarks. Christ, some switches drop packets at 300K pps when they're not flow controlled. Besides, flow control isn't part of the argument. Performance isn't about how gracefully you can fail to perform a task; its about being able to perform the task without having to resort to using flow control. To me, a box that is issuing flow control is no better than one that drops packets. Both have failed to do the job required. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: DP performance
--- Martin P. Hellwig [EMAIL PROTECTED] wrote: cut all Okay so when your the expert on practical implementation, what do you make of the surfnet internet2 test results (they did actually test current normal hardware too) that prove your practical hypothesis wrong? Or do you just deny the results and continue on trolling? What test are you referring to, and what do you think it disproves? DT __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: DP performance
Vinicius Santos wrote: I wonder why it is that important 'who' Danial Thom is, or even whoMatthew Dillon is, in this kind of discussion. I thought that theory,reasoning and results were what mattered and that the rest was justdecorative fallacy, wich might be annoying when it's in the field ofpersonnal insult. it shouldn't be important, and there is a simple solution to this problem... Anonymity counters vanity. On a forum where registration is required, or even where people give themselves names, a clique is developed of the elite users, and posts deal as much with who you are as what you are posting. On an anonymous forum, if you can't tell who posts what, logic will overrule vanity. As Hiroyuki, the administrator of 2ch, writes: If there is a user ID attached to a user, a discussion tends to become a criticizing game. On the other hand, under the anonymous system, even though your opinion/information is criticized, you don't know with whom to be upset. Also with a user ID, those who participate in the site for a long time tend to have authority, and it becomes difficult for a user to disagree with them. Under a perfectly anonymous system, you can say, it's boring, if it is actually boring. All information is treated equally; only an accurate argument will work.
Re: DP performance
I wonder why it is that important 'who' Danial Thom is, or even whoMatthew Dillon is, in this kind of discussion. [...] it shouldn't be important, and there is a simple solution to this problem... [...] It's important only in this particular discussion, especially when one party has thus far not provided any technical details. Tim
Re: DP performance
Guys'n'girls, Just Google for Danial Thom. All I found are messages on *BSD forums/lists with the same proofless and abusing words. M.b. he know something about the subject, but definitely unable to talk about that. PS. I'm just about to mark him twit. 2005/12/3, [EMAIL PROTECTED] [EMAIL PROTECTED]: Hiten Pandya wrote: You obviously did not research on who Matthew Dillon is, otherwise you would know that he has plenty of real world experience. The guy wrote a packet rate controller inspired by basic laws of physics, give him credit instead of being rude. Time will tell whether he was wrong about his arguments on PCI-X or not, and whether our effort with DragonFly is just plain useless; but there is absolutely no need for animosity on the lists. Should you wish to continue debating performance issues then do so with a civil manner. Kind regards, May i second you Hiten. The DragonFly lists are particularly interesting and you always learn things here. Personnally i don't know anything on the subject of this thread and i have enjoyed observing and trying to understand the arguments. Let's hope people make an effort to be civil, for the benefit of everybody. I will add my grain of salt: on the same subject i have enjoyed reading the papers describing the program of André Oppermann for FreeBSD, notably http://people.freebsd.org/~andre/Optimizing%20the%20FreeBSD%20IP%20and%20TCP%20Stack.pdf which has intersection with points that have been discussed here. Finally let me congratulate Matt for his work and hope best chance of success. -- Michel Talon -- Dennis Melentyev
Re: DP performance
On 12/3/05, Dennis Melentyev [EMAIL PROTECTED] wrote: Guys'n'girls, Just Google for Danial Thom. All I found are messages on *BSD forums/lists with the same proofless and abusing words. M.b. he know something about the subject, but definitely unable to talk about that. PS. I'm just about to mark him twit. Dennis Melentyev[snip] I wonder why it is that important 'who' Danial Thom is, or even whoMatthew Dillon is, in this kind of discussion. I thought that theory,reasoning and results were what mattered and that the rest was justdecorative fallacy, wich might be annoying when it's in the field ofpersonnal insult.A bunch of people make $$ with software for network hardware, but thebussiness environtment is very different from the open software one.That's why Windows XP Home isn't the safest/stablest operating systemin the martk, the target consumer is happy with it(open todiscussion :) ). Now back on thread, I see only reasoning by Matt, since generic Danialstatements don't seem to propose any approach, but then we are notpaying him $$.
Re: DP performance
[EMAIL PROTECTED] wrote: Hiten Pandya wrote: [snip] Kind regards, May i second you Hiten. The DragonFly lists are particularly interesting and you always learn things here. Personnally i don't know anything on the subject of this thread and i have enjoyed observing and trying to understand the arguments. Let's hope people make an effort to be civil, for the benefit of everybody. I will add my grain of salt: [snip] Ditto for me. You pretty much wrote what I was thinking. I do not have the education or knowledge level but my main enjoyment in life is learning, however and when/where I can. I enjoy when people who know more than myself take the time in these discussions because I find the material interesting, in that I can learn from it. Just wanted to take a minute and express thanks and gratitude that such discussions are available for perusal, I enjoy them immensely. -Mike
Re: DP performance
Matthew Dillon wrote: Well, if you think they're so provably wrong you are welcome to put forth an actual technical argument to disprove them, rather then throw out derogatory comments which contain no data value whatsoever. I've done my best to explain the technical issues to you, but frankly you have not answered with a single technical argument or explanation of your own. If you are expecting acolades, you are aren't going to get them from any of us. My friends, I can't help but noticing that Danial's argumentation style (you know, the 50 column look, always starting with a little insult, like, I see you haven't done much empirical testing..., etc.) bears a striking resemblance to our old friend em1897. So before you jump into the discussion think again if you don't have more important stuff to do. Just my 2¢... Sascha
Re: DP performance
You obviously did not research on who Matthew Dillon is, otherwise you would know that he has plenty of real world experience. The guy wrote a packet rate controller inspired by basic laws of physics, give him credit instead of being rude. Time will tell whether he was wrong about his arguments on PCI-X or not, and whether our effort with DragonFly is just plain useless; but there is absolutely no need for animosity on the lists. Should you wish to continue debating performance issues then do so with a civil manner. Kind regards, -- Hiten Pandya hmp at dragonflybsd.org Danial Thom wrote: You obviously have forgotten the original premise of this (which is how do we get past the wall of UP networking performance), and you also obviously have no practical experience with heavily utilized network devices, because you seem to have no grasp on the real issues. Being smart is not about knowing everything; its about recognizing when you don't and making an effort to learn. I seem to remember you saying that there was no performance advantage to PCI-X as well not so long ago. Its really quite amazing to me that you can continue to stick to arguments that are so easily provable to be wrong. Perhaps someday you'll trade in your slide rule and get yourself a good test bed. College is over. Time to enter reality, where the results are almost never whole numbers.
Re: DP performance
At 6:48 PM -0800 12/1/05, Danial Thom wrote: --- Matthew Dillon [EMAIL PROTECTED] wrote: : [various observations based on years of : real-world experience, as anyone could : find out via a competent google search] ..., and you also obviously have no practical experience with heavily utilized network devices, because you seem to have no grasp on the real issues. A-hahahahahahahahahahahaha. Thanks. It's always nice when I can start my day by reading such a funny line. ...I suspect a troll is present in these here woods. -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED]
Re: DP performance
--- Hiten Pandya [EMAIL PROTECTED] wrote: You obviously did not research on who Matthew Dillon is, otherwise you would know that he has plenty of real world experience. The guy wrote a packet rate controller inspired by basic laws of physics, give him credit instead of being rude. Time will tell whether he was wrong about his arguments on PCI-X or not, and whether our effort with DragonFly is just plain useless; but there is absolutely no need for animosity on the lists. Should you wish to continue debating performance issues then do so with a civil manner. Kind regards, -- Hiten Pandya hmp at dragonflybsd.org Danial Thom wrote: You obviously have forgotten the original premise of this (which is how do we get past the wall of UP networking performance), and you also obviously have no practical experience with heavily utilized network devices, because you seem to have no grasp on the real issues. Being smart is not about knowing everything; its about recognizing when you don't and making an effort to learn. I seem to remember you saying that there was no performance advantage to PCI-X as well not so long ago. Its really quite amazing to me that you can continue to stick to arguments that are so easily provable to be wrong. Perhaps someday you'll trade in your slide rule and get yourself a good test bed. College is over. Time to enter reality, where the results are almost never whole numbers. I know Matt very well, and he knows a lot about a lot of things. I generally have respect for his analysis. But he doesn't know everything, and he's particularly weak in networking. Judging by his comments on this thread, I'd say that he's a lot weaker than I expected. He admittedly doesn't even understand how PCI bursting works or how flow control works, and he's in complete denial about packet loss issues, so how can he lecture anyone on anything related to network processing? I, on the other hand, have made millions of $$ designing and selling network equipment based on unix-like OSes, so I'm not only qualified to lecture on the subject, but I'm also qualified to tell Matt that he's dead wrong about just about everything he's said in this thread. If you think that convincing 1000s of people to spend $1000s. on systems running a Free OS is not because of results (as opposed to conjecture), then you haven't tried to do it. Results trump theory every time. Research will show that DragonFLYBSD exists because Matt's peers in the FreeBSD camp disagreed with his ideas, so its not like he's Jesus Christ as you guys portray him. He created DFLY so he could again be the one-eyed man in the land of the blind. Lots of brilliant economists are wrong much of the time. Does dragonfly still use 10K as the default interrupt moderation for the em device? Wow, that means that just about everyone running Dragonfly with em devices is getting 1 interrupt per packet, since I doubt many of you are pushing more than 10K pps. What brilliance came up with RAISING the default of 8K to 10K? Faulty analysis by Matt is the answer. Google a thread with the subject serious networking (em) performance (ggate and NFS) problem, and you'll see he's maintained the same, stupid position about packet loss a full year ago. He hasn't learned a friggin thing in a year, because he thinks he's already got the answer. That, my friend, is the definition of a fool. If you think that ANYONE is an auhority on every subject you are a fool. But here is Matt, who again and again admits that hes not sure about things like flow control and how the bus works, yet you follow his theories without question. Its absolutely mindless. All of the empirical evidence points to Matt being wrong. If you still can't accept that then DFLY is more of a religion than a project, which is damn shame. DT __ Yahoo! DSL Something to write home about. Just $16.99/mo. or less. dsl.yahoo.com
Re: DP performance
:All of the empirical evidence points to Matt :being wrong. If you still can't accept that then :DFLY is more of a religion than a project, which :is damn shame. : :DT Well, again, all I can say is that if 'all of the empirical evidence' points to me being wrong, I welcome actually *hearing* the evidence. Because so far all I have heard is... well, nothing at all really, other then childish name calling. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
On 2005-12-02 19:16, Danial Thom wrote: All of the empirical evidence points to Matt being wrong. If you still can't accept that then DFLY is more of a religion than a project, which is damn shame. DT Since I don't know anything about networking at GigE-speed I find this whole diskussion very interesting and I hope to learn something new. However, as always when two people both believe they are right, it's hard for me to really choos whom to trust. However Matt has provided some arguments that I find very convincing (calculations and reasoning). Now, since you say that all empirical evidence suggest that he is wrong I suppose that you have some other numbers (wouldn't be very empirical otherwise would it?) that you could show me (like benchmarking or such). Then I might decide what to think when I've seen both sides arguments. Erik Wikström -- I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone -- Bjarne Stroustrup
Re: DP performance
Dude, if you have made and are making millions of benjamins, I don't a person in his right mind would be spending his time arguing Gig-E performance speeds on a mailing list for a beta-quality OS. :-) If I was a person that made millions, I would definitely be planning on buying an AMG convertible and thinking of buying some Cuban cigars. So please, lay off the shit of how much you make and talk some sense here. No, we don't think Matt is all singing all dancing and knowing, BUT, he does talk sense most of the times. Anyway, this feels like troll targeting a lot. If you really think Matt is right, or people who agree with his viewpoints, then do provide some numbers and we will take it from there. Kind Regards, -- Hiten Pandya hmp at dragonflybsd.org Danial Thom wrote: I, on the other hand, have made millions of $$ designing and selling network equipment based on unix-like OSes, so I'm not only qualified to lecture on the subject, but I'm also qualified to tell Matt that he's dead wrong about just about everything he's said in this thread. If you think
Re: DP performance
Danial Thom wrote: cut I, on the other hand, have made millions of $$ designing and selling network equipment based on unix-like OSes, so I'm not only qualified to cut What company? Your name doesn't ring a bell to me. -- mph
Re: DP performance
Hiten Pandya wrote: You obviously did not research on who Matthew Dillon is, otherwise you would know that he has plenty of real world experience. The guy wrote a packet rate controller inspired by basic laws of physics, give him credit instead of being rude. Time will tell whether he was wrong about his arguments on PCI-X or not, and whether our effort with DragonFly is just plain useless; but there is absolutely no need for animosity on the lists. Should you wish to continue debating performance issues then do so with a civil manner. Kind regards, May i second you Hiten. The DragonFly lists are particularly interesting and you always learn things here. Personnally i don't know anything on the subject of this thread and i have enjoyed observing and trying to understand the arguments. Let's hope people make an effort to be civil, for the benefit of everybody. I will add my grain of salt: on the same subject i have enjoyed reading the papers describing the program of André Oppermann for FreeBSD, notably http://people.freebsd.org/~andre/Optimizing%20the%20FreeBSD%20IP%20and%20TCP%20Stack.pdf which has intersection with points that have been discussed here. Finally let me congratulate Matt for his work and hope best chance of success. -- Michel Talon
Re: DP performance
Marko Zec wrote: On Wednesday 30 November 2005 16:18, Danial Thom wrote: --- Hiten Pandya [EMAIL PROTECTED] wrote: Marko Zec wrote: Should we be really that pessimistic about potential MP performance, even with two NICs only? Typically packet flows are bi-directional, and if we could have one CPU/core taking care of one direction, then there should be at least some room for parallelism, especially once the parallelized routing tables see the light. Of course provided that each NIC is handled by a separate core, and that IPC doesn't become the actual bottleneck. On a similar note, it is important that we add the *hardware* support for binding a set of CPUs to particular interrupt lines. I believe that the API support for CPU-affinitized interrupt threads is already there so only the hard work is left of converting the APIC code from physical to logical access mode. I am not sure how the AMD64 platform handles CPU affinity, by that I mean if the same infrastructure put in place for i386 would work or not with a few modifications here and there. The recent untangling of the interrupt code should make it simpler for others to dig into adding interrupt affinity support. This, by itself, it not enough, albeit useful. What you need to do is separate transmit and receive (which use the same interrupts, of course). The only way to increase capacity for a single stream with MP is to separate tx and rx. Unless doing fancy oubound queuing, which typically doesn't make much sense at 1Gbit/s speeds and above, I'd bet that significantly more CPU cycles are spent in the RX part than in the TX, which basically only has to enqueue a packet into the devices' DMA ring, and recycle already transmitted mbufs. The other issue with having separate CPUs handling RX and TX parts of the same interface would be the locking mess - you would end up with the per-data-structure locking model of FreeBSD 5.0 and later, which DragonFly diverted from. And what about using CPUs to both RX and TX? That is, bound a packet to a CPU to both RX and TX? Cheers -- Alfredo Beaumont. GPG: http://aintel.bi.ehu.es/~jtbbesaa/jtbbesaa.gpg.asc Elektronika eta Telekomunikazioak Saila (Ingeniaritza Telematikoa) Euskal Herriko Unibertsitatea, Bilbao (Basque Country). http://www.ehu.es
Re: DP performance
Alfredo Beaumont Sainz wrote: And what about using CPUs to both RX and TX? That is, bound a packet to a CPU to both RX and TX? Cheers I am not sure about binding packets to CPUs, but binding by protocol families would be quite nice I think. Starting by binding specific IRQs to certain set of CPUs would be much better way to get there since it would go well with our cpu-locality concept, starting from the interrupt and right all the way up to a process. ithread_irq11 - netisr_cpu1 - tcpthread_cpu1 - process on cpu1 -- Hiten Pandya hmp at dragonflybsd.org
Re: DP performance
--- Marko Zec [EMAIL PROTECTED] wrote: On Wednesday 30 November 2005 16:18, Danial Thom wrote: --- Hiten Pandya [EMAIL PROTECTED] wrote: Marko Zec wrote: Should we be really that pessimistic about potential MP performance, even with two NICs only? Typically packet flows are bi-directional, and if we could have one CPU/core taking care of one direction, then there should be at least some room for parallelism, especially once the parallelized routing tables see the light. Of course provided that each NIC is handled by a separate core, and that IPC doesn't become the actual bottleneck. On a similar note, it is important that we add the *hardware* support for binding a set of CPUs to particular interrupt lines. I believe that the API support for CPU-affinitized interrupt threads is already there so only the hard work is left of converting the APIC code from physical to logical access mode. I am not sure how the AMD64 platform handles CPU affinity, by that I mean if the same infrastructure put in place for i386 would work or not with a few modifications here and there. The recent untangling of the interrupt code should make it simpler for others to dig into adding interrupt affinity support. This, by itself, it not enough, albeit useful. What you need to do is separate transmit and receive (which use the same interrupts, of course). The only way to increase capacity for a single stream with MP is to separate tx and rx. Unless doing fancy oubound queuing, which typically doesn't make much sense at 1Gbit/s speeds and above, I'd bet that significantly more CPU cycles are spent in the RX part than in the TX, which basically only has to enqueue a packet into the devices' DMA ring, and recycle already transmitted mbufs. The other issue with having separate CPUs handling RX and TX parts of the same interface would be the locking mess - you would end up with the per-data-structure locking model of FreeBSD 5.0 and later, which DragonFly diverted from. Cheers, Marko The issue is that RX is absolute, as you cannot decide to delay or selectively drop since you don't know whats coming. Better to have some latency than dropped packets. But if you don't dedicate to RX, then you have an unknown amount of cpu resources doing other stuff. The capacity issues always first manifest themselves as rx overruns, and they always happen a lot sooner on MP machines than UP machines. The LINUX camp made the mistake of not making RX important enough, and now their 2.6 kernels drop packets all over the ranch. But audio is nice and smooth... How to do it or why its difficult is a designer's issue. I've not yet been convinced that MP is something thats suitable for a network intensive environment as I've never seen an MP OS that has come close to Freebsd 4.x UP performance. DT __ Yahoo! DSL Something to write home about. Just $16.99/mo. or less. dsl.yahoo.com
Re: DP performance
:The issue is that RX is absolute, as you cannot :decide to delay or selectively drop since you :don't know whats coming. Better to have some :latency than dropped packets. But if you don't :dedicate to RX, then you have an unknown amount :of cpu resources doing other stuff. The :capacity issues always first manifest themselves :as rx overruns, and they always happen a lot :sooner on MP machines than UP machines. The LINUX :camp made the mistake of not making RX important :enough, and now their 2.6 kernels drop packets :all over the ranch. But audio is nice and :smooth... : :How to do it or why its difficult is a designer's :issue. I've not yet been convinced that MP is :something thats suitable for a network intensive :environment as I've never seen an MP OS that has :come close to Freebsd 4.x UP performance. : :DT RX interrupts can be hardware moderated just like TX interrupts. The EM device, for example, allows you to set the minimum delay between RX intrrupts. For example, lets say a packet comes in and EM interrupts immediately, resulting in a single packet processed on that interrupt. Once the interrupt has occured EM will not generate another interrupt for N microseconds, no matter how many packets come in, where N is programmable. Of course N is programmed to a value that will not result in the RX ring overflowing. The result is that further RX interrupts may bundle 10-50 receive packets on each interrupt depending on the packet size. This aggregation feature of (nearly all) GiGE ethernet devices reduces the effective overhead of interrupt entry and exit to near zero, which means that the device doesn't need to be polled even under the most adverse circumstances. I don't know what the issue you bring up with the Linux kernels is, but at GigE speeds the ethernet hardware is actually flow-controlled. There should not be any packet loss even if the cpu cannot keep up with a full-bandwidth packet stream. There is certainly no need to fast-path the network interrupt, it simply needs to be processed in a reasonable period of time. A few milliseconds of latency occuring every once in a while would not have any adverse effect. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
On Thursday 01 December 2005 15:27, Danial Thom wrote: The issue is that RX is absolute, as you cannot decide to delay or selectively drop since you don't know whats coming. Better to have some latency than dropped packets. No, if the system can't cope with the inbound traffic, it's much better to drop or flow-control the inbound packets early (in the hardware) then to waste other system resources (bus time, CPU cycles) on useless processing if the same packets will never be forwarded. Giving absolute priority to RX processing leads to livelock at high traffic loads, a phenomenon which was well known and studied for at least a decade. Cheers, Marko But if you don't dedicate to RX, then you have an unknown amount of cpu resources doing other stuff. The capacity issues always first manifest themselves as rx overruns, and they always happen a lot sooner on MP machines than UP machines. The LINUX camp made the mistake of not making RX important enough, and now their 2.6 kernels drop packets all over the ranch. But audio is nice and smooth... How to do it or why its difficult is a designer's issue. I've not yet been convinced that MP is something thats suitable for a network intensive environment as I've never seen an MP OS that has come close to Freebsd 4.x UP performance. DT __ Yahoo! DSL Something to write home about. Just $16.99/mo. or less. dsl.yahoo.com
Re: DP performance
:But, and this is where you are misinterpreting the features, the problem :is that there is ANY latency, it is simply that there is too MUCH latency Er, I meant 'is NOT that there is ANY latency, ...'. -Matt
Re: DP performance
On Thursday 01 December 2005 22:19, Danial Thom wrote: I see you haven't done much empirical testing; the assumption that all is well because intel has it all figured out is not a sound one. Interrupt moderation is given but at some point you hit a wall, and my point is that you hit a wall a lot sooner with MP than with UP, because you have to get back to the ring, no matter what the intervals are, before they wrap. As you increase the intervals (and thus decrease the ints/second) you'll lose even more packets, because there is less space in the ring when the interrupt is generated and less time for the cpu to get to it. Gig-E line rate is around 1.44 Mpps max. If you set up the interrupt coalescing timers to trigger interrupts with max frequency of 15000 int/s (a pretty standard setting), then the maximum number of packets buffered due to interrupt delaying will never exceed 100. Given that even the cheapest NICs have 256 RX slots or more, it is evident that the issue you are raising is non-existent. Flow control isn't like XON/OFF where we say hey our buffer is almost full so lets send some flow control at 9600 baud. By the time you're flow controlling you've already lost enough packets to piss off your customer base. No, flow control really works, if tresholds are set up properly then there's no reason for it to fail, i.e. loose packets at all. Plus flow controlling a big switch will just result in the switch dropping the packets instead of you, so what have you really gained? So what's your point? If a system cannot cope with the incoming traffic, _somewhere_ a certain amount of packets will have to be dropped. The real issue is how to make a system more efficiently handle the offered traffic, not to cry about lost packets. Cheers, Marko Packet loss is real, no matter how much you deny it. If you don't believe it, then you need a better traffic generator. DT
Re: DP performance
:... : of latency occuring every once in a while would not have any adverse : effect. : :A few milliseconds of latency / jitter can sometimes completely kill TCP :throughput at gigabit speeds. A few microseconds won't matter, though. : :Cheers, : :Marko Not any more, not with scaled TCP windows and SACK. A few milliseconds doesn't matter. The only effect is that you need a larger transmit buffer to hold the data until the round-trap ack arrives. so, e.g. a 1 Megabyte buffer would allow you to have 10mS of round-trip latency. That's an edge case, of course, so to be safe one would want to cut it in half and say 5 mS with a 1 megabyte buffer. TCP isn't really the problem, anyway, because it can tolerate any amount of latency without 'losing' packets. So if you have a TCP link and you suffer, say, 15 ms of delay once every few seconds, the aggregate bandwidth is still pretty much maintained. The real problem with TCP is packet backlogs appearing at choke points. For example, if you have a GigE LAN and a 45 MBit WAN, an incomming TCP stream from a host with an aweful TCP stack (such as a windows server) might build up a megabyte worth of packets on your network provider's border router all trying to squeeze down into 45 MBits. NewReno, RED, and other algorithms try to deal with it but the best solution is for the server to not try to push out so much data in the first place if the target's *PHYSICAL* infrastructure doesn't have the bandwidth. But that's another issue. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
On Thursday 01 December 2005 23:13, Matthew Dillon wrote: :... : : of latency occuring every once in a while would not have any : adverse effect. : :A few milliseconds of latency / jitter can sometimes completely kill : TCP throughput at gigabit speeds. A few microseconds won't matter, : though. : :Cheers, : :Marko Not any more, not with scaled TCP windows and SACK. A few milliseconds doesn't matter. The only effect is that you need a larger transmit buffer to hold the data until the round-trap ack arrives. so, e.g. a 1 Megabyte buffer would allow you to have 10mS of round-trip latency. That's an edge case, of course, so to be safe one would want to cut it in half and say 5 mS with a 1 megabyte buffer. Mostly true, but having TCP window sizes as large as a megabyte doesn't come at no cost, just as you described later in your note (there may be other problems as well). But I don't think that today's gigabit cards ever delay interrupts for more than a few dozens of microseconds (unless explicitly misconfigured ;), so probably we have a non-issue here. Cheers, Marko TCP isn't really the problem, anyway, because it can tolerate any amount of latency without 'losing' packets. So if you have a TCP link and you suffer, say, 15 ms of delay once every few seconds, the aggregate bandwidth is still pretty much maintained. The real problem with TCP is packet backlogs appearing at choke points. For example, if you have a GigE LAN and a 45 MBit WAN, an incomming TCP stream from a host with an aweful TCP stack (such as a windows server) might build up a megabyte worth of packets on your network provider's border router all trying to squeeze down into 45 MBits. NewReno, RED, and other algorithms try to deal with it but the best solution is for the server to not try to push out so much data in the first place if the target's *PHYSICAL* infrastructure doesn't have the bandwidth. But that's another issue. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
--- Marko Zec [EMAIL PROTECTED] wrote: On Thursday 01 December 2005 22:19, Danial Thom wrote: I see you haven't done much empirical testing; the assumption that all is well because intel has it all figured out is not a sound one. Interrupt moderation is given but at some point you hit a wall, and my point is that you hit a wall a lot sooner with MP than with UP, because you have to get back to the ring, no matter what the intervals are, before they wrap. As you increase the intervals (and thus decrease the ints/second) you'll lose even more packets, because there is less space in the ring when the interrupt is generated and less time for the cpu to get to it. Gig-E line rate is around 1.44 Mpps max. If you set up the interrupt coalescing timers to trigger interrupts with max frequency of 15000 int/s (a pretty standard setting), then the maximum number of packets buffered due to interrupt delaying will never exceed 100. Given that even the cheapest NICs have 256 RX slots or more, it is evident that the issue you are raising is non-existent. no box in existence can handle 1.44Mpps, so why are you arguing a case that can't possibly be made? Flow control isn't like XON/OFF where we say hey our buffer is almost full so lets send some flow control at 9600 baud. By the time you're flow controlling you've already lost enough packets to piss off your customer base. No, flow control really works, if tresholds are set up properly then there's no reason for it to fail, i.e. loose packets at all. Intel boxes don't send out flow control unless the bus has been saturated (ie doing gigE on a 32bit bus) or until the ring has been breached. In both cases its too late as you've lost 100s of packets. Flow control is not an issue here anyway; its a stupid point. The goal is to be able to handle the traffic without flow control. Its like saying that it doesn't matter how fast memory is because programs will wait. Its just plain stupid. Plus flow controlling a big switch will just result in the switch dropping the packets instead of you, so what have you really gained? So what's your point? If a system cannot cope with the incoming traffic, _somewhere_ a certain amount of packets will have to be dropped. The real issue is how to make a system more efficiently handle the offered traffic, not to cry about lost packets. The point is that if you are receiving some # of packets, it means that the device before you can handle it and you can't. It means that you are the bottleneck, and you don't belong in the path. It means that your device is unsuitable to be used as a networking device on such a network. Thats all it means. DT __ Start your day with Yahoo! - Make it your home page! http://www.yahoo.com/r/hs
Re: DP performance
--- Matthew Dillon [EMAIL PROTECTED] wrote: :... :wall a lot sooner with MP than with UP, because :you have to get back to the ring, no matter what :the intervals are, before they wrap. As you :increase the intervals (and thus decrease the :ints/second) you'll lose even more packets, :because there is less space in the ring when the :interrupt is generated and less time for the cpu :to get to it. : :Flow control isn't like XON/OFF where we say hey :our buffer is almost full so lets send some flow :control at 9600 baud. By the time you're flow :controlling you've already lost enough packets to :piss off your customer base. Plus flow :controlling a big switch will just result in the :switch dropping the packets instead of you, so :what have you really gained? : :Packet loss is real, no matter how much you deny :it. If you don't believe it, then you need a :better traffic generator. : :DT Now you need to think carefully about what you are actually arguing about over here. You are making generalizations that are an incorrect interpretation of what features such as interrupt moderation and flow control are intended to provide. What I have said, several times now, is that a reasonably modern cpu is NO LONGER the bottleneck for routing packets. In otherwords, there is going to be *plenty* of cpu suds available. The problem isn't cpu suds, its the latency after new data becomes available before the cpu is able to clear the RX ring. But, and this is where you are misinterpreting the features, the problem is that there is ANY latency, it is simply that there is too MUCH latency *SOMETIMES*. The whole point of having a receive ring and flow control is to INCREASE the amount of latency that can be tolerated before packets are lost. This in turn gives the operating system a far better ability to manage its processing latencies. All the operating system has to guarentee to avoid losing packets is that latency does not exceed a certain calculated value, NOT That the latency has to be minimized. There is a big difference between those two concepts. Lets take an example. Including required on-the-wire PAD I think the minimum packet size is around 64-128 bytes. I'd have to look up the standard to know for sure (but its 64 bytes worth on 100BaseT, and probably something similar for GigE). So lets just say its 64 bytes. That is approximately 576 to 650 bits on the wire. Lets say 576 bits. Now lets say you have a 256 entry receive ring and your interrupt moderation is set so you get around 12 packets per interrupt. So your effective receive ring is 256 - 12 or 244 entries for 576 bits per entry if minimally sized packets are being routed. That's around 140,000 bits, or 140 uS. So in such a configuration routing minimally sized packets the operating system must respond to an interrupt within 140 uS to avoid flow control being activated. If we take a more likely scenario... packets with an average size of, say, 256 bytes (remember that a TCP/IP header is 40 bytes just in itself), you wind up with around 550,000 bits to fill the receive ring or a required interrupt latency of no more then 550 uS. 550 uS is a very long time. Even 140 uS is quite a long time (keep in mind that most system calls takes less then 5 uS to execute, and many takes less then 1). Most interrupt service routines take only a few microseconds to execute. Even clearing 200 entries out of a receive ring would not take more then 10-15 uS. So 140 uS is likely to be achievable WITHOUT having to resort to real time scheduling or other methods. If you have a bunch of interfaces all having to clear nearly whole rings (200+ entries) then it can start to get a little iffy, but even there it would take quite a few interfaces to saturate even a single cpu, let alone multiple cpus in an MP setup. The key thing to remember here is that the goal here NOT to minimize interrupt latency, but instead to simply guarentee that interrupt processing latency does not exceed the ring calculation. That's the ONLY thing we care about. While it is true in one sense that minimizing interrupt latency gives you a bit more margin, the problem with that sort of reasoning is that the cost of minimizing interrupt latency is often to be far less cpu-efficient, which means you actually wind up being able to handle FEWER network interfaces instead of the greater number of network interfaces you thought you'd be able to handle. I can think of a number of polling schemes that would be able to improve overall throughput in a dedicated routing environment, but they
Re: DP performance
Well, if you think they're so provably wrong you are welcome to put forth an actual technical argument to disprove them, rather then throw out derogatory comments which contain no data value whatsoever. I've done my best to explain the technical issues to you, but frankly you have not answered with a single technical argument or explanation of your own. If you are expecting acolades, you are aren't going to get them from any of us. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
You obviously have forgotten the original premise of this (which is how do we get past the wall of UP networking performance), and you also obviously have no practical experience with heavily utilized network devices, because you seem to have no grasp on the real issues. Why don't you Google Matt Dillon (not the actor) and do some research before shooting your mouth off on HIS mailing list. Try adding the keywords Best Internet if you are search engine challenged. Tim
Re: DP performance
On Wednesday 30 November 2005 16:18, Danial Thom wrote: --- Hiten Pandya [EMAIL PROTECTED] wrote: Marko Zec wrote: Should we be really that pessimistic about potential MP performance, even with two NICs only? Typically packet flows are bi-directional, and if we could have one CPU/core taking care of one direction, then there should be at least some room for parallelism, especially once the parallelized routing tables see the light. Of course provided that each NIC is handled by a separate core, and that IPC doesn't become the actual bottleneck. On a similar note, it is important that we add the *hardware* support for binding a set of CPUs to particular interrupt lines. I believe that the API support for CPU-affinitized interrupt threads is already there so only the hard work is left of converting the APIC code from physical to logical access mode. I am not sure how the AMD64 platform handles CPU affinity, by that I mean if the same infrastructure put in place for i386 would work or not with a few modifications here and there. The recent untangling of the interrupt code should make it simpler for others to dig into adding interrupt affinity support. This, by itself, it not enough, albeit useful. What you need to do is separate transmit and receive (which use the same interrupts, of course). The only way to increase capacity for a single stream with MP is to separate tx and rx. Unless doing fancy oubound queuing, which typically doesn't make much sense at 1Gbit/s speeds and above, I'd bet that significantly more CPU cycles are spent in the RX part than in the TX, which basically only has to enqueue a packet into the devices' DMA ring, and recycle already transmitted mbufs. The other issue with having separate CPUs handling RX and TX parts of the same interface would be the locking mess - you would end up with the per-data-structure locking model of FreeBSD 5.0 and later, which DragonFly diverted from. Cheers, Marko You'll still have higher latency than UP, but you may be able to increase capacity by dedicating cycles to processing the receive ring. If you can eliminate overruns then you can selectively manage transmit.
Re: DP performance
On Wednesday 30 November 2005 03:08, Hiten Pandya wrote: Marko Zec wrote: Should we be really that pessimistic about potential MP performance, even with two NICs only? Typically packet flows are bi-directional, and if we could have one CPU/core taking care of one direction, then there should be at least some room for parallelism, especially once the parallelized routing tables see the light. Of course provided that each NIC is handled by a separate core, and that IPC doesn't become the actual bottleneck. On a similar note, it is important that we add the *hardware* support for binding a set of CPUs to particular interrupt lines. Yes that would be nice. Alternatively one could have separate polling threads on per-CPU/core basis for handling different interfaces. Maybe such a framework could even allow for migration of interfaces between polling threads, in order to dynamically adapt to different workloads / traffic patterns. With hardware interrupts such an idea would be very difficult if not completely impossible to implement. Marko I believe that the API support for CPU-affinitized interrupt threads is already there so only the hard work is left of converting the APIC code from physical to logical access mode. I am not sure how the AMD64 platform handles CPU affinity, by that I mean if the same infrastructure put in place for i386 would work or not with a few modifications here and there. The recent untangling of the interrupt code should make it simpler for others to dig into adding interrupt affinity support.
Re: DP performance
On Monday 28 November 2005 22:13, Matthew Dillon wrote: If we are talking about maxing out a machine in the packet routing role, then there are two major issue sthat have to be considered: * Bus bandwidth. e.g. PCI, PCIX, PCIE, etc etc etc. A standard PCI bus is limited to ~120 MBytes/sec, not enough for even a single GiGE link going full duplex at full speed. More recent busses can do better. * Workload separation. So e.g. if one has four interfaces and two cpus, each cpu could handle two interfaces. An MP system would not reap any real gains over UP until one had three or more network interfaces, since two interfaces is no different from one interface from the point of view of trying to route packets. Should we be really that pessimistic about potential MP performance, even with two NICs only? Typically packet flows are bi-directional, and if we could have one CPU/core taking care of one direction, then there should be at least some room for parallelism, especially once the parallelized routing tables see the light. Of course provided that each NIC is handled by a separate core, and that IPC doesn't become the actual bottleneck. Main memory bandwidth used to be an issue but isn't so much any more. The memory bandwidth isn't but latency _is_ now the major performance bottleneck, IMO. DRAM access latencies are now in 50 ns range and will not noticeably decrease in the forseeable future. Consider the amount of independent memory accesses that need to be performed on per-packet basis: DMA RX descriptor read, DMA RX buffer write, DMA RX descriptor update, RX descriptor update/refill, TX descriptor update, DMA TX desctiptor read, DMA TX buffer read, DMA TX descriptor update... Without doing any smart work at all we have to waste a few hundreds of ns of DRAM bus time per packet, provided we are lucky and the memory bus is not congested. So to improve the forwarding performance anywhere above 1Mpps, UP or MP, having the CPU touch the DRAM in the forwarding path has to be avoided like the plaque. The stack paralelization seems to be the right step in this direction. Cheers Marko Insofar as DragonFly goes, we can almost handle the workload separation case now, but not quite. We will be able to handle it with the work going in after the release. Even so, it will probably only matter if the majority of packets being routed are tiny. Bigger packets eat far less cpu for the amount of data transfered. -Matt
Re: DP performance
:Should we be really that pessimistic about potential MP performance, :even with two NICs only? Typically packet flows are bi-directional, :and if we could have one CPU/core taking care of one direction, then :there should be at least some room for parallelism, especially once the :parallelized routing tables see the light. Of course provided that :each NIC is handled by a separate core, and that IPC doesn't become the :actual bottleneck. The problem is that if you only have two interfaces, every incoming packet being routed has to go through both interfaces, which means that there will be significant memory contention between the two cpus no matter what you do. This won't degrade the 2xCPUs by 50%... it's probably more like 20%, but if you only have two ethernet interfaces and the purpose of the box is to route packets, there isn't much of a reason to make it an SMP box. cpu's these days are far, far faster then two measily GigE ethernet interface that can only do 200 MBytes/sec each. Even more to the point, if you have two interfaces you still only have 200 MBytes/sec worth of packets to contend with, even though each incoming packet is being shoved out the other interface (for 400 MBytes/sec of total network traffic). It is still only *one* packet that the cpu is routing. Even cheap modern cpus can shove around several GBytes/sec without DMA so 200 MBytes/sec is really nothing to them. : Main memory bandwidth used to be an issue but isn't so much any : more. : :The memory bandwidth isn't but latency _is_ now the major performance :bottleneck, IMO. DRAM access latencies are now in 50 ns range and will :not noticeably decrease in the forseeable future. Consider the amount :of independent memory accesses that need to be performed on per-packet :... :Cheers : :Marko No, this is irrelevant. All modern ethernet devices (for the last decade or more) have DMA engines and fairly significant FIFOs, which means that nearly all memory accesses are going to be burst accesses capable of getting fairly close to the maximum burst bandwidth of the memory. I can't say for sure that this is actually happening without a putting a logic analyzer on the memory bus, but I'm fairly sure it is. I seem to recall that the PCI (PCIx, PCIe, etc) bus DMA protocols are all burst capable protocols. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
Marko Zec wrote: Should we be really that pessimistic about potential MP performance, even with two NICs only? Typically packet flows are bi-directional, and if we could have one CPU/core taking care of one direction, then there should be at least some room for parallelism, especially once the parallelized routing tables see the light. Of course provided that each NIC is handled by a separate core, and that IPC doesn't become the actual bottleneck. On a similar note, it is important that we add the *hardware* support for binding a set of CPUs to particular interrupt lines. I believe that the API support for CPU-affinitized interrupt threads is already there so only the hard work is left of converting the APIC code from physical to logical access mode. I am not sure how the AMD64 platform handles CPU affinity, by that I mean if the same infrastructure put in place for i386 would work or not with a few modifications here and there. The recent untangling of the interrupt code should make it simpler for others to dig into adding interrupt affinity support. -- Hiten Pandya hmp at dragonflybsd.org
Re: DP performance
:It seems most of the banter for the past few :months is userland related. What is the state of :the kernel in terms of DP/MP kernel performance? :Has any work been done or is DFLY still in the :cleaning up stages? I'm still desparately seeking :a good reason to move to Dual-core processors : :DT It's getting better but there is still a lot of work to do. After this upcoming release (middle of December) I intend to bring in Jeff's parallel routing code and make the TCP and UDP protocol threads MP safe. Even in its current state (and, in fact, even using the old FreeBSD 4.x kernel's), you will reap major benefits on a dual-core cpu. I have been very impressed with AMD's Athlon X2 in these Shuttle XPC boxes, despite having to buy a PCI GiGE ethernet card due to motherboard issues. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
--- Matthew Dillon [EMAIL PROTECTED] wrote: :It seems most of the banter for the past few :months is userland related. What is the state of :the kernel in terms of DP/MP kernel performance? :Has any work been done or is DFLY still in the :cleaning up stages? I'm still desparately seeking :a good reason to move to Dual-core processors : :DT It's getting better but there is still a lot of work to do. After this upcoming release (middle of December) I intend to bring in Jeff's parallel routing code and make the TCP and UDP protocol threads MP safe. Even in its current state (and, in fact, even using the old FreeBSD 4.x kernel's), you will reap major benefits on a dual-core cpu. I have been very impressed with AMD's Athlon X2 in these Shuttle XPC boxes, despite having to buy a PCI GiGE ethernet card due to motherboard issues. What kind of benefits would be realized for systems being used primary as a router/bridge, given that its almost 100% kernel usage? DT __ Yahoo! Music Unlimited Access over 1 million songs. Try it free. http://music.yahoo.com/unlimited/
Re: DP performance
:What kind of benefits would be realized for :systems being used primary as a router/bridge, :given that its almost 100% kernel usage? : :DT Routing packets doesn't take much cpu unless you are running a gigabit of actual bandwidth (or more). If you aren't doing anything else with the machine then the cheapest AMD XP will do the job. -Matt Matthew Dillon [EMAIL PROTECTED]
Re: DP performance
On Mon, Nov 28, 2005 at 10:15:55AM -0800, Matthew Dillon wrote: :What kind of benefits would be realized for :systems being used primary as a router/bridge, :given that its almost 100% kernel usage? : :DT Routing packets doesn't take much cpu unless you are running a gigabit of actual bandwidth (or more). If you aren't doing anything else with the machine then the cheapest AMD XP will do the job. We've found the bottle neck for routers is CPU cycles neccessary to process NIC hardware interrupts. At least for OBSD. Interupt mitigation, and I suppose POLLING on Dragonfly may help but it isn't supported on all hardware AFAIK. Why kind of parallelism, as far as processing separate NIC hardware interupts on separate CPU's, can DragonFly currently support? -steve
Re: DP performance
--- Steve Shorter [EMAIL PROTECTED] wrote: On Mon, Nov 28, 2005 at 10:15:55AM -0800, Matthew Dillon wrote: :What kind of benefits would be realized for :systems being used primary as a router/bridge, :given that its almost 100% kernel usage? : :DT Routing packets doesn't take much cpu unless you are running a gigabit of actual bandwidth (or more). If you aren't doing anything else with the machine then the cheapest AMD XP will do the job. We've found the bottle neck for routers is CPU cycles neccessary to process NIC hardware interrupts. At least for OBSD. Interupt mitigation, and I suppose POLLING on Dragonfly may help but it isn't supported on all hardware AFAIK. Polling is pretty dumb with modern NICs, as most have built-in interrupt moderation that does the work of polling without all of the overhead (by generating interrupts with user-definable forced separation). At least Intels do; if others don't then thats enough reason not to use them. Doing 500K pps with a 10K interrupts/second setting is better than you could ever do with polling, and the results quite good. Dealing with the NICs (processing packets, I/Os, etc) is what uses the cycles, and the stack (a bridge machine can do twice as many packets as a router for example). For network processing true separation of transmit and receive is probably the only way to realize networking gains for non TCP/UDP operations. Slicing up the stack will only slow things down compared to UP (think of a full-speed relay race against a guy that doesn't get tired...the guy without the hand-offs will always win.) The best you can probably do is match the UP performance; but idealy have a bunch of cpu power left over. So maybe you'd have a UP machine that can do 800K pps and be on the edge of livelock, and a DP machine that can do 750K pps but is still usable at the user level. Danial __ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com
Re: DP performance
If we are talking about maxing out a machine in the packet routing role, then there are two major issue sthat have to be considered: * Bus bandwidth. e.g. PCI, PCIX, PCIE, etc etc etc. A standard PCI bus is limited to ~120 MBytes/sec, not enough for even a single GiGE link going full duplex at full speed. More recent busses can do better. * Workload separation. So e.g. if one has four interfaces and two cpus, each cpu could handle two interfaces. An MP system would not reap any real gains over UP until one had three or more network interfaces, since two interfaces is no different from one interface from the point of view of trying to route packets. Main memory bandwidth used to be an issue but isn't so much any more. Insofar as DragonFly goes, we can almost handle the workload separation case now, but not quite. We will be able to handle it with the work going in after the release. Even so, it will probably only matter if the majority of packets being routed are tiny. Bigger packets eat far less cpu for the amount of data transfered. -Matt