Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hi, Please move any further comments on this thread to: https://reviews.freebsd.org/D1438 See r277528. Discussion ends here. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hans: We (netflix) run in production 35% of the internet with these very things you identify no lock an all. We *do* have some issue we are looking at but so far I have *never* connected the dots the way you were claiming that would cause a crash. I can see where TCP would do incorrect retransmissions but I did *not* see a crash. Now granted my look was quick at this, but that was due to time constraints and the holidays. I am going to put myself full-time on this to see if I can understand both how you got at “there is a panic in tcp” and it must fully be the callout-subsystem thus we need to re-write large parts of it. You *may* be correct in a re-write is needed, you *may* be completely incorrect. In either case I plan to dig into this and find out. R On Jan 22, 2015, at 3:39 AM, Hans Petter Selasky h...@selasky.org wrote: On 01/22/15 09:10, Konstantin Belousov wrote: On Thu, Jan 22, 2015 at 08:14:26AM +0100, Hans Petter Selasky wrote: On 01/22/15 06:26, Warner Losh wrote: The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Radically changing the performance characteristics is breaking the code. Performance regression in the TCP stack is urgent to fix. Not being able to enumerate what all the consumers are that use this and provide an analysis about why they aren?t important to fix is a bug in your process, and in your interaction with the project. We simply do not operate that way. Right, I completely agree with this statement. Hi, My plan is to work out a patch for the TCP stack today, which only change the callout_init() call or its function. This should not need any particular review. I'll let adrian test and review, because I think he is closer to me timezone wise and you're standing on my head saying its urgent. If he is still not happy, I can back my change out. Else it remains in -current AS-IS. TCP regresssion was noted, so it is brought in front. There is nothing else which makes TCP issue different from other (hidden) issues. === MFC to 10-stable I can delay for sure until all issues you report to me are fixed. === Sigh, you still do not understand. It is your duty to identify all pieces which break after your change. After that, we can argue whether each of them is critical or not to allow the migration. But this must have been done before the KPI change hit the tree. Hi, Are you saying that pieces of code that runs completely unlocked using volatile as only synchronization mechanism is better than what I would call a temporary and hopefully short TCP stack performance loss? I don't understand? How frequently do you reboot your boxes? Maybe one every day? And you don't care? --HPS - Randall Stewart rand...@lakerest.net ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Let me re-send my email.. my silly mac sent my first try from the wrong address.. sigh (sorry moderator where ever you are ;-o) All: I have finally pulled my head out of the sands of TLS and had some time to look at this interesting long thread. I agree with Warner and Adrian on this.. Lets back it out and then in a branch chew this over piece by piece.. R As an addition I have decided to get my head back into this, I was one of the ones on Hann’s original email and I had asked him to wait until *after* the Holiday’s to do anything (thinking on continuing the discussion) I did *not* realize he planned on roto-tilling the callout system.. sigh On Jan 21, 2015, at 7:10 PM, Adrian Chadd adr...@freebsd.org wrote: On 21 January 2015 at 16:07, K. Macy km...@freebsd.org wrote: HPS: Your change failed to meet these guidelines. Some of us are upset because these guidelines are fairly fundamental for the on-going viability of FreeBSD. Due to linguistic / time zone / cultural differences these expectations have not been adequately communicated to you. You are not in the USB sandbox where others need for your support outweighs the inconvenience of random breakage. It sounds like you are making progress towards updating the concerns that have been voiced. If kib's observations are in fact comprehensive then adding a callout_init_cpu function and updating all clients so that their callouts continue to be scheduled on a CPU other than the BSP will suffice and we can all move on. Is there some reason that we can’t back things out, break things down into smaller pieces and have everything pass through phabric with a wide ranging review? Given the fundamental nature of these changes, they really need better review and doing it after the fact seems to be to be too risky. I’m not debating that this “fixes” some issues, but given the performance regression, it sure seems like we may need a different solution to be implemented and hashing that out in a branch might be the best approach. Thank you. A more incremental approach would be appreciated by many of us. To avoid the bystander effect we can permit explicit timeouts for review-to-commit (72 hours?) so that we don't collectively end up sandbagging him. I'm +1 for this. -a -- Randall Stewart 803-317-4952 (cell) ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hans Thats great, could you please open a project branch that we can look at it in too? I would very much appreciate that. Sometimes I like to look at the whole code with it all in place (not just patches) and a project branch really helps with that. R On Jan 22, 2015, at 3:39 AM, Hans Petter Selasky h...@selasky.org wrote: On 01/22/15 09:10, Konstantin Belousov wrote: On Thu, Jan 22, 2015 at 08:14:26AM +0100, Hans Petter Selasky wrote: On 01/22/15 06:26, Warner Losh wrote: The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Radically changing the performance characteristics is breaking the code. Performance regression in the TCP stack is urgent to fix. Not being able to enumerate what all the consumers are that use this and provide an analysis about why they aren?t important to fix is a bug in your process, and in your interaction with the project. We simply do not operate that way. Right, I completely agree with this statement. Hi, My plan is to work out a patch for the TCP stack today, which only change the callout_init() call or its function. This should not need any particular review. I'll let adrian test and review, because I think he is closer to me timezone wise and you're standing on my head saying its urgent. If he is still not happy, I can back my change out. Else it remains in -current AS-IS. TCP regresssion was noted, so it is brought in front. There is nothing else which makes TCP issue different from other (hidden) issues. === MFC to 10-stable I can delay for sure until all issues you report to me are fixed. === Sigh, you still do not understand. It is your duty to identify all pieces which break after your change. After that, we can argue whether each of them is critical or not to allow the migration. But this must have been done before the KPI change hit the tree. Hi, Are you saying that pieces of code that runs completely unlocked using volatile as only synchronization mechanism is better than what I would call a temporary and hopefully short TCP stack performance loss? I don't understand? How frequently do you reboot your boxes? Maybe one every day? And you don't care? --HPS -- Randall Stewart 803-317-4952 (cell) ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hans: I think this is the wrong approach. You should: a) Back out the commit you did to head b) create a project branch and put your changes in there c) fix *everything* you break in these subtle ways d) Discuss through a review process if your changes are correct e) When consensus is reached then and only then put it into head with all the fixes. If you are not willing to do the above, then we need this taken up to core and have them act on it. I think enough folks have voiced concern that you should be willing to do the above steps. R On Jan 22, 2015, at 12:19 AM, Randy Stewart rand...@lakerest.net wrote: All: I have finally pulled my head out of the sands of TLS and had some time to look at this interesting long thread. I agree with Warner and Adrian on this.. Lets back it out and then in a branch chew this over piece by piece.. R - Randall Stewart rand...@lakerest.net ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/22/15 12:26, Randy Stewart wrote: Hans: We (netflix) run in production 35% of the internet with these very things you identify no lock an all. We *do* have some issue we are looking at but so far I have *never* connected the dots the way you were claiming that would cause a crash. I can see where TCP would do incorrect retransmissions but I did *not* see a crash. Now granted my look was quick at this, but that was due to time constraints and the holidays. I am going to put myself full-time on this to see if I can understand both how you got at “there is a panic in tcp” and it must fully be the callout-subsystem thus we need to re-write large parts of it. You *may* be correct in a re-write is needed, you *may* be completely incorrect. In either case I plan to dig into this and find out. Hi, There are multiple issues in the bag here ... --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
All: I have finally pulled my head out of the sands of TLS and had some time to look at this interesting long thread. I agree with Warner and Adrian on this.. Lets back it out and then in a branch chew this over piece by piece.. R On Jan 21, 2015, at 7:10 PM, Adrian Chadd adr...@freebsd.org wrote: On 21 January 2015 at 16:07, K. Macy km...@freebsd.org wrote: HPS: Your change failed to meet these guidelines. Some of us are upset because these guidelines are fairly fundamental for the on-going viability of FreeBSD. Due to linguistic / time zone / cultural differences these expectations have not been adequately communicated to you. You are not in the USB sandbox where others need for your support outweighs the inconvenience of random breakage. It sounds like you are making progress towards updating the concerns that have been voiced. If kib's observations are in fact comprehensive then adding a callout_init_cpu function and updating all clients so that their callouts continue to be scheduled on a CPU other than the BSP will suffice and we can all move on. Is there some reason that we can’t back things out, break things down into smaller pieces and have everything pass through phabric with a wide ranging review? Given the fundamental nature of these changes, they really need better review and doing it after the fact seems to be to be too risky. I’m not debating that this “fixes” some issues, but given the performance regression, it sure seems like we may need a different solution to be implemented and hashing that out in a branch might be the best approach. Thank you. A more incremental approach would be appreciated by many of us. To avoid the bystander effect we can permit explicit timeouts for review-to-commit (72 hours?) so that we don't collectively end up sandbagging him. I'm +1 for this. -a - Randall Stewart rand...@lakerest.net ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Thu, Jan 22, 2015 at 08:14:26AM +0100, Hans Petter Selasky wrote: On 01/22/15 06:26, Warner Losh wrote: The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Radically changing the performance characteristics is breaking the code. Performance regression in the TCP stack is urgent to fix. Not being able to enumerate what all the consumers are that use this and provide an analysis about why they aren?t important to fix is a bug in your process, and in your interaction with the project. We simply do not operate that way. Right, I completely agree with this statement. Hi, My plan is to work out a patch for the TCP stack today, which only change the callout_init() call or its function. This should not need any particular review. I'll let adrian test and review, because I think he is closer to me timezone wise and you're standing on my head saying its urgent. If he is still not happy, I can back my change out. Else it remains in -current AS-IS. TCP regresssion was noted, so it is brought in front. There is nothing else which makes TCP issue different from other (hidden) issues. === MFC to 10-stable I can delay for sure until all issues you report to me are fixed. === Sigh, you still do not understand. It is your duty to identify all pieces which break after your change. After that, we can argue whether each of them is critical or not to allow the migration. But this must have been done before the KPI change hit the tree. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/22/15 09:10, Konstantin Belousov wrote: On Thu, Jan 22, 2015 at 08:14:26AM +0100, Hans Petter Selasky wrote: On 01/22/15 06:26, Warner Losh wrote: The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Radically changing the performance characteristics is breaking the code. Performance regression in the TCP stack is urgent to fix. Not being able to enumerate what all the consumers are that use this and provide an analysis about why they aren?t important to fix is a bug in your process, and in your interaction with the project. We simply do not operate that way. Right, I completely agree with this statement. Hi, My plan is to work out a patch for the TCP stack today, which only change the callout_init() call or its function. This should not need any particular review. I'll let adrian test and review, because I think he is closer to me timezone wise and you're standing on my head saying its urgent. If he is still not happy, I can back my change out. Else it remains in -current AS-IS. TCP regresssion was noted, so it is brought in front. There is nothing else which makes TCP issue different from other (hidden) issues. === MFC to 10-stable I can delay for sure until all issues you report to me are fixed. === Sigh, you still do not understand. It is your duty to identify all pieces which break after your change. After that, we can argue whether each of them is critical or not to allow the migration. But this must have been done before the KPI change hit the tree. Hi, Are you saying that pieces of code that runs completely unlocked using volatile as only synchronization mechanism is better than what I would call a temporary and hopefully short TCP stack performance loss? I don't understand? How frequently do you reboot your boxes? Maybe one every day? And you don't care? --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/22/15 10:49, K. Macy wrote: Sigh, you still do not understand. It is your duty to identify all pieces which break after your change. After that, we can argue whether each of them is critical or not to allow the migration. But this must have been done before the KPI change hit the tree. Hi, Are you saying that pieces of code that runs completely unlocked using volatile as only synchronization mechanism is better than what I would call a temporary and hopefully short TCP stack performance loss? Hans - The project has long standing expectations about how changes are made to core subsystems. When you hear understand your ego intercedes - put that aside. I told you this first this afternoon and others have repeated it several times. When you change a KPI, consumers are updated at the same time - _period_. This protocol is not up for debate. I'm glad that others have the presence of mind and fortitude to insist on this. Your work is appreciated, but whether or not you agree about this is not relevant. We're all sorry if this upsets you but this is only a temporary setback. Channelling this work through phabricator will go a long way towards smoothing over the current friction. Think about the greater goal here, not whether this is done now or in a week. Hi Kip, That is fine by me. I didn't know about the protocol you refer to until now. I will revert my callout patch and hopefully without causing any build issues and then we can have another round in the Phabricator to iron out the TCP stack issues and possibly others. Sounds good. Please give me some hours to ensure that the pullout doesn't cause any build breakages. Thank you! --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
http://www.mrs-realestate.blogspot.in/ 2400sq.ft 2400*200=48 only near (sez-3) tamilnadu 6km only sez-3 after 10 year land valu w contact:7373730713(sms only) ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Sigh, you still do not understand. It is your duty to identify all pieces which break after your change. After that, we can argue whether each of them is critical or not to allow the migration. But this must have been done before the KPI change hit the tree. Hi, Are you saying that pieces of code that runs completely unlocked using volatile as only synchronization mechanism is better than what I would call a temporary and hopefully short TCP stack performance loss? Hans - The project has long standing expectations about how changes are made to core subsystems. When you hear understand your ego intercedes - put that aside. I told you this first this afternoon and others have repeated it several times. When you change a KPI, consumers are updated at the same time - _period_. This protocol is not up for debate. I'm glad that others have the presence of mind and fortitude to insist on this. Your work is appreciated, but whether or not you agree about this is not relevant. We're all sorry if this upsets you but this is only a temporary setback. Channelling this work through phabricator will go a long way towards smoothing over the current friction. Think about the greater goal here, not whether this is done now or in a week. -K ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Sean, On Tue, Jan 20, 2015 at 04:22:44PM -0800, Sean Bruno wrote: S In our universe, this commit (right or wrong) resolved our panics. I S think that there is some room for improvement based on the commentary S in this thread, but some people do indeed prefer stability over S performance. I hope we can come to a middle ground somewhere here. Sorry, but this sounds very much like alchemy. We poured this stuff into that stuff and yield in gold precipitate. We don't understand what's going on, but let's record the recipe into our tome of aclhemy wisdom. So alchemy never came to a scientific level, and chemistry evolved as science only when researchers started to measure, explain and understand. If we treat our precious kernel in alchemy way, we will follow the path of alchemy, except that it took centuries for alchemy to die, and for a software product it would take a few years. So, for me Kip ideas sound very sensible. There could be a race somewhere else. You tweak callout subsystem in any direction, timings of events in kernel shift, your race is hidden. If we fix problems w/o understanding them, we are going alchemy way. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 21 January 2015 at 10:15, Gleb Smirnoff gleb...@freebsd.org wrote: Sean, On Tue, Jan 20, 2015 at 04:22:44PM -0800, Sean Bruno wrote: S In our universe, this commit (right or wrong) resolved our panics. I S think that there is some room for improvement based on the commentary S in this thread, but some people do indeed prefer stability over S performance. I hope we can come to a middle ground somewhere here. Sorry, but this sounds very much like alchemy. We poured this stuff into that stuff and yield in gold precipitate. We don't understand what's going on, but let's record the recipe into our tome of aclhemy wisdom. So alchemy never came to a scientific level, and chemistry evolved as science only when researchers started to measure, explain and understand. If we treat our precious kernel in alchemy way, we will follow the path of alchemy, except that it took centuries for alchemy to die, and for a software product it would take a few years. So, for me Kip ideas sound very sensible. There could be a race somewhere else. You tweak callout subsystem in any direction, timings of events in kernel shift, your race is hidden. If we fix problems w/o understanding them, we are going alchemy way. Hi, I don't think it's quite this bad. They originally found that things were spinning for way too long. Hans found something similar and determined/concluded that the migration code in callouts was racy-by-design and dramatically simplified it and also put very hard constraints on what is a valid situation to support migrating from one callwheel to another. Now we have fallout which we can either address or back out until the callout stuff is again reviewed/fixed. I don't think it's as alchemic as is being promoted. -a ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 21 January 2015 at 06:00, Lawrence Stewart lstew...@freebsd.org wrote: On 01/20/15 09:22, Adrian Chadd wrote: Yeah, it looks like you set c_cpu to timeout_cpu in _callout_init_locked(), but then you only handle the case of the CPU being changed in certain circumstances. You aren't handling the CPU being initialised when the callout is first added. And, callout_restart_async() calls callout_lock(), which calls CC_LOCK() on the callout CPU, which initially is zero. Then, it never seems to be 'moved' into the correct CPU, even though it's being called with a CPU id. So, if I am reading this all correctly, you aren't really handling multi CPU callwheels at all. ;) In my instance, I'm seeing quite a lot of lock contention between the userland threads, the network RX threads and the clock thread, all contending on a single callout wheel being used for TCP timers. One of the early goals for fixing up the RSS stuff in -HEAD was to make per-CPU (Well, per-RSS-bucket really) TCP timers not contend with the NIC RX processing path, and now it does. :( To clarify, you're seeing this with net.inet.tcp.per_cpu_timers=1? Yup, RSS enables that by default. -adrian ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
HPS: Your change failed to meet these guidelines. Some of us are upset because these guidelines are fairly fundamental for the on-going viability of FreeBSD. Due to linguistic / time zone / cultural differences these expectations have not been adequately communicated to you. You are not in the USB sandbox where others need for your support outweighs the inconvenience of random breakage. It sounds like you are making progress towards updating the concerns that have been voiced. If kib's observations are in fact comprehensive then adding a callout_init_cpu function and updating all clients so that their callouts continue to be scheduled on a CPU other than the BSP will suffice and we can all move on. Is there some reason that we can’t back things out, break things down into smaller pieces and have everything pass through phabric with a wide ranging review? Given the fundamental nature of these changes, they really need better review and doing it after the fact seems to be to be too risky. I’m not debating that this “fixes” some issues, but given the performance regression, it sure seems like we may need a different solution to be implemented and hashing that out in a branch might be the best approach. Thank you. A more incremental approach would be appreciated by many of us. To avoid the bystander effect we can permit explicit timeouts for review-to-commit (72 hours?) so that we don't collectively end up sandbagging him. -K ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 21 January 2015 at 16:07, K. Macy km...@freebsd.org wrote: HPS: Your change failed to meet these guidelines. Some of us are upset because these guidelines are fairly fundamental for the on-going viability of FreeBSD. Due to linguistic / time zone / cultural differences these expectations have not been adequately communicated to you. You are not in the USB sandbox where others need for your support outweighs the inconvenience of random breakage. It sounds like you are making progress towards updating the concerns that have been voiced. If kib's observations are in fact comprehensive then adding a callout_init_cpu function and updating all clients so that their callouts continue to be scheduled on a CPU other than the BSP will suffice and we can all move on. Is there some reason that we can’t back things out, break things down into smaller pieces and have everything pass through phabric with a wide ranging review? Given the fundamental nature of these changes, they really need better review and doing it after the fact seems to be to be too risky. I’m not debating that this “fixes” some issues, but given the performance regression, it sure seems like we may need a different solution to be implemented and hashing that out in a branch might be the best approach. Thank you. A more incremental approach would be appreciated by many of us. To avoid the bystander effect we can permit explicit timeouts for review-to-commit (72 hours?) so that we don't collectively end up sandbagging him. I'm +1 for this. -a ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
They originally found that things were spinning for way too long. Hans found something similar and determined/concluded that the migration code in callouts was racy-by-design and dramatically simplified it and also put very hard constraints on what is a valid situation to support migrating from one callwheel to another. Now we have fallout which we can either address or back out until the callout stuff is again reviewed/fixed. I don't think it's as alchemic as is being promoted. Let's not get drawn into semantic debates. To me alchemy implies voodoo debugging, whereas this is, in part, disabling functionality that exposes races - which is more like disabling preemption or fine grained locking. Normally I would advocate backing this out on the basis of precedence. That is to say, back it out so that developers will get it right the first time. However, it appears that hps@ and some others don't understand what was done wrong. So I'm going to state some basic premises and then the implied basic guidelines required of a commit that were not met. If these guidelines are violated by a change in the future I will agitate for a rapid back out. Premises: A) A performance regression is a bug every bit as much as a locking race. Stability OR performance (where we are talking about _current_ performance) is a false dichotomy. - This means that a change that fixes a bug by introducing a substantial performance regression does not constitute a fix. B) Existing behavior and performance characteristics should not be adversely changed unless there is widespread consensus. - Sometimes it is necessary to break a KPI but that should not be discovered by svn or scanning back through the mailing lists. Guidelines: A) Performance should not measurably regress. B) If a KPI changes behavior e.g. callout_reset_on being crippled, all clients need to be updated so that they maintain their current behavior by the committer changing the KPI. The client code maintainers aren't responsible for reacting to these types of changes. That is OK for marginal architectures like sun4v or pc98. C) If there is a non-zero possibility that these changes break the client code, committers who have touched that code in the last few years should be notified. HPS: Your change failed to meet these guidelines. Some of us are upset because these guidelines are fairly fundamental for the on-going viability of FreeBSD. Due to linguistic / time zone / cultural differences these expectations have not been adequately communicated to you. You are not in the USB sandbox where others need for your support outweighs the inconvenience of random breakage. It sounds like you are making progress towards updating the concerns that have been voiced. If kib's observations are in fact comprehensive then adding a callout_init_cpu function and updating all clients so that their callouts continue to be scheduled on a CPU other than the BSP will suffice and we can all move on. Thanks in advance. -K ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Jan 21, 2015, at 4:38 PM, K. Macy km...@freebsd.org wrote: They originally found that things were spinning for way too long. Hans found something similar and determined/concluded that the migration code in callouts was racy-by-design and dramatically simplified it and also put very hard constraints on what is a valid situation to support migrating from one callwheel to another. Now we have fallout which we can either address or back out until the callout stuff is again reviewed/fixed. I don't think it's as alchemic as is being promoted. Let's not get drawn into semantic debates. To me alchemy implies voodoo debugging, whereas this is, in part, disabling functionality that exposes races - which is more like disabling preemption or fine grained locking. Normally I would advocate backing this out on the basis of precedence. That is to say, back it out so that developers will get it right the first time. However, it appears that hps@ and some others don't understand what was done wrong. So I'm going to state some basic premises and then the implied basic guidelines required of a commit that were not met. If these guidelines are violated by a change in the future I will agitate for a rapid back out. Premises: A) A performance regression is a bug every bit as much as a locking race. Stability OR performance (where we are talking about _current_ performance) is a false dichotomy. - This means that a change that fixes a bug by introducing a substantial performance regression does not constitute a fix. B) Existing behavior and performance characteristics should not be adversely changed unless there is widespread consensus. - Sometimes it is necessary to break a KPI but that should not be discovered by svn or scanning back through the mailing lists. Guidelines: A) Performance should not measurably regress. B) If a KPI changes behavior e.g. callout_reset_on being crippled, all clients need to be updated so that they maintain their current behavior by the committer changing the KPI. The client code maintainers aren't responsible for reacting to these types of changes. That is OK for marginal architectures like sun4v or pc98. C) If there is a non-zero possibility that these changes break the client code, committers who have touched that code in the last few years should be notified. HPS: Your change failed to meet these guidelines. Some of us are upset because these guidelines are fairly fundamental for the on-going viability of FreeBSD. Due to linguistic / time zone / cultural differences these expectations have not been adequately communicated to you. You are not in the USB sandbox where others need for your support outweighs the inconvenience of random breakage. It sounds like you are making progress towards updating the concerns that have been voiced. If kib's observations are in fact comprehensive then adding a callout_init_cpu function and updating all clients so that their callouts continue to be scheduled on a CPU other than the BSP will suffice and we can all move on. Is there some reason that we can’t back things out, break things down into smaller pieces and have everything pass through phabric with a wide ranging review? Given the fundamental nature of these changes, they really need better review and doing it after the fact seems to be to be too risky. I’m not debating that this “fixes” some issues, but given the performance regression, it sure seems like we may need a different solution to be implemented and hashing that out in a branch might be the best approach. Warner ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Jan 20, 2015, at 12:51 AM, Konstantin Belousov kostik...@gmail.com wrote: Do other people consider this situation acceptable ? Not even a little. Warner ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Jan 20, 2015, at 12:35 AM, Hans Petter Selasky h...@selasky.org wrote: On 01/20/15 06:22, Adrian Chadd wrote: Sweet, thanks. I'l test it, but anything that changes the locking to TCP is going to need a more thorough review. The there be dragons disclaimer is appropriate.:) No changes in locking - simply some minor code reordering. This isn’t entirely true. You changed the INFO_WLOCK protocol, and also drop the WLOCK to acquire the INFO_WLOCK in places, and it isn’t clear to me at all why this is safe to do. Please document the analysis you did to show that was safe. Warner ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Jan 20, 2015, at 2:37 AM, Hans Petter Selasky h...@selasky.org wrote: On 01/20/15 10:00, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote: On 01/20/15 08:51, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. Hi, It is not very hard to update existing callout clients and you can do it too, if you need the extra bits of performance. Are there more API's than the TCP stack which you think needs an update and are performance critical? As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? For the TCP stack - yes, but not for other clients like cv_timedwait() and such. If you think you have a better way to solve the callout problems, please tell me! In order for a callout to change its CPU you need a lock to protect which CPU the callout is on. Instead of introducing a third lock in the callout path, which will be a congestion point, to protect against changing the CPU number, I decided that we will use the client's mutex and the MPSAFE implies the client doesn't have any mutex. So it won't work with callout clients which use the CALLOUT_MPSAFE flag. Honestly CALLOUT_MPSAFE should not be used, because it leads to extra complexity in the clients catching the race when tearing down the callouts and any pending callbacks. Then it is incumbent on you to fix them. You can’t just fix one instance and wash your hands of the problem. Maybe this is a real and legitimate bug. However, until you’ve followed your solution through by actually fixing the abusers of it, my confidence that another issue won’t present itself is quite low. The code seems half baked to me. And from reading this thread, it seems like perhaps I’m not the only one. Please read the callout 9 manual page first. Assume I read it. How this changes any of my points above ? A change in the CPU selection cannot happen if this function is re-scheduled inside a callout function. Else the callback function given by the func argument will be executed on the same CPU like previously done. You cannot do this without fixing consumers. The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Radically changing the performance characteristics is breaking the code. Performance regression in the TCP stack is urgent to fix. Not being able to enumerate what all the consumers are that use this and provide an analysis about why they aren’t important to fix is a bug in your process, and in your interaction with the project. We simply do not operate that way. Warner ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/22/15 06:26, Warner Losh wrote: The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Radically changing the performance characteristics is breaking the code. Performance regression in the TCP stack is urgent to fix. Not being able to enumerate what all the consumers are that use this and provide an analysis about why they aren’t important to fix is a bug in your process, and in your interaction with the project. We simply do not operate that way. Hi, My plan is to work out a patch for the TCP stack today, which only change the callout_init() call or its function. This should not need any particular review. I'll let adrian test and review, because I think he is closer to me timezone wise and you're standing on my head saying its urgent. If he is still not happy, I can back my change out. Else it remains in -current AS-IS. MFC to 10-stable I can delay for sure until all issues you report to me are fixed. Thank you for your patience! --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/21/15 01:49, Adrian Chadd wrote: You should totally try say, 100,000 active TCP connections on a box. See what happens to swi0 (clock). TL;DR - the lock contention sucks and it takes a chunk of the core up. The lock contention is highly not good. That's why I'd like to see both the callout stuff in its slightly-better-defined-and-sane state from you/and/ make it so TCP can use it. I'll have to double-check to see if the RSS stuff is all lined up correctly so we can use it when we create the callouts (well, at inpcb creation time, right), rather than when we first schedule them. Then we can experiment with having the initial CPU be specified at callout create time rather than expecting to be able to move it when we first schedule it. Or, hm, maybe have it so we don't have a CPU chosen until the first time we schedule the timeout, and if it hasn't been scheduled before, allow the CPU to be set? Because at that point we aren't migrating it off f timeout_cpu - it's never been added to it in the first place. Hi Adrian, What you are saying is correct. If you set the initial c_cpu value when the callout is initialized it will run on SWI#X instead of SWI#0. This is fully allowed, so maybe a callout_init_cpu() would be appropriate, to set the initial CPU number for callouts. With regard to the callout the c_cpu value can be different from zero, only the it must remain fixed/constant when there is no lock protecting updates to it! Kip and Gleb: Does adding a callout_init_cpu() function which can be used for existing callouts and in conjunction with CALLOUT_MPSAFE change anything? --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Wed, Jan 21, 2015 at 09:32:11AM +0100, Hans Petter Selasky wrote: On 01/21/15 00:53, Sean Bruno wrote: Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. sean Hi, I've seen this with Mellanox hardware when running some special tests, but not during regular use yet. That was the reason for going into the callout subsystem in the first place. 40GE. Also I would like to mention during the heat of this discussion, that during X-mas this year, I had a very heavy discussion with Attilio and a few other FreeBSD developers, who's name was on a patch (r220456) that changed how the return value of callout_active() works. callout_active() is heavily used inside the TCP stack and what was found is there is a potential race related to migrating the callout from one CPU to the other, which in turn might give other symptoms than a spinlock hang. FYI: https://svnweb.freebsd.org/base?view=revisionrevision=225057 Cite: If the newly scheduled thread wants to acquire the old queue it will just spin forever. This description reminds me very much of what Jason Wolfe, others and myself have seen. Konstantin, you're responsible for r220456 (Approved by: kib). I would I definitely do not see anything related to my freefall login in the log message for r220456, nor I participated in any way in the work which lead to that revision. If you mean r225057, note that approval by re != review. like to ask what investigation you did to ensure that you solved the problem as described in the commit message and didn't introduce a new one? In r220456 the callout_reset_on() function was changed in a way that directly conflicts with how the TCP stack works, by not always ensuring that callout_active() returns non-zero after a callout is restarted! See return at line 821: https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?revision=225057view=markuppathrev=225057#l821 Kib: Any comments? With the re hat on, explanation for the proposed commit looked reasonable, and committer provided enough evidence that change got adequate testing. Since change fixed a bug, and this is exactly what re wants to see during release cycle, I see no reason why commit should be denied. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/21/15 09:51, Konstantin Belousov wrote: On Wed, Jan 21, 2015 at 09:32:11AM +0100, Hans Petter Selasky wrote: On 01/21/15 00:53, Sean Bruno wrote: Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. sean Hi, I've seen this with Mellanox hardware when running some special tests, but not during regular use yet. That was the reason for going into the callout subsystem in the first place. 40GE. Also I would like to mention during the heat of this discussion, that during X-mas this year, I had a very heavy discussion with Attilio and a few other FreeBSD developers, who's name was on a patch (r220456) that changed how the return value of callout_active() works. callout_active() is heavily used inside the TCP stack and what was found is there is a potential race related to migrating the callout from one CPU to the other, which in turn might give other symptoms than a spinlock hang. FYI: https://svnweb.freebsd.org/base?view=revisionrevision=225057 Cite: If the newly scheduled thread wants to acquire the old queue it will just spin forever. This description reminds me very much of what Jason Wolfe, others and myself have seen. Konstantin, you're responsible for r220456 (Approved by: kib). I would I definitely do not see anything related to my freefall login in the log message for r220456, nor I participated in any way in the work which lead to that revision. If you mean r225057, note that approval by re != review. Yes, I meant r225057. like to ask what investigation you did to ensure that you solved the problem as described in the commit message and didn't introduce a new one? In r220456 the callout_reset_on() function was changed in a way that directly conflicts with how the TCP stack works, by not always ensuring that callout_active() returns non-zero after a callout is restarted! See return at line 821: https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?revision=225057view=markuppathrev=225057#l821 Kib: Any comments? With the re hat on, explanation for the proposed commit looked reasonable, and committer provided enough evidence that change got adequate testing. Since change fixed a bug, and this is exactly what re wants to see during release cycle, I see no reason why commit should be denied. The problem is Attilio is no longer an active committer and he was not been very willing to do more work in this area. When people writing code in an area no longer respond - what should we do? --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 04:37:44PM -0800, K. Macy wrote: I would pick stability over performance any day. However, it _seems_ to me, and maybe I simply don't understand some key details, that the fix consisted of largely single-threading the callout system. And as I say I may simply not understand the specifics, but this sort of large scale disabling does not constitute a fix but is a workaround. It's more like disabling preemption because it fixes a panic. Yes, it might fix a whole array of bugs that crop up but it could not be seen as a fix to an otherwise working system. Well, this is not a complete truth. Let me try to explain my understanding, I spent some time actually looking at the new code. Would be nice if corrections to my understanding is posted, if needed. Now, when callout_reset() is directed to change cpu, the change only happens when the callout is associated with a lock, and that lock is owned by the caller of callout_reset(). There are two consequences. One, which is good, is significant simplification of the migration code, together with the drain. This is exactly the place where there bugs which make my head hurt, see e.g. r234952 and r243901. Note that some callouts follow this pattern already, e.g. process timers after r243869. Another consequence, which is very visible and which causes the roar, is that all lockless callers of callout_reset_on(9) now silently changed the behaviour to callout_reset(9). There is no audit performed to identify such callers, and there is no even a plan to fix them. The answer to the complains seems to be 'if you think that the fix is needed, go and fix'. My impression is that some set of vocal active developers consider the second consequence unacceptable, myself included. IMO, committing the change, however good the consequence one is, without fixing the consequence two, is inappropriate. And the onus of doing this is on the person doing the change. Yes, it is very interesting is the change actually good WRT fixing migration, since indeed there is serious reduction in the migration amount due to locked callout_reset() being not universally used. It is possible that the bugs are only covered. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/21/15 00:53, Sean Bruno wrote: Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. sean Hi, I've seen this with Mellanox hardware when running some special tests, but not during regular use yet. That was the reason for going into the callout subsystem in the first place. 40GE. Also I would like to mention during the heat of this discussion, that during X-mas this year, I had a very heavy discussion with Attilio and a few other FreeBSD developers, who's name was on a patch (r220456) that changed how the return value of callout_active() works. callout_active() is heavily used inside the TCP stack and what was found is there is a potential race related to migrating the callout from one CPU to the other, which in turn might give other symptoms than a spinlock hang. FYI: https://svnweb.freebsd.org/base?view=revisionrevision=225057 Cite: If the newly scheduled thread wants to acquire the old queue it will just spin forever. This description reminds me very much of what Jason Wolfe, others and myself have seen. Konstantin, you're responsible for r220456 (Approved by: kib). I would like to ask what investigation you did to ensure that you solved the problem as described in the commit message and didn't introduce a new one? In r220456 the callout_reset_on() function was changed in a way that directly conflicts with how the TCP stack works, by not always ensuring that callout_active() returns non-zero after a callout is restarted! See return at line 821: https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?revision=225057view=markuppathrev=225057#l821 Kib: Any comments? --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/20/15 09:22, Adrian Chadd wrote: Yeah, it looks like you set c_cpu to timeout_cpu in _callout_init_locked(), but then you only handle the case of the CPU being changed in certain circumstances. You aren't handling the CPU being initialised when the callout is first added. And, callout_restart_async() calls callout_lock(), which calls CC_LOCK() on the callout CPU, which initially is zero. Then, it never seems to be 'moved' into the correct CPU, even though it's being called with a CPU id. So, if I am reading this all correctly, you aren't really handling multi CPU callwheels at all. ;) In my instance, I'm seeing quite a lot of lock contention between the userland threads, the network RX threads and the clock thread, all contending on a single callout wheel being used for TCP timers. One of the early goals for fixing up the RSS stuff in -HEAD was to make per-CPU (Well, per-RSS-bucket really) TCP timers not contend with the NIC RX processing path, and now it does. :( To clarify, you're seeing this with net.inet.tcp.per_cpu_timers=1? Cheers, Lawrence ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote: On 01/20/15 08:51, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? Hi Konstantin, Please read the callout 9 manual page first. Assume I read it. How this changes any of my points above ? A change in the CPU selection cannot happen if this function is re-scheduled inside a callout function. Else the callback function given by the func argument will be executed on the same CPU like previously done. You cannot do this without fixing consumers. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/20/15 10:00, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote: On 01/20/15 08:51, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. Hi, It is not very hard to update existing callout clients and you can do it too, if you need the extra bits of performance. Are there more API's than the TCP stack which you think needs an update and are performance critical? As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? For the TCP stack - yes, but not for other clients like cv_timedwait() and such. If you think you have a better way to solve the callout problems, please tell me! In order for a callout to change its CPU you need a lock to protect which CPU the callout is on. Instead of introducing a third lock in the callout path, which will be a congestion point, to protect against changing the CPU number, I decided that we will use the client's mutex and the MPSAFE implies the client doesn't have any mutex. So it won't work with callout clients which use the CALLOUT_MPSAFE flag. Honestly CALLOUT_MPSAFE should not be used, because it leads to extra complexity in the clients catching the race when tearing down the callouts and any pending callbacks. Please read the callout 9 manual page first. Assume I read it. How this changes any of my points above ? A change in the CPU selection cannot happen if this function is re-scheduled inside a callout function. Else the callback function given by the func argument will be executed on the same CPU like previously done. You cannot do this without fixing consumers. The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 20 January 2015 at 04:37, Hans Petter Selasky h...@selasky.org wrote: It is not very hard to update existing callout clients and you can do it too, if you need the extra bits of performance. Hi Hans Petter, The issue here is that the onus is on the one changing a KPI to support its consumers through the transition. This doesn't necessarily mean doing all of the work. But the KPI changes, and techniques for adapting to them, need to be communicated very well in advance of the change arriving. I appreciate that you have a patch for TCP in review now - but having Adrian encounter a huge TCP performance regression is an unfortunate way to discover this issue. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 15 January 2015 at 13:53, John Baldwin j...@freebsd.org wrote: I think it's been a clear practice with all other changes reviewed in phabric to date that the committer only lists people in 'Reviewed by' who actually signed off on the patch, not just the list of people asked to review it. This point is worth repeating. Phabricator is an add-on tool to aid in pre-commit review, but is not a required part of our process and doesn't change the meaning of commit message metadata fields. Reviewed by in a commit message means exactly what it always has: that those listed have reasonably carefully reviewed the change and are willing to put their name on it. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 10:37:52AM +0100, Hans Petter Selasky wrote: On 01/20/15 10:00, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote: On 01/20/15 08:51, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. Hi, It is not very hard to update existing callout clients and you can do it too, if you need the extra bits of performance. I want to avoid regressions, and avoid breaking other' people work. Are there more API's than the TCP stack which you think needs an update and are performance critical? I did not performed any analysis. More, I naturally expect that such analysis and demonstration that there is no regression, is the duty of the person who proposes the change. As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? For the TCP stack - yes, but not for other clients like cv_timedwait() and such. If you think you have a better way to solve the callout problems, please tell me! In order for a callout to change its CPU you need a lock to protect which CPU the callout is on. Instead of introducing a third lock in the callout path, which will be a congestion point, to protect against changing the CPU number, I decided that we will use the client's mutex and the MPSAFE implies the client doesn't have any mutex. So it won't work with callout clients which use the CALLOUT_MPSAFE flag. Honestly CALLOUT_MPSAFE should not be used, because it leads to extra complexity in the clients catching the race when tearing down the callouts and any pending callbacks. This is your opinion. I did fixed some bugs in the callout migration code, and I am not sure that requiring rototiling of almost all KPI consumers (and leaving unconverted consumers to pre-cpu state) is the only possible solution. But again, since it is you who brought the change into the tree, it is your duty to present a valid proof why this is the only possible way to solve bugs (which bugs ?). Please read the callout 9 manual page first. Assume I read it. How this changes any of my points above ? A change in the CPU selection cannot happen if this function is re-scheduled inside a callout function. Else the callback function given by the func argument will be executed on the same CPU like previously done. You cannot do this without fixing consumers. The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. Isn't it obvious ? If callouts no longer migrate to non-BSP, this is the regression. I am sorry for you attitude. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 20 January 2015 at 06:25, Ed Maste ema...@freebsd.org wrote: On 20 January 2015 at 04:37, Hans Petter Selasky h...@selasky.org wrote: It is not very hard to update existing callout clients and you can do it too, if you need the extra bits of performance. Hi Hans Petter, The issue here is that the onus is on the one changing a KPI to support its consumers through the transition. This doesn't necessarily mean doing all of the work. But the KPI changes, and techniques for adapting to them, need to be communicated very well in advance of the change arriving. I appreciate that you have a patch for TCP in review now - but having Adrian encounter a huge TCP performance regression is an unfortunate way to discover this issue. I'm +1 on the think it's the right, clean solution for the callout stuff as I've done this stuff in userland a few times and it gets hairy very quickly when you try to support multiple ways to schedule, cancel and migrate events from arbitrary CPUs. I'm -1 on the rapid change without fixing other consumers, but I'm definitely +1 on the quick help from Hans on it. The TCP patch will need some closer review by people who have recently worked on the TCP stack and locking. I'll try to get the Verisign developers looped in as they touched this stuff recently. I do think we could do with a debugging print for now to catch situations which need migrating. The callout API doesn't tell you that it did or didn't migrate an event to another CPU and it could make for some performance unpredictability. -a ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tuesday, January 20, 2015 4:37:52 am Hans Petter Selasky wrote: On 01/20/15 10:00, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote: On 01/20/15 08:51, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. Hi, It is not very hard to update existing callout clients and you can do it too, if you need the extra bits of performance. Are there more API's than the TCP stack which you think needs an update and are performance critical? As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? For the TCP stack - yes, but not for other clients like cv_timedwait() and such. If you think you have a better way to solve the callout problems, please tell me! In order for a callout to change its CPU you need a lock to protect which CPU the callout is on. Instead of introducing a third lock in the callout path, which will be a congestion point, to protect against changing the CPU number, I decided that we will use the client's mutex and the MPSAFE implies the client doesn't have any mutex. So it won't work with callout clients which use the CALLOUT_MPSAFE flag. Honestly CALLOUT_MPSAFE should not be used, because it leads to extra complexity in the clients catching the race when tearing down the callouts and any pending callbacks. Please read the callout 9 manual page first. Assume I read it. How this changes any of my points above ? A change in the CPU selection cannot happen if this function is re-scheduled inside a callout function. Else the callback function given by the func argument will be executed on the same CPU like previously done. You cannot do this without fixing consumers. The code simply needs an update. It is not broken in any ways - right? If it is not broken, fixing it is not that urgent. This is not at all acceptable. TCP callouts were the largest potential user of multi-cpu callouts and you've just broken them. Your proposed change to handle inp locks is not necessarily correct either since dropping the inp lock inside a callout introduces new races (now callout_stop doesn't have quite the same semantics as it does for other callout_init_*()). Given this, it seems that your fix just mostly disabled multi-CPU callouts, so it is not at all clear that you've actually fixed anything. :( -- John Baldwin ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQF8BAEBCgBmBQJUvujmXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kv2MH/1F15x/lgwWq5fE/cZ3n9HlV
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:39, Navdeep Parhar wrote: Sean, Was this really Reviewed by: sbruno@ or just Tested by: sbruno@? I was very happy to see so many reviewers on the original commit but you seem to be the only one still left on the list. Regards, Navdeep I doubt that I would qualify as a Reviewer in this code by any stretch of the imagination. My contribution to the testing was in house and general review of code flow in out freebsd10 environment. sean On 01/20/15 15:35, Sean Bruno wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQF8BAEBCgBmBQJUvuhjXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kz74H/RVMvCmstmep+rXvwYNXwQvQ M45uwBUYe27qL1Z11qbMwghV0tkLWPkko7Nwgb+BwDhNcqsgxvGxH/8i8ohElHs2
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 20 January 2015 at 14:30, Hans Petter Selasky h...@selasky.org wrote: Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. You should totally try say, 100,000 active TCP connections on a box. See what happens to swi0 (clock). TL;DR - the lock contention sucks and it takes a chunk of the core up. The lock contention is highly not good. That's why I'd like to see both the callout stuff in its slightly-better-defined-and-sane state from you /and/ make it so TCP can use it. I'll have to double-check to see if the RSS stuff is all lined up correctly so we can use it when we create the callouts (well, at inpcb creation time, right), rather than when we first schedule them. Then we can experiment with having the initial CPU be specified at callout create time rather than expecting to be able to move it when we first schedule it. Or, hm, maybe have it so we don't have a CPU chosen until the first time we schedule the timeout, and if it hasn't been scheduled before, allow the CPU to be set? Because at that point we aren't migrating it off f timeout_cpu - it's never been added to it in the first place. -a ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Please back this out now. It was a substantial interface change without review. This should not be up for debate. I hope the others have the fortitude to insist upon this. -K On Thu, Jan 15, 2015 at 7:32 AM, Hans Petter Selasky hsela...@freebsd.org wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: - Close a migration race where callout_reset() failed to set the CALLOUT_ACTIVE flag. - Callout callback functions are now allowed to be protected by spinlocks. - Switching the callout CPU number cannot always be done on a per-callout basis. See the updated timeout(9) manual page for more information. - The timeout(9) manual page has been updated to reflect how all the functions inside the callout API are working. The manual page has been made function oriented to make it easier to deduce how each of the functions making up the callout API are working without having to first read the whole manual page. Group all functions into a handful of sections which should give a quick top-level overview when the different functions should be used. - The CALLOUT_SHAREDLOCK flag and its functionality has been removed to reduce the complexity in the callout code and to avoid problems about atomically stopping callouts via callout_stop(). If someone needs it, it can be re-added. From my quick grep there are no CALLOUT_SHAREDLOCK clients in the kernel. - A new callout API function named callout_drain_async() has been added. See the updated timeout(9) manual page for a complete description. - Update the callout clients in the kern/ folder to use the callout API properly, like cv_timedwait(). Previously there was some custom sleepqueue code in the callout subsystem, which has been removed, because we now allow callouts to be protected by spinlocks. This allows us to tear down the callout like done with regular mutexes, and a td_slpmutex has been added to struct thread to atomically teardown the td_slpcallout. Further the TDF_TIMOFAIL and SWT_SLEEPQTIMO states can now be completely removed. Currently they are marked as available and will be cleaned up in a follow up commit. - Bump the __FreeBSD_version to indicate kernel modules need recompilation. - There has been several reports that this patch seems to squash a serious bug leading to a callout timeout and panic. Kernel build testing: all architectures were built MFC after:2 weeks Differential Revision:https://reviews.freebsd.org/D1438 Sponsored by: Mellanox Technologies Reviewed by: jhb, adrian, sbruno and emaste Modified: head/share/man/man9/Makefile head/share/man/man9/timeout.9 head/sys/kern/init_main.c head/sys/kern/kern_condvar.c head/sys/kern/kern_lock.c head/sys/kern/kern_switch.c head/sys/kern/kern_synch.c head/sys/kern/kern_thread.c head/sys/kern/kern_timeout.c head/sys/kern/subr_sleepqueue.c head/sys/ofed/include/linux/completion.h head/sys/sys/_callout.h head/sys/sys/callout.h head/sys/sys/param.h head/sys/sys/proc.h Modified: head/share/man/man9/Makefile == --- head/share/man/man9/MakefileThu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/MakefileThu Jan 15 15:32:30 2015 (r277213) @@ -1570,6 +1570,7 @@ MLINKS+=timeout.9 callout.9 \ timeout.9 callout_active.9 \ timeout.9 callout_deactivate.9 \ timeout.9 callout_drain.9 \ + timeout.9 callout_drain_async.9 \ timeout.9 callout_handle_init.9 \ timeout.9 callout_init.9 \ timeout.9 callout_init_mtx.9 \ Modified: head/share/man/man9/timeout.9 == --- head/share/man/man9/timeout.9 Thu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/timeout.9 Thu Jan 15 15:32:30 2015 (r277213) @@ -29,13 +29,14 @@ .\ .\ $FreeBSD$ .\ -.Dd October 8, 2014 +.Dd January 14, 2015 .Dt TIMEOUT 9 .Os .Sh NAME .Nm callout_active , .Nm callout_deactivate , .Nm callout_drain , +.Nm callout_drain_async , .Nm callout_handle_init , .Nm callout_init , .Nm callout_init_mtx , @@ -63,279 +64,232 @@ .In sys/systm.h .Bd -literal typedef void timeout_t (void *); +typedef void callout_func_t (void *); .Ed -.Ft int -.Fn callout_active struct callout *c -.Ft void -.Fn callout_deactivate struct callout *c -.Ft int -.Fn callout_drain struct callout *c -.Ft void -.Fn callout_handle_init struct callout_handle *handle -.Bd -literal -struct callout_handle handle = CALLOUT_HANDLE_INITIALIZER(handle); -.Ed -.Ft void -.Fn callout_init
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:48, K. Macy wrote: Are any other drivers hitting this? e.g. cxgb/cxgbe? -K Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. sean On Tue, Jan 20, 2015 at 3:46 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 3:48 PM, K. Macy km...@freebsd.org wrote: Are any other drivers hitting this? e.g. cxgb/cxgbe? The evidence is simply past experience of recurring locking and control flow bugs in the intel drivers. -K -K On Tue, Jan 20, 2015 at 3:46 PM, Sean Bruno sbr...@ignoranthack.me wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:59, K. Macy wrote: On Tue, Jan 20, 2015 at 3:53 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 15:48, K. Macy wrote: Are any other drivers hitting this? e.g. cxgb/cxgbe? -K Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. For an ill characterized problem isolated to one environment this seems like a workaround that should not be part of the general code base. -K There was never any indication in our testing that the driver was involved in any way. In our universe, this commit (right or wrong) resolved our panics. I think that there is some room for improvement based on the commentary in this thread, but some people do indeed prefer stability over performance. I hope we can come to a middle ground somewhere here. sean sean On Tue, Jan 20, 2015 at 3:46 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang.
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 4:22 PM, Sean Bruno sbr...@ignoranthack.me wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:59, K. Macy wrote: On Tue, Jan 20, 2015 at 3:53 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 15:48, K. Macy wrote: Are any other drivers hitting this? e.g. cxgb/cxgbe? -K Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. For an ill characterized problem isolated to one environment this seems like a workaround that should not be part of the general code base. -K There was never any indication in our testing that the driver was involved in any way. In our universe, this commit (right or wrong) resolved our panics. I think that there is some room for improvement based on the commentary in this thread, but some people do indeed prefer stability over performance. I hope we can come to a middle ground somewhere here. I would pick stability over performance any day. However, it _seems_ to me, and maybe I simply don't understand some key details, that the fix consisted of largely single-threading the callout system. And as I say I may simply not understand the specifics, but this sort of large scale disabling does not constitute a fix but is a workaround. It's more like disabling preemption because it fixes a panic. Yes, it might fix a whole array of bugs that crop up but it could not be seen as a fix to an otherwise working system. -K sean sean On Tue, Jan 20, 2015 at 3:46 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
I think you're working around driver locking bugs by crippling the callout code. -K On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQF8BAEBCgBmBQJUvuYrXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kJTMIAMfh6ghV/AwQauY+a44q1hjJ WC7E3u69FK0opgSYg71kk6HckbyB+sTWND6HdXnpyrcMLXUt74zlB8c48wbUUV5+ EwKNYzGNNnDNhoc0WtPMect8e9Y1kBRvSGfHBdVrqATXfPOyZEa+i4lQAXpiFKIt nndqVrAH7bJM6143YDpnIg7vaR+8IQnC2ztSP4ogJzh03DZ7zVsg4BsoCPg50eVZ kr46cXcE+SP/TsQBsVNVwRJD5NFie6QJdLoTnwkd0XfQGJMIWivNgUcE4tIxqPIf nGQ0xMJCotpNLuPtzYzCIurSaDHdHmL6bjkhjTtBmWsMNfdGH8TKoih79GDxkTg= =Y3rd -END PGP SIGNATURE- ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 3:53 PM, Sean Bruno sbr...@ignoranthack.me wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:48, K. Macy wrote: Are any other drivers hitting this? e.g. cxgb/cxgbe? -K Unkown to me. Nor am I aware of anyone else who ever hit our panics either. Our environment, and the failure, was only seen in the Intel 10GE space (ixgbe). This is an artifact of our use cases, and hasn't been expanded nor tested in our environment with other vendor interfaces. For an ill characterized problem isolated to one environment this seems like a workaround that should not be part of the general code base. -K sean On Tue, Jan 20, 2015 at 3:46 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hans, On Tue, Jan 20, 2015 at 10:37:52AM +0100, Hans Petter Selasky wrote: H It is not very hard to update existing callout clients and you can do it H too, if you need the extra bits of performance. If it is not very hard, then you should have done that as part of your change. H Are there more API's than the TCP stack which you think needs an update H and are performance critical? This is the question that you should have invetigated yourself before pushing the change in. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Sean, Was this really Reviewed by: sbruno@ or just Tested by: sbruno@? I was very happy to see so many reviewers on the original commit but you seem to be the only one still left on the list. Regards, Navdeep On 01/20/15 15:35, Sean Bruno wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQF8BAEBCgBmBQJUvuYrXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kJTMIAMfh6ghV/AwQauY+a44q1hjJ WC7E3u69FK0opgSYg71kk6HckbyB+sTWND6HdXnpyrcMLXUt74zlB8c48wbUUV5+ EwKNYzGNNnDNhoc0WtPMect8e9Y1kBRvSGfHBdVrqATXfPOyZEa+i4lQAXpiFKIt nndqVrAH7bJM6143YDpnIg7vaR+8IQnC2ztSP4ogJzh03DZ7zVsg4BsoCPg50eVZ kr46cXcE+SP/TsQBsVNVwRJD5NFie6QJdLoTnwkd0XfQGJMIWivNgUcE4tIxqPIf nGQ0xMJCotpNLuPtzYzCIurSaDHdHmL6bjkhjTtBmWsMNfdGH8TKoih79GDxkTg= =Y3rd -END PGP SIGNATURE- ___ svn-src-head@freebsd.org mailing list
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQF8BAEBCgBmBQJUvuYrXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kJTMIAMfh6ghV/AwQauY+a44q1hjJ WC7E3u69FK0opgSYg71kk6HckbyB+sTWND6HdXnpyrcMLXUt74zlB8c48wbUUV5+ EwKNYzGNNnDNhoc0WtPMect8e9Y1kBRvSGfHBdVrqATXfPOyZEa+i4lQAXpiFKIt nndqVrAH7bJM6143YDpnIg7vaR+8IQnC2ztSP4ogJzh03DZ7zVsg4BsoCPg50eVZ kr46cXcE+SP/TsQBsVNVwRJD5NFie6QJdLoTnwkd0XfQGJMIWivNgUcE4tIxqPIf nGQ0xMJCotpNLuPtzYzCIurSaDHdHmL6bjkhjTtBmWsMNfdGH8TKoih79GDxkTg= =Y3rd -END PGP SIGNATURE- ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Are any other drivers hitting this? e.g. cxgb/cxgbe? -K On Tue, Jan 20, 2015 at 3:46 PM, Sean Bruno sbr...@ignoranthack.me wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/20/15 15:40, K. Macy wrote: I think you're working around driver locking bugs by crippling the callout code. -K We had zero evidence of this. What leads you down that path? I'm totally open to being wrong, e.g. yeah, you slowed down things so that you don't hit a race condition sean On Tue, Jan 20, 2015 at 3:35 PM, Sean Bruno sbr...@ignoranthack.me wrote: On 01/20/15 14:30, Hans Petter Selasky wrote: On 01/20/15 22:11, Gleb Smirnoff wrote: On Tue, Jan 20, 2015 at 09:51:26AM +0200, Konstantin Belousov wrote: K Like stated in the manual page, callout_reset_curcpu/on() does not work K with MPSAFE callouts any more! K I.e. you 'fixed' some undeterminate bugs in callout migration by not K doing migration at all anymore. K K K You need to use callout_init_{mtx,rm,rw} and remove the custom locking K inside the callback in the TCP stack to get it working like before! K K No, you need to do this, if you think that whole callout KPI must be K rototiled. It is up to the person who modifies the KPI, to ensure that K existing code is not broken. K K As I understand, currently we are back to the one-cpu callouts. K Do other people consider this situation acceptable ? I think this isn't acceptable. The commit to a complex subsystem lacked a review from persons involved in the system before. The commit to subsystem broke consumers of the subsystem and this was even done not accidentially, but due to Hans not caring about it. As for me this is enough to request a backout, and let the change back in only after proper review. Hi Gleb, Backing out my callout API patch means we will for sure re-introduce an unknown callout spinlock hang, as noted to me by several people. What do you think about that? dram Maybe Jason Wolfe CC'ed can add to 10-stable w/o my patches: Jason picked up this patch for work and it resolved our instability issues that had remained unsolved for quite some time as reported to freebsd-net: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html This had gone undiagnosed for some time (even with the gracious help of jhb in offline emails, thanks btw!). There's some diagnostics in that email thread that may be of value to you folks for determination of the validity of changing the callout API or at least understanding why we were involved in diagnostics. While I'd sure love to tune performance, the fact that our machines were basically going out to lunch without these changes, probably means that others were seeing it and didn't know what else to do. As much as I enjoy a good break out the pitch forks and torches email thread, this increased stability for us and is allowing us to upgrade from freebsd8 to freebsd10. Bear this in mind when you throw your voice in favor of reverting. int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; + cpu = timeout_cpu; /* XXX test code XXX */ cancelled = 0; Jason or I would have to run this in production, which would be problematic I fear. We never had a deterministic test case that would exhibit the reported failure. We merely tested in production and saw that panics ceased. We didn't note a dropoff in our traffic either, perhaps we are not as efficient as others in this corner case, but we were consistently seeing the spinlock hangs after a day or so of traffic. And see if he observes a callout spinlock hang or not on his test setup. The patch above should force all callouts to the same thread basically. Then we could maybe see if single threading the callouts has anything to do with solving the spinlock hang. The rewritten callout API still has all the features and capabilities the old one had, when used as described in man 9 callout. At the present moment I'm not technically convinced a backout is correct. Neither am I, to be honest. Just based on *results*. Gleb: I think we would see far better results with high speed internet links using TCP if we could extend the LRO (large receive offload) code to accumulate more than 64KBytes worth of data per call to the TCP stack instead of complaining about some callouts ending up on the same thread! Actually I have a patch for that. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQF8BAEBCgBmBQJUvujmXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 19 January 2015 at 20:30, Hans Petter Selasky h...@selasky.org wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! Hm! How many places in the kernel did you leave like this? :P I mean, I'm glad to have stuff be forced to be cleaned up, but you didn't even leave a KASSERT or a debug warning that something unsupported is being done. I'm sure I'm not going to be the first person to be caught out like this. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! Would you please give me a hand with this? I've sunk a lot of (unpaid, personal) spare time into getting the RSS stuff into shape and now a lot of it just plainly doesn't do anything. :( -adrian ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! Thank you! --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/19/15 23:22, Adrian Chadd wrote: In my instance, I'm seeing quite a lot of lock contention between the userland threads, the network RX threads and the clock thread, all contending on a single callout wheel being used for TCP timers. One of the early goals for fixing up the RSS stuff in -HEAD was to make per-CPU (Well, per-RSS-bucket really) TCP timers not contend with the NIC RX processing path, and now it does.:( man 9 callout --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/20/15 06:22, Adrian Chadd wrote: Sweet, thanks. I'l test it, but anything that changes the locking to TCP is going to need a more thorough review. The there be dragons disclaimer is appropriate.:) No changes in locking - simply some minor code reordering. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/20/15 08:51, Konstantin Belousov wrote: On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? Hi Konstantin, Please read the callout 9 manual page first. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hi, Have a look here: https://reviews.freebsd.org/D1563 Give me a hand and test and review this patch properly! --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! I.e. you 'fixed' some undeterminate bugs in callout migration by not doing migration at all anymore. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! No, you need to do this, if you think that whole callout KPI must be rototiled. It is up to the person who modifies the KPI, to ensure that existing code is not broken. As I understand, currently we are back to the one-cpu callouts. Do other people consider this situation acceptable ? ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/20/15 06:04, Adrian Chadd wrote: On 19 January 2015 at 20:30, Hans Petter Selasky h...@selasky.org wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! Hm! Hi Adrian, How many places in the kernel did you leave like this? :P :-) I mean, I'm glad to have stuff be forced to be cleaned up, but you didn't even leave a KASSERT or a debug warning that something unsupported is being done. I'm sure I'm not going to be the first person to be caught out like this. MPSAFE is still valid and fully useable and can be used with callout_reset_curcpu/on(), but the callout CPU will remain at zero. There is no need for a KASSERT() yet. You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! Would you please give me a hand with this? I've sunk a lot of (unpaid, personal) spare time into getting the RSS stuff into shape and now a lot of it just plainly doesn't do anything. :( I'll send you a patch in an hours time from now for 11-current. This should be fairly trivial and then you can test and review it! --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 19 January 2015 at 21:20, Hans Petter Selasky h...@selasky.org wrote: On 01/20/15 06:04, Adrian Chadd wrote: On 19 January 2015 at 20:30, Hans Petter Selasky h...@selasky.org wrote: On 01/19/15 22:59, Adrian Chadd wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, Hi, Like stated in the manual page, callout_reset_curcpu/on() does not work with MPSAFE callouts any more! Hm! Hi Adrian, How many places in the kernel did you leave like this? :P :-) I mean, I'm glad to have stuff be forced to be cleaned up, but you didn't even leave a KASSERT or a debug warning that something unsupported is being done. I'm sure I'm not going to be the first person to be caught out like this. MPSAFE is still valid and fully useable and can be used with callout_reset_curcpu/on(), but the callout CPU will remain at zero. There is no need for a KASSERT() yet. Right, but people won't know that their callout won't be scheduled on the CPU they've asked for and there's no way to get notified of that. So something debug-y may be useful, even to make it easy to track down situations where sometimes it succeeds and sometimes it fails (because well, there's lots of places in the kernel where locking is .. suboptimal.) You need to use callout_init_{mtx,rm,rw} and remove the custom locking inside the callback in the TCP stack to get it working like before! Would you please give me a hand with this? I've sunk a lot of (unpaid, personal) spare time into getting the RSS stuff into shape and now a lot of it just plainly doesn't do anything. :( I'll send you a patch in an hours time from now for 11-current. This should be fairly trivial and then you can test and review it! Sweet, thanks. I'l test it, but anything that changes the locking to TCP is going to need a more thorough review. The there be dragons disclaimer is appropriate. :) -adrian ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, -adrian On 15 January 2015 at 07:32, Hans Petter Selasky hsela...@freebsd.org wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: - Close a migration race where callout_reset() failed to set the CALLOUT_ACTIVE flag. - Callout callback functions are now allowed to be protected by spinlocks. - Switching the callout CPU number cannot always be done on a per-callout basis. See the updated timeout(9) manual page for more information. - The timeout(9) manual page has been updated to reflect how all the functions inside the callout API are working. The manual page has been made function oriented to make it easier to deduce how each of the functions making up the callout API are working without having to first read the whole manual page. Group all functions into a handful of sections which should give a quick top-level overview when the different functions should be used. - The CALLOUT_SHAREDLOCK flag and its functionality has been removed to reduce the complexity in the callout code and to avoid problems about atomically stopping callouts via callout_stop(). If someone needs it, it can be re-added. From my quick grep there are no CALLOUT_SHAREDLOCK clients in the kernel. - A new callout API function named callout_drain_async() has been added. See the updated timeout(9) manual page for a complete description. - Update the callout clients in the kern/ folder to use the callout API properly, like cv_timedwait(). Previously there was some custom sleepqueue code in the callout subsystem, which has been removed, because we now allow callouts to be protected by spinlocks. This allows us to tear down the callout like done with regular mutexes, and a td_slpmutex has been added to struct thread to atomically teardown the td_slpcallout. Further the TDF_TIMOFAIL and SWT_SLEEPQTIMO states can now be completely removed. Currently they are marked as available and will be cleaned up in a follow up commit. - Bump the __FreeBSD_version to indicate kernel modules need recompilation. - There has been several reports that this patch seems to squash a serious bug leading to a callout timeout and panic. Kernel build testing: all architectures were built MFC after:2 weeks Differential Revision:https://reviews.freebsd.org/D1438 Sponsored by: Mellanox Technologies Reviewed by: jhb, adrian, sbruno and emaste Modified: head/share/man/man9/Makefile head/share/man/man9/timeout.9 head/sys/kern/init_main.c head/sys/kern/kern_condvar.c head/sys/kern/kern_lock.c head/sys/kern/kern_switch.c head/sys/kern/kern_synch.c head/sys/kern/kern_thread.c head/sys/kern/kern_timeout.c head/sys/kern/subr_sleepqueue.c head/sys/ofed/include/linux/completion.h head/sys/sys/_callout.h head/sys/sys/callout.h head/sys/sys/param.h head/sys/sys/proc.h Modified: head/share/man/man9/Makefile == --- head/share/man/man9/MakefileThu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/MakefileThu Jan 15 15:32:30 2015 (r277213) @@ -1570,6 +1570,7 @@ MLINKS+=timeout.9 callout.9 \ timeout.9 callout_active.9 \ timeout.9 callout_deactivate.9 \ timeout.9 callout_drain.9 \ + timeout.9 callout_drain_async.9 \ timeout.9 callout_handle_init.9 \ timeout.9 callout_init.9 \ timeout.9 callout_init_mtx.9 \ Modified: head/share/man/man9/timeout.9 == --- head/share/man/man9/timeout.9 Thu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/timeout.9 Thu Jan 15 15:32:30 2015 (r277213) @@ -29,13 +29,14 @@ .\ .\ $FreeBSD$ .\ -.Dd October 8, 2014 +.Dd January 14, 2015 .Dt TIMEOUT 9 .Os .Sh NAME .Nm callout_active , .Nm callout_deactivate , .Nm callout_drain , +.Nm callout_drain_async , .Nm callout_handle_init , .Nm callout_init , .Nm callout_init_mtx , @@ -63,279 +64,232 @@ .In sys/systm.h .Bd -literal typedef void timeout_t (void *); +typedef void callout_func_t (void *); .Ed -.Ft int -.Fn callout_active struct callout *c -.Ft void -.Fn callout_deactivate struct callout *c -.Ft int -.Fn callout_drain struct callout *c -.Ft void -.Fn
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Yeah, it looks like you set c_cpu to timeout_cpu in _callout_init_locked(), but then you only handle the case of the CPU being changed in certain circumstances. You aren't handling the CPU being initialised when the callout is first added. And, callout_restart_async() calls callout_lock(), which calls CC_LOCK() on the callout CPU, which initially is zero. Then, it never seems to be 'moved' into the correct CPU, even though it's being called with a CPU id. So, if I am reading this all correctly, you aren't really handling multi CPU callwheels at all. ;) In my instance, I'm seeing quite a lot of lock contention between the userland threads, the network RX threads and the clock thread, all contending on a single callout wheel being used for TCP timers. One of the early goals for fixing up the RSS stuff in -HEAD was to make per-CPU (Well, per-RSS-bucket really) TCP timers not contend with the NIC RX processing path, and now it does. :( -adrian On 19 January 2015 at 13:59, Adrian Chadd adr...@freebsd.org wrote: Hi, Would you please check what the results of this are with CPU specific callwheels? I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on ixgbe) and with this setup, the per-CPU TCP callwheel stuff is enabled. But all the callwheels are now back on clock(0) and so is the lock contention. :( Thanks, -adrian On 15 January 2015 at 07:32, Hans Petter Selasky hsela...@freebsd.org wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: - Close a migration race where callout_reset() failed to set the CALLOUT_ACTIVE flag. - Callout callback functions are now allowed to be protected by spinlocks. - Switching the callout CPU number cannot always be done on a per-callout basis. See the updated timeout(9) manual page for more information. - The timeout(9) manual page has been updated to reflect how all the functions inside the callout API are working. The manual page has been made function oriented to make it easier to deduce how each of the functions making up the callout API are working without having to first read the whole manual page. Group all functions into a handful of sections which should give a quick top-level overview when the different functions should be used. - The CALLOUT_SHAREDLOCK flag and its functionality has been removed to reduce the complexity in the callout code and to avoid problems about atomically stopping callouts via callout_stop(). If someone needs it, it can be re-added. From my quick grep there are no CALLOUT_SHAREDLOCK clients in the kernel. - A new callout API function named callout_drain_async() has been added. See the updated timeout(9) manual page for a complete description. - Update the callout clients in the kern/ folder to use the callout API properly, like cv_timedwait(). Previously there was some custom sleepqueue code in the callout subsystem, which has been removed, because we now allow callouts to be protected by spinlocks. This allows us to tear down the callout like done with regular mutexes, and a td_slpmutex has been added to struct thread to atomically teardown the td_slpcallout. Further the TDF_TIMOFAIL and SWT_SLEEPQTIMO states can now be completely removed. Currently they are marked as available and will be cleaned up in a follow up commit. - Bump the __FreeBSD_version to indicate kernel modules need recompilation. - There has been several reports that this patch seems to squash a serious bug leading to a callout timeout and panic. Kernel build testing: all architectures were built MFC after:2 weeks Differential Revision:https://reviews.freebsd.org/D1438 Sponsored by: Mellanox Technologies Reviewed by: jhb, adrian, sbruno and emaste Modified: head/share/man/man9/Makefile head/share/man/man9/timeout.9 head/sys/kern/init_main.c head/sys/kern/kern_condvar.c head/sys/kern/kern_lock.c head/sys/kern/kern_switch.c head/sys/kern/kern_synch.c head/sys/kern/kern_thread.c head/sys/kern/kern_timeout.c head/sys/kern/subr_sleepqueue.c head/sys/ofed/include/linux/completion.h head/sys/sys/_callout.h head/sys/sys/callout.h head/sys/sys/param.h head/sys/sys/proc.h Modified: head/share/man/man9/Makefile == --- head/share/man/man9/MakefileThu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/MakefileThu Jan 15 15:32:30 2015 (r277213) @@ -1570,6 +1570,7 @@ MLINKS+=timeout.9 callout.9 \ timeout.9 callout_active.9 \ timeout.9 callout_deactivate.9 \ timeout.9 callout_drain.9 \ + timeout.9
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 15 Jan 2015, at 15:32 , Hans Petter Selasky hsela...@freebsd.org wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: I see this for i386 and amd64 LINT* kernels. Is that a local problem or did you miss something? linking kernel addr.o: In function `set_timeout': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/addr.c:(.text+0xd37): undefined reference to `_callout_stop_safe' mad.o: In function `ib_unregister_mad_agent': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad.c:(.text+0x1ed4): undefined reference to `_callout_stop_safe' mad.o: In function `ib_mad_complete_send_wr': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad.c:(.text+0x3ea0): undefined reference to `_callout_stop_safe' cm.o: In function `ib_cm_cleanup': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/cm.c:(.text+0x40cf): undefined reference to `_callout_stop_safe' mad_rmpp.o: In function `ib_cancel_rmpp_recvs': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad_rmpp.c:(.text+0xc2): undefined reference to `_callout_stop_safe' mad_rmpp.o:/scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad_rmpp.c:(.text+0xe9): more undefined references to `_callout_stop_safe' follow --- kernel --- *** [kernel] Error code 1 — Bjoern A. Zeeb Charles Haddon Spurgeon: Friendship is one of the sweetest joys of life. Many might have failed beneath the bitterness of their trial had they not found a friend. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Thu, Jan 15, 2015 at 05:05:58PM +0100, Hans Petter Selasky wrote: The Reviewed by was simply a CP of the review list in from the Differential Revision. When you mention it should probably simply have said: Reviewed by: sbruno @ Due to the meaning of Reviewed by in commit messages. Sorry. BTW: Nice if people respond quickly or remove themselves from the differential reviews if they don't plan to do any reviews. I think it's kind of wrong. Instead, lack of reviews should be treated as lack of reviews (and thus blocking commit), not that people are not interested. ./danfe P.S. BTW, I still didn't receive your answer on an email I sent Jan 8th. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Thu, Jan 15, 2015 at 08:14:30PM +0300, Gleb Smirnoff wrote: T I'd dare to say that such important change simply cannot be committed T with a thorough review from at least two people very confident in this T area. Of course, I typoed. I meant cannot be committed withOUT a thorough review. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On Thu, Jan 15, 2015 at 7:32 AM, Hans Petter Selasky hsela...@freebsd.org wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: I plan to review this as well -- although I didn't have time for it lately. Probably I'll be able to take a closer look during the weekend. I missed it because I'm terribly late with my backlog and I wasn't listed as reviewer so the mail didn't hit my mailbox. I think this is a huge improvement, for many reason I'll try to outline in my following review. OTOH, considering how complex is the subsystem, I would like you to take a look at Peter's (pho@) stress suite and run on that. He has an handful of scenarios that cause problems in migration an callout in general. Thanks for undertaking this, -- Davide There are no solved problems; there are only problems that are more or less solved -- Henri Poincare ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 1/15/15 12:15 PM, Gleb Smirnoff wrote: On Thu, Jan 15, 2015 at 08:14:30PM +0300, Gleb Smirnoff wrote: T I'd dare to say that such important change simply cannot be committed T with a thorough review from at least two people very confident in this T area. Of course, I typoed. I meant cannot be committed withOUT a thorough review. There is something to be said for people taking time to review things and not wanting to be held up forever. However, I think it's been a clear practice with all other changes reviewed in phabric to date that the committer only lists people in 'Reviewed by' who actually signed off on the patch, not just the list of people asked to review it. -- John Baldwin ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Hans, On Thu, Jan 15, 2015 at 05:05:58PM +0100, Hans Petter Selasky wrote: H Eh, I have not reviewed this at all. (I still plan to though.) H H The Reviewed by was simply a CP of the review list in from the H Differential Revision. When you mention it should probably simply have said: H H Reviewed by: sbruno @ H H Due to the meaning of Reviewed by in commit messages. Sorry. H H BTW: Nice if people respond quickly or remove themselves from the H differential reviews if they don't plan to do any reviews. I'd dare to say that such important change simply cannot be committed with a thorough review from at least two people very confident in this area. Look at r24. This is the way it should be done. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: - Close a migration race where callout_reset() failed to set the CALLOUT_ACTIVE flag. - Callout callback functions are now allowed to be protected by spinlocks. - Switching the callout CPU number cannot always be done on a per-callout basis. See the updated timeout(9) manual page for more information. - The timeout(9) manual page has been updated to reflect how all the functions inside the callout API are working. The manual page has been made function oriented to make it easier to deduce how each of the functions making up the callout API are working without having to first read the whole manual page. Group all functions into a handful of sections which should give a quick top-level overview when the different functions should be used. - The CALLOUT_SHAREDLOCK flag and its functionality has been removed to reduce the complexity in the callout code and to avoid problems about atomically stopping callouts via callout_stop(). If someone needs it, it can be re-added. From my quick grep there are no CALLOUT_SHAREDLOCK clients in the kernel. - A new callout API function named callout_drain_async() has been added. See the updated timeout(9) manual page for a complete description. - Update the callout clients in the kern/ folder to use the callout API properly, like cv_timedwait(). Previously there was some custom sleepqueue code in the callout subsystem, which has been removed, because we now allow callouts to be protected by spinlocks. This allows us to tear down the callout like done with regular mutexes, and a td_slpmutex has been added to struct thread to atomically teardown the td_slpcallout. Further the TDF_TIMOFAIL and SWT_SLEEPQTIMO states can now be completely removed. Currently they are marked as available and will be cleaned up in a follow up commit. - Bump the __FreeBSD_version to indicate kernel modules need recompilation. - There has been several reports that this patch seems to squash a serious bug leading to a callout timeout and panic. Kernel build testing: all architectures were built MFC after:2 weeks Differential Revision:https://reviews.freebsd.org/D1438 Sponsored by: Mellanox Technologies Reviewed by: jhb, adrian, sbruno and emaste Modified: head/share/man/man9/Makefile head/share/man/man9/timeout.9 head/sys/kern/init_main.c head/sys/kern/kern_condvar.c head/sys/kern/kern_lock.c head/sys/kern/kern_switch.c head/sys/kern/kern_synch.c head/sys/kern/kern_thread.c head/sys/kern/kern_timeout.c head/sys/kern/subr_sleepqueue.c head/sys/ofed/include/linux/completion.h head/sys/sys/_callout.h head/sys/sys/callout.h head/sys/sys/param.h head/sys/sys/proc.h Modified: head/share/man/man9/Makefile == --- head/share/man/man9/MakefileThu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/MakefileThu Jan 15 15:32:30 2015 (r277213) @@ -1570,6 +1570,7 @@ MLINKS+=timeout.9 callout.9 \ timeout.9 callout_active.9 \ timeout.9 callout_deactivate.9 \ timeout.9 callout_drain.9 \ + timeout.9 callout_drain_async.9 \ timeout.9 callout_handle_init.9 \ timeout.9 callout_init.9 \ timeout.9 callout_init_mtx.9 \ Modified: head/share/man/man9/timeout.9 == --- head/share/man/man9/timeout.9 Thu Jan 15 14:47:48 2015 (r277212) +++ head/share/man/man9/timeout.9 Thu Jan 15 15:32:30 2015 (r277213) @@ -29,13 +29,14 @@ .\ .\ $FreeBSD$ .\ -.Dd October 8, 2014 +.Dd January 14, 2015 .Dt TIMEOUT 9 .Os .Sh NAME .Nm callout_active , .Nm callout_deactivate , .Nm callout_drain , +.Nm callout_drain_async , .Nm callout_handle_init , .Nm callout_init , .Nm callout_init_mtx , @@ -63,279 +64,232 @@ .In sys/systm.h .Bd -literal typedef void timeout_t (void *); +typedef void callout_func_t (void *); .Ed -.Ft int -.Fn callout_active struct callout *c -.Ft void -.Fn callout_deactivate struct callout *c -.Ft int -.Fn callout_drain struct callout *c -.Ft void -.Fn callout_handle_init struct callout_handle *handle -.Bd -literal -struct callout_handle handle = CALLOUT_HANDLE_INITIALIZER(handle); -.Ed -.Ft void -.Fn callout_init struct callout *c int mpsafe -.Ft void -.Fn callout_init_mtx struct callout *c struct mtx *mtx int flags -.Ft void -.Fn callout_init_rm struct callout *c struct rmlock *rm int flags -.Ft void -.Fn callout_init_rw struct callout *c struct rwlock *rw int flags -.Ft int -.Fn callout_pending struct callout *c -.Ft int -.Fn callout_reset struct callout *c int ticks timeout_t
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 15 January 2015 at 10:53, John Baldwin j...@freebsd.org wrote: On 1/15/15 10:32 AM, Hans Petter Selasky wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Eh, I have not reviewed this at all. (I still plan to though.) Yes, the list is people listed as reviewers or subscribers on the Phabricator review. I don't believe anyone in the list has examined it closely enough to count as Reviewed-by. I certainly haven't. ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/15/15 16:53, John Baldwin wrote: On 1/15/15 10:32 AM, Hans Petter Selasky wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Hi, Eh, I have not reviewed this at all. (I still plan to though.) The Reviewed by was simply a CP of the review list in from the Differential Revision. When you mention it should probably simply have said: Reviewed by: sbruno @ Due to the meaning of Reviewed by in commit messages. Sorry. BTW: Nice if people respond quickly or remove themselves from the differential reviews if they don't plan to do any reviews. --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 1/15/15 10:32 AM, Hans Petter Selasky wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: - Close a migration race where callout_reset() failed to set the CALLOUT_ACTIVE flag. - Callout callback functions are now allowed to be protected by spinlocks. - Switching the callout CPU number cannot always be done on a per-callout basis. See the updated timeout(9) manual page for more information. - The timeout(9) manual page has been updated to reflect how all the functions inside the callout API are working. The manual page has been made function oriented to make it easier to deduce how each of the functions making up the callout API are working without having to first read the whole manual page. Group all functions into a handful of sections which should give a quick top-level overview when the different functions should be used. - The CALLOUT_SHAREDLOCK flag and its functionality has been removed to reduce the complexity in the callout code and to avoid problems about atomically stopping callouts via callout_stop(). If someone needs it, it can be re-added. From my quick grep there are no CALLOUT_SHAREDLOCK clients in the kernel. - A new callout API function named callout_drain_async() has been added. See the updated timeout(9) manual page for a complete description. - Update the callout clients in the kern/ folder to use the callout API properly, like cv_timedwait(). Previously there was some custom sleepqueue code in the callout subsystem, which has been removed, because we now allow callouts to be protected by spinlocks. This allows us to tear down the callout like done with regular mutexes, and a td_slpmutex has been added to struct thread to atomically teardown the td_slpcallout. Further the TDF_TIMOFAIL and SWT_SLEEPQTIMO states can now be completely removed. Currently they are marked as available and will be cleaned up in a follow up commit. - Bump the __FreeBSD_version to indicate kernel modules need recompilation. - There has been several reports that this patch seems to squash a serious bug leading to a callout timeout and panic. Kernel build testing: all architectures were built MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D1438 Sponsored by: Mellanox Technologies Reviewed by:jhb, adrian, sbruno and emaste Eh, I have not reviewed this at all. (I still plan to though.) -- John Baldwin ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org
Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
On 01/15/15 19:46, Bjoern A. Zeeb wrote: On 15 Jan 2015, at 15:32 , Hans Petter Selasky hsela...@freebsd.org wrote: Author: hselasky Date: Thu Jan 15 15:32:30 2015 New Revision: 277213 URL: https://svnweb.freebsd.org/changeset/base/277213 Log: Major callout subsystem cleanup and rewrite: I see this for i386 and amd64 LINT* kernels. Is that a local problem or did you miss something? linking kernel addr.o: In function `set_timeout': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/addr.c:(.text+0xd37): undefined reference to `_callout_stop_safe' mad.o: In function `ib_unregister_mad_agent': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad.c:(.text+0x1ed4): undefined reference to `_callout_stop_safe' mad.o: In function `ib_mad_complete_send_wr': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad.c:(.text+0x3ea0): undefined reference to `_callout_stop_safe' cm.o: In function `ib_cm_cleanup': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/cm.c:(.text+0x40cf): undefined reference to `_callout_stop_safe' mad_rmpp.o: In function `ib_cancel_rmpp_recvs': /scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad_rmpp.c:(.text+0xc2): undefined reference to `_callout_stop_safe' mad_rmpp.o:/scratch/tmp/bz/head.svn/sys/ofed/drivers/infiniband/core/mad_rmpp.c:(.text+0xe9): more undefined references to `_callout_stop_safe' follow --- kernel --- *** [kernel] Error code 1 Hi, I think this is a local problem at your side. I get at r277215: --- kernel --- linking kernel objcopy --strip-debug kernel text data bssdec hex filename 28873399 10373165 1857376 41103940 0x2733244 kernel -- Kernel build for LINT completed on Thu Jan 15 23:07:58 IST 2015 -- When running: make -m $PWD/share/mk buildkernel WITH_OFED=YES TARGET_ARCH=i386 KERNCONF=LINT --HPS ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org