Re: Survey results very helpful, thanks! (was: Re: net.inet.tcp.timer_race: does anyone have a non-zero value?)

2010-03-09 Thread Doug Hardie

On 8 March 2010, at 12:33, Robert Watson wrote:

> 
> On Mon, 8 Mar 2010, Doug Hardie wrote:
> 
>> I run a number of 4 core systems with em interfaces.  These are production 
>> systems that are unmanned and located a long way from me.  Under unusual 
>> conditions it can take up to 6 hours to get there.  I have been waiting to 
>> switch to 8.0 because of the discussions on the em device and now it sounds 
>> like I had better just skip 8.x and wait for 9.  7.2 is working just fine.
> 
> Not sure that any information in this survey thread should be relevant to 
> that decision.  This race has existed since before FreeBSD, having appeared 
> in the original BSD network stack, and is just as present in FreeBSD 7.x as 
> 8.x or 9.x.  When I learned about the race during the early 7.x development 
> cycle, I added a counter/statistic to measure how much it happened in 
> practice, but was not able to exercise it in my testing, and so left the 
> counter in to appear in 7.0 and later so that we could perform this survey as 
> core counts/etc increase.
> 
> The two likely outcomes were "it is never exercised" and "it is exercised but 
> only very infrequently", neither really justifying the quite complex change 
> to correct it given requirements at the time.  On-going development work on 
> the virtual network stack is what justifies correcting the bug at this point, 
> moving from detecting and handling the race to preventing it from occuring as 
> an invariant.  The motivation here, BTW, is that we'd like to eliminate the 
> type-stable storage requirement for connection state (which ensures that 
> memory once used for a connection block is only ever used for connection 
> blocks in the future), allowing memory to be fully freed when a virtual 
> network stack is destroyed.  Using type-stable storage helped address this 
> bug, but was primarily present to reduce the overhead of monitoring using 
> netstat(1).  We'll now need to use a slightly more expensive solution (true 
> reference counts) in that context, although in practice it will almost 
> certainly be an unmeasurable cost.
> 
> Which is to say that while there might be something in the em/altq/... thread 
> to reasonably lead you to avoid 8.0, nothing in the TCP timer race thread 
> should do so, since it affects 7.2 just as much as 8.0.  Even if you do see a 
> non-zero counter, that's not a matter for operational concern, just useful 
> from the perspective of a network stack developer to understanding timing and 
> behaviors in the stack.  :-)


Thanks for the complete explanation.  I don't believe the ALTQ issue will 
affect me.  I am not currently using it and do not expect to in the near 
future.  In addition, there was a posting that a fix for at least part of that 
will be added in a week or so.  Given all that it appears its time to start the 
planning/testing process for 8.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Survey results very helpful, thanks! (was: Re: net.inet.tcp.timer_race: does anyone have a non-zero value?)

2010-03-08 Thread Robert Watson


On Mon, 8 Mar 2010, Doug Hardie wrote:

I run a number of 4 core systems with em interfaces.  These are production 
systems that are unmanned and located a long way from me.  Under unusual 
conditions it can take up to 6 hours to get there.  I have been waiting to 
switch to 8.0 because of the discussions on the em device and now it sounds 
like I had better just skip 8.x and wait for 9.  7.2 is working just fine.


Not sure that any information in this survey thread should be relevant to that 
decision.  This race has existed since before FreeBSD, having appeared in the 
original BSD network stack, and is just as present in FreeBSD 7.x as 8.x or 
9.x.  When I learned about the race during the early 7.x development cycle, I 
added a counter/statistic to measure how much it happened in practice, but was 
not able to exercise it in my testing, and so left the counter in to appear in 
7.0 and later so that we could perform this survey as core counts/etc 
increase.


The two likely outcomes were "it is never exercised" and "it is exercised but 
only very infrequently", neither really justifying the quite complex change to 
correct it given requirements at the time.  On-going development work on the 
virtual network stack is what justifies correcting the bug at this point, 
moving from detecting and handling the race to preventing it from occuring as 
an invariant.  The motivation here, BTW, is that we'd like to eliminate the 
type-stable storage requirement for connection state (which ensures that 
memory once used for a connection block is only ever used for connection 
blocks in the future), allowing memory to be fully freed when a virtual 
network stack is destroyed.  Using type-stable storage helped address this 
bug, but was primarily present to reduce the overhead of monitoring using 
netstat(1).  We'll now need to use a slightly more expensive solution (true 
reference counts) in that context, although in practice it will almost 
certainly be an unmeasurable cost.


Which is to say that while there might be something in the em/altq/... thread 
to reasonably lead you to avoid 8.0, nothing in the TCP timer race thread 
should do so, since it affects 7.2 just as much as 8.0.  Even if you do see a 
non-zero counter, that's not a matter for operational concern, just useful 
from the perspective of a network stack developer to understanding timing and 
behaviors in the stack.  :-)


Robert
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Survey results very helpful, thanks! (was: Re: net.inet.tcp.timer_race: does anyone have a non-zero value?)

2010-03-08 Thread Doug Hardie

On 8 March 2010, at 06:53, Robert Watson wrote:

> 
> On Sun, 7 Mar 2010, Robert Watson wrote:
> 
>> If your system shows a non-zero value, please send me a *private e-mail* 
>> with the output of that command, plus also the output of "sysctl kern.smp", 
>> "uptime", and a brief description of the workload and network interface 
>> configuration.  For example: it's a busy 8-core web server with roughly X 
>> connections/second, and that has three em network interfaces used to load 
>> balance from an upstream source.  IPSEC is used for management purposes (but 
>> not bulk traffic), and there's a local MySQL database.
> 
> I've now received a number of reports that confirm our suspicion that the 
> race does occur, albeit very rarely, and particularly on systems with many 
> cores or multiple network interfaces.  Fixing it is definitely on the TODO 
> for 9.0, both to improve our ability to do multiple virtual network stacks, 
> but with an appropriately scalable fix in mind given our improved TCP 
> scalability for 9.0 as well.

I run a number of 4 core systems with em interfaces.  These are production 
systems that are unmanned and located a long way from me.  Under unusual 
conditions it can take up to 6 hours to get there.  I have been waiting to 
switch to 8.0 because of the discussions on the em device and now it sounds 
like I had better just skip 8.x and wait for 9.  7.2 is working just 
fine.___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Survey results very helpful, thanks! (was: Re: net.inet.tcp.timer_race: does anyone have a non-zero value?)

2010-03-08 Thread Robert Watson


On Sun, 7 Mar 2010, Robert Watson wrote:

If your system shows a non-zero value, please send me a *private e-mail* 
with the output of that command, plus also the output of "sysctl kern.smp", 
"uptime", and a brief description of the workload and network interface 
configuration.  For example: it's a busy 8-core web server with roughly X 
connections/second, and that has three em network interfaces used to load 
balance from an upstream source.  IPSEC is used for management purposes (but 
not bulk traffic), and there's a local MySQL database.


I've now received a number of reports that confirm our suspicion that the race 
does occur, albeit very rarely, and particularly on systems with many cores or 
multiple network interfaces.  Fixing it is definitely on the TODO for 9.0, 
both to improve our ability to do multiple virtual network stacks, but with an 
appropriately scalable fix in mind given our improved TCP scalability for 9.0 
as well.


Thanks for all the responses,

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: net.inet.tcp.timer_race: does anyone have a non-zero value?

2010-03-07 Thread Robert N. M. Watson

On Mar 7, 2010, at 12:33 PM, Mikolaj Golub wrote:

> On Sun, 7 Mar 2010 11:59:35 + (GMT) Robert Watson wrote:
> 
>> Please check the results of the following command:
>> 
>>  % sysctl net.inet.tcp.timer_race
>>  net.inet.tcp.timer_race: 0
> 
> Are the results for FreeBSD7 look interesting for you? Because currently we
> have mostly FreeBSD7.1 hosts in production and I observe nonzero values on 8
> hosts (about 15%). I would send more details to you privately if you are
> interested.

Yes, 7.x is also of interest, thanks!

Robert___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: net.inet.tcp.timer_race: does anyone have a non-zero value?

2010-03-07 Thread Mikolaj Golub
On Sun, 7 Mar 2010 11:59:35 + (GMT) Robert Watson wrote:

> Please check the results of the following command:
>
>   % sysctl net.inet.tcp.timer_race
>   net.inet.tcp.timer_race: 0

Are the results for FreeBSD7 look interesting for you? Because currently we
have mostly FreeBSD7.1 hosts in production and I observe nonzero values on 8
hosts (about 15%). I would send more details to you privately if you are
interested.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


net.inet.tcp.timer_race: does anyone have a non-zero value?

2010-03-07 Thread Robert Watson


Dear all:

I'm embarking on some new network stack locking work, which requires me to 
address a number of loose ends in the current model.  A few years ago, my 
attention was drawn to a largly theoretical race, which had existed in the BSD 
code since inception.  It is detected and handled in practice, but relies on 
type stability of TCP connection data structures, which will need to change in 
the future due to on-going virtualization work.  I didn't fix it at the time, 
but did add a counter so that we could see if it was happening in the field -- 
that counter, net.inet.tcp.timer_race, indicates whether or not the stack has 
detected it happening (and then handled it).  This e-mail is to collect the 
results of that in-the-field survey.


Please check the results of the following command:

  % sysctl net.inet.tcp.timer_race
  net.inet.tcp.timer_race: 0

If your system shows a non-zero value, please send me a *private e-mail* with 
the output of that command, plus also the output of "sysctl kern.smp", 
"uptime", and a brief description of the workload and network interface 
configuration.  For example: it's a busy 8-core web server with roughly X 
connections/second, and that has three em network interfaces used to load 
balance from an upstream source.  IPSEC is used for management purposes (but 
not bulk traffic), and there's a local MySQL database.


I've already seen one non-zero report, but would be interested in knowing a 
bit more about the kinds of situations where it's happening so that I can 
prioritize fixing it appropriately, but also reason about the frequency at 
which it happens so we can select a fix that avoids adding significant 
overhead in the common case.


Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"