Re: svn commit: r302998 - head/sys/kern

2016-07-19 Thread Randall Stewart via svn-src-all
Well 

The code itself I had up on machines for probably about 2 months. But then
I switched over to Gleb’s changes here just recently .. which caused me all
kinds of fun :)

I had to go back into Mercurial to pull back my changes.. I have had the
resurrected changes running on my netflix machines for about 20 or so
hours generating about anywhere from 14Gbps to 32Gbps depending on the machine
type.

I plan on waiting until tomorrow to sync it down into the NF code base. 

Note that if you do decide instead to roll back to the 10.x kern_timeout.c you
will need to roll back a bunch of tcp changes as well that use the new
async_drain() interface.

I am game either way for you to proceed.. I will commit this current code
to head as long as I hear no objections (from Gleb or others)….

R

> On Jul 19, 2016, at 3:56 PM, Glen Barber  wrote:
> 
> On Tue, Jul 19, 2016 at 03:46:54PM +0200, Randall Stewart wrote:
>> Glen:
>> 
>> My changes work.. I have them running in NF in  at least 1/2 dozen machines.
>> 
> 
> For how long?  What are the uptimes on these machines?
> 
> This is the blocker for 11.0-BETA2, and I don't want to see more
> regressions being introduced at this point of the cycle.
> 
> Glen
> 
>> I am more than willing to commit them.. they actually are not much different 
>> than
>> whats in stable 10.. though I don’t know if the async-drain was MFC’d 
>> there.. it
>> needs to be in for TCP.. or else you will have yet another mess in that
>> respect (TCP depends on ASYNC-drain).
>> 
>> I can commit what I have.. if you like.. or not.. I really don’t care (I 
>> hate kern_timeout.c :-o)
>> 
>> R
>>> On Jul 19, 2016, at 2:25 PM, Glen Barber  wrote:
>>> 
>>> On Tue, Jul 19, 2016 at 01:43:16PM +0200, Randall Stewart wrote:
 Gleb
 
 Ok
 
 I have now updated
 
 https://reviews.freebsd.org/D7135
 
 You can take this or not… I really don’t care either way… (you are welcome 
 to
 own the kern_timeout.c code I hate it) :-)
 
 Basically when you went off and re-factored kern_timeout.c I had worked in 
 parallel on fixing
 the bugs you were seeing.. There were three distinct problems that I 
 fixed… but then
 you had refactored the stop() routine.. and I thought ok.. thats fine. I 
 had actually thought about
 doing something similar to what you did and was too chicken to poke that 
 much at it.. it has
 always had a nasty habit of biting back when you make a lot of changes :-D
 
 I know my version has worked for quite some time in my testing so I 
 brought it back.
 Complete with its 3 return codes (I only recently switched to your version 
 and thus
 started having difficulties with leaks and crashes)….
 
 You are welcome not to use this..  I know it works (it ran
 on a number of machines at NF last night.. and we will of course continue 
 testing
 it as we finish our dev testing for the upcoming OCA software release).. 
 For now
 this is what will be going out into the OCA’s at least :-)
 
>>> 
>>> I'm honestly done with this topic, and at the point now where I'm
>>> considering backing out all changes to callout(9) and related changes to
>>> the state they were at in stable/10.
>>> 
>>> This changes the KBI, and if it needs to be done, it needs to happen
>>> now.  We cannot wait for RC1 phase for this, and the amount of churn to
>>> get things into a working state with the current implementation far
>>> outweighs the benefit of the dangers.
>>> 
>>> Glen
>>> 
>> 
>> 
>> Randall Stewart
>> r...@netflix.com
>> 803-317-4952
>> 
>> 
>> 
>> 
>> 


Randall Stewart
r...@netflix.com
803-317-4952





___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r302998 - head/sys/kern

2016-07-19 Thread Glen Barber
On Tue, Jul 19, 2016 at 03:46:54PM +0200, Randall Stewart wrote:
> Glen:
> 
> My changes work.. I have them running in NF in  at least 1/2 dozen machines.
> 

For how long?  What are the uptimes on these machines?

This is the blocker for 11.0-BETA2, and I don't want to see more
regressions being introduced at this point of the cycle.

Glen

> I am more than willing to commit them.. they actually are not much different 
> than
> whats in stable 10.. though I don’t know if the async-drain was MFC’d there.. 
> it
> needs to be in for TCP.. or else you will have yet another mess in that
> respect (TCP depends on ASYNC-drain).
> 
> I can commit what I have.. if you like.. or not.. I really don’t care (I hate 
> kern_timeout.c :-o)
> 
> R
> > On Jul 19, 2016, at 2:25 PM, Glen Barber  wrote:
> > 
> > On Tue, Jul 19, 2016 at 01:43:16PM +0200, Randall Stewart wrote:
> >> Gleb
> >> 
> >> Ok
> >> 
> >> I have now updated
> >> 
> >> https://reviews.freebsd.org/D7135
> >> 
> >> You can take this or not… I really don’t care either way… (you are welcome 
> >> to
> >> own the kern_timeout.c code I hate it) :-)
> >> 
> >> Basically when you went off and re-factored kern_timeout.c I had worked in 
> >> parallel on fixing
> >> the bugs you were seeing.. There were three distinct problems that I 
> >> fixed… but then
> >> you had refactored the stop() routine.. and I thought ok.. thats fine. I 
> >> had actually thought about
> >> doing something similar to what you did and was too chicken to poke that 
> >> much at it.. it has
> >> always had a nasty habit of biting back when you make a lot of changes :-D
> >> 
> >> I know my version has worked for quite some time in my testing so I 
> >> brought it back.
> >> Complete with its 3 return codes (I only recently switched to your version 
> >> and thus
> >> started having difficulties with leaks and crashes)….
> >> 
> >> You are welcome not to use this..  I know it works (it ran
> >> on a number of machines at NF last night.. and we will of course continue 
> >> testing
> >> it as we finish our dev testing for the upcoming OCA software release).. 
> >> For now
> >> this is what will be going out into the OCA’s at least :-)
> >> 
> > 
> > I'm honestly done with this topic, and at the point now where I'm
> > considering backing out all changes to callout(9) and related changes to
> > the state they were at in stable/10.
> > 
> > This changes the KBI, and if it needs to be done, it needs to happen
> > now.  We cannot wait for RC1 phase for this, and the amount of churn to
> > get things into a working state with the current implementation far
> > outweighs the benefit of the dangers.
> > 
> > Glen
> > 
> 
> 
> Randall Stewart
> r...@netflix.com
> 803-317-4952
> 
> 
> 
> 
> 


signature.asc
Description: PGP signature


Re: svn commit: r302998 - head/sys/kern

2016-07-19 Thread Randall Stewart via svn-src-all
Glen:

My changes work.. I have them running in NF in  at least 1/2 dozen machines.

I am more than willing to commit them.. they actually are not much different 
than
whats in stable 10.. though I don’t know if the async-drain was MFC’d there.. it
needs to be in for TCP.. or else you will have yet another mess in that
respect (TCP depends on ASYNC-drain).

I can commit what I have.. if you like.. or not.. I really don’t care (I hate 
kern_timeout.c :-o)

R
> On Jul 19, 2016, at 2:25 PM, Glen Barber  wrote:
> 
> On Tue, Jul 19, 2016 at 01:43:16PM +0200, Randall Stewart wrote:
>> Gleb
>> 
>> Ok
>> 
>> I have now updated
>> 
>> https://reviews.freebsd.org/D7135
>> 
>> You can take this or not… I really don’t care either way… (you are welcome to
>> own the kern_timeout.c code I hate it) :-)
>> 
>> Basically when you went off and re-factored kern_timeout.c I had worked in 
>> parallel on fixing
>> the bugs you were seeing.. There were three distinct problems that I fixed… 
>> but then
>> you had refactored the stop() routine.. and I thought ok.. thats fine. I had 
>> actually thought about
>> doing something similar to what you did and was too chicken to poke that 
>> much at it.. it has
>> always had a nasty habit of biting back when you make a lot of changes :-D
>> 
>> I know my version has worked for quite some time in my testing so I brought 
>> it back.
>> Complete with its 3 return codes (I only recently switched to your version 
>> and thus
>> started having difficulties with leaks and crashes)….
>> 
>> You are welcome not to use this..  I know it works (it ran
>> on a number of machines at NF last night.. and we will of course continue 
>> testing
>> it as we finish our dev testing for the upcoming OCA software release).. For 
>> now
>> this is what will be going out into the OCA’s at least :-)
>> 
> 
> I'm honestly done with this topic, and at the point now where I'm
> considering backing out all changes to callout(9) and related changes to
> the state they were at in stable/10.
> 
> This changes the KBI, and if it needs to be done, it needs to happen
> now.  We cannot wait for RC1 phase for this, and the amount of churn to
> get things into a working state with the current implementation far
> outweighs the benefit of the dangers.
> 
> Glen
> 


Randall Stewart
r...@netflix.com
803-317-4952





___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r302998 - head/sys/kern

2016-07-19 Thread Glen Barber
On Tue, Jul 19, 2016 at 01:43:16PM +0200, Randall Stewart wrote:
> Gleb
> 
> Ok
> 
> I have now updated
> 
> https://reviews.freebsd.org/D7135
> 
> You can take this or not… I really don’t care either way… (you are welcome to
> own the kern_timeout.c code I hate it) :-)
> 
> Basically when you went off and re-factored kern_timeout.c I had worked in 
> parallel on fixing
> the bugs you were seeing.. There were three distinct problems that I fixed… 
> but then
> you had refactored the stop() routine.. and I thought ok.. thats fine. I had 
> actually thought about
> doing something similar to what you did and was too chicken to poke that much 
> at it.. it has
> always had a nasty habit of biting back when you make a lot of changes :-D
> 
> I know my version has worked for quite some time in my testing so I brought 
> it back.
> Complete with its 3 return codes (I only recently switched to your version 
> and thus
> started having difficulties with leaks and crashes)….
> 
> You are welcome not to use this..  I know it works (it ran
> on a number of machines at NF last night.. and we will of course continue 
> testing
> it as we finish our dev testing for the upcoming OCA software release).. For 
> now
> this is what will be going out into the OCA’s at least :-)
> 

I'm honestly done with this topic, and at the point now where I'm
considering backing out all changes to callout(9) and related changes to
the state they were at in stable/10.

This changes the KBI, and if it needs to be done, it needs to happen
now.  We cannot wait for RC1 phase for this, and the amount of churn to
get things into a working state with the current implementation far
outweighs the benefit of the dangers.

Glen



signature.asc
Description: PGP signature


Re: svn commit: r302998 - head/sys/kern

2016-07-19 Thread Randall Stewart via svn-src-all
Gleb

Ok

I have now updated

https://reviews.freebsd.org/D7135

You can take this or not… I really don’t care either way… (you are welcome to
own the kern_timeout.c code I hate it) :-)

Basically when you went off and re-factored kern_timeout.c I had worked in 
parallel on fixing
the bugs you were seeing.. There were three distinct problems that I fixed… but 
then
you had refactored the stop() routine.. and I thought ok.. thats fine. I had 
actually thought about
doing something similar to what you did and was too chicken to poke that much 
at it.. it has
always had a nasty habit of biting back when you make a lot of changes :-D

I know my version has worked for quite some time in my testing so I brought it 
back.
Complete with its 3 return codes (I only recently switched to your version and 
thus
started having difficulties with leaks and crashes)….

You are welcome not to use this..  I know it works (it ran
on a number of machines at NF last night.. and we will of course continue 
testing
it as we finish our dev testing for the upcoming OCA software release).. For now
this is what will be going out into the OCA’s at least :-)

R

> On Jul 18, 2016, at 6:19 PM, Randall Stewart  wrote:
> 
> I have worked out a fix of this in Netflix code base (I have the same code 
> running there). I
> will get that tested tonight I will get the fixes in to restore the behavior.
> 
> I will setup a phabricator shortly.. most likely I will update the one I 
> already
> have on the one problem your earlier patch did not fix.
> 
> R
>> On Jul 18, 2016, at 5:44 PM, Randall Stewart  wrote:
>> 
>> Gleb:
>> 
>> This now leaks TCP-PCB’s since you have broken the return codes with all your
>> fixes that used to be in here.
>> 
>> It was
>> 
>> return 1 — You stopped the callout
>> return 0 — The callout could not be stopped
>> return -1 — The callout was not running.
>> 
>> The LLRef code that was crashing in in.c depended on this to know to free
>> the memory.. i.e. if was > 0 then they needed to free the memory.
>> 
>> TCP depends on a return 0 to indicate the async-drain function will be 
>> called back and
>> thus increments a refcnt and waits for the callback.
>> 
>> You now return 0 when no timer was active.. which makes the stack then wait
>> for the not forth coming async-drain call.
>> 
>> R
>>> On Jul 18, 2016, at 11:29 AM, Gleb Smirnoff  wrote:
>>> 
>>> Author: glebius
>>> Date: Mon Jul 18 09:29:08 2016
>>> New Revision: 302998
>>> URL: https://svnweb.freebsd.org/changeset/base/302998
>>> 
>>> Log:
>>> Revert the last commit. It must get more review and testing first.
>>> 
>>> Modified:
>>> head/sys/kern/kern_timeout.c
>>> 
>>> Modified: head/sys/kern/kern_timeout.c
>>> ==
>>> --- head/sys/kern/kern_timeout.cMon Jul 18 09:26:06 2016
>>> (r302997)
>>> +++ head/sys/kern/kern_timeout.cMon Jul 18 09:29:08 2016
>>> (r302998)
>>> @@ -1381,7 +1381,7 @@ again:
>>> CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
>>> c, c->c_func, c->c_arg);
>>> CC_UNLOCK(cc);
>>> -   return (-1);
>>> +   return (0);
>>> }
>>> 
>>> c->c_iflags &= ~CALLOUT_PENDING;
>>> 
>> 
>> 
>> Randall Stewart
>> r...@netflix.com
>> 803-317-4952
>> 
>> 
>> 
>> 
>> 
> 
> 
> Randall Stewart
> r...@netflix.com
> 803-317-4952
> 
> 
> 
> 
> 


Randall Stewart
r...@netflix.com
803-317-4952





___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r302998 - head/sys/kern

2016-07-18 Thread Randall Stewart via svn-src-all
I have worked out a fix of this in Netflix code base (I have the same code 
running there). I
will get that tested tonight I will get the fixes in to restore the behavior.

I will setup a phabricator shortly.. most likely I will update the one I already
have on the one problem your earlier patch did not fix.

R
> On Jul 18, 2016, at 5:44 PM, Randall Stewart  wrote:
> 
> Gleb:
> 
> This now leaks TCP-PCB’s since you have broken the return codes with all your
> fixes that used to be in here.
> 
> It was
> 
> return 1 — You stopped the callout
> return 0 — The callout could not be stopped
> return -1 — The callout was not running.
> 
> The LLRef code that was crashing in in.c depended on this to know to free
> the memory.. i.e. if was > 0 then they needed to free the memory.
> 
> TCP depends on a return 0 to indicate the async-drain function will be called 
> back and
> thus increments a refcnt and waits for the callback.
> 
> You now return 0 when no timer was active.. which makes the stack then wait
> for the not forth coming async-drain call.
> 
> R
>> On Jul 18, 2016, at 11:29 AM, Gleb Smirnoff  wrote:
>> 
>> Author: glebius
>> Date: Mon Jul 18 09:29:08 2016
>> New Revision: 302998
>> URL: https://svnweb.freebsd.org/changeset/base/302998
>> 
>> Log:
>> Revert the last commit. It must get more review and testing first.
>> 
>> Modified:
>> head/sys/kern/kern_timeout.c
>> 
>> Modified: head/sys/kern/kern_timeout.c
>> ==
>> --- head/sys/kern/kern_timeout.c Mon Jul 18 09:26:06 2016
>> (r302997)
>> +++ head/sys/kern/kern_timeout.c Mon Jul 18 09:29:08 2016
>> (r302998)
>> @@ -1381,7 +1381,7 @@ again:
>>  CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
>>  c, c->c_func, c->c_arg);
>>  CC_UNLOCK(cc);
>> -return (-1);
>> +return (0);
>>  }
>> 
>>  c->c_iflags &= ~CALLOUT_PENDING;
>> 
> 
> 
> Randall Stewart
> r...@netflix.com
> 803-317-4952
> 
> 
> 
> 
> 


Randall Stewart
r...@netflix.com
803-317-4952





___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r302998 - head/sys/kern

2016-07-18 Thread Randall Stewart via svn-src-all
Gleb:

This now leaks TCP-PCB’s since you have broken the return codes with all your
fixes that used to be in here.

It was

return 1 — You stopped the callout
return 0 — The callout could not be stopped
return -1 — The callout was not running.

The LLRef code that was crashing in in.c depended on this to know to free
the memory.. i.e. if was > 0 then they needed to free the memory.

TCP depends on a return 0 to indicate the async-drain function will be called 
back and
thus increments a refcnt and waits for the callback.

You now return 0 when no timer was active.. which makes the stack then wait
for the not forth coming async-drain call.

R
> On Jul 18, 2016, at 11:29 AM, Gleb Smirnoff  wrote:
> 
> Author: glebius
> Date: Mon Jul 18 09:29:08 2016
> New Revision: 302998
> URL: https://svnweb.freebsd.org/changeset/base/302998
> 
> Log:
>  Revert the last commit. It must get more review and testing first.
> 
> Modified:
>  head/sys/kern/kern_timeout.c
> 
> Modified: head/sys/kern/kern_timeout.c
> ==
> --- head/sys/kern/kern_timeout.c  Mon Jul 18 09:26:06 2016
> (r302997)
> +++ head/sys/kern/kern_timeout.c  Mon Jul 18 09:29:08 2016
> (r302998)
> @@ -1381,7 +1381,7 @@ again:
>   CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
>   c, c->c_func, c->c_arg);
>   CC_UNLOCK(cc);
> - return (-1);
> + return (0);
>   }
> 
>   c->c_iflags &= ~CALLOUT_PENDING;
> 


Randall Stewart
r...@netflix.com
803-317-4952





___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

svn commit: r302998 - head/sys/kern

2016-07-18 Thread Gleb Smirnoff
Author: glebius
Date: Mon Jul 18 09:29:08 2016
New Revision: 302998
URL: https://svnweb.freebsd.org/changeset/base/302998

Log:
  Revert the last commit. It must get more review and testing first.

Modified:
  head/sys/kern/kern_timeout.c

Modified: head/sys/kern/kern_timeout.c
==
--- head/sys/kern/kern_timeout.cMon Jul 18 09:26:06 2016
(r302997)
+++ head/sys/kern/kern_timeout.cMon Jul 18 09:29:08 2016
(r302998)
@@ -1381,7 +1381,7 @@ again:
CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
c, c->c_func, c->c_arg);
CC_UNLOCK(cc);
-   return (-1);
+   return (0);
}
 
c->c_iflags &= ~CALLOUT_PENDING;
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"