RE: A question about timecounters
>be open source. It's a simulated web client and web server, running >inside the kernel. It's good for load-testing and performance-testing >many kinds of network devices. With two 1-GHz PIII boxes (one acting >as the client and the other acting as the server) it can generate >around 5 (actually I think it's more than that) full web sessions >per second. Also, you can dial in any rate you want, and it will >generate that rate very precisely. Lots of fun! ... yea yea :-) It reminds me an 2 years old feature in our product (was QARobot originally, now it's a part of RouterTester) where we had an Session Storm module generates HTTP traffic with kernel support too. I did remember how it was killing any httpd server in secs even from one slave with possible up to 20 with thousands ssm's running on each cpu. Unfortunately it's obsolete now (qa and development, but code and qbol language support still be inside new product for free :). I may try to find some performance data tomorrow if it's interesting for you off cause... Igor. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message: <[EMAIL PROTECTED]> John Polstra <[EMAIL PROTECTED]> writes: : I'm testing that now. But for how long would microuptime have to : be interrupted to make this happen? Surely not 7.81 seconds! On : this same machine I have a curses application running which is : updating the screen once a second. It never misses a beat, and : userland is very responsive. Silly hypothesis. It isn't losing 7.81 seconds of time. Rather it is being interrupt at the wrong time and the wrap detection code works badly when interrupted at that point. So you aren't blocked for 7.81 seconds, but rather for 0.02 seconds since you are seeing the 7.79 jump. You are basically catching the wrong edge of the phase of the TSC. Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
John Polstra wrote: > After 25 minutes testing that with NTIMECOUNTER=5, I haven't > gotten any microuptime messages. So it appears that my problem was > just that the current timecounter wrapped all the way around the ring > while microuptime was interrupted, due to the high HZ value and the > heavy interrupt load. I'm sorry I didn't try this sooner, when you > suggested it. > > After lunch I'll try NTIMECOUNTER=HZ (1 in my case). That sounds > like a nice default value to me. 5 * HZ @ 10,000 HZ * sizeof(struct timecounter) = 5 * 1 * 23*4 = 4,600,000 = 4.5M in timecounter structures Even with your NTIMECOUNTER=HZ (assuming it even works), that is 1M of memory burned on timecounter structures to prevent wrap around. If the interrupts are happening fast enough that wrap is this bad a problem, then I suggest another counter that divides the interupt frequency down to a (much smaller) update frequency for the timecounter. Is having this many of these things really worthwhile? I can't see how off the top of my head, but I can imagine a situation where it miht be referenced with a higher than wrap frequency... but the question is, what's the highest access frequency? That should limit the update frequency to twice that, right, which would guarantee adequate update resolution to satisfy that use? If it is, could we maybe just count wraps, instead, and add 7.7? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> Is C a great language, or what? ;-) Nah, just mediocre even when it comes to obfuscation! Have you played with unlambda?! > The way I always remember it is that you read the declaration > inside-out: starting with the variable name and then heading toward > the outside while obeying the precedence rules. When you hit a "*", > you say "pointer to"; when you hit "[]", you say "array of"; and when > you hit "()" you say "function returning." For example: I remember something about switching declaration reading direction when you hit a bracket; but why bother once you have cdecl? cdecl> declare f as array of pointer to function returning pointer to function returning int int (*(*f[])())() It is not clear to me how to apply your rule. It doesn't matter though, it is gotten to the point where I can only store ptrs to ptrs to information in my ever shrinking brain! To the people who pointed out the cdecl port, I did look in /usr/ports/devel but missed cdecl somehow. Sigh... :-) -- bakul To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> PS: Chances are most people don't have cdecl any more. You > can get it like this: cd /usr/ports/devel/cdecl && make install :) -Anthony. msg31489/pgp0.pgp Description: PGP signature
Re: A question about timecounters
On Tue, Feb 05, 2002 at 02:42:38PM -0800, Bakul Shah wrote: > > PS: Chances are most people don't have cdecl any more. You > can get it like this: > You can also get it like this: cd /usr/ports/devel/cdecl ; make install which I just went and did. Pretty helpful utility :) --K To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Bakul Shah <[EMAIL PROTECTED]> wrote: > [I see that jdp has answered your question but] cdecl is your friend! > > $ cdecl > Type `help' or `?' for help > cdecl> explain volatile struct timecounter *timecounter > declare timecounter as pointer to volatile struct timecounter > cdecl> declare timecounter as volatile pointer to struct timecounter > struct timecounter * volatile timecounter Is C a great language, or what? ;-) The way I always remember it is that you read the declaration inside-out: starting with the variable name and then heading toward the outside while obeying the precedence rules. When you hit a "*", you say "pointer to"; when you hit "[]", you say "array of"; and when you hit "()" you say "function returning." For example: struct timecounter * volatile timecounter; /* "Timecounter is a volatile pointer to a struct timecounter." */ volatile struct timecounter *timecounter; /* "Timecounter is a pointer to a struct timecounter which is volatile." */ The reason for the awkward "which is" in that last one is just because C lets you get sloppy with the ordering of the outermost keywords. The pedantically correct way to declare a pointer to volatile struct is like this: struct timecounter volatile *timecounter; /* "Timecounter is a pointer to a volatile struct timecounter." */ John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> Btw, regarding the volatile thing: > > If I do > extern volatile struct timecounter *timecounter; > > microtime() > { > struct timecounter *tc; > > tc = timecounter; > > The compiler complains about loosing the volatile thing. > > How do I tell it that it is the contents of the "timecounter" pointer which > is volatile, but now what it points at ? I don't want the "tc" pointer to > be volatile because it obviously isn't. Do I really need to cast it ? > > tc = (struct timecounter *)timecounter; [I see that jdp has answered your question but] cdecl is your friend! $ cdecl Type `help' or `?' for help cdecl> explain volatile struct timecounter *timecounter declare timecounter as pointer to volatile struct timecounter cdecl> declare timecounter as volatile pointer to struct timecounter struct timecounter * volatile timecounter -- bakul PS: Chances are most people don't have cdecl any more. You can get it like this: mkdir cdecl;cd cdecl fetch ftp://gatekeeper.dec.com/pub/usenet/comp.sources.unix/volume14/cdecl2/part0{1,2}.Z gzcat part01.Z | gunshar gzcat part02.Z | gunshar patch <<'EOF' diff -ru ../cdecl-orig/cdecl.c ./cdecl.c --- ../cdecl-orig/cdecl.c Tue Feb 5 14:24:23 2002 +++ ./cdecl.c Tue Feb 5 12:12:30 2002 @@ -57,6 +57,9 @@ # include # include # include +#ifdef BSD +#include +#endif #else # ifndef NOVARARGS # include @@ -110,6 +113,9 @@ void docast(char*, char*, char*, char*); void dodexplain(char*, char*, char*, char*); void docexplain(char*, char*, char*, char*); +#ifdef __FreeBSD__ +#define setprogname _bad_bad_bad_FreeBSD +#endif void setprogname(char *); int dotmpfile(int, char**), dofileargs(int, char**); #else diff -ru ../cdecl-orig/makefile ./makefile --- ../cdecl-orig/makefile Tue Feb 5 14:24:19 2002 +++ ./makefile Tue Feb 5 12:10:10 2002 @@ -13,7 +13,7 @@ # add -DdodebugTo compile in debugging trace statements. # add -Ddoyydebug To compile in yacc trace statements. -CFLAGS= -g -Ddodebug -Ddoyydebug +CFLAGS= -g -Ddodebug -Ddoyydebug -DBSD CC= cc ALLFILES= makefile cdgram.y cdlex.l cdecl.c cdecl.1 testset testset++ BIN= /usr/lubin EOF make # as root: make install BIN=/usr/local/bin No idea if c++decl is valid any more! To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > > Well, either way I will commit the volatile and this NTIMECOUNTER to > -current now, it's certainly better than what is there now. Great, thanks. > Thanks for the help, I owe you one at BSDcon! I'll look forward to it! > Ohh, and btw: do I need to say that I'm dying to know what the heck > you are doing with that box ? :-) Making the room a good bit warmer, that's for sure. :-) First I should mention for the benefit of those listening in that it isn't something I can release publicly, and it will probably never be open source. It's a simulated web client and web server, running inside the kernel. It's good for load-testing and performance-testing many kinds of network devices. With two 1-GHz PIII boxes (one acting as the client and the other acting as the server) it can generate around 5 (actually I think it's more than that) full web sessions per second. Also, you can dial in any rate you want, and it will generate that rate very precisely. Lots of fun! John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >After 25 minutes testing that with NTIMECOUNTER=5, I haven't >gotten any microuptime messages. So it appears that my problem was >just that the current timecounter wrapped all the way around the ring >while microuptime was interrupted, due to the high HZ value and the >heavy interrupt load. I'm sorry I didn't try this sooner, when you >suggested it. > >After lunch I'll try NTIMECOUNTER=HZ (1 in my case). That sounds >like a nice default value to me. Yup, if we preempt things for more than a second we have other problems too I think. My fault for not letting it depend in HZ in the first place. >It would be interesting to see whether this same fix also works in >-current. I'm not sure yet if I can test that here or not. I haven't >tried building my netgraph module under -current yet. Well, either way I will commit the volatile and this NTIMECOUNTER to -current now, it's certainly better than what is there now. Thanks for the help, I owe you one at BSDcon! Poul-Henning Ohh, and btw: do I need to say that I'm dying to know what the heck you are doing with that box ? :-) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > Could you try this combination: > > NTIMECOUNTER = HZ (or even 5 * HZ) > tco_method = 0 > no splhigh protection for microuptime() ? After 25 minutes testing that with NTIMECOUNTER=5, I haven't gotten any microuptime messages. So it appears that my problem was just that the current timecounter wrapped all the way around the ring while microuptime was interrupted, due to the high HZ value and the heavy interrupt load. I'm sorry I didn't try this sooner, when you suggested it. After lunch I'll try NTIMECOUNTER=HZ (1 in my case). That sounds like a nice default value to me. It would be interesting to see whether this same fix also works in -current. I'm not sure yet if I can test that here or not. I haven't tried building my netgraph module under -current yet. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > >I don't follow that. As I read the code, the "current" timecounter > >is only advanced every second -- not every 1/HZ seconds. Why should > >more of them be needed when HZ is large? > > No, only if you have set tco_method to one, if tco_method is zero (default) > we update the timecounter every HZ. Oh, you're right. Sorry, I misread the code. My tco_method is 0, so it will do the update every HZ. > Could you try this combination: > > NTIMECOUNTER = HZ (or even 5 * HZ) > tco_method = 0 > no splhigh protection for microuptime() ? Yep, I'll try that. > If I do > extern volatile struct timecounter *timecounter; > > microtime() > { > struct timecounter *tc; > > tc = timecounter; > > The compiler complains about loosing the volatile thing. > > How do I tell it that it is the contents of the "timecounter" pointer which > is volatile, but now what it points at ? You want: extern struct timecounter *volatile timecounter; and also change the definition of the variable in the .c file. You won't get any warnings from that. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >OK, adding the splhigh() around the body of microuptime seems to have >solved the problem. After 45 minutes of running the same test as >before, I haven't gotten a single message. If I get one later, I'll >let you know. Ok, so we know where the bogotism happens, now to identify it... >I don't follow that. As I read the code, the "current" timecounter >is only advanced every second -- not every 1/HZ seconds. Why should >more of them be needed when HZ is large? No, only if you have set tco_method to one, if tco_method is zero (default) we update the timecounter every HZ. >> You didn't say if you ran with standard NTIMECOUNTER right now, >> but 5 would be awfully short time at HZ=1: 500 usec... > >Well, microseconds aren't what they used to be ... :-) But isn't it >true that the current timecounter only advances every second? I think >I have 5 seconds, not 5/HZ seconds. Depends on your tco_method... Could you try this combination: NTIMECOUNTER = HZ (or even 5 * HZ) tco_method = 0 no splhigh protection for microuptime() ? The reason why tco_method=1 isn't nice is that we loose the ability to use 1/hz precision cached timestamps in the get*() funtions. Btw, regarding the volatile thing: If I do extern volatile struct timecounter *timecounter; microtime() { struct timecounter *tc; tc = timecounter; The compiler complains about loosing the volatile thing. How do I tell it that it is the contents of the "timecounter" pointer which is volatile, but now what it points at ? I don't want the "tc" pointer to be volatile because it obviously isn't. Do I really need to cast it ? tc = (struct timecounter *)timecounter; That looks silly to me... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
OK, adding the splhigh() around the body of microuptime seems to have solved the problem. After 45 minutes of running the same test as before, I haven't gotten a single message. If I get one later, I'll let you know. > >I'm testing that now. But for how long would microuptime have to > >be interrupted to make this happen? Surely not 7.81 seconds! On > >this same machine I have a curses application running which is > >updating the screen once a second. It never misses a beat, and > >userland is very responsive. > > Well, that is what I don't understand yet either :-) > > The fact that the delta is not exactly 2^32 * cpu clock is probably > blindingly obviously indicative of "why", but I havn't solved the > puzzle yet... I've been staring at the code, but I haven't gotten it yet either. > Since you are running with a 1 HZ, NTIMECOUNTER should probably > be considerably increased. (Actually it might be a good idea to > simply set NTIMECOUNTER == hz ... hmmm...) I don't follow that. As I read the code, the "current" timecounter is only advanced every second -- not every 1/HZ seconds. Why should more of them be needed when HZ is large? > You didn't say if you ran with standard NTIMECOUNTER right now, > but 5 would be awfully short time at HZ=1: 500 usec... Well, microseconds aren't what they used to be ... :-) But isn't it true that the current timecounter only advances every second? I think I have 5 seconds, not 5/HZ seconds. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> >How are issues (1) and (3) above different? > > > >ps. I'm just trying to understand, and am *NOT* trying to start a > >flame-war. :) :) :) > > If the starvation happens to hardclock() or rather tc_windup() the effect > will be cummulative and show up in permanent jumps in the output of date > for instance. In stable hardclock() is spl-protected so this would be > _really_ bad news. > > If the starvation happens in any of {micro|nano}[up]time() (but not the > "get&" variants!) the it will result in a single spurious reading. Ok, the bulb is starting to grow from dim to bright. :) > The premise for avoiding locking in the access functions to timecounters > where precisely that we could trust them to not be pre-empted for long > enough for the hardware to roll over, if this is not the case we loose > because the overflow in the hardware counter means that the timecounter > we calculate from is not valid for the delta we get from the hardware. > > I'm not sure this answers your question, if not it is not bad will, just > me not understanding the question :-) *grin* I think I understand the problem. Let me try to rephrase to make sure. 1) If we have an interrupt lockout (*NOT* due to time-counting code), then we'd have a problem since the hardclock would never get run. 2) If however, the locking done to protect the timecounter code happens to make getting/setting the timecounter take too long, we'd get similar results, but for *completely* different reasons. Let me be more precise. (1) cli(); /* Take a really long time doing something */ sti(); (2) /* Do something */ gettime(); /* Takes a really long time to complete */ The first is harder to track down/fix, simply because you don't know *who* the offender is. The latter is essentially the same problem to fix, but *may* be easier to fix since the offending code *IS* the timecounter code. Am I even close to understanding? Nate To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: >> In message <[EMAIL PROTECTED]>, John Polstra writes: >> >Yes, I think you're onto something now. It's a 550 MHz. machine, so >> >the TSC increments every 1.82 nsec. And 1.82 nsec * 2^32 is 7.81 >> >seconds. :-) >> >> In that case I'm almost willing to put an AnchorSteam on microuptime() >> being interrupted for more than good is in which case the splhigh() should >> cure it. > >I'm testing that now. But for how long would microuptime have to >be interrupted to make this happen? Surely not 7.81 seconds! On >this same machine I have a curses application running which is >updating the screen once a second. It never misses a beat, and >userland is very responsive. Well, that is what I don't understand yet either :-) The fact that the delta is not exactly 2^32 * cpu clock is probably blindingly obviously indicative of "why", but I havn't solved the puzzle yet... Since you are running with a 1 HZ, NTIMECOUNTER should probably be considerably increased. (Actually it might be a good idea to simply set NTIMECOUNTER == hz ... hmmm...) We could be seing a situation where a process is preempted in microuptime() and the timecounter ring being recycled *and* the hardware counter overflowing before it completes, that might give this problem. You didn't say if you ran with standard NTIMECOUNTER right now, but 5 would be awfully short time at HZ=1: 500 usec... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, Nate Williams writes: >How are issues (1) and (3) above different? > >ps. I'm just trying to understand, and am *NOT* trying to start a >flame-war. :) :) :) If the starvation happens to hardclock() or rather tc_windup() the effect will be cummulative and show up in permanent jumps in the output of date for instance. In stable hardclock() is spl-protected so this would be _really_ bad news. If the starvation happens in any of {micro|nano}[up]time() (but not the "get&" variants!) the it will result in a single spurious reading. The premise for avoiding locking in the access functions to timecounters where precisely that we could trust them to not be pre-empted for long enough for the hardware to roll over, if this is not the case we loose because the overflow in the hardware counter means that the timecounter we calculate from is not valid for the delta we get from the hardware. I'm not sure this answers your question, if not it is not bad will, just me not understanding the question :-) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > >Yes, I think you're onto something now. It's a 550 MHz. machine, so > >the TSC increments every 1.82 nsec. And 1.82 nsec * 2^32 is 7.81 > >seconds. :-) > > In that case I'm almost willing to put an AnchorSteam on microuptime() > being interrupted for more than good is in which case the splhigh() should > cure it. I'm testing that now. But for how long would microuptime have to be interrupted to make this happen? Surely not 7.81 seconds! On this same machine I have a curses application running which is updating the screen once a second. It never misses a beat, and userland is very responsive. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> >> >> Can you try to MFC rev 1.111 and see if that changes anything ? > >> > > >> >That produced some interesting results. I am still testing under > >> >very heavy network interrupt load. With the change from 1.111, I > >> >still get the microuptime messages about as often. But look how > >> >much larger the reported backwards jumps are: > >> > > >> >microuptime() went backwards (896.225603 -> 888.463636) > >> >microuptime() went backwards (896.225603 -> 888.494440) > >> >microuptime() went backwards (896.225603 -> 888.500875) > >> >microuptime() went backwards (1184.392277 -> 1176.603001) > >> >microuptime() went backwards (1184.392277 -> 1176.603749) > >> > >> (Ok, I'll MFC 1.111) > > > >Huh? It appears that 1.111 makes things worse, not better (larger > >jumps). > > No, 1.111 makes the jumps report more correctly I think. Now, if that ain't a glowing reason to MFC it, I don't know one (I think). :) :) > They will maybe save your meal in less bad cases than yours, but in > yours they just make sure that we don't get invalid number of > microseconds in a timeval, and consequently we get more honest output. How can you verify that this is the case? > >> We now have three options left: > >>hardclock interrupt starvation > > > >This is Bruce's hypothesis, right? > > Also mine for that matter. > > >>scheduling related anomaly wrt to the use of microuptime(). > >>arithmetic overflow because the call to microuptime() gets > >>interrupted for too long. > > > >'Interrupted for too long'. Do you mean 'not interrupted enough', aka > >a long interrupt blockage? (I'm trying to understand here.) > > See my previous email, I just explained it there. I still didn't understand, hence the reason for the question. (The explanation was in the email I originall responded to). I understand the 'overflow' issue, but it would seem to my naive thinking that it would occur only when interrupts are blocked for a period of time, which is the same as hardclock interrupt starvation in my mind. How are issues (1) and (3) above different? Nate ps. I'm just trying to understand, and am *NOT* trying to start a flame-war. :) :) :) To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: >> In message <[EMAIL PROTECTED]>, John Polstra writes: >> >In article <[EMAIL PROTECTED]>, >> >John Polstra <[EMAIL PROTECTED]> wrote: >> > >> >Another interesting thing is that the jumps are always 7.7x seconds >> >back -- usually 7.79 seconds. This is even true with more data points >> >from two different machines. >> >> Yes, I noticed, but didn't dare draw conclusions based on two data points. > >It's pretty consistent -- always 7.7somthing. > >> This points to an arithmetic overflow (ie: point 3 in my previous email) > >Yes, I think you're onto something now. It's a 550 MHz. machine, so >the TSC increments every 1.82 nsec. And 1.82 nsec * 2^32 is 7.81 >seconds. :-) In that case I'm almost willing to put an AnchorSteam on microuptime() being interrupted for more than good is in which case the splhigh() should cure it. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, Nate Williams writes: >> >> Can you try to MFC rev 1.111 and see if that changes anything ? >> > >> >That produced some interesting results. I am still testing under >> >very heavy network interrupt load. With the change from 1.111, I >> >still get the microuptime messages about as often. But look how >> >much larger the reported backwards jumps are: >> > >> >microuptime() went backwards (896.225603 -> 888.463636) >> >microuptime() went backwards (896.225603 -> 888.494440) >> >microuptime() went backwards (896.225603 -> 888.500875) >> >microuptime() went backwards (1184.392277 -> 1176.603001) >> >microuptime() went backwards (1184.392277 -> 1176.603749) >> >> (Ok, I'll MFC 1.111) > >Huh? It appears that 1.111 makes things worse, not better (larger >jumps). No, 1.111 makes the jumps report more correctly I think. They will maybe save your meal in less bad cases than yours, but in yours they just make sure that we don't get invalid number of microseconds in a timeval, and consequently we get more honest output. >> We now have three options left: >> hardclock interrupt starvation > >This is Bruce's hypothesis, right? Also mine for that matter. >> scheduling related anomaly wrt to the use of microuptime(). >> arithmetic overflow because the call to microuptime() gets >> interrupted for too long. > >'Interrupted for too long'. Do you mean 'not interrupted enough', aka >a long interrupt blockage? (I'm trying to understand here.) See my previous email, I just explained it there. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >> This may be a problem, I have yet to see GCC make different code for >> that but I should probably have committed the "volatile" anyway. > >It should be committed, but it is not causing the problem in this >case. I changed it and then compared MD5s of the object files. The >only changes that resulted were unimportant. Will do. >> >I also noticed this in tco_forward(): >> > >> >tco = timecounter; >> >tc = sync_other_counter(); >> >[...] >> >if (tco->tc_poll_pps) >> >> This code is actually correct, the tc_poll_pps needs to be done on >> the "old" timecounter, because that would be the reference for any >> captured hardware timestamps, if I did it on the new timecounter I >> might get negative deltas which would complicate things. Also the >> new timecounter may have a changed frequency/offset (tickadj/ntpd >> and all that). > >I don't think I follow your reasoning here. If the call to >sync_other_counter were inlined, we'd have something like this: > >tco = timecounter; >tco_in_sync_other_counter = timecounter; >[...] >if (tco->tc_poll_pps) > >Obviously tco and tco_in_sync_other_counter will have the same value >almost all of the time, so the code can't be relying on them being >different. It doesn't rely on them being different, they are used for two different purposes. sync_other_counter() sets up the next counter in the ring to be used for timekeeping, tco/tc_poll_pps is used to calculate any hardware captured timestamps which occured _before_ now. sync_other_counter() returns the next timecounter after fiddling it but we cannot use that for tc_poll_pps() purposes, we need the old (or if you will: still present one) for that. Trust me, this is not your problem because this is used only with the xrpu.c timecounter code anyway and you don't have the hardware for that :-) >Many of the systems where I see this problem are using the TSC as >the timecounter. They don't have APM in the kernel, and they aren't >running ntpd. I.e., it's not only the i8254 that's the problem. The >fastest of these systems is a 1.13 GHz PIII, and it would take the 32 >bits of the TSC which are actually used 3.8 seconds to wrap around. It would be a problem if the kernel is punched hard enough that code doesn't complete in the time it takes for the used timecounter to wrap, ie: if you can have something like: call microuptime() tc = timecounter; (interrupt and do other stuff for several seconds) ... overflow in arithmetic But you would have to pummel your kernel pretty bad for that. On the other hand, between a 1 Hz and the network load you describe, that may be exactly what is happening... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > > Sanity-check: this is NOT a multi-CPU system, right ? Right. These are all single-CPU systems with non-SMP -stable kernels. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> >> Can you try to MFC rev 1.111 and see if that changes anything ? > > > >That produced some interesting results. I am still testing under > >very heavy network interrupt load. With the change from 1.111, I > >still get the microuptime messages about as often. But look how > >much larger the reported backwards jumps are: > > > >microuptime() went backwards (896.225603 -> 888.463636) > >microuptime() went backwards (896.225603 -> 888.494440) > >microuptime() went backwards (896.225603 -> 888.500875) > >microuptime() went backwards (1184.392277 -> 1176.603001) > >microuptime() went backwards (1184.392277 -> 1176.603749) > > (Ok, I'll MFC 1.111) Huh? It appears that 1.111 makes things worse, not better (larger jumps). Can you explain why you think this is a good things, since it seems to be a bad thing to me. > Sanity-check: this is NOT a multi-CPU system, right ? As stated before, both are > 1Ghz single-CPU systems running -stable, although I'm sure John is capable of a answering this on his own. :) > We now have three options left: > hardclock interrupt starvation This is Bruce's hypothesis, right? > scheduling related anomaly wrt to the use of microuptime(). > arithmetic overflow because the call to microuptime() gets > interrupted for too long. 'Interrupted for too long'. Do you mean 'not interrupted enough', aka a long interrupt blockage? (I'm trying to understand here.) Nate To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > >In article <[EMAIL PROTECTED]>, > >John Polstra <[EMAIL PROTECTED]> wrote: > > > >Another interesting thing is that the jumps are always 7.7x seconds > >back -- usually 7.79 seconds. This is even true with more data points > >from two different machines. > > Yes, I noticed, but didn't dare draw conclusions based on two data points. It's pretty consistent -- always 7.7somthing. > This points to an arithmetic overflow (ie: point 3 in my previous email) Yes, I think you're onto something now. It's a 550 MHz. machine, so the TSC increments every 1.82 nsec. And 1.82 nsec * 2^32 is 7.81 seconds. :-) I'll try the things you suggested in your other mail. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >John Polstra <[EMAIL PROTECTED]> wrote: >> In article <[EMAIL PROTECTED]>, >> Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: >> > In message <[EMAIL PROTECTED]>, John Polstra writes: >> > >> > Can you try to MFC rev 1.111 and see if that changes anything ? >> >> That produced some interesting results. I am still testing under >> very heavy network interrupt load. With the change from 1.111, I >> still get the microuptime messages about as often. But look how >> much larger the reported backwards jumps are: >> >> microuptime() went backwards (896.225603 -> 888.463636) >> microuptime() went backwards (896.225603 -> 888.494440) >> microuptime() went backwards (896.225603 -> 888.500875) >> microuptime() went backwards (1184.392277 -> 1176.603001) >> microuptime() went backwards (1184.392277 -> 1176.603749) > >Another interesting thing is that the jumps are always 7.7x seconds >back -- usually 7.79 seconds. This is even true with more data points >from two different machines. Yes, I noticed, but didn't dare draw conclusions based on two data points. This points to an arithmetic overflow (ie: point 3 in my previous email) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: >> In message <[EMAIL PROTECTED]>, John Polstra writes: >> >> Can you try to MFC rev 1.111 and see if that changes anything ? > >That produced some interesting results. I am still testing under >very heavy network interrupt load. With the change from 1.111, I >still get the microuptime messages about as often. But look how >much larger the reported backwards jumps are: > >microuptime() went backwards (896.225603 -> 888.463636) >microuptime() went backwards (896.225603 -> 888.494440) >microuptime() went backwards (896.225603 -> 888.500875) >microuptime() went backwards (1184.392277 -> 1176.603001) >microuptime() went backwards (1184.392277 -> 1176.603749) (Ok, I'll MFC 1.111) Sanity-check: this is NOT a multi-CPU system, right ? We now have three options left: hardclock interrupt starvation scheduling related anomaly wrt to the use of microuptime(). arithmetic overflow because the call to microuptime() gets interrupted for too long. Looking for the first, add this to hardclock: static u_int64_t lasttsc; u_int64_t delta, now; now = rdtsc(); delta = now - lasttsc; lasttsc = now; if (delta > (tsc_freq * 3) / (2 * hz)) { printf("delta = %llu \n", delta); } In theory, you get no printfs. If you do, hopefully they have small delta values. Looking for the third: Add splhigh() protection around the stuff in microuptime(); -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, John Polstra <[EMAIL PROTECTED]> wrote: > In article <[EMAIL PROTECTED]>, > Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > > In message <[EMAIL PROTECTED]>, John Polstra writes: > > > > Can you try to MFC rev 1.111 and see if that changes anything ? > > That produced some interesting results. I am still testing under > very heavy network interrupt load. With the change from 1.111, I > still get the microuptime messages about as often. But look how > much larger the reported backwards jumps are: > > microuptime() went backwards (896.225603 -> 888.463636) > microuptime() went backwards (896.225603 -> 888.494440) > microuptime() went backwards (896.225603 -> 888.500875) > microuptime() went backwards (1184.392277 -> 1176.603001) > microuptime() went backwards (1184.392277 -> 1176.603749) Another interesting thing is that the jumps are always 7.7x seconds back -- usually 7.79 seconds. This is even true with more data points from two different machines. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > > Can you try to MFC rev 1.111 and see if that changes anything ? That produced some interesting results. I am still testing under very heavy network interrupt load. With the change from 1.111, I still get the microuptime messages about as often. But look how much larger the reported backwards jumps are: microuptime() went backwards (896.225603 -> 888.463636) microuptime() went backwards (896.225603 -> 888.494440) microuptime() went backwards (896.225603 -> 888.500875) microuptime() went backwards (1184.392277 -> 1176.603001) microuptime() went backwards (1184.392277 -> 1176.603749) John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > > >Agreed. But in the cases I'm worrying about right now, the > >timecounter is the TSC. > > Now, *that* is very interesting, how reproducible is it ? I can reproduce it pretty easily. This is from a PIII/550: web1# uptime 5:23PM up 37 mins, 2 users, load averages: 0.08, 0.05, 0.01 web1# dmesg | grep microuptime microuptime() went backwards (1992.7347381 -> 1991.587620) microuptime() went backwards (1992.7347381 -> 1991.620385) microuptime() went backwards (1992.7347381 -> 1991.621582) microuptime() went backwards (2016.7063298 -> 2015.272466) microuptime() went backwards (2286.7346482 -> 2285.587587) microuptime() went backwards (2286.7346482 -> 2285.593646) microuptime() went backwards (2286.7346482 -> 2285.595103) microuptime() went backwards (2286.7346482 -> 2285.617023) Now, two things are unusual about this system. First, I have it pumping packets from inside the kernel (with a special netgraph node) at a rate which is keeping the system about 95% busy processing interrupts. (That's about 330K packets/second.) That definitely makes it happen more often. But I also see it occasionally when the system is sitting idle. Second, I have HZ set to 1. I think that also contributes to the problem. It shouldn't, though. Not even a PII/400 breathes hard at HZ=1. On another system which is mostly idle, I see it about 4 times a day. That system also has HZ set to 1. It's a PIII/1.13 GHz: microuptime() went backwards (200671.582473 -> 200671.582472) microuptime() went backwards (222622.358518 -> 222622.358517) microuptime() went backwards (228934.897839 -> 228934.897838) microuptime() went backwards (246166.386301 -> 246166.386300) microuptime() went backwards (263484.488575 -> 263484.488574) Notice the difference on this system that's fairly idle. On the other system, it was the seconds going backward. Here it's the microseconds. These may be two entirely different problems. > Can you try to MFC rev 1.111 and see if that changes anything ? Sure. I'll let you know. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >Agreed. But in the cases I'm worrying about right now, the >timecounter is the TSC. Now, *that* is very interesting, how reproducible is it ? Can you try to MFC rev 1.111 and see if that changes anything ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > >In article <[EMAIL PROTECTED]>, > >John Baldwin <[EMAIL PROTECTED]> wrote: > >> > >> > like, "If X is never locked out for longer than Y, this problem > >> > cannot happen." I'm looking for definitions of X and Y. X might be > >> > hardclock() or softclock() or non-interrupt kernel processing. Y > >> > would be some measure of time, probably a function of HZ and/or the > >> > timecounter frequency. > >> > >> X is hardclock I think, since hardclock() calls tc_windup(). > > > >That makes sense, but on the other hand hardclock seems unlikely to be > >delayed by much. The only thing that can block hardclock is another > >hardclock, an splclock, or an splhigh. And, maybe, splstatclock. I'm > >talking about -stable here, which is where I'm doing my experiments. > > Try swapping so you use the RTC for hardclock & statclock. > > Let the i8254 run with 65536 divisor and do only timecounter service. > > That would be a very interresting experiment. Agreed. But in the cases I'm worrying about right now, the timecounter is the TSC. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote: > In message <[EMAIL PROTECTED]>, John Polstra writes: > >That's the global variable named "timecounter", right? I did notice > >one potential problem: that variable is not declared volatile. So > >in this part ... > > This may be a problem, I have yet to see GCC make different code for > that but I should probably have committed the "volatile" anyway. It should be committed, but it is not causing the problem in this case. I changed it and then compared MD5s of the object files. The only changes that resulted were unimportant. > >I also noticed this in tco_forward(): > > > >tco = timecounter; > >tc = sync_other_counter(); > > [...] > >if (tco->tc_poll_pps) > > > >But sync_other_counter() loads its own copy of "timecounter", > >and there's no guarantee it hasn't changed from the value that > >tco_forward() saved in its local variable. I'm not sure yet if > >that's a potential problem. It could corrected by passing "tco" as > >an argument to sync_other_counter. I'll try that too. > > This code is actually correct, the tc_poll_pps needs to be done on > the "old" timecounter, because that would be the reference for any > captured hardware timestamps, if I did it on the new timecounter I > might get negative deltas which would complicate things. Also the > new timecounter may have a changed frequency/offset (tickadj/ntpd > and all that). I don't think I follow your reasoning here. If the call to sync_other_counter were inlined, we'd have something like this: tco = timecounter; tco_in_sync_other_counter = timecounter; [...] if (tco->tc_poll_pps) Obviously tco and tco_in_sync_other_counter will have the same value almost all of the time, so the code can't be relying on them being different. Anyway, I realize now that this also isn't the problem, because tco_forward is only ever called at splclock. It can't be interrupted or re-entered, at least not on the uniprocessor -stable systems I'm looking at. > There is one more failure mode which you have overlooked: The individual > timecounters maintain a binary counter of a certain width, if interrupt > latency gets too bad, this may overflow. > > This is a non-issue for the TSC, which is 64bit wide in hardware. Many of the systems where I see this problem are using the TSC as the timecounter. They don't have APM in the kernel, and they aren't running ntpd. I.e., it's not only the i8254 that's the problem. The fastest of these systems is a 1.13 GHz PIII, and it would take the 32 bits of the TSC which are actually used 3.8 seconds to wrap around. > Hope this helps... Yep, thanks. I have some ideas of other things to try. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >In article <[EMAIL PROTECTED]>, >John Baldwin <[EMAIL PROTECTED]> wrote: >> >> > like, "If X is never locked out for longer than Y, this problem >> > cannot happen." I'm looking for definitions of X and Y. X might be >> > hardclock() or softclock() or non-interrupt kernel processing. Y >> > would be some measure of time, probably a function of HZ and/or the >> > timecounter frequency. >> >> X is hardclock I think, since hardclock() calls tc_windup(). > >That makes sense, but on the other hand hardclock seems unlikely to be >delayed by much. The only thing that can block hardclock is another >hardclock, an splclock, or an splhigh. And, maybe, splstatclock. I'm >talking about -stable here, which is where I'm doing my experiments. Try swapping so you use the RTC for hardclock & statclock. Let the i8254 run with 65536 divisor and do only timecounter service. That would be a very interresting experiment. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, John Baldwin <[EMAIL PROTECTED]> wrote: > > > like, "If X is never locked out for longer than Y, this problem > > cannot happen." I'm looking for definitions of X and Y. X might be > > hardclock() or softclock() or non-interrupt kernel processing. Y > > would be some measure of time, probably a function of HZ and/or the > > timecounter frequency. > > X is hardclock I think, since hardclock() calls tc_windup(). That makes sense, but on the other hand hardclock seems unlikely to be delayed by much. The only thing that can block hardclock is another hardclock, an splclock, or an splhigh. And, maybe, splstatclock. I'm talking about -stable here, which is where I'm doing my experiments. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, "M. Warner Losh" writes: >In message: <[EMAIL PROTECTED]> >Poul-Henning Kamp <[EMAIL PROTECTED]> writes: >: But the i8254 is a piece of shit in this context, and due to >: circumstances (apm being enabled0 most machines end up using the >: i8254 by default. >: >: My (and I belive Bruce's) diagnosis so far is that most problems >: come from the i8254 timecounter. > >We measured pps interrupts with the i8254 timecounter in a fast >interrupt handler via the parallel port (yes, we hacked it to give us >a fast interrupt). We found lots of outliers on the order of a few >milliseconds in the data that we had to discard because they were >obviously bogus. We don't know if this is because of interrupt >latency or because of bugs in the 8254 timecounter code/hardware. At >the time, it wasn't important enough to do a detailed numerology on to >see if more data couldn't be mined from it or not. And the data that >we saw the outliers in was somewhat processed from the original >data... I have not tried to measure the i8254 against my hardware solution, but by now I belive that certain bogus chipsets may have bummed the 'latch' command or more than that maybe. Anyway, some, but not all of the i8254 issues could be eased up a bit if we lost the pcaudio crap and used the RTC's 128 Hz signal for Hz and let the i8254 run at a 65536 count all the time. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message: <[EMAIL PROTECTED]> Poul-Henning Kamp <[EMAIL PROTECTED]> writes: : But the i8254 is a piece of shit in this context, and due to : circumstances (apm being enabled0 most machines end up using the : i8254 by default. : : My (and I belive Bruce's) diagnosis so far is that most problems : come from the i8254 timecounter. We measured pps interrupts with the i8254 timecounter in a fast interrupt handler via the parallel port (yes, we hacked it to give us a fast interrupt). We found lots of outliers on the order of a few milliseconds in the data that we had to discard because they were obviously bogus. We don't know if this is because of interrupt latency or because of bugs in the 8254 timecounter code/hardware. At the time, it wasn't important enough to do a detailed numerology on to see if more data couldn't be mined from it or not. And the data that we saw the outliers in was somewhat processed from the original data... Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In message <[EMAIL PROTECTED]>, John Polstra writes: >Mike Smith <[EMAIL PROTECTED]> wrote: >> >> It's not necessarily caused by interrupt latency. Here's the assumption >> that's being made. >[...] > >Thanks for the superb explanation! I appreciate it. My apologies for never getting the timecounter paper finished and published, I should really do that... >That's the global variable named "timecounter", right? I did notice >one potential problem: that variable is not declared volatile. So >in this part ... This may be a problem, I have yet to see GCC make different code for that but I should probably have committed the "volatile" anyway. >I also noticed this in tco_forward(): > >tco = timecounter; >tc = sync_other_counter(); > [...] >if (tco->tc_poll_pps) > >But sync_other_counter() loads its own copy of "timecounter", >and there's no guarantee it hasn't changed from the value that >tco_forward() saved in its local variable. I'm not sure yet if >that's a potential problem. It could corrected by passing "tco" as >an argument to sync_other_counter. I'll try that too. This code is actually correct, the tc_poll_pps needs to be done on the "old" timecounter, because that would be the reference for any captured hardware timestamps, if I did it on the new timecounter I might get negative deltas which would complicate things. Also the new timecounter may have a changed frequency/offset (tickadj/ntpd and all that). >> There are a couple of possible problems with this mechanism. >> >> One is that the ring "catches up" with your saved copy of the >> "current" pointer, ie. inbetween fetching the pointer and reading the >> timecounter contents, the "next" pointer passes over you again in such >> a fashion that you get garbage out of the structure. This is unlikely. >> Another is that there is a race between multiple updaters of the >> timecounter; if two parties are both updating the "next" timecounter >> along with another party trying to get the "current" time, this could >> cause corruption. I have worries in this respect with SMPng, but have yet to see a corrupted timecounter ring. There is one more failure mode which you have overlooked: The individual timecounters maintain a binary counter of a certain width, if interrupt latency gets too bad, this may overflow. This is a non-issue for the TSC, which is 64bit wide in hardware. It should also be a non-issue for the PIIX which is at least 24 bits in hardware. But the i8254 is a piece of shit in this context, and due to circumstances (apm being enabled0 most machines end up using the i8254 by default. My (and I belive Bruce's) diagnosis so far is that most problems come from the i8254 timecounter. I made a commit recently which made the core-code more robust to bad interrupt jitter/latency, basically it would return timestamps with too many microseconds or nanoseconds because it only tried to roll over to seconds ones. Now it while()'s over that loop. Hope this helps... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
On 04-Feb-02 John Polstra wrote: > In article <[EMAIL PROTECTED]>, > Dominic Marks <[EMAIL PROTECTED]> wrote: >> On Mon, Feb 04, 2002 at 01:21:25PM -0800, John Polstra wrote: >> > I'm trying to understand the timecounter code, and in particular the >> > reason for the "microuptime went backwards" messages which I see on >> > just about every machine I have, whether running -stable or -current. >> >> I see them everywhere with -CURRENT, but not at all with -STABLE. This is >> with two seperate machines. Perhaps that may add clues. > > I'm looking for something less empirical than that. When somebody > says this problem is caused by too much interrupt latency, I assume > they have a mental model of what is going wrong when this excessive > latency occurs. Given that, it should be possible to make a statement > like, "If X is never locked out for longer than Y, this problem > cannot happen." I'm looking for definitions of X and Y. X might be > hardclock() or softclock() or non-interrupt kernel processing. Y > would be some measure of time, probably a function of HZ and/or the > timecounter frequency. X is hardclock I think, since hardclock() calls tc_windup(). I'm not sure what Y is except that it is indeed a known value. phk should know as he is Mr. Timecounter. > John > -- > John Polstra > John D. Polstra & Co., Inc.Seattle, Washington USA > "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Mike Smith <[EMAIL PROTECTED]> wrote: > > It's not necessarily caused by interrupt latency. Here's the assumption > that's being made. [...] Thanks for the superb explanation! I appreciate it. > There is a ring of timecounter structures, of some size. In testing, > I've used sizes of a thousand or more, but still seen this problem. > > There is a pointer to the "current" timecounter structure. That's the global variable named "timecounter", right? I did notice one potential problem: that variable is not declared volatile. So in this part ... > When one wishes to read the current time, one proceeds as follows: > > - Get the "current" pointer and save it locally. > - Read the timecounter structure via the local "current" pointer. ... the compiler is perfectly free to reread the global multiple times in the function rather than using the saved local copy. If the "current" pointer has moved in that time, we'll an inconsistent view of the timecounter. In looking at the generated code I haven't found any actual instances of that. But I'll try making it volatile just to make sure. Even if it doesn't cause any problems currently, I think we should change it to volatile since it could start to cause problems some day. I also noticed this in tco_forward(): tco = timecounter; tc = sync_other_counter(); [...] if (tco->tc_poll_pps) But sync_other_counter() loads its own copy of "timecounter", and there's no guarantee it hasn't changed from the value that tco_forward() saved in its local variable. I'm not sure yet if that's a potential problem. It could corrected by passing "tco" as an argument to sync_other_counter. I'll try that too. > There are a couple of possible problems with this mechanism. > > One is that the ring "catches up" with your saved copy of the > "current" pointer, ie. inbetween fetching the pointer and reading the > timecounter contents, the "next" pointer passes over you again in such > a fashion that you get garbage out of the structure. As you mentioned, with a large enough ring this should be impossible. If I read the code correctly, the "current" pointer is only moved once per second. So in the current ring of 4 counters (number 0 is special), it would take 4 seconds to wrap around the ring. That's a pretty long time. > Another is that there is a race between multiple updaters of the > timecounter; if two parties are both updating the "next" timecounter > along with another party trying to get the "current" time, this could > cause corruption. > > All that interrupt latency will do is make the updates late; I can't > actually see how it could cause corruption. Corruption has to be > caused by mishandling of the timecounter ring in some fashion. I agree. > Note that you can probably eliminate the ring loop theory by > allocating a very large number of entries in the ring by setting > NTIMECOUNTER (kern/kern_tc.c) higher. The structures are small; try > 100,000 or so. OK, but even the thousand you tried should give a cushion of more than 16 minutes. > If you can reproduce under these circumstances, try adding some checks > to make sure the "current" timecounter pointer is behaving > monotonically; just save the last timecounter pointer in microtime() > et. al. > > Another test worth performing is to look at the tco_delta function for > the timecounter and make sure that it returns a sane value, and one > that doesn't behave out of synch with the interrupt handler that updates > the timecounter proper. If you save the delta value in the timecounter > and zero it when it's updated, you can catch this. > > You can rule this out by using getmicroptime() rather than > microuptime(); it may return the same value twice, which isn't > desirable, but that would be better than nothing. > > Hope this helps a bit. Yep, thanks again. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
> In article <[EMAIL PROTECTED]>, > Dominic Marks <[EMAIL PROTECTED]> wrote: > > On Mon, Feb 04, 2002 at 01:21:25PM -0800, John Polstra wrote: > > > I'm trying to understand the timecounter code, and in particular the > > > reason for the "microuptime went backwards" messages which I see on > > > just about every machine I have, whether running -stable or -current. > > > > I see them everywhere with -CURRENT, but not at all with -STABLE. This is > > with two seperate machines. Perhaps that may add clues. > > I'm looking for something less empirical than that. When somebody > says this problem is caused by too much interrupt latency, I assume > they have a mental model of what is going wrong when this excessive > latency occurs. It's not necessarily caused by interrupt latency. Here's the assumption that's being made. There is a ring of timecounter structures, of some size. In testing, I've used sizes of a thousand or more, but still seen this problem. There is a pointer to the "current" timecounter structure. When the "current" time is updated, the following procedure is followed: - Find the "next" timecounter in the ring. - Update its contents with the new current time. - Move the "current" pointer. When one wishes to read the current time, one proceeds as follows: - Get the "current" pointer and save it locally. - Read the timecounter structure via the local "current" pointer. Since the operations on the "current" pointer are atomic, there is no need to lock the structure. There are a couple of possible problems with this mechanism. One is that the ring "catches up" with your saved copy of the "current" pointer, ie. inbetween fetching the pointer and reading the timecounter contents, the "next" pointer passes over you again in such a fashion that you get garbage out of the structure. Another is that there is a race between multiple updaters of the timecounter; if two parties are both updating the "next" timecounter along with another party trying to get the "current" time, this could cause corruption. All that interrupt latency will do is make the updates late; I can't actually see how it could cause corruption. Corruption has to be caused by mishandling of the timecounter ring in some fashion. Note that you can probably eliminate the ring loop theory by allocating a very large number of entries in the ring by setting NTIMECOUNTER (kern/kern_tc.c) higher. The structures are small; try 100,000 or so. If you can reproduce under these circumstances, try adding some checks to make sure the "current" timecounter pointer is behaving monotonically; just save the last timecounter pointer in microtime() et. al. Another test worth performing is to look at the tco_delta function for the timecounter and make sure that it returns a sane value, and one that doesn't behave out of synch with the interrupt handler that updates the timecounter proper. If you save the delta value in the timecounter and zero it when it's updated, you can catch this. You can rule this out by using getmicroptime() rather than microuptime(); it may return the same value twice, which isn't desirable, but that would be better than nothing. Hope this helps a bit. Regards, Mike To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
In article <[EMAIL PROTECTED]>, Dominic Marks <[EMAIL PROTECTED]> wrote: > On Mon, Feb 04, 2002 at 01:21:25PM -0800, John Polstra wrote: > > I'm trying to understand the timecounter code, and in particular the > > reason for the "microuptime went backwards" messages which I see on > > just about every machine I have, whether running -stable or -current. > > I see them everywhere with -CURRENT, but not at all with -STABLE. This is > with two seperate machines. Perhaps that may add clues. I'm looking for something less empirical than that. When somebody says this problem is caused by too much interrupt latency, I assume they have a mental model of what is going wrong when this excessive latency occurs. Given that, it should be possible to make a statement like, "If X is never locked out for longer than Y, this problem cannot happen." I'm looking for definitions of X and Y. X might be hardclock() or softclock() or non-interrupt kernel processing. Y would be some measure of time, probably a function of HZ and/or the timecounter frequency. John -- John Polstra John D. Polstra & Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: A question about timecounters
On Mon, Feb 04, 2002 at 01:21:25PM -0800, John Polstra wrote: > I'm trying to understand the timecounter code, and in particular the > reason for the "microuptime went backwards" messages which I see on > just about every machine I have, whether running -stable or -current. I see them everywhere with -CURRENT, but not at all with -STABLE. This is with two seperate machines. Perhaps that may add clues. > This problem is usually attributed to too much interrupt latency. My > question is, how much latency is "too much"? Which interrupt has to > be locked out for how long in order to see these messages? > > John > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-hackers" in the body of the message -- Dominic To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
A question about timecounters
I'm trying to understand the timecounter code, and in particular the reason for the "microuptime went backwards" messages which I see on just about every machine I have, whether running -stable or -current. This problem is usually attributed to too much interrupt latency. My question is, how much latency is "too much"? Which interrupt has to be locked out for how long in order to see these messages? John To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message