Re: Stop breaking the CSRNG
On Wed, Oct 02, 2019 at 11:36:55PM -0400, Theodore Y. Ts'o wrote: > On Wed, Oct 02, 2019 at 06:55:33PM +0200, Kurt Roeckx wrote: > > > > But it seems people are now thinking about breaking getrandom() too, > > to let it return data when it's not initialized by default. Please > > don't. > > "It's complicated" > > The problem is that whether a CRNG can be considered secure is a > property of the entire system, including the hardware, and given the > large number of hardware configurations which the kernel and OpenSSL > can be used, in practice, we can't assure that getrandom(2) is > "secure" without making certain assumptions. I'm not saying it's easy. But getrandom() is documented as only returning data after it has been initialized, which is an important property of that interface and the main reason to switch to it. And it seems that because someone's laptop hung during boot because it doesn't find enough entrpoy is enough to break the security of the rest. It seems that the only important thing is that applications don't stop working, because it's clearly visible that it's not working. Returning data before it's been initialized doesn't have the effect of being visibly broken, but it's just as broken, which is in my opinion worse. > But if you assume that there is no hardware random number generator, > and everything is driven from a single master oscillator, with no > exernal input, and the CPU is utterly simple, with speculation or > anything else that might be non-determinstic, AND if we assume that > the idiots who make an IOT device use the same random seed across > millions of devices all cloned off of the same master imagine, there > is ***absoutely*** nothing the kernel can do to guarantee, with 100% > certainty, that the CRNG will be initialzied. (This is especially > true if the idiots who design the IOT device call OpenSSL to generate > their long-term private key the moment the device is first plugged in, > before any networking device is brought on-line.) And returning data before it's been initialized will only make that situtation worse. We can only hope that by refusing to return data the idiot will properly fix it. If the hardware can't provide it, the kernel shouldn't just pretend the hardware did provide it. > There really are no good choices here. The one thing which Linus has > made very clear is that hanging at boot is Not Acceptable. And I think it's not a kernel problem but a combination of hardware, configuration and user space problem. The kernel can of course be improved, and I'm sure it will. I wonder if it's useful to extend getrandom() to provide an option where the application can indicate it doesn't care about security and just wants some number, like what /dev/urandom provides but then as a system call. Other options could be that you're happy with to get data after got an estimated 64 bit of entropy. > And given that many users are just installing some kind of userspace > jitter entropy to square this particular circle, even though I don't > trust a jitter entropy scheme, even if it is insecure, we're also > using RDRAND, and ultimately I'll trust RDRAND more than I trust a > jitter entropy scheme. And that's where we are right now. Linus has > introduced a simple in-kernel jitter entropy system I don't trust it much either. And I think we should at least try to estimate how much entropy it actually provides on various systems, knowing that there will probably be systems where it provides much less than what we think it does. I'm willing to help analyze data if people can provide a list of TSCs that are being added. The more samples the better. I think you want to do this on an idle system. Kurt
Stop breaking the CSRNG
Hi, As OpenSSL, we want cryptograhic secure random numbers. Before getrandom(), Linux never provided a good API for that, both /dev/random and /dev/urandom have problems. getrandom() fixed that, so we switched to it were available. It was possible to combine /dev/random and /dev/urandom, and get something that worked properly. You could call select() on /dev/random and know that both were initialized when it returned. But then select() started returning before /dev/random was initialized, so that if you switch to /dev/urnadom, it's still uninitialized. A solution for that was that you could instead read 1 byte from /dev/random, and then switch to /dev/urandom. But that also stopped working, /dev/urandom can still be uninitialized when you can read from /dev/random. So there no longer is a way to wait for /dev/urandom to be initialized. As a result of that, we now refuse to use /dev/urandom on recent kernels, and require to use of getrandom(). (To make this work with older userspace, this means we need to import all the different __NR_getrandom defines, and do the system call ourself.) But it seems people are now thinking about breaking getrandom() too, to let it return data when it's not initialized by default. Please don't. If you think such a mode is useful for some applications, let them set a flag, instead of the reverse. Kurt
Re: temperature standard - global config option?
On Thu, Jun 21, 2001 at 12:18:12PM +0100, Jonathan Morton wrote: > I've been taught by every Maths, Engineering and Physics > teacher/lecturer I've encountered to write down significant figures > consistent with the precision of the value. So blindly writing down > a value of 59.42886726469 ±2°C is obviously ludicrous, even if that's > what my calculator gives me. I should instead write 59 ±2°C, since > that is the most precision I can possibly know it to. With some > advanced measuring techniques it *may* be acceptable to write 59.43 > ±2°C *at most*, and then only if you really know why you need the > extra information. What they teached me in school is about the same. But the rule for the precision was to use two signicifant(?) decimals. So you end up with 59.4 ± 2.0 °C or something. Also note that you have to round up the precision, so it couldn't have been 2.01, but could have been 1.01 the way you wrote it. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: temperature standard - global config option?
On Thu, Jun 21, 2001 at 12:18:12PM +0100, Jonathan Morton wrote: I've been taught by every Maths, Engineering and Physics teacher/lecturer I've encountered to write down significant figures consistent with the precision of the value. So blindly writing down a value of 59.42886726469 ±2°C is obviously ludicrous, even if that's what my calculator gives me. I should instead write 59 ±2°C, since that is the most precision I can possibly know it to. With some advanced measuring techniques it *may* be acceptable to write 59.43 ±2°C *at most*, and then only if you really know why you need the extra information. What they teached me in school is about the same. But the rule for the precision was to use two signicifant(?) decimals. So you end up with 59.4 ± 2.0 °C or something. Also note that you have to round up the precision, so it couldn't have been 2.01, but could have been 1.01 the way you wrote it. Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Break 2.4 VM in five easy steps
On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > Do I understand you correctly? > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > drives. Maybe you really should reread the statements people made about this before. One of them being, that if you're not using swap in 2.2, it won't need any in 2.4 either. 2.4 will use more swap in case it does use it. It now works more like other UNIX variants where the rule is that swap = 2 * RAM. That swap = 2 * RAM is just a guideline, you really should look at what applications you run, and how memory they use. If you choise your RAM so that all application can always be in memory at all time, there is no need for swap. If they can't be, the rule might help you. I think someone said that the swap should be large enough to hold all application that are running on swapspace, that is, in case you want to use swap. Disk maybe be alot cheaper than RAM, but it's also alot slower. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Break 2.4 VM in five easy steps
On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote: On Wed, 6 Jun 2001, Sean Hunter wrote: For large memory boxes, this is ridiculous. Should I have 8GB of swap? Do I understand you correctly? ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB drives. Maybe you really should reread the statements people made about this before. One of them being, that if you're not using swap in 2.2, it won't need any in 2.4 either. 2.4 will use more swap in case it does use it. It now works more like other UNIX variants where the rule is that swap = 2 * RAM. That swap = 2 * RAM is just a guideline, you really should look at what applications you run, and how memory they use. If you choise your RAM so that all application can always be in memory at all time, there is no need for swap. If they can't be, the rule might help you. I think someone said that the swap should be large enough to hold all application that are running on swapspace, that is, in case you want to use swap. Disk maybe be alot cheaper than RAM, but it's also alot slower. Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [QUESTION] which routines must be re-entrant?
On Thu, May 31, 2001 at 04:01:34PM -0700, Dawson Engler wrote: > Is there an easy way to tell which routines must be re-entrant? > (it doesn't have to be exhaustive, even an incomplete set is useful) > > I was going to write a checker to make sure supposedly re-entrant > routines actually were, but was having a hard time figuring out which > ones were supposed to be... Their was an post on bugtraq a few days ago about this, it had a list with all system calls which are reentrant safe under OpenBSD. The paper was about signals, and is available at http://razor.bindview.com/publish/papers/signals.txt OpenBSD had a manpage wich lists all the function which should be be safe to call from a signal handler. It might be a nice place to start. You should only look at those from section 2 of course. http://www.openbsd.org/cgi-bin/man.cgi?query=sigaction Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [QUESTION] which routines must be re-entrant?
On Thu, May 31, 2001 at 04:01:34PM -0700, Dawson Engler wrote: Is there an easy way to tell which routines must be re-entrant? (it doesn't have to be exhaustive, even an incomplete set is useful) I was going to write a checker to make sure supposedly re-entrant routines actually were, but was having a hard time figuring out which ones were supposed to be... Their was an post on bugtraq a few days ago about this, it had a list with all system calls which are reentrant safe under OpenBSD. The paper was about signals, and is available at http://razor.bindview.com/publish/papers/signals.txt OpenBSD had a manpage wich lists all the function which should be be safe to call from a signal handler. It might be a nice place to start. You should only look at those from section 2 of course. http://www.openbsd.org/cgi-bin/man.cgi?query=sigaction Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Potenitial security hole in the kernel
On Tue, May 29, 2001 at 01:30:30AM +0200, Kurt Roeckx wrote: > You should never "return" from userspace to kernelspace. The > only way to go from user space to kernel space should be by using > a system call. If you were able to return to kernel space, it already means you're running as kernel in the first place. There is no reason to even do the return in the first place. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Potenitial security hole in the kernel
On Tue, May 29, 2001 at 12:30:03AM +0200, Vadim Lebedev wrote: > Kurt, > > Maybe i'm missing something but it seems that during execution of the signal > handler, user mode stack contains kernel mode context... > Hence the security hole It's rather complicated how things work. Both the user and kernel stack are changed. On the user stack we add a frame from the calling function. This just looks a function call. On the kernel stack we change the last frame so we "return" to the signal handler from the kernel. The signal handler then "returns" to the place where the process did the system call. You do not return to the kernel. I hope this helps you understand things better. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Potenitial security hole in the kernel
On Mon, May 28, 2001 at 11:43:38PM +0200, Vadim Lebedev wrote: > Hi folks, > > Please correct me if i'm wrong but it seems to me that i've stumbled on > really BIG security hole in the signal handling code. > The problem IMO is that the signal handling code stores a processor context > on the user-mode stack frame which is active while And how is that different from any other function call? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Potenitial security hole in the kernel
On Mon, May 28, 2001 at 11:43:38PM +0200, Vadim Lebedev wrote: Hi folks, Please correct me if i'm wrong but it seems to me that i've stumbled on really BIG security hole in the signal handling code. The problem IMO is that the signal handling code stores a processor context on the user-mode stack frame which is active while And how is that different from any other function call? Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Potenitial security hole in the kernel
On Tue, May 29, 2001 at 12:30:03AM +0200, Vadim Lebedev wrote: Kurt, Maybe i'm missing something but it seems that during execution of the signal handler, user mode stack contains kernel mode context... Hence the security hole It's rather complicated how things work. Both the user and kernel stack are changed. On the user stack we add a frame from the calling function. This just looks a function call. On the kernel stack we change the last frame so we return to the signal handler from the kernel. The signal handler then returns to the place where the process did the system call. You do not return to the kernel. I hope this helps you understand things better. Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Potenitial security hole in the kernel
On Tue, May 29, 2001 at 01:30:30AM +0200, Kurt Roeckx wrote: You should never return from userspace to kernelspace. The only way to go from user space to kernel space should be by using a system call. If you were able to return to kernel space, it already means you're running as kernel in the first place. There is no reason to even do the return in the first place. Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel bug with UNIX sockets not detecting other end gone?
On Fri, May 18, 2001 at 09:02:51PM +0100, Alan Cox wrote: > > What I'm seeing however in an other program is that select says I > > can read from the socket, and that read returns 0, with errno set > > to EGAIN. I call select() again, with returns and says I can read > > No no no. If the read does not return -1 it does not change errno. EOF isnt > an error. Of course, how stupid of me. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel bug with UNIX sockets not detecting other end gone?
On Thu, May 17, 2001 at 11:57:45PM +0100, Chris Evans wrote: > > Hi, > > I wonder if the following is a bug? It certainly differs from FreeBSD 4.2 > behaviour, which gives the behaviour I would expect. > > The following program blocks indefinitely on Linux (2.2, 2.4 not tested). > Since the other end is clearly gone, I would expect some sort of error > condition. Indeed, FreeBSD gives ECONNRESET. I'm having a simular problem, but somehow can't recreate it. The difference is that I set the sockets to non-blocking, and expect it to return some error. read() returns 0, with errno set to 0, if the socket is still open it returns -1 with errno set 11 (EAGAIN). I can understand those behaviours. What I'm seeing however in an other program is that select says I can read from the socket, and that read returns 0, with errno set to EGAIN. I call select() again, with returns and says I can read from that socket ..., which keeps going on. It stops from the moment there is any i/o on an other socket. Is there any way you can detect the other side is gone without using write()? write() shoud return an EPIPE. Should I be able to detect it with read(), or some oter system call? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel bug with UNIX sockets not detecting other end gone?
On Thu, May 17, 2001 at 11:57:45PM +0100, Chris Evans wrote: Hi, I wonder if the following is a bug? It certainly differs from FreeBSD 4.2 behaviour, which gives the behaviour I would expect. The following program blocks indefinitely on Linux (2.2, 2.4 not tested). Since the other end is clearly gone, I would expect some sort of error condition. Indeed, FreeBSD gives ECONNRESET. I'm having a simular problem, but somehow can't recreate it. The difference is that I set the sockets to non-blocking, and expect it to return some error. read() returns 0, with errno set to 0, if the socket is still open it returns -1 with errno set 11 (EAGAIN). I can understand those behaviours. What I'm seeing however in an other program is that select says I can read from the socket, and that read returns 0, with errno set to EGAIN. I call select() again, with returns and says I can read from that socket ..., which keeps going on. It stops from the moment there is any i/o on an other socket. Is there any way you can detect the other side is gone without using write()? write() shoud return an EPIPE. Should I be able to detect it with read(), or some oter system call? Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel bug with UNIX sockets not detecting other end gone?
On Fri, May 18, 2001 at 09:02:51PM +0100, Alan Cox wrote: What I'm seeing however in an other program is that select says I can read from the socket, and that read returns 0, with errno set to EGAIN. I call select() again, with returns and says I can read No no no. If the read does not return -1 it does not change errno. EOF isnt an error. Of course, how stupid of me. Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Athlon possible fixes
On Sat, May 05, 2001 at 06:26:30PM +0200, Rogier Wolff wrote: > > As all this is trying to avoid bus turnarounds (i.e. switching from > reading to writing), wouldn't it be fastest to just trust that the CPU > has at least 4k worth of cache? (and hope for the best that we don't > get interrupted in the meanwhile). > > void copy_page (char *dest, char *source) > { > long *dst = (long *)dest, > *src=(long *)source, > *end= (long *)(source+PAGE_SIZE); > #if 1 > register int i; > long t=0; > static long tt; > > for (i=0;i /* Actually the innards of this loop should be: > (void) from[i]; > however, the compiler will probably optimize that away. */ > t += src[i]; > > tt = t; > #endif > while (src < end) > *dst++ = *src++; > > } > > So, this is 15 lines of C, and it'd be interesting to benchmark this > against the assembly. > > I'm assuming that the "loop variable handling" is not going to > influence the overall performance: that would run at 500 - 1000MHz, > and around 1 clock cycle (1-2ns) per loop. Set this against the stalls > against the memory unit whose output buffer is full, and memory writes > that take on the order of 30 ns per 64bits. Can't you use volatile to prevent the compiler from optimizing it? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Athlon possible fixes
On Sat, May 05, 2001 at 06:26:30PM +0200, Rogier Wolff wrote: As all this is trying to avoid bus turnarounds (i.e. switching from reading to writing), wouldn't it be fastest to just trust that the CPU has at least 4k worth of cache? (and hope for the best that we don't get interrupted in the meanwhile). void copy_page (char *dest, char *source) { long *dst = (long *)dest, *src=(long *)source, *end= (long *)(source+PAGE_SIZE); #if 1 register int i; long t=0; static long tt; for (i=0;iPAGE_SIZE/sizeof (long);i += cache_line_size()/sizeof(long)) /* Actually the innards of this loop should be: (void) from[i]; however, the compiler will probably optimize that away. */ t += src[i]; tt = t; #endif while (src end) *dst++ = *src++; } So, this is 15 lines of C, and it'd be interesting to benchmark this against the assembly. I'm assuming that the loop variable handling is not going to influence the overall performance: that would run at 500 - 1000MHz, and around 1 clock cycle (1-2ns) per loop. Set this against the stalls against the memory unit whose output buffer is full, and memory writes that take on the order of 30 ns per 64bits. Can't you use volatile to prevent the compiler from optimizing it? Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks
On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote: > While running 2.4.3, I saw the following message a few times: > > KERNEL: assertion (tp->lost_out == 0) failed at > tcp_input.c(1202):tcp_remove_reno_sacks I've been running tcpdump for some time, and get the message 2 times again today. Apr 19 19:05:17 thunderbird kernel: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks Apr 19 19:07:18 thunderbird kernel: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks I'm going to start with the second one, because there was alot less trafic at that time. 19:07:17.571150 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 1921:3141(1220) ack 1811 win 5680 (len 1240, hlim 64) 19:07:17.571163 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 3141:3341(200) ack 1811 win 5680 (len 220, hlim 64) 19:07:17.572431 3ffe:401:0:1::16:2 > 3ffe:80c0:220::b: icmp6: too big 1280 (len 1240, hlim 63) 19:07:17.645807 3ffe:8010:91::26.2237 > 3ffe:80c0:220::b.6667: S [tcp sum ok] 2268475160:2268475160(0) win 32660 (len 40, hlim 61) 19:07:17.816319 3ffe:1001:211:80:baba:beba:deca:ceca.33258 > 3ffe:80c0:220::b.6667: . [tcp sum ok] 290:290(0) ack 14134 win 34160 (len 20, hlim 60) 19:07:18.186433 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060 > 3ffe:80c0:220::b.6667: . [tcp sum ok] 1811:1811(0) ack 3341 win 15620 (len 20, hlim 59) 19:07:18.186465 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 3341:4561(1220) ack 1811 win 5680 (len 1240, hlim 64) 19:07:18.886979 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060 > 3ffe:80c0:220::b.6667: . [tcp sum ok] 1811:1811(0) ack 4561 win 17040 (len 20, hlim 59) 19:07:18.887047 3ffe:80c0:220::b.6667 > 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 4561:4761(200) ack 1811 win 5680 (len 220, hlim 64) 19:07:19.236653 3ffe:8010:14::1:dead:beef.3207 > 3ffe:80c0:220::b.6667: S [tcp sum ok] 2702352776:2702352776(0) win 31680 (len 40, hlim 60) As you can see, during that second there only was trafic of 1 connection. Some part of the tcpdump around the time of the first: 19:05:16.783871 3ffe:8010:7:43:1000:dead:dead:2.3292 > 3ffe:80c0:220::b.6667: P [tcp sum ok] 134:152(18) ack 1104 win 31520 (len 38, hlim 60) 19:05:16.783923 3ffe:80c0:220::b.6667 > 3ffe:8010:7:43:1000:dead:dead:2.3292: . [tcp sum ok] 3321:3321(0) ack 152 win 5680 (len 20, hlim 64) 19:05:16.849145 3ffe:400:680::::15.1117 > 3ffe:80c0:220::b.6667: . [tcp sum ok] 124:124(0) ack 38670 win 32660 (len 20, hlim 61) 19:05:16.921394 3ffe:8060:100::26:2 > 3ffe:80c0:220::b: icmp6: too big 1280 (len 1240, hlim 63) 19:05:16.972044 3ffe:8191::2.1044 > 3ffe:80c0:220::b.6667: . [tcp sum ok] 73:73(0) ack 8784 win 17040 (len 20, hlim 60) 19:05:16.972143 3ffe:80c0:220::b.6667 > 3ffe:8191::2.1044: P 8784:8984(200) ack 73 win 5680 (len 220, hlim 64) 19:05:17.030129 3ffe:80c0:220::b.6667 > 3ffe:b00:4011:a::3.1880: P 76:1163(1087) ack 213 win 5680 (len 1107, hlim 64) 19:05:17.062691 3ffe:80c0:220::b.6667 > 3ffe:b00:4011:a::3.1880: P 1163:2383(1220) ack 213 win 5680 (len 1240, hlim 64) 19:05:17.097973 3ffe:80c0:220::b. > 3ffe:1200:3028:82ca:4:4:4:6.2160: P 205:819(614) ack 256 win 5680 (len 634, hlim 64) 19:05:17.098080 3ffe:80c0:220::b. > 3ffe:8114:2000:1d0::4.2856: P 3811:4198(387) ack 85 win 5680 (len 407, hlim 64) 19:05:17.098135 3ffe:80c0:220::b.6667 > 3ffe:400:680::::15.1117: . 38670:40090(1420) ack 124 win 5680 (len 1440, hlim 64) 19:05:17.098151 3ffe:80c0:220::b.6667 > 3ffe:400:680::::15.1117: P 40090:40197(107) ack 124 win 5680 (len 127, hlim 64) 19:05:17.098860 3ffe:80c0:220::b.6667 > 3ffe:80e8:140:200::1.3899: P 158:1049(891) ack 85 win 5680 (len 911, hlim 64) 19:05:17.106040 3ffe:80c0:220::b.6667 > 3ffe:80c0:220::19.1998: P 5851:6543(692) ack 475 win 5680 (len 712, hlim 64) 19:05:17.108239 3ffe:80c0:220::b.4126 > 3ffe:1001:340::6.113: S [tcp sum ok] 2552352896:2552352896(0) win 5680 (len 40, hlim 64) 19:05:17.258572 3ffe:401:0:1::16:2 > 3ffe:80c0:220::b: icmp6: too big 1280 (len 1240, hlim 63) 19:05:17.258633 3ffe:80c0:220::b.6667 > 3ffe:400:680::::15.1117: . 38670:39890(1220) ack 124 win 5680 (len 1240, hlim 64) 19:05:17.321612 3ffe:8010:7:43:1000:dead:dead:2.3292 > 3ffe:80c0:220::b.6667: P 152:244(92) ack 1104 win 31520 (len 112, hlim 60) 19:05:17.321636 3ffe:80c0:220::b.6667 > 3ffe:8010:7:43:1000:dead:dead:2.3292: . [tcp sum ok] 3321:3321(0) ack 244 win 5680 (len 20, hlim 64) 19:05:17.364448 3ffe:80c0:220::b.6667 > 3ffe:400:680:11:::aa15.3452: P [tcp sum ok] 770:789(19) ack 67 win 5680 (len 39, hlim 64) 19:05:17.370740 3ffe:400:680:11:::aa15.3452 > 3ffe:80c0:220::b.6667: P [tcp sum ok] 51:67(16) ack 770 win 48800 (len 36, hlim 60) 19:05:17.370761 3ffe:80c0:220::b.6667 > 3ffe:400:680:11:::aa15.3452: . [tcp sum ok] 789:789(0) ack
Re: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks
On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote: While running 2.4.3, I saw the following message a few times: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks I've been running tcpdump for some time, and get the message 2 times again today. Apr 19 19:05:17 thunderbird kernel: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks Apr 19 19:07:18 thunderbird kernel: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks I'm going to start with the second one, because there was alot less trafic at that time. 19:07:17.571150 3ffe:80c0:220::b.6667 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 1921:3141(1220) ack 1811 win 5680 (len 1240, hlim 64) 19:07:17.571163 3ffe:80c0:220::b.6667 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 3141:3341(200) ack 1811 win 5680 (len 220, hlim 64) 19:07:17.572431 3ffe:401:0:1::16:2 3ffe:80c0:220::b: icmp6: too big 1280 (len 1240, hlim 63) 19:07:17.645807 3ffe:8010:91::26.2237 3ffe:80c0:220::b.6667: S [tcp sum ok] 2268475160:2268475160(0) win 32660 mss 1420,sackOK,timestamp 54007992 0,nop,wscale 0 (len 40, hlim 61) 19:07:17.816319 3ffe:1001:211:80:baba:beba:deca:ceca.33258 3ffe:80c0:220::b.6667: . [tcp sum ok] 290:290(0) ack 14134 win 34160 (len 20, hlim 60) 19:07:18.186433 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060 3ffe:80c0:220::b.6667: . [tcp sum ok] 1811:1811(0) ack 3341 win 15620 (len 20, hlim 59) 19:07:18.186465 3ffe:80c0:220::b.6667 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: . 3341:4561(1220) ack 1811 win 5680 (len 1240, hlim 64) 19:07:18.886979 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060 3ffe:80c0:220::b.6667: . [tcp sum ok] 1811:1811(0) ack 4561 win 17040 (len 20, hlim 59) 19:07:18.887047 3ffe:80c0:220::b.6667 3ffe:400:290:100:2a0:c9ff:feaa:635e.1060: P 4561:4761(200) ack 1811 win 5680 (len 220, hlim 64) 19:07:19.236653 3ffe:8010:14::1:dead:beef.3207 3ffe:80c0:220::b.6667: S [tcp sum ok] 2702352776:2702352776(0) win 31680 mss 1440,sackOK,timestamp 113753265 0,nop,wscale 0 (len 40, hlim 60) As you can see, during that second there only was trafic of 1 connection. Some part of the tcpdump around the time of the first: 19:05:16.783871 3ffe:8010:7:43:1000:dead:dead:2.3292 3ffe:80c0:220::b.6667: P [tcp sum ok] 134:152(18) ack 1104 win 31520 (len 38, hlim 60) 19:05:16.783923 3ffe:80c0:220::b.6667 3ffe:8010:7:43:1000:dead:dead:2.3292: . [tcp sum ok] 3321:3321(0) ack 152 win 5680 (len 20, hlim 64) 19:05:16.849145 3ffe:400:680::::15.1117 3ffe:80c0:220::b.6667: . [tcp sum ok] 124:124(0) ack 38670 win 32660 (len 20, hlim 61) 19:05:16.921394 3ffe:8060:100::26:2 3ffe:80c0:220::b: icmp6: too big 1280 (len 1240, hlim 63) 19:05:16.972044 3ffe:8191::2.1044 3ffe:80c0:220::b.6667: . [tcp sum ok] 73:73(0) ack 8784 win 17040 (len 20, hlim 60) 19:05:16.972143 3ffe:80c0:220::b.6667 3ffe:8191::2.1044: P 8784:8984(200) ack 73 win 5680 (len 220, hlim 64) 19:05:17.030129 3ffe:80c0:220::b.6667 3ffe:b00:4011:a::3.1880: P 76:1163(1087) ack 213 win 5680 (len 1107, hlim 64) 19:05:17.062691 3ffe:80c0:220::b.6667 3ffe:b00:4011:a::3.1880: P 1163:2383(1220) ack 213 win 5680 (len 1240, hlim 64) 19:05:17.097973 3ffe:80c0:220::b. 3ffe:1200:3028:82ca:4:4:4:6.2160: P 205:819(614) ack 256 win 5680 (len 634, hlim 64) 19:05:17.098080 3ffe:80c0:220::b. 3ffe:8114:2000:1d0::4.2856: P 3811:4198(387) ack 85 win 5680 (len 407, hlim 64) 19:05:17.098135 3ffe:80c0:220::b.6667 3ffe:400:680::::15.1117: . 38670:40090(1420) ack 124 win 5680 (len 1440, hlim 64) 19:05:17.098151 3ffe:80c0:220::b.6667 3ffe:400:680::::15.1117: P 40090:40197(107) ack 124 win 5680 (len 127, hlim 64) 19:05:17.098860 3ffe:80c0:220::b.6667 3ffe:80e8:140:200::1.3899: P 158:1049(891) ack 85 win 5680 (len 911, hlim 64) 19:05:17.106040 3ffe:80c0:220::b.6667 3ffe:80c0:220::19.1998: P 5851:6543(692) ack 475 win 5680 (len 712, hlim 64) 19:05:17.108239 3ffe:80c0:220::b.4126 3ffe:1001:340::6.113: S [tcp sum ok] 2552352896:2552352896(0) win 5680 mss 1420,sackOK,timestamp 11315112 0,nop,wscale 0 (len 40, hlim 64) 19:05:17.258572 3ffe:401:0:1::16:2 3ffe:80c0:220::b: icmp6: too big 1280 (len 1240, hlim 63) 19:05:17.258633 3ffe:80c0:220::b.6667 3ffe:400:680::::15.1117: . 38670:39890(1220) ack 124 win 5680 (len 1240, hlim 64) 19:05:17.321612 3ffe:8010:7:43:1000:dead:dead:2.3292 3ffe:80c0:220::b.6667: P 152:244(92) ack 1104 win 31520 (len 112, hlim 60) 19:05:17.321636 3ffe:80c0:220::b.6667 3ffe:8010:7:43:1000:dead:dead:2.3292: . [tcp sum ok] 3321:3321(0) ack 244 win 5680 (len 20, hlim 64) 19:05:17.364448 3ffe:80c0:220::b.6667 3ffe:400:680:11:::aa15.3452: P [tcp sum ok] 770:789(19) ack 67 win 5680 (len 39, hlim 64) 19:05:17.370740 3ffe:400:680:11:::aa15.3452 3ffe:80c0:220::b.6667: P [tcp sum ok] 51:67(16) ack 770 win 48800 (len 36, hlim 60) 19:05:17.370761 3ffe:80c0:220::b.6667 3ffe:400:680:11:::aa15.3452: . [tcp sum ok] 789:789(0) ack 67 win
Re: Athlon problem report summary
On Mon, Apr 16, 2001 at 01:30:14PM +0100, Alan Cox wrote: > > 2. 'My athlon box is fine until I am swapping' {and using DMA} > > Compiler independant, CPU version independant. All victims have a VIA chipset. > This one may be linked to the reported problems with VIA PCI. Two of the > reporters found disabling IDE DMA fixed this one That's intresting. I had no problem at all. hda and hdc are using udma33 here. hdb contains the swap, and gets this error on boot: hdb: Conner Peripherals 340MB - CP30344, ATA DISK drive hdb: set_drive_speed_status: status=0x51 { DriveReady SeekComplete Error } hdb: set_drive_speed_status: error=0x04 { DriveStatusError } ide0: Drive 1 didn't accept speed setting. Oh, well. [...] [same error message] hdb: 670320 sectors (343 MB) w/64KiB Cache, CHS=665/16/63, DMA hdparm -iv /dev/hdb output: /dev/hdb: multcount= 0 (off) I/O support = 1 (32-bit) unmaskirq= 1 (on) using_dma= 1 (on) keepsettings = 0 (off) nowerr = 0 (off) readonly = 0 (off) readahead= 8 (on) geometry = 665/16/63, sectors = 670320, start = 0 Model=Conner Peripherals 340MB - CP30344, FwRev=6FT1.67, SerialNo=BQB2B7G Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } RawCHS=665/16/63, TrkSize=40887, SectSize=649, ECCbytes=4 BuffType=3(DualPortCache), BuffSize=64kB, MaxMultSect=64, MultSect=off DblWordIO=no, OldPIO=1, DMA=yes, OldDMA=1 CurCHS=665/16/63, CurSects=980418570, LBA=no DMA modes: *mword0 Hope this helps. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks
On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote: > While running 2.4.3, I saw the following message a few times: > > KERNEL: assertion (tp->lost_out == 0) failed at > tcp_input.c(1202):tcp_remove_reno_sacks Nobody seems to be intrested in fixing this bug? Anyway, I was looking at some statistics of the box, which I think might be related to this problem. netstat -s shows this under TCP: Tcp: 11681 active connections openings 0 passive connection openings 84689 failed connection attempts 0 connection resets received 94 connections established 10963047 segments received 11476087 segments send out 392891 segments retransmited 772 bad segments received. 24083 resets sent It seems it has to retransmit 3.4% of the TCP segments, which is rather high. The box is just up for 10 days, this means it has to retransmit about .45 segments / second, and the rate seems to be going up. I hope this helps. If there is anything else I can do, please ask. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks
On Sat, Apr 14, 2001 at 04:42:54PM +0200, Kurt Roeckx wrote: While running 2.4.3, I saw the following message a few times: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks Nobody seems to be intrested in fixing this bug? Anyway, I was looking at some statistics of the box, which I think might be related to this problem. netstat -s shows this under TCP: Tcp: 11681 active connections openings 0 passive connection openings 84689 failed connection attempts 0 connection resets received 94 connections established 10963047 segments received 11476087 segments send out 392891 segments retransmited 772 bad segments received. 24083 resets sent It seems it has to retransmit 3.4% of the TCP segments, which is rather high. The box is just up for 10 days, this means it has to retransmit about .45 segments / second, and the rate seems to be going up. I hope this helps. If there is anything else I can do, please ask. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Athlon runtime problems
On Sat, Apr 14, 2001 at 04:12:09PM +0100, Alan Cox wrote: > Can the folks who are seeing crashes running athlon optimised kernels all mail > me Just trying to privide you with usefull info. I'm NOT seeing any crashes at all. > - CPU model/stepping processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 4 model name : AMD Athlon(tm) Processor stepping: 2 cpu MHz : 807.190 cache size : 256 KB > - Chipset VIA KT133/KM133 (An Asus A7V) > - Amount of RAM 128 MiB > - /proc/mtrr output No support for it compiled in. > - compiler used gcc version 2.95.3 20010315 (release) (compiled myself) Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks
While running 2.4.3, I saw the following message a few times: KERNEL: assertion (tp->lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks Is that bad, or should I just ignore it? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SW-RAID0 Performance problems
On Sat, Apr 14, 2001 at 11:38:06AM +0200, Andreas Peter wrote: > Am Samstag, 14. April 2001 09:04 schrieb David Rees: > > > OK, so it's not the RAID setup. There's two things that can cause this. > > One is that DMA is turned off (what does hdparm /dev/hda and hdparm > > /dev/hdc show?), the second was that the drives are on the same channel > > (which obviously isn't the case here). Can you verify that the drives are > > in DMA mode? > > hdparm /dev/hda > > /dev/hda: > multcount= 16 (on) > I/O support = 0 (default 16-bit) > unmaskirq= 0 (off) > using_dma= 1 (on) Does turning unmaskirq on help? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SW-RAID0 Performance problems
On Sat, Apr 14, 2001 at 11:38:06AM +0200, Andreas Peter wrote: Am Samstag, 14. April 2001 09:04 schrieb David Rees: OK, so it's not the RAID setup. There's two things that can cause this. One is that DMA is turned off (what does hdparm /dev/hda and hdparm /dev/hdc show?), the second was that the drives are on the same channel (which obviously isn't the case here). Can you verify that the drives are in DMA mode? hdparm /dev/hda /dev/hda: multcount= 16 (on) I/O support = 0 (default 16-bit) unmaskirq= 0 (off) using_dma= 1 (on) Does turning unmaskirq on help? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks
While running 2.4.3, I saw the following message a few times: KERNEL: assertion (tp-lost_out == 0) failed at tcp_input.c(1202):tcp_remove_reno_sacks Is that bad, or should I just ignore it? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Athlon runtime problems
On Sat, Apr 14, 2001 at 04:12:09PM +0100, Alan Cox wrote: Can the folks who are seeing crashes running athlon optimised kernels all mail me Just trying to privide you with usefull info. I'm NOT seeing any crashes at all. - CPU model/stepping processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 4 model name : AMD Athlon(tm) Processor stepping: 2 cpu MHz : 807.190 cache size : 256 KB - Chipset VIA KT133/KM133 (An Asus A7V) - Amount of RAM 128 MiB - /proc/mtrr output No support for it compiled in. - compiler used gcc version 2.95.3 20010315 (release) (compiled myself) Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Let init know user wants to shutdown
On Wed, Apr 11, 2001 at 01:38:30AM +0200, Kurt Roeckx wrote: > On Tue, Apr 10, 2001 at 11:20:24PM +, Miquel van Smoorenburg wrote: > > > > the shutdown scripts > > include "kill -15 -1; sleep 2; kill -9 -1". The "-1" means > > "all processes except me". That means init will get hit with > > SIGTERM occasionally during shutdown, and that might cause > > weird things to happen. > > -1 mean everything but init. Oh, maybe you mean killall5 -TERM? Which would send a SIGTERM to all processes but the one in his own session. (Hey look, you wrote that manpage.) Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Let init know user wants to shutdown
On Tue, Apr 10, 2001 at 11:20:24PM +, Miquel van Smoorenburg wrote: > > the shutdown scripts > include "kill -15 -1; sleep 2; kill -9 -1". The "-1" means > "all processes except me". That means init will get hit with > SIGTERM occasionally during shutdown, and that might cause > weird things to happen. -1 mean everything but init. >From the manpage: If pid equals -1, then sig is sent to every process except for the first one, from higher numbers in the process table to lower. And later: BUGS It is impossible to send a signal to task number one, the init process, for which it has not installed a signal han- dler. This is done to assure the system is not brought down accidentally. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Let init know user wants to shutdown
On Tue, Apr 10, 2001 at 11:20:24PM +, Miquel van Smoorenburg wrote: the shutdown scripts include "kill -15 -1; sleep 2; kill -9 -1". The "-1" means "all processes except me". That means init will get hit with SIGTERM occasionally during shutdown, and that might cause weird things to happen. -1 mean everything but init. From the manpage: If pid equals -1, then sig is sent to every process except for the first one, from higher numbers in the process table to lower. And later: BUGS It is impossible to send a signal to task number one, the init process, for which it has not installed a signal han- dler. This is done to assure the system is not brought down accidentally. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Let init know user wants to shutdown
On Wed, Apr 11, 2001 at 01:38:30AM +0200, Kurt Roeckx wrote: On Tue, Apr 10, 2001 at 11:20:24PM +, Miquel van Smoorenburg wrote: the shutdown scripts include "kill -15 -1; sleep 2; kill -9 -1". The "-1" means "all processes except me". That means init will get hit with SIGTERM occasionally during shutdown, and that might cause weird things to happen. -1 mean everything but init. Oh, maybe you mean killall5 -TERM? Which would send a SIGTERM to all processes but the one in his own session. (Hey look, you wrote that manpage.) Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux Kernel IRC Room?
On Wed, Mar 28, 2001 at 08:37:49PM -0500, Alexander Valys wrote: > Is there a kernel development irc room anywhere? If not, does anyone think > it might be useful? I'd just like to point out that it's called a channel, not a room. Room is a term introduced by AOL, and I don't think it has much to do with IRC. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux Kernel IRC Room?
On Wed, Mar 28, 2001 at 08:37:49PM -0500, Alexander Valys wrote: Is there a kernel development irc room anywhere? If not, does anyone think it might be useful? I'd just like to point out that it's called a channel, not a room. Room is a term introduced by AOL, and I don't think it has much to do with IRC. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.2ac11
On Sat, Mar 03, 2001 at 07:32:13PM +, Alan Cox wrote: > > 2.4.2-ac11 > o Add ALi15x3 to the list of isa dma hangs(Angelo Di Filippo) What does this mean? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.2ac11
On Sat, Mar 03, 2001 at 07:32:13PM +, Alan Cox wrote: 2.4.2-ac11 o Add ALi15x3 to the list of isa dma hangs(Angelo Di Filippo) What does this mean? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
OOPS with 2.4.1-ac8
I suddenly started to get those oopses. It didn't seem to cause any problems tho. I hope this result from ksymoops are usefull. Kurt ksymoops 2.3.7 on i586 2.4.1-ac8. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.1-ac8/ (default) -m /System.map (specified) Error (regular_file): read_ksyms stat /proc/ksyms failed No modules in ksyms, skipping objects No ksyms, skipping lsmod Feb 11 15:04:01 Q kernel: c013d4a6 Feb 11 15:04:01 Q kernel: Oops: Feb 11 15:04:01 Q kernel: CPU:0 Feb 11 15:04:01 Q kernel: EIP:0010:[d_lookup+98/252] Feb 11 15:04:01 Q kernel: EFLAGS: 00010213 Feb 11 15:04:01 Q kernel: eax: c3fe8dc8 ebx: ffe8 ecx: 001a edx: 00704020 Feb 11 15:04:01 Q kernel: esi: edi: c3fca320 ebp: esp: c1437f04 Feb 11 15:04:01 Q kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 15:04:01 Q kernel: Process sh (pid: 13077, stackpage=c1437000) Feb 11 15:04:01 Q kernel: Stack: c1437f68 c3fca320 c1437fa4 c3fe8dc8 c2957005 00704020 0005 Feb 11 15:04:01 Q kernel:c01359f2 c3fca320 c1437f68 c1437f68 c0136161 c3fca320 c1437f68 Feb 11 15:04:01 Q kernel:c2957000 c1437fa4 080b91c4 c2957000 080b91c4 c013581b 0009 Feb 11 15:04:01 Q kernel: Call Trace: [cached_lookup+14/80] [path_walk+1337/1944] [getname+91/152] [__user_walk+58/84] [sys_newstat+21/108] [system_call+51/64] Feb 11 15:04:01 Q kernel: Code: 8b 6d 00 8b 54 24 18 39 53 48 75 76 8b 44 24 24 39 43 0c 75 Using defaults from ksymoops -t elf32-i386 -a i386 Code; Before first symbol <_EIP>: Code; Before first symbol 0: 8b 6d 00 mov0x0(%ebp),%ebp Code; 0003 Before first symbol 3: 8b 54 24 18 mov0x18(%esp,1),%edx Code; 0007 Before first symbol 7: 39 53 48 cmp%edx,0x48(%ebx) Code; 000a Before first symbol a: 75 76 jne82 <_EIP+0x82> 0082 Before first symbol Code; 000c Before first symbol c: 8b 44 24 24 mov0x24(%esp,1),%eax Code; 0010 Before first symbol 10: 39 43 0c cmp%eax,0xc(%ebx) Code; 0013 Before first symbol 13: 75 00 jne15 <_EIP+0x15> 0015 Before first symbol Feb 11 15:05:01 Q kernel: c013d4a6 Feb 11 15:05:01 Q kernel: Oops: Feb 11 15:05:01 Q kernel: CPU:0 Feb 11 15:05:01 Q kernel: EIP:0010:[d_lookup+98/252] Feb 11 15:05:01 Q kernel: EFLAGS: 00010213 Feb 11 15:05:01 Q kernel: eax: c3fe8dc8 ebx: ffe8 ecx: 001a edx: 00704020 Feb 11 15:05:01 Q kernel: esi: edi: c3fca320 ebp: esp: c1437f04 Feb 11 15:05:01 Q kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 15:05:01 Q kernel: Process sh (pid: 13079, stackpage=c1437000) Feb 11 15:05:01 Q kernel: Stack: c1437f68 c3fca320 c1437fa4 c3fe8dc8 c2c41005 00704020 0005 Feb 11 15:05:01 Q kernel:c01359f2 c3fca320 c1437f68 c1437f68 c0136161 c3fca320 c1437f68 Feb 11 15:05:01 Q kernel:c2c41000 c1437fa4 080b91c4 c2c41000 080b91c4 c013581b 0009 Feb 11 15:05:01 Q kernel: Call Trace: [cached_lookup+14/80] [path_walk+1337/1944] [getname+91/152] [__user_walk+58/84] [sys_newstat+21/108] [system_call+51/64] Feb 11 15:05:01 Q kernel: Code: 8b 6d 00 8b 54 24 18 39 53 48 75 76 8b 44 24 24 39 43 0c 75 Code; Before first symbol <_EIP>: Code; Before first symbol 0: 8b 6d 00 mov0x0(%ebp),%ebp Code; 0003 Before first symbol 3: 8b 54 24 18 mov0x18(%esp,1),%edx Code; 0007 Before first symbol 7: 39 53 48 cmp%edx,0x48(%ebx) Code; 000a Before first symbol a: 75 76 jne82 <_EIP+0x82> 0082 Before first symbol Code; 000c Before first symbol c: 8b 44 24 24 mov0x24(%esp,1),%eax Code; 0010 Before first symbol 10: 39 43 0c cmp%eax,0xc(%ebx) Code; 0013 Before first symbol 13: 75 00 jne15 <_EIP+0x15> 0015 Before first symbol Feb 11 15:06:01 Q kernel: c013d4a6 Feb 11 15:06:01 Q kernel: Oops: Feb 11 15:06:01 Q kernel: CPU:0 Feb 11 15:06:01 Q kernel: EIP:0010:[d_lookup+98/252] Feb 11 15:06:01 Q kernel: EFLAGS: 00010213 Feb 11 15:06:01 Q kernel: eax: c3fe8dc8 ebx: ffe8 ecx: 001a edx: 00704020 Feb 11 15:06:01 Q kernel: esi: edi: c3fca320 ebp: esp: c1437f04 Feb 11 15:06:01 Q kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 15:06:01 Q kernel: Process sh (pid: 13081, stackpage=c1437000) Feb 11 15:06:01 Q kernel: Stack: c1437f68 c3fca320 c1437fa4 c3fe8dc8 c1603005 00704020 0005 Feb 11 15:06:01 Q kernel:c01359f2 c3fca320 c1437f68 c1437f68 c0136161 c3fca320 c1437f68 Feb 11
OOPS with 2.4.1-ac8
I suddenly started to get those oopses. It didn't seem to cause any problems tho. I hope this result from ksymoops are usefull. Kurt ksymoops 2.3.7 on i586 2.4.1-ac8. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.1-ac8/ (default) -m /System.map (specified) Error (regular_file): read_ksyms stat /proc/ksyms failed No modules in ksyms, skipping objects No ksyms, skipping lsmod Feb 11 15:04:01 Q kernel: c013d4a6 Feb 11 15:04:01 Q kernel: Oops: Feb 11 15:04:01 Q kernel: CPU:0 Feb 11 15:04:01 Q kernel: EIP:0010:[d_lookup+98/252] Feb 11 15:04:01 Q kernel: EFLAGS: 00010213 Feb 11 15:04:01 Q kernel: eax: c3fe8dc8 ebx: ffe8 ecx: 001a edx: 00704020 Feb 11 15:04:01 Q kernel: esi: edi: c3fca320 ebp: esp: c1437f04 Feb 11 15:04:01 Q kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 15:04:01 Q kernel: Process sh (pid: 13077, stackpage=c1437000) Feb 11 15:04:01 Q kernel: Stack: c1437f68 c3fca320 c1437fa4 c3fe8dc8 c2957005 00704020 0005 Feb 11 15:04:01 Q kernel:c01359f2 c3fca320 c1437f68 c1437f68 c0136161 c3fca320 c1437f68 Feb 11 15:04:01 Q kernel:c2957000 c1437fa4 080b91c4 c2957000 080b91c4 c013581b 0009 Feb 11 15:04:01 Q kernel: Call Trace: [cached_lookup+14/80] [path_walk+1337/1944] [getname+91/152] [__user_walk+58/84] [sys_newstat+21/108] [system_call+51/64] Feb 11 15:04:01 Q kernel: Code: 8b 6d 00 8b 54 24 18 39 53 48 75 76 8b 44 24 24 39 43 0c 75 Using defaults from ksymoops -t elf32-i386 -a i386 Code; Before first symbol _EIP: Code; Before first symbol 0: 8b 6d 00 mov0x0(%ebp),%ebp Code; 0003 Before first symbol 3: 8b 54 24 18 mov0x18(%esp,1),%edx Code; 0007 Before first symbol 7: 39 53 48 cmp%edx,0x48(%ebx) Code; 000a Before first symbol a: 75 76 jne82 _EIP+0x82 0082 Before first symbol Code; 000c Before first symbol c: 8b 44 24 24 mov0x24(%esp,1),%eax Code; 0010 Before first symbol 10: 39 43 0c cmp%eax,0xc(%ebx) Code; 0013 Before first symbol 13: 75 00 jne15 _EIP+0x15 0015 Before first symbol Feb 11 15:05:01 Q kernel: c013d4a6 Feb 11 15:05:01 Q kernel: Oops: Feb 11 15:05:01 Q kernel: CPU:0 Feb 11 15:05:01 Q kernel: EIP:0010:[d_lookup+98/252] Feb 11 15:05:01 Q kernel: EFLAGS: 00010213 Feb 11 15:05:01 Q kernel: eax: c3fe8dc8 ebx: ffe8 ecx: 001a edx: 00704020 Feb 11 15:05:01 Q kernel: esi: edi: c3fca320 ebp: esp: c1437f04 Feb 11 15:05:01 Q kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 15:05:01 Q kernel: Process sh (pid: 13079, stackpage=c1437000) Feb 11 15:05:01 Q kernel: Stack: c1437f68 c3fca320 c1437fa4 c3fe8dc8 c2c41005 00704020 0005 Feb 11 15:05:01 Q kernel:c01359f2 c3fca320 c1437f68 c1437f68 c0136161 c3fca320 c1437f68 Feb 11 15:05:01 Q kernel:c2c41000 c1437fa4 080b91c4 c2c41000 080b91c4 c013581b 0009 Feb 11 15:05:01 Q kernel: Call Trace: [cached_lookup+14/80] [path_walk+1337/1944] [getname+91/152] [__user_walk+58/84] [sys_newstat+21/108] [system_call+51/64] Feb 11 15:05:01 Q kernel: Code: 8b 6d 00 8b 54 24 18 39 53 48 75 76 8b 44 24 24 39 43 0c 75 Code; Before first symbol _EIP: Code; Before first symbol 0: 8b 6d 00 mov0x0(%ebp),%ebp Code; 0003 Before first symbol 3: 8b 54 24 18 mov0x18(%esp,1),%edx Code; 0007 Before first symbol 7: 39 53 48 cmp%edx,0x48(%ebx) Code; 000a Before first symbol a: 75 76 jne82 _EIP+0x82 0082 Before first symbol Code; 000c Before first symbol c: 8b 44 24 24 mov0x24(%esp,1),%eax Code; 0010 Before first symbol 10: 39 43 0c cmp%eax,0xc(%ebx) Code; 0013 Before first symbol 13: 75 00 jne15 _EIP+0x15 0015 Before first symbol Feb 11 15:06:01 Q kernel: c013d4a6 Feb 11 15:06:01 Q kernel: Oops: Feb 11 15:06:01 Q kernel: CPU:0 Feb 11 15:06:01 Q kernel: EIP:0010:[d_lookup+98/252] Feb 11 15:06:01 Q kernel: EFLAGS: 00010213 Feb 11 15:06:01 Q kernel: eax: c3fe8dc8 ebx: ffe8 ecx: 001a edx: 00704020 Feb 11 15:06:01 Q kernel: esi: edi: c3fca320 ebp: esp: c1437f04 Feb 11 15:06:01 Q kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 15:06:01 Q kernel: Process sh (pid: 13081, stackpage=c1437000) Feb 11 15:06:01 Q kernel: Stack: c1437f68 c3fca320 c1437fa4 c3fe8dc8 c1603005 00704020 0005 Feb 11 15:06:01 Q kernel:c01359f2 c3fca320 c1437f68 c1437f68 c0136161 c3fca320 c1437f68 Feb 11 15:06:01 Q
Re: Linux 2.4.1-ac7
On Thu, Feb 08, 2001 at 08:12:39PM -0200, Rik van Riel wrote: > On Thu, 8 Feb 2001, Alan Cox wrote: > > > ftp://ftp.kernel.org/pub/linux/kernel/people/alan/2.4/ > > > > 2.4.1-ac7 > > o Rebalance the 2.4.1 VM (Rik van Riel) > > | This should make things feel a lot faster especially > > | on small boxes .. feedback to Rik > > I'd really like feedback from people when it comes to this > change. The change /should/ fix most paging performance bugs > because it makes kswapd do the right amount of work in order > to solve the free memory shortage every time it is run. I just tested ac8. If I run this test, the system gets really slow. It takes about a second between the time I press a key, and the time it appears on the screen. The load goes way up. Everything seems to block. This is a box with 64 MB or RAM, and 32 MB of swap. There isn't much running on the box while doing this, only dnetc. It starts to get slow from the time the process starts is about 70 MB. Then you really hear the disk work. It ended up at about 75 MB, where it got killed by the OOM killer. (For once it killed the right thing!) I ran a vmstat 1, while doing this, and have attached the output. It ran for serval minutes. The process itself took about 1 minutes of CPU time, and so did kswapd. It took atleast 5 minutes real time. I once did just the same with 2.4.0, it took more like 30 minutes then, and I ended up killing the process myself. Kurt procs memoryswapiosystem cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 16580 1808 1200 43336 0 0 834 130 25 93 2 5 1 0 0 16704 1804 1200 43460 0 000 1164 100 0 0 1 0 0 16704 1804 1200 43460 0 000 1204 100 0 0 1 0 0 16784 1808 1200 43540 0 000 1026 99 1 0 1 0 0 16904 1808 1200 43660 0 000 1017 99 1 0 1 0 0 16988 1808 1200 43744 0 000 130 32 99 1 0 1 0 0 16988 1808 1200 43744 0 000 136 18 100 0 0 1 0 0 17152 1808 1200 43908 0 000 110 14 100 0 0 1 0 0 17152 1808 1200 43908 0 003 110 10 100 0 0 2 0 0 17112 1428 1204 43960 0 0 3844 142 63 98 2 0 2 0 0 17268 1428 1204 43016 0 0 1300 133 46 100 0 0 2 0 0 17344 1428 1204 41980 0 0 1280 110 47 97 3 0 2 0 0 17548 1428 1204 41104 0 0 1290 116 48 98 2 0 2 0 0 17604 1428 1204 40056 0 0 1280 102 45 97 3 0 2 0 0 17820 1428 1208 39188 0 0 1290 106 49 97 3 0 2 0 0 17972 1428 1208 38288 0 0 1280 108 48 99 1 0 2 0 0 18136 1428 1208 37352 0 0 1290 104 46 95 5 0 2 0 0 18140 1428 1208 36296 0 0 1280 107 49 98 2 0 2 0 0 18140 1428 1208 35212 0 0 1290 108 54 96 4 0 2 0 0 18140 1428 1208 34108 0 0 1280 105 44 99 1 0 2 0 0 18140 1428 1208 33208 0 000 126 47 96 4 0 procs memoryswapiosystem cpu r b w swpd free buff cache si so bi bo in cs us sy id 2 0 0 18116 1456 1168 32144 0 0 1290 127 49 96 4 0 2 0 0 18116 1368 1100 31236 0 0 1280 110 46 97 3 0 2 0 0 18752 1304 1016 30944 0 0 1291 107 51 97 3 0 2 0 0 18752 2020 908 29944 0 0 1280 109 49 100 0 0 2 0 0 18752 1296 904 29596 0 0 1290 104 45 95 5 0 2 0 0 18752 1300 880 28788 0 248 128 62 106 42 97 3 0 2 1 0 18752 1236 880 27864 0 1896 73 531 162 52 95 5 0 2 0 0 18752 948 880 27056 0 0 1840 128 43 96 4 0 2 0 0 18752 948 856 26008 0 000 105 41 98 2 0 2 0 0 19688 980 840 26024 0 1360 129 340 133 50 97 3 0 2 0 0 19688 948 828 25028 0 624 128 156 114 46 98 2 0 2 0 0 20628 948 812 25008 0 192 129 48 108 45 95 5 0 2 0 0 23420 948 784 26956 0 1140 128 285 125 49 97 3 0 2 0 0 23464 948 676 26604 0 0 1300 105 49 99 1 0 2 0 0 23560 948 564 25780 0 820 128 214 132 43 95 5 0 2 1 0 23840 948 560 25036 0 244 73 61 109 47 96 4 0 2 0 0 23940 948 556 24184 0 1768 56 445 152 61 94 6 0 2 0 0 23928 948 540 23144 0 0 1430 111 55 98 2 0 2 0 0 24168 948 532 22352 0 164 129 41 106 48 96 4 0 2 0 0 29044 948 504 26240 0 120 128 30 105 42 97 3 0 2 0 0 31476 948 412 28256 0 316 129 79 126 48 96 4 0 procs memoryswapiosystem cpu r b w swpd free buff cache si so bi bo in cs us sy id 3 0 0 31476 948 416 27300 0 1580 193 413 269 68 97 3 0 2 0 0 32580 948
Re: Linux 2.4.1-ac7
On Thu, Feb 08, 2001 at 08:12:39PM -0200, Rik van Riel wrote: On Thu, 8 Feb 2001, Alan Cox wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/alan/2.4/ 2.4.1-ac7 o Rebalance the 2.4.1 VM (Rik van Riel) | This should make things feel a lot faster especially | on small boxes .. feedback to Rik I'd really like feedback from people when it comes to this change. The change /should/ fix most paging performance bugs because it makes kswapd do the right amount of work in order to solve the free memory shortage every time it is run. I just tested ac8. If I run this test, the system gets really slow. It takes about a second between the time I press a key, and the time it appears on the screen. The load goes way up. Everything seems to block. This is a box with 64 MB or RAM, and 32 MB of swap. There isn't much running on the box while doing this, only dnetc. It starts to get slow from the time the process starts is about 70 MB. Then you really hear the disk work. It ended up at about 75 MB, where it got killed by the OOM killer. (For once it killed the right thing!) I ran a vmstat 1, while doing this, and have attached the output. It ran for serval minutes. The process itself took about 1 minutes of CPU time, and so did kswapd. It took atleast 5 minutes real time. I once did just the same with 2.4.0, it took more like 30 minutes then, and I ended up killing the process myself. Kurt procs memoryswapiosystem cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 16580 1808 1200 43336 0 0 834 130 25 93 2 5 1 0 0 16704 1804 1200 43460 0 000 1164 100 0 0 1 0 0 16704 1804 1200 43460 0 000 1204 100 0 0 1 0 0 16784 1808 1200 43540 0 000 1026 99 1 0 1 0 0 16904 1808 1200 43660 0 000 1017 99 1 0 1 0 0 16988 1808 1200 43744 0 000 130 32 99 1 0 1 0 0 16988 1808 1200 43744 0 000 136 18 100 0 0 1 0 0 17152 1808 1200 43908 0 000 110 14 100 0 0 1 0 0 17152 1808 1200 43908 0 003 110 10 100 0 0 2 0 0 17112 1428 1204 43960 0 0 3844 142 63 98 2 0 2 0 0 17268 1428 1204 43016 0 0 1300 133 46 100 0 0 2 0 0 17344 1428 1204 41980 0 0 1280 110 47 97 3 0 2 0 0 17548 1428 1204 41104 0 0 1290 116 48 98 2 0 2 0 0 17604 1428 1204 40056 0 0 1280 102 45 97 3 0 2 0 0 17820 1428 1208 39188 0 0 1290 106 49 97 3 0 2 0 0 17972 1428 1208 38288 0 0 1280 108 48 99 1 0 2 0 0 18136 1428 1208 37352 0 0 1290 104 46 95 5 0 2 0 0 18140 1428 1208 36296 0 0 1280 107 49 98 2 0 2 0 0 18140 1428 1208 35212 0 0 1290 108 54 96 4 0 2 0 0 18140 1428 1208 34108 0 0 1280 105 44 99 1 0 2 0 0 18140 1428 1208 33208 0 000 126 47 96 4 0 procs memoryswapiosystem cpu r b w swpd free buff cache si so bi bo in cs us sy id 2 0 0 18116 1456 1168 32144 0 0 1290 127 49 96 4 0 2 0 0 18116 1368 1100 31236 0 0 1280 110 46 97 3 0 2 0 0 18752 1304 1016 30944 0 0 1291 107 51 97 3 0 2 0 0 18752 2020 908 29944 0 0 1280 109 49 100 0 0 2 0 0 18752 1296 904 29596 0 0 1290 104 45 95 5 0 2 0 0 18752 1300 880 28788 0 248 128 62 106 42 97 3 0 2 1 0 18752 1236 880 27864 0 1896 73 531 162 52 95 5 0 2 0 0 18752 948 880 27056 0 0 1840 128 43 96 4 0 2 0 0 18752 948 856 26008 0 000 105 41 98 2 0 2 0 0 19688 980 840 26024 0 1360 129 340 133 50 97 3 0 2 0 0 19688 948 828 25028 0 624 128 156 114 46 98 2 0 2 0 0 20628 948 812 25008 0 192 129 48 108 45 95 5 0 2 0 0 23420 948 784 26956 0 1140 128 285 125 49 97 3 0 2 0 0 23464 948 676 26604 0 0 1300 105 49 99 1 0 2 0 0 23560 948 564 25780 0 820 128 214 132 43 95 5 0 2 1 0 23840 948 560 25036 0 244 73 61 109 47 96 4 0 2 0 0 23940 948 556 24184 0 1768 56 445 152 61 94 6 0 2 0 0 23928 948 540 23144 0 0 1430 111 55 98 2 0 2 0 0 24168 948 532 22352 0 164 129 41 106 48 96 4 0 2 0 0 29044 948 504 26240 0 120 128 30 105 42 97 3 0 2 0 0 31476 948 412 28256 0 316 129 79 126 48 96 4 0 procs memoryswapiosystem cpu r b w swpd free buff cache si so bi bo in cs us sy id 3 0 0 31476 948 416 27300 0 1580 193 413 269 68 97 3 0 2 0 0 32580 948 416 27404 0
Re: gprof cannot profile multi-threaded programs
On Tue, Jan 30, 2001 at 11:31:13PM -0600, Mohit Aron wrote: > I analyzed the problem to be the following. Linux uses periodic SIGPROF signals > for profiling (Linux doesn't use the profil system call used in other OS's like > Solaris where the kernel does the profiling on behalf of the process). All > profile information is collected in the context of the signal handler for the > SIGPROF signal in Linux. Unfortunately, any thread that's created using > pthread_create() does not get these periodic SIGPROF signals. Hence any thread > other than the first thread is not profiled. The fix is to use setitimer() > system call immediately in the thread startup function for any new thread to > make the SIGPROF signal to be delivered at the designated interrupt frequency > (every 10ms). With this fix, the profile produced by gprof reflects the overall > computation done by all threads in the process. A more general fix would be > to fix the kernel to make any new threads inherit the setitimer() settings > for the parent thread. You have the same problem when doing fork(). Only the parent will get cpu usage info. I have to call setitimer() too, to make it work properly. I complained about it a few days ago, but didn't get a reply yet. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: gprof cannot profile multi-threaded programs
On Tue, Jan 30, 2001 at 11:31:13PM -0600, Mohit Aron wrote: I analyzed the problem to be the following. Linux uses periodic SIGPROF signals for profiling (Linux doesn't use the profil system call used in other OS's like Solaris where the kernel does the profiling on behalf of the process). All profile information is collected in the context of the signal handler for the SIGPROF signal in Linux. Unfortunately, any thread that's created using pthread_create() does not get these periodic SIGPROF signals. Hence any thread other than the first thread is not profiled. The fix is to use setitimer() system call immediately in the thread startup function for any new thread to make the SIGPROF signal to be delivered at the designated interrupt frequency (every 10ms). With this fix, the profile produced by gprof reflects the overall computation done by all threads in the process. A more general fix would be to fix the kernel to make any new threads inherit the setitimer() settings for the parent thread. You have the same problem when doing fork(). Only the parent will get cpu usage info. I have to call setitimer() too, to make it work properly. I complained about it a few days ago, but didn't get a reply yet. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
setitimer() and fork()
I'm having a problem when I try to profile a program that fork()'s. The problem is that it does count how many times I'm in a function, but nothing seems to use any cpu time at all. If I call setitmer(ITIMER_PROF, ...) again after the fork, it works as expected. fork() doesn't seem to copy the timer(s). On other OS's, I don't seem to have to do this. I'm having this problem with both 2.2, and 2.4. I think it used to work in older versions. Is this a bug, or is this intentional? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
setitimer() and fork()
I'm having a problem when I try to profile a program that fork()'s. The problem is that it does count how many times I'm in a function, but nothing seems to use any cpu time at all. If I call setitmer(ITIMER_PROF, ...) again after the fork, it works as expected. fork() doesn't seem to copy the timer(s). On other OS's, I don't seem to have to do this. I'm having this problem with both 2.2, and 2.4. I think it used to work in older versions. Is this a bug, or is this intentional? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: `rmdir .` doesn't work in 2.4
On Mon, Jan 08, 2001 at 11:50:44PM +0100, [EMAIL PROTECTED] wrote: > From: Andrea Arcangeli <[EMAIL PROTECTED]> > > > But in fact it fails with EINVAL, and > > > > [EINVAL]: The path argument contains a last component that is dot. > > I can't confirm. The specs I'm checking are here: > > http://www.opengroup.org/onlinepubs/007908799/xsh/rmdir.html > > That is the SUSv2 text, one of the ingredients for the new > POSIX standard. I quoted the current Austin draft, the current > draft for the next version of the POSIX standard. > > Quoting a text fragment: > > The rmdir( ) function shall remove a directory whose name is given by > path. The directory is removed only if it is an empty directory. > If the directory is the root directory or the current working > directory of any process, it is unspecified whether the function > succeeds, or whether it shall fail and set errno to [EBUSY]. > If path names a symbolic link, then rmdir( ) shall fail and > set errno to [ENOTDIR]. If the path argument refers to a path > whose final component is either dot or dot-dot, rmdir( ) shall > fail. ... At the bottom of Andrea Arcangeli's url, it says: Derived from the POSIX.1-1988 standard. I think it makes sense that if POSIX changed it, that we should follow POSIX, and not SuS v2, specially if it simplify's things in the kernel. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: `rmdir .` doesn't work in 2.4
On Mon, Jan 08, 2001 at 11:50:44PM +0100, [EMAIL PROTECTED] wrote: From: Andrea Arcangeli [EMAIL PROTECTED] But in fact it fails with EINVAL, and [EINVAL]: The path argument contains a last component that is dot. I can't confirm. The specs I'm checking are here: http://www.opengroup.org/onlinepubs/007908799/xsh/rmdir.html That is the SUSv2 text, one of the ingredients for the new POSIX standard. I quoted the current Austin draft, the current draft for the next version of the POSIX standard. Quoting a text fragment: The rmdir( ) function shall remove a directory whose name is given by path. The directory is removed only if it is an empty directory. If the directory is the root directory or the current working directory of any process, it is unspecified whether the function succeeds, or whether it shall fail and set errno to [EBUSY]. If path names a symbolic link, then rmdir( ) shall fail and set errno to [ENOTDIR]. If the path argument refers to a path whose final component is either dot or dot-dot, rmdir( ) shall fail. ... At the bottom of Andrea Arcangeli's url, it says: Derived from the POSIX.1-1988 standard. I think it makes sense that if POSIX changed it, that we should follow POSIX, and not SuS v2, specially if it simplify's things in the kernel. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] new bug report script
On Sun, Jan 07, 2001 at 08:43:12PM +1100, Brett wrote: > > Taking a guess here > > strings /lib/libc* | grep "release version" > > I'm not sure how reliable this method is either :) That returns nothing here. I do find this in it: "@(#) The Linux C library 5.4.46" Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] new bug report script
On Sun, Jan 07, 2001 at 08:43:12PM +1100, Brett wrote: Taking a guess here strings /lib/libc* | grep "release version" I'm not sure how reliable this method is either :) That returns nothing here. I do find this in it: "@(#) The Linux C library 5.4.46" Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 500 ms offset in i386 Real Time Clock setting
On Sat, Jan 06, 2001 at 11:35:52AM -0800, [EMAIL PROTECTED] wrote: > Neither hwclock nor the /dev/rtc driver takes the following comment from > set_rtc_mmss() in arch/i386/kernel/time.c into account. As a result, using > hwclock --systohc or --adjust always leaves the Hardware Clock 500 ms ahead of > the System Clock: I mailed a patch to Andries Brouwer yesterday for exactly the same problem if hwclock writes writes to the cmos directly, and said to check if other places have the same problem. I added an usleep() of 500 ms in cmos.c Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 500 ms offset in i386 Real Time Clock setting
On Sat, Jan 06, 2001 at 11:35:52AM -0800, [EMAIL PROTECTED] wrote: Neither hwclock nor the /dev/rtc driver takes the following comment from set_rtc_mmss() in arch/i386/kernel/time.c into account. As a result, using hwclock --systohc or --adjust always leaves the Hardware Clock 500 ms ahead of the System Clock: I mailed a patch to Andries Brouwer yesterday for exactly the same problem if hwclock writes writes to the cmos directly, and said to check if other places have the same problem. I added an usleep() of 500 ms in cmos.c Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Happy new year^H^H^H^Hkernel..
On Tue, Jan 02, 2001 at 03:51:34AM +0100, Gerold Jury wrote: > The ISDN changes for the HISAX drivers > that came in since test12 have introduced a bug that causes a > AIEE-something and a complete kernel hang when i hangup the isdn line. > I have reversed the patch for all occurences of INIT_LIST_HEAD in the > isdn patch part and it works for me now. > > The relevant part is attached. Please back it out for 2.4.0. I'm using the hisax driver too (build in), and it works perfectly for me. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Happy new year^H^H^H^Hkernel..
On Tue, Jan 02, 2001 at 03:51:34AM +0100, Gerold Jury wrote: The ISDN changes for the HISAX drivers that came in since test12 have introduced a bug that causes a AIEE-something and a complete kernel hang when i hangup the isdn line. I have reversed the patch for all occurences of INIT_LIST_HEAD in the isdn patch part and it works for me now. The relevant part is attached. Please back it out for 2.4.0. I'm using the hisax driver too (build in), and it works perfectly for me. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Linus's include file strategy redux
On Fri, Dec 15, 2000 at 12:14:04AM +, Miquel van Smoorenburg wrote: > In article <[EMAIL PROTECTED]>, > LA Walsh <[EMAIL PROTECTED]> wrote: > >Which works because in a normal compile environment they have /usr/include > >in their include path and /usr/include/linux points to the directory > >under /usr/src/linux/include. > > No, that a redhat-ism. > > Sane distributions simply include a known good copy of > /usr/src/linux/include/{asm,linux} verbatim in their libc6-dev package. The glibc FAQ still has this in it: 2.17. I have /usr/include/net and /usr/include/scsi as symlinks into my Linux source tree. Is that wrong? {PB} This was necessary for libc5, but is not correct when using glibc. Including the kernel header files directly in user programs usually does not work (see question 3.5). glibc provides its own and header files to replace them, and you may have to remove any symlink that you have in place before you install glibc. However, /usr/include/asm and /usr/include/linux should remain as they were. It's the version that's in cvs, I just did an cvs update. It's been in it for ages. If it's wrong, someone *please* correct it. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Linus's include file strategy redux
On Fri, Dec 15, 2000 at 12:14:04AM +, Miquel van Smoorenburg wrote: In article [EMAIL PROTECTED], LA Walsh [EMAIL PROTECTED] wrote: Which works because in a normal compile environment they have /usr/include in their include path and /usr/include/linux points to the directory under /usr/src/linux/include. No, that a redhat-ism. Sane distributions simply include a known good copy of /usr/src/linux/include/{asm,linux} verbatim in their libc6-dev package. The glibc FAQ still has this in it: 2.17. I have /usr/include/net and /usr/include/scsi as symlinks into my Linux source tree. Is that wrong? {PB} This was necessary for libc5, but is not correct when using glibc. Including the kernel header files directly in user programs usually does not work (see question 3.5). glibc provides its own net/* and scsi/* header files to replace them, and you may have to remove any symlink that you have in place before you install glibc. However, /usr/include/asm and /usr/include/linux should remain as they were. It's the version that's in cvs, I just did an cvs update. It's been in it for ages. If it's wrong, someone *please* correct it. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
chroot [Was: Re: Linux 2.2.18pre21]
On Thu, Nov 16, 2000 at 11:52:49AM -0800, jesse wrote: > On Thu, Nov 16, 2000 at 05:16:18PM +0100, Andrea Arcangeli wrote: > > On Thu, Nov 16, 2000 at 03:07:04PM +0100, Matthias Andree wrote: > > > It shows a program that saves the cwd -- open(".",...) in an open file, > > > then chroots [..] > > > > This is known behaviour (I know Alan knows about it too), solution is to close > > open directories filedescriptors before chrooting. > > > > Everything that happens before chroot(2) is trusted, so it's secure to rely > > on it to close directories first. > > > > If this is not well documented and people doesn't know about it and so they > > writes unsafe code that's another issue... > > But the problem is because you can call chroot when you're already chrooted. Only if you're root. There are other ways to break out of a chroot() if you're root too. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
chroot [Was: Re: Linux 2.2.18pre21]
On Thu, Nov 16, 2000 at 11:52:49AM -0800, jesse wrote: On Thu, Nov 16, 2000 at 05:16:18PM +0100, Andrea Arcangeli wrote: On Thu, Nov 16, 2000 at 03:07:04PM +0100, Matthias Andree wrote: It shows a program that saves the cwd -- open(".",...) in an open file, then chroots [..] This is known behaviour (I know Alan knows about it too), solution is to close open directories filedescriptors before chrooting. Everything that happens before chroot(2) is trusted, so it's secure to rely on it to close directories first. If this is not well documented and people doesn't know about it and so they writes unsafe code that's another issue... But the problem is because you can call chroot when you're already chrooted. Only if you're root. There are other ways to break out of a chroot() if you're root too. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: sb.o support in 2.4-broken?
On Wed, Nov 08, 2000 at 12:53:18PM -0800, Jim Bonnet wrote: > I am using the 2.4.0-test10 kernel. I have a sound blaster 16 which > works fine under 2.2.17. > > I see that a while back someone posted on this problem previously but > there were no answers I can find.. > > Is support for soundblaster16 ISA broken in the 2.4 kernel? Compiled in > or used as a module I can not get it to work. I have passed sb=220,5,1,5 > during boot when compiled in and also sent those during insmod. Use sb=0x220,5,1,5 Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
_stext and _etext in 2.2.18pre20
I once complained about the s390 port not compiling because _stext had conflicting types. You seem to have changed include/asm/irq.h then, adding [] to it, like it is in kernel/ksyms.c. I just did a little grepping, And saw this: ./init/main.c:extern char _stext, _etext; ./kernel/ksyms.c:extern char _stext[], _etext[]; ./include/asm-s390/irq.h:extern char _stext[]; ./arch/i386/kernel/irq.h:extern char _stext, _etext; ./arch/alpha/kernel/traps.c:extern unsigned long _stext, _etext; ./arch/alpha/kernel/irq.h:extern char _stext; ./arch/sparc/kernel/sun4d_smp.c:extern int _stext; ./arch/sparc/kernel/sun4m_smp.c:extern int _stext; ./arch/sparc/mm/btfixup.c:extern unsigned int _stext[], _end[], __start___ksymtab[], __stop___ksymtab[]; ./arch/sparc/ap1000/timer.c:extern int _stext; ./arch/mips/kernel/traps.c: extern char _stext, _etext; ./arch/mips/kernel/time.c: extern int _stext; ./arch/ppc/mm/init.c:extern char etext[], _stext[]; ./arch/m68k/kernel/time.c: extern int _stext; ./arch/sparc64/kernel/smp.c:extern int _stext; ./arch/arm/kernel/setup.c:extern int _stext, _text, _etext, _edata, _end; ./arch/arm/kernel/time.c: extern int _stext; ./arch/arm/mm/init.c:extern char _etext, _stext, _edata, __bss_start, _end; As you can see, most of them don't have the [], but some do. Others are (still?) signed or unsigned, int or long's. I think all of them should be pointers, it doesn't make much sense for them to be a char, altho most are just chars. Doing the same in 2.4.0-test10, shows about the same. What should be done with this? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
_stext and _etext in 2.2.18pre20
I once complained about the s390 port not compiling because _stext had conflicting types. You seem to have changed include/asm/irq.h then, adding [] to it, like it is in kernel/ksyms.c. I just did a little grepping, And saw this: ./init/main.c:extern char _stext, _etext; ./kernel/ksyms.c:extern char _stext[], _etext[]; ./include/asm-s390/irq.h:extern char _stext[]; ./arch/i386/kernel/irq.h:extern char _stext, _etext; ./arch/alpha/kernel/traps.c:extern unsigned long _stext, _etext; ./arch/alpha/kernel/irq.h:extern char _stext; ./arch/sparc/kernel/sun4d_smp.c:extern int _stext; ./arch/sparc/kernel/sun4m_smp.c:extern int _stext; ./arch/sparc/mm/btfixup.c:extern unsigned int _stext[], _end[], __start___ksymtab[], __stop___ksymtab[]; ./arch/sparc/ap1000/timer.c:extern int _stext; ./arch/mips/kernel/traps.c: extern char _stext, _etext; ./arch/mips/kernel/time.c: extern int _stext; ./arch/ppc/mm/init.c:extern char etext[], _stext[]; ./arch/m68k/kernel/time.c: extern int _stext; ./arch/sparc64/kernel/smp.c:extern int _stext; ./arch/arm/kernel/setup.c:extern int _stext, _text, _etext, _edata, _end; ./arch/arm/kernel/time.c: extern int _stext; ./arch/arm/mm/init.c:extern char _etext, _stext, _edata, __bss_start, _end; As you can see, most of them don't have the [], but some do. Others are (still?) signed or unsigned, int or long's. I think all of them should be pointers, it doesn't make much sense for them to be a char, altho most are just chars. Doing the same in 2.4.0-test10, shows about the same. What should be done with this? Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: sb.o support in 2.4-broken?
On Wed, Nov 08, 2000 at 12:53:18PM -0800, Jim Bonnet wrote: I am using the 2.4.0-test10 kernel. I have a sound blaster 16 which works fine under 2.2.17. I see that a while back someone posted on this problem previously but there were no answers I can find.. Is support for soundblaster16 ISA broken in the 2.4 kernel? Compiled in or used as a module I can not get it to work. I have passed sb=220,5,1,5 during boot when compiled in and also sent those during insmod. Use sb=0x220,5,1,5 Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
conflicting types for `mktime' is userspave programs using libc5
When trying to compile something using libc5, with the 2.4.0-test10 kernel, I get this: /usr/include/time.h:85: conflicting types for `mktime' /usr/include/linux/time.h:69: previous declaration of `mktime' A simple diff is attached --- include/linux/time.h~ Fri Nov 3 20:22:14 2000 +++ include/linux/time.hFri Nov 3 20:21:22 2000 @@ -46,6 +46,7 @@ value->tv_sec = jiffies / HZ; } +#ifdef __KERNEL__ /* Converts Gregorian date to seconds since 1970-01-01 00:00:00. * Assumes input in normal date format, i.e. 1980-12-31 23:59:59 * => year=1980, mon=12, day=31, hour=23, min=59, sec=59. @@ -78,6 +79,7 @@ )*60 + min /* now have minutes */ )*60 + sec; /* finally seconds */ } +#endif struct timeval {
conflicting types for `mktime' is userspave programs using libc5
When trying to compile something using libc5, with the 2.4.0-test10 kernel, I get this: /usr/include/time.h:85: conflicting types for `mktime' /usr/include/linux/time.h:69: previous declaration of `mktime' A simple diff is attached --- include/linux/time.h~ Fri Nov 3 20:22:14 2000 +++ include/linux/time.hFri Nov 3 20:21:22 2000 @@ -46,6 +46,7 @@ value-tv_sec = jiffies / HZ; } +#ifdef __KERNEL__ /* Converts Gregorian date to seconds since 1970-01-01 00:00:00. * Assumes input in normal date format, i.e. 1980-12-31 23:59:59 * = year=1980, mon=12, day=31, hour=23, min=59, sec=59. @@ -78,6 +79,7 @@ )*60 + min /* now have minutes */ )*60 + sec; /* finally seconds */ } +#endif struct timeval {
Re: [ANNOUNCE] Withdrawl of Open Source NDS Project/NTFS/M2FS forLinux
On Tue, Sep 05, 2000 at 05:30:46PM +0200, Ingo Molnar wrote: > > On Tue, 5 Sep 2000, Jeff V. Merkey wrote: > > > A kernel debugger will reduce development costs. No one cares what's > > underneath, [...] > > this is the point where IMO your argument gets flawed, and where you are > apparently ignoring our arguments. With utmost respect, we *do* care about > what's underneath. The health of what's underneath fueled and fuels the > growth of Linux. I'd like to repeat it again: we cannot optimize for both > the speed of FS driver development (your goal - correct me if i got it > wrong), and long term health of the kernel proper itself (our goal), > because in some areas (such as the inclusion of high complexity debugging > facilities) they do contradict. (mostly they dont contradict) Long term > health has priority - i cannot put it any simpler for you. Anyone who > thinks driver development is the most important for Linux then thats a > pretty shortsighted view IMO. A (better?) kernel debugger could help (certain) people to help improve the long term health, because they can't (or don't want) to use what's available, or just think they can't easely do it with them. It could help certain people who are used to debug a problem in certain ways, to debug it, where they have no idea how to do it with the current tools, because nobody explained them how to use them for that problem. That debugger shouldn't be part of the main kernel, specially if it's so complex. Trying to get the debugger to work might even find problems you wouldn't easely find otherwise. Kurt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/