Re: Version of XFree86 in FreeBSD Release 4.4
David O'Brien wrote: On Sun, Sep 23, 2001 at 04:05:27PM +0200, Cyrille Lefevre wrote: David O'Brien wrote: On Mon, Sep 17, 2001 at 05:42:23PM -0700, Jordan Hubbard wrote: We're still waiting for 4.0's support footprint to widen a bit more before subjecting people to it by default. Hopefully by 4.5. Are you really considering using XFree86 4.x in FreeBSD-4.5? When I asked you about this in the past, you had said you wanted to keep the same X in RELENG_4 (presumable to not rock the boat mid-branch). isn't it possible to provide both versions and left the user to choice between both w/ a information box relating problems found in the one or the other ? There are issues for the pre-compiled packages due to differences between the two versions of XFree86. what kind of issues ? I'm using both XFree86-4 and ports in package form (pre-compiled stuffs) w/o any problems. Cyrille. -- Cyrille Lefevre mailto:[EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Version of XFree86 in FreeBSD Release 4.4
On Mon, Sep 24, 2001 at 08:56:08AM +0200, Cyrille Lefevre wrote: what kind of issues ? I'm using both XFree86-4 and ports in package form (pre-compiled stuffs) w/o any problems. Please RTF /usr/ports/Mk/bsd.port.mk and look at what XFREE86_VERSION does. -- -- David ([EMAIL PROTECTED]) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Second set of stable buildworld results w/ vmiodirenable nameileafonly combos
Ok, here is the second set of results. I didn't run all the tests because nothing I did appeared to really have much of an effect. In this set of tests I set MAXMEM to 128M. As you can see the buildworld took longer verses 512M (no surprise), and vmiodirenable still helped verses otherwise. If one takes into consideration the standard deviation, the directory vnode reclamation parameters made absolutely no difference in the tests. The primary differentiator in all the tests is 'block input ops'. With vmiodirenable turned on it sits at around 51000. With it off it sits at around 56000. In the 512M tests the pass-1 numbers were 26000 with vmiodirenable turned on and 33000 with it off. Pass-2 numbers were 9000 with it on and 18000 with it off. The directory leaf reuse parameters had almost no effect on either the 128M or 512M numbers. I'm not sure why test2 wound up doing a better job then test1 in the 128M tests with vmiodirenable disabled. Both machines are configured identically with only some extra junk on test1's /usr from prior tests. In anycase, the differences point to a rather significant error spread in regards to possible outcomes, at least with vmiodirenable=0. My conclusion from all of this is: * vmiodirenable should be turned on by default. * We should rip out the cache_purgeleafdirs() code entirely and use my simpler version to fix the vnode-growth problem. * We can probably also rip out my cache_leaf_test() .. we do not need to add any sophistication to reuse only directory vnodes without subdirectories in the cache. If it had been a problem we would have seen it. I can leave the sysctl's in place on the commit to allow further testing, and I can leave it conditional on vmiodirenable. I'll set the default vmiodirenable to 1 (which will also enable directory vnode reuse) and the default nameileafonly to 0 (i.e. to use the less sophisticated check). In a few weeks I will rip-out nameileafonly and cache_leaf_test(). -Matt WIDE TERMINAL WINDOW REQUIRED! --- TEST SUITE 2 (128M ram) buildworld of -stable. DELL2550 (Duel PIII-1.2GHz / 128M ram (via MAXMEM) / SCSI) 23 September 2001 SMP kernel, softupdates-enabled, dirpref'd local /usr/src (no nfs), make -j 12 buildworld UFS_DIRHASH. 2 identical machines tested in parallel (test1, test2) /usr/bin/time -l timingsnote: atime updates left enabled in all tests REUSE LEAF DIR VNODES: directory vnodes with no subdirectories in the namei cache can be reused REUSE ALL DIR VNODES: directory vnodes can be reused (namei cache ignored) DO NOT REUSE DIR...:(Poul's original 1995 algo) directory vnode can only be reused if no subdirectories or files in the namei cache I stopped bothering with pass-2 after it became evident that the numbers were not changing significantly. VMIODIRENABLE ENABLED [ A ] [ B ] [ C ] [BEST CASE ] [BEST CASE ] [BEST CASE ] machine test1 test2 test1 test2 test1 test2 test1 test2 test1 test2 test1 test2 pass (2)R 1 1 2 2 R 1 1 2 2R 1 1 2 2 vfs.vmiodirenable E 1 1 1 1 E 1 1 1 1E 1 1 1 1 vfs.nameileafonly B 1 1 1 1 B 0 0 0 0B -1 -1 -1 -1 O OO O REUSE LEAF DIR VNODES O REUSE ALL DIR VNODES O DO NOT REUSE DIR VNODES W/ACTIVE NAMEI T TT 26:49 26:30 26:41 26:24 real1609159016011584 user1361135413611356 sys 617 615 617 614 max resident16264 16256 16260 16264 avg shared mem 1030103010301030 avg unshared data 1004100510061004 avg unshared stack 129 129 129 129 page reclaims 11.16M 11.16M 11.15M 11.15M page faults 3321367429402801 swaps 0 0 0 0 block input ops 51748 51881 50777 50690 block output ops5532649756806089 messages sent 35847 35848 35789 35715 messages received 35848 35852 35792
Re: Disk based file system cache
On Mon, Sep 24, 2001 at 01:07:00PM +0200, Attila Nagy wrote: Hello, I'm just curious: is it possible to set up an NFS server and a client where the client has very big (28 GB maximum for FreeBSD?) swap area on multiple disks and caches the NFS exported data on it? This could save a lot of bandwidth on the NFS server and also redues load on that. I'm not familiar with nfsiod(was it?), but I think, that this NFS run in kernel mode and uses kernel malloc(9) memory for caching. And kernel memory is quite different from user space memory ... Correct me, if I'm wrong. Even if it worked, you will possibly get REAL problems due to synchronisation problems. If your client machines are Linux, Solaris or (;-)) FreeBSD, you can setup CODA from the ports collection, it's much more suitable for this. Peter To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
dump files too large, nodump related??
Hi, I have noticed some strange behaviour with 4.3-RELEASE and dump. I have been dumping my filesystems through gzip into a compressed dumpfile. Some of the resulting dumps have been MUCH larger than I would expect. As an example, I have just dumped my /home partition note that lots of directories on this partition are marked nodump, eg /home/ftp which is one of the biggest users of diskspace. Building 8 level dump of /home and writing it to /var/dumps//home8.gz (gzipped) DUMP: Date of this level 8 dump: Mon Sep 24 21:13:55 2001 DUMP: Date of last level 1 dump: Tue Sep 18 20:15:43 2001 DUMP: Dumping /dev/ad0s1h (/home) to standard output DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 360780 tape blocks. DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: 30.76% done, finished in 0:11 DUMP: 60.89% done, finished in 0:06 DUMP: DUMP: 360664 tape blocks DUMP: finished in 849 seconds, throughput 424 KBytes/sec DUMP: level 8 dump on Mon Sep 24 21:13:55 2001 DUMP: DUMP IS DONE The GZIPPED dumpfile is 289 MB!!! I wrote a little perl script to check the table of contents and estimate how big the dump should be (see attached) and this gives an interesting result. doorway:~ proj/dumpsize/dumpsize.pl /home /var/dumps/home8.gz Level 8 dump of /home on doorway.home.lan:/dev/ad0s1h Label: none The level 0 dump of /home partition written to /var/dumps/home8.gz contains 689 files totalling 146450 KB, cf size of dumpfile = 282063 ( 360660 ) KB The following files are larger than 1024 KB in size: 163264 ./mark/.netscape/xover-cache/host-news/athome.aus.service.snm 1343488 ./mark/.netscape/xover-cache/host-news/athome.aus.support.snm 2097152 ./mark/.netscape/xover-cache/host-news/athome.aus.users.linux.snm 1754819 ./mark/.netscape/xover-cache/host-news/hostinfo.dat 1122336 ./samba/profile.9x/mark/USER.DAT 1441792 ./samba/profile.9x/tuija/History/History.IE5/index.dat 92440996./tuija/Mail/Archive/Sent Items 2001 2985510 ./tuija/My Documents/gas1.JPG 2528914 ./tuija/My Documents/gas2.JPG The interesting thing here is that the sum of all the file sizes in the dump is only 147MB cf the 361MB uncompressed dump size!!! This is a discrepancy of 210MB. (This would line up with the 180MB ISO image plus other dribs and drabs that I have stored in a nodump flagged directory since my last dump) Any ideas of what is wrong? Are the nodumped files stored on the dump for some reason (even though they don't appear in the restore table of contents) Regards/Mark dumpsize.pl
Re: Disk based file system cache
On Mon, Sep 24, 2001 at 01:07:00PM +0200, Attila Nagy wrote: I'm just curious: is it possible to set up an NFS server and a client where the client has very big (28 GB maximum for FreeBSD?) swap area on multiple disks and caches the NFS exported data on it? This could save a lot of bandwidth on the NFS server and also redues load on that. This would really be more than NFS is supposed to do. There other filesystems which can do this sort of thing - I think Coda might be one of them. David. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Boot proccess
Hello, | In short, which program gives enough knowledge to the microprocessor (?) | and allow him to use kern.flp mfsroot.flp in order to boot and make the | operating system running. your BIOS reads the first sektor from your floppy which consists of a boot loader, which usually loads the 2nd step boot loader and this one loads the kernel. Tell me if I am wrong but from the floppy, the files kern.flp mfsroot.flp are compressed and then uncompressed into memory. If so, that means that the FreeBSD box is running this programs from the RAM and not from the floppy, right ? If so, is it possible to do the same but from a hard disk instead of the floppy ? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: Disk based file system cache
As a side note, Irix and Solaris provide cachefs for this purpose and use NFS filesystems as examples (others examples may include CD-ROM, etc). Charles -Original Message- From: David Malone [mailto:[EMAIL PROTECTED]] Sent: Monday, September 24, 2001 8:26 AM To: Attila Nagy Cc: [EMAIL PROTECTED] Subject: Re: Disk based file system cache On Mon, Sep 24, 2001 at 01:07:00PM +0200, Attila Nagy wrote: I'm just curious: is it possible to set up an NFS server and a client where the client has very big (28 GB maximum for FreeBSD?) swap area on multiple disks and caches the NFS exported data on it? This could save a lot of bandwidth on the NFS server and also redues load on that. This would really be more than NFS is supposed to do. There other filesystems which can do this sort of thing - I think Coda might be one of them. David. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: ipfw and dummynet
Thanks for the responses, as expected it was an operator head space problem. My lack of understanding how the default queues and bw would make ping look. Apparently, enough delay is introduced merely by adding a pipe that the ping client timesout waiting for the reponse. The response was actually returning which became visible when I upped the timeout. I also didn't realize that the counters reflected the input to the pipe and not the output which was why I didn't see any change when I added a bw clamp. Luigi Rizzo wrote: hi, can you show me the output of ipfw show and ipfw pipe show Reading your questions, i have the feeling you are doing something wrong in the commands. For the last one, the client will keep generating its stream of data, it is just after going through the pipe that you will see the limitation in effect. cheers luigi --+- Luigi RIZZO, [EMAIL PROTECTED] . ACIRI/ICSI (on leave from Univ. di Pisa) http://www.iet.unipi.it/~luigi/ . 1947 Center St, Berkeley CA 94704 Phone (510) 666 2927 . --+- I tried questions for this but no answer. I am attempting to use ipfw and dummynet to instrument some network traffic tests. I am running freebsd 4.3 release and have built the kernel with ipfirewall, dummynet, and default to enabled. For a simple test, I added a pipe ipfw add pipe 1 icmp from any to any. When I ping this machine, I can do ipfw pipe 1 show and watch the counters increment, but the machine doing the pinging does not see a response to the ping. That's my first question. Next, when I try to delete the pipe, ipfw pipe 1 delete, it won't delete. The only way I can get rid of it is to do a flush. That's the second question. Third question, if I type ipfw pipe 1 config bw 10Bytes/s, I would expect the bw to be limited and the counters to reflect this limit. The counters indicate no change in the 64 byte/s generated by my windows client. I have read the man pages for ipfw, dummynet, and ipfirewall. If these are obvious questions, I would appreciate a pointer to a good reference. Thanks -- Logically speaking, logic is not the answer. Rick Norman [EMAIL PROTECTED] 408 742 1619 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message -- Logically speaking, logic is not the answer. Rick Norman [EMAIL PROTECTED] 408 742 1619 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
CVSup4.Freebsd.org
Seems to have still S1G bug: Connected to cvsup4.freebsd.org Server cvsup4.freebsd.org has the S1G bug -- Regards, Ulf. - Ulf Zimmermann, 1525 Pacific Ave., Alameda, CA-94501, #: 510-865-0204 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: CVSup4.Freebsd.org
++ 24/09/01 11:30 -0700 - Ulf Zimmermann: | Seems to have still S1G bug: | | Connected to cvsup4.freebsd.org | Server cvsup4.freebsd.org has the S1G bug This should go to the maintainer of cvsup4.freebsd.org, available at: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/cvsup.html#CVSUP-MIRRORS And also CC:'d. -pete -- Pete Fritchman [petef@(databits.net|freebsd.org|csh.rit.edu)] finger [EMAIL PROTECTED] for PGP key To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: panic on mount
On 23-Sep-01 Evan Sarmiento wrote: Hello, After compiling a new kernel, installing it, when my laptop tries to mount its drive, it panics with this message: panic: lock (sleep mutex) vnode interlock not locked @ ../../../kern/vfs_default.c:460 which is: if (ap-a_flags LK_INTERLOCK) mtx_unlock(ap-a_vp-v_interlock); within the function vop_nolock. Can you get a stack trace to see where vop_nolock is being called from? -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
termcap sources
I saw a duplicate in one of the capabilities that wer submitted to -bugs earlier. This had me thinking. What happens when a duplicate capability exists in termcap? Are there any other duplicates in termcap.src? If yes, which? The first attachment is a perl script that strips all cruft from termcap.src (fed through stdin) and makes every terminal entry occupy a single line of text. This is necessary for the second attachment to work correctly. The second attachment is a perl script that splits capability names of each input line in an array, and performs the (boring for a human to do by reading through the termcap sources) duplicate check in all the elements of the array. The third attachment is the output of the command (termcap.src,v 1.109): % cat termcap.src | ./tstrip.pl | ./tdupcheck.pl As you can see there are quite a few terminals that have capabilities defined more than once! I don't have THAT many terminals to check, but I'm open to suggestions. Should we do something about this? If yes, what? -giorgos tstrip.pl tdupcheck.pl q101 - cl 5410 - k4 AT386 - IC h1510 - do h1520 - do ibm3163 - ds:es:fs:hs abm80 - do d132 - ic tec400 - do f200 - ds:ts sol - ho it2 - do mdl110 - cd wsiris - cl:ho intext - le c100 - us:ue dtterm - op h29 - do z29a - mb:mr ztx - sr adm5 - do mime - do sexidy - le ttyWilliams - do tvi950 - do tvi955 - do abm85 - kd fos - bs
Re: VM Corruption - stumped, anyone have any ideas?
: :In message [EMAIL PROTECTED], Matt Dillon writes: : :$8 = 58630 :(kgdb) print vm_page_buckets[$8] : :What is vm_page_hash_mask? The chunk of memory you printed out below :looks alright; it is consistent with vm_page_array == 0xc051c000. Is :it just the vm_page_buckets[] pointer that is corrupt? : :The address 0xc08428cc is (char *)vm_page_array[55060] + 28, and :sizeof(struct vm_page) is 60, so 0xc08428cc is in the middle of :a vm_page within vm_page_array[]. : :Ian (kgdb) print vm_page_buckets[58630] $5 = (struct vm_page *) 0xc08428cc (kgdb) print vm_page_array $6 = 0xc051c000 (kgdb) print vm_page_hash_mask $7 = 262143 (kgdb) print vm_page_array[55060] $11 = (struct vm_page *) 0xc08428b0 (kgdb) print vm_page_array[55061] $10 = (struct vm_page *) 0xc08428ec Yowzer. How the hell did that happen! Yes, you're right, the vm_page_array[] pointer has gotten corrupted. If we assume that the vm_page_t is valid (0xc0842acc), then the vm_page_buckets[] pointer should be that. vm_page_buckets[58630] - c08428cc panic on vm_page_t m- c0842acc Ok, so the corruption here is that an 'a' turned into an '8'. 1010 turned into 1000... a bit got cleared. This is very similar to the corruption I found on one of Yahoo's machines. Except on that machine two bits were changed. It's as though some other subsystem is trying to manipulate a flag in a structure using a bad structure pointer. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
remember that we hit almost this problem with the KSE stuff during debugging? The pointers in the last few entries of the vm_page_buckets array got corrupted when an agument to a function that manipulated whatever was next in ram was 0, and it turned out that it was 0 because of some PTE flushing thing (you are the one that found it... remember?) (there was a line of asm code missing) On Mon, 24 Sep 2001, Matt Dillon wrote: : :In message [EMAIL PROTECTED], Matt Dillon writes: : :$8 = 58630 :(kgdb) print vm_page_buckets[$8] : :What is vm_page_hash_mask? The chunk of memory you printed out below :looks alright; it is consistent with vm_page_array == 0xc051c000. Is :it just the vm_page_buckets[] pointer that is corrupt? : :The address 0xc08428cc is (char *)vm_page_array[55060] + 28, and :sizeof(struct vm_page) is 60, so 0xc08428cc is in the middle of :a vm_page within vm_page_array[]. : :Ian (kgdb) print vm_page_buckets[58630] $5 = (struct vm_page *) 0xc08428cc (kgdb) print vm_page_array $6 = 0xc051c000 (kgdb) print vm_page_hash_mask $7 = 262143 (kgdb) print vm_page_array[55060] $11 = (struct vm_page *) 0xc08428b0 (kgdb) print vm_page_array[55061] $10 = (struct vm_page *) 0xc08428ec Yowzer. How the hell did that happen! Yes, you're right, the vm_page_array[] pointer has gotten corrupted. If we assume that the vm_page_t is valid (0xc0842acc), then the vm_page_buckets[] pointer should be that. vm_page_buckets[58630] - c08428cc panic on vm_page_t m- c0842acc Ok, so the corruption here is that an 'a' turned into an '8'. 1010 turned into 1000... a bit got cleared. This is very similar to the corruption I found on one of Yahoo's machines. Except on that machine two bits were changed. It's as though some other subsystem is trying to manipulate a flag in a structure using a bad structure pointer. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Boot proccess
Tell me if I am wrong but from the floppy, the files kern.flp mfsroot.flp are compressed and then uncompressed into memory. If so, that means that the FreeBSD box is running this programs from the RAM and not from the floppy, right ? Correct. They're running with the root device set to a memory filesystem (which has been initialized with the contents of mfsroot.flp). If so, is it possible to do the same but from a hard disk instead of the floppy ? That's generally how most FreeBSD systems boot, yes. :-) - Jordan To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
The pointers in the last few entries of the vm_page_buckets array got corrupted when an agument to a function that manipulated whatever was next in ram was 0, and it turned out that it was 0 because of some PTE flushing thing (you are the one that found it... remember?) I think I've also seen a few reports of programs exiting with Profiling timer expired messages with 4.4. These can be caused by stack overflows, since the p_timer[] array in struct pstats is one of the things that I think lives below the per-process kernel stack. I wonder if they are related? Stack overflows could result in corruption of local variables, after which anything could happen. That said, hardware problems are still a possiblilty. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
:The pointers in the last few entries of the vm_page_buckets array got :corrupted when an agument to a function that manipulated whatever was next :in ram was 0, and it turned out that it was 0 because : of some PTE flushing thing (you are the one that found it... remember?) : :I think I've also seen a few reports of programs exiting with :Profiling timer expired messages with 4.4. These can be caused :by stack overflows, since the p_timer[] array in struct pstats is :one of the things that I think lives below the per-process kernel :stack. I wonder if they are related? Stack overflows could result :in corruption of local variables, after which anything could happen. : :That said, hardware problems are still a possiblilty. : :Ian Hmm. Do we have a guard page at the base of the per process kernel stack? -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
: :remember that we hit almost this problem with the KSE stuff during :debugging? : :The pointers in the last few entries of the vm_page_buckets array got :corrupted when an agument to a function that manipulated whatever was next :in ram was 0, and it turned out that it was 0 because : of some PTE flushing thing (you are the one that found it... remember?) :(there was a line of asm code missing) I've kept that in mind, but I think this may be a different issue. The memory involved is 100% statically mapped in the kernel page table array, and the errors are more like bit errors then anything else. Either the memory is bad or something in our kernel is setting or clearing flags through a bad pointer. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
not, I believe in 4.x we do in 5.x On Mon, 24 Sep 2001, Matt Dillon wrote: :The pointers in the last few entries of the vm_page_buckets array got :corrupted when an agument to a function that manipulated whatever was next :in ram was 0, and it turned out that it was 0 because : of some PTE flushing thing (you are the one that found it... remember?) : :I think I've also seen a few reports of programs exiting with :Profiling timer expired messages with 4.4. These can be caused :by stack overflows, since the p_timer[] array in struct pstats is :one of the things that I think lives below the per-process kernel :stack. I wonder if they are related? Stack overflows could result :in corruption of local variables, after which anything could happen. : :That said, hardware problems are still a possiblilty. : :Ian Hmm. Do we have a guard page at the base of the per process kernel stack? -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
In message [EMAIL PROTECTED], Matt Dillon writes: Hmm. Do we have a guard page at the base of the per process kernel stack? As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386) pages of per-process kernel state at p-p_addr. The stack grows down from the top, and struct user (sys/user.h) sits at the bottom. According to the comment in the definition of struct user, only the first three items in struct user are valid in normal running conditions: 8192 ??? 8176Top of stack stack space (4672 bytes) 3504 struct timeval p_start struct uprof p_prof struct itimerval p_timer[ITIMER_PROF] (for SIGPROF) struct itimerval p_timer[ITIMER_VIRTUAL] struct itimerval p_timer[ITIMER_REAL] struct rusage p_cru; struct rusage p_ru; u_stats 3280 u_sigacts 608 u_pcb 0 p-p_addr So if the stack does overflow, p_timer[ITIMER_PROF] is about the first noticable thing that gets clobbered, causing a SIGPROF signal delivery to the process some time later. Ian To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
:In message [EMAIL PROTECTED], Matt Dillon writes: : :Hmm. Do we have a guard page at the base of the per process kernel :stack? : :As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386) :pages of per-process kernel state at p-p_addr. The stack grows :down from the top, and struct user (sys/user.h) sits at the bottom. :According to the comment in the definition of struct user, only :the first three items in struct user are valid in normal running :conditions: Ok. I'm going to add a magic number to the end of the process structure and check it in mi_switch() in -stable. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
: :In message [EMAIL PROTECTED], Matt Dillon writes: : :Hmm. Do we have a guard page at the base of the per process kernel :stack? : :As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386) :pages of per-process kernel state at p-p_addr. The stack grows :down from the top, and struct user (sys/user.h) sits at the bottom. :According to the comment in the definition of struct user, only :the first three items in struct user are valid in normal running :conditions: Er, I mean I'll add a magic number to struct pstats. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
ecc on i386
What happens on an ECC equipped PC when you have a multi-bit memory error that hardware scrubbing can't fix? Will there be some sort of NMI or something that will panic the box? I'm used to alphas (where you'll get a fatal machine check panic) and I am just wondering if PCs are as safe. Thanks, Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
stack can be somewhat sparse depending on execution path, but it's not a bad idea.. On Mon, 24 Sep 2001, Matt Dillon wrote: :In message [EMAIL PROTECTED], Matt Dillon writes: : :Hmm. Do we have a guard page at the base of the per process kernel :stack? : :As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386) :pages of per-process kernel state at p-p_addr. The stack grows :down from the top, and struct user (sys/user.h) sits at the bottom. :According to the comment in the definition of struct user, only :the first three items in struct user are valid in normal running :conditions: Ok. I'm going to add a magic number to the end of the process structure and check it in mi_switch() in -stable. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: ecc on i386
:What happens on an ECC equipped PC when you have a multi-bit memory :error that hardware scrubbing can't fix? Will there be some sort of :NMI or something that will panic the box? : :I'm used to alphas (where you'll get a fatal machine check panic) and :I am just wondering if PCs are as safe. : :Thanks, : :Drew ECC can typically detect and correct single bit errors and detect double bit errors. Anything beyond that is problematic... it may or may not detect the problem or may mis-correct a multi-bit error. An NMI is generated if an uncorrectable error is detected. On PC's, ECC is optional. Desktops typically do not ship with ECC memory. Branded servers typically do.A year or two ago I would have been happy to use non-ECC rams (finding bad RAM through trial and error), but now with capacities as they are and memory prices down ECC is definitely the way to go. Bit errors can come from many sources, memory being only one. Bit errors can occur inside the cpu chip, in the L1 and L2 caches, in memory, in controller chips... all over the place. Many modern processors implement parity on their caches to try to cover the problem areas. I'm not sure how Pentium III's and IV's are setup. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: ecc on i386
Matt Dillon writes: :What happens on an ECC equipped PC when you have a multi-bit memory :error that hardware scrubbing can't fix? Will there be some sort of :NMI or something that will panic the box? : :I'm used to alphas (where you'll get a fatal machine check panic) and :I am just wondering if PCs are as safe. : :Thanks, : :Drew ECC can typically detect and correct single bit errors and detect double bit errors. Anything beyond that is problematic... it may or may not detect the problem or may mis-correct a multi-bit error. An NMI is generated if an uncorrectable error is detected. On PC's, ECC is optional. Desktops typically do not ship with ECC memory. Branded servers typically do.A year or two ago I would have been happy to use non-ECC rams (finding bad RAM through trial and error), but now with capacities as they are and memory prices down ECC is definitely the way to go. My sentiments exactly. Bit errors can come from many sources, memory being only one. Bit errors can occur inside the cpu chip, in the L1 and L2 caches, in memory, in controller chips... all over the place. Many modern processors implement parity on their caches to try to cover the problem areas. I'm not sure how Pentium III's and IV's are setup. -Matt Hmm.. Well, it turns out that the box Im insterested in (Thunder K7) can be set to send an SERR on multiple bit errors. I wonder what happens when a pc gets an SERR? (that's another machine check on alpha) Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: ecc on i386
Andrew Gallatin wrote: What happens on an ECC equipped PC when you have a multi-bit memory error that hardware scrubbing can't fix? Will there be some sort of NMI or something that will panic the box? I'm used to alphas (where you'll get a fatal machine check panic) and I am just wondering if PCs are as safe. Basically it depends on how the bios has programmed the chipsets and how the motherboard is wired. The usual way goes something like this: There are two PCI signals, #PERR (pci error), #SERR (system error). Various devices can be programmed to assert these under various conditions. Things like bus master fifo underflows etc will be programmed to assert #PERR and are generally not fatal. The memory controller is usually programmed to assert #SERR on a multiple bit error and either #SERR or some other signal (a GPIO or something like #SALERT on a serverworks chip) for a single bit (corrected) error. The south bridge listens to #SERR and #PERR and can convert those into NMI events. Usually #SERR shows up as parity error and #PERR shows up as IOCHK (if it is enabled). The bad news is that many bios manufacturers **TURN OFF** ECC functionality in order to speed things up. The reason for this is that with ECC off, the cpu can read/write down to byte granularity. With ECC on, memory is rigidly enforced as 64 bit quantities (ecc-encoded out to 72 bits). If the cpu reads a byte, the memory controller actually fetches all 64 (72) bits. If the cpu writes a byte, the memory controller has to do a read-merge-write cycle where it reads the 64 bit value, merges in the 1 byte write and writes out the entire 64 bit value again. This (naturally) shows up in poor benchmarks so they like to turn it off by default in order to get a speed edge. Tyan is a notable example here (eg: the Thunder K7, the dual-athlon DDR-SDRAM board has ECC turned off by default(!!)). I am sure that others do it too. The Tyan Thunder 2510 BIOS even disables ECC - NMI routing so you have to go to quite a bit of trouble to reprogram the serverworks chipset to actually generate NMI's so that you can find out if something got trashed. Our NMI / ECC handling really really sucks in FreeBSD. Consider: - i686_pagezero - reads before writing in order to minimize cache snooping traffic in SMP systems. However, if it gets an NMI while trying to check if the cache line is already zero, it will take the entire machine down instead of just zeroing the line. - NFS / VM / bio: when they get an NMI while trying to copy data that is clean and backed by storage, they take the machine down instead of trying to recover and re-read the page. - userland.. If userland gets an NMI, the machine dies instead of killing the process (or rereading a text page etc if possible) - our NMI handlers are a festering pile of excretement. They dont have the code to 'ack' the NMI so it isn't possible to return after recovery. - and so on. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: ecc on i386
Andrew Gallatin wrote: Matt Dillon writes: :What happens on an ECC equipped PC when you have a multi-bit memory :error that hardware scrubbing can't fix? Will there be some sort of :NMI or something that will panic the box? : :I'm used to alphas (where you'll get a fatal machine check panic) and :I am just wondering if PCs are as safe. : :Thanks, : :Drew ECC can typically detect and correct single bit errors and detect double bit errors. Anything beyond that is problematic... it may or may not detect the problem or may mis-correct a multi-bit error. An NMI is generated if an uncorrectable error is detected. On PC's, ECC is optional. Desktops typically do not ship with ECC memory. Branded servers typically do.A year or two ago I would have been happy to use non-ECC rams (finding bad RAM through trial and error), but now with capacities as they are and memory prices down ECC is definitely the way to go. My sentiments exactly. I wrote a poller for picking up correction events on various serverworks motherboards (compaq, tyan) and it was *scarey* how often single-bit errors were being corrected. Bit errors can come from many sources, memory being only one. Bit err ors can occur inside the cpu chip, in the L1 and L2 caches, in memory, in controller chips... all over the place. Many modern processors implem ent parity on their caches to try to cover the problem areas. I'm not sur e how Pentium III's and IV's are setup. -Matt Hmm.. Well, it turns out that the box Im insterested in (Thunder K7) can be set to send an SERR on multiple bit errors. I wonder what happens when a pc gets an SERR? (that's another machine check on alpha) On the Thunder K7, #SERR is routed to NMI. Trust me, you want this. And set it to ECC-SCRUB instead of off like the default now is. See my other email about how #SERR is converted to NMI via the ISA part of the south bridge. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
Matt Dillon wrote: :The pointers in the last few entries of the vm_page_buckets array got :corrupted when an agument to a function that manipulated whatever was next :in ram was 0, and it turned out that it was 0 because : of some PTE flushing thing (you are the one that found it... remember?) : :I think I've also seen a few reports of programs exiting with :Profiling timer expired messages with 4.4. These can be caused :by stack overflows, since the p_timer[] array in struct pstats is :one of the things that I think lives below the per-process kernel :stack. I wonder if they are related? Stack overflows could result :in corruption of local variables, after which anything could happen. : :That said, hardware problems are still a possiblilty. : :Ian Hmm. Do we have a guard page at the base of the per process kernel stack? -Matt I did it as part of the KSE work in 5.x. It would be quite easy to do it for 4.x as well, but it makes a.out coredumps problematic. Also, options UPAGES=4 is a pretty good defensive measure. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
: :I did it as part of the KSE work in 5.x. It would be quite easy to do it :for 4.x as well, but it makes a.out coredumps problematic. : :Also, options UPAGES=4 is a pretty good defensive measure. : :Cheers, :-Peter :-- :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Well, in 4.x: (kgdb) print p-p_addr $6 = (struct user *) 0xcb7b9000 (kgdb) print p-p_addr-u_sigacts $7 = (struct sigacts *) 0xcb7b9260 (kgdb) print p-p_addr-u_stats $8 = (struct pstats *) 0xcb7b9cd0 (kgdb) print p-p_addr-u_kproc $9 = (struct kinfo_proc *) 0xcb7b9db0 (kgdb) print p-p_addr-u_md $10 = (struct md_coredump *) 0xcb7ba1d0 (kgdb) print p-p_addr-u_guard(my new field) $11 = (u_int32_t *) 0xcb7ba1d0 (kgdb) cb7b9000start of kstack cb7ba1d4end of struct user cb7bb000top of kstack Leaving us 3628 bytes for the kernel stack. Something really weird is going on... I added u_guard to the end of the struct user structure and there are two or three processes hitting the guard immediately. All the rest are ok. I'm going to investigate further but this is very odd. Am I missing something about the UAREA? -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
On Mon, 24 Sep 2001, Matt Dillon wrote: Yowzer. How the hell did that happen! Yes, you're right, the vm_page_array[] pointer has gotten corrupted. If we assume that the vm_page_t is valid (0xc0842acc), then the vm_page_buckets[] pointer should be that. ... This is very similar to the corruption I found on one of Yahoo's machines. Except on that machine two bits were changed. It's as though some other subsystem is trying to manipulate a flag in a structure using a bad structure pointer. -Matt Ok, time to take a good stab at sticking my foot in my mouth here. Would it be possible to have a kernel mode where the read-only bit was turned on for malloc pools which shouldn't currently be accessed? This could be gated through the spl() calls (or specific mutexes on -current), ensuring that something like getpid couldn't stomp on the vm structures w/o first doing a splvm(). Obviously this wouldn't help find bugs in interrupt handlers or other high level calls, but it could help locate some memory corruption problems. Actually, since memory regions roughly follow locks, this could be an even more powerful tool on -current once it develops me. Is this even feasible in ring 0? Mike Silby Silbersack To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
Matt Dillon wrote: : :I did it as part of the KSE work in 5.x. It would be quite easy to do it :for 4.x as well, but it makes a.out coredumps problematic. : :Also, options UPAGES=4 is a pretty good defensive measure. : :Cheers, :-Peter :-- :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Well, in 4.x: (kgdb) print p-p_addr $6 = (struct user *) 0xcb7b9000 (kgdb) print p-p_addr-u_sigacts $7 = (struct sigacts *) 0xcb7b9260 (kgdb) print p-p_addr-u_stats $8 = (struct pstats *) 0xcb7b9cd0 (kgdb) print p-p_addr-u_kproc $9 = (struct kinfo_proc *) 0xcb7b9db0 (kgdb) print p-p_addr-u_md $10 = (struct md_coredump *) 0xcb7ba1d0 (kgdb) print p-p_addr-u_guard (my new field) $11 = (u_int32_t *) 0xcb7ba1d0 (kgdb) cb7b9000 start of kstack cb7ba1d4 end of struct user cb7bb000 top of kstack Leaving us 3628 bytes for the kernel stack. Something really weird is going on... I added u_guard to the end of the struct user structure and there are two or three processes hitting the guard immediately. All the rest are ok. I'm going to investigate further but this is very odd. Am I missing something about the UAREA? Yes. u_md etc isn't used while the process is running. If you're going to have u_guard, it should come directly after u_stats, and *before* u_kproc, u_md etc. I had been contemplating making a fake 'struct user' in userland only in order to keep the a.out coredump reader code happy. The a.out coredump code (see cpu_coredump() in */*/vm_machdep.c) can generate this fake structure in order to keep gdb happy. But then I realized that a.out coredump debugging was almost totally irrelevant these days. Actually I tell a lie. In 4.x, u_kproc *can* be used on a live process.. see the **NASTY** PT_READ_U and PT_WRITE_U code in sys_process.c. It does a fill_eproc() in order to be able to read/write values from there. Nothing uses this stuff. I removed it from -current quite a while ago, and it should be MFC'ed too. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
Matt Dillon wrote: : :I did it as part of the KSE work in 5.x. It would be quite easy to do it :for 4.x as well, but it makes a.out coredumps problematic. : :Also, options UPAGES=4 is a pretty good defensive measure. : :Cheers, :-Peter :-- :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Well, in 4.x: (kgdb) print p-p_addr $6 = (struct user *) 0xcb7b9000 (kgdb) print p-p_addr-u_sigacts $7 = (struct sigacts *) 0xcb7b9260 (kgdb) print p-p_addr-u_stats $8 = (struct pstats *) 0xcb7b9cd0 (kgdb) print p-p_addr-u_kproc $9 = (struct kinfo_proc *) 0xcb7b9db0 (kgdb) print p-p_addr-u_md $10 = (struct md_coredump *) 0xcb7ba1d0 (kgdb) print p-p_addr-u_guard (my new field) $11 = (u_int32_t *) 0xcb7ba1d0 (kgdb) cb7b9000 start of kstack cb7ba1d4 end of struct user cb7bb000 top of kstack Leaving us 3628 bytes for the kernel stack. Something really weird is going on... I added u_guard to the end of the struct user structure and there are two or three processes hitting the guard immediately. All the rest are ok. I'm going to investigate further but this is very odd. Am I missing something about the UAREA? Oh, one other thing... When we had PCIBIOS active for pci config space read/write support, we had stack overflows on many systems when the SSE stuff got MFC'ed. The simple act of trimming about 300 bytes from the pcb_save structure was enough to make the difference between it working or not. We are *way* too close to the wire. I asked about raising UPAGES from 2 to 3 before RELENG_4_4 but it never happened. Julian cleaned up a couple of places stuff where we were allocating 2K of local data *twice* on local stack frames. There are some gcc patches floating around that enable you to generate a warning if your local stack frame exceedes a certain amount or the arguments are bigger than a specified amount. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: VM Corruption - stumped, anyone have any ideas?
:Oh, one other thing... When we had PCIBIOS active for pci config space :read/write support, we had stack overflows on many systems when the SSE :stuff got MFC'ed. The simple act of trimming about 300 bytes from the :pcb_save structure was enough to make the difference between it working or :not. We are *way* too close to the wire. I asked about raising UPAGES :from 2 to 3 before RELENG_4_4 but it never happened. : :Julian cleaned up a couple of places stuff where we were allocating 2K of :local data *twice* on local stack frames. There are some gcc patches :floating around that enable you to generate a warning if your local stack :frame exceedes a certain amount or the arguments are bigger than a :specified amount. : :Cheers, :-Peter :-- :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] I'm getting stack underflows with UPAGES set to 2. I've set UPAGES to 4 and preinitialized the UAREA to 0x11 and then scan it in exit1() to determine how much stack was actually used. If these numbers are correct, we are screwed with UPAGES set to 2. This is just 4 seconds worth of a buildworld. Note the '3664's showing up. That's too close. note the 3984 that came up after playing with the system for a few seconds! I'll post the patch set to use to test this stuff in a moment. -Matt process 323 exit kstackuse 2272 ... process 333 exit kstackuse 2272 process 225 exit kstackuse 3664 process 233 exit kstackuse 2272 ... process 237 exit kstackuse 2272 process 322 exit kstackuse 2676 process 334 exit kstackuse 2272 ... process 319 exit kstackuse 2272 test1# dmesg | fgrep process | sort -n +4 | tail -10 process 6 exit kstackuse 3640 process 89 exit kstackuse 3640 process 176 exit kstackuse 3664 process 186 exit kstackuse 3664 process 225 exit kstackuse 3664 process 290 exit kstackuse 3664 process 299 exit kstackuse 3664 process 300 exit kstackuse 3664 process 303 exit kstackuse 3664 process 138 exit kstackuse 3984 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Patch to test kstack usage.
This isn't perfect but it should be a good start in regards to testing kstack use. This patch is against -stable. It reports kernel stack use on process exit and will generate a 'Kernel stack underflow' message if it detects an underflow. It doesn't panic, so for a fun time you can leave UPAGES at 2 and watch in horror. note: make sure you make depend before making a new kernel, or use buildkernel. -Matt Index: sys/user.h === RCS file: /home/ncvs/src/sys/sys/user.h,v retrieving revision 1.24 diff -u -r1.24 user.h --- sys/user.h 1999/12/29 04:24:49 1.24 +++ sys/user.h 2001/09/25 03:41:04 @@ -109,9 +109,13 @@ * Remaining fields only for core dump and/or ptrace-- * not valid at other times! */ + u_int32_t u_guard2; /* guard the base of the kstack */ struct kinfo_proc u_kproc; /* proc + eproc */ struct md_coredump u_md; /* machine dependent glop */ + u_int32_t u_guard; /* guard the base of the kstack */ }; + +#define U_GUARD_MAGIC 0x51A2C3D4 /* * Redefinitions to make the debuggers happy for now... This subterfuge Index: kern/init_main.c === RCS file: /home/ncvs/src/sys/kern/init_main.c,v retrieving revision 1.134.2.6 diff -u -r1.134.2.6 init_main.c --- kern/init_main.c2001/06/15 09:37:55 1.134.2.6 +++ kern/init_main.c2001/09/25 01:39:05 @@ -358,6 +358,7 @@ */ p-p_stats = p-p_addr-u_stats; p-p_sigacts = p-p_addr-u_sigacts; + p-p_addr-u_guard = U_GUARD_MAGIC; /* bottom of kernel stack */ /* * Charge root for one process. Index: kern/kern_exit.c === RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v retrieving revision 1.92.2.5 diff -u -r1.92.2.5 kern_exit.c --- kern/kern_exit.c2001/07/27 14:06:01 1.92.2.5 +++ kern/kern_exit.c2001/09/25 04:09:32 @@ -123,6 +123,16 @@ WTERMSIG(rv), WEXITSTATUS(rv)); panic(Going nowhere without my init!); } + { + int *ua; + int *addrend = (int *)((char *)p-p_addr + UPAGES * PAGE_SIZE); + for (ua = p-p_addr-u_guard + 1; ua addrend; ++ua) { + if (*ua != 0x) + break; + } + printf(process %d exit kstackuse %d\n, + p-p_pid, (char *)addrend - (char *)ua); + } aio_proc_rundown(p); Index: kern/kern_synch.c === RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v retrieving revision 1.87.2.3 diff -u -r1.87.2.3 kern_synch.c --- kern/kern_synch.c 2000/12/31 22:10:45 1.87.2.3 +++ kern/kern_synch.c 2001/09/25 02:54:46 @@ -44,13 +44,17 @@ #include sys/param.h #include sys/systm.h #include sys/proc.h +#include sys/lock.h #include sys/kernel.h #include sys/signalvar.h #include sys/resourcevar.h #include sys/vmmeter.h #include sys/sysctl.h #include vm/vm.h +#include vm/pmap.h +#include vm/vm_map.h #include vm/vm_extern.h +#include sys/user.h #ifdef KTRACE #include sys/uio.h #include sys/ktrace.h @@ -792,6 +796,13 @@ register struct proc *p = curproc; /* XXX */ register struct rlimit *rlim; int x; + + /* +* Check to see if the kernel stack underflowed (XXX) +*/ + if (p-p_addr-u_guard != U_GUARD_MAGIC) { + printf(Kernel stack underflow! %p %p %08x\n, p, p-p_addr, +p-p_addr-u_guard); + } /* * XXX this spl is almost unnecessary. It is partly to allow for Index: i386/i386/pmap.c === RCS file: /home/ncvs/src/sys/i386/i386/pmap.c,v retrieving revision 1.250.2.10 diff -u -r1.250.2.10 pmap.c --- i386/i386/pmap.c2001/07/30 23:27:59 1.250.2.10 +++ i386/i386/pmap.c2001/09/25 04:03:52 @@ -891,6 +891,7 @@ } if (updateneeded) invltlb(); + memset(up, 0x11, UPAGES * PAGE_SIZE); } /* Index: i386/include/param.h === RCS file: /home/ncvs/src/sys/i386/include/param.h,v retrieving revision 1.54.2.5 diff -u -r1.54.2.5 param.h --- i386/include/param.h2001/09/15 00:50:36 1.54.2.5 +++ i386/include/param.h2001/09/25 03:41:11 @@ -110,7 +110,7 @@ #define MAXDUMPPGS (DFLTPHYS/PAGE_SIZE) #define IOPAGES2 /* pages of i/o permission bitmap */ -#define UPAGES 2 /* pages of u-area */ +#define UPAGES 4 /* pages of u-area */ /* * Ceiling on amount of swblock kva space. Index: vm/vm_glue.c
Re: Patch to test kstack usage.
Matt Dillon wrote: This isn't perfect but it should be a good start in regards to testing kstack use. This patch is against -stable. It reports kernel stack use on process exit and will generate a 'Kernel stack underflow' message if it detects an underflow. It doesn't panic, so for a fun time you can leave UPAGES at 2 and watch in horror. It is checking against the wrong guard value. It should be u_guard2. FWIW; the max stack available is 4688 bytes on a standard 4.x system. Yes, that is too freaking close. Also, the maximum usage depends on what sort of cards you have in the system.. If you have a heavy tty user (eg: a 32+ port serial card) then you have lots of tty interrupts nesting as well. Having the ppp/sl/plip drivers in the system partly negates the effect of this though since it wires the net/tty interrupt masks together. peter@thunder[10:13pm]~-111 ./tu stack base = 3504 stack size = 4688 peter@thunder[10:13pm]~-112 cat tu.c #include sys/param.h #include sys/user.h #include stdio.h #include stddef.h int main(int ac, char **av) { int stack_base = offsetof(struct user, u_kproc); printf(stack base = %d\n, stack_base); printf(stack size = %d\n, UPAGES * PAGE_SIZE - stack_base); } --- sys/user.h1999/12/29 04:24:49 1.24 +++ sys/user.h2001/09/25 03:41:04 @@ -109,9 +109,13 @@ * Remaining fields only for core dump and/or ptrace-- * not valid at other times! */ + u_int32_t u_guard2; /* guard the base of the kstack */ struct kinfo_proc u_kproc; /* proc + eproc */ struct md_coredump u_md; /* machine dependent glop */ + u_int32_t u_guard; /* guard the base of the kstack */ }; Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Patch to test kstack usage.
: :Matt Dillon wrote: : This isn't perfect but it should be a good start in regards to : testing kstack use. This patch is against -stable. It reports : kernel stack use on process exit and will generate a 'Kernel stack : underflow' message if it detects an underflow. It doesn't panic, : so for a fun time you can leave UPAGES at 2 and watch in horror. : :It is checking against the wrong guard value. It should be u_guard2. : :FWIW; the max stack available is 4688 bytes on a standard 4.x system. Yes, :that is too freaking close. Also, the maximum usage depends on what sort :of cards you have in the system.. If you have a heavy tty user (eg: a 32+ I looked at it fairly carefully. It has got to be u_guard... at the end of struct user, at least until you do that MFC. The ptrace code appears to mess around with u_kproc quite a bit. And when you rip out u_kproc it still needs to be at the end, after the coredump structure (though for i386 the coredump structure is empty)... because interrupts can occur during a core dump. :port serial card) then you have lots of tty interrupts nesting as well. :Having the ppp/sl/plip drivers in the system partly negates the effect of :this though since it wires the net/tty interrupt masks together. :... :Cheers, :-Peter :-- :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] :All of this is for nothing if we don't go to the stars - JMS/B5 : Yah... the test I ran was just a couple of seconds worth of playing around over ssh. I expect the worst case to be a whole lot worse. We're going to have to bump up UPAGES to 3 in 4.x, there's no question about it. I'm going to do it tonight. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Patch to test kstack usage.
:stack size = 4688 Sep 24 22:47:22 test1 /kernel: process 29144 exit kstackuse 4496 closer... :-) -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Patch to test kstack usage.
Matt Dillon wrote: Yah... the test I ran was just a couple of seconds worth of playing around over ssh. I expect the worst case to be a whole lot worse. We're going to have to bump up UPAGES to 3 in 4.x, there's no question about it. I'm going to do it tonight. Heh. I already asked to do it a few weeks ago, in order to get it into the release. I guess I wasn't persistant enough. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message