Re: Version of XFree86 in FreeBSD Release 4.4

2001-09-24 Thread Cyrille Lefevre

David O'Brien wrote:
 On Sun, Sep 23, 2001 at 04:05:27PM +0200, Cyrille Lefevre wrote:
  David O'Brien wrote:
   On Mon, Sep 17, 2001 at 05:42:23PM -0700, Jordan Hubbard wrote:
We're still waiting for 4.0's support footprint to widen
a bit more before subjecting people to it by default.  Hopefully
by 4.5.
   
   Are you really considering using XFree86 4.x in FreeBSD-4.5?
   When I asked you about this in the past, you had said you wanted to keep
   the same X in RELENG_4 (presumable to not rock the boat mid-branch).
  
  isn't it possible to provide both versions and left the user to
  choice between both w/ a information box relating problems found
  in the one or the other ?
 
 There are issues for the pre-compiled packages due to differences between
 the two versions of XFree86.

what kind of issues ? I'm using both XFree86-4 and ports in package form
(pre-compiled stuffs) w/o any problems.

Cyrille.
-- 
Cyrille Lefevre mailto:[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Version of XFree86 in FreeBSD Release 4.4

2001-09-24 Thread David O'Brien

On Mon, Sep 24, 2001 at 08:56:08AM +0200, Cyrille Lefevre wrote:
 what kind of issues ? I'm using both XFree86-4 and ports in package form
 (pre-compiled stuffs) w/o any problems.

Please RTF /usr/ports/Mk/bsd.port.mk and look at what XFREE86_VERSION
does.

-- 
-- David  ([EMAIL PROTECTED])

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Second set of stable buildworld results w/ vmiodirenable nameileafonly combos

2001-09-24 Thread Matt Dillon

Ok, here is the second set of results.  I didn't run all the tests
because nothing I did appeared to really have much of an effect.  In
this set of tests I set MAXMEM to 128M.  As you can see the buildworld
took longer verses 512M (no surprise), and vmiodirenable still helped 
verses otherwise.  If one takes into consideration the standard
deviation, the directory vnode reclamation parameters made absolutely
no difference in the tests.

The primary differentiator in all the tests is 'block input ops'.  With
vmiodirenable turned on it sits at around 51000.  With it off it sits
at around 56000.  In the 512M tests the pass-1 numbers were 26000 with
vmiodirenable turned on and 33000 with it off.  Pass-2 numbers were
9000 with it on and 18000 with it off.  The directory leaf reuse 
parameters had almost no effect on either the 128M or 512M numbers.

I'm not sure why test2 wound up doing a better job then test1 in the
128M tests with vmiodirenable disabled.  Both machines are configured
identically with only some extra junk on test1's /usr from prior tests.
In anycase, the differences point to a rather significant error spread
in regards to possible outcomes, at least with vmiodirenable=0.

My conclusion from all of this is:

* vmiodirenable should be turned on by default.

* We should rip out the cache_purgeleafdirs() code entirely and use my
  simpler version to fix the vnode-growth problem.

* We can probably also rip out my cache_leaf_test() .. we do not need 
  to add any sophistication to reuse only directory vnodes without 
  subdirectories in the cache.  If it had been a problem we would have
  seen it.

I can leave the sysctl's in place on the commit to allow further testing,
and I can leave it conditional on vmiodirenable.  I'll set the default
vmiodirenable to 1 (which will also enable directory vnode reuse) and
the default nameileafonly to 0 (i.e. to use the less sophisticated check).
In a few weeks I will rip-out nameileafonly and cache_leaf_test().

-Matt


WIDE TERMINAL WINDOW REQUIRED! 
---

TEST SUITE 2 (128M ram)

buildworld of -stable.  DELL2550 (Duel PIII-1.2GHz / 128M ram (via MAXMEM) / 
SCSI)
23 September 2001   SMP kernel, softupdates-enabled, dirpref'd local 
/usr/src (no nfs),
make -j 12 buildworld   UFS_DIRHASH.  2 identical machines tested in parallel 
(test1, test2)
/usr/bin/time -l timingsnote: atime updates left enabled in all tests

REUSE LEAF DIR VNODES:  directory vnodes with no subdirectories in the namei cache can 
be reused
REUSE ALL DIR VNODES:   directory vnodes can be reused (namei cache ignored)
DO NOT REUSE DIR...:(Poul's original 1995 algo) directory vnode can only be reused 
if no subdirectories or files in the
 namei cache

I stopped bothering with pass-2 after it became evident that the numbers
were not changing significantly.

VMIODIRENABLE ENABLED   [ A ]   [ B ]  
 [ C ]
[BEST CASE  ]   [BEST CASE  ]  
 [BEST CASE  ]
machine test1   test2   test1   test2   test1   test2   test1   test2  
 test1   test2   test1   test2
pass (2)R   1   1   2   2   R   1   1   2   2R 
 1   1   2   2
vfs.vmiodirenable   E   1   1   1   1   E   1   1   1   1E 
 1   1   1   1
vfs.nameileafonly   B   1   1   1   1   B   0   0   0   0B 
 -1  -1  -1  -1
O   OO
O   REUSE LEAF DIR VNODES   O   REUSE ALL DIR VNODES O 
 DO NOT REUSE DIR VNODES W/ACTIVE NAMEI
T   TT
26:49   26:30   26:41   26:24
real1609159016011584
user1361135413611356
sys 617 615 617 614
max resident16264   16256   16260   16264
avg shared mem  1030103010301030
avg unshared data   1004100510061004
avg unshared stack  129 129 129 129
page reclaims   11.16M  11.16M  11.15M  11.15M
page faults 3321367429402801
swaps   0   0   0   0
block input ops 51748   51881   50777   50690
block output ops5532649756806089
messages sent   35847   35848   35789   35715
messages received   35848   35852   35792   

Re: Disk based file system cache

2001-09-24 Thread Peter Wullinger

On Mon, Sep 24, 2001 at 01:07:00PM +0200, Attila Nagy wrote:
 Hello,
 
 I'm just curious: is it possible to set up an NFS server and a client
 where the client has very big (28 GB maximum for FreeBSD?) swap area on
 multiple disks and caches the NFS exported data on it?
 This could save a lot of bandwidth on the NFS server and also redues load
 on that.
 

I'm not familiar with nfsiod(was it?), 
but I think, that this NFS run in kernel mode and
uses kernel malloc(9) memory for caching. And kernel
memory is quite different from user space memory ... Correct me,
if I'm wrong. 

Even if it worked, you will possibly get REAL problems due to
synchronisation problems. If your client machines are Linux, Solaris
or (;-)) FreeBSD, you can setup CODA from the ports collection, it's
much more suitable for this.

Peter

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



dump files too large, nodump related??

2001-09-24 Thread Mark Hannon

Hi,

I have noticed some strange behaviour with 4.3-RELEASE and dump.  I have
been dumping my filesystems through gzip into a compressed dumpfile.
Some
of the resulting dumps have been MUCH larger than I would expect.

As an example, I have just dumped my /home partition  note that lots
of directories on this partition are marked nodump, eg /home/ftp which
is one of the biggest users of diskspace.

Building 8 level dump of /home and writing it to /var/dumps//home8.gz
(gzipped)
  DUMP: Date of this level 8 dump: Mon Sep 24 21:13:55 2001
  DUMP: Date of last level 1 dump: Tue Sep 18 20:15:43 2001
  DUMP: Dumping /dev/ad0s1h (/home) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 360780 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: 30.76% done, finished in 0:11
  DUMP: 60.89% done, finished in 0:06
  DUMP: DUMP: 360664 tape blocks
  DUMP: finished in 849 seconds, throughput 424 KBytes/sec
  DUMP: level 8 dump on Mon Sep 24 21:13:55 2001
  DUMP: DUMP IS DONE

The GZIPPED dumpfile is 289 MB!!!   

I wrote a little perl script to check the table of contents and estimate
how
big the dump should be (see attached) and this gives an interesting
result.

doorway:~ proj/dumpsize/dumpsize.pl /home /var/dumps/home8.gz 
Level 8 dump of /home on doorway.home.lan:/dev/ad0s1h
Label: none
The level 0 dump of /home partition written to /var/dumps/home8.gz
contains 689 files totalling 146450 KB, cf size of dumpfile = 282063 (
360660 ) 
KB

The following files are larger than 1024 KB in size:
163264 ./mark/.netscape/xover-cache/host-news/athome.aus.service.snm
1343488 ./mark/.netscape/xover-cache/host-news/athome.aus.support.snm
2097152
./mark/.netscape/xover-cache/host-news/athome.aus.users.linux.snm
1754819 ./mark/.netscape/xover-cache/host-news/hostinfo.dat
1122336 ./samba/profile.9x/mark/USER.DAT
1441792 ./samba/profile.9x/tuija/History/History.IE5/index.dat
92440996./tuija/Mail/Archive/Sent Items 2001
2985510 ./tuija/My Documents/gas1.JPG
2528914 ./tuija/My Documents/gas2.JPG

The interesting thing here is that the sum of all the file sizes in the
dump
is only 147MB cf the 361MB uncompressed dump size!!!  This is a
discrepancy
of 210MB.  (This would line up with the 180MB ISO image plus other dribs
and
drabs that I have stored in a nodump flagged directory since my last
dump)

Any ideas of what is wrong?  Are the nodumped files stored on the dump
for some reason (even though they don't appear in the restore table of
contents)

Regards/Mark
 dumpsize.pl


Re: Disk based file system cache

2001-09-24 Thread David Malone

On Mon, Sep 24, 2001 at 01:07:00PM +0200, Attila Nagy wrote:
 I'm just curious: is it possible to set up an NFS server and a client
 where the client has very big (28 GB maximum for FreeBSD?) swap area on
 multiple disks and caches the NFS exported data on it?
 This could save a lot of bandwidth on the NFS server and also redues load
 on that.

This would really be more than NFS is supposed to do. There other
filesystems which can do this sort of thing - I think Coda might
be one of them.

David.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Boot proccess

2001-09-24 Thread Jean-Christophe Varaillon

 Hello,
 
 | In short, which program gives enough knowledge to the microprocessor (?)
 | and allow him to use kern.flp  mfsroot.flp in order to boot and make the
 | operating system running.
 
 your BIOS reads the first sektor from your floppy which consists
 of a boot loader, which usually loads the 2nd step boot loader
 and this one loads the kernel.

Tell me if I am wrong but from the floppy, the files kern.flp 
mfsroot.flp are compressed and then uncompressed into memory.

If so, that means that the FreeBSD box is running this programs from the
RAM and not from the floppy, right ?

If so, is it possible to do the same but from a hard disk instead of the
floppy ?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RE: Disk based file system cache

2001-09-24 Thread Charles Randall

As a side note, Irix and Solaris provide cachefs for this purpose and use
NFS filesystems as examples (others examples may include CD-ROM, etc). 

Charles

-Original Message-
From: David Malone [mailto:[EMAIL PROTECTED]]
Sent: Monday, September 24, 2001 8:26 AM
To: Attila Nagy
Cc: [EMAIL PROTECTED]
Subject: Re: Disk based file system cache


On Mon, Sep 24, 2001 at 01:07:00PM +0200, Attila Nagy wrote:
 I'm just curious: is it possible to set up an NFS server and a client
 where the client has very big (28 GB maximum for FreeBSD?) swap area on
 multiple disks and caches the NFS exported data on it?
 This could save a lot of bandwidth on the NFS server and also redues load
 on that.

This would really be more than NFS is supposed to do. There other
filesystems which can do this sort of thing - I think Coda might
be one of them.

David.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ipfw and dummynet

2001-09-24 Thread rick norman

Thanks for the responses, as expected it was an operator head space problem.
My lack of understanding how the default queues and bw would make ping
look.  Apparently, enough delay is introduced merely by adding a pipe that
the ping client timesout waiting for the reponse.  The response was actually
returning
which became visible when I upped the timeout.  I also didn't realize that the
counters
reflected the input to the pipe and not the output which was why I didn't see
any change when I added a bw clamp.

Luigi Rizzo wrote:

 hi,

 can you show me the output of
 ipfw show
 and
 ipfw pipe show

 Reading your questions, i have the feeling you are doing
 something wrong in the commands.
 For the last one, the client will keep generating its stream
 of data, it is just after going through the pipe that you will
 see the limitation in effect.

 cheers
 luigi

 --+-
  Luigi RIZZO, [EMAIL PROTECTED]  . ACIRI/ICSI (on leave from Univ. di Pisa)
  http://www.iet.unipi.it/~luigi/  . 1947 Center St, Berkeley CA 94704
  Phone (510) 666 2927 .
 --+-

  I tried questions for this but no answer.
 
  I am attempting to use ipfw and dummynet to instrument some network
  traffic tests.  I am running freebsd 4.3 release and have built the
  kernel
  with ipfirewall, dummynet, and default to enabled.  For a simple test, I
 
  added a pipe ipfw add pipe 1 icmp from any to any.  When I ping this
  machine, I can do ipfw pipe 1 show and watch the counters increment,
  but the machine doing the pinging does not see a response to the ping.
  That's
  my first question.  Next, when I try to delete the pipe, ipfw pipe 1
  delete, it
  won't delete.  The only way I can get rid of it is to do a flush. That's
 
  the second
  question.  Third question, if I type ipfw pipe 1 config bw 10Bytes/s,
  I would
  expect the bw to be limited and the counters to reflect this limit.  The
 
  counters
  indicate no change in the 64 byte/s generated by my windows client.
 
  I have read the man pages for ipfw, dummynet, and ipfirewall.  If these
  are
  obvious questions, I would appreciate a pointer to a good reference.
 
  Thanks
 
 
 
  --
 
  Logically speaking, logic is not the answer.
 
  Rick Norman
  [EMAIL PROTECTED]
  408 742 1619
 
 
 
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with unsubscribe freebsd-hackers in the body of the message
 

--

Logically speaking, logic is not the answer.

Rick Norman
[EMAIL PROTECTED]
408 742 1619



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



CVSup4.Freebsd.org

2001-09-24 Thread Ulf Zimmermann

Seems to have still S1G bug:

Connected to cvsup4.freebsd.org
Server cvsup4.freebsd.org has the S1G bug

-- 
Regards, Ulf.

-
Ulf Zimmermann, 1525 Pacific Ave., Alameda, CA-94501, #: 510-865-0204

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: CVSup4.Freebsd.org

2001-09-24 Thread Pete Fritchman

++ 24/09/01 11:30 -0700 - Ulf Zimmermann:
| Seems to have still S1G bug:
| 
| Connected to cvsup4.freebsd.org
| Server cvsup4.freebsd.org has the S1G bug

This should go to the maintainer of cvsup4.freebsd.org, available at:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/cvsup.html#CVSUP-MIRRORS

And also CC:'d.

-pete

--
Pete Fritchman [petef@(databits.net|freebsd.org|csh.rit.edu)]
finger [EMAIL PROTECTED] for PGP key

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RE: panic on mount

2001-09-24 Thread John Baldwin


On 23-Sep-01 Evan Sarmiento wrote:
 Hello,
 
 After compiling a new kernel, installing it, when my laptop
 tries to mount its drive, it panics with this message:
 
 panic: lock (sleep mutex) vnode interlock not locked  @
 ../../../kern/vfs_default.c:460
 
 which is:
 
   if (ap-a_flags  LK_INTERLOCK)
mtx_unlock(ap-a_vp-v_interlock);
 
 within the function vop_nolock.

Can you get a stack trace to see where vop_nolock is being called from?

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



termcap sources

2001-09-24 Thread Giorgos Keramidas

I saw a duplicate in one of the capabilities that wer submitted to -bugs earlier.
This had me thinking.  What happens when a duplicate capability exists in termcap?
Are there any other duplicates in termcap.src?  If yes, which?

The first attachment is a perl script that strips all cruft from termcap.src
(fed through stdin) and makes every terminal entry occupy a single line of
text.  This is necessary for the second attachment to work correctly.

The second attachment is a perl script that splits capability names of each
input line in an array, and performs the (boring for a human to do by reading
through the termcap sources) duplicate check in all the elements of the array.

The third attachment is the output of the command (termcap.src,v 1.109):

% cat termcap.src | ./tstrip.pl | ./tdupcheck.pl

As you can see there are quite a few terminals that have capabilities defined
more than once!  I don't have THAT many terminals to check, but I'm open to
suggestions.  Should we do something about this?  If yes, what?

-giorgos

 tstrip.pl
 tdupcheck.pl

q101 - cl
5410 - k4
AT386 - IC
h1510 - do
h1520 - do
ibm3163 - ds:es:fs:hs
abm80 - do
d132 - ic
tec400 - do
f200 - ds:ts
sol - ho
it2 - do
mdl110 - cd
wsiris - cl:ho
intext - le
c100 - us:ue
dtterm - op
h29 - do
z29a - mb:mr
ztx - sr
adm5 - do
mime - do
sexidy - le
ttyWilliams - do
tvi950 - do
tvi955 - do
abm85 - kd
fos - bs



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon


:
:In message [EMAIL PROTECTED], Matt Dillon writes:
:
:$8 = 58630
:(kgdb) print vm_page_buckets[$8]
:
:What is vm_page_hash_mask? The chunk of memory you printed out below
:looks alright; it is consistent with vm_page_array == 0xc051c000. Is
:it just the vm_page_buckets[] pointer that is corrupt?
:
:The address 0xc08428cc is (char *)vm_page_array[55060] + 28, and
:sizeof(struct vm_page) is 60, so 0xc08428cc is in the middle of
:a vm_page within vm_page_array[].
:
:Ian

(kgdb) print vm_page_buckets[58630]
$5 = (struct vm_page *) 0xc08428cc
(kgdb) print vm_page_array
$6 = 0xc051c000
(kgdb) print vm_page_hash_mask
$7 = 262143
(kgdb) print vm_page_array[55060]
$11 = (struct vm_page *) 0xc08428b0
(kgdb) print vm_page_array[55061]
$10 = (struct vm_page *) 0xc08428ec

Yowzer.  How the hell did that happen!  Yes, you're right, the
vm_page_array[] pointer has gotten corrupted.  If we assume that
the vm_page_t is valid (0xc0842acc), then the vm_page_buckets[]
pointer should be that.

vm_page_buckets[58630]  - c08428cc
panic on vm_page_t m- c0842acc

Ok, so the corruption here is that an 'a' turned into an '8'. 1010 turned
into 1000... a bit got cleared.

This is very similar to the corruption I found on one of Yahoo's 
machines.  Except on that machine two bits were changed.  It's as though
some other subsystem is trying to manipulate a flag in a structure using
a bad structure pointer.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Julian Elischer

remember that we hit almost this problem with the KSE stuff during
debugging?

The pointers in the last few entries of the vm_page_buckets array got
corrupted when an agument to a function that manipulated whatever was next
in ram was 0, and it turned out that it was 0 because
 of some PTE flushing thing (you are the one that found it... remember?)
(there was a line of asm code missing)

On Mon, 24 Sep 2001, Matt Dillon wrote:

 
 :
 :In message [EMAIL PROTECTED], Matt Dillon writes:
 :
 :$8 = 58630
 :(kgdb) print vm_page_buckets[$8]
 :
 :What is vm_page_hash_mask? The chunk of memory you printed out below
 :looks alright; it is consistent with vm_page_array == 0xc051c000. Is
 :it just the vm_page_buckets[] pointer that is corrupt?
 :
 :The address 0xc08428cc is (char *)vm_page_array[55060] + 28, and
 :sizeof(struct vm_page) is 60, so 0xc08428cc is in the middle of
 :a vm_page within vm_page_array[].
 :
 :Ian
 
 (kgdb) print vm_page_buckets[58630]
 $5 = (struct vm_page *) 0xc08428cc
 (kgdb) print vm_page_array
 $6 = 0xc051c000
 (kgdb) print vm_page_hash_mask
 $7 = 262143
 (kgdb) print vm_page_array[55060]
 $11 = (struct vm_page *) 0xc08428b0
 (kgdb) print vm_page_array[55061]
 $10 = (struct vm_page *) 0xc08428ec
 
 Yowzer.  How the hell did that happen!  Yes, you're right, the
 vm_page_array[] pointer has gotten corrupted.  If we assume that
 the vm_page_t is valid (0xc0842acc), then the vm_page_buckets[]
 pointer should be that.
 
 vm_page_buckets[58630]  - c08428cc
 panic on vm_page_t m- c0842acc
 
 Ok, so the corruption here is that an 'a' turned into an '8'. 1010 turned
 into 1000... a bit got cleared.
 
 This is very similar to the corruption I found on one of Yahoo's 
 machines.  Except on that machine two bits were changed.  It's as though
 some other subsystem is trying to manipulate a flag in a structure using
 a bad structure pointer.
 
   -Matt
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-hackers in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Boot proccess

2001-09-24 Thread Jordan Hubbard

 Tell me if I am wrong but from the floppy, the files kern.flp 
 mfsroot.flp are compressed and then uncompressed into memory.
 
 If so, that means that the FreeBSD box is running this programs from the
 RAM and not from the floppy, right ?

Correct.  They're running with the root device set to a memory
filesystem (which has been initialized with the contents of
mfsroot.flp).

 If so, is it possible to do the same but from a hard disk instead of the
 floppy ?

That's generally how most FreeBSD systems boot, yes. :-)

- Jordan

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Ian Dowse


The pointers in the last few entries of the vm_page_buckets array got
corrupted when an agument to a function that manipulated whatever was next
in ram was 0, and it turned out that it was 0 because
 of some PTE flushing thing (you are the one that found it... remember?)

I think I've also seen a few reports of programs exiting with
Profiling timer expired messages with 4.4. These can be caused
by stack overflows, since the p_timer[] array in struct pstats is
one of the things that I think lives below the per-process kernel
stack. I wonder if they are related? Stack overflows could result
in corruption of local variables, after which anything could happen.

That said, hardware problems are still a possiblilty.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon


:The pointers in the last few entries of the vm_page_buckets array got
:corrupted when an agument to a function that manipulated whatever was next
:in ram was 0, and it turned out that it was 0 because
: of some PTE flushing thing (you are the one that found it... remember?)
:
:I think I've also seen a few reports of programs exiting with
:Profiling timer expired messages with 4.4. These can be caused
:by stack overflows, since the p_timer[] array in struct pstats is
:one of the things that I think lives below the per-process kernel
:stack. I wonder if they are related? Stack overflows could result
:in corruption of local variables, after which anything could happen.
:
:That said, hardware problems are still a possiblilty.
:
:Ian

Hmm.  Do we have a guard page at the base of the per process kernel
stack?

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon


:
:remember that we hit almost this problem with the KSE stuff during
:debugging?
:
:The pointers in the last few entries of the vm_page_buckets array got
:corrupted when an agument to a function that manipulated whatever was next
:in ram was 0, and it turned out that it was 0 because
: of some PTE flushing thing (you are the one that found it... remember?)
:(there was a line of asm code missing)

I've kept that in mind, but I think this may be a different issue.
The memory involved is 100% statically mapped in the kernel page table
array, and the errors are more like bit errors then anything else.  Either
the memory is bad or something in our kernel is setting or clearing flags
through a bad pointer.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Julian Elischer

not, I believe in 4.x
we do in 5.x


On Mon, 24 Sep 2001, Matt Dillon wrote:

 
 :The pointers in the last few entries of the vm_page_buckets array got
 :corrupted when an agument to a function that manipulated whatever was next
 :in ram was 0, and it turned out that it was 0 because
 : of some PTE flushing thing (you are the one that found it... remember?)
 :
 :I think I've also seen a few reports of programs exiting with
 :Profiling timer expired messages with 4.4. These can be caused
 :by stack overflows, since the p_timer[] array in struct pstats is
 :one of the things that I think lives below the per-process kernel
 :stack. I wonder if they are related? Stack overflows could result
 :in corruption of local variables, after which anything could happen.
 :
 :That said, hardware problems are still a possiblilty.
 :
 :Ian
 
 Hmm.  Do we have a guard page at the base of the per process kernel
 stack?
 
   -Matt
 
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Ian Dowse

In message [EMAIL PROTECTED], Matt Dillon writes:

Hmm.  Do we have a guard page at the base of the per process kernel
stack?

As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386)
pages of per-process kernel state at p-p_addr. The stack grows
down from the top, and struct user (sys/user.h) sits at the bottom.
According to the comment in the definition of struct user, only
the first three items in struct user are valid in normal running
conditions:

8192
???
8176Top of stack

stack space (4672 bytes)

3504
struct timeval p_start
struct uprof p_prof
struct itimerval p_timer[ITIMER_PROF] (for SIGPROF)
struct itimerval p_timer[ITIMER_VIRTUAL]
struct itimerval p_timer[ITIMER_REAL]
struct rusage p_cru;
struct rusage p_ru;
u_stats 
3280
u_sigacts
608
u_pcb
0   p-p_addr

So if the stack does overflow, p_timer[ITIMER_PROF] is about the
first noticable thing that gets clobbered, causing a SIGPROF
signal delivery to the process some time later.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon

:In message [EMAIL PROTECTED], Matt Dillon writes:
:
:Hmm.  Do we have a guard page at the base of the per process kernel
:stack?
:
:As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386)
:pages of per-process kernel state at p-p_addr. The stack grows
:down from the top, and struct user (sys/user.h) sits at the bottom.
:According to the comment in the definition of struct user, only
:the first three items in struct user are valid in normal running
:conditions:

Ok.  I'm going to add a magic number to the end of the process
structure and check it in mi_switch() in -stable.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon


:
:In message [EMAIL PROTECTED], Matt Dillon writes:
:
:Hmm.  Do we have a guard page at the base of the per process kernel
:stack?
:
:As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386)
:pages of per-process kernel state at p-p_addr. The stack grows
:down from the top, and struct user (sys/user.h) sits at the bottom.
:According to the comment in the definition of struct user, only
:the first three items in struct user are valid in normal running
:conditions:

Er, I mean I'll add a magic number to struct pstats.

-Matt

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



ecc on i386

2001-09-24 Thread Andrew Gallatin


What happens on an ECC equipped PC when you have a multi-bit memory
error that hardware scrubbing can't fix?  Will there be some sort of
NMI or something that will panic the box?

I'm used to alphas (where you'll get a fatal machine check panic) and
I am just wondering if PCs are as safe.

Thanks,

Drew



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Julian Elischer

stack can be somewhat sparse depending on execution path, but it's not a
bad idea..


On Mon, 24 Sep 2001, Matt Dillon wrote:

 :In message [EMAIL PROTECTED], Matt Dillon writes:
 :
 :Hmm.  Do we have a guard page at the base of the per process kernel
 :stack?
 :
 :As I understand it, no. In RELENG_4 there are UPAGES (== 2 on i386)
 :pages of per-process kernel state at p-p_addr. The stack grows
 :down from the top, and struct user (sys/user.h) sits at the bottom.
 :According to the comment in the definition of struct user, only
 :the first three items in struct user are valid in normal running
 :conditions:
 
 Ok.  I'm going to add a magic number to the end of the process
 structure and check it in mi_switch() in -stable.
   
   -Matt
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-hackers in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ecc on i386

2001-09-24 Thread Matt Dillon


:What happens on an ECC equipped PC when you have a multi-bit memory
:error that hardware scrubbing can't fix?  Will there be some sort of
:NMI or something that will panic the box?
:
:I'm used to alphas (where you'll get a fatal machine check panic) and
:I am just wondering if PCs are as safe.
:
:Thanks,
:
:Drew

ECC can typically detect and correct single bit errors and detect
double bit errors.  Anything beyond that is problematic... it may or
may not detect the problem or may mis-correct a multi-bit error. 
An NMI is generated if an uncorrectable error is detected.

On PC's, ECC is optional.  Desktops typically do not ship with ECC
memory.  Branded servers typically do.A year or two ago I would
have been happy to use non-ECC rams (finding bad RAM through trial
and error), but now with capacities as they are and memory prices down
ECC is definitely the way to go.

Bit errors can come from many sources, memory being only one.  Bit errors
can occur inside the cpu chip, in the L1 and L2 caches, in memory, in
controller chips... all over the place.  Many modern processors implement
parity on their caches to try to cover the problem areas.  I'm not sure
how Pentium III's and IV's are setup.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ecc on i386

2001-09-24 Thread Andrew Gallatin


Matt Dillon writes:
  
  :What happens on an ECC equipped PC when you have a multi-bit memory
  :error that hardware scrubbing can't fix?  Will there be some sort of
  :NMI or something that will panic the box?
  :
  :I'm used to alphas (where you'll get a fatal machine check panic) and
  :I am just wondering if PCs are as safe.
  :
  :Thanks,
  :
  :Drew
  
  ECC can typically detect and correct single bit errors and detect
  double bit errors.  Anything beyond that is problematic... it may or
  may not detect the problem or may mis-correct a multi-bit error. 
  An NMI is generated if an uncorrectable error is detected.
  
  On PC's, ECC is optional.  Desktops typically do not ship with ECC
  memory.  Branded servers typically do.A year or two ago I would
  have been happy to use non-ECC rams (finding bad RAM through trial
  and error), but now with capacities as they are and memory prices down
  ECC is definitely the way to go.

My sentiments exactly.

  Bit errors can come from many sources, memory being only one.  Bit errors
  can occur inside the cpu chip, in the L1 and L2 caches, in memory, in
  controller chips... all over the place.  Many modern processors implement
  parity on their caches to try to cover the problem areas.  I'm not sure
  how Pentium III's and IV's are setup.
  
   -Matt

Hmm.. Well, it turns out that the box Im insterested in (Thunder K7)
can be set to send an SERR on multiple bit errors.  I wonder what
happens when a pc gets an SERR? (that's another machine check
on alpha)

Drew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ecc on i386

2001-09-24 Thread Peter Wemm

Andrew Gallatin wrote:
 
 What happens on an ECC equipped PC when you have a multi-bit memory
 error that hardware scrubbing can't fix?  Will there be some sort of
 NMI or something that will panic the box?
 
 I'm used to alphas (where you'll get a fatal machine check panic) and
 I am just wondering if PCs are as safe.

Basically it depends on how the bios has programmed the chipsets and how
the motherboard is wired.

The usual way goes something like this:

There are two PCI signals, #PERR (pci error), #SERR (system error).

Various devices can be programmed to assert these under various conditions.

Things like bus master fifo underflows etc will be programmed to assert #PERR
and are generally not fatal.

The memory controller is usually programmed to assert #SERR on a multiple
bit error and either #SERR or some other signal (a GPIO or something like
#SALERT on a serverworks chip) for a single bit (corrected) error.

The south bridge listens to #SERR and #PERR and can convert those into NMI
events.  Usually #SERR shows up as parity error and #PERR shows up as
IOCHK (if it is enabled).

The bad news is that many bios manufacturers **TURN OFF** ECC functionality
in order to speed things up.  The reason for this is that with ECC off, the
cpu can read/write down to byte granularity.  With ECC on, memory is
rigidly enforced as 64 bit quantities (ecc-encoded out to 72 bits).  If the
cpu reads a byte, the memory controller actually fetches all 64 (72) bits.
If the cpu writes a byte, the memory controller has to do a
read-merge-write cycle where it reads the 64 bit value, merges in the 1
byte write and writes out the entire 64 bit value again.  This (naturally)
shows up in poor benchmarks so they like to turn it off by default in order
to get a speed edge.  Tyan is a notable example here (eg: the Thunder K7,
the dual-athlon DDR-SDRAM board has ECC turned off by default(!!)).  I am
sure that others do it too.

The Tyan Thunder 2510 BIOS even disables ECC - NMI routing so you have to
go to quite a bit of trouble to reprogram the serverworks chipset to
actually generate NMI's so that you can find out if something got trashed.

Our NMI / ECC handling really really sucks in FreeBSD. Consider:
- i686_pagezero - reads before writing in order to minimize cache snooping
traffic in SMP systems.  However, if it gets an NMI while trying to check
if the cache line is already zero, it will take the entire machine down
instead of just zeroing the line.
- NFS / VM / bio:  when they get an NMI while trying to copy data that is
clean and backed by storage, they take the machine down instead of trying
to recover and re-read the page.
- userland.. If userland gets an NMI, the machine dies instead of killing
the process (or rereading a text page etc if possible)
- our NMI handlers are a festering pile of excretement.  They dont have
the code to 'ack' the NMI so it isn't possible to return after recovery.
- and so on.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ecc on i386

2001-09-24 Thread Peter Wemm

Andrew Gallatin wrote:
 
 Matt Dillon writes:
   
   :What happens on an ECC equipped PC when you have a multi-bit memory
   :error that hardware scrubbing can't fix?  Will there be some sort of
   :NMI or something that will panic the box?
   :
   :I'm used to alphas (where you'll get a fatal machine check panic) and
   :I am just wondering if PCs are as safe.
   :
   :Thanks,
   :
   :Drew
   
   ECC can typically detect and correct single bit errors and detect
   double bit errors.  Anything beyond that is problematic... it may or
   may not detect the problem or may mis-correct a multi-bit error. 
   An NMI is generated if an uncorrectable error is detected.
   
   On PC's, ECC is optional.  Desktops typically do not ship with ECC
   memory.  Branded servers typically do.A year or two ago I would
   have been happy to use non-ECC rams (finding bad RAM through trial
   and error), but now with capacities as they are and memory prices down
   ECC is definitely the way to go.
 
 My sentiments exactly.

I wrote a poller for picking up correction events on various serverworks
motherboards (compaq, tyan) and it was *scarey* how often single-bit errors
were being corrected.

   Bit errors can come from many sources, memory being only one.  Bit err
ors
   can occur inside the cpu chip, in the L1 and L2 caches, in memory, in
   controller chips... all over the place.  Many modern processors implem
ent
   parity on their caches to try to cover the problem areas.  I'm not sur
e
   how Pentium III's and IV's are setup.
   
  -Matt
 
 Hmm.. Well, it turns out that the box Im insterested in (Thunder K7)
 can be set to send an SERR on multiple bit errors.  I wonder what
 happens when a pc gets an SERR? (that's another machine check
 on alpha)

On the Thunder K7, #SERR is routed to NMI.  Trust me, you want this.
And set it to ECC-SCRUB instead of off like the default now is.

See my other email about how #SERR is converted to NMI via the ISA part of
the south bridge.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Peter Wemm

Matt Dillon wrote:
 
 :The pointers in the last few entries of the vm_page_buckets array got
 :corrupted when an agument to a function that manipulated whatever was next
 :in ram was 0, and it turned out that it was 0 because
 : of some PTE flushing thing (you are the one that found it... remember?)
 :
 :I think I've also seen a few reports of programs exiting with
 :Profiling timer expired messages with 4.4. These can be caused
 :by stack overflows, since the p_timer[] array in struct pstats is
 :one of the things that I think lives below the per-process kernel
 :stack. I wonder if they are related? Stack overflows could result
 :in corruption of local variables, after which anything could happen.
 :
 :That said, hardware problems are still a possiblilty.
 :
 :Ian
 
 Hmm.  Do we have a guard page at the base of the per process kernel
 stack?
 
   -Matt

I did it as part of the KSE work in 5.x.  It would be quite easy to do it
for 4.x as well, but it makes a.out coredumps problematic.

Also, options UPAGES=4 is a pretty good defensive measure.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon

:
:I did it as part of the KSE work in 5.x.  It would be quite easy to do it
:for 4.x as well, but it makes a.out coredumps problematic.
:
:Also, options UPAGES=4 is a pretty good defensive measure.
:
:Cheers,
:-Peter
:--
:Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]

Well, in 4.x:

(kgdb) print p-p_addr  
$6 = (struct user *) 0xcb7b9000
(kgdb) print p-p_addr-u_sigacts
$7 = (struct sigacts *) 0xcb7b9260
(kgdb) print p-p_addr-u_stats  
$8 = (struct pstats *) 0xcb7b9cd0
(kgdb) print p-p_addr-u_kproc
$9 = (struct kinfo_proc *) 0xcb7b9db0
(kgdb) print p-p_addr-u_md   
$10 = (struct md_coredump *) 0xcb7ba1d0
(kgdb) print p-p_addr-u_guard(my new field)
$11 = (u_int32_t *) 0xcb7ba1d0
(kgdb) 

cb7b9000start of kstack
cb7ba1d4end of struct user
cb7bb000top of kstack

Leaving us 3628 bytes for the kernel stack.

Something really weird is going on... I added u_guard to the end
of the struct user structure and there are two or three processes
hitting the guard immediately.  All the rest are ok.  I'm going
to investigate further but this is very odd.  Am I missing something
about the UAREA?

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Mike Silbersack


On Mon, 24 Sep 2001, Matt Dillon wrote:

 Yowzer.  How the hell did that happen!  Yes, you're right, the
 vm_page_array[] pointer has gotten corrupted.  If we assume that
 the vm_page_t is valid (0xc0842acc), then the vm_page_buckets[]
 pointer should be that.

...

 This is very similar to the corruption I found on one of Yahoo's
 machines.  Except on that machine two bits were changed.  It's as though
 some other subsystem is trying to manipulate a flag in a structure using
 a bad structure pointer.

   -Matt

Ok, time to take a good stab at sticking my foot in my mouth here.

Would it be possible to have a kernel mode where the read-only bit was
turned on for malloc pools which shouldn't currently be accessed?  This
could be gated through the spl() calls (or specific mutexes on -current),
ensuring that something like getpid couldn't stomp on the vm structures
w/o first doing a splvm().

Obviously this wouldn't help find bugs in interrupt handlers or other high
level calls, but it could help locate some memory corruption problems.
Actually, since memory regions roughly follow locks, this could be an even
more powerful tool on -current once it develops me.

Is this even feasible in ring 0?

Mike Silby Silbersack


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Peter Wemm

Matt Dillon wrote:
 :
 :I did it as part of the KSE work in 5.x.  It would be quite easy to do it
 :for 4.x as well, but it makes a.out coredumps problematic.
 :
 :Also, options UPAGES=4 is a pretty good defensive measure.
 :
 :Cheers,
 :-Peter
 :--
 :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
 
 Well, in 4.x:
 
 (kgdb) print p-p_addr  
 $6 = (struct user *) 0xcb7b9000
 (kgdb) print p-p_addr-u_sigacts
 $7 = (struct sigacts *) 0xcb7b9260
 (kgdb) print p-p_addr-u_stats  
 $8 = (struct pstats *) 0xcb7b9cd0
 (kgdb) print p-p_addr-u_kproc
 $9 = (struct kinfo_proc *) 0xcb7b9db0
 (kgdb) print p-p_addr-u_md   
 $10 = (struct md_coredump *) 0xcb7ba1d0
 (kgdb) print p-p_addr-u_guard  (my new field)
 $11 = (u_int32_t *) 0xcb7ba1d0
 (kgdb) 
 
 cb7b9000  start of kstack
 cb7ba1d4  end of struct user
 cb7bb000  top of kstack
 
 Leaving us 3628 bytes for the kernel stack.
 
 Something really weird is going on... I added u_guard to the end
 of the struct user structure and there are two or three processes
 hitting the guard immediately.  All the rest are ok.  I'm going
 to investigate further but this is very odd.  Am I missing something
 about the UAREA?

Yes. u_md etc isn't used while the process is running.  If you're going to
have u_guard, it should come directly after u_stats, and *before* u_kproc,
u_md etc.

I had been contemplating making a fake 'struct user' in userland only in
order to keep the a.out coredump reader code happy.  The a.out coredump
code (see cpu_coredump() in */*/vm_machdep.c) can generate this fake
structure in order to keep gdb happy.  But then I realized that a.out
coredump debugging was almost totally irrelevant these days.

Actually I tell a lie. In 4.x, u_kproc *can* be used on a live process..
see the **NASTY** PT_READ_U and PT_WRITE_U code in sys_process.c. It does a
fill_eproc() in order to be able to read/write values from there. Nothing
uses this stuff. I removed it from -current quite a while ago, and it
should be MFC'ed too.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Peter Wemm

Matt Dillon wrote:
 :
 :I did it as part of the KSE work in 5.x.  It would be quite easy to do it
 :for 4.x as well, but it makes a.out coredumps problematic.
 :
 :Also, options UPAGES=4 is a pretty good defensive measure.
 :
 :Cheers,
 :-Peter
 :--
 :Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
 
 Well, in 4.x:
 
 (kgdb) print p-p_addr  
 $6 = (struct user *) 0xcb7b9000
 (kgdb) print p-p_addr-u_sigacts
 $7 = (struct sigacts *) 0xcb7b9260
 (kgdb) print p-p_addr-u_stats  
 $8 = (struct pstats *) 0xcb7b9cd0
 (kgdb) print p-p_addr-u_kproc
 $9 = (struct kinfo_proc *) 0xcb7b9db0
 (kgdb) print p-p_addr-u_md   
 $10 = (struct md_coredump *) 0xcb7ba1d0
 (kgdb) print p-p_addr-u_guard  (my new field)
 $11 = (u_int32_t *) 0xcb7ba1d0
 (kgdb) 
 
 cb7b9000  start of kstack
 cb7ba1d4  end of struct user
 cb7bb000  top of kstack
 
 Leaving us 3628 bytes for the kernel stack.
 
 Something really weird is going on... I added u_guard to the end
 of the struct user structure and there are two or three processes
 hitting the guard immediately.  All the rest are ok.  I'm going
 to investigate further but this is very odd.  Am I missing something
 about the UAREA?

Oh, one other thing...  When we had PCIBIOS active for pci config space
read/write support, we had stack overflows on many systems when the SSE
stuff got MFC'ed.  The simple act of trimming about 300 bytes from the
pcb_save structure was enough to make the difference between it working or
not.  We are *way* too close to the wire.  I asked about raising UPAGES
from 2 to 3 before RELENG_4_4 but it never happened.

Julian cleaned up a couple of places stuff where we were allocating 2K of
local data *twice* on local stack frames.  There are some gcc patches
floating around that enable you to generate a warning if your local stack
frame exceedes a certain amount or the arguments are bigger than a
specified amount.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: VM Corruption - stumped, anyone have any ideas?

2001-09-24 Thread Matt Dillon

:Oh, one other thing...  When we had PCIBIOS active for pci config space
:read/write support, we had stack overflows on many systems when the SSE
:stuff got MFC'ed.  The simple act of trimming about 300 bytes from the
:pcb_save structure was enough to make the difference between it working or
:not.  We are *way* too close to the wire.  I asked about raising UPAGES
:from 2 to 3 before RELENG_4_4 but it never happened.
:
:Julian cleaned up a couple of places stuff where we were allocating 2K of
:local data *twice* on local stack frames.  There are some gcc patches
:floating around that enable you to generate a warning if your local stack
:frame exceedes a certain amount or the arguments are bigger than a
:specified amount.
:
:Cheers,
:-Peter
:--
:Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]

I'm getting stack underflows with UPAGES set to 2.  I've set UPAGES to 4
and preinitialized the UAREA to 0x11 and then scan it in exit1() to
determine how much stack was actually used.  If these numbers are
correct, we are screwed with UPAGES set to 2.  This is just 4 seconds
worth of a buildworld.  Note the '3664's showing up.  That's too close.
note the 3984 that came up after playing with the system for a few 
seconds!

I'll post the patch set to use to test this stuff in a moment.

-Matt

process 323 exit kstackuse 2272
...
process 333 exit kstackuse 2272
process 225 exit kstackuse 3664
process 233 exit kstackuse 2272
...
process 237 exit kstackuse 2272
process 322 exit kstackuse 2676
process 334 exit kstackuse 2272
...
process 319 exit kstackuse 2272

test1# dmesg | fgrep process | sort -n +4 | tail -10
process 6 exit kstackuse 3640
process 89 exit kstackuse 3640
process 176 exit kstackuse 3664
process 186 exit kstackuse 3664
process 225 exit kstackuse 3664
process 290 exit kstackuse 3664
process 299 exit kstackuse 3664
process 300 exit kstackuse 3664
process 303 exit kstackuse 3664
process 138 exit kstackuse 3984


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Patch to test kstack usage.

2001-09-24 Thread Matt Dillon

This isn't perfect but it should be a good start in regards to 
testing kstack use.  This patch is against -stable.  It reports
kernel stack use on process exit and will generate a 'Kernel stack
underflow' message if it detects an underflow.  It doesn't panic,
so for a fun time you can leave UPAGES at 2 and watch in horror.

note: make sure you make depend before making a new kernel, or use
buildkernel.

-Matt


Index: sys/user.h
===
RCS file: /home/ncvs/src/sys/sys/user.h,v
retrieving revision 1.24
diff -u -r1.24 user.h
--- sys/user.h  1999/12/29 04:24:49 1.24
+++ sys/user.h  2001/09/25 03:41:04
@@ -109,9 +109,13 @@
 * Remaining fields only for core dump and/or ptrace--
 * not valid at other times!
 */
+   u_int32_t u_guard2; /* guard the base of the kstack */
struct  kinfo_proc u_kproc; /* proc + eproc */
struct  md_coredump u_md;   /* machine dependent glop */
+   u_int32_t u_guard;  /* guard the base of the kstack */
 };
+
+#define U_GUARD_MAGIC   0x51A2C3D4
 
 /*
  * Redefinitions to make the debuggers happy for now...  This subterfuge
Index: kern/init_main.c
===
RCS file: /home/ncvs/src/sys/kern/init_main.c,v
retrieving revision 1.134.2.6
diff -u -r1.134.2.6 init_main.c
--- kern/init_main.c2001/06/15 09:37:55 1.134.2.6
+++ kern/init_main.c2001/09/25 01:39:05
@@ -358,6 +358,7 @@
 */
p-p_stats = p-p_addr-u_stats;
p-p_sigacts = p-p_addr-u_sigacts;
+   p-p_addr-u_guard = U_GUARD_MAGIC; /* bottom of kernel stack */
 
/*
 * Charge root for one process.
Index: kern/kern_exit.c
===
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.92.2.5
diff -u -r1.92.2.5 kern_exit.c
--- kern/kern_exit.c2001/07/27 14:06:01 1.92.2.5
+++ kern/kern_exit.c2001/09/25 04:09:32
@@ -123,6 +123,16 @@
WTERMSIG(rv), WEXITSTATUS(rv));
panic(Going nowhere without my init!);
}
+   {
+   int *ua;
+   int *addrend = (int *)((char *)p-p_addr + UPAGES * PAGE_SIZE);
+   for (ua = p-p_addr-u_guard + 1; ua  addrend; ++ua) {
+   if (*ua != 0x)
+   break;
+   }
+   printf(process %d exit kstackuse %d\n,
+   p-p_pid, (char *)addrend - (char *)ua);
+   }
 
aio_proc_rundown(p);
 
Index: kern/kern_synch.c
===
RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.87.2.3
diff -u -r1.87.2.3 kern_synch.c
--- kern/kern_synch.c   2000/12/31 22:10:45 1.87.2.3
+++ kern/kern_synch.c   2001/09/25 02:54:46
@@ -44,13 +44,17 @@
 #include sys/param.h
 #include sys/systm.h
 #include sys/proc.h
+#include sys/lock.h
 #include sys/kernel.h
 #include sys/signalvar.h
 #include sys/resourcevar.h
 #include sys/vmmeter.h
 #include sys/sysctl.h
 #include vm/vm.h
+#include vm/pmap.h
+#include vm/vm_map.h
 #include vm/vm_extern.h
+#include sys/user.h
 #ifdef KTRACE
 #include sys/uio.h
 #include sys/ktrace.h
@@ -792,6 +796,13 @@
register struct proc *p = curproc;  /* XXX */
register struct rlimit *rlim;
int x;
+
+   /*
+* Check to see if the kernel stack underflowed (XXX)
+*/
+   if (p-p_addr-u_guard != U_GUARD_MAGIC) {
+   printf(Kernel stack underflow! %p %p %08x\n, p, p-p_addr, 
+p-p_addr-u_guard);
+   }
 
/*
 * XXX this spl is almost unnecessary.  It is partly to allow for
Index: i386/i386/pmap.c
===
RCS file: /home/ncvs/src/sys/i386/i386/pmap.c,v
retrieving revision 1.250.2.10
diff -u -r1.250.2.10 pmap.c
--- i386/i386/pmap.c2001/07/30 23:27:59 1.250.2.10
+++ i386/i386/pmap.c2001/09/25 04:03:52
@@ -891,6 +891,7 @@
}
if (updateneeded)
invltlb();
+   memset(up, 0x11, UPAGES * PAGE_SIZE);
 }
 
 /*
Index: i386/include/param.h
===
RCS file: /home/ncvs/src/sys/i386/include/param.h,v
retrieving revision 1.54.2.5
diff -u -r1.54.2.5 param.h
--- i386/include/param.h2001/09/15 00:50:36 1.54.2.5
+++ i386/include/param.h2001/09/25 03:41:11
@@ -110,7 +110,7 @@
 #define MAXDUMPPGS (DFLTPHYS/PAGE_SIZE)
 
 #define IOPAGES2   /* pages of i/o permission bitmap */
-#define UPAGES 2   /* pages of u-area */
+#define UPAGES 4   /* pages of u-area */
 
 /*
  * Ceiling on amount of swblock kva space.
Index: vm/vm_glue.c

Re: Patch to test kstack usage.

2001-09-24 Thread Peter Wemm

Matt Dillon wrote:
 This isn't perfect but it should be a good start in regards to 
 testing kstack use.  This patch is against -stable.  It reports
 kernel stack use on process exit and will generate a 'Kernel stack
 underflow' message if it detects an underflow.  It doesn't panic,
 so for a fun time you can leave UPAGES at 2 and watch in horror.

It is checking against the wrong guard value. It should be u_guard2.

FWIW; the max stack available is 4688 bytes on a standard 4.x system. Yes,
that is too freaking close.  Also, the maximum usage depends on what sort
of cards you have in the system.. If you have a heavy tty user (eg: a 32+
port serial card) then you have lots of tty interrupts nesting as well.
Having the ppp/sl/plip drivers in the system partly negates the effect of
this though since it wires the net/tty interrupt masks together.

peter@thunder[10:13pm]~-111 ./tu
stack base = 3504
stack size = 4688
peter@thunder[10:13pm]~-112 cat tu.c
#include sys/param.h
#include sys/user.h
#include stdio.h
#include stddef.h

int
main(int ac, char **av)
{
int stack_base = offsetof(struct user, u_kproc);

printf(stack base = %d\n, stack_base);
printf(stack size = %d\n, UPAGES * PAGE_SIZE - stack_base);
}

 --- sys/user.h1999/12/29 04:24:49 1.24
 +++ sys/user.h2001/09/25 03:41:04
 @@ -109,9 +109,13 @@
* Remaining fields only for core dump and/or ptrace--
* not valid at other times!
*/
 + u_int32_t u_guard2; /* guard the base of the kstack */
   struct  kinfo_proc u_kproc; /* proc + eproc */
   struct  md_coredump u_md;   /* machine dependent glop */
 + u_int32_t u_guard;  /* guard the base of the kstack */
  };


Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch to test kstack usage.

2001-09-24 Thread Matt Dillon

:
:Matt Dillon wrote:
: This isn't perfect but it should be a good start in regards to 
: testing kstack use.  This patch is against -stable.  It reports
: kernel stack use on process exit and will generate a 'Kernel stack
: underflow' message if it detects an underflow.  It doesn't panic,
: so for a fun time you can leave UPAGES at 2 and watch in horror.
:
:It is checking against the wrong guard value. It should be u_guard2.
:
:FWIW; the max stack available is 4688 bytes on a standard 4.x system. Yes,
:that is too freaking close.  Also, the maximum usage depends on what sort
:of cards you have in the system.. If you have a heavy tty user (eg: a 32+

I looked at it fairly carefully.  It has got to be u_guard... at the
end of struct user, at least until you do that MFC.  The ptrace code
appears to mess around with u_kproc quite a bit.  And when you rip out
u_kproc it still needs to be at the end, after the coredump structure
(though for i386 the coredump structure is empty)... because interrupts
can occur during a core dump.

:port serial card) then you have lots of tty interrupts nesting as well.
:Having the ppp/sl/plip drivers in the system partly negates the effect of
:this though since it wires the net/tty interrupt masks together.
:...
:Cheers,
:-Peter
:--
:Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
:All of this is for nothing if we don't go to the stars - JMS/B5
:

Yah... the test I ran was just a couple of seconds worth of playing
around over ssh.  I expect the worst case to be a whole lot worse.

We're going to have to bump up UPAGES to 3 in 4.x, there's no question
about it.  I'm going to do it tonight.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch to test kstack usage.

2001-09-24 Thread Matt Dillon

:stack size = 4688

Sep 24 22:47:22 test1 /kernel: process 29144 exit kstackuse 4496

closer... :-)

-Matt

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch to test kstack usage.

2001-09-24 Thread Peter Wemm

Matt Dillon wrote:

 Yah... the test I ran was just a couple of seconds worth of playing
 around over ssh.  I expect the worst case to be a whole lot worse.
 
 We're going to have to bump up UPAGES to 3 in 4.x, there's no question
 about it.  I'm going to do it tonight.

Heh. I already asked to do it a few weeks ago, in order to get it into the
release.  I guess I wasn't persistant enough.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message