Re: size of /usr/src

2002-01-16 Thread Joel M. Baldwin


My current /usr/src is 524M.  That include 83M of kernel
object files in /usr/src/sys/i386/compile from a couple
of different kernel builds.  /usr/obj which holds the
object files from a buildworld is 460M.  If you're going
to do a full cvs repository then /home/ncvs on my system
is 1391M.


--On Tuesday, January 15, 2002 11:38 PM -0800 Arvind Srivaths 
[EMAIL PROTECTED] wrote:


 Hi,

 I created a separate partition for /usr/src (around 420MB) and cvsup
 ran out of space.  Can someone give me a rough idea of how big it is?
 Also, I should be able to use growfs (after booting off of a floppy)
 to increase the size of the partition (if the slice has space),
 right? How about moving partitions - is there an easier way than
 creating a partition at the end of the slice and copying partitions
 down?

 Thanks,

 Arvind

 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: ghostscript-gnu build broken

2002-01-16 Thread Andreas Klemm


HP released a new version. 1.0.1.
They have fixed a bug in the sources.
Could someone try the new version with optimization -O / -O2 turned on ?
I only can do it in 10 hours from now.



Andreas ///

-- 
Andreas Klemm
Apsfilter Homepage   http://www.apsfilter.org
Support over mailing-lists (only!)   http://www.apsfilter.org/support
Mailing-list archive http://www.apsfilter.org/Lists-Archives
Songs from our band  64Bits  http://www.64bits.de
Inofficial band pages with add-on stuff  http://www.apsfilter.org/64bits.html



msg33589/pgp0.pgp
Description: PGP signature


Re: size of /usr/src

2002-01-16 Thread Crist J . Clark

On Wed, Jan 16, 2002 at 01:02:31AM -0800, Joel M. Baldwin wrote:
 
 My current /usr/src is 524M.  That include 83M of kernel
 object files in /usr/src/sys/i386/compile from a couple
 of different kernel builds.  /usr/obj which holds the
 object files from a buildworld is 460M.  If you're going
 to do a full cvs repository then /home/ncvs on my system
 is 1391M.

Hmmm...

  $ du -kd 0 /export/stable/src
  315854  /export/stable/src
  $ du -kd 0 /export/stable/src
  383116  /export/stable/src
  $ du -kd 0 /export/ncvs
  1248390 /export/ncvs

-- 
It's always funny until someone gets hurt. Then it's hilarious.

Crist J. Clark | [EMAIL PROTECTED]
   | [EMAIL PROTECTED]
http://people.freebsd.org/~cjc/| [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



[no subject]

2002-01-16 Thread Sarnovsky, Jesse

help

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



socket shutdown delay?

2002-01-16 Thread Chad David

Has anyone noticed (or fixed) a bug in -current where socket connections
on the local machine do not shutdown properly?  During stress testing
I'm seeing thousands (2316 right now) of these:

tcp4   0  0  192.168.1.2.8080   192.168.1.2.2215   FIN_WAIT_2
tcp4   0  0  192.168.1.2.2215   192.168.1.2.8080   LAST_ACK
tcp4   0  0  192.168.1.2.8080   192.168.1.2.2214   FIN_WAIT_2
tcp4   0  0  192.168.1.2.2214   192.168.1.2.8080   LAST_ACK
tcp4   0  0  192.168.1.2.8080   192.168.1.2.2213   FIN_WAIT_2
tcp4   0  0  192.168.1.2.2213   192.168.1.2.8080   LAST_ACK
tcp4   0  0  192.168.1.2.8080   192.168.1.2.2212   FIN_WAIT_2
tcp4   0  0  192.168.1.2.2212   192.168.1.2.8080   LAST_ACK
tcp4   0  0  192.168.1.2.8080   192.168.1.2.2211   FIN_WAIT_2
tcp4   0  0  192.168.1.2.2211   192.168.1.2.8080   LAST_ACK
tcp4   0  0  192.168.1.2.8080   192.168.1.2.2210   FIN_WAIT_2
tcp4   0  0  192.168.1.2.2210   192.168.1.2.8080   LAST_ACK

Both the client and the server are dead, but the connections stay in this
state.

I tested with the server on -current and the client on another box, and
all of the server sockets end up in TIME_WAIT.  Is there something delaying
the last ack on local connections?

Thanks.



FreeBSD colnta 5.0-CURRENT FreeBSD 5.0-CURRENT #17: Sun Jan 13 03:51:32 MST 2002

Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #17: Sun Jan 13 03:51:32 MST 2002
davidc@colnta:/mnt1/obj/usr/src/sys/COLNTA
Preloaded elf kernel /boot/kernel/kernel at 0xc0521000.
Preloaded elf module /boot/kernel/acpi.ko at 0xc05210a8.
Timecounter i8254  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (1004.52-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x68a  Stepping = 10
  
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
real memory  = 1073725440 (1048560K bytes)
avail memory = 1039327232 (1014968K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): apic id:  3, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  0, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178011, at 0xfec0
Pentium Pro MTRR support enabled
Using $PIR table, 7 entries at 0xc00f12d0
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: ASUS   CUV4X-D  on motherboard
acpi0: power button is handled as a fixed feature programming model.
Timecounter ACPI  frequency 3579545 Hz
acpi_timer0: 24-bit timer at 3.579545MHz port 0xe408-0xe40b on acpi0
acpi_cpu0: CPU on acpi0
acpi_cpu1: CPU on acpi0
acpi_button0: Power Button on acpi0
acpi_pcib0: Host-PCI bridge port 0xcf8-0xcff on acpi0
IOAPIC #0 intpin 18 - irq 2
IOAPIC #0 intpin 16 - irq 5
IOAPIC #0 intpin 19 - irq 10
pci0: PCI bus on acpi_pcib0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
pci1: display, VGA at device 0.0 (no driver attached)
isab0: PCI-ISA bridge at device 4.0 on pci0
isa0: ISA bus on isab0
atapci0: VIA 82C686 ATA100 controller port 0xb800-0xb80f at device 4.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: VIA 83C572 USB controller port 0xb400-0xb41f irq 2 at device 4.2 on pci0
usb0: VIA 83C572 USB controller on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: VIA 83C572 USB controller port 0xb000-0xb01f irq 2 at device 4.3 on pci0
usb1: VIA 83C572 USB controller on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0: display, VGA at device 12.0 (no driver attached)
dc0: ADMtek AN985 10/100BaseTX port 0xa800-0xa8ff mem 0xeb80-0xeb8003ff irq 10 
at device 13.0 on pci0
dc0: Ethernet address: 00:04:5a:61:f5:6a
miibus0: MII bus on dc0
ukphy0: Generic IEEE 802.3u media interface on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: enhanced floppy controller (i82077, NE72065 or clone) port 0x3f7,0x3f2-0x3f5 
irq 6 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5 drive on fdc0 drive 0
ppc0 port 0x778-0x77b,0x378-0x37f irq 7 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: IEEE1284 device found /NIBBLE/ECP
Probing for PnP devices on ppbus0:
ppbus0: Hewlett-Packard HP LaserJet 1200 PRINTER PJL,MLC,PCL,PCLXL,POSTSCRIPT
plip0: PLIP network interface on ppbus0
lpt0: Printer on ppbus0
lpt0: Interrupt-driven port
ppi0: Parallel I/O on ppbus0
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A
sio1 port 0x2f8-0x2ff irq 3 on 

[no subject]

2002-01-16 Thread XoX






Re: size of /usr/src

2002-01-16 Thread Bakul Shah

Your questions belong to freebsd-questions!

 I created a separate partition for /usr/src (around 420MB) and cvsup ran
 out of space.  Can someone give me a rough idea of how big it is?  Also,
 I should be able to use growfs (after booting off of a floppy) to increase
 the size of the partition (if the slice has space), right? How about moving
 partitions - is there an easier way than creating a partition at the end
 of the slice and copying partitions down?

Are you creating a 5.0-CURRENT or a 4-STABLE /usr/src?

On a -STABLE:
$ du -s /usr/src
355799  /usr/src

On a -CURRENT:
$ du -s /usr/src
389637  /usr/src

FFS likes to have about 10% free space + add a few more (may
be 4%) for the inodes space.  So you need a partition of at
least 450MB.  You need to leave another 20% ~ 50% free for
future source fat (second law of computer thermodynamics).  A
partition of 1GB wouldn't hurt!

You need another 40MB or more for each kernel on whichever
partition you build them.  More if you turn debugging on.  Instead
of building kernels in /usr/src/sys/compile, you can do

cd /usr/src
make buildkernel KERNCONF=foo

to build them in /usr/obj/usr/src/sys/.

You don't need to boot from a floppy -- just unmount the
partition.  In case of the root partition you can growfs if
you boot in single user.  I believe initially the root
partition is mounted read-only so growfs change are safe.  I
would reboot immediately afterwards though.

For moving partitions I would use dump/restore to/from a
networked machine rather than copying them around.  For that
you may need to boot from a floppy.

Or you can just install released kernels and do something
worthwhile (like build some furniture) in the time you will
save :-)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: size of /usr/src

2002-01-16 Thread Joerg Wunsch

Bakul Shah [EMAIL PROTECTED] wrote:

 On a -CURRENT:
 $ du -s /usr/src
 389637/usr/src
 
 FFS likes to have about 10% free space + add a few more (may
 be 4%) for the inodes space.  So you need a partition of at
 least 450MB.

j@uriah 57% df -k /usr/src
Filesystem  1K-blocks UsedAvail Capacity  Mounted on
/dev/vinum/src 595455   434778   11304179%/usr/src

That is -current as of around christmas.  There's a stale
/sys/i386/compile directory in it, but that doesn't contribute
much to the above since there are no .o files in it.
The file system is a 1 KB fsize/8 KB bsize one; if you use
larger block sizes, waste of space might be more.

-- 
cheers, Jorg   .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Kernel panic... (was Re: Netatalk broken in current? Lock order reversal?)

2002-01-16 Thread Emiel Kollof

* Emiel Kollof ([EMAIL PROTECTED]) wrote:
   exclusive (sleep mutex) Giant (0xc0462c00) locked @
   /usr/src/sys/i386/i386/trap.c:1102
   panic: system call pwrite returning with mutex(s) held
  
  Hmm, erm, go kick Alfred really hard. :)  This function locks Giant and then
  doesn't ever unlock it.  This looks to be breakage from his fget() changes
  perhaps.

Alfred? Are you listening? Are you tending to this already? It's not
only Samba that makes my machine panic. Also icecast does (when you
start to stream to it). 

Oh, has anyone else seen these panics as well? Just wondering...

Cheers,
Emiel
-- 
Never let your schooling interfere with your education.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Kernel panic... (was Re: Netatalk broken in current? Lock order reversal?)

2002-01-16 Thread Alfred Perlstein

* Emiel Kollof [EMAIL PROTECTED] [020116 13:29] wrote:
 * Emiel Kollof ([EMAIL PROTECTED]) wrote:
exclusive (sleep mutex) Giant (0xc0462c00) locked @
/usr/src/sys/i386/i386/trap.c:1102
panic: system call pwrite returning with mutex(s) held
   
   Hmm, erm, go kick Alfred really hard. :)  This function locks Giant and then
   doesn't ever unlock it.  This looks to be breakage from his fget() changes
   perhaps.
 
 Alfred? Are you listening? Are you tending to this already? It's not
 only Samba that makes my machine panic. Also icecast does (when you
 start to stream to it). 
 
 Oh, has anyone else seen these panics as well? Just wondering...

It would help if someone cc'd me on these. :P

-- 
-Alfred Perlstein [[EMAIL PROTECTED]]
'Instead of asking why a piece of software is using 1970s technology,
 start asking why software is ignoring 30 years of accumulated wisdom.'
Tax deductable donations for FreeBSD: http://www.freebsdfoundation.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Kernel panic... (was Re: Netatalk broken in current? Lock order reversal?)

2002-01-16 Thread Alfred Perlstein

* Alfred Perlstein [EMAIL PROTECTED] [020116 13:30] wrote:
 * Emiel Kollof [EMAIL PROTECTED] [020116 13:29] wrote:
  * Emiel Kollof ([EMAIL PROTECTED]) wrote:
 exclusive (sleep mutex) Giant (0xc0462c00) locked @
 /usr/src/sys/i386/i386/trap.c:1102
 panic: system call pwrite returning with mutex(s) held

Hmm, erm, go kick Alfred really hard. :)  This function locks Giant and then
doesn't ever unlock it.  This looks to be breakage from his fget() changes
perhaps.
  
  Alfred? Are you listening? Are you tending to this already? It's not
  only Samba that makes my machine panic. Also icecast does (when you
  start to stream to it). 
  
  Oh, has anyone else seen these panics as well? Just wondering...
 
 It would help if someone cc'd me on these. :P

Fix should be in now.

-- 
-Alfred Perlstein [[EMAIL PROTECTED]]
'Instead of asking why a piece of software is using 1970s technology,
 start asking why software is ignoring 30 years of accumulated wisdom.'
Tax deductable donations for FreeBSD: http://www.freebsdfoundation.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Kernel panic... (was Re: Netatalk broken in current? Lock order reversal?)

2002-01-16 Thread Emiel Kollof

* Alfred Perlstein ([EMAIL PROTECTED]) wrote:
  It would help if someone cc'd me on these. :P
 
 Fix should be in now.

Great! Thanks! Remind me to buy you a beer if I ever get to meet you in
real life :-)

Right.. cvsup it is...

Cheers,
Emiel
-- 
If you can survive death, you can probably survive anything.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: socket shutdown delay?

2002-01-16 Thread Terry Lambert

Chad David wrote:
 Has anyone noticed (or fixed) a bug in -current where socket connections
 on the local machine do not shutdown properly?  During stress testing
 I'm seeing thousands (2316 right now) of these:
 
 tcp4   0  0  192.168.1.2.8080   192.168.1.2.2215   FIN_WAIT_2
 tcp4   0  0  192.168.1.2.2215   192.168.1.2.8080   LAST_ACK
 
 Both the client and the server are dead, but the connections stay in this
 state.
 
 I tested with the server on -current and the client on another box, and
 all of the server sockets end up in TIME_WAIT.  Is there something delaying
 the last ack on local connections?

A connection goes into FIN_WAIT_2 when it has received the ACK
of the FIN, but not received a FIN (or sent an ACK) itself, thus
permitting it to enter TIME_WAIT state for 2MSL before proceeding
to the CLOSED state, as a result of a server initiated close.

A connection goes into LAST_ACK when it has sent a FIN and not
received the ACK of the FIN before proceeding to the CLOSED
state, as a result of a client initiated close.

Since it's showing IP addresses, you appear to be using real
network connections, rather than loopback connections.

There are basically several ways to cause this:

1)  You have something on your network, like a dummynet,
that is deteministically dropping the the ACK to
the client when the server goes from FIN_WAIT_1,
so that the server goes to CLOSING instead of going
to FIN_WAIT_2 (client closes first), or the FIN in
the other direction so that the server doesn't go
to TIME_WAIT from FIN_WAIT_2 (server closes first).

2)  You have intentionally disabled KEEPALIVE, so that
a close results in an RST instead of a normal
shutdown of the TCP connection (I can't tell if
you are doing a real call to shutdown(2), or if
you are just relying on the OS resource tracking
behaviour that is implicit to close(2) (but only
if you don't set KEEPALIVE, and have disabled the
sysctl default of always doing KEEPALIVE on every
connection).  In this case, it's possible that the
RST was lost on the wire, and since RSTs are not
retransmitted, you have shot yourself in the foot.

Note:   You often see this type of foolish foot
shooting when running MAST, WAST, or
webbench, which try to factor out response
speed and measure connection speed, so that
they benchmark the server, not the FS or
other OS latencies in the document delivery
path (which is why these tools suck as real
world benchmarks go).  You could also cause
this (unlikely) with a bad firewall rule.

3)  You've exhausted your mbufs before you've exhausted
the number of simultaneous connections you are
permitted, because you have incorrectly tuned your
kernel, and therefore all your connections are sitting
in a starvation deadlock, waiting for packets that can
never be sent because there are no mbufs available.

4)  You've got local hacks that your aren't telling us
about (shame on you!).

5)  You have found an introduced bug in -current.

Note:   I personally think this one is unlikely.

6)  Maybe something I haven't thought of...

Note:   I personally think this one is unlikely,
too... ;^)

See RFC 793 (or Stevens) for details on the state machine for
both ends of the connection, and you will see how your machine
got into this mess in the first place.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT as of 14 Jan seems slow

2002-01-16 Thread Jason Evans

On Wed, Jan 16, 2002 at 05:04:31PM +1100, Bruce Evans wrote:
 On Tue, 15 Jan 2002, David Wolfskill wrote:
 
  Date: Tue, 15 Jan 2002 16:46:17 -0800 (PST)
  From: John Baldwin [EMAIL PROTECTED]
  Two questions:
 
  1) Do you have WITNESS on in your kernel config?
 
  Yes, in both the build machine  the laptop -- since before I made a
  local hierarchy within my CVS repository (September 9, 2001).
 
  2) If yes, have you tried building with a kernel without witness?
 
  No, not since I put it in to re-sync with GENERIC.  I could try that, I
  suppose -- but as noted, I've had WITNESS in there for a while; something
  seems to have changed during that one 24-hr. period that affected things
  rather radically.  And I thought it notable.  :-}
 
  I gather no one else has noticed this?
 
 File locking seems to cause only the usual few percent of slowdown for
 each round of major locking changes.  I haven't completed benchmarking
 the file locking pessimizations.  I don't use WITNESS or INVARIANTS
 for benchmarking of course.  Maybe the file locking changes cause much
 larger pessimizations when WITNESS is turned on than most locking
 changes.  I can see how they might: WITNESS seemed to slow down creation
 and destruction of mutexes more than most mutex operations last time I
 checked, and there is a descriptor for each file and each file descriptor.

Note that additional locking with witness turned on can drastically affect
performance.  Chances are that Alfred's changes in combination with witness
are what caused the slowdown.  During certain stages of the lockmgr
conversion to mutexes, I saw similar performance degradations (a factor of
~5-10, IIRC).

Jason

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: size of /usr/src

2002-01-16 Thread Joerg Wunsch

[EMAIL PROTECTED] (Joerg Wunsch) wrote:

 j@uriah 57% df -k /usr/src
 Filesystem  1K-blocks UsedAvail Capacity  Mounted on
 /dev/vinum/src 595455   434778   11304179%/usr/src
 
 That is -current as of around christmas.

Bakul got back to me and questioned that number -- he was
right. :)  The above filesystem doesn't only contain a checked
out copy of the src/ stuff but also of doc/, which adds another
60+ MB to it.

-- 
cheers, Jorg   .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: socket shutdown delay?

2002-01-16 Thread Chad David

On Wed, Jan 16, 2002 at 01:39:54PM -0800, Terry Lambert wrote:
 Chad David wrote:
  Has anyone noticed (or fixed) a bug in -current where socket connections
  on the local machine do not shutdown properly?  During stress testing
  I'm seeing thousands (2316 right now) of these:
  
  tcp4   0  0  192.168.1.2.8080   192.168.1.2.2215   FIN_WAIT_2
  tcp4   0  0  192.168.1.2.2215   192.168.1.2.8080   LAST_ACK
  
  Both the client and the server are dead, but the connections stay in this
  state.
  
  I tested with the server on -current and the client on another box, and
  all of the server sockets end up in TIME_WAIT.  Is there something delaying
  the last ack on local connections?
 
 A connection goes into FIN_WAIT_2 when it has received the ACK
 of the FIN, but not received a FIN (or sent an ACK) itself, thus
 permitting it to enter TIME_WAIT state for 2MSL before proceeding
 to the CLOSED state, as a result of a server initiated close.
 
 A connection goes into LAST_ACK when it has sent a FIN and not
 received the ACK of the FIN before proceeding to the CLOSED
 state, as a result of a client initiated close.

I've got TCP/IP Illistrated V1 right beside me, so I basically
knew what was happening.  Just not why.

Like I said in the original email, connections from another machine
end up in TIME_WAIT right away, it is only local connection.

 
 Since it's showing IP addresses, you appear to be using real
 network connections, rather than loopback connections.

In this case yes.  Connections to 127.0.0.1 result in the same thing.

 
 There are basically several ways to cause this:
 
 1)You have something on your network, like a dummynet,
   that is deteministically dropping the the ACK to
   the client when the server goes from FIN_WAIT_1,
   so that the server goes to CLOSING instead of going
   to FIN_WAIT_2 (client closes first), or the FIN in
   the other direction so that the server doesn't go
   to TIME_WAIT from FIN_WAIT_2 (server closes first).

Nothing like that on the box.

 
 2)You have intentionally disabled KEEPALIVE, so that
   a close results in an RST instead of a normal
   shutdown of the TCP connection (I can't tell if
   you are doing a real call to shutdown(2), or if
   you are just relying on the OS resource tracking
   behaviour that is implicit to close(2) (but only
   if you don't set KEEPALIVE, and have disabled the
   sysctl default of always doing KEEPALIVE on every
   connection).  In this case, it's possible that the
   RST was lost on the wire, and since RSTs are not
   retransmitted, you have shot yourself in the foot.
 
   Note:   You often see this type of foolish foot
   shooting when running MAST, WAST, or
   webbench, which try to factor out response
   speed and measure connection speed, so that
   they benchmark the server, not the FS or
   other OS latencies in the document delivery
   path (which is why these tools suck as real
   world benchmarks go).  You could also cause
   this (unlikely) with a bad firewall rule.

I haven't changed any sysctls, and other than SO_REUSEADDR,
the default sockopts are being used.  I also do not call
shutdown() on either end, and both the client and server
processes have exited and the connections still do not clear
up (in time they do, around 10 minutes).

 
 3)You've exhausted your mbufs before you've exhausted
   the number of simultaneous connections you are
   permitted, because you have incorrectly tuned your
   kernel, and therefore all your connections are sitting
   in a starvation deadlock, waiting for packets that can
   never be sent because there are no mbufs available.

The client eventually fails with EADDRNOTAVAIL.

Here are the mbuf stats before and after.

Before test:


colnta-netstat -m
mbuf usage:
GEN list:   0/0 (in use/in pool)
CPU #0 list:51/144 (in use/in pool)
CPU #1 list:51/144 (in use/in pool)
Total:  102/288 (in use/in pool)
Maximum number allowed on each CPU list: 512
Maximum possible: 67584
Allocated mbuf types:
  102 mbufs allocated to data
0% of mbuf map consumed
mbuf cluster usage:
GEN list:   0/0 (in use/in pool)
CPU #0 list:50/86 (in use/in pool)
CPU #1 list:51/88 (in use/in pool)
Total:  101/174 (in use/in pool)
Maximum number allowed on each CPU list: 128
Maximum possible: 33792
0% of cluster map consumed
420 KBytes of wired memory reserved (54% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

After test:


Re: socket shutdown delay?

2002-01-16 Thread Terry Lambert

Chad David wrote:
  A connection goes into FIN_WAIT_2 when it has received the ACK
  of the FIN, but not received a FIN (or sent an ACK) itself, thus
  permitting it to enter TIME_WAIT state for 2MSL before proceeding
  to the CLOSED state, as a result of a server initiated close.
 
  A connection goes into LAST_ACK when it has sent a FIN and not
  received the ACK of the FIN before proceeding to the CLOSED
  state, as a result of a client initiated close.
 
 I've got TCP/IP Illistrated V1 right beside me, so I basically
 knew what was happening.  Just not why.
 
 Like I said in the original email, connections from another machine
 end up in TIME_WAIT right away, it is only local connection.

Maybe there is a bug in the interrupt thread code, or in the
scheduler for NETISR processing.  Like I said before, I think
this is unlikely.

The other possibility is a bug in simultaneous client and
server closes, but without information about your client
and server program's operation (e.g. if it's an HTTP session,
and the client closes without waiting for a response, or the
server responsed and closes), that's as close as I can give
you.  I *really* doubt that, since I think it would have
shown before.

The other possibility might be the sequence numbers on a
re-used connection going backwards.  If that were to happen,
you might see the sate machien push pack into LAST_ACK when
it shouldn't.

Be sure that you use the sysctl to set the sequence number
algorithm to the one specified in the RFC, instead of the
broken OpenBSD version that supposedly prevents predictive
session hijack (which should be an application level thing
about verification of the peer, anyway).

Also make sure that the keepalive sysctl is set on (1).


  Since it's showing IP addresses, you appear to be using real
  network connections, rather than loopback connections.
 
 In this case yes.  Connections to 127.0.0.1 result in the same thing.

OK, so it's not lost packets because of the use of the network
driver.  This makes me lean toward the sequence number or RST
with no mbufs available problem.

[ ... test net intentionally lossy ... ]
 Nothing like that on the box.

OK.  It was low hanging fruit, but unlikely, but had to be
asked.


  2)You have intentionally disabled KEEPALIVE, so that
a close results in an RST instead of a normal
shutdown of the TCP connection (I can't tell if
you are doing a real call to shutdown(2), or if
you are just relying on the OS resource tracking
behaviour that is implicit to close(2) (but only
if you don't set KEEPALIVE, and have disabled the
sysctl default of always doing KEEPALIVE on every
connection).  In this case, it's possible that the
RST was lost on the wire, and since RSTs are not
retransmitted, you have shot yourself in the foot.
 
Note:   You often see this type of foolish foot
shooting when running MAST, WAST, or
webbench, which try to factor out response
speed and measure connection speed, so that
they benchmark the server, not the FS or
other OS latencies in the document delivery
path (which is why these tools suck as real
world benchmarks go).  You could also cause
this (unlikely) with a bad firewall rule.
 
 I haven't changed any sysctls, and other than SO_REUSEADDR,
 the default sockopts are being used.

This doesn't tell me the setting of the keepalive sysctl.  By
default, it won't be on unless the sysctl forces it on, which
it does by default, unless it's been changed, or the default
has been changed in -current (don't know).  So check this one.

 I also do not call
 shutdown() on either end, and both the client and server
 processes have exited and the connections still do not clear
 up (in time they do, around 10 minutes).

You should probably call shutdown(2), if you want your code
to be mostly correct.

You also didn't say that they in fact drain after that
period of time.

I suspect that you are just doing a large number of connections.

I frequently ended up with 50,000+ connections in TIME_WAIT
state (I rarely use the same machine for both the client and
the server, since that is not representative of real world
use), and, of course, it takes 2MSL for TIME_WAIT to drain
connections out.

My guess is that you have ran out of mbufs (your usage stats
tell me nothing about the abailable number of real mbufs;
even the 0 requests for memory denied is not really as
useful as it would appear in the stats), or you just have
an incredibly large number of files open.

The FreeBSD file allocation table entry allocation for a
large number of simultaneously open files is bad.

Similarly, the FreeBSD allocation of the port space is
a linear lookup that has exponential time increase as the
number of connections go up.  The same is true of the
lookup of the INPCB and TCPCB on incoming 

Downgrading

2002-01-16 Thread Timothy Aslat

Hi All,

Quick question.   Where would I find information on downgrading a
-CURRENT to a -STABLE or -RELEASE?

I'm just trying to avoid doing a reinstall and re-setup from scratch.

Regards

Tim

-- 
| The most exciting phrase to   | Tim Aslat [EMAIL PROTECTED]   |
| hear in science, the one that | http://www.spyderweb.com.au|
| heralds new discoveries, is   | Spyderweb Consulting   |
| not Eureka! (I found it!)   | P: 82270800M: 0401088479   |
| but That's funny ...| Webmaster for  |
|  -- Isaac Asimov  | http://www.goodiesruleok.com   |
| No, Eureka is Greek for | The Ultimate Goody Fansite |
| This bath is too hot!| [EMAIL PROTECTED]|
|   -- Dr Who   ||

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: socket shutdown delay?

2002-01-16 Thread Chad David

On Wed, Jan 16, 2002 at 03:50:47PM -0800, Terry Lambert wrote:
 Chad David wrote:
   A connection goes into FIN_WAIT_2 when it has received the ACK
   of the FIN, but not received a FIN (or sent an ACK) itself, thus
   permitting it to enter TIME_WAIT state for 2MSL before proceeding
   to the CLOSED state, as a result of a server initiated close.
  
   A connection goes into LAST_ACK when it has sent a FIN and not
   received the ACK of the FIN before proceeding to the CLOSED
   state, as a result of a client initiated close.
  
 

The direct cause is a bug in my client.  I call close(2) out side of the
main loop (one line off :( ), so none of the client side sockets were
getting closed.  When I fixed this all of the connections went to
TIME_WAIT right away.

I'm still not convinced that all is well though, as on Solaris 5.9 and
4.4-STABLE I do not see the problem with the bad client.

I'll address your points below, but if you don't feel like chasing this
anymore that is fine with me... I'll add it to my list of things to
try and understand on my next vacation :).

 Also make sure that the keepalive sysctl is set on (1).

colnta-sysctl -a | grep keepalive
net.inet.tcp.always_keepalive: 1

 
 This doesn't tell me the setting of the keepalive sysctl.  By
 default, it won't be on unless the sysctl forces it on, which
 it does by default, unless it's been changed, or the default
 has been changed in -current (don't know).  So check this one.

It must be on by default.

 
  I also do not call
  shutdown() on either end, and both the client and server
  processes have exited and the connections still do not clear
  up (in time they do, around 10 minutes).
 
 You should probably call shutdown(2), if you want your code
 to be mostly correct.

Call shutdown(2) instead of close(2)?

 
 I suspect that you are just doing a large number of connections.

One connection at a time, as fast as the client can loop, with
a small (1k) amount of data being returned by the server.

 
 I frequently ended up with 50,000+ connections in TIME_WAIT
 state (I rarely use the same machine for both the client and
 the server, since that is not representative of real world
 use), and, of course, it takes 2MSL for TIME_WAIT to drain
 connections out.

Agreed, I'm still testing functionality.  I just got hit with
this while trying to check for simple memory leaks and broken
code (not load testing).

 
 My guess is that you have ran out of mbufs (your usage stats
 tell me nothing about the abailable number of real mbufs;
 even the 0 requests for memory denied is not really as
 useful as it would appear in the stats), or you just have
 an incredibly large number of files open.

colnta-sysctl -a | grep mbuf
kern.ipc.nmbufs: 67584
kern.ipc.mbuf_wait: 64
kern.ipc.mbuf_limit: 512

   3)You've exhausted your mbufs before you've exhausted
 the number of simultaneous connections you are
 permitted, because you have incorrectly tuned your
 kernel, and therefore all your connections are sitting
 in a starvation deadlock, waiting for packets that can
 never be sent because there are no mbufs available.
  
  The client eventually fails with EADDRNOTAVAIL.
 
 Yes, this is the outbound connection limitation because of the
 ports.  There's three bugs there, in FreeBSD, as well, but they
 generally limit the outbound connections, rather than causing
 problems.
 
 One tuning variable you probably want on the machine making the
 connections is to up the TCP port range to 65535; you will have
 to do two sysctls in order to do this.  This will delay your
 client failure by about a factor of 8-10 times as many
 connections (outbound connections count against the total, but
 inbound connections do not, since they do not use up socket/port
 pairs be source).

With the fixed client it never fails.  I moved a few GB through it
without any problem.

  
  and a few minutes later:
  colnta-netstat -an | grep FIN_WAIT_2 | wc
  14348604  111852
 
 This indicates a 2MSL draining.  The resource track close could
 also be slow.  You could probably get an incredible speedup by
 doing explicit closes in the client program, starting with the
 highest used fd, and working down, instead of going the other
 way (it's probably a good idea to modify the FreeBSD resource
 track close to so the same thing).

If I had been doing any explicit closes :(.

 
 There are some other inefficiencies in the fd code that can be
 addressed... nominally, the allocation is a linear search at
 the last valid one going higher.  For most servers, this could
 be significantly improved by linking free fd's in a sparse
 list onto a freelist, and maintaining a pointer to that,
 instead of the index to the first free one, but that should only
 impact you on allocation (like the inpcb hash, which fails
 pretty badly, even when you tune up the hash size to some
 unreasonable amount, and the port allocation for outbound
 connections, which is, frankly, 

panic: bioqdisksort()

2002-01-16 Thread Jun Kuriyama


I got a panic with kernel around Jan 16 09:02:54 JST.


Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic.id = 
fault virtual address   = 0xcaeef040
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc01cc833
stack pointer   = 0x10:0xf4f33a84
frame pointer   = 0x10:0xf4f33a90
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 19128 (make)
kernel: type 12 trap, code=0
Stopped at  bioqdisksort+0x2b:  movl0xc0(%ebx),%eax
db t
bioqdisksort(c8ea80a8,caeeef80,c8ea7000,f4f33adc,c014e3de) at bioqdisksort+0x2b
adstrategy(caeeef80) at adstrategy+0x39
arstrategy(d54f271c,d54f271c,ec363c20,f4f33b08,c019f8d3) at arstrategy+0x2de
diskstrategy(d54f271c,c8efd100,d54f271c,0,f4f33b14) at diskstrategy+0xcd
spec_strategy(f4f33b2c,f4f33b38,c0290821,f4f33b2c,f4f33bfc) at spec_strategy+0x19b
spec_vnoperate(f4f33b2c,f4f33bfc,d54f271c,0,c0334380) at spec_vnoperate+0x15
ufs_strategy(f4f33b68,f4f33b74,c01ec78e,f4f33b68,f2c42f00) at ufs_strategy+0xa9
ufs_vnoperate(f4f33b68) at ufs_vnoperate+0x15
breadn(f2c42f00,0,800,0,0) at breadn+0xc2
bread(f2c42f00,0,800,0,f4f33bfc) at bread+0x1d
ffs_read(f4f33c20,2000,f4f33d20,2000,0) at ffs_read+0x2c3
vn_read(caeed100,f4f33c90,ca9f2200,0,f4ea6404) at vn_read+0x130
dofileread(f4ea6404,caeed100,3,8088000,2000) at dofileread+0xae
read(f4ea6404,f4f33d20,807e0e0,8095b00,8095b00) at read+0x51
syscall(2f,2f,2f,8095b00,8095b00) at syscall+0x25f
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (3, FreeBSD ELF, read), eip = 0x8061cd7, esp = 0xbfbfe748, ebp = 
0xbfbfe764 ---


(kgdb) where
#0  dumpsys () at ../../../kern/kern_shutdown.c:492
#1  0xc01bdc6b in boot (howto=260) at ../../../kern/kern_shutdown.c:335
#2  0xc01be10d in panic (fmt=0xc02eac6a from debugger)
at ../../../kern/kern_shutdown.c:634
#3  0xc013fb4d in db_panic (addr=-1071855565, have_addr=0, count=-1, 
modif=0xf4f338f0 ) at ../../../ddb/db_command.c:452
#4  0xc013faeb in db_command (last_cmdp=0xc0339d64, cmd_table=0xc0339b84, 
aux_cmd_tablep=0xc03306f8, aux_cmd_tablep_end=0xc03306fc)
at ../../../ddb/db_command.c:348
#5  0xc013fbb7 in db_command_loop () at ../../../ddb/db_command.c:474
#6  0xc0141f33 in db_trap (type=12, code=0) at ../../../ddb/db_trap.c:72
#7  0xc02bfd4a in kdb_trap (type=12, code=0, regs=0xf4f33a44)
at ../../../i386/i386/db_interface.c:167
#8  0xc02d0da0 in trap_fatal (frame=0xf4f33a44, eva=3404656704)
at ../../../i386/i386/trap.c:837
#9  0xc02d0ae9 in trap_pfault (frame=0xf4f33a44, usermode=0, eva=3404656704)
at ../../../i386/i386/trap.c:756
#10 0xc02d0653 in trap (frame={tf_fs = -1071972328, tf_es = -1070006256, 
  tf_ds = 16, tf_edi = -185965564, tf_esi = -924155736, 
  tf_ebp = -185386352, tf_isp = -185386384, tf_ebx = -890310784, 
  tf_edx = -890311808, tf_ecx = -92416, tf_eax = -185965776, 
  tf_trapno = 12, tf_err = 0, tf_eip = -1071855565, tf_cs = 8, 
  tf_eflags = 66178, tf_esp = -924155904, tf_ss = 13507})
at ../../../i386/i386/trap.c:426
#11 0xc01cc833 in bioqdisksort (bioq=0xc8ea80a8, bp=0xcaeeef80)
at ../../../kern/subr_disklabel.c:91
#12 0xc014c951 in adstrategy (bp=0xcaeeef80) at ../../../dev/ata/ata-disk.c:293
#13 0xc014e3de in arstrategy (bp=0xd54f271c) at ../../../dev/ata/ata-raid.c:243
#14 0xc01cc6d5 in diskstrategy (bp=0xd54f271c) at ../../../kern/subr_disk.c:390
#15 0xc019f8d3 in spec_strategy (ap=0xf4f33b2c)
at ../../../fs/specfs/spec_vnops.c:494
#16 0xc019f1ed in spec_vnoperate (ap=0xf4f33b2c)
at ../../../fs/specfs/spec_vnops.c:119
#17 0xc0290821 in ufs_strategy (ap=0xf4f33b68) at vnode_if.h:762
#18 0xc0290fe5 in ufs_vnoperate (ap=0xf4f33b68)
at ../../../ufs/ufs/ufs_vnops.c:2657
#19 0xc01ec78e in breadn (vp=0xf2c42f00, blkno=0, size=2048, rablkno=0x0, 
rabsize=0x0, cnt=0, cred=0x0, bpp=0xf4f33bfc) at vnode_if.h:762
#20 0xc01ec6c9 in bread (vp=0xf2c42f00, blkno=0, size=2048, cred=0x0, 
bpp=0xf4f33bfc) at ../../../kern/vfs_bio.c:585
#21 0xc02886fb in ffs_read (ap=0xf4f33c20)
at ../../../ufs/ufs/ufs_readwrite.c:278
#22 0xc01ff044 in vn_read (fp=0xcaeed100, uio=0xf4f33c90, cred=0xca9f2200, 
flags=0, td=0xf4ea6404) at vnode_if.h:279
#23 0xc01d5306 in dofileread (td=0xf4ea6404, fp=0xcaeed100, fd=3, 
buf=0x8088000, nbyte=8192, offset=-1, flags=0) at ../../../sys/file.h:179
#24 0xc01d5189 in read (td=0xf4ea6404, uap=0xf4f33d20)
at ../../../kern/sys_generic.c:133
#25 0xc02d118b in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
  tf_edi = 134830848, tf_esi = 134830848, tf_ebp = -1077942428, 
  tf_isp = -185385612, tf_ebx = 134734048, tf_edx = 134774784, 
(kgdb) up 11
#11 0xc01cc833 in bioqdisksort (bioq=0xc8ea80a8, bp=0xcaeeef80)
at ../../../kern/subr_disklabel.c:91
91  TAILQ_FOREACH(bn, bioq-queue, bio_queue)

Re: socket shutdown delay?

2002-01-16 Thread Terry Lambert

Chad David wrote:
 The direct cause is a bug in my client.  I call close(2) out side of the
 main loop (one line off :( ), so none of the client side sockets were
 getting closed.  When I fixed this all of the connections went to
 TIME_WAIT right away.
 
 I'm still not convinced that all is well though, as on Solaris 5.9 and
 4.4-STABLE I do not see the problem with the bad client.

So it's the resource track close of the sockets.

If the client and the server were the same program, you could
be seeing this as a timing thing on order of operation.  I'm
guessing they aren't, though...


 I'll address your points below, but if you don't feel like chasing this
 anymore that is fine with me... I'll add it to my list of things to
 try and understand on my next vacation :).

Unless there's something that jumps out at you, this is probably
a good plan.  8-).


  Also make sure that the keepalive sysctl is set on (1).

[ ... it's on, so it's not the RST instead of FIN/FIN-ACK/2MSL
  losing the RST that isn't retransmitted ... ]


  You should probably call shutdown(2), if you want your code
  to be mostly correct.
 
 Call shutdown(2) instead of close(2)?

Nope.  Before close.  Depending on the argument, perhaps not
before the last read or write, then the close.


  I suspect that you are just doing a large number of connections.
 
 One connection at a time, as fast as the client can loop, with
 a small (1k) amount of data being returned by the server.

So you would shutdown() after the request, but before reading
the response, to indicate that you have no more request data
to send.


  My guess is that you have ran out of mbufs (your usage stats
  tell me nothing about the abailable number of real mbufs;
  even the 0 requests for memory denied is not really as
  useful as it would appear in the stats), or you just have
  an incredibly large number of files open.
 
 colnta-sysctl -a | grep mbuf
 kern.ipc.nmbufs: 67584
 kern.ipc.mbuf_wait: 64
 kern.ipc.mbuf_limit: 512

This number is how many mbufs possible.  It represents the map
size for the page table entries, and doesn't really indicate
that there are physical pages of RAM available to back them.

However, since this is only ~33M of memory, this is nothing.

With the 1K size of the data you are sending from the server,
this puts you at a connection max of 16,000, assuming all
data is sent but not yet ACK'ed... or 8,000, given that both
client and server are on the same machine.  The absolute worst
case sits down around 6,000 (ACK packets, driver mbufs, socket
option mbufs, etc., dragging it down a little).

So it's unrelated to that, but we already knew that because
of the program change.  8-).


   The client eventually fails with EADDRNOTAVAIL.
[ ... ]
 With the fixed client it never fails.  I moved a few GB through it
 without any problem.

You will want to up the user ports on the clients when you start
stress testing it from multiple client machines, anyway.

  This indicates a 2MSL draining.  The resource track close could
  also be slow.  You could probably get an incredible speedup by
  doing explicit closes in the client program, starting with the
  highest used fd, and working down, instead of going the other
  way (it's probably a good idea to modify the FreeBSD resource
  track close to so the same thing).
 
 If I had been doing any explicit closes :(.

Yes, but your ordering is reverse optimal, actually, so you are
going to be rate limited at the client.

Did the client actually exit?  If it didn't, that would explain
everything.

  There are some other inefficiencies in the fd code that can be
  addressed... nominally, the allocation is a linear search at
  the last valid one going higher.  For most servers, this could
  be significantly improved by linking free fd's in a sparse
  list onto a freelist, and maintaining a pointer to that,
  instead of the index to the first free one, but that should only
  impact you on allocation (like the inpcb hash, which fails
  pretty badly, even when you tune up the hash size to some
  unreasonable amount, and the port allocation for outbound
  connections, which is, frankly, broken.  Both could benefit from
  a nice btree overhaul).
 
 I actually implemented something for this type of problem over Christmas
 with one of the Solaris engineers.  It was inspired by Jeff Bonwick's
 vmem stuff (Usenix 2001), but was bit mask based, so the actual storage
 overhead was a lot less, with what appeared to be very good allocate and
 free times (O(n) as the worst case with O(1) typically).

This would be nice for FreeBSD, assuming we could pry it out
of you.  8-).


[ ... timer code, Rice U. Opportunistic Timers ... ]

 I think I have that paper around here somewhere... is it older,
 like from around 1990?

No, you are probably thinking of the WRL paper by Jeff Mogul.
The paper I'm referring to is late mid-90's.


   Nope.  Stock -current, none of my patches applied.
 
  Heh... not useful information without a date 

Re: Downgrading

2002-01-16 Thread Terry Lambert

Timothy Aslat wrote:
 
 Hi All,
 
 Quick question.   Where would I find information on downgrading a
 -CURRENT to a -STABLE or -RELEASE?
 
 I'm just trying to avoid doing a reinstall and re-setup from scratch.

THis belongs on -questions.

In general, you can boot from a CDROM of the version you
want to downgrade to, choose upgrade from the sysinstall
menu, and then proceed to upgrade.

It will not install your sources for you (you will have to
do that manually).

You may also have a number of issues with configuration file
data, though it should leave libraries and other things intact.

The only other things that should be able to go wrong are any
libraries in developement that have not had their version numbers
bumped for interface changes, and the boot blocks, which you
can deal with by manually reinstalling via the holographic
shell via a manual run of disklabel -B using the installed
files by specifying the path to them, prior to the reboot.

FWIW, I have, in practice, upgraded a large number of -current
machines from an October 2000 snapshot to a 4.3-RELEASE CDROM
version, with no problem, if locally booted, and with some
effort when doing the upgrade from an NFS mounted CDROM over
the network (mostly, SSH problems with the pam.conf files
when the SSH changed to need explicit pam.conf entries, and not
using the generic entries if the SSH ones were missing, as the
PAM design documents with which SSH does not comply indicate
you should do...).  You also have to run the sysinstall from
the CDROM, which is not on the CDROM itself, and is hidden in
the boot images -- and must be named sysinstall, because it's
a crunched binary.  The only other issue is that you must
manually copy ove /dev/MAKEDEV and /dev/MAKEDEV.local, and run
sh MAKEDEV all to get the /dev/random set up correctly, but
all this can be done prior to the reboot.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



boot floppy problems...

2002-01-16 Thread Mike Brancato

Just leting you guys know that the Jan 15th and Jan 16th boot floppies
aren't working.  the Jan 13th snaps are though.

mike


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic during fdisk'ing a md(4) device

2002-01-16 Thread Bruce Evans

On Mon, 14 Jan 2002, Michael Reifenberger wrote:

 On Tue, 15 Jan 2002, Bruce Evans wrote:
 ...
  Try this version.  Only disklabel.h has many changes.  The code for
  avoiding creation of bogus 'c' partitions didn't work at all.

 This works during startup but the following commands cases a panic
 while executing newfs (BT is attached):

 mdconfig -a -t swap -s 128M -o reserve -u 10
 disklabel -r -w md10 auto
   (When looking into /dev now I see two! md10c devices!)
 newfs -f 4096 /dev/md10c
 tunefs -n enable /dev/md10c
 mount /dev/md10c /tmp

Oops.  There should be no alias for md10c.  Try this version.  It fixes
the may want an alias case in dkmodminor() and moves all the dk inlines
to subr_diskslice.c.

%%%
Index: kern/subr_disk.c
===
RCS file: /home/ncvs/src/sys/kern/subr_disk.c,v
retrieving revision 1.50
diff -u -2 -r1.50 subr_disk.c
--- kern/subr_disk.c4 Nov 2001 11:56:22 -   1.50
+++ kern/subr_disk.c14 Jan 2002 11:42:38 -
@@ -301,5 +301,5 @@

error = 0;
-   pdev = dkmodpart(dkmodslice(dev, WHOLE_DISK_SLICE), RAW_PART);
+   pdev = dkmodslice(dkmodpart(dev, -RAW_PART), WHOLE_DISK_SLICE);

dp = pdev-si_disk;
@@ -349,5 +349,5 @@

error = 0;
-   pdev = dkmodpart(dkmodslice(dev, WHOLE_DISK_SLICE), RAW_PART);
+   pdev = dkmodslice(dkmodpart(dev, -RAW_PART), WHOLE_DISK_SLICE);
dp = pdev-si_disk;
if (!dp)
@@ -365,5 +365,5 @@
struct disk *dp;

-   pdev = dkmodpart(dkmodslice(bp-bio_dev, WHOLE_DISK_SLICE), RAW_PART);
+   pdev = dkmodslice(dkmodpart(bp-bio_dev, -RAW_PART), WHOLE_DISK_SLICE);
dp = pdev-si_disk;
bp-bio_resid = bp-bio_bcount;
@@ -400,5 +400,5 @@
dev_t pdev;

-   pdev = dkmodpart(dkmodslice(dev, WHOLE_DISK_SLICE), RAW_PART);
+   pdev = dkmodslice(dkmodpart(dev, -RAW_PART), WHOLE_DISK_SLICE);
dp = pdev-si_disk;
if (!dp)
@@ -416,5 +416,5 @@
dev_t pdev;

-   pdev = dkmodpart(dkmodslice(dev, WHOLE_DISK_SLICE), RAW_PART);
+   pdev = dkmodslice(dkmodpart(dev, -RAW_PART), WHOLE_DISK_SLICE);
dp = pdev-si_disk;
if (!dp)
Index: kern/subr_diskmbr.c
===
RCS file: /home/ncvs/src/sys/kern/subr_diskmbr.c,v
retrieving revision 1.54
diff -u -2 -r1.54 subr_diskmbr.c
--- kern/subr_diskmbr.c 11 Dec 2001 05:35:43 -  1.54
+++ kern/subr_diskmbr.c 9 Jan 2002 10:34:30 -
@@ -209,5 +209,5 @@
/* Read master boot record. */
bp = geteblk((int)lp-d_secsize);
-   bp-b_dev = dkmodpart(dkmodslice(dev, WHOLE_DISK_SLICE), RAW_PART);
+   bp-b_dev = dkmodslice(dkmodpart(dev, -RAW_PART), WHOLE_DISK_SLICE);
bp-b_blkno = mbr_offset;
bp-b_bcount = lp-d_secsize;
Index: kern/subr_diskslice.c
===
RCS file: /home/ncvs/src/sys/kern/subr_diskslice.c,v
retrieving revision 1.96
diff -u -2 -r1.96 subr_diskslice.c
--- kern/subr_diskslice.c   12 Sep 2001 08:37:45 -  1.96
+++ kern/subr_diskslice.c   17 Jan 2002 04:19:10 -
@@ -68,4 +68,5 @@

 static struct disklabel *clone_label __P((struct disklabel *lp));
+static dev_t dkmodminor __P((dev_t dev, int mynor, int slicehint));
 static void dsiodone __P((struct bio *bp));
 static char *fixlabel __P((char *sname, struct diskslice *sp,
@@ -77,4 +78,5 @@
  struct disklabel *lp));
 static void set_ds_labeldevs __P((dev_t dev, struct diskslices *ssp));
+static void set_ds_labeldevs_unaliased __P((dev_t dev, struct diskslices *ssp));
 static void set_ds_wlabel __P((struct diskslices *ssp, int slice,
   int wlabel));
@@ -123,4 +125,106 @@

 /*
+ * XXX should be able to share more code between disk_dev_synth(),
+ * disk_clone() and here.
+ * XXX using dsname() only slightly insulates us from complications.
+ */
+static dev_t
+dkmodminor(dev_t dev, int mynor, int slicehint)
+{
+   dev_t newdev, newdev_alias;
+   const char *sname;
+   char partname[2];
+
+   newdev = makedev(major(dev), mynor);
+   if ((dev-si_flags  SI_NAMED) == 0)
+   return (newdev);/* XXX should panic. */
+   if (newdev-si_flags  SI_NAMED) {
+   /* We have found a device, but may want an alias. */
+   if (dkslice(newdev) == WHOLE_DISK_SLICE ||
+   dkslice(newdev) == COMPATIBILITY_SLICE ||
+   dkpart(newdev) != RAW_PART || slicehint)
+   return (newdev);
+
+   /* We do want an alias.  There can be only one.  XXX. */
+   newdev_alias = LIST_FIRST(newdev-si_children);
+   if (newdev_alias != NULL)
+   return (newdev_alias);
+   sname = dsname(dev, dkunit(newdev), dkslice(newdev),
+   dkpart(newdev), partname);
+   return (make_dev_alias(newdev,