Re: double fault with vinum and 4.5 RC3

2002-01-27 Thread Greg Lehey

On Sunday, 27 January 2002 at 19:12:59 -0600, David W. Chapman Jr. wrote:
>> I still don't know why you're switching jumpers.  That's not needed.
>> But it doesn't change anything.
>
> wouldn't he have to change the scsi id's if id 1 was set to boot in
> the controller and drive 1 no longer exists.  I suppose he could
> change the boot id in the scsi adapters bios as well.

Yes, this appears to be the case.  

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: double fault with vinum and 4.5 RC3

2002-01-27 Thread Greg Lehey

On Sunday, 27 January 2002 at 19:46:28 +0100, Martin Blapp wrote:
>
> Uhh, what I'm doing wrong ?
>
> cat configfile
> 

The config file looks fine.

> # vinum create configfile
>2: drive vinumdrive2 device /dev/da1s1e
> ** 2 : Invalid argument
>
> D vinumdrive1   State: up   Device /dev/da0s1e  Avail: 0/29880 MB 
>(0%)
> D vinumdrive2   State: up   Device  Avail: 0/0 MB
>
> Uhh ? Vinum does not find the second drive ?

So it seems.  I've heard of this in a couple of cases.  It would be
interesting to see the log output, but if I understand you correctly
you have since found a way to work around the problem.  I suspect that
there was something in the disk label which confused the issue, but it
would be nice to find what it is.  Possibly /dev/da1e would have
worked.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: double fault with vinum and 4.5 RC3

2002-01-27 Thread Greg Lehey

On Sunday, 27 January 2002 at  0:37:09 +0100, Thomas Moestl wrote:
> On Fri, 2002/01/25 at 11:38:31 +0100, Martin Blapp wrote:
>> vinum create /root/configfile twice ! makes STABLE 4.5RC3 crashing with a double
>> fault. I guess we should fix that. I'm makeing now a debug kernel.
>
> I've had a quick look at Martin's crash dump;

Well, for a quick look you've done some really good work.  This bug
has probably been chasing me for years.

> the double fault is caused by a longjmp() that restores an invalid
> %esp. By accessing the stack that is to be restored by longjmp
> before actually setting %esp, I was able to get a usable trace
> (showing the vinum-relevant parts only):
>
>   longjmp(c034a120,0,0,c1c31af4,3) at longjmp+0x7
>   throw_rude_remark(0,c02dbfc0,c1c41406,c1c31b84,c1c31af4) at
>   
>
> Apparently, this process slept in disk I/O initiated by
> vinum_read_label(); meanwhile, the vinum process forked in
> start_daemon() was scheduled and performed an ioctl() to find out
> whether the daemon was running. In this case, it was, so it exited.
> After some time, the other process is woken up because the I/O
> finished. vinum_read_label() returned DL_WRONG_DRIVE, so eventually
> throw_rude_remark() was called, which tried to longjmp() back to
> vinumioctl(). However, because vinumioctl() was called by another
> process in between, command_fail had been overwritten (it is a global
> variable). The stack referenced in that jmpbuf was unmapped when
> that process exited, so the result was a double fault.

Exactly.  The culprit is an extraneous call to setjmp in vinumioctl.
The setjmp logic assumes that the process is holding the config lock,
so it can use command_fail.  Unfortunately, this call was outside the
protected code, and that's what caused the problem.

> Greg, Martin should the crash dump I'm talking about available if
> you would like to take a look (the kernel in question has longjmp
> debugging code and a few printf()s added).

I tried looking, but the net connection was terrible.  Anyway, I've
been able to reproduce it here, and I have a patch which should make
it go away:

RCS file: /home/ncvs/src/sys/dev/vinum/vinumconfig.c,v
retrieving revision 1.32.2.5
diff -w -u -r1.32.2.5 vinumconfig.c
--- vinumconfig.c   28 May 2001 05:56:27 -  1.32.2.5
+++ vinumconfig.c   27 Jan 2002 03:31:33 -
@@ -99,6 +99,8 @@
 static int finishing;  /* don't recurse */
 int was_finishing;
 
+if ((vinum_conf.flags & VF_LOCKED) == 0)   /* bug catcher */
+   panic ("throw_rude_remark: called without config lock");
 va_start(ap, msg);
 if ((ioctl_reply != NULL)  /* we're called from the 
user */
 &&(!(vinum_conf.flags & VF_READING_CONFIG))) { /* and not reading from 
disk: return msg */
Index: vinumioctl.c
===
RCS file: /home/ncvs/src/sys/dev/vinum/vinumioctl.c,v
retrieving revision 1.25.2.3
diff -w -u -r1.25.2.3 vinumioctl.c
--- vinumioctl.c13 Mar 2001 02:59:43 -  1.25.2.3
+++ vinumioctl.c27 Jan 2002 03:31:06 -
@@ -82,9 +82,6 @@
 switch (DEVTYPE(dev)) {
 case VINUM_SUPERDEV_TYPE:  /* ordinary super device */
ioctl_reply = (struct _ioctl_reply *) data; /* save the address to 
reply to */
-   error = setjmp(command_fail);   /* come back here on error 
*/
-   if (error)  /* bombed out */
-   return 0;   /* the reply will contain 
meaningful info */
switch (cmd) {
 #ifdef VINUMDEBUG
case VINUM_DEBUG:

The panic in throw_rude_remark is "just in case", since it's a lot
easier to debug like that than after a double fault.  Please try that
and let me know if you still have problems.  In view of the impending
release of 4.5, it would be nice to get this fix in.

This doesn't change the fact that Martin's approach to recovery is
incorrect.  I'll address that in a separate message.  

Greg
-- 
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: double fault with vinum and 4.5 RC3

2002-01-27 Thread Greg Lehey

On Sunday, 27 January 2002 at 19:53:52 +0100, Martin Blapp wrote:
>
> I just got another panic while executing;
>
> vinum resetconfig

Why are you using resetconfig?  It's almost never needed.

> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x28
> fault code  = supervisor read, page not present
> instruction pointer = 0x8:0xc135abcb
> stack pointer   = 0x10:0xcdad8d20
> frame pointer   = 0x10:0xcdad8d34
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 271 (vinum)
> interrupt mask  = none
> kernel: type 12 trap, code=0
> Stopped at  driveio+0x47:   movl0x28(%eax),%eax
>
> driveio(c12520fc,c11aa200,200,1000,0) at driveio+0x47
> remove_drive(1,4649,c134c380,cdad8d8c,c135c131) at remove_drive+0x79
> free_vinum(1,cdad8de4,c134c380,cbfae700,cdad8ea8) at free_vinum+0x26
> vinumioctl(c134c380,4649,cdad8ea8,3,cbfae700) at vinumioctl+0x47d
> spec_ioctl(cdad8de4,cdad8dcc,c02634e1,cdad8de4,cdad8e74) at spec_ioctl+0x26
> spec_vnoperate(cdad8de4,cdad8e74,c01d41ab,cdad8de4,c1312b80) at
> spec_vnoperate+0x15
> ufs_vnoperatespec(cdad8de4,c1312b80,3,0,c02f80a0) at ufs_vnoperatespec+0x15
> vn_ioctl(c1312b80,4649,cdad8ea8,cbfae700,cbfae700) at vn_ioctl+0x10f
> ioctl(cbfae700,cdad8f80,bfbffb88,bfbffb93,8095d39) at ioctl+0x20a
> syscall2(2f,2f,2f,8095d39,bfbffb93) at syscall2+0x1f5
> Xint0x80_syscall() at Xint0x80_syscall+0x25

I'd guess that this is some race condition I've never seen before.  
I've asked you several times now to read
http://www.vinumvm.org/vinum/how-to-debug.html and supply me the
information I ask for there.  In particular, I need log output.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: double fault with vinum and 4.5 RC3

2002-01-27 Thread Greg Lehey

On Thursday, 24 January 2002 at 19:14:10 +0100, Martin Blapp wrote:
>
> ... unfortunatly I'm not able to recover from a simulated
> disk-crash. Here is what I made:
>
> Two disks: da0, da0, both 35GB
>
> 1) Installed FreeBSD on da0s1a
>
> 2) mount so single user mode
> dd bs=4096k if=/dev/da0 of=/dev/da1
>
> (to have the root on disk 2 too, I did this cold mirror)
>
> I plan to do a cold mirror procedure each evening, since
> the root will be mostly static anyway.
>
> mount /dev/da1a /mnt
> cd /mnt && dump -rf 0 /dev/da0a -f - | restore -xf -
> rm /mnt/restoresytable
> umount /mnt
>
> 3) I edited the disklabels and set da0e, da0f, da0g
> to type vinum
>
> # vinum mirror -n var /dev/da0e /dev/da1e
> # vinum mirror -n docsis /dev/da0f /dev/da1f
> # vinum mirror -n docsisvar /dev/da0g /dev/da1g

As I've mentioned elsewhere, this is seriously suboptimal.  The
"mirror" command is a toy for people getting used to Vinum.  You want
a proper config file.  Then create one drive per spindle, and choose
your subdisk sizes to match what you want.  Specifically, your config
file should look like this:

  drive 0 device /dev/da0e
  drive 1 device /dev/da1e
  volume var setupstate
   plex org concat 
 sd drive 0 len 2096000s
   plex org concat 
 sd drive 1 len 2096000s
  volume docsis setupstate
   plex org concat 
sd drive 0 len 1257000s 
   plex org concat 
sd drive 1 len 1257000s 
  volume docsisvar setupstate
   plex org concat 
sd drive 0 len 4651000s
   plex org concat 
sd drive 1 len 4651000s

This will require repartitioning the disks, of course, and in the
process you'll regain a little additional space.  Theoretically you
could repartition and use the existing data, but that is rather risky.

> But now I tried what happens if the disk 0 fails, I mean da0.  I
> took da0 out and replaced it with a new disk. I switched jumpers, so
> da0 got da1, and da1 got da0 and rebooted.

I still don't know why you're switching jumpers.  That's not needed.
But it doesn't change anything.

> I could boot fine, vinum failed to start and I was in single user
> mode. The I saw:
>
> # vinum start
> Warning: defective objects
>
> D vinumdrive0   State: referenced   Device  Avail: 0/0 MB
> D vinumdrive2   State: referenced   Device  Avail: 0/0 MB
> D vinumdrive4   State: referenced   Device  Avail: 0/0 MB
> P var.p0  C State: faulty   Subdisks: 1 Size:   1023
> P docsis.p0   C State: faulty   Subdisks: 1 Size:   6143
> P docsisvar.p0C State: faulty   Subdisks: 1 Size: 22
> S var.p0.s0 State: stalePO:0  B Size:   1023
> S docsis.p0.s0  State: stalePO:0  B Size:   6143
> S docsisvar.p0.s0   State: stalePO:0  B Size: 22
>
> where is p1 ???

Not defective.   I've repeated this configuration almost exactly (I
was using much smaller disks, so the size is different).  What I got
was:

  vinum -> start
  Warning: defective objects
  
  D vinumdrive0   State: referenced   A: 0/0 MB
  D vinumdrive2   State: referenced   A: 0/0 MB
  D vinumdrive4   State: referenced   A: 0/0 MB
  P var.p0  C State: faulty   Subdisks: 1 Size:   1023 MB
  P docsis.p0   C State: faulty   Subdisks: 1 Size:613 MB
  P docsisvar.p0C State: faulty   Subdisks: 1 Size:   2270 MB
  S var.p0.s0 State: staleD: vinumdrive0  Size:   1023 MB
  S docsis.p0.s0  State: staleD: vinumdrive2  Size:613 MB
  S docsisvar.p0.s0   State: staleD: vinumdrive4  Size:   2270 MB

This is simply a list of the defective objects, as the warning message
states.  To see them all, use the list (or l) command:

  vinum -> l
  3 drives:
  D vinumdrive1   State: up   /dev/da0e   A: 0/1023 MB (0%)
  D vinumdrive3   State: up   /dev/da0f   A: 0/614 MB (0%)
  D vinumdrive5   State: up   /dev/da0g   A: 0/2271 MB (0%)
  D vinumdrive0   State: referenced   A: 0/0 MB
  D vinumdrive2   State: referenced   A: 0/0 MB
  D vinumdrive4   State: referenced   A: 0/0 MB
  
  3 volumes:
  V var   State: up   Plexes:   2 Size:   1023 MB
  V docsisState: up   Plexes:   2 Size:613 MB
  V docsisvar State: up   Plexes:   2 Size:   2270 MB
  
  6 plexes:
  P var.p0  C State: faulty   Subdisks: 1 Size:   1023 MB
  P var.p1  C State: up   Subdisks: 1 Size:   1023 MB
  P docsis.p0   C State: faulty   Subdisks: 1 Size:613 MB
  P docsis.p1   C State: up   Subdisks: 1 Size:613 MB
  P docsisvar.p0C State: faulty   Subdisks: 1 Size:   2270 MB
  P docsisvar.p1C State: up   Subd

Re: double fault with vinum and 4.5 RC3

2002-01-25 Thread Greg Lehey

On Friday, 25 January 2002 at 11:38:31 +0100, Martin Blapp wrote:
>
> Hi,
>
> Here is the config, before I exchanged da0 with a raw disk.
>
> # Vinum configuration of , saved at Fri Jan 25 09:44:32 2002
> drive vinumdrive0 device /dev/da0e
> drive vinumdrive1 device /dev/da1e
> drive vinumdrive2 device /dev/da0f
> drive vinumdrive3 device /dev/da1f
> drive vinumdrive4 device /dev/da0g
> drive vinumdrive5 device /dev/da1g

You still haven't explained why you're doing this.

> I guess I know what makes the trouble. To be able to boot !

!?

> from da0, which also has a root partition on it, I changed SCSI
> ID's, so my old da1 got da0. And vinum cannot deal with this.

Yes, it can.

> I really should be able to tell vinum that vinumdrive1 is indead
> vinumdrive2, since the device ordering has changed.

It does this automatically.

> But I really do not want to restore my root partition from backup,
> if I still have a working cold mirror. That is nonsense.

It would be nonsense, but it's incorrect.

>> vinum list?
>
> eblcom# vinum list
> 3 drives:
> D vinumdrive1   State: up   Device /dev/da0s1e  Avail: 0/1024 MB
> D vinumdrive3   State: up   Device /dev/da0s1f  Avail: 0/6144 MB
> D vinumdrive5   State: up   Device /dev/da0s1g  Avail: 0/22712 MB
> D vinumdrive0   State: referenced   Device  Avail: 0/0 MB
> D vinumdrive2   State: referenced   Device  Avail: 0/0 MB
> D vinumdrive4   State: referenced   Device  Avail: 0/0 MB

OK, da1 is gone away.

> It seems to me that reviveing the mirrors does not work as supposed.

You need a disk first.

> By the way. I found the command which made such problems:
>
> have a configfile like such:
>
> # cat /root/configfile
> drive vinumdrive0 device /dev/da1e
> drive vinumdrive2 device /dev/da1f
> drive vinumdrive4 device /dev/da1g
>
> and execute
>
> vinum create /root/configfile twice ! makes STABLE 4.5RC3 crashing
> with a double fault. I guess we should fix that. I'm makeing now a
> debug kernel.

OK, this bears investigation.

> If I try to take the configfile as described above, from the
> original creation, I get vinum hanging and get unkillable, or I get
> a segfault from vinum. (Both happened)

OK, let's see the dumps.

On Friday, 25 January 2002 at 13:59:32 +0100, Martin Blapp wrote:
>
> I've found now a way to work around this vinum limitation.
> It seems that a vinum mirror is not able to handle this case:
>
> Crash of the "Master disk", "Slave Disk becomes Master instead,
> and we boot the Slave Disk as Master.

There's no such thing as a master or a slave disk in Vinum.

> I had to do the following steps:
>
> 1) Reboot
> 1) Change SCSI ID's. ID1 > ID2, ID2 > ID1

This is not necssary.

> 3) Boot up the previous SLAVE disk into SUM
> 4) Partition da1 the same as da0
> 5) Disklabel da1 the same way as da0
> 6) dd bs=4096k if=/dev/da0a of=/dev/da1a
> 7) reboot
> 8) Change SCSI ID's. ID2 > ID1, ID1 > ID2
> 9) Boot up the previous SLAVE disk into SUM
> 10) echo "drive vinumdrive0 device /dev/da0e" > configfile
> echo "drive vinumdrive2 device /dev/da0f" >> configfile
> echo "drive vinumdrive4 device /dev/da0g" >> configfile
> 11) vinum start
> 12) vinum start vinumdrive0
> 13) vinum start vinumdrive2
> 14) vinum start vinumdrive4
> 15) vinum stop
> 16) vinum start var.p0.s0
> 17) vinum start docsis.p0.s0
> 18) vinum start docsisvar.p0.s0
>
> And then vinum gets happy and rebuilds.

Most of what you have done there is irrelevant.

> The only difference here is that we have to reboot and change SCSI
> ID's twice, instead of only one time.

You don't need to change them at all.

On Friday, 25 January 2002 at 15:14:37 +0100, Martin Blapp wrote:
>
> Hi Greg,
>
> Does this look reasonable now ? (I Included my workaround for the
> vinum double fault panic problem.)

No.  It's all based on a misunderstanding of how Vinum works.

I'll wait for the dumps of the situations you show above.  We should
also check the hangs.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: double fault with vinum and 4.5 RC3

2002-01-24 Thread Greg Lehey

On Thursday, 24 January 2002 at 19:14:10 +0100, Martin Blapp wrote:
>
> Hi Greg,
>
> I've solved my issues differently. But unfortunatly I'm not able to
> recover from a simulated disk-crash. Here is what I made:

Hmm.  That's about half the information I need.  Take a look at
http://www.vinumvm.org/vinum/how-to-debug.html .  You should be in a
position to debug the crash better yourself.

> # vinum start
> Warning: defective objects
>
> D vinumdrive0   State: referenced   Device  Avail: 0/0 MB
> D vinumdrive2   State: referenced   Device  Avail: 0/0 MB
> D vinumdrive4   State: referenced   Device  Avail: 0/0 MB
> P var.p0  C State: faulty   Subdisks: 1 Size:   1023
> P docsis.p0   C State: faulty   Subdisks: 1 Size:   6143
> P docsisvar.p0C State: faulty   Subdisks: 1 Size: 22
> S var.p0.s0 State: stalePO:0  B Size:   1023
> S docsis.p0.s0  State: stalePO:0  B Size:   6143
> S docsisvar.p0.s0   State: stalePO:0  B Size: 22
>
> where is p1 ???

Looks like it's up.  These are only the defective objects.  But
obviously it hasn't found three of your spindles.

> Then I tried to do:
>
> # echo "drive evar device /dev/da1e" >> configfile
> # echo "drive edocsis device /dev/da1f" >> configfile
> # echo "drive edocsisvar device /dev/da1g" >> configfile

Why do you have three drives on the same spindle?

> # vinum create configfile
>
> And boom I got a double fault.
>
> Strange thing is that I can vinum start, see the error messages,
> and then can mount all this "mirrors" fine, but indead they are
> not mirrors. It's just this disk which I can see.

vinum list?

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



arplookup 0.0.0.0 failed?

2002-01-05 Thread Greg Lehey

I've recently upgraded a machine to 4.5-PRERELEASE and am now getting
messages such as

Jan  5 12:33:39 echunga /kernel: arplookup 0.0.0.0 failed: host is not on local network

Any idea what could be causing this?

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: [HARD CRASH] gdb output - what is it saying?

2001-10-22 Thread Greg Lehey

On Monday, 22 October 2001 at 22:04:46 +0200, Bjarne Wichmann Petersen wrote:
> I've included the latest gdb-out. I have no clue to what it all means, so if
> someone with a clue would help me locate what is causing my 4.4-STABLE to
> crash I'd be very happy.

If I could read it, it would help.  You shouldn't wrap computer
output.  If you follow up, please send the output as it comes, and
make the output in hex.

> (kgdb) symbol-file kernel.debug
> Reading symbols from kernel.debug...done.
> (kgdb) exec-file /var/crash/kernel.4
> (kgdb) core-file /var/crash/vmcore.4
>
> (kgdb) where

You've had two traps in a row, separated by an interrupt.  Both traps
are in timer code.

(rearranging)

> at ../../i386/i386/trap.c:849
> #16 0xc0357ca7 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi =
> -880967808,
>   tf_esi = -880967900, tf_ebp = -880967916, tf_isp = -880967952, tf_ebx =
> -1058816640,
>   tf_edx = -1069680192, tf_ecx = -1069680192, tf_eax = 1381192787,
> tf_trapno = 12,
>   tf_err = 0, tf_eip = 1381192787, tf_cs = 8, tf_eflags = 66178, tf_esp =
> -1071952601,
>   tf_ss = -1058816640}) at ../../i386/i386/trap.c:448

Here's the first one.  You've had a trap 12 (page fault in kernel
mode).  The IP register (instruction pointer) was pointing to
1381192787.  It's a lot easier to read this if you set your
output-radix to 16, where the address will show as 0x52535453.  This
is not only not a valid kernel address, it represents the text "STSR",
which suggests to me that something has been overwriting the stack.
It's not worth looking at this frame any more.

> #17 0x52535453 in ?? ()

I'm not sure what this is, but clearly the stack has been trashed
(recognize that address?).

> #18 0xc01c6472 in gettimeofday (p=0xcb684ea0, uap=0xcb7d7f80) at
> ../../kern/kern_time.c:307

This should be a call to microtime().  Somehow it didn't get there.

> #19 0xc03586c1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi
> = 137846784,
>   tf_esi = 46815, tf_ebp = -1077938116, tf_isp = -880967724, tf_ebx =
> 842094169,
>   tf_edx = -1077938308, tf_ecx = 678132992, tf_eax = 116, tf_trapno = 0,
> tf_err = 2,
>   tf_eip = 677671868, tf_cs = 31, tf_eflags = 643, tf_esp = -1077938160,
> tf_ss = 47})
> at ../../i386/i386/trap.c:1155
> #20 0xc0349ce5 in Xint0x80_syscall ()
> #21 0x8068b2c in ?? ()
> #22 0x8064540 in ?? ()
> #23 0x8061f7d in ?? ()

Clock interrupt.  So far, so good.

> #9  0xc035cbb7 in clkintr (frame={cf_vec = 0, cf_ppl = 0, cf_fs = 16, cf_es =
> 16,
>   cf_ds = 16, cf_edi = -880968264, cf_esi = -880968260, cf_ebp =
> -880968248,
>   -880968308, cf_ebx = -881756544, cf_edx = -880968264, cf_ecx =
> -881756544,
>   cf_eax = -881982624, 0, 0, cf_eip = -1071714030, cf_cs = 8, cf_eflags =
> 582,
>   cf_esp = -1055106048, cf_ss = 0}) at ../../i386/isa/clock.c:216
> #10 0xc01ef112 in vfs_msync (mp=0xc11c5c00, flags=2) at
> ../../kern/vfs_subr.c:2536
> #11 0xc01f00e0 in sync (p=0xc043d760, uap=0x0) at
> ../../kern/vfs_syscalls.c:544
> #12 0xc01c0bd2 in boot (howto=256) at ../../kern/kern_shutdown.c:234
> #13 0xc01c11c0 in poweroff_wait (junk=0xc03cb42c, howto=-1069764785)
> at ../../kern/kern_shutdown.c:581
> #14 0xc0358416 in trap_fatal (frame=0xcb7d7ec4, eva=1381192787)
> at ../../i386/i386/trap.c:956
> #15 0xc03580e9 in trap_pfault (frame=0xcb7d7ec4, usermode=0, eva=1381192787)


> #0  dumpsys () at ../../kern/kern_shutdown.c:473
> #1  0xc01c0df3 in boot (howto=260) at ../../kern/kern_shutdown.c:313
> #2  0xc01c11c0 in poweroff_wait (junk=0xc03cb42c, howto=-1069764785)
> at ../../kern/kern_shutdown.c:581
> #3  0xc0358416 in trap_fatal (frame=0xcb7d7cc4, eva=1381192787)
> at ../../i386/i386/trap.c:956
> #4  0xc03580e9 in trap_pfault (frame=0xcb7d7cc4, usermode=0, eva=1381192787)
> at ../../i386/i386/trap.c:849

Second trap.  Same address as the first.

> #5  0xc0357ca7 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi =
> -1058817060,
>   tf_esi = -1058816548, tf_ebp = -880968424, tf_isp = -880968464, tf_ebx
> = -1058817152,
>   tf_edx = -1058817024, tf_ecx = 0, tf_eax = 1381192787, tf_trapno = 12,
> tf_err = 0,
>   tf_eip = 1381192787, tf_cs = 8, tf_eflags = 66050, tf_esp = -1071951411,
>   tf_ss = -1058817152}) at ../../i386/i386/trap.c:448
> #6  0x52535453 in ?? ()

Recognize this bogus address again?

> #7  0xc01b5224 in tco_forward (force=0) at ../../kern/kern_clock.c:761

This should be a call to sync_other_counter().

> #8  0xc01b49b4 in hardclock (frame=0xcb7d7d58) at ../../kern/kern_clock.c:236

This is a puzzling dump.  Have you any specialized timer hardware or
software on your machine?  Is the dump repeatable?

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: [stable] Re: RAID5

2001-09-04 Thread Greg Lehey

On Tuesday,  4 September 2001 at 13:52:02 +0100, Lawrence Farr wrote:
> Just to add another benchmark, I got:
>
> Pass 23 - 1048576 kb written in 115 seconds, at 9118 kb/Sec
> Pass 23 - 1048576 kb read in 15 seconds, at 69905 kb/Sec

This shows there's a big difference.  Which is which?  What is it
really doing here?

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Vinum vs. hardware RAID (was: RAID5)

2001-09-04 Thread Greg Lehey

On Tuesday,  4 September 2001 at 10:49:31 -0700, Darryl Okahata wrote:
> Chris BeHanna <[EMAIL PROTECTED]> wrote:
>
>> AFAIK, there are no IDE RAID cards that support RAID-5.  You are
>
>  The 3ware cards support RAID5, although the performance is
> supposedly not very good.
>
>> Or you could use vinum(8) and use whatever IDE controller you
>> have, and whatever RAID configuration that you want.  Note, however
>
>  I used to use vinum, but I'm now using a 3ware controller.  It's
> *SOOO* nice to have something that's easy to set up, yet "just works".
>
>  My issues with vinum are:
>
> [ Note: it's been a year or two since I've used vinum, and so the
>   following may be out-of-date.  Corrections appreciated.  ]
>
> * Once set up, vinum works well, but it takes a lot of hard reading to
>   set up.  The actual work itself is not hard, mind you, but it's
>   difficult to figure out what you need to put into the config file.

I've added sample configs to the man page.  People still have trouble,
but none of them seem to have read that section of the man page.

>   It's *SO* much easier to configure the 3ware controller 

Probably also more restricted, isn't it?

> * Vinum is not part of the GENERIC kernel.

This is not really relevant.

>   I had /usr on a vinum-controlled partition, and upgrading FreeBSD
>   via CDs was a royal pain.  I basically had to do the upgrade to
>   the root drive (moving /usr out of the way, rebuild the kernel
>   with vinum, reboot, and then copy/move the contents of /usr into
>   the vinum-controlled /usr -- bleah).
>
>   [ Today, I should be able to kldload vinum, but I don't think I can do
> this at install time from the CDROM, and so I'd still have to play
> games with /usr.  ]

Yes, currently installing on Vinum is a pain.  But vinum(8) loads the
kld automatically, and I suspect that you should be able to load it
during the installation process.  The problem is that sysinstall
doesn't know about Vinum.

> * (This may have changed.)  There is no documentation on how to do
>   disaster recovery with vinum.  The procedure is (roughly) documented
>   with 3ware (although I should, but have not, admittedly, tested
>   it).

It's (roughly) documented with Vinum now as well.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: XFS (was: ReiserFS (was: JFS (was: The FreeBSD core team needs your help)))a

2001-07-19 Thread Greg Lehey

On Sunday,  8 July 2001 at 20:47:53 -0500, Dave Uhring wrote:
> On Sunday 08 July 2001 19:32, Matthew Emmerton wrote:
>
>>
>> This "virus-like" aspect of the GPL is also very loudly explained in
>> a 12-page presentation distributed internally to all developers at
>> IBM.  (The virus-like aspect is a big deal.  If someone accidentally
>> *statically* linked a piece of GPL'd object code, such as GNU
>> getopt(), into a major product such as DB2 EEE for Linux, then they
>> would be forced to open the code to DB2.  That would not be a very
>> profitable move for IBM.)
>>
>> The problem for me (as a developer who would love to port XFS to
>> FreeBSD, but by being employed by IBM, cannot), is that IBM would
>> have IP rights over any changes that I would have to submit back to
>> SGI in order to make XFS work on FreeBSD.  For IBM to release that
>> code under the GPL, SGI would have to work with IBM and come to an
>> agreement (which would involve all of the IP laywers from the two
>> firms battling it out).  Since this process would take months and a
>> pile of cash, FreeBSD would never see XFS.
>>
>> Sad, but true.  Maybe I need to switch companies :)
>
> No need to switch companies over something like this.  Fix up JFS for
> FreeBSD :-)  and stay with IBM.

As I've said on the correct group (FreeBSD-fs), if anybody wants to
port JFS to FreeBSD, please contact me.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



ReiserFS (was: JFS (was: The FreeBSD core team needs your help))

2001-07-07 Thread Greg Lehey

On Saturday,  7 July 2001 at 14:50:21 +0200, A. L. Meyers wrote:
> On Sat, 7 Jul 2001, Greg Lehey wrote:
>
>> On Friday,  6 July 2001 at 11:31:49 +0100, Antony T Curtis wrote:
>>> Greg Lehey wrote:
>>>>
>>>> On Wednesday,  4 July 2001 at 11:38:08 +0100, Antony T Curtis wrote:
>>>>> Greg Lehey wrote:
>>>>>>
>>>>>> On Tuesday, 12 June 2001 at 19:22:45 +0200, Steve O'Hara-Smith wrote:
>>>>>>> On Tue, 12 Jun 2001 12:09:58 +0100
>>>>>>> Josef Karthauser <[EMAIL PROTECTED]> wrote:
>>>>>>>
>>>>>>>> On Fri, Jun 08, 2001 at 08:32:23AM -0700, Eric Parusel wrote:
>>>>>>>>>>> A journalling FS for those people who just hate waiting for a
>>>>>>>>> couple
>>>>>>>>>>> of
>>>>>>>>>>> TB of slow disks to fsck?
>>>>>>>>>>
>>>>>>>>>> Does ReiserFS work with FreeBSD? ? ? ?
>
> (big snip)
>
> Hey, guys and gals, did you forget this part of the original
> post?

No, they didn't notice it, because the Subject: line was pointing
elsewhere.  Piggybacking new questions onto old topics doesn't work
well unless you change the Subject: line.

> Just installed SuSE Linux 7.2 with Reiser FS throughout on an Intel
> SMP box. The FS purrs, even on /, which doesn't mean everything is
> better or worse than FBSD.

I don't know enough about ReiserFS to be able to give a useful
opinion.  The Linux people I know are by no means in agreement about
its merits, but I've heard that it's best as a "special purpose" FS
for small files.  I don't know how valid that statement is.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: JFS (was: The FreeBSD core team needs your help)

2001-07-05 Thread Greg Lehey

On Wednesday,  4 July 2001 at 11:38:08 +0100, Antony T Curtis wrote:
> Greg Lehey wrote:
>>
>> On Tuesday, 12 June 2001 at 19:22:45 +0200, Steve O'Hara-Smith wrote:
>>> On Tue, 12 Jun 2001 12:09:58 +0100
>>> Josef Karthauser <[EMAIL PROTECTED]> wrote:
>>>
>>>> On Fri, Jun 08, 2001 at 08:32:23AM -0700, Eric Parusel wrote:
>>>>>>> A journalling FS for those people who just hate waiting for a
>>>>> couple
>>>>>>> of
>>>>>>> TB of slow disks to fsck?
>>>>>>
>>>>>> Does ReiserFS work with FreeBSD?
>>>>>
>>>>>> From what I've read, XFS is quite good as well  (Whether or not it
>>>>> could ever work with *BSD, I don't know)
>>>>
>>>> Apparently XFS would run better on FreeBSD than on Linux, from what
>>>
>>>   Whatever happened to the open source release of JFS, or is JFS really
>>> bad ?
>>
>> The open source version of JFS was based on OS/2, not AIX.  It's not
>> an overly good fit to UNIX.
>
> That was only because AIX's JFS implementation was so closely bound into
> their kernel that there was no easy way to "port" it out of it. Also,
> AFAIK, it was written in a mesh of different languages too, including
> POWER architecture assembly.
>
> The OS/2 version was the first clean implementation to plug into OS/2's
> IFS driver model - and being written in C, it is much more 'portable'.
> (AFAIK, it was supposed to be able to be recompiled for OS/2 for CHRP
> PowerPC)

Since writing that (quite some time ago, IIRC) I have joined IBM and
am now working with the people who did the JFS port.  They
substantially confirm your viewpoint, with the added information that
the "old" JFS, now called JFS 1, is being phased out under AIX, and
the "new" AIX JFS, JFS 2, is based on the same code base as the OS/2
port.  With that background, IBM's approach makes a lot more sense.
It's a pity that this issue wasn't clarified earlier.

> All said, I would be interested in a JFS port for FreeBSD 

I'm going to be doing a lot of work on JFS in the next few months.  I
don't think I'll port it to FreeBSD, but I'll be available for
questions, and I'll have a better understanding.


>> unix soit qui mal y pense

You're aware that the original word of this phrase, "hon(n)i", means
"ashamed"?

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Hard lockup

2001-07-03 Thread Greg Lehey

On Monday,  2 July 2001 at 15:45:06 -0600, John R. Shannon wrote:
> I'm also experiencing the lockups; 4 today since CVSUPing stable this AM.
>
> /var/run/dmesg.boot:

*sigh*  We really need to update our documentation.  dmesg was once
important.  In this case it's irrelevant.  Can you find any
correlation between what you were doing and the hangs?

> Mounting root from ufs:/dev/da0s1a
> @%9`%9

Or are you saying that it froze at this point?

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Vinum safe to use for raid 0?

2001-01-03 Thread Greg Lehey

On Wednesday,  3 January 2001 at 10:29:44 +, Josef Karthauser wrote:
> On Tue, Jan 02, 2001 at 11:01:07PM +0100, Thomas Seck wrote:
>> Hi all,
>>
>> sorry if this is OT for -stable, but I followed the discussion about
>> vinum in here and got a bit worried.
>>
>> I am currently deploying a proxy server for our company. It shall use
>> squid on 4.2-STABLE. I would like to put the cache data on a vinum RAID
>> 0, made of three U160 disks. As I understood the discussion so far,
>> there are some unresolved problems with the raid 5 code. Could someone
>> tell me whether I can safely use vinum for building a raid 0 system
>> (despite the fact that the HW may be a point of failure of course)?
>
> As far as I'm aware there are no problems using vinum for raid 0.
>
> As far as I understand from Greg he's not aware of many people who
> are having problems with raid 5.

Probably.  If you know of somebody I don't know of, please let me
know.

> As one of the small minority who was having problems I'd advice you
> to soaktest any vinum raid 5 installation before committing
> important data to it.

That's what vinum(4) says.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: RAID costs (was: Vinum safe to use for raid 0?)

2001-01-03 Thread Greg Lehey

On Tuesday,  2 January 2001 at 23:58:26 -0800, Matt Dillon wrote:
> Like everything, topologies have their strengths and weaknesses.
> RAID-5 is excellent for read-centric operations (which large data stores
> tend to be, I will note) and, as Poul reminded me a few days ago...
> stripe-sized block-write operations can be made optimal.

Well, they can be optimized, which isn't quite the same thing.  That's
a wish list item for Vinum.

> Of course, it has to be reliable to be useable, which is really
> the crux of the current thread.  Someone needs to buy Greg some
> faster machines to play with :-), as the current vinum issues
> appear to be related to timing.

I'm not sure about that, though it's possible.  The real issue with my
test setup is that the disks I have are all ancient.  I'm getting some
more modern ones Real Soon Now, but of course the optimum way to solve
this particular problem would be if somebody sent me exactly the
machine that was having trouble.

> There are other big differences between software and black-box
> RAID solutions.  For example, what happens when the machine
> crashes right smack in the middle of a write?  Hardware RAID
> (e.g. RAID-5) solutions have NVRAM to hold the log.  Software
> RAID either has to be extremely careful in the sequencing of the
> data, play serial number tricks (which is why you sometimes see
> disks with weird physical sector sizes), or write a separate log
> and delay the actual disk updates until the log write has been
> confirmed.

Indeed.  Vinum cheats a little here, but even then it seem to be too
finicky for many people.  Theoretically, after a crash you need to
synchronize the volumes.  I'm thinking of a volume manager logging
facility which will keep track of the last n operations.  This would
enable recovery code to confirm that they had been performed.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Vinum safe to use for raid 0?

2001-01-02 Thread Greg Lehey

On Wednesday,  3 January 2001 at 13:30:50 +1100, Zero Sum wrote:
> On Wednesday 03 January 2001 10:21, Alfred Perlstein wrote:
>> * Thomas Seck <[EMAIL PROTECTED]> [010102 14:00] wrote:
>>> Hi all,
>>>
>>> sorry if this is OT for -stable, but I followed the discussion about
>>> vinum in here and got a bit worried.
>>>
>>> I am currently deploying a proxy server for our company. It shall use
>>> squid on 4.2-STABLE. I would like to put the cache data on a vinum RAID
>>> 0, made of three U160 disks. As I understood the discussion so far,
>>> there are some unresolved problems with the raid 5 code. Could someone
>>> tell me whether I can safely use vinum for building a raid 0 system
>>> (despite the fact that the HW may be a point of failure of course)?
>>>
>>> Thanks in advance and best regards from Germany
>>
>> We've been using RAID-0 and RAID-1 with vinum here for a long time,
>> the only problem we had was during a 3.x->4.x upgrade, we were able
>> to recover from it after freaking out for a bit though.
>
> I know it is a bit off topic and if it has been discussed to death befire,
> I apologise.  But for the lfe of me I can't see why anyone would use RAID 5
> as other than an acadaemic exercise.
>
> If this seems like a troll, I'm sorry, but I have had this argument so many
> time in RL.  In the past I have always managed to get better performance by
> throwing RAID 5 out.

There are many reasons for using RAID.  If you're looking for good
read/write performance, you won't use RAID-5.  But if you have a web
server, for example, where 99% of all accesses are reads, then RAID-5
is quite a good choice.  I do tend to agree that a lot of people use
RAID-5 where RAID-1 would be a better choice.

>> So yes, it is stable.  I still wouldn't trust the RAID-5, but if
>> you want to get RAID-5 working you could take a shot on getting
>> some reproducable corruption/panics and let Greg know.
>>
> The lack of data may be because of it's lack of use as a general
> practice.

No, I don't think so.  I'm surprised to hear how many people use it.
I'm reasonably sure that the problems people have reported are due to
a bug in Vinum, but I suspect it needs something else in combination
in order to make it appear.  For a while there was a theory that you
need an fxp0 Ethernet card in the system, for example.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: vinum malfunction!

2001-01-02 Thread Greg Lehey

[Format recovered--see http://www.lemis.com/email/email-format.html]

On Tuesday,  2 January 2001 at 10:46:25 +0200, Roman Shterenzon wrote:
> Quoting Greg Lehey <[EMAIL PROTECTED]>:
>
>>> I've submitted kern/22021 which was left untouched, it has all the
>>> needed details.
>>
>> 1.  You insisted on including the config file and dmesg, which
>> just bloat the PR and make it more difficult to read.
>
> yes I did. sed(1) is your friend.

OK, take sed(1) and a few other tools, and write a usable problem
report.  It's just plain impolite to expect people who are working for
free to endure this kind of crap.  In fact, even if you were paying
them, it's still impolite.

>> 2.  The backtrace still does not include Vinum symbols, making it
>> pretty useless.
>
> To the very least it's not true.
> Please check again http://www.FreeBSD.org/cgi/query-pr.cgi?pr=22103

There's one there which is missing symbols.  In a total of 30 printed
pages of mainly irrelevant text, I might have missed something.

>> 3.  It includes (without comment) a mail exchange with Andy Newman,
>> which have been twice mutilated, once before I received the
>> message, once after.
>
> This is completely irrelevant.

Precisely.  Please don't include irrelevant information.

>> Dealing with this kind of PR is an absolute pain.  It's bloated,
>> illegible, full of irrelevant information and lacking the information
>> I need.  It is, however, one of the few cases I've seen of a
>
> ..or any other reason that prevents you from working on it.

Sorry, what are you trying to say?

> vinum RAID-5 is dangerous, shouldn't be used in sensitive
> environments, and it cost me much health.

This kind of blanket statement detracts from anything useful you might
have to say.

Greg
--
When replying to this message, please take care not to mutilate the
original text.  
For more information, see http://www.lemis.com/email.html
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Stable 'make release' fails

2000-12-17 Thread Greg Lehey

I'm trying to do a 'make release' with today's -stable.  It dies in perl:

===> gnu/usr.bin/perl/miniperl
cc -O -pipe -I/usr/src/gnu/usr.bin/perl/miniperl/../../../../contrib/perl5 
-I/usr/obj/usr/src/gnu/usr.bin/perl/miniperl   -I/usr/obj/usr/src/i386/usr/include -c 
/usr/src/gnu/usr.bin/perl/miniperl/../../../../contrib/perl5/miniperlmain.c
cc -O -pipe -I/usr/src/gnu/usr.bin/perl/miniperl/../../../../contrib/perl5 
-I/usr/obj/usr/src/gnu/usr.bin/perl/miniperl   -I/usr/obj/usr/src/i386/usr/include  
-static -o miniperl miniperlmain.o  
-L/usr/obj/usr/src/gnu/usr.bin/perl/miniperl/../libperl -lperl -lm -lcrypt
===> gnu/usr.bin/perl/perl
miniperl /usr/src/gnu/usr.bin/perl/perl/../../../../contrib/perl5/configpm  Config.pm 
Porting/Glossary myconfig config.sh
sh cflags.sh
cc -O -pipe -I/usr/src/gnu/usr.bin/perl/perl/../../../../contrib/perl5 
-I/usr/obj/usr/src/gnu/usr.bin/perl/perl   -I/usr/obj/usr/src/i386/usr/include -c 
perlmain.c
miniperl: not found
*** Error code 127

It's not quite clear where it's expecting to find miniperl.  The only
one in /home/release is the directory
/home/release/usr/src/gnu/usr.bin/perl/miniperl, the only one in
/usr/obj is the one just built above, but that's not in any reasonable
path.  There's one in /usr/bin as well, but I'm sure that shouldn't be
used.  Is it possible that the dependencies are messed up, and that
this is the result of using -j4 for the make release?

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Dedicated disks (was: Dangerously Dedicated)

2000-12-14 Thread Greg Lehey

On Friday, 15 December 2000 at  2:20:40 -0500, Mike Nowlin wrote:
>
>> Does that mean that such BIOS's are proprietary in the sense that they
>> don't recognize the dedicated format?
>
> There are times when the politically-correct of the world use the term
> "proprietary" when they actually mean "dumb" or "really badly
> designed".  But yes, that's what it means...  :)

To be fair, the dedicated fake partition table format is a hack.  It's
too difficult to figure out what the real geometry is, so it invents
one which should "do the job".  Some BIOSes check the table and find
it wanting.  It's a grey area. 

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Dedicated disks (was: Dangerously Dedicated)

2000-11-19 Thread Greg Lehey

On Sunday, 19 November 2000 at 17:50:48 -0700, Warner Losh wrote:
> In message <[EMAIL PROTECTED]> "Daniel O'Connor" writes:
>> At least remove the option from sysinstall so new users don't get
>> stuck with it.
>
> I strongly support this.  It has burned me on several machines.
>
> I don't think that anyone will remove it from the kernel...

OK, the more this thread continues, the more it's looking as if we're
talking about different things.  I don't have (much) of an objection
to removing it from sysinstall.  If that's all we're talking about, I
don't have any further objections.  But I still want to have the
facility in the system.

I wonder how long the current Microsoft partition table has to live,
anyway?  Sooner or later people are going to have to move to LBA
addressing, or disks will get so big that the partition table can't
address them.  Then, hopefully, we'll be able to use a more sane
layout.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Dedicated disks (was: Dangerously Dedicated)

2000-11-19 Thread Greg Lehey

On Sunday, 19 November 2000 at 17:48:14 -0700, Warner Losh wrote:
> In message <[EMAIL PROTECTED]> Greg Lehey writes:
>> They waste space.  In most cases, they're not needed.  Isn't that
>> enough?
>
> No.  Writing in 'C' isn't necesary and wastes space.  That, in and of
> itself, isn't a reason to not use it.

No.  Unlike the Microsoft partition tables in dedicated machines, it
has advantages that make up for it.  But take away my ability to write
in assembler and I'll complain too.

> But like mike said, it was the ability to create these for the boot
> disk that is going away from sysinstall.  

Not for other disks?

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Dedicated disks (was: Dangerously Dedicated)

2000-11-19 Thread Greg Lehey

On Sunday, 19 November 2000 at 18:50:40 -0600, Jim King wrote:
> Greg Lehey wrote:
>
>>> Why is DD ever _needed_?
>>
>> Because Microsoft partition tables waste space.
>
> That's a really weak argument, given the price and size of drives
> nowadays.

It's a matter of principle.  Why waste?

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Server with vinum, adaptec, softupdates crashes

2000-10-19 Thread Greg Lehey

On Wednesday, 18 October 2000 at 12:57:01 +0200, Roman Shterenzon wrote:
> Hi,
> I'm really sorry, I don't know what mangles it. If it's still mangled I
> can post it somewhere on the web.

No, the text is now correct.

> If it was fixed after 4.1-RELEASE, please let me know, I'll have to drive
> 50km to upgrade it, but it's worth it.

I don't know what the problem was.

> Content-Description: kernel config
> Content-Description: boot messages

As I said in the reply to the PR (now closed), this is not the
information I need.  Please read
http://www.vinumvm.org/how-to-debug.html and enter a new PR with the
necessary information.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Makeworld is dying...

2000-09-16 Thread Greg Lehey

On Sunday, 17 September 2000 at  9:24:47 +0300, Roman Shterenzon wrote:
> On Sat, 16 Sep 2000, Kent Stewart wrote:
>> "Paul A. Howes" wrote:
>>>
>>> All-
>>>
>>> When I attempt a buildworld on a brand new FreeBSD system (IBM/Cyrix-166+,
>>> 64MB memory, 20GB Maxtor drive), it dies while building the ncurses library.
>>> This happen whether it's the version of the source code on the 4.1 CD-Rom
>>> disc, or the latest and greatest 4-STABLE code from a cvsup.  Any help would
>>> be appreciated.  The tail of the trace log is included below.
>>
>> Signal 11's during a buildworld are usually caused by memory and a few
>> things such as cpu cooling. I finished a buildworld at 1:30 PDT (about
>> 2 hours before your message arrived) and I didn't have any problems.
>> Do you have another 64MB of memory that you could switch as a test of
>> your memory.
>
> Perhaps we should add an entry to a FAQ (if it's not already there),
> describing the problem, and, a good indication of a bad ram or undercooled
> cpu would be trying buildworld couple of times (without -DNOCLEAN) and
> watch where it fails.
> If it fails in different places - then it's almost sure hardware problem,
> if it fails in the same place, it's still can be a hardware problem, for
> example some c++ file which demands more memory then others to compile.
> Can someone add it to FAQ, or shall I fill a PR? :)

No.

What you *should* do before sending out a reply like this is to check
whether it's really in the FAQ or not.  It is
(http://www.freebsd.org/FAQ/troubleshoot.html#AEN1570):

Q: My programs occasionally die with Signal 11 errors.

A: This can be caused by bad hardware (memory, motherboard, etc.). Try
   running a memory-testing program on your PC. Note that, even though
   every memory testing program you try will report your memory as
   being fine, it's possible for slightly marginal memory to pass all
   memory tests, yet fail under operating conditions (such as during
   bus mastering DMA from a SCSI controller like the Adaptec 1542,
   when you're beating on memory by compiling a kernel, or just when
   the system's running particularly hot).

   The SIG11 FAQ (listed below) points up slow memory as being the
   most common problem. Increase the number of wait states in your
   BIOS setup, or get faster memory.

   For me the guilty party has been bad cache RAM or a bad on-board
   cache controller. Try disabling the on-board (secondary) cache in
   the BIOS setup and see if that solves the problem.

   There's an extensive FAQ on this at the (link) SIG11 problem FAQ

Now you *could* consider better wording of the text.  This entry
belies its age by the hardware it refers to.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: not in bitmap

2000-08-19 Thread Greg Lehey

On Saturday, 19 August 2000 at 20:41:54 -0700, Gary Kline wrote:
>   Curiouser and curiouser.   dmesg reports:
>
>   sio1: configured irq 3 not in bitmap of probed irqs 0
>
>   Any thoughts on this?  like maybe the mouse and modem
>   just maybe are switched?

Well, it would be nice to know the usual background.  Hardware, OS
release (are you running FreeBSD?  The only thing that says so is that
fairly specific message saying that the probe isn't getting any
interrupts).

> (XF86Setup fails 100% too.)

Well, it would do.

--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: HEADS UP! Always use the 'make buildkernel' target to make yer kernels

2000-07-10 Thread Greg Lehey

On Monday, 10 July 2000 at 10:04:53 -0400, Vivek Khera wrote:
>> "KK" == Kris Kennaway <[EMAIL PROTECTED]> writes:
>
> KK> Subject basically says it all. "make buildkernel KERNEL=" and
> KK> "make installkernel KERNEL=" (or set KERNEL in /etc/make.conf or
> KK> the environment, where KERNEL is the name of the kernel to build (GENERIC,
> KK> etc)) are what you should always be using to build your kernels, unless
> KK> you know what you're doing.
>
> So you're saying that even after upgrading from 3.4 to 4.0 you should
> use make buildkernel?  That seems counter to what has been discussed
> before, and is way non-BSD-ish.

Agreed.  I tried it out and found a number of things I didn't like
about it.  Basically, it's a completely different build process:

1.  Before building, it removes the existing kernel build tree.
There's no good reason for this.

2.  It builds in a different tree (/usr/obj instead of
/usr/src/sys/compile).  These two points mean that if you later
want to go back and tune your kernel (change a driver parameter,
say), you can't just do a config; cd ../../compile/FOO; make, you
have to go the whole nine yards.

3.  It gives the kernel a different name.

4.  It's just plain clumsy.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: AMD K6-2 / 550

2000-07-01 Thread Greg Lehey

On Sunday,  2 July 2000 at 11:52:54 +0930, Greg Work wrote:
>>> Just a quick question - *exactly* what was the symptoms of your problem?
>>>
>>> ive just turned a K6-2 300 into a BSD box, and make world fails with
>>> SIG's 10, 11 and 12 randomly.  I know its not the memory - i have
>>> been thrashing that memory for 12 mths now - never skipped a beat.
>>
>> You mean it never discovered a problem?  What software were you
>> running?
>
> Was running win98 - and used to dual boot 3.0 --> 3.2-STABLE with 98

> It got retired to a straight win98 box (ducks objects thrown at him) when i
> got a second machine

OK, this seems to contradict your previous statements ("just turned
into a BSD box").

>>> The CPU / Motherboard / Memory used to run BSD in the 2.2.8 - 3.2
>>> days no problems.  Now - no luck :( - i get random panics, SIG 10's
>>> to 12's and core dumps - unfortunately - i have no idea how to debug
>>> them :(
>>
>> If they're hardware related, there's probably not much to see except
>> that things will be different each time.
>>
>> I suspect that a large number of crashes on Microsoft machines are in
>> fact due to flaky hardware.  Despite my low opinion of Microsoft, I
>> have not had an unexpected crash of a Microsoft system in years (it
>> helps, of course, if you don't use them much :-).
>

> the funny thing was that it never skipped a beat under M$ after a
> bios flash (it didnt like my vid card - but thats another issue :) )

After a flash, or until after a flash?  Check the BIOS settings; maybe
they're overstressing the memory.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: AMD K6-2 / 550

2000-07-01 Thread Greg Lehey

On Saturday,  1 July 2000 at 21:21:56 -0500, Larry Rosenman wrote:
> (mutilated description of hardware-related problems with AMD K6-2 deleted)
> hmm
> I can't (the AMD K6-2 is back at the computer store), as
> I said, that computer now has an Intel P-III 600E in it :-).
>
> The tech's at IMS did say that AMD did admit to "some problem" with
> the K6-2's, so I'm not sure that an earlier FreeBSD will help...

There have been "some problems" with all processors.  Without any
more accurate description, this statement doesn't help.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: AMD K6-2 / 550

2000-07-01 Thread Greg Lehey

[Format recovered--see http://www.lemis.com/email/email-format.html]

On Sunday,  2 July 2000 at 11:47:51 +0930, Greg Work wrote:
> (mutilated description of hardware-related problems with AMD K6-2 deleted)
> hmm
>
> how easy is it to install an earlier 3.x ?  (3.0-RELEASE -->
> 3.2-RELEASE) it would be interesting to see if that fixes the
> problem

I don't think that would be very interesting.  All versions of FreeBSD
run with the K6-2.

> unfortunately - i dont have any spare HD's to stick it on - or i
> would try it
>
> i tell you what - this was *REALLY* getting me bugged - i thought i had
> fried the CPU / MB somehow...

It's not that serious.  I'd guess memory problems, or probably BIOS
settings.

Greg
--
When replying to this message, please take care not to mutilate the
original text.  
For more information, see http://www.lemis.com/email.html
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



HEADS UP: Another Vinum RAID-5 data integrity problem (*sigh*)

2000-06-05 Thread Greg Lehey

I've only just fixed one problem in RAID-5 revive, and another one has
surfaced.  For the moment: if you have a RAID-5 plex with a dead
subdisk, leave it that way.  It's safer than restarting it.  I think I
should have it fixed relatively quickly.  Watch this space.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Debugging Kernel/System Crashes, can anyone help??

2000-05-03 Thread Greg Lehey

On Wednesday,  3 May 2000 at  3:48:42 -0400, Howard Leadmon wrote:
>
>Hello,
>
>  I know I posted a few messages here in the past, but maybe someone who is
> good at tracking kernel problems can step up and lend a hand.
>
>  I have a machine running FBSD 4.0-STABLE, and have been experiencing almost
> daily kernel panics or reboots on the machine.  I have replaced ALL of the
> hardware, and reloaded the OS, but still having troubles.  I am at a bit of
> a loss as to what is going on.  From one panic, I thought well maybe this
> is an SMP issue, but removed one of the CPU's and still the box crashes. As
> I have basically replaced everything, I am at a loss as to where to go from
> here, so looking for some type of pointers or help with this..

Indeed.  We need to address this issue in some detail.  We need both
documentation and tools.

>  The other day I was there, and got the following from one of the
> crashes, as many times I am gone and luckally in some ways the box
> will just panicboot and go on it's way.  Here is what I was able to
> copy down:
>
>
> Fatal trap 12: page fault while in kernel mode
> mp_lock=0102; cpuid=1; lapic.id=0100
> fault virtual address= 0x30
> fault code= supervisor read, page not present
> instruction pointer= 0x8:0xC01CAF71
> stack pointer= 0x10:0xFF80DE48
> frame pointer= 0x10:0xFF80DE4C
> code segment= base 0x0, limit 0xF, type 0x1B
> = DPL 0, pres 1, def 32, gran 1
> processor eflags= interrupt enabled, resume, IOPL=0
> current process = idle
> interupt mask= bio <- SMP: XXX
> trap number= 12
> panic: page fault
>
> The formatting of it may not be perfect, but the information should be
> accurate, as I tried to be precise on what I wrote down.  Also here are
> a few previous messages I had posted a while back when I thought this
> might be network related, but after trying several different NIC's I still
> have the same issues.  I will include the info below, as maybe it will
> have some value in trying to debunk this problem.

The sad thing is that this information is that most of this
information is almost useless.  I'm thinking of printing out a stack
trace instead (comments, anybody?).  Without tedious comparison with
your kernel namelist, all we can say here is that you died somewhere
in the kernel, that you have an SMP machine, and that the block I/O
subsystem is probably involved.  If this is happening daily, you
should build a kernel with debugging symbols enabled and take a dump
of the next crash.  We can then use gdb to analyse the dump.

>   Hello, I am running a 4.0-STABLE machine which is being used to host an
> Undernet IRC server, and the machine keeps dying at times, or should I say
> the networking side of it is at least dying.  At first I thought it might
> have been related to the dc (DEC Chip) based drivers, so I replaced it with
> a EEpro board using the fxp driver, but the same results.
>
> 

If all your dumps have the interrupt mask set to bio, I don't think
it's a networking problem.  With one possible exception...

> Mar 27 12:39:00 u2 /kernel: fxp0: device timeout

Søren and I are trying to find out what is causing some weird Vinum
problems.  He stated that the problem happened more frequently when
an fxp board was in the system.  I don't believe him, and I've found
at least one bug in Vinum that has nothing to do with networking (but
does have to do with the bio mask); possibly, however, there's some
other problem with the fxp driver.

It's possible that the other information will be of use, but I think
we first need to look at a dump.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: 3.4-4.0 and vinum

2000-03-20 Thread Greg Lehey

On Sunday, 19 March 2000 at  2:18:05 -0600, Jim C. Nasby wrote:
> For those of you who are running vinum, make damn sure you keep an old
> kernel around while upgrading... it seems that you can't load vinum
> after booting with the new kernel... I'm hoping that a make install in
> the vinum directory will remedy this situation... I'll post an update
> once I know what's going on, but in the meantime, this is a heads-up for
> everyone running vinum.

This appears to be a problem that the upgrade doesn't install the klds
for the new kernel.  If you install them, you should be OK.

Greg
--
When replying to this message, please copy the original recipients.
For more information, see http://www.lemis.com/questions.html
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Re: Does vinum actually currently build under 3.2-STABLE?

1999-07-07 Thread Greg Lehey

On Wednesday,  7 July 1999 at 18:20:32 +0200, Brad Knowles wrote:
> Folks,
>
>   I know a little while back that Poul-Henning Kamp had suggested
> some changes to vinum that would get it to compile (in 3.2-STABLE?),

That was a while back.  I fixed those problems about 3 weeks ago.

> and I've seen some recent traffic on -current to indicate that there
> were some other minor problems keeping vinum from building with
> FreeBSD 4.0-CURRENT.

Vinum builds under 3.2-STABLE.  I don't know why you're trying to use
the wrong version.  Yes, I once said that, at that moment, the
-CURRENT version would build under -STABLE.  That was quite some time
ago, and it no longer applies.

>   The problems with FreeBSD 4.0-CURRENT appear to have been
> cleared up by Greg committing one additional file to CVS.

I did?  Which one?

> However, I am not exactly clear on whether vinum is currently
> supposed to build out-of-the-box with 3.2-STABLE, either with the
> provided code (apparently dated 5 May) or with the tarball at
>  (dated 11
> May).

It's supposed to build out of the box.  Vinum is included in -STABLE.
The last modification was 11 May (sys/dev/vinum/vinumparser.c).

Greg
--
See complete headers for address, home page and phone numbers
finger [EMAIL PROTECTED] for PGP public key


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message