Re: fsck Segmentation fault on 4.1

2007-08-03 Thread Marcos Laufer
So the patch works, and this problem seems serious and easy to encounter, i
vote for moving it to stable.

- Original Message - 
From: "Tobias Ulmer" <[EMAIL PROTECTED]>
To: "Otto Moerbeek" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, August 03, 2007 3:37 PM
Subject: Re: fsck Segmentation fault on 4.1


On Thu, Jul 19, 2007 at 08:09:58PM +0200, Otto Moerbeek wrote:
> [...]
>
> I misdiagnosed the problem. In the meantime I got another report with
> a dd of the partition which enabled me to diagnose the problem and
> make a fix for 4.1. Please test and report back. I'll be on vacation
> from Saturday, so it would be nice if you can answer before that.
>
> Anobody else seeing INCONSISTENT CGSIZE messages should try this as well.
>
> NOTE: this diff only applies to 4.1. Current does not have the
> problem, due to a corrected CGSIZE macro.
>
> -Otto
>
> Index: setup.c
> ===
> RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 setup.c
> --- setup.c 16 Feb 2007 08:34:29 - 1.29
> +++ setup.c 19 Jul 2007 18:02:36 -
> @@ -336,6 +336,7 @@ setup(char *dev)
>  sbdirty();
>  dirty(&asblk);
>  }
> +#if 0
>  if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
>  pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
>  sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
> @@ -346,6 +347,7 @@ setup(char *dev)
>  dirty(&asblk);
>  }
>  }
> +#endif
>  if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode)) {
>  pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
>  sblock.fs_inopb = sblock.fs_bsize / sizeof(struct ufs1_dinode);
>
>

I had a power failure here (power company was doing maintenance and
repeatedly switched power off and on...)

Both my 4.1 boxen ran into this. The patch fixed the BLK 64 issues, but
i have a partition made with a larger blocksize (defaults * 2), that
couldn't be fixed (BLK 128). bsd.rd from snapshots did the trick... Just FYI

Tobias



Re: fsck Segmentation fault on 4.1

2007-08-03 Thread Tobias Ulmer
On Thu, Jul 19, 2007 at 08:09:58PM +0200, Otto Moerbeek wrote:
> [...]
>
> I misdiagnosed the problem. In the meantime I got another report with
> a dd of the partition which enabled me to diagnose the problem and
> make a fix for 4.1. Please test and report back. I'll be on vacation
> from Saturday, so it would be nice if you can answer before that. 
> 
> Anobody else seeing INCONSISTENT CGSIZE messages should try this as well.
> 
> NOTE: this diff only applies to 4.1. Current does not have the
> problem, due to a corrected CGSIZE macro.
> 
>   -Otto
> 
> Index: setup.c
> ===
> RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 setup.c
> --- setup.c   16 Feb 2007 08:34:29 -  1.29
> +++ setup.c   19 Jul 2007 18:02:36 -
> @@ -336,6 +336,7 @@ setup(char *dev)
>   sbdirty();
>   dirty(&asblk);
>   }
> +#if 0
>   if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
>   pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
>   sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
> @@ -346,6 +347,7 @@ setup(char *dev)
>   dirty(&asblk);
>   }
>   }
> +#endif
>   if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode)) {
>   pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
>   sblock.fs_inopb = sblock.fs_bsize / sizeof(struct ufs1_dinode);
> 
> 

I had a power failure here (power company was doing maintenance and
repeatedly switched power off and on...)

Both my 4.1 boxen ran into this. The patch fixed the BLK 64 issues, but
i have a partition made with a larger blocksize (defaults * 2), that
couldn't be fixed (BLK 128). bsd.rd from snapshots did the trick... Just FYI

Tobias



Re: fsck Segmentation fault on 4.1

2007-07-23 Thread Marcos Laufer
Otto , i couldn't apply the patch , i get some errors:

patch -p0 < patch1.txt
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|Index: setup.c
|===
|RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
|retrieving revision 1.29
|diff -u -p -r1.29 setup.c
|--- setup.c 16 Feb 2007 08:34:29 - 1.29
|+++ setup.c 19 Jul 2007 18:02:36 -
--
Patching file setup.c using Plan A...
Hunk #1 failed at 336.
Hunk #2 failed at 347.
2 out of 2 hunks failed--saving rejects to setup.c.rej
Hmm...  Ignoring the trailing garbage.
done

---

But i added the 2 lines (if and endif) manually in setup.c, so the code
looks like this:

sbdirty();
dirty(&asblk);
}
#if 0
if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
if (preen)
printf(" (FIXED)\n");
if (preen || reply("FIX") == 1) {
sbdirty();
dirty(&asblk);
}
}
#endif
if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode))
{
pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
sblock.fs_inopb = sblock.fs_bsize / sizeof(struct
ufs1_dinode);

--

Will this work?


Regards,
Marcos

- Original Message - 
From: "Otto Moerbeek" <[EMAIL PROTECTED]>
To: "Marcos Laufer" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, July 20, 2007 4:02 PM
Subject: Re: fsck Segmentation fault on 4.1


On Fri, 20 Jul 2007, Marcos Laufer wrote:

> Will this be moved to -stable, or is it an uncommon thing ?

It's not very common, but the impact is pretty high. So once some more
test reports are coming in, we'll consider it.

-Otto

>
> - Original Message - 
> From: "Otto Moerbeek" <[EMAIL PROTECTED]>
> To: "Marcos Laufer" <[EMAIL PROTECTED]>
> Cc: 
> Sent: Thursday, July 19, 2007 3:09 PM
> Subject: Re: fsck Segmentation fault on 4.1
>
>
> On Fri, 13 Jul 2007, Otto Moerbeek wrote:
>
> > On Fri, 13 Jul 2007, Marcos Laufer wrote:
> >
> > > Otto ,
> > >
> > > This is the error i get:



Re: fsck Segmentation fault on 4.1

2007-07-21 Thread Marcos Laufer
Well it seems that more people are having this same error , i found this guy
in
Brazil who hasn't reported it but mentioned it in a forum
http://www.bsdforums.org/forums/showthread.php?p=265260
read at the end.
I think it would be a good idea to put the patch in the stable branch

Regards,
Marcos

- Original Message - 
From: "Otto Moerbeek" <[EMAIL PROTECTED]>
To: "Marcos Laufer" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, July 20, 2007 4:02 PM
Subject: Re: fsck Segmentation fault on 4.1


On Fri, 20 Jul 2007, Marcos Laufer wrote:

> Will this be moved to -stable, or is it an uncommon thing ?

It's not very common, but the impact is pretty high. So once some more
test reports are coming in, we'll consider it.

-Otto

>
> - Original Message - 
> From: "Otto Moerbeek" <[EMAIL PROTECTED]>
> To: "Marcos Laufer" <[EMAIL PROTECTED]>
> Cc: 
> Sent: Thursday, July 19, 2007 3:09 PM
> Subject: Re: fsck Segmentation fault on 4.1
>
>
> On Fri, 13 Jul 2007, Otto Moerbeek wrote:
>
> > On Fri, 13 Jul 2007, Marcos Laufer wrote:
> >
> > > Otto ,
> > >
> > > This is the error i get:
> > > It starts booting , and it starts fsck , it fails with /dev/rwd0e and
rwd0h,
> > >
> > > (i could see once that when it finished it says:)
> > > fsck_ffs in free():  error: free_page: pointer to wrong page
> > > fsck: /dev/rwd0h: Abort trap
> > >
> > > I reboot it again many times and that did not show again
> > >
> > >
> > > i try to fsck manually like this as you say and i get:
> > >
> > > # ulimit -d unlimited
> > > # fsck -y /dev/rwd0e
> > >
> > > INCONSISTENT CGSIZE=16384
> > >
> > > FIX? yes
> > >
> > > * * Last mounted on /usr
> > > * * Phase 1- Check Blocks and Sizes
> > > * * Phase 2 - Check pathnames
> > > * * Phase 3 - Check Conectivity
> > > * * Phase 4 - Check Reference Counts
> > > * * Phase 5 - Check Cyl Groups
> > >
> > > CANNOT READ: BLK 64
> > >
> > > CONTINUE? yes
> > >
> > > fsck: /dev/rwd0e: Segmentation Fault
> >
> > This is not an out of memory situation.
> >
> > It looks like fsck_ffs has problems getting data from your disk,
> > probably because of hardware failure or bad cabling.  Sometimes it
> > detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
> > is possible it gets corrupted data in other cases.
> >
> > Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
> > memory and corrupt it's internal data. During the last year I've fixed
> > some stuff in this area, but there still remains cases that can go
> > wrong.
>
> I misdiagnosed the problem. In the meantime I got another report with
> a dd of the partition which enabled me to diagnose the problem and
> make a fix for 4.1. Please test and report back. I'll be on vacation
> from Saturday, so it would be nice if you can answer before that.
>
> Anobody else seeing INCONSISTENT CGSIZE messages should try this as well.
>
> NOTE: this diff only applies to 4.1. Current does not have the
> problem, due to a corrected CGSIZE macro.
>
> -Otto
>
> Index: setup.c
> ===
> RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 setup.c
> --- setup.c 16 Feb 2007 08:34:29 - 1.29
> +++ setup.c 19 Jul 2007 18:02:36 -
> @@ -336,6 +336,7 @@ setup(char *dev)
>   sbdirty();
>   dirty(&asblk);
>   }
> +#if 0
>   if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
>   pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
>   sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
> @@ -346,6 +347,7 @@ setup(char *dev)
>   dirty(&asblk);
>   }
>   }
> +#endif
>   if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode)) {
>   pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
>   sblock.fs_inopb = sblock.fs_bsize / sizeof(struct ufs1_dinode);



Re: fsck Segmentation fault on 4.1

2007-07-20 Thread Otto Moerbeek
On Fri, 20 Jul 2007, Marcos Laufer wrote:

> Will this be moved to -stable, or is it an uncommon thing ?

It's not very common, but the impact is pretty high. So once some more
test reports are coming in, we'll consider it. 

-Otto

> 
> - Original Message - 
> From: "Otto Moerbeek" <[EMAIL PROTECTED]>
> To: "Marcos Laufer" <[EMAIL PROTECTED]>
> Cc: 
> Sent: Thursday, July 19, 2007 3:09 PM
> Subject: Re: fsck Segmentation fault on 4.1
> 
> 
> On Fri, 13 Jul 2007, Otto Moerbeek wrote:
> 
> > On Fri, 13 Jul 2007, Marcos Laufer wrote:
> > 
> > > Otto ,
> > > 
> > > This is the error i get:
> > > It starts booting , and it starts fsck , it fails with /dev/rwd0e and 
> > > rwd0h,
> > > 
> > > (i could see once that when it finished it says:)
> > > fsck_ffs in free():  error: free_page: pointer to wrong page
> > > fsck: /dev/rwd0h: Abort trap
> > > 
> > > I reboot it again many times and that did not show again
> > > 
> > > 
> > > i try to fsck manually like this as you say and i get:
> > > 
> > > # ulimit -d unlimited
> > > # fsck -y /dev/rwd0e
> > > 
> > > INCONSISTENT CGSIZE=16384
> > > 
> > > FIX? yes
> > > 
> > > * * Last mounted on /usr
> > > * * Phase 1- Check Blocks and Sizes
> > > * * Phase 2 - Check pathnames
> > > * * Phase 3 - Check Conectivity
> > > * * Phase 4 - Check Reference Counts
> > > * * Phase 5 - Check Cyl Groups
> > > 
> > > CANNOT READ: BLK 64
> > > 
> > > CONTINUE? yes
> > > 
> > > fsck: /dev/rwd0e: Segmentation Fault
> > 
> > This is not an out of memory situation.
> > 
> > It looks like fsck_ffs has problems getting data from your disk,
> > probably because of hardware failure or bad cabling.  Sometimes it
> > detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
> > is possible it gets corrupted data in other cases. 
> > 
> > Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
> > memory and corrupt it's internal data. During the last year I've fixed
> > some stuff in this area, but there still remains cases that can go
> > wrong.
> 
> I misdiagnosed the problem. In the meantime I got another report with
> a dd of the partition which enabled me to diagnose the problem and
> make a fix for 4.1. Please test and report back. I'll be on vacation
> from Saturday, so it would be nice if you can answer before that. 
> 
> Anobody else seeing INCONSISTENT CGSIZE messages should try this as well.
> 
> NOTE: this diff only applies to 4.1. Current does not have the
> problem, due to a corrected CGSIZE macro.
> 
> -Otto
> 
> Index: setup.c
> ===
> RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 setup.c
> --- setup.c 16 Feb 2007 08:34:29 - 1.29
> +++ setup.c 19 Jul 2007 18:02:36 -
> @@ -336,6 +336,7 @@ setup(char *dev)
>   sbdirty();
>   dirty(&asblk);
>   }
> +#if 0
>   if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
>   pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
>   sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
> @@ -346,6 +347,7 @@ setup(char *dev)
>   dirty(&asblk);
>   }
>   }
> +#endif
>   if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode)) {
>   pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
>   sblock.fs_inopb = sblock.fs_bsize / sizeof(struct ufs1_dinode);



Re: fsck Segmentation fault on 4.1

2007-07-20 Thread Marcos Laufer
Will this be moved to -stable, or is it an uncommon thing ?

- Original Message - 
From: "Otto Moerbeek" <[EMAIL PROTECTED]>
To: "Marcos Laufer" <[EMAIL PROTECTED]>
Cc: 
Sent: Thursday, July 19, 2007 3:09 PM
Subject: Re: fsck Segmentation fault on 4.1


On Fri, 13 Jul 2007, Otto Moerbeek wrote:

> On Fri, 13 Jul 2007, Marcos Laufer wrote:
> 
> > Otto ,
> > 
> > This is the error i get:
> > It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,
> > 
> > (i could see once that when it finished it says:)
> > fsck_ffs in free():  error: free_page: pointer to wrong page
> > fsck: /dev/rwd0h: Abort trap
> > 
> > I reboot it again many times and that did not show again
> > 
> > 
> > i try to fsck manually like this as you say and i get:
> > 
> > # ulimit -d unlimited
> > # fsck -y /dev/rwd0e
> > 
> > INCONSISTENT CGSIZE=16384
> > 
> > FIX? yes
> > 
> > * * Last mounted on /usr
> > * * Phase 1- Check Blocks and Sizes
> > * * Phase 2 - Check pathnames
> > * * Phase 3 - Check Conectivity
> > * * Phase 4 - Check Reference Counts
> > * * Phase 5 - Check Cyl Groups
> > 
> > CANNOT READ: BLK 64
> > 
> > CONTINUE? yes
> > 
> > fsck: /dev/rwd0e: Segmentation Fault
> 
> This is not an out of memory situation.
> 
> It looks like fsck_ffs has problems getting data from your disk,
> probably because of hardware failure or bad cabling.  Sometimes it
> detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
> is possible it gets corrupted data in other cases. 
> 
> Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
> memory and corrupt it's internal data. During the last year I've fixed
> some stuff in this area, but there still remains cases that can go
> wrong.

I misdiagnosed the problem. In the meantime I got another report with
a dd of the partition which enabled me to diagnose the problem and
make a fix for 4.1. Please test and report back. I'll be on vacation
from Saturday, so it would be nice if you can answer before that. 

Anobody else seeing INCONSISTENT CGSIZE messages should try this as well.

NOTE: this diff only applies to 4.1. Current does not have the
problem, due to a corrected CGSIZE macro.

-Otto

Index: setup.c
===
RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
retrieving revision 1.29
diff -u -p -r1.29 setup.c
--- setup.c 16 Feb 2007 08:34:29 - 1.29
+++ setup.c 19 Jul 2007 18:02:36 -
@@ -336,6 +336,7 @@ setup(char *dev)
  sbdirty();
  dirty(&asblk);
  }
+#if 0
  if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
  pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
  sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
@@ -346,6 +347,7 @@ setup(char *dev)
  dirty(&asblk);
  }
  }
+#endif
  if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode)) {
  pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
  sblock.fs_inopb = sblock.fs_bsize / sizeof(struct ufs1_dinode);



Re: fsck Segmentation fault on 4.1

2007-07-19 Thread Otto Moerbeek
On Fri, 13 Jul 2007, Otto Moerbeek wrote:

> On Fri, 13 Jul 2007, Marcos Laufer wrote:
> 
> > Otto ,
> > 
> > This is the error i get:
> > It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,
> > 
> > (i could see once that when it finished it says:)
> > fsck_ffs in free():  error: free_page: pointer to wrong page
> > fsck: /dev/rwd0h: Abort trap
> > 
> > I reboot it again many times and that did not show again
> > 
> > 
> > i try to fsck manually like this as you say and i get:
> > 
> > # ulimit -d unlimited
> > # fsck -y /dev/rwd0e
> > 
> > INCONSISTENT CGSIZE=16384
> > 
> > FIX? yes
> > 
> > * * Last mounted on /usr
> > * * Phase 1- Check Blocks and Sizes
> > * * Phase 2 - Check pathnames
> > * * Phase 3 - Check Conectivity
> > * * Phase 4 - Check Reference Counts
> > * * Phase 5 - Check Cyl Groups
> > 
> > CANNOT READ: BLK 64
> > 
> > CONTINUE? yes
> > 
> > fsck: /dev/rwd0e: Segmentation Fault
> 
> This is not an out of memory situation.
> 
> It looks like fsck_ffs has problems getting data from your disk,
> probably because of hardware failure or bad cabling.  Sometimes it
> detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
> is possible it gets corrupted data in other cases. 
> 
> Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
> memory and corrupt it's internal data. During the last year I've fixed
> some stuff in this area, but there still remains cases that can go
> wrong.

I misdiagnosed the problem. In the meantime I got another report with
a dd of the partition which enabled me to diagnose the problem and
make a fix for 4.1. Please test and report back. I'll be on vacation
from Saturday, so it would be nice if you can answer before that. 

Anobody else seeing INCONSISTENT CGSIZE messages should try this as well.

NOTE: this diff only applies to 4.1. Current does not have the
problem, due to a corrected CGSIZE macro.

-Otto

Index: setup.c
===
RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v
retrieving revision 1.29
diff -u -p -r1.29 setup.c
--- setup.c 16 Feb 2007 08:34:29 -  1.29
+++ setup.c 19 Jul 2007 18:02:36 -
@@ -336,6 +336,7 @@ setup(char *dev)
sbdirty();
dirty(&asblk);
}
+#if 0
if (sblock.fs_cgsize != fragroundup(&sblock, CGSIZE(&sblock))) {
pwarn("INCONSISTENT CGSIZE=%d\n", sblock.fs_cgsize);
sblock.fs_cgsize = fragroundup(&sblock, CGSIZE(&sblock));
@@ -346,6 +347,7 @@ setup(char *dev)
dirty(&asblk);
}
}
+#endif
if (INOPB(&sblock) != sblock.fs_bsize / sizeof(struct ufs1_dinode)) {
pwarn("INCONSISTENT INOPB=%d\n", INOPB(&sblock));
sblock.fs_inopb = sblock.fs_bsize / sizeof(struct ufs1_dinode);



Re: fsck Segmentation fault on 4.1

2007-07-15 Thread Otto Moerbeek
On Sun, 15 Jul 2007, Niko Itajarvi wrote:

> Otto Moerbeek  drijf.net> writes:
> 
> > 
> > On Fri, 13 Jul 2007, Marcos Laufer wrote:
> > 
> > > Otto,
> > > 
> > > I know the cables are allright, i'm using them with other hard drive .
> > > And the hard drive is new , but i will format it and check if it
> > > shows up some errors.
> > > I hope it is hardware related , i would get kind of scared otherwise.
> > > Do you need me to try anything else with this filesystem?
> > 
> > If possible (if it's not too large and the drive cooperates), I would
> > like a dd of the partition. I'm always very interested in having an
> > image of a filesystems on which fsck_ffs chokes. 
> > 
> > -Otto
> > 
> 
> 
> I have been getting the exact same error (CANNOT READ: BLK 64) on two 
> different
> servers since I updated them to 4.1. I can happily provide a dd of the 
> partition
> if it would help to resolve this.

yes, please. Something like 

dd if=/dev/rsd0a of=image bs=512

gzip the image and make it available to me.

Althoiugh I will be on vacation soo, dunno if I have time to check
things before that.

-Otto



Re: fsck Segmentation fault on 4.1

2007-07-15 Thread Niko Itajarvi
Otto Moerbeek  drijf.net> writes:

> 
> On Fri, 13 Jul 2007, Marcos Laufer wrote:
> 
> > Otto,
> > 
> > I know the cables are allright, i'm using them with other hard drive .
> > And the hard drive is new , but i will format it and check if it
> > shows up some errors.
> > I hope it is hardware related , i would get kind of scared otherwise.
> > Do you need me to try anything else with this filesystem?
> 
> If possible (if it's not too large and the drive cooperates), I would
> like a dd of the partition. I'm always very interested in having an
> image of a filesystems on which fsck_ffs chokes. 
> 
>   -Otto
> 


I have been getting the exact same error (CANNOT READ: BLK 64) on two different
servers since I updated them to 4.1. I can happily provide a dd of the partition
if it would help to resolve this.


-Niko
[EMAIL PROTECTED]



Re: fsck Segmentation fault on 4.1

2007-07-13 Thread Otto Moerbeek
On Fri, 13 Jul 2007, Marcos Laufer wrote:

> Otto,
> 
> I know the cables are allright, i'm using them with other hard drive .
> And the hard drive is new , but i will format it and check if it
> shows up some errors.
> I hope it is hardware related , i would get kind of scared otherwise.
> Do you need me to try anything else with this filesystem?

If possible (if it's not too large and the drive cooperates), I would
like a dd of the partition. I'm always very interested in having an
image of a filesystems on which fsck_ffs chokes. 

-Otto

> 
> Regards,
> Marcos
> 
> - Original Message - 
> From: "Otto Moerbeek" <[EMAIL PROTECTED]>
> To: "Marcos Laufer" <[EMAIL PROTECTED]>
> Cc: 
> Sent: Friday, July 13, 2007 4:46 PM
> Subject: Re: fsck Segmentation fault on 4.1
> 
> 
> On Fri, 13 Jul 2007, Marcos Laufer wrote:
> 
> > Otto ,
> >
> > This is the error i get:
> > It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,
> >
> > (i could see once that when it finished it says:)
> > fsck_ffs in free():  error: free_page: pointer to wrong page
> > fsck: /dev/rwd0h: Abort trap
> >
> > I reboot it again many times and that did not show again
> >
> >
> > i try to fsck manually like this as you say and i get:
> >
> > # ulimit -d unlimited
> > # fsck -y /dev/rwd0e
> >
> > INCONSISTENT CGSIZE=16384
> >
> > FIX? yes
> >
> > * * Last mounted on /usr
> > * * Phase 1- Check Blocks and Sizes
> > * * Phase 2 - Check pathnames
> > * * Phase 3 - Check Conectivity
> > * * Phase 4 - Check Reference Counts
> > * * Phase 5 - Check Cyl Groups
> >
> > CANNOT READ: BLK 64
> >
> > CONTINUE? yes
> >
> > fsck: /dev/rwd0e: Segmentation Fault
> 
> This is not an out of memory situation.
> 
> It looks like fsck_ffs has problems getting data from your disk,
> probably because of hardware failure or bad cabling.  Sometimes it
> detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
> is possible it gets corrupted data in other cases.
> 
> Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
> memory and corrupt it's internal data. During the last year I've fixed
> some stuff in this area, but there still remains cases that can go
> wrong.
> 
> -Otto
> 
> 
> > # _
> >
> >
> > The dmesg is:
> >
> > OpenBSD 4.1-stable (GENERIC) #0: Mon May 14 14:02:47 ART 2007
> > [EMAIL PROTECTED]:/u/system/src/sys/arch/i386/compile/GENERIC
> > cpu0: Intel(R) Pentium(R) 4 CPU 2.80GHz ("GenuineIntel" 686-class) 2.81 GHz
> > cpu0:
> >
> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
> > ,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
> > real mem  = 1064857600 (1039900K)
> > avail mem = 964222976 (941624K)
> > using 4278 buffers containing 53366784 bytes (52116K) of memory
> > mainbus0 (root)
> > bios0 at mainbus0: AT/286+ BIOS, date 09/15/03, BIOS32 rev. 0 @ 0xfbbd0, 
> > SMBIOS rev. 2.2
> @
> > 0xf0800 (39 entries)
> > bios0: MICRO-STAR INTL, CO.,LTD. MS-6743
> > apm0 at bios0: Power Management spec V1.2
> > apm0: AC on, battery charge unknown
> > apm0: flags 70102 dobusy 1 doidle 1
> > pcibios0 at bios0: rev 2.1 @ 0xf/0xdf84
> > pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdeb0/176 (9 entries)
> > pcibios0: PCI Exclusive IRQs: 3 4 5 7 10 11
> > pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371SB ISA" rev 0x00)
> > pcibios0: PCI bus #1 is the last bus
> > bios0: ROM list: 0xc/0xa600 0xcc000/0x1800
> > acpi at mainbus0 not configured
> > cpu0 at mainbus0
> > pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> > pchb0 at pci0 dev 0 function 0 "Intel 82865G/PE/P CPU-I/0-1" rev 0x02
> > vga1 at pci0 dev 2 function 0 "Intel 82865G Video" rev 0x02: aperture at 
> > 0xf000,
> size
> > 0x800
> > wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> > wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> > ppb0 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> > pci1 at ppb0 bus 1
> > fxp0 at pci1 dev 8 function 0 "Intel PRO/100 VE" rev 0x02, i82562: irq 10, 
> > address
> > 00:0c:76:b5:8a:85
> > inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0
> > ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> > pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA, 
> >

Re: fsck Segmentation fault on 4.1

2007-07-13 Thread Marcos Laufer
Otto,

I know the cables are allright, i'm using them with other hard drive .
And the hard drive is new , but i will format it and check if it
shows up some errors.
I hope it is hardware related , i would get kind of scared otherwise.
Do you need me to try anything else with this filesystem?

Regards,
Marcos

- Original Message - 
From: "Otto Moerbeek" <[EMAIL PROTECTED]>
To: "Marcos Laufer" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, July 13, 2007 4:46 PM
Subject: Re: fsck Segmentation fault on 4.1


On Fri, 13 Jul 2007, Marcos Laufer wrote:

> Otto ,
>
> This is the error i get:
> It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,
>
> (i could see once that when it finished it says:)
> fsck_ffs in free():  error: free_page: pointer to wrong page
> fsck: /dev/rwd0h: Abort trap
>
> I reboot it again many times and that did not show again
>
>
> i try to fsck manually like this as you say and i get:
>
> # ulimit -d unlimited
> # fsck -y /dev/rwd0e
>
> INCONSISTENT CGSIZE=16384
>
> FIX? yes
>
> * * Last mounted on /usr
> * * Phase 1- Check Blocks and Sizes
> * * Phase 2 - Check pathnames
> * * Phase 3 - Check Conectivity
> * * Phase 4 - Check Reference Counts
> * * Phase 5 - Check Cyl Groups
>
> CANNOT READ: BLK 64
>
> CONTINUE? yes
>
> fsck: /dev/rwd0e: Segmentation Fault

This is not an out of memory situation.

It looks like fsck_ffs has problems getting data from your disk,
probably because of hardware failure or bad cabling.  Sometimes it
detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
is possible it gets corrupted data in other cases.

Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
memory and corrupt it's internal data. During the last year I've fixed
some stuff in this area, but there still remains cases that can go
wrong.

-Otto


> # _
>
>
> The dmesg is:
>
> OpenBSD 4.1-stable (GENERIC) #0: Mon May 14 14:02:47 ART 2007
> [EMAIL PROTECTED]:/u/system/src/sys/arch/i386/compile/GENERIC
> cpu0: Intel(R) Pentium(R) 4 CPU 2.80GHz ("GenuineIntel" 686-class) 2.81 GHz
> cpu0:
>
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
> ,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
> real mem  = 1064857600 (1039900K)
> avail mem = 964222976 (941624K)
> using 4278 buffers containing 53366784 bytes (52116K) of memory
> mainbus0 (root)
> bios0 at mainbus0: AT/286+ BIOS, date 09/15/03, BIOS32 rev. 0 @ 0xfbbd0, 
> SMBIOS rev. 2.2
@
> 0xf0800 (39 entries)
> bios0: MICRO-STAR INTL, CO.,LTD. MS-6743
> apm0 at bios0: Power Management spec V1.2
> apm0: AC on, battery charge unknown
> apm0: flags 70102 dobusy 1 doidle 1
> pcibios0 at bios0: rev 2.1 @ 0xf/0xdf84
> pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdeb0/176 (9 entries)
> pcibios0: PCI Exclusive IRQs: 3 4 5 7 10 11
> pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371SB ISA" rev 0x00)
> pcibios0: PCI bus #1 is the last bus
> bios0: ROM list: 0xc/0xa600 0xcc000/0x1800
> acpi at mainbus0 not configured
> cpu0 at mainbus0
> pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> pchb0 at pci0 dev 0 function 0 "Intel 82865G/PE/P CPU-I/0-1" rev 0x02
> vga1 at pci0 dev 2 function 0 "Intel 82865G Video" rev 0x02: aperture at 
> 0xf000,
size
> 0x800
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> ppb0 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> pci1 at ppb0 bus 1
> fxp0 at pci1 dev 8 function 0 "Intel PRO/100 VE" rev 0x02, i82562: irq 10, 
> address
> 00:0c:76:b5:8a:85
> inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0
> ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA, channel > 0
configured
> to compatibility, channel 1 configured to compatibility
> wd0 at pciide0 channel 0 drive 1: 
> wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors
> wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 5
> atapiscsi0 at pciide0 channel 1 drive 1
> scsibus0 at atapiscsi0: 2 targets
> cd0 at scsibus0 targ 0 lun 0:  SCSI0 5/cdrom 
> removable
> cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2
> ichiic0 at pci0 dev 31 function 3 "Intel 82801EB/ER SMBus" rev 0x02: irq 4
> iic0 at ichiic0
> iic0: addr 0x2f 04=00 06=0a 07=00 0c=00 0d=07 0e=85 0f=00 10=c0 11=11 12=00 
> 13=60 14=14
> 15=62 16=01 17=06
> isa0 at ichpcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: consol

Re: fsck Segmentation fault on 4.1

2007-07-13 Thread Otto Moerbeek
On Fri, 13 Jul 2007, Marcos Laufer wrote:

> Otto ,
> 
> This is the error i get:
> It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,
> 
> (i could see once that when it finished it says:)
> fsck_ffs in free():  error: free_page: pointer to wrong page
> fsck: /dev/rwd0h: Abort trap
> 
> I reboot it again many times and that did not show again
> 
> 
> i try to fsck manually like this as you say and i get:
> 
> # ulimit -d unlimited
> # fsck -y /dev/rwd0e
> 
> INCONSISTENT CGSIZE=16384
> 
> FIX? yes
> 
> * * Last mounted on /usr
> * * Phase 1- Check Blocks and Sizes
> * * Phase 2 - Check pathnames
> * * Phase 3 - Check Conectivity
> * * Phase 4 - Check Reference Counts
> * * Phase 5 - Check Cyl Groups
> 
> CANNOT READ: BLK 64
> 
> CONTINUE? yes
> 
> fsck: /dev/rwd0e: Segmentation Fault

This is not an out of memory situation.

It looks like fsck_ffs has problems getting data from your disk,
probably because of hardware failure or bad cabling.  Sometimes it
detects it cannot read the data (the CANNOT READ: BLK 64 case), but it
is possible it gets corrupted data in other cases. 

Sadly, this can cause fsck_ffs to do the wrong thing and access wrong
memory and corrupt it's internal data. During the last year I've fixed
some stuff in this area, but there still remains cases that can go
wrong.

-Otto


> # _
> 
> 
> The dmesg is:
> 
> OpenBSD 4.1-stable (GENERIC) #0: Mon May 14 14:02:47 ART 2007
> [EMAIL PROTECTED]:/u/system/src/sys/arch/i386/compile/GENERIC
> cpu0: Intel(R) Pentium(R) 4 CPU 2.80GHz ("GenuineIntel" 686-class) 2.81 GHz
> cpu0:
> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
> ,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
> real mem  = 1064857600 (1039900K)
> avail mem = 964222976 (941624K)
> using 4278 buffers containing 53366784 bytes (52116K) of memory
> mainbus0 (root)
> bios0 at mainbus0: AT/286+ BIOS, date 09/15/03, BIOS32 rev. 0 @ 0xfbbd0, 
> SMBIOS rev. 2.2 @
> 0xf0800 (39 entries)
> bios0: MICRO-STAR INTL, CO.,LTD. MS-6743
> apm0 at bios0: Power Management spec V1.2
> apm0: AC on, battery charge unknown
> apm0: flags 70102 dobusy 1 doidle 1
> pcibios0 at bios0: rev 2.1 @ 0xf/0xdf84
> pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdeb0/176 (9 entries)
> pcibios0: PCI Exclusive IRQs: 3 4 5 7 10 11
> pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371SB ISA" rev 0x00)
> pcibios0: PCI bus #1 is the last bus
> bios0: ROM list: 0xc/0xa600 0xcc000/0x1800
> acpi at mainbus0 not configured
> cpu0 at mainbus0
> pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> pchb0 at pci0 dev 0 function 0 "Intel 82865G/PE/P CPU-I/0-1" rev 0x02
> vga1 at pci0 dev 2 function 0 "Intel 82865G Video" rev 0x02: aperture at 
> 0xf000, size
> 0x800
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> ppb0 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> pci1 at ppb0 bus 1
> fxp0 at pci1 dev 8 function 0 "Intel PRO/100 VE" rev 0x02, i82562: irq 10, 
> address
> 00:0c:76:b5:8a:85
> inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0
> ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA, channel 
> 0 configured
> to compatibility, channel 1 configured to compatibility
> wd0 at pciide0 channel 0 drive 1: 
> wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors
> wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 5
> atapiscsi0 at pciide0 channel 1 drive 1
> scsibus0 at atapiscsi0: 2 targets
> cd0 at scsibus0 targ 0 lun 0:  SCSI0 5/cdrom 
> removable
> cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2
> ichiic0 at pci0 dev 31 function 3 "Intel 82801EB/ER SMBus" rev 0x02: irq 4
> iic0 at ichiic0
> iic0: addr 0x2f 04=00 06=0a 07=00 0c=00 0d=07 0e=85 0f=00 10=c0 11=11 12=00 
> 13=60 14=14
> 15=62 16=01 17=06
> isa0 at ichpcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pmsi0 at pckbc0 (aux slot)
> pckbc0: using irq 12 for aux slot
> wsmouse0 at pmsi0 mux 0
> pcppi0 at isa0 port 0x61
> midi0 at pcppi0: 
> spkr0 at pcppi0
> lm0 at isa0 port 0x290/8: W83627THF
> npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
> biomask ebfd netmask effd ttymask 
> pctr: user-level cycle counter enabled
> dkcsum: wd0 matches BIOS drive 0x80
> root on wd0a
> rootdev=0x0 rrootdev=0x300 rawdev=0x302
> 
> 

Re: fsck Segmentation fault on 4.1

2007-07-13 Thread Bob Beck
> I want to report a problem i experienced while testing OpenBSD 4.1 .
> I've installed it, increased VM_PHYSSEG_MAX to 16
> in /usr/src/sys/arch/i386/include/vmparam.h to make
> it work with this particular motherboard and made a
> stable release.

Fluffy!!!

There be dragons..

-Bob



Re: fsck Segmentation fault on 4.1

2007-07-13 Thread Marcos Laufer
Otto ,

This is the error i get:
It starts booting , and it starts fsck , it fails with /dev/rwd0e and rwd0h,

(i could see once that when it finished it says:)
fsck_ffs in free():  error: free_page: pointer to wrong page
fsck: /dev/rwd0h: Abort trap

I reboot it again many times and that did not show again


i try to fsck manually like this as you say and i get:

# ulimit -d unlimited
# fsck -y /dev/rwd0e

INCONSISTENT CGSIZE=16384

FIX? yes

* * Last mounted on /usr
* * Phase 1- Check Blocks and Sizes
* * Phase 2 - Check pathnames
* * Phase 3 - Check Conectivity
* * Phase 4 - Check Reference Counts
* * Phase 5 - Check Cyl Groups

CANNOT READ: BLK 64

CONTINUE? yes

fsck: /dev/rwd0e: Segmentation Fault
# _


The dmesg is:

OpenBSD 4.1-stable (GENERIC) #0: Mon May 14 14:02:47 ART 2007
[EMAIL PROTECTED]:/u/system/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Pentium(R) 4 CPU 2.80GHz ("GenuineIntel" 686-class) 2.81 GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX
,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
real mem  = 1064857600 (1039900K)
avail mem = 964222976 (941624K)
using 4278 buffers containing 53366784 bytes (52116K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+ BIOS, date 09/15/03, BIOS32 rev. 0 @ 0xfbbd0, SMBIOS 
rev. 2.2 @
0xf0800 (39 entries)
bios0: MICRO-STAR INTL, CO.,LTD. MS-6743
apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
apm0: flags 70102 dobusy 1 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xf/0xdf84
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdeb0/176 (9 entries)
pcibios0: PCI Exclusive IRQs: 3 4 5 7 10 11
pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371SB ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc/0xa600 0xcc000/0x1800
acpi at mainbus0 not configured
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "Intel 82865G/PE/P CPU-I/0-1" rev 0x02
vga1 at pci0 dev 2 function 0 "Intel 82865G Video" rev 0x02: aperture at 
0xf000, size
0x800
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
ppb0 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
pci1 at ppb0 bus 1
fxp0 at pci1 dev 8 function 0 "Intel PRO/100 VE" rev 0x02, i82562: irq 10, 
address
00:0c:76:b5:8a:85
inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0
ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
pciide0 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA, channel 0 
configured
to compatibility, channel 1 configured to compatibility
wd0 at pciide0 channel 0 drive 1: 
wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors
wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0:  SCSI0 5/cdrom 
removable
cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2
ichiic0 at pci0 dev 31 function 3 "Intel 82801EB/ER SMBus" rev 0x02: irq 4
iic0 at ichiic0
iic0: addr 0x2f 04=00 06=0a 07=00 0c=00 0d=07 0e=85 0f=00 10=c0 11=11 12=00 
13=60 14=14
15=62 16=01 17=06
isa0 at ichpcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: 
spkr0 at pcppi0
lm0 at isa0 port 0x290/8: W83627THF
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
biomask ebfd netmask effd ttymask 
pctr: user-level cycle counter enabled
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302


- Original Message - 
From: "Otto Moerbeek" <[EMAIL PROTECTED]>
To: "Marcos Laufer" <[EMAIL PROTECTED]>
Cc: 
Sent: Friday, July 13, 2007 3:38 PM
Subject: Re: fsck Segmentation fault on 4.1


On Fri, 13 Jul 2007, Marcos Laufer wrote:

> Hello,
>
> I want to report a problem i experienced while testing OpenBSD 4.1 .
> I've installed it, increased VM_PHYSSEG_MAX to 16
> in /usr/src/sys/arch/i386/include/vmparam.h to make
> it work with this particular motherboard and made a
> stable release.
> Installed a server with it and it's working fine as an MX for
> a few months until now.
> The machine was crashed, no error on the screen and the keyboard
> did not respond. I rebooted , it started to fsck , and
> the fsck failed on /usr. So i run fsck manually : fsck -y, but
> it crashes with segmentation fault, so i can't mount or
> start the server.
> I read on the archives that it was a problem because of running out
> of swap, but i had made a 2gb swap partition, despite of that
> i added a 64mb file as swap and tried fsck again, but no luck.

Re: fsck Segmentation fault on 4.1

2007-07-13 Thread Otto Moerbeek
On Fri, 13 Jul 2007, Marcos Laufer wrote:

> Hello,
> 
> I want to report a problem i experienced while testing OpenBSD 4.1 .
> I've installed it, increased VM_PHYSSEG_MAX to 16
> in /usr/src/sys/arch/i386/include/vmparam.h to make
> it work with this particular motherboard and made a
> stable release.
> Installed a server with it and it's working fine as an MX for
> a few months until now.
> The machine was crashed, no error on the screen and the keyboard
> did not respond. I rebooted , it started to fsck , and
> the fsck failed on /usr. So i run fsck manually : fsck -y, but
> it crashes with segmentation fault, so i can't mount or
> start the server.
> I read on the archives that it was a problem because of running out
> of swap, but i had made a 2gb swap partition, despite of that
> i added a 64mb file as swap and tried fsck again, but no luck.
> This time it was easy for me to reinstall everything in a new hard disk, but
> i still keep the old one because i would like to learn how to fix
> this , if anyone wants me to make some tests or has
> any ideas on what is going on , let me know.

Start by showing the error messgae. A segmentation fault is something
different than running out of memory.

If fsck segfaults, I need a proper error report.
See http://www.openbsd.org/report.html

If fsck runs out of memory, increasing ulimit -d might help, like:

# ulimit -d unlimited
# fsck ...

That reminds me to cook a diff to do this automatically. With
filesystem getting larger an larger, more people will run into
out-of-mem situations.

-Otto



fsck Segmentation fault on 4.1

2007-07-13 Thread Marcos Laufer
Hello,

I want to report a problem i experienced while testing OpenBSD 4.1 .
I've installed it, increased VM_PHYSSEG_MAX to 16
in /usr/src/sys/arch/i386/include/vmparam.h to make
it work with this particular motherboard and made a
stable release.
Installed a server with it and it's working fine as an MX for
a few months until now.
The machine was crashed, no error on the screen and the keyboard
did not respond. I rebooted , it started to fsck , and
the fsck failed on /usr. So i run fsck manually : fsck -y, but
it crashes with segmentation fault, so i can't mount or
start the server.
I read on the archives that it was a problem because of running out
of swap, but i had made a 2gb swap partition, despite of that
i added a 64mb file as swap and tried fsck again, but no luck.
This time it was easy for me to reinstall everything in a new hard disk, but
i still keep the old one because i would like to learn how to fix
this , if anyone wants me to make some tests or has
any ideas on what is going on , let me know.

Regards,
Marcos