Re: Mount root error / New device numbering?

2010-05-14 Thread Fred Souza
On Fri, May 14, 2010 at 00:32, Fred Souza  wrote:
> Good to know, I never really paid much attention to those details (I
> will from now on). Thank you a lot for the help, Jeremy. I will try
> your suggestions in the morning and post back to tell what did I find
> out.

Like I said, here are my findings:

Jeremy's pointers were very correct, the difference in numbering seems
to be just an ata(4) change. Manually changing entries in /etc/fstab
does fix it, and I found out that the kernel panic I was getting was
merely a simple detail I overlooked:

The 3rd-party nvidia driver had been compiled on -RELEASE and was
causing the kernel panics on -STABLE. Simply disabling its loading at
the boot loader prompt, then booting with /etc/fstab properly updated
and then reinstalling the nvidia-driver port (`portmaster
nvidia-driver`) fixed it. Just to be on the safe side, I also
reinstalled the other 3rd-party kernel module I use (fusefs-ntfs3g),
even though it wasn't giving me any errors.

I did try the -STABLE snapshot image as Jeremy suggested, that's how I
figured he was right about the numering difference being an ata(4)
change. I preferred to just manually change the previous install's
/etc/fstab, though (but maybe there was a better way of doing this
with the -STABLE snapshot DVD).

The interrupt storm on irq21 is still happening, and I'm going to work
on that next. Mounting any non-audio CD/DVD stops it, so I'll keep
doing that until I actually find a fix for the issue.

So thank you very much, Jeremy. Your pointers were very helpful in
fixing my problem.


Best regards,
Fred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mount root error / New device numbering?

2010-05-13 Thread Fred Souza
On Fri, May 14, 2010 at 00:25, Jeremy Chadwick  wrote:
> Absolutely.  I've done it myself many times over the years, including
> remotely over serial console.  However, you said you did that then typed
> "exit" rather than "reboot", and the end result was a kernel panic.

Yes, I should have given it another clean try after changing
/etc/fstab. I guess my rustiness with the Good Stuff(tm) plus the
unexpected behavior made me panic myself.

> Honestly, I'm not surprised; the system was probably still confused
> about the root device.  I'm guessing some kernel innards (or maybe
> something picked up from boot2/loader) still referenced the "unknown
> root device" and caused the panic.

Could be. The system had just too many possible points of failure at
that point (its original kernel refused to boot for a few hours due to
the geometry mess, then all of a sudden started working normally, for
instance), so I take whatever I learned from that install as
experience. I'm glad I thought about the right thing to do, it may
have failed for a number of things in the way.

> Even on other operating systems, if I'm dropped (unintentionally or
> intentionally/by choice) into single-user mode, I reboot the system
> rather than exit out of single-user and hope that multi-user works from
> that point forward.  I've seen "exit" on Solaris fail and cause all
> sorts of mayhem (all sorts of system startup services (not rc/init!)
> failing, machine ending up in some sort of catatonic state).

Good to know, I never really paid much attention to those details (I
will from now on). Thank you a lot for the help, Jeremy. I will try
your suggestions in the morning and post back to tell what did I find
out.


Best regards,
Fred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mount root error / New device numbering?

2010-05-13 Thread Jeremy Chadwick
On Fri, May 14, 2010 at 12:16:47AM -0300, Fred Souza wrote:
> > I'd recommend booting/trying an actual 8.0-STABLE snapshot image from
> > here:
> >
> > ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/201004/
> >
> > This will allow you to boot and install 8.0-STABLE on your system.  You
> > should see devices ad10 and ad16 there as well.  It would at least save
> > you the pain of installing the kernel, rebooting, and finding you have
> > to manually deal with /etc/fstab changes and so on.  Give this a shot
> > first.
> >
> > It also might help in debugging the "stray IRQ" problem you see (it
> > would be useful to know what's sitting on IRQ 21; it may be an unused
> > device in your BIOS which you can disable there, or try to find a
> > FreeBSD driver for the device which can attach to the IRQ).
> 
> I will try that, thank you very much. But as future reference.. Should
> it work if I just get to that shell prompt, change /etc/fstab to match
> those number changes and reboot? I'm asking because that sounded like
> the way to go when I first encountered this problem, but I ended
> making my system unusable. It is possible that I left anything out
> when I tried that, or that I changed something incorrectly.. But the
> idea should work, right?

Absolutely.  I've done it myself many times over the years, including
remotely over serial console.  However, you said you did that then typed
"exit" rather than "reboot", and the end result was a kernel panic.

Honestly, I'm not surprised; the system was probably still confused
about the root device.  I'm guessing some kernel innards (or maybe
something picked up from boot2/loader) still referenced the "unknown
root device" and caused the panic.

> I do an exit on that shell just to get to a kernel panic message and a
> quick reboot. I tried to unload the -STABLE kernel and boot from the
> -RELEASE one, but now the system hangs right after it tries to find my
> disks.

Even on other operating systems, if I'm dropped (unintentionally or
intentionally/by choice) into single-user mode, I reboot the system
rather than exit out of single-user and hope that multi-user works from
that point forward.  I've seen "exit" on Solaris fail and cause all
sorts of mayhem (all sorts of system startup services (not rc/init!)
failing, machine ending up in some sort of catatonic state).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mount root error / New device numbering?

2010-05-13 Thread Fred Souza
On Fri, May 14, 2010 at 00:06, Jeremy Chadwick  wrote:
> There is probably an ata(4) device layer change which either fixes (yes
> really), breaks (possibly), or enhances (likely) support for your ATA or
> SATA controller.  This is pretty much how the ata(4) layer has behaved
> for years upon years -- that's just how it goes.  If this is your first
> time encountering it, congratulations.  :-)  The device names *should
> not* change on you once you stick with that kernel; it just indicates
> something changed between -RELEASE and -STABLE.

Hmmm.. Ok, then that may not be me messing up. Good news for me!.. I guess..

> I'd recommend booting/trying an actual 8.0-STABLE snapshot image from
> here:
>
> ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/201004/
>
> This will allow you to boot and install 8.0-STABLE on your system.  You
> should see devices ad10 and ad16 there as well.  It would at least save
> you the pain of installing the kernel, rebooting, and finding you have
> to manually deal with /etc/fstab changes and so on.  Give this a shot
> first.
>
> It also might help in debugging the "stray IRQ" problem you see (it
> would be useful to know what's sitting on IRQ 21; it may be an unused
> device in your BIOS which you can disable there, or try to find a
> FreeBSD driver for the device which can attach to the IRQ).

I will try that, thank you very much. But as future reference.. Should
it work if I just get to that shell prompt, change /etc/fstab to match
those number changes and reboot? I'm asking because that sounded like
the way to go when I first encountered this problem, but I ended
making my system unusable. It is possible that I left anything out
when I tried that, or that I changed something incorrectly.. But the
idea should work, right?


Thank you,
Fred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mount root error / New device numbering?

2010-05-13 Thread Fred Souza
On Thu, May 13, 2010 at 23:51, Jeremy Chadwick  wrote:
> 1) We use csup now, not cvsup.  csup comes with the base system, so
>   there's no need to install cvsup.
>
> 2) I'm not sure why you're downloading ports.tar.gz and extracting it.
>   This means that /var/db/sup/ports-all won't match what's in
>   /usr/ports.  You should just use csup to populate /usr/ports.
>   You can do this by doing:
>
>   csup -h  -L 2 /usr/share/examples/cvsup/ports-supfile
>
>   You can also populate /usr/src (and thus /var/db/sup/src-all) by
>   doing:
>
>   csup -h  -L 2 /usr/share/example/cvsup/stable-supfile
>
>   There are also /etc/make.conf variables you can set to make this
>   process easier once you've populated /usr/ports and /usr/src; you
>   can do something like "cd /usr/ports ; make update".

 Thank you, that is something I didn't see changing. I will try that
out from now on.

> Well, if what you're doing is an "in-place" 7.x upgrade to 8.x, I don't
> know how to do this or if it works.  Others can help.

 No, I did a fresh 8.0-RELEASE install and then tried updating it to -STABLE.

> Otherwise, the steps you're describing for building a system are not
> what's in src/Makefile (not src/UPDATING).  These are the steps:
>
> #  1.  `cd /usr/src'       (or to the directory containing your source tree).
> #  2.  `make buildworld'
> #  3.  `make buildkernel KERNCONF=YOUR_KERNEL_HERE'     (default is GENERIC).
> #  4.  `make installkernel KERNCONF=YOUR_KERNEL_HERE'   (default is GENERIC).
> #       [steps 3. & 4. can be combined by using the "kernel" target]
> #  5.  `reboot'        (in single user mode: boot -s from the loader prompt).
> #  6.  `mergemaster -p'
> #  7.  `make installworld'
> #  8.  `make delete-old'
> #  9.  `mergemaster'                         (you may wish to use -U or -ai).
> # 10.  `reboot'
> # 11.  `make delete-old-libs' (in case no 3rd party program uses them anymore)

Yeah, that is very close to what I did:

# cd /usr/src
# make buildworld
# make kernel KERNCONF=LIGHTNING
# reboot

That was for the first install, that got completely borked after
rebooting and me trying to change the contents of /etc/fstab. On this
current install, I did this:

# cd /usr/src
# make buildworld
# make kernel KERNCONF=LIGHTNING
# mergemaster -p
# make installworld
# make delete-old
# mergemaster -i
# make delete-old-libs
# reboot

The reason for me to try all that before rebooting, like I said on the
first e-mail, was that I thought the drive numbers changing could be
related to the -STABLE kernel running on top of -RELEASE userland. All
those steps ran just fine, though. But when I reboot, I still see the
kernel assigning ad10 to my first drive (it's ad8 with the -RELEASE
kernel) and ad16 for the second (ad14 with -RELEASE). I have no idea
what is causing this change in numbering.


Thanks,
Fred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mount root error / New device numbering?

2010-05-13 Thread Jeremy Chadwick
On Thu, May 13, 2010 at 11:00:38PM -0300, Fred Souza wrote:
> I give up and reinstall (that first install had given me quite a
> headache with incorrect drive geometry [that I had to fix with a lot
> of research to get to TestDisk and GAG], so I thought it was best to
> just start fresh). I do the same procedure this time, but paying extra
> attention to any details I could have overlooked before. One of them
> was to make a kernel (-STABLE) out of a renamed copy of GENERIC (no
> options added or removed). I also decided on doing the remaining steps
> listed on /usr/src/UPDATING before rebooting; I thought the drive
> numbering difference could be related to something in userland that
> was missing when booting the -STABLE kernel with -RELEASE userland.
> ...
> And I got the same mount root error message, and again it shows the
> drives as ad10 and ad16 instead of ad8 and ad14. The difference is
> that this time I did not try to update /etc/fstab before resorting to
> this list (I had been browsing it for the past 3 days trying to find
> any hints on this, as well as reading /usr/src/UPDATING in full
> again). I can get the system to boot normally if I unload the -STABLE
> kernel and load the -RELEASE one. But I can't figure out for the life
> of me why does -STABLE shift my drive numbers around.

There is probably an ata(4) device layer change which either fixes (yes
really), breaks (possibly), or enhances (likely) support for your ATA or
SATA controller.  This is pretty much how the ata(4) layer has behaved
for years upon years -- that's just how it goes.  If this is your first
time encountering it, congratulations.  :-)  The device names *should
not* change on you once you stick with that kernel; it just indicates
something changed between -RELEASE and -STABLE.

I'd recommend booting/trying an actual 8.0-STABLE snapshot image from
here:

ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/201004/

This will allow you to boot and install 8.0-STABLE on your system.  You
should see devices ad10 and ad16 there as well.  It would at least save
you the pain of installing the kernel, rebooting, and finding you have
to manually deal with /etc/fstab changes and so on.  Give this a shot
first.

It also might help in debugging the "stray IRQ" problem you see (it
would be useful to know what's sitting on IRQ 21; it may be an unused
device in your BIOS which you can disable there, or try to find a
FreeBSD driver for the device which can attach to the IRQ).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mount root error / New device numbering?

2010-05-13 Thread Jeremy Chadwick
On Thu, May 13, 2010 at 11:00:38PM -0300, Fred Souza wrote:
> I did a similar procedure as I used to on the old system, grabbed a
> fresh ports.tar.gz, uncompressed it under /usr, installed cvsup and
> proceeded to updating /usr/src to -STABLE (using the RELENG_8 tag). So
> far, so good.

1) We use csup now, not cvsup.  csup comes with the base system, so
   there's no need to install cvsup.

2) I'm not sure why you're downloading ports.tar.gz and extracting it.
   This means that /var/db/sup/ports-all won't match what's in
   /usr/ports.  You should just use csup to populate /usr/ports.
   You can do this by doing:

   csup -h  -L 2 /usr/share/examples/cvsup/ports-supfile

   You can also populate /usr/src (and thus /var/db/sup/src-all) by
   doing:

   csup -h  -L 2 /usr/share/example/cvsup/stable-supfile

   There are also /etc/make.conf variables you can set to make this
   process easier once you've populated /usr/ports and /usr/src; you
   can do something like "cd /usr/ports ; make update".

> I made a custom kernel config file based off of GENERIC, added only a
> few options (like sound and console customization options), and
> followed the steps listed on /usr/src/UPDATING:
> 
> # cd /usr/src
> # make buildworld
> # make kernel KERNCONF=MYKERNELNAME
> 
>
> ... 
> 
> Could anyone please enlighten me?

Well, if what you're doing is an "in-place" 7.x upgrade to 8.x, I don't
know how to do this or if it works.  Others can help.

Otherwise, the steps you're describing for building a system are not
what's in src/Makefile (not src/UPDATING).  These are the steps:

#  1.  `cd /usr/src'   (or to the directory containing your source tree).
#  2.  `make buildworld'
#  3.  `make buildkernel KERNCONF=YOUR_KERNEL_HERE' (default is GENERIC).
#  4.  `make installkernel KERNCONF=YOUR_KERNEL_HERE'   (default is GENERIC).
#   [steps 3. & 4. can be combined by using the "kernel" target]
#  5.  `reboot'(in single user mode: boot -s from the loader prompt).
#  6.  `mergemaster -p'
#  7.  `make installworld'
#  8.  `make delete-old'
#  9.  `mergemaster' (you may wish to use -U or -ai).
# 10.  `reboot'
# 11.  `make delete-old-libs' (in case no 3rd party program uses them anymore)

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"