Stage 1 weak points

Last week I sent in a few patches, that correct some minor issues and clear
some corner cases. Because of their limited scope I felt sure to simply write
a patch without further discussion. But the following is of another kind.
The now outstanding points touch the architecture of stage 1, and here I
certainly want to discuss matters with the community.

A. Goals
Perhaps I should start with a short description of what stage 1 does when it
is loaded and started:
Its primary goal is to load the first 512 bytes of the next stage into RAM
and start it. The location of next stage is given as a LBA offset. Second,
stage 1 supports further loads by providing information such as disk geometry,
so subsequent stages do not have to probe the drives themselves. Last, should
something go wrong, stage 1 displays some diagnostic information and dies.

B. Strategies to fulfill the goals
As you can imagine, the first point is by far the most difficult one.
Loading a sector means that you probe the drive first to see how it is best
accessed and then finally order the BIOS to do the loading.
Currently, stage 1 supports two loading strategies and even 3 probing
strategies. This is quite a lot for a small program confined in as little as 378
bytes of machine code.
The three probing strategies are:
1. Checking, whether the BIOS supports LBA access for the boot drive;
2. Evaluating the cylinder/head/sector geometry of a drive using a special
BIOS call.
3. Evaluating the geometry of a floppy by successively assuming smaller and
smaller track sizes, until reading of the last sector of a track succeeds.
>From the track size one can guess more or less safely the format of the disk.
The two loading strategies are:
1. Building a data structure called disk address packet und submitting this
structure for LBA access.
2. Using the geometry of a floppy or hard disk to transform the LBA address
of the following stage into a CHS formatted address and submitting this value
for CHS access.
As a bonus, stage 1 finally moves the loaded beginning of the next stage
into any RAM location the grub-installer told him.

C. Bugs
If all the above went good, I would say 'wow', but, sadly, that is sometimes
not the case. Especially with old hardware, there are some issues that I
think need refining. The weak point is the 3. probing strategy, the floppy
probing. IMHO, the current implementation renders this strategy useless (there
might be
some relation to Bug #482). So either remove it completely, or activate it
in a correct fashion.
What goes wrong here, is a misunderstanding of BIOS call INT 0x13, function
8. This call is used to retrieve the geometry of a drive. The drive and not
the inserted media. So, if the capabilities of a drive and media differ (a
1.44 MB floppy in a 2.88 MB drive), GRUB currently uses a wrong geometry to
access the floppy disk.
Further on, GRUB invokes the floppy probing only, if the above BIOS call
returns an error. I admit, I do not understand why it is programmed that way.
The call might fail for two reasons: the drive is not accessible by BIOS, or
the battery of the CMOS RAM is exhausted, so the parameters of the drive could
not be read. But in either case, how could reading from a floppy succeed at
all, and floppy probing could help out?
Fortunately, modern computers are not that susceptible to this bug. First,
the number of computers without floppy drive is growing. Second, modern floppy
drives or modern BIOSes, whatever is responsible for the following effect,
seems to abstract from the real format of a floppy. I was really surprised to
see both my drives accessing a more than 10 years old 720 KB floppy using the
geometry of a 1.44 MB disk and assuming a track size of 18 sectors! And they
succeeded! I knew for sure it was formatted according to the then standard
with 9 sectors/track, and I made sure, this value was written on the disk
label as well. With such a drive the above bug does not show up, of course.
Third, a normal GRUB installation of stage 1 and stage 1.5 fits completely
into 18 sectors, so with nowadays diskettes and drives you do not reach into
the critical area beyond the first track.
However, it's still a bug and should be dealt with. I would like to hear
some opinions on this subject.

D. Software Design
Some of the complexity of stage 1 could be put down to the fact that it is
designed as an allrounder. Whether you use it on a hard disk or on a floppy
disk does not matter at all. This was convenient at times when there were only
two media types bootable. You simply copy the boot track from a floppy
disk to a hard disk, and all works fine.
But this concept is broken for quite some time already. First, you see
plenty of other bootable media such as a CD-ROM. Second, because of BIOS bugs,
GRUB already patches stage 1 when it loads it onto a hard drive, so simple
copying won't work any more already.
In addition, the tight memory restrictions do not allow for more BIOS bug
workarounds or enhancements of drives.
So, why not step back from this concept and create dedicated stage 1s. The
installer uses out of a pool the one suiting the destination media. Since the
code could be stream-lined, more functionality and richer error diagnosis
could be provided. For instance, a floppy stage 1 could deal with various
special formats. A hard disk stage 1 does not die any more, if an ECCcorrected load
occurs. And so on.
Using such a 'plugable' stage 1 concept requires a redesign of the current
stage1/stage1.5 interface. IMHO, the current protocol between both stages
looks a bit strange. Stage 1.5 knows too much about the interior of stage 1, and,
since stage 1 has a loader built in, a loading service should be provided
to other stages.
I'd like to hear some opinions to this too.

Wolf Lammen

-- 
GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...)
jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++



_______________________________________________
Bug-grub mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-grub

Reply via email to