Stage 1 weak points Last week I sent in a few patches, that correct some minor issues and clear some corner cases. Because of their limited scope I felt sure to simply write a patch without further discussion. But the following is of another kind. The now outstanding points touch the architecture of stage 1, and here I certainly want to discuss matters with the community.
A. Goals Perhaps I should start with a short description of what stage 1 does when it is loaded and started: Its primary goal is to load the first 512 bytes of the next stage into RAM and start it. The location of next stage is given as a LBA offset. Second, stage 1 supports further loads by providing information such as disk geometry, so subsequent stages do not have to probe the drives themselves. Last, should something go wrong, stage 1 displays some diagnostic information and dies. B. Strategies to fulfill the goals As you can imagine, the first point is by far the most difficult one. Loading a sector means that you probe the drive first to see how it is best accessed and then finally order the BIOS to do the loading. Currently, stage 1 supports two loading strategies and even 3 probing strategies. This is quite a lot for a small program confined in as little as 378 bytes of machine code. The three probing strategies are: 1. Checking, whether the BIOS supports LBA access for the boot drive; 2. Evaluating the cylinder/head/sector geometry of a drive using a special BIOS call. 3. Evaluating the geometry of a floppy by successively assuming smaller and smaller track sizes, until reading of the last sector of a track succeeds. >From the track size one can guess more or less safely the format of the disk. The two loading strategies are: 1. Building a data structure called disk address packet und submitting this structure for LBA access. 2. Using the geometry of a floppy or hard disk to transform the LBA address of the following stage into a CHS formatted address and submitting this value for CHS access. As a bonus, stage 1 finally moves the loaded beginning of the next stage into any RAM location the grub-installer told him. C. Bugs If all the above went good, I would say 'wow', but, sadly, that is sometimes not the case. Especially with old hardware, there are some issues that I think need refining. The weak point is the 3. probing strategy, the floppy probing. IMHO, the current implementation renders this strategy useless (there might be some relation to Bug #482). So either remove it completely, or activate it in a correct fashion. What goes wrong here, is a misunderstanding of BIOS call INT 0x13, function 8. This call is used to retrieve the geometry of a drive. The drive and not the inserted media. So, if the capabilities of a drive and media differ (a 1.44 MB floppy in a 2.88 MB drive), GRUB currently uses a wrong geometry to access the floppy disk. Further on, GRUB invokes the floppy probing only, if the above BIOS call returns an error. I admit, I do not understand why it is programmed that way. The call might fail for two reasons: the drive is not accessible by BIOS, or the battery of the CMOS RAM is exhausted, so the parameters of the drive could not be read. But in either case, how could reading from a floppy succeed at all, and floppy probing could help out? Fortunately, modern computers are not that susceptible to this bug. First, the number of computers without floppy drive is growing. Second, modern floppy drives or modern BIOSes, whatever is responsible for the following effect, seems to abstract from the real format of a floppy. I was really surprised to see both my drives accessing a more than 10 years old 720 KB floppy using the geometry of a 1.44 MB disk and assuming a track size of 18 sectors! And they succeeded! I knew for sure it was formatted according to the then standard with 9 sectors/track, and I made sure, this value was written on the disk label as well. With such a drive the above bug does not show up, of course. Third, a normal GRUB installation of stage 1 and stage 1.5 fits completely into 18 sectors, so with nowadays diskettes and drives you do not reach into the critical area beyond the first track. However, it's still a bug and should be dealt with. I would like to hear some opinions on this subject. D. Software Design Some of the complexity of stage 1 could be put down to the fact that it is designed as an allrounder. Whether you use it on a hard disk or on a floppy disk does not matter at all. This was convenient at times when there were only two media types bootable. You simply copy the boot track from a floppy disk to a hard disk, and all works fine. But this concept is broken for quite some time already. First, you see plenty of other bootable media such as a CD-ROM. Second, because of BIOS bugs, GRUB already patches stage 1 when it loads it onto a hard drive, so simple copying won't work any more already. In addition, the tight memory restrictions do not allow for more BIOS bug workarounds or enhancements of drives. So, why not step back from this concept and create dedicated stage 1s. The installer uses out of a pool the one suiting the destination media. Since the code could be stream-lined, more functionality and richer error diagnosis could be provided. For instance, a floppy stage 1 could deal with various special formats. A hard disk stage 1 does not die any more, if an ECCcorrected load occurs. And so on. Using such a 'plugable' stage 1 concept requires a redesign of the current stage1/stage1.5 interface. IMHO, the current protocol between both stages looks a bit strange. Stage 1.5 knows too much about the interior of stage 1, and, since stage 1 has a loader built in, a loading service should be provided to other stages. I'd like to hear some opinions to this too. Wolf Lammen -- GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...) jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++ _______________________________________________ Bug-grub mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-grub