I am posting the following information with permission from HP support, in the hope that it may be useful for future GRUB developer reference. Please note that I do not subscribe to the GRUB mailing list, so cc: me directly if any reply is required.
Summary: When using GRUB to chain-load from one device to another device, the HP BIOS used in currently DL120/DL360 (G7) servers reports "Illegal Opcode" and a red crashdump screen. This failure did not occur on previous G6 generation servers of the same models, which used AMI/Phoenix BIOS. References: HP support case 4635415916, opened for additional clarification in reference to HP customer advisory number c02695572 http://bizsupport1.austin.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02695572&lang=en&cc=us&taskId=101&prodSeriesId=4091408&prodTypeId=15351 Root cause analysis: HP level3 engineering identified the root cause as follows: _start_quoted_text_ HP Level-3 engineering have found that the HP BIOS on the DL120 G7 is not causing the red screen. GRUB loads its own INT13 handler in the interrupt vector table, so it will now intercept all int13 calls. Some time after it does that, GRUB does some type of memory copy operation which overrides the data at the address where Grub stores the INT13 handler code. As a result, on the next Int13 call in grub, the interrupt handler is no longer there so the processor just starts to execute whatever data overwrote where the int13 handler code was. Here is how the red screen happens: When the processor executes an illegal instruction (like when it tries to execute whatever is in the overwritten int13 handler), the processor causes and interrupt which the BIOS then handles by printing the red screen with the register dump and the message. So our BIOS just prints out the red screen, but the cause of the red screen is Grub. The specific scenario which leads to this is identified as follows: 1) Grub installs its own INT13 handler 2) Near the end of the chain loading process, Grub loads an image of the Linux kernel into memory which wipes out their Int13 handler. 3) Right before grub transfers control to the kernel to boot, grub makes a call to a function to turn off the floppy drive. 4) The call to the floppy code then makes an Int13 call to the handler which has been overwritten by the kernel and thereby results in the red screen. The problem seems to be that Grub made assumptions about the memory layout in our system which is not accurate. HP systems that use HP developed BIOSes instead of outsourced (AMI) BIOSes use more of a memory area called EBDA than a typical system does. As a result, Grub assumes there's memory that it could safely use instead of properly calculating an area of safe memory to use. That's probably why Grub worked on the other systems and fails on G7. _end quoted text_ Regards, Iain Barker - Platform Engineering, Acme Packet. yos...@member.fsf.org _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel