[This Week] - Investigate MOL device tree further - <Mark> There is also an interesting dump of what seems to be a MOL tree at http://josejx.net/mol/mol-stable/mollib/oftrees/oftree.nw.old (which has a commented out entry for "AAPL,debug" set to -1). The interesting parts I see here are the "interrupt-controller" node in /chosen and various additional names in the "compatible" property for various devices. - Progress past quiesce() client interface call - At the moment, this function simply closes all USB devices (`usb_exit()`) and sets the Instruction and Data Address Translation bits in the MSR (@agraf: does this cause any side effects?). The stack diagram shows no arguments or return values, and none of the other `ciface` functions make any modifications to the stack other than their arguments or return values. - Looks like `quiesce` either hangs on `mtmsr` or makes a jump out of the function. - MSR.IR and MSR.DR enable paging for instructions and data respectively. IIRC we disable paging when quiesce gets called because it's the last call that Linux runs before it executes non-OF aware code IIRC. - Try to run qemu with -d in_asm,cpu,int -D log and check where the code execution hangs :). There's a good chance we're already in the Mac OS 9 kernel. - It would appear that execution has branched off into invalid memory. invalid/unsupported opcode: 00 - 00 - 00 (00000000) 00f03000 0 IN: 0x00f03000: .long 0x0 - Awesome. What code gets executed before that? Maybe we shouldn't turn off IR/DR? - Actually looking at the code, the MSR IR/DR part is surround by a #if 0 ... #endif pair... maybe something in usb_exit() is breaking things? Possibly try removing it temporarily? - Patch: Copyright string - Patch: RTAS node
[Next Week] - Remove extraneous "interrupts" property from /pci/mac-io - I'm having trouble tracking down where the property is actually being set. The mac-io devices are defined in drivers/pci_database.c, but the pci_dev_t struct (drivers/pci_database.h) doesn't appear to have an interrupts field. Much of the mac-io related functionality is defined in drivers/macio.c, but there's no mention of interrupts there so I'm kind of at a loss. - Test and debug boot script loader [Long Term] - Continue updating project log - Create and send patches - [ ] Patch: Boot script loader - [ ] ROM node - Enable debugging in MOL - The installation is now broken after a failed make. Tried reinstalling from Zypper but it couldn't find the config files -- do they have to be added manually? - The config files are at a different location when you compile manually, but I don't remember where everything was :). Just run mol with strace, it will tell you where it's looking for things. The current ones should be somewhere in /etc. [Done] - Try to set AAPL,Debug in / - <mark> Did this have any effect in the end? Note that the casing above is incorrect and the property name should be "AAPL,debug". - This does work, it dumps most system calls to the output. Still looking for an actual cause for the error. - @cormacobrien2 That sounds great! Errors can sometimes be subtle, e.g. returning a wrong value rather than a missing value so it would be good to see the output. Does it run in -nographic mode so the output can be pasted into a github gist (similar to as you did for the OS 9 boot script?) - @markcaveayland Here's a gist with the output log. Most of the relevant info is at the bottom, it's quite verbose. https://gist.github.com /cormac-obrien/f7b1f0f600dd578fb7a5 - @cormacobrien2 Hmmm that's interesting - it's complaining about the "interrupts" property on the macio device. (goes and digs) According to the copies of the g3beige and g4 device trees that I have here (and indeed the tree at http://josejx.net/mol/mol- stable/mollib/oftrees/oftree.nw.old), the mac-io node itself shouldn't have *any* interrupts properties. I wonder if that is what the error message is trying to say? - @markcaveayland That's the baffling part, here's the output of .properties after the boot fails: https://gist.github.com/cormac- obrien/4fea2f03e69e2a8c94b8 - @cormacobrien2 That part is easy - for a PCI device such as macio, the values for the properties are extracted directly from the PCI configuration (as per the OF spec). First thing to check is whether removing the "interrupts" property will get things further - can you boot direct to the Forth prompt, remove the property manually with "delete- property" and then "boot" manually to see if it gets further or not? - @markcaveayland That appears to work -- got much further in the boot process than we have previously. Seems like I misinterpreted the error message, thinking the lack of interrupts was a problem, but it was actually that it had too many. Thanks! Now on to the next error. I've updated the gist with the new log, which gives "Stopping at end of FCODE, due to fatal error (see above)." The only thing I can "see above" at the moment is on line 6516, with a "possible argument error". Obviously there's a lot more to look through, so I'll keep searching. - @cormacobrien2 Excellent! I wonder if the "Fatal Error" message is a final error check before execing the kernel, i.e. if the error flag has been set before jumping into the kernel then fail with the error first rather than carrying on? Given that the remaining fatal error for -M mac99 is about being unable to find the /rtas node, I would suggest trying to fake one up and see if that helps. It is particularly telling that most of the /rtas words appear to be "made up" if you look at http://josejx.net/mol/mol-stable/mollib/oftrees/oftree.nw.old for inspiration. - @cormacobrien2 Also I noticed a subtle issue at line 474 of your gist which may be relevant here - if not yet then very soon :) - @markcaveayland It looks like adding the device statically doesn't agree with instantiate-rtas (implemented in C as arch/ppc/qemu/methods.c:rtas_instantiate). rtas_instantiate doesn't appear to actually affect the device tree, but it looks to be reserving a page and copying to it from of_rtas_start. Is there any way to force a device's position in memory? - @cormac_obrien Hmmm rtas is part of CHRP rather than Open Firmware so this sounds like a question for @agraf. Alex, any ideas? - RTAS are the "Runtime Services". Some code allocated in memory that an OS can jump to to do random firmware operations. It's very similar to BIOS or EFI Runtime Services. The device tree part just tells the OS which address to call when it wants to do an RTAS call. - Setting the CONFIG_RTAS flag would probably have been a good idea *facepalm*, although it did still require the fake properties. @markcaveayland How do I go about setting that flag for build? I'm defining it in the source file and it works but I assume there's an XSL file I should be modifying somewhere, I just don't know where. And the issue you mentioned is indeed relevant, but seems trivial enough. (famous last words...) - @cormac_obrien That's easy - just enable it in config/examples/ppc_config.xml. But note that the config XML files are only read by the switch-arch script so you'll need to remove all existing obj-* target directories, re-run config/scripts/switch-arch to regenerate them from the updated XML, and finally make once again to generate the new binaries. - Got it. A couple of cool developments, one being the attached AAPL,debug printout (see the last bit) and the other being this: "Off to MacOS. The next (and last) call into OpenFirmware is quiesce()." Of course, it hangs here, but it's still pretty cool to see. I've updated the log at https://gist.github.com/cormac-obrien/f7b1f0f600dd578fb7a5 again with the new execution. - @cormac_obrien Take a look at ob_pci_add_properties() in drivers/pci.c to see how the interrupts are set up. As for why the boot hangs, I can think of 2 reasons: 1) Are there any missing stack parameters to be returned by "quiesce"? The last line reads ">> of_client_interface return:" with no return value, so maybe something is amiss here? 2) Check out the very last line of your screenshot above ;) - @markcaveayland 2) is definitely not it, I cleared that bit as soon as I saw it. Not sure if you've looked at the comments in quiesce(), but they're less than reassuring :D I'll take a look and see if there could be something missing. - @cormac_obrien Cool. From memory I believe that the CIF call wrapper in client.fs "consumes" an extra stack parameter from the underlying word in order to indicate success or failure, before passing the remaining stack args back to the client. Check out other CIF words and their stack diagrams whilst stepping through the debugger to see what needs to be done here. - Since I've successfully set the "AAPL,debug" property, I'm going to move this card to "Done" and open a new one regarding the quiesce() call. - Patch: Adler-32 functionality - Send out intro email