When I upgraded my system (733 MHz PowerPC G4 QuickSilver) using an MPC-7450 to a 1.4GHz MPC-7457 using a 133 MHz bus, I started getting DSI exceptions and system freezes (panics) running MacOS 10.2, then MacOS 10.3, and finally MacOS 10.4. All the memory (1.5 GB) was tested and replaced if necessary; but, the exceptions and freezes kept occurring. I finally replaced the MPC-7457 with a Dual 800 MHz PowerPC G4 which resulted in the DSI and freezes dropping dramatically (still rare occurrences and some hang ups on shutdown). I began looking at the code where the exceptions occurred and noticed register values that did not match what the instruction stream would generate which lead me to believe there may be an interrupt servicing problem. As part of the research, I looked at the entry point of various modules and found an instruction sequence that sends up flags to anyone who works with ISR's: namely, that the stack is used before being updated leaving a potential register corruption problem should an interrupt occur between the two points. The instruction stream of concern is as follows:
mflr r0 lis r2,2 stmw r25,-28(r1) lis r9,2 mr r28,r4 mr r25,r3 mr r27,r5 stw r0,8(r1) stwu r1,-128(r1) The program with the aforementioned sequence is C++. Looking at a2p and gcc (4.1.1 configured with ../configure -disable-multilib) had the same instruction sequence with the exception of r2 and r9 being set to 1. According to the MPC-7450 RISC Processor Family Reference Manual, instructions are fetched 8 at a time with the critical quad word (1st 4 instructions) going directly to the IQ and the non-critical quad word going to L1I. The 733 MHz has a 5.5:1 ratio with the bus which means 11 CPU cycles per fetch of four instructions (not including the propagation delay to make the bus request following an L1 and L2 miss). The first four instructions would use 7 cycles to dispatch as follows: cycle 0 - mflr r0 lis r2,2 stw r25,-28(r1) cycle 1 - stw r26,-24(r1) lis r9,2 cycle 2 - stw r27,-20(r1) cycle 3 - stw r28,-1C(r1) cycle 4 - stw r29,-18(r1) cycle 5 - stw r30,-14(r1) cycle 6 - stw r31,-10(r1) then wait 4 cycles for the next 4 instructions. The Load/Store Unit (LSU) is a 3 stage pipeline (not 2 as indicated in 7450.md) which implies 4 additional cycles (stage 3 is the L1D request to store) for r25 to reach L1D making the storage at the same time as the receipt of the non-critical quad word if the request is honored by the next cycle. It is the next cycle that would initiate the fetch of the stwu instruction which would have a propagation delay of 13 cycles before a bus request is made (2 cycles for L1I request and miss, 11 cycles for L2 request and miss). That leaves a 17 cycle window for an interrupt to occur with the critical time after the non-critical quad word in IQ and during the L1/L2 miss sequence. On the dual 800 MHz (6:1), the 19 cycle window is expanded to 19 cycles (once a main bus request is made, it does not matter whether a bus is busy) and on the 7457 (10.5:1), the window is expanded to 27 cycles. This window is expected to be expanded to 32 cycles at 1.7 GHz. The wider the window, the greater the possibility of receiving an interrupt between memory storage, register setting, and stack pointer update. A far safer instruction stream is to have the stwu as the first instruction rather than ninth which will not change the timing very much in that r1's rename will become available in stage 2 of the LSU.. -- Summary: PPC object code generation Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: trog24 at comcast dot net GCC build triplet: 4.1.1 GCC host triplet: Dual 800 MHz PowerPC G4 (QuickSilver) GCC target triplet: powerpc-apple-darwin8.8.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31544