[Bug preprocessor/31544] New: PPC object code generation

trog24 at comcast dot net Wed, 11 Apr 2007 19:20:58 -0700

When I upgraded my system (733 MHz PowerPC G4 QuickSilver) using an MPC-7450 to
a 1.4GHz MPC-7457 using a 133 MHz bus, I started getting DSI exceptions and
system freezes (panics) running MacOS 10.2, then MacOS 10.3, and finally MacOS
10.4.  All the memory (1.5 GB) was tested and replaced if necessary; but, the
exceptions and freezes kept occurring.  I finally replaced the MPC-7457 with a
Dual 800 MHz PowerPC G4 which resulted in the DSI and freezes dropping
dramatically (still rare occurrences and some hang ups on shutdown).  I began
looking at the code where the exceptions occurred and noticed register values
that did not match what the instruction stream would generate which lead me to
believe there may be an interrupt servicing problem.  As part of the research,
I looked at the entry point of various modules and found an instruction
sequence that sends up flags to anyone who works with ISR's: namely, that the
stack is used before being updated leaving a potential register corruption
problem should an interrupt occur between the two points.  The instruction
stream of concern is as follows:


        mflr    r0
        lis     r2,2
        stmw    r25,-28(r1)
        lis     r9,2
        mr      r28,r4
        mr      r25,r3
        mr      r27,r5
        stw     r0,8(r1)
        stwu    r1,-128(r1)

The program with the aforementioned sequence is C++.  Looking at a2p and gcc
(4.1.1 configured with ../configure -disable-multilib) had the same instruction
sequence with the exception of r2 and r9 being set to 1.  According to the
MPC-7450 RISC Processor Family Reference Manual, instructions are fetched 8 at
a time with the critical  quad word (1st 4 instructions) going directly to the
IQ and the non-critical quad word going to L1I.  The 733 MHz has a 5.5:1 ratio
with the bus which means 11 CPU cycles per fetch of four instructions (not
including the propagation delay to make the bus request following an L1 and L2
miss).  The first four instructions would use 7 cycles to dispatch as follows:

cycle 0 - mflr          r0
          lis           r2,2
          stw           r25,-28(r1)
cycle 1 - stw           r26,-24(r1)
          lis           r9,2
cycle 2 - stw           r27,-20(r1)
cycle 3 - stw           r28,-1C(r1)
cycle 4 - stw           r29,-18(r1)
cycle 5 - stw           r30,-14(r1)
cycle 6 - stw           r31,-10(r1)

then wait 4 cycles for the next 4 instructions.  The Load/Store Unit (LSU) is a
3 stage pipeline (not 2 as indicated in 7450.md) which implies 4 additional
cycles (stage 3 is the L1D request to store) for r25 to reach L1D making the
storage at the same time as the receipt of the non-critical quad word if the
request is honored by the next cycle.  It is the next cycle that would initiate
the fetch of the stwu instruction which would have a propagation delay of 13
cycles before a bus request is made (2 cycles for L1I request and miss, 11
cycles for L2 request and miss).  That leaves a 17 cycle window for an
interrupt to occur with the critical time after the non-critical quad word in
IQ and during the L1/L2 miss sequence.  On the dual 800 MHz (6:1), the 19 cycle
window is expanded to 19 cycles (once a main bus request is made, it does not
matter whether a bus is busy) and on the 7457 (10.5:1), the window is expanded
to 27 cycles.  This window is expected to be expanded to 32 cycles at 1.7 GHz. 
The wider the window, the greater the possibility of receiving an interrupt
between memory storage, register setting, and stack pointer update.

        A far safer instruction stream is to have the stwu as the first
instruction rather than ninth which will not change the timing very much in
that r1's rename will become available in stage 2 of the LSU..


-- 
           Summary: PPC object code generation
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: preprocessor
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: trog24 at comcast dot net
 GCC build triplet: 4.1.1
  GCC host triplet: Dual 800 MHz PowerPC G4 (QuickSilver)
GCC target triplet: powerpc-apple-darwin8.8.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31544

[Bug preprocessor/31544] New: PPC object code generation

Reply via email to