On Tue, Nov 10, 2009 at 11:36:44AM -0600, Peter Tyser wrote: > > Ok, here are my results, this is on a 8349EMDS-derived board. My > > 8349EMDS eval board doesn't have ECC memory. > > > > 1) It might be nice to have something to print the current injection > > registers. It is not a big deal, anyone using this should be an expert > > anyway. > > Thanks for the feedback. I can add a printing of the current injection > values when "ecc inject" is ran if others would like. > > > 2) ecc inject off didn't seem to work, see the following capture: > > > > => ecc info > > No ECC errors have occurred > > => ecc inject low 0x1 > > => ecc info > > > > WARNING: ECC error in DDR Controller 0 > > Addr: 0x0_0ff7ae40 > > Data: 0x0fffdf9c_0ff7aed1 ECC: 0x81 > > Expect: 0x0fffdf9c_0ff7aed0 ECC: 0x81 > > Net: DATA0 > > Syndrome: 0x3b > > Single-Bit errors: 0x1e > > Attrib: 0x01002001 > > Detect: 0x80000004 (MME, SBE) > > > > => ecc inject off > > > > # Ok, now error injection is off, I still expect some errors to be > > # present in the error registers > > > > => ecc info > > > > WARNING: ECC error in DDR Controller 0 > > Addr: 0x0_0ff7ae1c > > Data: 0x0fffdf9c_0ff7d2a1 ECC: 0xe4 > > Expect: 0x0fffdf9c_0ff7d2a0 ECC: 0xe4 > > Net: DATA0 > > Syndrome: 0x3b > > Single-Bit errors: 0xd1 > > Attrib: 0x01003001 > > Detect: 0x80000004 (MME, SBE) > > > > # And there was the error. Now, I don't expect any more errors to > > # be present, after all, injection is disabled. > > # > > # But there is one! Why? > > I believe what's happening is: > 1. You turn error injection on > 2. Every time you perform a DRAM write, the value written has an ECC > error > 3. You write to DRAM lots of times, in lots of locations > 4. You turn error injection off > 5. There are still lots of ECC errors residing in DRAM that you discover > later when you read from "corrupted" memory locations > > So in theory, unless you scrub your memory, you might uncover lots more > ECC errors later. > > As an easily reproducible example try: > > ecc inject low 1; mw.l 0x100000 0xbeefba11 0x800000; ecc inject off > > ecc info > > ecc info > > md 0x100000 > > ecc info > > ecc info > > md 0x200000 > ... > > The majority of the above ecc errors could be cleared by running the > following command with ecc injection off: > mw.l 0x100000 0xbeefba11 0x800000 > > > > => ecc info > > > > WARNING: ECC error in DDR Controller 0 > > Addr: 0x0_0fff8a0c > > Data: 0x0fff8a00_0fff8a01 ECC: 0xff > > Expect: 0x0fff8a00_0fff8a00 ECC: 0xff > > Net: DATA0 > > Syndrome: 0x3b > > Single-Bit errors: 0x04 > > Attrib: 0x01003001 > > Detect: 0x00000000 > > => > > > > # Note that I keep seeing ecc errors until I run the command: > > # ecc inject low 0 > > Hmm... "ecc inject off" should have the same effect as "ecc inject low > 0". Is there a chance some of the ECC errors still remaining in DRAM > are the culprit? > > > # Why did it take two runs of ecc info to clear all of the errors? > > This is probably the same issue as above - lots errors are injected and > there's no saying when exactly they'll turn up. > > > Other than the above strangeness, everything is working great on my 83xx > > board. I think the new output is pretty nice. It serves my purposes > > equally well to the old code. > > Thanks for trying the changes out,
Ok, this makes perfect sense. I didn't think about the possibility of latent memory errors. :) Here is a run using your instructions above. Keeping the possibility of latent memory errors in mind, the behavior seems correct to me. You're free to add my Tested-by if you'd like. => ecc inject low 1 => mw.l 0x100000 0xbeefba11 0x800000 => ecc inject off => ecc info WARNING: ECC error in DDR Controller 0 Addr: 0x0_0ff7ae40 Data: 0x0fffdf9c_0ff7aed1 ECC: 0x81 Expect: 0x0fffdf9c_0ff7aed0 ECC: 0x81 Net: DATA0 Syndrome: 0x3b Single-Bit errors: 0x56 Attrib: 0x01002001 Detect: 0x80000004 (MME, SBE) => ecc info WARNING: ECC error in DDR Controller 0 Addr: 0x0_0ff7ad08 Data: 0x0ffd594c_0000087f ECC: 0x91 Expect: 0x0ffd594c_0000087e ECC: 0x91 Net: DATA0 Syndrome: 0x3b Single-Bit errors: 0x01 Attrib: 0x01003001 Detect: 0x00000000 => ecc info No ECC errors have occurred => ecc info No ECC errors have occurred => ecc info No ECC errors have occurred => md 0x100000 10 00100000: beefba11 beefba11 beefba11 beefba11 ................ 00100010: beefba11 beefba11 beefba11 beefba11 ................ 00100020: beefba11 beefba11 beefba11 beefba11 ................ 00100030: beefba11 beefba11 beefba11 beefba11 ................ => ecc info WARNING: ECC error in DDR Controller 0 Addr: 0x0_0010003c Data: 0xbeefba11_beefba10 ECC: 0x7b Expect: 0xbeefba11_beefba11 ECC: 0x7b Net: DATA0 Syndrome: 0x3b Single-Bit errors: 0x13 Attrib: 0x01002001 Detect: 0x00000000 => ecc info No ECC errors have occurred => md 0x200000 10 00200000: beefba11 beefba11 beefba11 beefba11 ................ 00200010: beefba11 beefba11 beefba11 beefba11 ................ 00200020: beefba11 beefba11 beefba11 beefba11 ................ 00200030: beefba11 beefba11 beefba11 beefba11 ................ => ecc info WARNING: ECC error in DDR Controller 0 Addr: 0x0_0020003c Data: 0xbeefba11_beefba10 ECC: 0x7b Expect: 0xbeefba11_beefba11 ECC: 0x7b Net: DATA0 Syndrome: 0x3b Single-Bit errors: 0x10 Attrib: 0x01002001 Detect: 0x00000000 => ecc info No ECC errors have occurred => mw.l 0x100000 0xbeefba11 0x800000 => ecc info WARNING: ECC error in DDR Controller 0 Addr: 0x0_001007c8 Data: 0xbeefba11_beefba10 ECC: 0x7b Expect: 0xbeefba11_beefba11 ECC: 0x7b Net: DATA0 Syndrome: 0x3b Single-Bit errors: 0x06 Attrib: 0x01003001 Detect: 0x80000004 (MME, SBE) => ecc info No ECC errors have occurred => ecc info No ECC errors have occurred Ira _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot