RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-26 Thread Paul Walmsley
Hi,

On Tue, 23 Jun 2009, Woodruff, Richard wrote:

  From: Paul Walmsley [mailto:p...@pwsan.com]
  Sent: Tuesday, June 23, 2009 3:05 PM
 
  Looks like the significant difference is the use of CLKCTRL = 0x2 (+
  AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that
  causes this problem?  Perhaps the problem is partially related to PWDENA =
  1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER
  register value and see if the problem reproduces.
 
 It is my guess not for PWDENA as it was reported on a system with that 
 not set.  Yes that should not enabled per errata.  Anyway it only does 
 something in OFF type scenarios where you don't use self-refresh. 
 Typical use cases don't seem to be using this.
 
 There are a few unlock events which are best for any focused test. Most 
 are supposed to auto-relock but ... I've heard of some validation tests 
 in a few areas maybe they will yield something.

A quick note: in stress testing here, was able to confirm that SDRC_POWER 
= 0x000200AD causes problems under memory load and intensive CORE DVFS.  
SDRC_POWER = 0x0002009D seems to work fine.  The difference is CLKCTRL = 1 
(works) vs. CLKCTRL = 2 (fails).  According to the TRM, CLKCTRL = 2 
enables self-refresh when AUTOCOUNT reaches 0.  There is probably some 
race between AUTOCOUNT and the CORE DVFS SRAM code, or it is tickling some 
SDRC bug.  Will look into this further later.

By the way, there appears to be a TRM bug in the Dynamic Power Saving 
Configurations table in the SDRC chapter.  The External SDRC CLK column 
should have after AUTOCOUNT expiration for CLKCTRL = 1 rows.


- Paul
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-26 Thread Woodruff, Richard
Hi Paul,

 On Tue, 23 Jun 2009, Woodruff, Richard wrote:

   From: Paul Walmsley [mailto:p...@pwsan.com]
   Sent: Tuesday, June 23, 2009 3:05 PM
 
   Looks like the significant difference is the use of CLKCTRL = 0x2 (+
   AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that
   causes this problem?  Perhaps the problem is partially related to PWDENA =
   1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER
   register value and see if the problem reproduces.
 
  It is my guess not for PWDENA as it was reported on a system with that
  not set.  Yes that should not enabled per errata.  Anyway it only does
  something in OFF type scenarios where you don't use self-refresh.
  Typical use cases don't seem to be using this.
 
  There are a few unlock events which are best for any focused test. Most
  are supposed to auto-relock but ... I've heard of some validation tests
  in a few areas maybe they will yield something.

 A quick note: in stress testing here, was able to confirm that SDRC_POWER
 = 0x000200AD causes problems under memory load and intensive CORE DVFS.
 SDRC_POWER = 0x0002009D seems to work fine.  The difference is CLKCTRL = 1
 (works) vs. CLKCTRL = 2 (fails).  According to the TRM, CLKCTRL = 2
 enables self-refresh when AUTOCOUNT reaches 0.  There is probably some
 race between AUTOCOUNT and the CORE DVFS SRAM code, or it is tickling some
 SDRC bug.  Will look into this further later.

 By the way, there appears to be a TRM bug in the Dynamic Power Saving
 Configurations table in the SDRC chapter.  The External SDRC CLK column
 should have after AUTOCOUNT expiration for CLKCTRL = 1 rows.

That is an interesting result.

My suspicion also has been with some kind of self-refresh race and DLL 
lock/unlock conditions.  DVFS would stress the DLL lock/unlock path.

In your test did you have a crash or just find the DLL did not relock.  The 
signature I'm looking for is not locking.  A crash in this particular test 
might point more to some other issue in sequence.

On another thread it was noted you had optimized out a needed delay after M2 
divider change for DVFS.  Fixing this might change behavior of this test.  
Stress testing in older code bases along with simulation result showed the 
delay was necessary.

Regards,
Richard W.

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-23 Thread Paul Walmsley
Hello Richard, 

On Fri, 19 Jun 2009, Woodruff, Richard wrote:

  From: Paul Walmsley [mailto:p...@pwsan.com]
  Sent: Thursday, June 18, 2009 4:07 PM
 
  On Wed, 17 Jun 2009, Woodruff, Richard wrote:
 
   I've only seen the condition along side of very aggressive SDRC_POWER
   settings.
 
  Could you send over the SDRC_POWER settings that cause this problem?
 
 This is one pulled for a custom board:
 
 # ./readmem 0x6d70
 0x000200AD

Thanks.

According to the 34xx TRM Table 11-178 this sets the following: (asterisks 
mark differences from the current linux-omap code base.)

  WAKEUPPROC = 0
* AUTOCOUNT = 0x200 
* SRFRONRESET = 1
  SRFRONIDLEREQ = 0
* CLKCTRL = 2
  EXTCLKDIS = 1
  PWDENA = 1
  PAGEPOLICY = 1

Looks like the significant difference is the use of CLKCTRL = 0x2 (+ 
AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that 
causes this problem?  Perhaps the problem is partially related to PWDENA = 
1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER 
register value and see if the problem reproduces.


- Paul
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-19 Thread Woodruff, Richard

 From: Paul Walmsley [mailto:p...@pwsan.com]
 Sent: Thursday, June 18, 2009 4:07 PM

Hi Paul,

 On Wed, 17 Jun 2009, Woodruff, Richard wrote:

  I've only seen the condition along side of very aggressive SDRC_POWER
  settings.

 Could you send over the SDRC_POWER settings that cause this problem?

This is one pulled for a custom board:

# ./readmem 0x6d70
0x000200AD

Regards,
Richard W.

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Kevin Hilman
Woodruff, Richard r-woodru...@ti.com writes:

 Hi Paul,

 Hm, seems my mailing is out to lunch today...  I'll be worse than normal in 
 protocol and top post.

 I've seen this issue on a couple custom boards.  In these instances
 I've not had access to boot loaders.  I would expect this issue
 would show up in the wild in some devices.  It might be ok to move
 to at least init.  Having it at deadlock spot is a bit more
 conservative.

Richard, since Paul and I cannot reproduce this currently, would you be able to 
test
if a fix at init fixes the problems that you've seen?

Kevin

 I've only seen the condition along side of very aggressive SDRC_POWER 
 settings. Its my guess beagle doesn't enable all those features yet.  Unless 
 you have a bunch of beagle accessories attached, I'm not sure how good a 
 reference beagle is for system DVFS validation.

 I've not taken stat's to see how much it happens. The alternate to the 
 latency spike is a dead lock which is clealry not wanted.  Net-Net I see this 
 as more a plus than a minus in current form.

 If you have some time you might expirment with beagle and see if you can 
 trigger the condition.

 Regards,
 Richard W.

 -Original Message-
 From: Paul Walmsley [mailto:p...@pwsan.com]
 Sent: Wednesday, June 17, 2009 3:04 AM

 So, what is different about your setup?  The usual suite of questions:

 - What chip revisions/boards does this affect?

 - Is this specific to certain bootloaders?

 - Has anyone else out there seen this problem?

 Rather than working around an occasional DLL failure to lock, is it possible 
 to reset the DLL earlier in the boot process, as Richard's commit message 
 suggests?  That would be preferable to this approach.  A back-of-the-envelope 
 assessment suggests that this patch could cause unpredictable, additional 1.5 
 millisecond latency spikes whenever the workaround triggers (since OCM RAM is 
 marked uncacheable).
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Woodruff, Richard
Mike,

From: Mike Chan [mailto:m...@android.com]
Sent: Thursday, June 18, 2009 1:42 PM


+/* Kick DLL state machine if lock not started */
+kick_dll:
+   ldr r4, sdrc_dlla_ctrl  /* get dlla addr */
+   ldr r5, [r4]/* grab value */
+   mov r6, r5  /* save value */

Richard, could this be done in one instruction, eliminating the need for the 
r5 temporary register? Or was this done intentionally?

Its on purpose I suppose, perhaps there is an optimization.

Above I want to make sure I restore the value I came in with for dlla_ctrl.  I 
save off value in r5 for this.

Below I clear and set dllidle bit.  I don't take the time to figure out if it 
was set or not coming into function.  I saved off value in r5 to allow for 
this.  It is legal for it to be set or cleared in normal operation.

The below set/clear needs to happen as 2 writes.  The 3rd write is a restore.  
You could be mindful of input value and perhaps not do a 3rd write but my guess 
is that will take more code then it saves.

+   orr r6, r6, #0x10   /* dllidle on */
+   str r6, [r4]
+   dsb
+   bic r6, #0x10   /* dllidle off */
+   str r6, [r4]
+   dsb
+   str r6, [r4]/* restore old value */
+   b wait_dll_lock_timed
+

Regards,
Richard W.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Woodruff, Richard

 From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
 Sent: Thursday, June 18, 2009 10:03 AM

  I've seen this issue on a couple custom boards.  In these instances
  I've not had access to boot loaders.  I would expect this issue
  would show up in the wild in some devices.  It might be ok to move
  to at least init.  Having it at deadlock spot is a bit more
  conservative.

 Richard, since Paul and I cannot reproduce this currently, would you be able
 to test
 if a fix at init fixes the problems that you've seen?

I did up the fix at this spot for a couple reasons.
-1- This is where it was locked up...
-2- This kind of operation needs to happen in uncached sram

Doesn't doing at 1st init require pushing of another sram?  Also as its not 
conclusive what causes it to enter this state other than SDRC_POWER association 
its safer to do it here.

I think the boot loader should have tried to do this correctly.  But I really 
don't know what it did.

Regards,
Richard W.


--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Wang Sawsd-A24013
 

 -Original Message-
 From: linux-omap-ow...@vger.kernel.org 
 [mailto:linux-omap-ow...@vger.kernel.org] On Behalf Of 
 Woodruff, Richard
 Sent: 2009年6月19日 2:55
 To: Mike Chan; Kevin Hilman
 Cc: linux-omap@vger.kernel.org; Mike Chan
 Subject: RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state 
 machine and kick
 
 Mike,
 
 From: Mike Chan [mailto:m...@android.com]
 Sent: Thursday, June 18, 2009 1:42 PM
 
 
 +/* Kick DLL state machine if lock not started */
 +kick_dll:
 +   ldr r4, sdrc_dlla_ctrl  /* get dlla addr */
 +   ldr r5, [r4]/* grab value */
 +   mov r6, r5  /* save value */
 
 Richard, could this be done in one instruction, eliminating 
 the need for the r5 temporary register? Or was this done 
 intentionally?
 
 Its on purpose I suppose, perhaps there is an optimization.
 
 Above I want to make sure I restore the value I came in with 
 for dlla_ctrl.  I save off value in r5 for this.
 
 Below I clear and set dllidle bit.  I don't take the time to 
 figure out if it was set or not coming into function.  I 
 saved off value in r5 to allow for this.  It is legal for it 
 to be set or cleared in normal operation.
 
 The below set/clear needs to happen as 2 writes.  The 3rd 
 write is a restore.  You could be mindful of input value and 
 perhaps not do a 3rd write but my guess is that will take 
 more code then it saves.
 
 +   orr r6, r6, #0x10   /* dllidle on */
 +   str r6, [r4]
 +   dsb
 +   bic r6, #0x10   /* dllidle off */
 +   str r6, [r4]
 +   dsb
 +   str r6, [r4]/* restore old value */

Richard, I think here it should be: str r5, [r4], not r6.
I modifed this when I was verifying your patch.

 +   b wait_dll_lock_timed
 +
 
 Regards,
 Richard W.
 --
 To unsubscribe from this list: send the line unsubscribe 
 linux-omap in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Woodruff, Richard
Hi,

 From: Wang Sawsd-A24013 [mailto:cqw...@motorola.com]
 Sent: Thursday, June 18, 2009 2:27 PM
  +   str r6, [r4]/* restore old value */

 Richard, I think here it should be: str r5, [r4], not r6.
 I modifed this when I was verifying your patch.

You are correct that was a bug.  Thanks.

Regards,
Richard W.


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Paul Walmsley
Hi Richard,

On Wed, 17 Jun 2009, Woodruff, Richard wrote:

 I've only seen the condition along side of very aggressive SDRC_POWER 
 settings. 

Could you send over the SDRC_POWER settings that cause this problem?


thanks,

- Paul
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-17 Thread Paul Walmsley
Hi,

a few questions for Mike and Richard:

On Mon, 15 Jun 2009, Kevin Hilman wrote:

 From: Richard Woodruff r-woodru...@ti.com
 
 I'm thinking DLL state is an exceptional condition which happens based
 on some wrong early programming when initially setting DLL up.  Some
 kind of race between idle features and lock happens early on.  This
 patch recognizes the issue and moves state machine into locked state.
 
 Its my guess the kick case won't be executed that often.  As long as
 DLL is not on you can mess with idle state.  When its on to mess with
 DDR control you need to be in self-refresh and not making
 accesses... like in dvfs code.
 
 Tested-by: Mike Chan mikec...@google.com

The last time I torture-tested CORE DVFS, which admittedly was some time 
ago, the boards tested were stable across the entire overnight run, doing 
nothing but CORE DVFS switching.  This was on a Beagle and a 3430SDP.

So, what is different about your setup?  The usual suite of questions:  

- What chip revisions/boards does this affect?

- Is this specific to certain bootloaders?  

- Has anyone else out there seen this problem?

Rather than working around an occasional DLL failure to lock, is it 
possible to reset the DLL earlier in the boot process, as Richard's commit 
message suggests?  That would be preferable to this approach.  A 
back-of-the-envelope assessment suggests that this patch could cause 
unpredictable, additional 1.5 millisecond latency spikes whenever the 
workaround triggers (since OCM RAM is marked uncacheable).


- Paul


 ---
  arch/arm/mach-omap2/sleep34xx.S |   22 --
  1 files changed, 20 insertions(+), 2 deletions(-)
 
 diff --git a/arch/arm/mach-omap2/sleep34xx.S b/arch/arm/mach-omap2/sleep34xx.S
 index aedcf94..c469bbe 100644
 --- a/arch/arm/mach-omap2/sleep34xx.S
 +++ b/arch/arm/mach-omap2/sleep34xx.S
 @@ -595,20 +595,38 @@ wait_sdrc_ok:
  ldr r5, [r4]
  bic r5, r5, #0x40
  str r5, [r4]
 -wait_dll_lock:
 -/* Is dll in lock mode? */
 +is_dll_in_lock_mode:
  ldr r4, sdrc_dlla_ctrl
  ldr r5, [r4]
  tst r5, #0x4
  bxnelr
  /* wait till dll locks */
 +wait_dll_lock_timed:
  ldr r4, sdrc_dlla_status
 +mov r6, #0x2000
 +wait_dll_lock:
 +subsr6, r6, #0x1
 +beq kick_dll
  ldr r5, [r4]
  and r5, r5, #0x4
  cmp r5, #0x4
  bne wait_dll_lock
  bx  lr
  
 +/* Kick DLL state machine if lock not started */
 +kick_dll:
 + ldr r4, sdrc_dlla_ctrl  /* get dlla addr */
 + ldr r5, [r4]/* grab value */
 + mov r6, r5  /* save value */
 + orr r6, r6, #0x10   /* dllidle on */
 + str r6, [r4]
 + dsb
 + bic r6, #0x10   /* dllidle off */
 + str r6, [r4]
 + dsb
 + str r6, [r4]/* restore old value */
 + b wait_dll_lock_timed
 +
  cm_idlest1_core:
   .word   CM_IDLEST1_CORE_V
  sdrc_dlla_status:
 -- 
 1.6.3.2
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-omap in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


- Paul
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html