RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-26 Thread Woodruff, Richard
Hi Paul,

> On Tue, 23 Jun 2009, Woodruff, Richard wrote:
>
> > > From: Paul Walmsley [mailto:p...@pwsan.com]
> > > Sent: Tuesday, June 23, 2009 3:05 PM
> >
> > > Looks like the significant difference is the use of CLKCTRL = 0x2 (+
> > > AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that
> > > causes this problem?  Perhaps the problem is partially related to PWDENA =
> > > 1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER
> > > register value and see if the problem reproduces.
> >
> > It is my guess not for PWDENA as it was reported on a system with that
> > not set.  Yes that should not enabled per errata.  Anyway it only does
> > something in OFF type scenarios where you don't use self-refresh.
> > "Typical" use cases don't seem to be using this.
> >
> > There are a few unlock events which are best for any focused test. Most
> > are supposed to auto-relock but ... I've heard of some validation tests
> > in a few areas maybe they will yield something.
>
> A quick note: in stress testing here, was able to confirm that SDRC_POWER
> = 0x000200AD causes problems under memory load and intensive CORE DVFS.
> SDRC_POWER = 0x0002009D seems to work fine.  The difference is CLKCTRL = 1
> (works) vs. CLKCTRL = 2 (fails).  According to the TRM, CLKCTRL = 2
> enables self-refresh when AUTOCOUNT reaches 0.  There is probably some
> race between AUTOCOUNT and the CORE DVFS SRAM code, or it is tickling some
> SDRC bug.  Will look into this further later.
>
> By the way, there appears to be a TRM bug in the "Dynamic Power Saving
> Configurations" table in the SDRC chapter.  The "External SDRC CLK" column
> should have "after AUTOCOUNT expiration" for CLKCTRL = 1 rows.

That is an interesting result.

My suspicion also has been with some kind of self-refresh race and DLL 
lock/unlock conditions.  DVFS would stress the DLL lock/unlock path.

In your test did you have a crash or just find the DLL did not relock.  The 
signature I'm looking for is not locking.  A crash in this particular test 
might point more to some other issue in sequence.

On another thread it was noted you had optimized out a needed delay after M2 
divider change for DVFS.  Fixing this might change behavior of this test.  
Stress testing in older code bases along with simulation result showed the 
delay was necessary.

Regards,
Richard W.

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-26 Thread Paul Walmsley
Hi,

On Tue, 23 Jun 2009, Woodruff, Richard wrote:

> > From: Paul Walmsley [mailto:p...@pwsan.com]
> > Sent: Tuesday, June 23, 2009 3:05 PM
> 
> > Looks like the significant difference is the use of CLKCTRL = 0x2 (+
> > AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that
> > causes this problem?  Perhaps the problem is partially related to PWDENA =
> > 1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER
> > register value and see if the problem reproduces.
> 
> It is my guess not for PWDENA as it was reported on a system with that 
> not set.  Yes that should not enabled per errata.  Anyway it only does 
> something in OFF type scenarios where you don't use self-refresh. 
> "Typical" use cases don't seem to be using this.
> 
> There are a few unlock events which are best for any focused test. Most 
> are supposed to auto-relock but ... I've heard of some validation tests 
> in a few areas maybe they will yield something.

A quick note: in stress testing here, was able to confirm that SDRC_POWER 
= 0x000200AD causes problems under memory load and intensive CORE DVFS.  
SDRC_POWER = 0x0002009D seems to work fine.  The difference is CLKCTRL = 1 
(works) vs. CLKCTRL = 2 (fails).  According to the TRM, CLKCTRL = 2 
enables self-refresh when AUTOCOUNT reaches 0.  There is probably some 
race between AUTOCOUNT and the CORE DVFS SRAM code, or it is tickling some 
SDRC bug.  Will look into this further later.

By the way, there appears to be a TRM bug in the "Dynamic Power Saving 
Configurations" table in the SDRC chapter.  The "External SDRC CLK" column 
should have "after AUTOCOUNT expiration" for CLKCTRL = 1 rows.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-23 Thread Woodruff, Richard
Hi,

> From: Paul Walmsley [mailto:p...@pwsan.com]
> Sent: Tuesday, June 23, 2009 3:05 PM

> Looks like the significant difference is the use of CLKCTRL = 0x2 (+
> AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that
> causes this problem?  Perhaps the problem is partially related to PWDENA =
> 1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER
> register value and see if the problem reproduces.

It is my guess not for PWDENA as it was reported on a system with that not set. 
 Yes that should not enabled per errata.  Anyway it only does something in OFF 
type scenarios where you don't use self-refresh. "Typical" use cases don't seem 
to be using this.

There are a few unlock events which are best for any focused test. Most are 
supposed to auto-relock but ... I've heard of some validation tests in a few 
areas maybe they will yield something.

Regards,
Richard W.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-23 Thread Paul Walmsley
Hello Richard, 

On Fri, 19 Jun 2009, Woodruff, Richard wrote:

> > From: Paul Walmsley [mailto:p...@pwsan.com]
> > Sent: Thursday, June 18, 2009 4:07 PM
> 
> > On Wed, 17 Jun 2009, Woodruff, Richard wrote:
> >
> > > I've only seen the condition along side of very aggressive SDRC_POWER
> > > settings.
> >
> > Could you send over the SDRC_POWER settings that cause this problem?
> 
> This is one pulled for a custom board:
> 
> # ./readmem 0x6d70
> 0x000200AD

Thanks.

According to the 34xx TRM Table 11-178 this sets the following: (asterisks 
mark differences from the current linux-omap code base.)

  WAKEUPPROC = 0
* AUTOCOUNT = 0x200 
* SRFRONRESET = 1
  SRFRONIDLEREQ = 0
* CLKCTRL = 2
  EXTCLKDIS = 1
  PWDENA = 1
  PAGEPOLICY = 1

Looks like the significant difference is the use of CLKCTRL = 0x2 (+ 
AUTOCOUNT).  Maybe there is some SDRC bug related to CLKCTRL = 0x2 that 
causes this problem?  Perhaps the problem is partially related to PWDENA = 
1 and erratum 1.150?  Anyway, will do a test run here with this SDRC_POWER 
register value and see if the problem reproduces.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-19 Thread Woodruff, Richard

> From: Paul Walmsley [mailto:p...@pwsan.com]
> Sent: Thursday, June 18, 2009 4:07 PM

Hi Paul,

> On Wed, 17 Jun 2009, Woodruff, Richard wrote:
>
> > I've only seen the condition along side of very aggressive SDRC_POWER
> > settings.
>
> Could you send over the SDRC_POWER settings that cause this problem?

This is one pulled for a custom board:

# ./readmem 0x6d70
0x000200AD

Regards,
Richard W.

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Paul Walmsley
Hi Richard,

On Wed, 17 Jun 2009, Woodruff, Richard wrote:

> I've only seen the condition along side of very aggressive SDRC_POWER 
> settings. 

Could you send over the SDRC_POWER settings that cause this problem?


thanks,

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Woodruff, Richard
Hi,

> From: Wang Sawsd-A24013 [mailto:cqw...@motorola.com]
> Sent: Thursday, June 18, 2009 2:27 PM
> > >+   str r6, [r4]/* restore old value */

> Richard, I think here it should be: str r5, [r4], not r6.
> I modifed this when I was verifying your patch.

You are correct that was a bug.  Thanks.

Regards,
Richard W.


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Wang Sawsd-A24013
 

> -Original Message-
> From: linux-omap-ow...@vger.kernel.org 
> [mailto:linux-omap-ow...@vger.kernel.org] On Behalf Of 
> Woodruff, Richard
> Sent: 2009年6月19日 2:55
> To: Mike Chan; Kevin Hilman
> Cc: linux-omap@vger.kernel.org; Mike Chan
> Subject: RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state 
> machine and kick
> 
> Mike,
> 
> >From: Mike Chan [mailto:m...@android.com]
> >Sent: Thursday, June 18, 2009 1:42 PM
> >
> >
> >+/* Kick DLL state machine if lock not started */
> >+kick_dll:
> >+   ldr r4, sdrc_dlla_ctrl  /* get dlla addr */
> >+   ldr r5, [r4]/* grab value */
> >+   mov r6, r5  /* save value */
> >
> >Richard, could this be done in one instruction, eliminating 
> the need for the r5 >temporary register? Or was this done 
> intentionally?
> 
> Its on purpose I suppose, perhaps there is an optimization.
> 
> Above I want to make sure I restore the value I came in with 
> for dlla_ctrl.  I save off value in r5 for this.
> 
> Below I clear and set dllidle bit.  I don't take the time to 
> figure out if it was set or not coming into function.  I 
> saved off value in r5 to allow for this.  It is legal for it 
> to be set or cleared in normal operation.
> 
> The below set/clear needs to happen as 2 writes.  The 3rd 
> write is a restore.  You could be mindful of input value and 
> perhaps not do a 3rd write but my guess is that will take 
> more code then it saves.
> 
> >+   orr r6, r6, #0x10   /* dllidle on */
> >+   str r6, [r4]
> >+   dsb
> >+   bic r6, #0x10   /* dllidle off */
> >+   str r6, [r4]
> >+   dsb
> >+   str r6, [r4]/* restore old value */

Richard, I think here it should be: str r5, [r4], not r6.
I modifed this when I was verifying your patch.

> >+   b wait_dll_lock_timed
> >+
> 
> Regards,
> Richard W.
> --
> To unsubscribe from this list: send the line "unsubscribe 
> linux-omap" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Woodruff, Richard

> From: Kevin Hilman [mailto:khil...@deeprootsystems.com]
> Sent: Thursday, June 18, 2009 10:03 AM

> > I've seen this issue on a couple custom boards.  In these instances
> > I've not had access to boot loaders.  I would expect this issue
> > would show up in the wild in some devices.  It might be ok to move
> > to at least init.  Having it at deadlock spot is a bit more
> > conservative.
>
> Richard, since Paul and I cannot reproduce this currently, would you be able
> to test
> if a fix at init fixes the problems that you've seen?

I did up the fix at this spot for a couple reasons.
-1- This is where it was locked up...
-2- This kind of operation needs to happen in uncached sram

Doesn't doing at 1st init require pushing of another sram?  Also as its not 
conclusive what causes it to enter this state other than SDRC_POWER association 
its safer to do it here.

I think the boot loader should have tried to do this correctly.  But I really 
don't know what it did.

Regards,
Richard W.


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Woodruff, Richard
Mike,

>From: Mike Chan [mailto:m...@android.com]
>Sent: Thursday, June 18, 2009 1:42 PM
>
>
>+/* Kick DLL state machine if lock not started */
>+kick_dll:
>+   ldr r4, sdrc_dlla_ctrl  /* get dlla addr */
>+   ldr r5, [r4]/* grab value */
>+   mov r6, r5  /* save value */
>
>Richard, could this be done in one instruction, eliminating the need for the 
>r5 >temporary register? Or was this done intentionally?

Its on purpose I suppose, perhaps there is an optimization.

Above I want to make sure I restore the value I came in with for dlla_ctrl.  I 
save off value in r5 for this.

Below I clear and set dllidle bit.  I don't take the time to figure out if it 
was set or not coming into function.  I saved off value in r5 to allow for 
this.  It is legal for it to be set or cleared in normal operation.

The below set/clear needs to happen as 2 writes.  The 3rd write is a restore.  
You could be mindful of input value and perhaps not do a 3rd write but my guess 
is that will take more code then it saves.

>+   orr r6, r6, #0x10   /* dllidle on */
>+   str r6, [r4]
>+   dsb
>+   bic r6, #0x10   /* dllidle off */
>+   str r6, [r4]
>+   dsb
>+   str r6, [r4]/* restore old value */
>+   b wait_dll_lock_timed
>+

Regards,
Richard W.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-18 Thread Kevin Hilman
"Woodruff, Richard"  writes:

> Hi Paul,
>
> Hm, seems my mailing is out to lunch today...  I'll be worse than normal in 
> protocol and top post.
>
> I've seen this issue on a couple custom boards.  In these instances
> I've not had access to boot loaders.  I would expect this issue
> would show up in the wild in some devices.  It might be ok to move
> to at least init.  Having it at deadlock spot is a bit more
> conservative.

Richard, since Paul and I cannot reproduce this currently, would you be able to 
test
if a fix at init fixes the problems that you've seen?

Kevin

> I've only seen the condition along side of very aggressive SDRC_POWER 
> settings. Its my guess beagle doesn't enable all those features yet.  Unless 
> you have a bunch of beagle accessories attached, I'm not sure how good a 
> reference beagle is for system DVFS validation.
>
> I've not taken stat's to see how much it happens. The alternate to the 
> latency spike is a dead lock which is clealry not wanted.  Net-Net I see this 
> as more a plus than a minus in current form.
>
> If you have some time you might expirment with beagle and see if you can 
> trigger the condition.
>
> Regards,
> Richard W.
>
> -Original Message-
> From: Paul Walmsley [mailto:p...@pwsan.com]
> Sent: Wednesday, June 17, 2009 3:04 AM
>
> So, what is different about your setup?  The usual suite of questions:
>
> - What chip revisions/boards does this affect?
>
> - Is this specific to certain bootloaders?
>
> - Has anyone else out there seen this problem?
>
> Rather than working around an occasional DLL failure to lock, is it possible 
> to reset the DLL earlier in the boot process, as Richard's commit message 
> suggests?  That would be preferable to this approach.  A back-of-the-envelope 
> assessment suggests that this patch could cause unpredictable, additional 1.5 
> millisecond latency spikes whenever the workaround triggers (since OCM RAM is 
> marked uncacheable).
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-17 Thread Woodruff, Richard
Hi Paul,

Hm, seems my mailing is out to lunch today...  I'll be worse than normal in 
protocol and top post.

I've seen this issue on a couple custom boards.  In these instances I've not 
had access to boot loaders.  I would expect this issue would show up in the 
wild in some devices.  It might be ok to move to at least init.  Having it at 
deadlock spot is a bit more conservative.

I've only seen the condition along side of very aggressive SDRC_POWER settings. 
Its my guess beagle doesn't enable all those features yet.  Unless you have a 
bunch of beagle accessories attached, I'm not sure how good a reference beagle 
is for system DVFS validation.

I've not taken stat's to see how much it happens. The alternate to the latency 
spike is a dead lock which is clealry not wanted.  Net-Net I see this as more a 
plus than a minus in current form.

If you have some time you might expirment with beagle and see if you can 
trigger the condition.

Regards,
Richard W.

-Original Message-
From: Paul Walmsley [mailto:p...@pwsan.com]
Sent: Wednesday, June 17, 2009 3:04 AM

So, what is different about your setup?  The usual suite of questions:

- What chip revisions/boards does this affect?

- Is this specific to certain bootloaders?

- Has anyone else out there seen this problem?

Rather than working around an occasional DLL failure to lock, is it possible to 
reset the DLL earlier in the boot process, as Richard's commit message 
suggests?  That would be preferable to this approach.  A back-of-the-envelope 
assessment suggests that this patch could cause unpredictable, additional 1.5 
millisecond latency spikes whenever the workaround triggers (since OCM RAM is 
marked uncacheable).




--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -pm 1/2] SDRC: check for stuck DLL state machine and kick

2009-06-17 Thread Paul Walmsley
Hi,

a few questions for Mike and Richard:

On Mon, 15 Jun 2009, Kevin Hilman wrote:

> From: Richard Woodruff 
> 
> I'm thinking DLL state is an exceptional condition which happens based
> on some wrong early programming when initially setting DLL up.  Some
> kind of race between idle features and lock happens early on.  This
> patch recognizes the issue and moves state machine into locked state.
> 
> Its my guess the kick case won't be executed that often.  As long as
> DLL is not on you can mess with idle state.  When its on to mess with
> DDR control you need to be in self-refresh and not making
> accesses... like in dvfs code.
> 
> Tested-by: Mike Chan 

The last time I torture-tested CORE DVFS, which admittedly was some time 
ago, the boards tested were stable across the entire overnight run, doing 
nothing but CORE DVFS switching.  This was on a Beagle and a 3430SDP.

So, what is different about your setup?  The usual suite of questions:  

- What chip revisions/boards does this affect?

- Is this specific to certain bootloaders?  

- Has anyone else out there seen this problem?

Rather than working around an occasional DLL failure to lock, is it 
possible to reset the DLL earlier in the boot process, as Richard's commit 
message suggests?  That would be preferable to this approach.  A 
back-of-the-envelope assessment suggests that this patch could cause 
unpredictable, additional 1.5 millisecond latency spikes whenever the 
workaround triggers (since OCM RAM is marked uncacheable).


- Paul


> ---
>  arch/arm/mach-omap2/sleep34xx.S |   22 --
>  1 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/sleep34xx.S b/arch/arm/mach-omap2/sleep34xx.S
> index aedcf94..c469bbe 100644
> --- a/arch/arm/mach-omap2/sleep34xx.S
> +++ b/arch/arm/mach-omap2/sleep34xx.S
> @@ -595,20 +595,38 @@ wait_sdrc_ok:
>  ldr r5, [r4]
>  bic r5, r5, #0x40
>  str r5, [r4]
> -wait_dll_lock:
> -/* Is dll in lock mode? */
> +is_dll_in_lock_mode:
>  ldr r4, sdrc_dlla_ctrl
>  ldr r5, [r4]
>  tst r5, #0x4
>  bxnelr
>  /* wait till dll locks */
> +wait_dll_lock_timed:
>  ldr r4, sdrc_dlla_status
> +mov r6, #0x2000
> +wait_dll_lock:
> +subsr6, r6, #0x1
> +beq kick_dll
>  ldr r5, [r4]
>  and r5, r5, #0x4
>  cmp r5, #0x4
>  bne wait_dll_lock
>  bx  lr
>  
> +/* Kick DLL state machine if lock not started */
> +kick_dll:
> + ldr r4, sdrc_dlla_ctrl  /* get dlla addr */
> + ldr r5, [r4]/* grab value */
> + mov r6, r5  /* save value */
> + orr r6, r6, #0x10   /* dllidle on */
> + str r6, [r4]
> + dsb
> + bic r6, #0x10   /* dllidle off */
> + str r6, [r4]
> + dsb
> + str r6, [r4]/* restore old value */
> + b wait_dll_lock_timed
> +
>  cm_idlest1_core:
>   .word   CM_IDLEST1_CORE_V
>  sdrc_dlla_status:
> -- 
> 1.6.3.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html