Re: [U-Boot] [PATCH 08/24] armv8: add simple sdelay implementation

2016-11-23 Thread Siarhei Siamashka
On Mon, 21 Nov 2016 16:52:47 +0100
Alexander Graf  wrote:

> On 20/11/2016 15:57, Andre Przywara wrote:
> > The sunxi DRAM setup code needs an sdelay() implementation, which
> > wasn't defined for armv8 so far.
> > Shamelessly copy the armv7 version and adjust it to work in AArch64.
> >
> > Signed-off-by: Andre Przywara   
> 
> I don't think it hurts to write this in C - and I also doubt that 
> inlining has any negative effect.
> 
> Something along the lines of
> 
> static inline void sdelay(...) {
>for (; loops; loops--)
>  asm volatile("");
> }
> 
> inside a header should do the trick as well and is much more readable.

Unfortunately the performance of the generated C code is very
unpredictable. Depending on the optimization settings, it may
place the counter variable in a register, or keep it on stack.

It would be much nicer to have more predictable timings for
these delays. So I like the assembly version a lot better.
Naturally, when it is implemented correctly.

> 
> 
> Alex
> 
> > ---
> >  arch/arm/cpu/armv8/cpu.c | 13 +
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c
> > index e06c3cc..e82e9cf 100644
> > --- a/arch/arm/cpu/armv8/cpu.c
> > +++ b/arch/arm/cpu/armv8/cpu.c
> > @@ -16,6 +16,19 @@
> >  #include 
> >  #include 
> >
> > +/
> > + * sdelay() - simple spin loop.  Will be constant time as
> > + *  its generally used in bypass conditions only.  This
> > + *  is necessary until timers are accessible.
> > + *
> > + *  not inline to increase chances its in cache when called
> > + */
> > +void sdelay(unsigned long loops)
> > +{
> > +   __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
> > + "b.ne 1b":"=r" (loops):"0"(loops));
> > +}
> > +
> >  int cleanup_before_linux(void)
> >  {
> > /*
> >  



-- 
Best regards,
Siarhei Siamashka
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 08/24] armv8: add simple sdelay implementation

2016-11-23 Thread André Przywara
On 24/11/16 01:25, Siarhei Siamashka wrote:

Hi Siarhei,

> On Sun, 20 Nov 2016 14:57:02 +
> Andre Przywara  wrote:
> 
>> The sunxi DRAM setup code needs an sdelay() implementation, which
>> wasn't defined for armv8 so far.
>> Shamelessly copy the armv7 version and adjust it to work in AArch64.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  arch/arm/cpu/armv8/cpu.c | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c
>> index e06c3cc..e82e9cf 100644
>> --- a/arch/arm/cpu/armv8/cpu.c
>> +++ b/arch/arm/cpu/armv8/cpu.c
>> @@ -16,6 +16,19 @@
>>  #include 
>>  #include 
>>  
>> +/
>> + * sdelay() - simple spin loop.  Will be constant time as
>> + *  its generally used in bypass conditions only.  This
>> + *  is necessary until timers are accessible.
>> + *
>> + *  not inline to increase chances its in cache when called
>> + */
>> +void sdelay(unsigned long loops)
>> +{
>> +__asm__ volatile ("1:\n" "subs %0, %1, #1\n"
>> +  "b.ne 1b":"=r" (loops):"0"(loops));
> 
> This inline assembly needs "cc" in the clobber list. Also don't we
> want to just use a single register for the counter ("subs %0, %0, #1")
> rather than trying to construct something excessively complicated
> and possibly fragile?

Please don't shoot the messenger, this is the version copied from ARMv7.
I noticed the redundant register as well, but didn't dare to touch it
(assuming some higher wisdom behind it).
And good catch for the cc clobber!

Cheers,
Andre.

> The https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html page provides
> some information.
> 
>> +}
>> +
>>  int cleanup_before_linux(void)
>>  {
>>  /*
> 
> 

___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 08/24] armv8: add simple sdelay implementation

2016-11-23 Thread Siarhei Siamashka
On Sun, 20 Nov 2016 14:57:02 +
Andre Przywara  wrote:

> The sunxi DRAM setup code needs an sdelay() implementation, which
> wasn't defined for armv8 so far.
> Shamelessly copy the armv7 version and adjust it to work in AArch64.
> 
> Signed-off-by: Andre Przywara 
> ---
>  arch/arm/cpu/armv8/cpu.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c
> index e06c3cc..e82e9cf 100644
> --- a/arch/arm/cpu/armv8/cpu.c
> +++ b/arch/arm/cpu/armv8/cpu.c
> @@ -16,6 +16,19 @@
>  #include 
>  #include 
>  
> +/
> + * sdelay() - simple spin loop.  Will be constant time as
> + *  its generally used in bypass conditions only.  This
> + *  is necessary until timers are accessible.
> + *
> + *  not inline to increase chances its in cache when called
> + */
> +void sdelay(unsigned long loops)
> +{
> + __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
> +   "b.ne 1b":"=r" (loops):"0"(loops));

This inline assembly needs "cc" in the clobber list. Also don't we
want to just use a single register for the counter ("subs %0, %0, #1")
rather than trying to construct something excessively complicated
and possibly fragile?

The https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html page provides
some information.

> +}
> +
>  int cleanup_before_linux(void)
>  {
>   /*


-- 
Best regards,
Siarhei Siamashka
___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


Re: [U-Boot] [PATCH 08/24] armv8: add simple sdelay implementation

2016-11-21 Thread Alexander Graf



On 20/11/2016 15:57, Andre Przywara wrote:

The sunxi DRAM setup code needs an sdelay() implementation, which
wasn't defined for armv8 so far.
Shamelessly copy the armv7 version and adjust it to work in AArch64.

Signed-off-by: Andre Przywara 


I don't think it hurts to write this in C - and I also doubt that 
inlining has any negative effect.


Something along the lines of

static inline void sdelay(...) {
  for (; loops; loops--)
asm volatile("");
}

inside a header should do the trick as well and is much more readable.


Alex


---
 arch/arm/cpu/armv8/cpu.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c
index e06c3cc..e82e9cf 100644
--- a/arch/arm/cpu/armv8/cpu.c
+++ b/arch/arm/cpu/armv8/cpu.c
@@ -16,6 +16,19 @@
 #include 
 #include 

+/
+ * sdelay() - simple spin loop.  Will be constant time as
+ *  its generally used in bypass conditions only.  This
+ *  is necessary until timers are accessible.
+ *
+ *  not inline to increase chances its in cache when called
+ */
+void sdelay(unsigned long loops)
+{
+   __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
+ "b.ne 1b":"=r" (loops):"0"(loops));
+}
+
 int cleanup_before_linux(void)
 {
/*


___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot


[U-Boot] [PATCH 08/24] armv8: add simple sdelay implementation

2016-11-20 Thread Andre Przywara
The sunxi DRAM setup code needs an sdelay() implementation, which
wasn't defined for armv8 so far.
Shamelessly copy the armv7 version and adjust it to work in AArch64.

Signed-off-by: Andre Przywara 
---
 arch/arm/cpu/armv8/cpu.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c
index e06c3cc..e82e9cf 100644
--- a/arch/arm/cpu/armv8/cpu.c
+++ b/arch/arm/cpu/armv8/cpu.c
@@ -16,6 +16,19 @@
 #include 
 #include 
 
+/
+ * sdelay() - simple spin loop.  Will be constant time as
+ *  its generally used in bypass conditions only.  This
+ *  is necessary until timers are accessible.
+ *
+ *  not inline to increase chances its in cache when called
+ */
+void sdelay(unsigned long loops)
+{
+   __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
+ "b.ne 1b":"=r" (loops):"0"(loops));
+}
+
 int cleanup_before_linux(void)
 {
/*
-- 
2.8.2

___
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot