Re: [V1 PATCH 1/2] rockchip: sdram: Support getting banks from TPL for rk3568 and rk3588

2024-04-05 Thread Quentin Schulz

Hi Chris,

On 4/4/24 23:33, Chris Morgan wrote:

On Tue, Apr 02, 2024 at 06:38:59PM +0200, Quentin Schulz wrote:

Hi Chris,

On 4/1/24 20:14, Chris Morgan wrote:

From: Chris Morgan 

Allow RK3568 and RK3588 based boards to get the RAM bank configuration
from the ROCKCHIP_TPL stage instead of the current logic. This fixes
both an issue where 256MB of RAM is blocked for devices with >= 4GB
of RAM and where memory holes need to be defined for devices with

= 16GB of RAM. In the event that neither SOC is used and the

ROCKCHIP_TPL stage is not used, fall back to existing logic.

Signed-off-by: Chris Morgan 
---
   arch/arm/mach-rockchip/sdram.c | 100 +
   1 file changed, 100 insertions(+)

diff --git a/arch/arm/mach-rockchip/sdram.c b/arch/arm/mach-rockchip/sdram.c
index 0d9a0aef6f..e02fb03c5f 100644
--- a/arch/arm/mach-rockchip/sdram.c
+++ b/arch/arm/mach-rockchip/sdram.c
@@ -12,6 +12,7 @@
   #include 
   #include 
   #include 
+#include 
   DECLARE_GLOBAL_DATA_PTR;
@@ -35,11 +36,110 @@ struct tos_parameter_t {
s64 reserve[8];
   };
+/* Tag magic */
+#define ATAGS_CORE_MAGIC   0x54410001
+#define ATAGS_DDR_MEM_MAGIC0x54410052
+
+/* Tag size and offset */
+#define ATAGS_SIZE SZ_8K
+#define ATAGS_OFFSET   (SZ_2M - ATAGS_SIZE)
+#define ATAGS_PHYS_BASE(CFG_SYS_SDRAM_BASE + ATAGS_OFFSET)
+
+/* ATAGS memory structure. */
+struct tag_ddr_mem {
+   u32 count;
+   u32 version;
+   u64 bank[20];
+   u32 flags;
+   u32 data[2];
+   u32 hash;
+} __packed;
+
+/**
+ * rockchip_dram_init_banksize() - Get RAM banks from Rockchip TPL
+ *
+ * Iterate through the defined ATAGS memory location to first find a
+ * valid core header, then find a valid ddr_info header. Sanity check
+ * the number of banks found. Then, iterate through the data to add
+ * each individual memory bank. Perform fixups on memory banks that
+ * overlap with a reserved space. If an error condition is received,
+ * it is expected that memory bank setup will fall back on existing
+ * logic. If ROCKCHIP_EXTERNAL_TPL is false then immediately return,
+ * and if neither ROCKCHIP_RK3588 or ROCKCHIP_RK3568 is enabled
+ * immediately return.
+ *
+ * Return number of banks found on success or negative on error.
+ */
+__weak int rockchip_dram_init_banksize(void)
+{
+   struct tag_ddr_mem *ddr_info;
+   size_t val;
+   size_t addr = ATAGS_PHYS_BASE;


I think this should be phys_addr_t instead of size_t?

size_t is an unsigned long on aarch64 and phys_addr_t is an unsigned long
long so 4B vs 8B.

This however would likely prevent us from reusing this code on aarch32
machines, but maybe it's a problem for the people who'll look into
supporting this :) (also, aarch32 and >= 3.75GiB may be a bit optimistic :)
).


Could I just specify a size and not worry about it? A u32 should be more
than enough to hold the maximum RAM address of an RK3588 board (32GB).
That would allow this to work on both 32 and 64 right? Otherwise I could
further restrict this code to the AARCH64 ifdef.



There's even a bigger issue here as phys_t is an 8B structure (on 
Aarch64) while the atags are actually 4B aligned, so that would 
unnecessarily increase the complexity for arithmetic on those addresses. 
So u32 would probably be fine then.





+   int i;
+


u8 is plenty enough here :)


I use it as a return value where there are negative numbers (though
obviously this should never be negative since it's an increment
counter). Does that matter?



You could return i still and let the compiler do the conversion.

No, it's not a blocker, but that may unnecessarily increase the size of 
the TPL.


[...]


+   break;
+   addr += 4;


This is an incorrect step size, addr is 4B, so this will result in 16B
increments, which may be too much. Additionally, we shouldn't read every 4B
as the tag is only ever guaranteed to be 4B aligned, not that we would have
a tag every 4B. This also means that it's possible somehow the content of a
tag at a 4B-aligned offset has the CORE_MAGIC for some reason, but we
shouldn't match on it.



I'm not quite sure I follow. Are you saying I need to increment every
4 * the value of size in the tag_header? The value I show is 0x5 in
my header meaning increment every 0x14?



What we have in memory is (each 4B) (correct me if I misread Rockchip's 
code)


tag1.size = 6
tag1.magic
data1[0]
data1[1]
data1[2]
data1[3]
tag2.size = 4
tag2.magic
data2[0]
data2[1]

...

You start with addr = &tag1.size
when you do addr +4, you now have addr= &data1[2]

By casting data1[2] into a tag struct, you would have

tagX.size = data1[2]
tagX.magic = data1[3]

If somehow data1[3] matches the magic, you'll detect tag data as a tag 
header, and that's no good. Also, you may be missing a tag by checking 
every 16B, which isn't guaranteed by Rockchip's ATAGS (only guaranteed 
to be 4B aligned)


tag1.size is the size of the tag1 in multiples

Re: [V1 PATCH 1/2] rockchip: sdram: Support getting banks from TPL for rk3568 and rk3588

2024-04-04 Thread Chris Morgan
On Tue, Apr 02, 2024 at 06:38:59PM +0200, Quentin Schulz wrote:
> Hi Chris,
> 
> On 4/1/24 20:14, Chris Morgan wrote:
> > From: Chris Morgan 
> > 
> > Allow RK3568 and RK3588 based boards to get the RAM bank configuration
> > from the ROCKCHIP_TPL stage instead of the current logic. This fixes
> > both an issue where 256MB of RAM is blocked for devices with >= 4GB
> > of RAM and where memory holes need to be defined for devices with
> > > = 16GB of RAM. In the event that neither SOC is used and the
> > ROCKCHIP_TPL stage is not used, fall back to existing logic.
> > 
> > Signed-off-by: Chris Morgan 
> > ---
> >   arch/arm/mach-rockchip/sdram.c | 100 +
> >   1 file changed, 100 insertions(+)
> > 
> > diff --git a/arch/arm/mach-rockchip/sdram.c b/arch/arm/mach-rockchip/sdram.c
> > index 0d9a0aef6f..e02fb03c5f 100644
> > --- a/arch/arm/mach-rockchip/sdram.c
> > +++ b/arch/arm/mach-rockchip/sdram.c
> > @@ -12,6 +12,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   DECLARE_GLOBAL_DATA_PTR;
> > @@ -35,11 +36,110 @@ struct tos_parameter_t {
> > s64 reserve[8];
> >   };
> > +/* Tag magic */
> > +#define ATAGS_CORE_MAGIC   0x54410001
> > +#define ATAGS_DDR_MEM_MAGIC0x54410052
> > +
> > +/* Tag size and offset */
> > +#define ATAGS_SIZE SZ_8K
> > +#define ATAGS_OFFSET   (SZ_2M - ATAGS_SIZE)
> > +#define ATAGS_PHYS_BASE(CFG_SYS_SDRAM_BASE + ATAGS_OFFSET)
> > +
> > +/* ATAGS memory structure. */
> > +struct tag_ddr_mem {
> > +   u32 count;
> > +   u32 version;
> > +   u64 bank[20];
> > +   u32 flags;
> > +   u32 data[2];
> > +   u32 hash;
> > +} __packed;
> > +
> > +/**
> > + * rockchip_dram_init_banksize() - Get RAM banks from Rockchip TPL
> > + *
> > + * Iterate through the defined ATAGS memory location to first find a
> > + * valid core header, then find a valid ddr_info header. Sanity check
> > + * the number of banks found. Then, iterate through the data to add
> > + * each individual memory bank. Perform fixups on memory banks that
> > + * overlap with a reserved space. If an error condition is received,
> > + * it is expected that memory bank setup will fall back on existing
> > + * logic. If ROCKCHIP_EXTERNAL_TPL is false then immediately return,
> > + * and if neither ROCKCHIP_RK3588 or ROCKCHIP_RK3568 is enabled
> > + * immediately return.
> > + *
> > + * Return number of banks found on success or negative on error.
> > + */
> > +__weak int rockchip_dram_init_banksize(void)
> > +{
> > +   struct tag_ddr_mem *ddr_info;
> > +   size_t val;
> > +   size_t addr = ATAGS_PHYS_BASE;
> 
> I think this should be phys_addr_t instead of size_t?
> 
> size_t is an unsigned long on aarch64 and phys_addr_t is an unsigned long
> long so 4B vs 8B.
> 
> This however would likely prevent us from reusing this code on aarch32
> machines, but maybe it's a problem for the people who'll look into
> supporting this :) (also, aarch32 and >= 3.75GiB may be a bit optimistic :)
> ).

Could I just specify a size and not worry about it? A u32 should be more
than enough to hold the maximum RAM address of an RK3588 board (32GB).
That would allow this to work on both 32 and 64 right? Otherwise I could
further restrict this code to the AARCH64 ifdef.

> 
> > +   int i;
> > +
> 
> u8 is plenty enough here :)

I use it as a return value where there are negative numbers (though
obviously this should never be negative since it's an increment
counter). Does that matter?

> 
> > +   if (!IS_ENABLED(CONFIG_ROCKCHIP_EXTERNAL_TPL))
> > +   return 0;
> > +   if (!IS_ENABLED(CONFIG_ROCKCHIP_RK3588) &&
> > +   !IS_ENABLED(CONFIG_ROCKCHIP_RK3568))
> > +   return 0;
> > +
> > +   if (!IS_ENABLED(CONFIG_ROCKCHIP_EXTERNAL_TPL))
> > +   return -EPERM;
> > +
> 
> I think testing once is enough :)
> 
> Also, you probably want to use -ENOTSUPP instead?

Acknowledged.

> 
> > +   while (addr < (ATAGS_PHYS_BASE + ATAGS_SIZE)) {
> > +   val = readl(addr);
> > +   if (val == ATAGS_CORE_MAGIC)
> 
> Save a variable by not saving the result of readl and just check against
> ATAGS_CORE_MAGIC.

Acknowledged.

> 
> > +   break;
> > +   addr += 4;
> 
> This is an incorrect step size, addr is 4B, so this will result in 16B
> increments, which may be too much. Additionally, we shouldn't read every 4B
> as the tag is only ever guaranteed to be 4B aligned, not that we would have
> a tag every 4B. This also means that it's possible somehow the content of a
> tag at a 4B-aligned offset has the CORE_MAGIC for some reason, but we
> shouldn't match on it.
> 

I'm not quite sure I follow. Are you saying I need to increment every
4 * the value of size in the tag_header? The value I show is 0x5 in
my header meaning increment every 0x14?

> I recommend to follow what Rockchip does downstream:
> 
> """
> struct tag_header {
> u32 size; /* size in units of 4B */
> u32 magic;
> };
> """
> 
> if magic != CORE_MAGIC, 

Re: [V1 PATCH 1/2] rockchip: sdram: Support getting banks from TPL for rk3568 and rk3588

2024-04-02 Thread Quentin Schulz

Hi Chris,

On 4/1/24 20:14, Chris Morgan wrote:

From: Chris Morgan 

Allow RK3568 and RK3588 based boards to get the RAM bank configuration
from the ROCKCHIP_TPL stage instead of the current logic. This fixes
both an issue where 256MB of RAM is blocked for devices with >= 4GB
of RAM and where memory holes need to be defined for devices with

= 16GB of RAM. In the event that neither SOC is used and the

ROCKCHIP_TPL stage is not used, fall back to existing logic.

Signed-off-by: Chris Morgan 
---
  arch/arm/mach-rockchip/sdram.c | 100 +
  1 file changed, 100 insertions(+)

diff --git a/arch/arm/mach-rockchip/sdram.c b/arch/arm/mach-rockchip/sdram.c
index 0d9a0aef6f..e02fb03c5f 100644
--- a/arch/arm/mach-rockchip/sdram.c
+++ b/arch/arm/mach-rockchip/sdram.c
@@ -12,6 +12,7 @@
  #include 
  #include 
  #include 
+#include 
  
  DECLARE_GLOBAL_DATA_PTR;
  
@@ -35,11 +36,110 @@ struct tos_parameter_t {

s64 reserve[8];
  };
  
+/* Tag magic */

+#define ATAGS_CORE_MAGIC   0x54410001
+#define ATAGS_DDR_MEM_MAGIC0x54410052
+
+/* Tag size and offset */
+#define ATAGS_SIZE SZ_8K
+#define ATAGS_OFFSET   (SZ_2M - ATAGS_SIZE)
+#define ATAGS_PHYS_BASE(CFG_SYS_SDRAM_BASE + ATAGS_OFFSET)
+
+/* ATAGS memory structure. */
+struct tag_ddr_mem {
+   u32 count;
+   u32 version;
+   u64 bank[20];
+   u32 flags;
+   u32 data[2];
+   u32 hash;
+} __packed;
+
+/**
+ * rockchip_dram_init_banksize() - Get RAM banks from Rockchip TPL
+ *
+ * Iterate through the defined ATAGS memory location to first find a
+ * valid core header, then find a valid ddr_info header. Sanity check
+ * the number of banks found. Then, iterate through the data to add
+ * each individual memory bank. Perform fixups on memory banks that
+ * overlap with a reserved space. If an error condition is received,
+ * it is expected that memory bank setup will fall back on existing
+ * logic. If ROCKCHIP_EXTERNAL_TPL is false then immediately return,
+ * and if neither ROCKCHIP_RK3588 or ROCKCHIP_RK3568 is enabled
+ * immediately return.
+ *
+ * Return number of banks found on success or negative on error.
+ */
+__weak int rockchip_dram_init_banksize(void)
+{
+   struct tag_ddr_mem *ddr_info;
+   size_t val;
+   size_t addr = ATAGS_PHYS_BASE;


I think this should be phys_addr_t instead of size_t?

size_t is an unsigned long on aarch64 and phys_addr_t is an unsigned 
long long so 4B vs 8B.


This however would likely prevent us from reusing this code on aarch32 
machines, but maybe it's a problem for the people who'll look into 
supporting this :) (also, aarch32 and >= 3.75GiB may be a bit optimistic 
:) ).



+   int i;
+


u8 is plenty enough here :)


+   if (!IS_ENABLED(CONFIG_ROCKCHIP_EXTERNAL_TPL))
+   return 0;
+   if (!IS_ENABLED(CONFIG_ROCKCHIP_RK3588) &&
+   !IS_ENABLED(CONFIG_ROCKCHIP_RK3568))
+   return 0;
+
+   if (!IS_ENABLED(CONFIG_ROCKCHIP_EXTERNAL_TPL))
+   return -EPERM;
+


I think testing once is enough :)

Also, you probably want to use -ENOTSUPP instead?


+   while (addr < (ATAGS_PHYS_BASE + ATAGS_SIZE)) {
+   val = readl(addr);
+   if (val == ATAGS_CORE_MAGIC)


Save a variable by not saving the result of readl and just check against 
ATAGS_CORE_MAGIC.



+   break;
+   addr += 4;


This is an incorrect step size, addr is 4B, so this will result in 16B 
increments, which may be too much. Additionally, we shouldn't read every 
4B as the tag is only ever guaranteed to be 4B aligned, not that we 
would have a tag every 4B. This also means that it's possible somehow 
the content of a tag at a 4B-aligned offset has the CORE_MAGIC for some 
reason, but we shouldn't match on it.


I recommend to follow what Rockchip does downstream:

"""
struct tag_header {
u32 size; /* size in units of 4B */
u32 magic;
};
"""

if magic != CORE_MAGIC, then we should increment addr by size * 4B and 
check again.



+   }
+   if (addr >= (ATAGS_PHYS_BASE + ATAGS_SIZE))
+   return -ENODATA;


I think it'd be nice to have debug messages here and there :)


+
+   while (addr < (ATAGS_PHYS_BASE + ATAGS_SIZE)) {
+   val = readl(addr);
+   if (val == ATAGS_DDR_MEM_MAGIC)
+   break;
+   addr += 4;


Same remarks here.


+   }
+   if (addr >= (ATAGS_PHYS_BASE + ATAGS_SIZE))
+   return -ENODATA;
+
+   ddr_info = (void *)addr + 4;


It seems that arithmetic operations on void pointers is illegal in the C 
standard. This also quite obfuscate what you want to do here.


Considering that in this patch you're iterating for each 4B until you 
get the MAGIC, the next 4B are data for that header of that magic.


If you go for the tag_header I suggested above, I would recommend to do 
the following instead:


"""
ddr_info = (u8*)addr + size