Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2023-02-13 Thread Leo Liang
Hi Xiang,

On Sat, Feb 11, 2023 at 10:11:31PM +0800, Xiang W wrote:
> 在 2023-02-10星期五的 07:25 +,Leo Liang写道:
> > Hi Xiang,
> > 
> > On Fri, Feb 03, 2023 at 03:24:37PM +0100, David Abdurachmanov wrote:
> > > On Mon, Jan 3, 2022 at 1:13 PM Leo Liang  wrote:
> > > > 
> > > > On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> > > > > 在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> > > > > > Hi Xiang,
> > > > > > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > > > > > Various specifications of riscv allow the number of hart to be
> > > > > > > greater than 32. The limit of 32 is determined by
> > > > > > > gd->arch.available_harts. We can eliminate this limitation through
> > > > > > > bitmaps. Currently, the number of hart is limited to 4095, and 
> > > > > > > 4095
> > > > > > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > > > > > Specification.
> > > > > > > 
> > > > > > > Test on sifive unmatched.
> > > > > > > 
> > > > > > > Signed-off-by: Xiang W 
> > > > > > > ---
> > > > > > > Changes since v1:
> > > > > > > 
> > > > > > > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> > > > > > >   overflow the immediate range of ld/lw. This patch fixes this
> > > > > > >   problem
> > > > > > > 
> > > > > > >  arch/riscv/Kconfig   |  4 ++--
> > > > > > >  arch/riscv/cpu/start.S   | 21 -
> > > > > > >  arch/riscv/include/asm/global_data.h |  4 +++-
> > > > > > >  arch/riscv/lib/smp.c |  2 +-
> > > > > > >  4 files changed, 22 insertions(+), 9 deletions(-)
> > > > > > > 
> > > 
> > > I noticed that this has never landed in U-Boot. Was this forgotten or
> > > dropped for some reason (couldn't find anything)?
> > > 
> > > The current limit on the Linux kernel side is 512. The default on
> > > 64-bit (riscv64) is 64.
> > > 
> > > david
> > 
> > The patch seems to cause some CI error (timeout on QEMU).
> > (https://source.denx.de/u-boot/custodians/u-boot-riscv/-/pipelines/15076)
> > Could you take a look at it if you have time?
> > 
> > Best regards,
> > Leo
> 
> sorry! I missing a bug. There is an error in calculating the starting address
> of available_harts. The patch for start.S needs to be updated.
> 
> diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
> index 76850ec9be..92f3b78f29 100644
> --- a/arch/riscv/cpu/start.S
> +++ b/arch/riscv/cpu/start.S
> @@ -166,11 +166,22 @@ wait_for_gd_init:
>   mv  gp, s0
>  
>   /* register available harts in the available_harts mask */
> - li  t1, 1
> - sll t1, t1, tp
> - LREGt2, GD_AVAILABLE_HARTS(gp)
> - or  t2, t2, t1
> - SREGt2, GD_AVAILABLE_HARTS(gp)
> + li  t1, GD_AVAILABLE_HARTS
> + add t1, t1, gp
> +#if defined(CONFIG_ARCH_RV64I)
> + srlit2, tp, 6
> + sllit2, t2, 3
> +#elif defined(CONFIG_ARCH_RV32I)
> + srlit2, tp, 5
> + sllit2, t2, 2
> +#endif
> + add t1, t1, t2
> + LREGt2, 0(t1)
> + li  t3, 1
> + sll t3, t3, tp
> + or  t2, t2, t3
> + SREGt2, 0(t1)
>  
>   amoswap.w.rl zero, zero, 0(t0)
> 
> The mailing list cannot receive my mail, please help to update
> 

I have updated the patch.
(https://patchwork.ozlabs.org/project/uboot/patch/20230213084313.10419-1-ycli...@andestech.com/)
Could you take a look to see if there is any issue?

Best regards,
Leo





Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2023-02-11 Thread Xiang W
在 2023-02-10星期五的 07:25 +,Leo Liang写道:
> Hi Xiang,
> 
> On Fri, Feb 03, 2023 at 03:24:37PM +0100, David Abdurachmanov wrote:
> > On Mon, Jan 3, 2022 at 1:13 PM Leo Liang  wrote:
> > > 
> > > On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> > > > 在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> > > > > Hi Xiang,
> > > > > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > > > > Various specifications of riscv allow the number of hart to be
> > > > > > greater than 32. The limit of 32 is determined by
> > > > > > gd->arch.available_harts. We can eliminate this limitation through
> > > > > > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > > > > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > > > > Specification.
> > > > > > 
> > > > > > Test on sifive unmatched.
> > > > > > 
> > > > > > Signed-off-by: Xiang W 
> > > > > > ---
> > > > > > Changes since v1:
> > > > > > 
> > > > > > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> > > > > >   overflow the immediate range of ld/lw. This patch fixes this
> > > > > >   problem
> > > > > > 
> > > > > >  arch/riscv/Kconfig   |  4 ++--
> > > > > >  arch/riscv/cpu/start.S   | 21 -
> > > > > >  arch/riscv/include/asm/global_data.h |  4 +++-
> > > > > >  arch/riscv/lib/smp.c |  2 +-
> > > > > >  4 files changed, 22 insertions(+), 9 deletions(-)
> > > > > > 
> > 
> > I noticed that this has never landed in U-Boot. Was this forgotten or
> > dropped for some reason (couldn't find anything)?
> > 
> > The current limit on the Linux kernel side is 512. The default on
> > 64-bit (riscv64) is 64.
> > 
> > david
> 
> The patch seems to cause some CI error (timeout on QEMU).
> (https://source.denx.de/u-boot/custodians/u-boot-riscv/-/pipelines/15076)
> Could you take a look at it if you have time?
> 
> Best regards,
> Leo

sorry! I missing a bug. There is an error in calculating the starting address
of available_harts. The patch for start.S needs to be updated.

diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
index 76850ec9be..92f3b78f29 100644
--- a/arch/riscv/cpu/start.S
+++ b/arch/riscv/cpu/start.S
@@ -166,11 +166,22 @@ wait_for_gd_init:
mv  gp, s0
 
/* register available harts in the available_harts mask */
-   li  t1, 1
-   sll t1, t1, tp
-   LREGt2, GD_AVAILABLE_HARTS(gp)
-   or  t2, t2, t1
-   SREGt2, GD_AVAILABLE_HARTS(gp)
+   li  t1, GD_AVAILABLE_HARTS
+   add t1, t1, gp
+#if defined(CONFIG_ARCH_RV64I)
+   srlit2, tp, 6
+   sllit2, t2, 3
+#elif defined(CONFIG_ARCH_RV32I)
+   srlit2, tp, 5
+   sllit2, t2, 2
+#endif
+   add t1, t1, t2
+   LREGt2, 0(t1)
+   li  t3, 1
+   sll t3, t3, tp
+   or  t2, t2, t3
+   SREGt2, 0(t1)
 
amoswap.w.rl zero, zero, 0(t0)

The mailing list cannot receive my mail, please help to update



Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2023-02-09 Thread Leo Liang
Hi Xiang,

On Fri, Feb 03, 2023 at 03:24:37PM +0100, David Abdurachmanov wrote:
> On Mon, Jan 3, 2022 at 1:13 PM Leo Liang  wrote:
> >
> > On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> > > 在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> > > > Hi Xiang,
> > > > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > > > Various specifications of riscv allow the number of hart to be
> > > > > greater than 32. The limit of 32 is determined by
> > > > > gd->arch.available_harts. We can eliminate this limitation through
> > > > > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > > > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > > > Specification.
> > > > >
> > > > > Test on sifive unmatched.
> > > > >
> > > > > Signed-off-by: Xiang W 
> > > > > ---
> > > > > Changes since v1:
> > > > >
> > > > > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> > > > >   overflow the immediate range of ld/lw. This patch fixes this
> > > > >   problem
> > > > >
> > > > >  arch/riscv/Kconfig   |  4 ++--
> > > > >  arch/riscv/cpu/start.S   | 21 -
> > > > >  arch/riscv/include/asm/global_data.h |  4 +++-
> > > > >  arch/riscv/lib/smp.c |  2 +-
> > > > >  4 files changed, 22 insertions(+), 9 deletions(-)
> > > > >
> 
> I noticed that this has never landed in U-Boot. Was this forgotten or
> dropped for some reason (couldn't find anything)?
> 
> The current limit on the Linux kernel side is 512. The default on
> 64-bit (riscv64) is 64.
> 
> david

The patch seems to cause some CI error (timeout on QEMU).
(https://source.denx.de/u-boot/custodians/u-boot-riscv/-/pipelines/15076)
Could you take a look at it if you have time?

Best regards,
Leo


Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2023-02-06 Thread Leo Liang
Hi David, 
On Fri, Feb 03, 2023 at 03:24:37PM +0100, David Abdurachmanov wrote:
> On Mon, Jan 3, 2022 at 1:13 PM Leo Liang  wrote:
> >
> > On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> > > 在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> > > > Hi Xiang,
> > > > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > > > Various specifications of riscv allow the number of hart to be
> > > > > greater than 32. The limit of 32 is determined by
> > > > > gd->arch.available_harts. We can eliminate this limitation through
> > > > > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > > > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > > > Specification.
> > > > >
> > > > > Test on sifive unmatched.
> > > > >
> > > > > Signed-off-by: Xiang W 
> 
> I noticed that this has never landed in U-Boot. Was this forgotten or
> dropped for some reason (couldn't find anything)?
> 

Sorry, This patch is forgotten.
I will make sure this gets applied as soon as possible
if there is no other error or concerns.

Thanks for the reminder!

Best regards,
Leo

> The current limit on the Linux kernel side is 512. The default on
> 64-bit (riscv64) is 64.
> 
> david


Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2023-02-03 Thread David Abdurachmanov
On Mon, Jan 3, 2022 at 1:13 PM Leo Liang  wrote:
>
> On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> > 在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> > > Hi Xiang,
> > > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > > Various specifications of riscv allow the number of hart to be
> > > > greater than 32. The limit of 32 is determined by
> > > > gd->arch.available_harts. We can eliminate this limitation through
> > > > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > > Specification.
> > > >
> > > > Test on sifive unmatched.
> > > >
> > > > Signed-off-by: Xiang W 
> > > > ---
> > > > Changes since v1:
> > > >
> > > > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> > > >   overflow the immediate range of ld/lw. This patch fixes this
> > > >   problem
> > > >
> > > >  arch/riscv/Kconfig   |  4 ++--
> > > >  arch/riscv/cpu/start.S   | 21 -
> > > >  arch/riscv/include/asm/global_data.h |  4 +++-
> > > >  arch/riscv/lib/smp.c |  2 +-
> > > >  4 files changed, 22 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
> > > > index 76850ec9be..92f3b78f29 100644
> > > > --- a/arch/riscv/cpu/start.S
> > > > +++ b/arch/riscv/cpu/start.S
> > > > @@ -166,11 +166,22 @@ wait_for_gd_init:
> > > > mv  gp, s0
> > > >
> > > > /* register available harts in the available_harts mask */
> > > > -   li  t1, 1
> > > > -   sll t1, t1, tp
> > > > -   LREGt2, GD_AVAILABLE_HARTS(gp)
> > > > -   or  t2, t2, t1
> > > > -   SREGt2, GD_AVAILABLE_HARTS(gp)
> > > > +   li  t1, GD_AVAILABLE_HARTS
> > > > +   add t1, t1, gp
> > > > +   LREGt1, 0(t1)
> > > > +#if defined(CONFIG_ARCH_RV64I)
> > > > +   srlit2, tp, 6
> > > > +   sllit2, t2, 3
> > > > +#elif defined(CONFIG_ARCH_RV32I)
> > > > +   srlit2, tp, 5
> > > > +   sllit2, t2, 2
> > > > +#endif
> > > > +   add t1, t1, t2
> > > > +   LREGt2, 0(t1)
> > > > +   li  t3, 1
> > > > +   sll t3, t3, tp
> > > This seems incorrect.
> > > Shouldn't we have "$tp % sizeof(ulong)" instead of "$tp /
> > > sizeof(ulong)" ?
> >
> > Do you meening: "$tp % sizeof(ulong)" instead of "$tp" ?
> >
> > There is such a description in the riscv specification:
> >
> > SLL, SRL, and SRA perform logical left, logical right, and arithmetic
> > right shifts on the value in register rs1 by the shift amount held in
> > the lower 5 bits of register rs2.
> >
> > SLL, SRL, and SRA perform logical left, logical right, and arithmetic
> > right shifts on the value in register rs1 by the shift amount held in
> > register rs2. In RV64I, only the low 6 bits of rs2 are considered for
> > the shift amount.
> >
> > So we don’t need to perform the remainder operation.
>
> Got it! Thanks for the explanation.
>
> LGTM,
> Reviewed-by: Leo Yu-Chi Liang 

I noticed that this has never landed in U-Boot. Was this forgotten or
dropped for some reason (couldn't find anything)?

The current limit on the Linux kernel side is 512. The default on
64-bit (riscv64) is 64.

david


>
> Best regards,
> Leo
> >
> > regards,
> > Xiang W
> > > > +   or  t2, t2, t3
> > > > +   SREGt2, 0(t1)
> > > >
> > > > amoswap.w.rl zero, zero, 0(t0)
> > > Best regards,
> > > Leo
> >
> >


Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2022-01-03 Thread Leo Liang
On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
> 在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> > Hi Xiang,
> > On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > > Various specifications of riscv allow the number of hart to be
> > > greater than 32. The limit of 32 is determined by
> > > gd->arch.available_harts. We can eliminate this limitation through
> > > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > > is the limit of the RISC-V Advanced Core Local Interruptor
> > > Specification.
> > > 
> > > Test on sifive unmatched.
> > > 
> > > Signed-off-by: Xiang W 
> > > ---
> > > Changes since v1:
> > > 
> > > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> > >   overflow the immediate range of ld/lw. This patch fixes this
> > >   problem
> > > 
> > >  arch/riscv/Kconfig   |  4 ++--
> > >  arch/riscv/cpu/start.S   | 21 -
> > >  arch/riscv/include/asm/global_data.h |  4 +++-
> > >  arch/riscv/lib/smp.c |  2 +-
> > >  4 files changed, 22 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
> > > index 76850ec9be..92f3b78f29 100644
> > > --- a/arch/riscv/cpu/start.S
> > > +++ b/arch/riscv/cpu/start.S
> > > @@ -166,11 +166,22 @@ wait_for_gd_init:
> > > mv  gp, s0
> > >  
> > > /* register available harts in the available_harts mask */
> > > -   li  t1, 1
> > > -   sll t1, t1, tp
> > > -   LREGt2, GD_AVAILABLE_HARTS(gp)
> > > -   or  t2, t2, t1
> > > -   SREGt2, GD_AVAILABLE_HARTS(gp)
> > > +   li  t1, GD_AVAILABLE_HARTS
> > > +   add t1, t1, gp
> > > +   LREGt1, 0(t1)
> > > +#if defined(CONFIG_ARCH_RV64I)
> > > +   srlit2, tp, 6
> > > +   sllit2, t2, 3
> > > +#elif defined(CONFIG_ARCH_RV32I)
> > > +   srlit2, tp, 5
> > > +   sllit2, t2, 2
> > > +#endif
> > > +   add t1, t1, t2
> > > +   LREGt2, 0(t1)
> > > +   li  t3, 1
> > > +   sll t3, t3, tp
> > This seems incorrect.
> > Shouldn't we have "$tp % sizeof(ulong)" instead of "$tp /
> > sizeof(ulong)" ?
> 
> Do you meening: "$tp % sizeof(ulong)" instead of "$tp" ?
> 
> There is such a description in the riscv specification:
> 
> SLL, SRL, and SRA perform logical left, logical right, and arithmetic
> right shifts on the value in register rs1 by the shift amount held in
> the lower 5 bits of register rs2.
> 
> SLL, SRL, and SRA perform logical left, logical right, and arithmetic
> right shifts on the value in register rs1 by the shift amount held in
> register rs2. In RV64I, only the low 6 bits of rs2 are considered for
> the shift amount.
> 
> So we don’t need to perform the remainder operation.

Got it! Thanks for the explanation.

LGTM,
Reviewed-by: Leo Yu-Chi Liang 

Best regards,
Leo
> 
> regards,
> Xiang W
> > > +   or  t2, t2, t3
> > > +   SREGt2, 0(t1)
> > >  
> > > amoswap.w.rl zero, zero, 0(t0)
> > Best regards,
> > Leo
> 
> 


Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2021-12-29 Thread Xiang W
在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
> Hi Xiang,
> On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> > Various specifications of riscv allow the number of hart to be
> > greater than 32. The limit of 32 is determined by
> > gd->arch.available_harts. We can eliminate this limitation through
> > bitmaps. Currently, the number of hart is limited to 4095, and 4095
> > is the limit of the RISC-V Advanced Core Local Interruptor
> > Specification.
> > 
> > Test on sifive unmatched.
> > 
> > Signed-off-by: Xiang W 
> > ---
> > Changes since v1:
> > 
> > * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
> >   overflow the immediate range of ld/lw. This patch fixes this
> >   problem
> > 
> >  arch/riscv/Kconfig   |  4 ++--
> >  arch/riscv/cpu/start.S   | 21 -
> >  arch/riscv/include/asm/global_data.h |  4 +++-
> >  arch/riscv/lib/smp.c |  2 +-
> >  4 files changed, 22 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
> > index 76850ec9be..92f3b78f29 100644
> > --- a/arch/riscv/cpu/start.S
> > +++ b/arch/riscv/cpu/start.S
> > @@ -166,11 +166,22 @@ wait_for_gd_init:
> > mv  gp, s0
> >  
> > /* register available harts in the available_harts mask */
> > -   li  t1, 1
> > -   sll t1, t1, tp
> > -   LREGt2, GD_AVAILABLE_HARTS(gp)
> > -   or  t2, t2, t1
> > -   SREGt2, GD_AVAILABLE_HARTS(gp)
> > +   li  t1, GD_AVAILABLE_HARTS
> > +   add t1, t1, gp
> > +   LREGt1, 0(t1)
> > +#if defined(CONFIG_ARCH_RV64I)
> > +   srlit2, tp, 6
> > +   sllit2, t2, 3
> > +#elif defined(CONFIG_ARCH_RV32I)
> > +   srlit2, tp, 5
> > +   sllit2, t2, 2
> > +#endif
> > +   add t1, t1, t2
> > +   LREGt2, 0(t1)
> > +   li  t3, 1
> > +   sll t3, t3, tp
> This seems incorrect.
> Shouldn't we have "$tp % sizeof(ulong)" instead of "$tp /
> sizeof(ulong)" ?

Do you meening: "$tp % sizeof(ulong)" instead of "$tp" ?

There is such a description in the riscv specification:

SLL, SRL, and SRA perform logical left, logical right, and arithmetic
right shifts on the value in register rs1 by the shift amount held in
the lower 5 bits of register rs2.

SLL, SRL, and SRA perform logical left, logical right, and arithmetic
right shifts on the value in register rs1 by the shift amount held in
register rs2. In RV64I, only the low 6 bits of rs2 are considered for
the shift amount.

So we don’t need to perform the remainder operation.

regards,
Xiang W
> > +   or  t2, t2, t3
> > +   SREGt2, 0(t1)
> >  
> > amoswap.w.rl zero, zero, 0(t0)
> Best regards,
> Leo




Re: [PATCH v2] riscv: cancel the limitation that NR_CPUS is less than or equal to 32

2021-12-29 Thread Leo Liang
Hi Xiang,
On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
> Various specifications of riscv allow the number of hart to be
> greater than 32. The limit of 32 is determined by
> gd->arch.available_harts. We can eliminate this limitation through
> bitmaps. Currently, the number of hart is limited to 4095, and 4095
> is the limit of the RISC-V Advanced Core Local Interruptor
> Specification.
> 
> Test on sifive unmatched.
> 
> Signed-off-by: Xiang W 
> ---
> Changes since v1:
> 
> * When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
>   overflow the immediate range of ld/lw. This patch fixes this
>   problem
> 
>  arch/riscv/Kconfig   |  4 ++--
>  arch/riscv/cpu/start.S   | 21 -
>  arch/riscv/include/asm/global_data.h |  4 +++-
>  arch/riscv/lib/smp.c |  2 +-
>  4 files changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
> index 76850ec9be..92f3b78f29 100644
> --- a/arch/riscv/cpu/start.S
> +++ b/arch/riscv/cpu/start.S
> @@ -166,11 +166,22 @@ wait_for_gd_init:
>   mv  gp, s0
>  
>   /* register available harts in the available_harts mask */
> - li  t1, 1
> - sll t1, t1, tp
> - LREGt2, GD_AVAILABLE_HARTS(gp)
> - or  t2, t2, t1
> - SREGt2, GD_AVAILABLE_HARTS(gp)
> + li  t1, GD_AVAILABLE_HARTS
> + add t1, t1, gp
> + LREGt1, 0(t1)
> +#if defined(CONFIG_ARCH_RV64I)
> + srlit2, tp, 6
> + sllit2, t2, 3
> +#elif defined(CONFIG_ARCH_RV32I)
> + srlit2, tp, 5
> + sllit2, t2, 2
> +#endif
> + add t1, t1, t2
> + LREGt2, 0(t1)
> + li  t3, 1
> + sll t3, t3, tp
This seems incorrect.
Shouldn't we have "$tp % sizeof(ulong)" instead of "$tp / sizeof(ulong)" ?
> + or  t2, t2, t3
> + SREGt2, 0(t1)
>  
>   amoswap.w.rl zero, zero, 0(t0)
Best regards,
Leo