Re: [PATCH 1/2] linux-user/riscv: vdso: fix call frame info in __vdso_rt_sigreturn

2024-01-16 Thread Vineet Gupta



On 1/15/24 15:18, Richard Henderson wrote:
> On 1/16/24 10:15, Vineet Gupta wrote:
>> When testing gcc testsuite against QEMU v8.2 we found some additional
>> failures vs. v8.1.2.
>>
>> | FAIL: gcc.dg/cleanup-10.c execution test
>> | FAIL: gcc.dg/cleanup-11.c execution test
>> | FAIL: gcc.dg/cleanup-8.c execution test
>> | FAIL: gcc.dg/cleanup-9.c execution test
>>
>> All of these tests involve unwinding off signal stack and v8.2 did
>> introduce a vdso with sigreturn trampoline and associated unwinding
>> info. It seems that info is not correct and making it similar to
>> to one in the linux kernel fixes the above failures.
> So.. you didn't actually determine what might be off in the unwind info?

Not yet. I just tried what kernel had and that worked.

>
>> +.cfi_startproc
>> +.cfi_signal_frame
>> +   raw_syscall __NR_rt_sigreturn
>> +   .cfi_endproc
> No, this is wrong.  It indicates that the unwind info is present and trivial.

Ok it seems the issue is really subtle.

With 8.2 trunk, the NOP needed before signal trampoline seems to be be
factored into the unwind info for sigrestorer.

003c 0098  CIE
  Version:   3
  Augmentation:  "zRS"
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 64
  Augmentation data: 1b
  DW_CFA_def_cfa: r2 (sp) ofs 832
  DW_CFA_offset_extended: r64 at cfa-528
  DW_CFA_offset: r1 (ra) at cfa-520
  DW_CFA_offset: r2 (sp) at cfa-512
...
  DW_CFA_offset: r63 (ft11) at cfa-24
  DW_CFA_nop
  DW_CFA_nop

00d8 0010 00a0 FDE cie=003c
pc=066c..0678

 
^^^    <--- NOP included
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop

0664 <__vdso_flush_icache>:
 664:    0513      li    a0,0
 668:    8067      ret
 66c:    0013      nop <--- this NOP

0670 <__vdso_rt_sigreturn>:
 670:    08b00893      li    a7,139
 674:    0073      ecall


This is due to the .cfi_startproc bracketing. If we move the nop out of
the .cfi_{start,end}proc, things start to work as well.

diff --git a/linux-user/riscv/vdso.S b/linux-user/riscv/vdso.S
index 4b4e34aeea51..8c9f1038cb8c 100644
--- a/linux-user/riscv/vdso.S
+++ b/linux-user/riscv/vdso.S
@@ -92,6 +92,8 @@ endf __vdso_flush_icache
 
    .cfi_endproc
 
+   nop
+
 /*
  * Start the unwind info at least one instruction before the signal
  * trampoline, because the unwinder will assume we are returning
@@ -178,8 +180,6 @@ endf __vdso_flush_icache
    .cfi_offset 62, B_FR + 30 * sizeof_freg
    .cfi_offset 63, B_FR + 31 * sizeof_freg /* f31 */
 
-   nop
-
 __vdso_rt_sigreturn:
    raw_syscall __NR_rt_sigreturn
 endf __vdso_rt_sigreturn


This changes the cfi info slightly as follows:

00d8 0010 00a0 FDE cie=003c
pc=0670..0678  <-- excludes nop
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop


0664 <__vdso_flush_icache>:
 664:    0513      li    a0,0
 668:    8067      ret
 66c:    0013      nop

0670 <__vdso_rt_sigreturn>:
 670:    08b00893      li    a7,139
 674:    0073      ecall

I concur this is still not 100% explanation of why things are going off,
but I have exact same nop quirk for glibc ARC sigrestorer.
Would an updated patch along those lines be more palatable.

Thx,
-Vineet



[PATCH 2/2] linux-user/riscv: rebuild vdso binaries after prev fix

2024-01-15 Thread Vineet Gupta
Signed-off-by: Vineet Gupta 
---
Splitting this from prev patch in case maintainers want to regenerate
the vdso at their end. Or if they choose to, this can be squashed with
prev change too.
---

Signed-off-by: Vineet Gupta 
---
 linux-user/riscv/vdso-32.so | Bin 2900 -> 2836 bytes
 linux-user/riscv/vdso-64.so | Bin 3856 -> 3792 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/linux-user/riscv/vdso-32.so b/linux-user/riscv/vdso-32.so
index 
1ad1e5c8b1fe36b0fe4bcb6c06fab8219ecd..ee158b8374d14973492a4d05f705bf12c6cf42b2
 100755
GIT binary patch
delta 563
zcmca2HbrcL0%OiZMRVRQEDUhqGci@34J5FIWnx#Ah{>aRivX+7%f)pfCby{;th~y{
zaAdLs;~pUuppponE+GaU1}z}n!7}+7qnyzipfCqiSQJRt0Qo#nz7LRIfFy4R}vipe{f!vIX7G3eSw)$}$z=?Tj+0$kO*!MMa>_3=0e<
zH!_M`_+QDw3tn?8^%WRtG<1Q`Hy1_&@ReEc5-qz?d{4+0>54v?O~4CVp}E{1v_
z>A(cxgXE{eq=2*_kiP&(BYS@fkdGXSCxHAZh~TnkxCRtJ4$oIWeg{GU!#^M
zla0z;OWoY{@)=YrCVR3PB7C^ni@l3+^CgaDjEotR1G$7J_i*vBf^1irJeNzI6%^q<
llO4H*C*R=`U}Tv5mP?)O02>2?Fymw?ZY8G6?33-d#Q-6rVafmi

delta 697
zcmbOtc13K00^^#Aisrl)tPF7AJTX9{TMW>6xcV8$Z
z{cy4b;~t?EEDQ`0KwUx%JPcYunum4rGe$Wh1E5Y0sIVx=P#~WN%J%`%DoFBnK)wX)
zW;3QdMo|Gqpp8I&29OQ_(o=x+pUFF!`uL_VDcG+Q+e<4eZ!C
zK*vo0VxGyeEHikVfUKg-;^Z<0MvKV@SxhH4GKx(WVco#EV)9W|{d!QaLI5b#5P*f@
z<9~i2R|4p35MTuIC4jU8Fg!s3#5V%cGnm0#Ai>4p3M3tvAbgN~G)xLe3j+CRKpHtd
zsu?(52!!_DFQ)0L5|=JKt2zW{1G4@Il>u|lLLw?eG;?v@)=YrHgm9dF>ap1
zv4oNF&169?;mJN+Jggw=R!olNQfCFFIp@iY+`^ORa0xKhOkN8VvtegoxCs_}!zI8Z
K$}#yLmlyyeGL7y4

diff --git a/linux-user/riscv/vdso-64.so b/linux-user/riscv/vdso-64.so
index 
83992bebe6d0182f24edfffc531015fd2f4e1cfb..f2e250fc6ca1bfb79bd7f350ae914fa7b366a1f5
 100755
GIT binary patch
delta 694
zcmbOrcR_Z72IGW@n$m7Mjr8r8x*)curh$b2iA$Vj)^$8=3hz_
z5MXOLa6sbt;}gq7+qw5{-olu}D76Nvjt8QgK?_I=G4L=PVV!KntmpLtD$N0v7KKXh
zK-1;}6~BX~-VQ2$fpzmH<|;<99w6Tah-UzC01%%5;tGbzPOM6f9zZ6{6#_uE4iGy)
zX&}V_5(5F|s-R#7#u`ROXN?vW(iLg;Sllk1nOgC`1pSZBandvf`Dw0I4nG2
zQaNA-7lQ)S!!Yss$qPBe>!$*_fT*L@WRcxCD
zxEC-nrcB<*tIq{<;TqP-uXyb^VZq`v*^tki(P45XpFP(D4zSe>!iqm&6$9S=k~gBFk$V$VVi8ltmjn#mF9p-i$bLx(6sqL
z#Z%DK+d;)6*fwusu3{7uVPs=>Q<@1Ed)kCp)n!)n5RzV6G4VvUPy?1e69+
z3^PCk5HME-1v4<#I50}{FdblHb!cI1Y+!7vX|8A~X)S2WY0v0L=}hQ~>5k|L=?&=f
z>Gzo6GSOj@&18!yCQ}Wj=}gy{p)ylpmdtF4IU;if=JCwuSm3hIX_151V!I_aORbh!
zEH_(WveIak!D_uVI%~DoX{=YnI%m$LfsI00EjnZ?Ov42%|&|FW6_qep3T3>yz4lL_18EOs|RSjYp_
zLBQmR?BbsMKsF=8$NxJ(0fPi2fNYR>12lkPQbu3~7lQ^PFCa)=Zpq&1cAC3
zCNuI1*Ms5}J%Xx$DnQ~e*T7<+4=Nsk7SVH{;^+xu1614xP5lw5I55FT1DS{@W=Kws
z2rzdwa@NadP^s9g!MTWW^98O2OpI?P7xL

[PATCH 1/2] linux-user/riscv: vdso: fix call frame info in __vdso_rt_sigreturn

2024-01-15 Thread Vineet Gupta
When testing gcc testsuite against QEMU v8.2 we found some additional
failures vs. v8.1.2.

| FAIL: gcc.dg/cleanup-10.c execution test
| FAIL: gcc.dg/cleanup-11.c execution test
| FAIL: gcc.dg/cleanup-8.c execution test
| FAIL: gcc.dg/cleanup-9.c execution test

All of these tests involve unwinding off signal stack and v8.2 did
introduce a vdso with sigreturn trampoline and associated unwinding
info. It seems that info is not correct and making it similar to
to one in the linux kernel fixes the above failures.

Fixes: 468c1bb5cac9 ("linux-user/riscv: Add vdso")
Reported-by: Edwin Lu 
Signed-off-by: Vineet Gupta 
---
 linux-user/riscv/vdso.S | 87 ++---
 1 file changed, 4 insertions(+), 83 deletions(-)

diff --git a/linux-user/riscv/vdso.S b/linux-user/riscv/vdso.S
index a86d8fc488e0..20119010c11b 100644
--- a/linux-user/riscv/vdso.S
+++ b/linux-user/riscv/vdso.S
@@ -97,91 +97,12 @@ endf __vdso_flush_icache
  * trampoline, because the unwinder will assume we are returning
  * after a call site.
  */
-
-   .cfi_startproc simple
-   .cfi_signal_frame
-
-#define sizeof_reg (__riscv_xlen / 4)
-#define sizeof_freg8
-#define B_GR   (offsetof_uc_mcontext - sizeof_rt_sigframe)
-#define B_FR   (offsetof_uc_mcontext - sizeof_rt_sigframe + offsetof_freg0)
-
-   .cfi_def_cfa2, sizeof_rt_sigframe
-
-   /* Return address */
-   .cfi_return_column 64
-   .cfi_offset 64, B_GR + 0/* pc */
-
-   /* Integer registers */
-   .cfi_offset 1, B_GR + 1 * sizeof_reg/* r1 (ra) */
-   .cfi_offset 2, B_GR + 2 * sizeof_reg/* r2 (sp) */
-   .cfi_offset 3, B_GR + 3 * sizeof_reg
-   .cfi_offset 4, B_GR + 4 * sizeof_reg
-   .cfi_offset 5, B_GR + 5 * sizeof_reg
-   .cfi_offset 6, B_GR + 6 * sizeof_reg
-   .cfi_offset 7, B_GR + 7 * sizeof_reg
-   .cfi_offset 8, B_GR + 8 * sizeof_reg
-   .cfi_offset 9, B_GR + 9 * sizeof_reg
-   .cfi_offset 10, B_GR + 10 * sizeof_reg
-   .cfi_offset 11, B_GR + 11 * sizeof_reg
-   .cfi_offset 12, B_GR + 12 * sizeof_reg
-   .cfi_offset 13, B_GR + 13 * sizeof_reg
-   .cfi_offset 14, B_GR + 14 * sizeof_reg
-   .cfi_offset 15, B_GR + 15 * sizeof_reg
-   .cfi_offset 16, B_GR + 16 * sizeof_reg
-   .cfi_offset 17, B_GR + 17 * sizeof_reg
-   .cfi_offset 18, B_GR + 18 * sizeof_reg
-   .cfi_offset 19, B_GR + 19 * sizeof_reg
-   .cfi_offset 20, B_GR + 20 * sizeof_reg
-   .cfi_offset 21, B_GR + 21 * sizeof_reg
-   .cfi_offset 22, B_GR + 22 * sizeof_reg
-   .cfi_offset 23, B_GR + 23 * sizeof_reg
-   .cfi_offset 24, B_GR + 24 * sizeof_reg
-   .cfi_offset 25, B_GR + 25 * sizeof_reg
-   .cfi_offset 26, B_GR + 26 * sizeof_reg
-   .cfi_offset 27, B_GR + 27 * sizeof_reg
-   .cfi_offset 28, B_GR + 28 * sizeof_reg
-   .cfi_offset 29, B_GR + 29 * sizeof_reg
-   .cfi_offset 30, B_GR + 30 * sizeof_reg
-   .cfi_offset 31, B_GR + 31 * sizeof_reg  /* r31 */
-
-   .cfi_offset 32, B_FR + 0/* f0 */
-   .cfi_offset 33, B_FR + 1 * sizeof_freg  /* f1 */
-   .cfi_offset 34, B_FR + 2 * sizeof_freg
-   .cfi_offset 35, B_FR + 3 * sizeof_freg
-   .cfi_offset 36, B_FR + 4 * sizeof_freg
-   .cfi_offset 37, B_FR + 5 * sizeof_freg
-   .cfi_offset 38, B_FR + 6 * sizeof_freg
-   .cfi_offset 39, B_FR + 7 * sizeof_freg
-   .cfi_offset 40, B_FR + 8 * sizeof_freg
-   .cfi_offset 41, B_FR + 9 * sizeof_freg
-   .cfi_offset 42, B_FR + 10 * sizeof_freg
-   .cfi_offset 43, B_FR + 11 * sizeof_freg
-   .cfi_offset 44, B_FR + 12 * sizeof_freg
-   .cfi_offset 45, B_FR + 13 * sizeof_freg
-   .cfi_offset 46, B_FR + 14 * sizeof_freg
-   .cfi_offset 47, B_FR + 15 * sizeof_freg
-   .cfi_offset 48, B_FR + 16 * sizeof_freg
-   .cfi_offset 49, B_FR + 17 * sizeof_freg
-   .cfi_offset 50, B_FR + 18 * sizeof_freg
-   .cfi_offset 51, B_FR + 19 * sizeof_freg
-   .cfi_offset 52, B_FR + 20 * sizeof_freg
-   .cfi_offset 53, B_FR + 21 * sizeof_freg
-   .cfi_offset 54, B_FR + 22 * sizeof_freg
-   .cfi_offset 55, B_FR + 23 * sizeof_freg
-   .cfi_offset 56, B_FR + 24 * sizeof_freg
-   .cfi_offset 57, B_FR + 25 * sizeof_freg
-   .cfi_offset 58, B_FR + 26 * sizeof_freg
-   .cfi_offset 59, B_FR + 27 * sizeof_freg
-   .cfi_offset 60, B_FR + 28 * sizeof_freg
-   .cfi_offset 61, B_FR + 29 * sizeof_freg
-   .cfi_offset 62, B_FR + 30 * sizeof_freg
-   .cfi_offset 63, B_FR + 31 * sizeof_freg /* f31 */
-
nop
 
 __vdso_rt_sigreturn:
-   raw_syscall __NR_rt_sigreturn
+   .cfi_startproc
+   .cfi_signal_frame
+   r

Re: [PULL 15/21] linux-user/riscv: Add vdso

2024-01-12 Thread Vineet Gupta
On 1/12/24 16:05, Richard Henderson wrote:
 So by default qemu ships the vdso binary. How can one rebuild it ?

   From skimming the build files it seems following ought to do it
       make update-linux-vdso

 with a prior configure cmd like below with PATH pointing to the cross
 compiler.
 ../configure  --target-list=riscv64-linux-user
 --cross-cc-riscv64=riscv64-unknown-linux-gnu-gcc
>>> Yes, that should do it.
>>>
 But it doesn't, I'm sure we are missing something basis here.
>>> Do you get an error message?
>>> Did $(BUILD_DIR)/tests/tcg/riscv64-linux-user/config-target.mak get created 
>>> properly?
>> It was indeed, but invoking make wasn't doing anything.
> Odd.  It Just Works here...
>
> What do you have in the "Cross compilers" section of the configure output?
>
> I would expect '--cross-prefix-riscv64=...' to be a better option that just 
> gcc.  And if 
> you have it installed as "riscv64-linux-gnu-", I would not expect you to need 
> to provide 
> any configure option at all -- it should be auto-detected.

This is really embarrassing but indeed build is working now. I swear
Edwin and I spent over several hours fighting the build and getting nowhere.
I also see how the vdso binary is subsumed into elfload.c so all of that
is good. Sorry for the noise

But not everything was a snafu. The gcc testsuite cleanup* failure issue
remains.

I applied the reg_size fix and that doesn't cure the cleanup failures.
Reverting the riscv vdso change, does fix them.

    2021-07-06 468c1bb5cac9 linux-user/riscv: Add vdso 

Now we just need to debug what in the vdso call frame information is
wrong and any pointers to debug that would be appreciated, even if you
are able to fix it right away ;-)

Thx,
-Vineet




Re: [PULL 15/21] linux-user/riscv: Add vdso

2024-01-12 Thread Vineet Gupta
On 1/12/24 15:37, Vineet Gupta wrote:
> Now if only I could rebuild vdso/qemu with the revert of following and
> the reg size change.
>  2021-07-06 468c1bb5cac9 linux-user/riscv: Add vdso  

And is there  way to debug qemu internals in this regard, like a
developer toggle on steroids or something.
-d in_asm,cpu,exec -dfilter etc show that app is taking different code
path, but not why.



Re: [PULL 15/21] linux-user/riscv: Add vdso

2024-01-12 Thread Vineet Gupta



On 1/12/24 13:35, Richard Henderson wrote:
> On 1/12/24 08:49, Vineet Gupta wrote:
>> Hi Richard, Alistair
>>
>> On 10/30/23 14:17, Richard Henderson wrote:
>>> diff --git a/linux-user/riscv/Makefile.vdso b/linux-user/riscv/Makefile.vdso
>>> new file mode 100644
>>> index 00..2c257dbfda
>>> --- /dev/null
>>> +++ b/linux-user/riscv/Makefile.vdso
>>> @@ -0,0 +1,15 @@
>>> +include $(BUILD_DIR)/tests/tcg/riscv64-linux-user/config-target.mak
>>> +
>>> +SUBDIR = $(SRC_PATH)/linux-user/riscv
>>> +VPATH += $(SUBDIR)
>>> +
>>> +all: $(SUBDIR)/vdso-32.so $(SUBDIR)/vdso-64.so
>>> +
>>> +LDFLAGS = -nostdlib -shared -fpic -Wl,-h,linux-vdso.so.1 
>>> -Wl,--build-id=sha1 \
>>> + -Wl,--hash-style=both -Wl,-T,$(SUBDIR)/vdso.ld
>>> +
>>> +$(SUBDIR)/vdso-32.so: vdso.S vdso.ld vdso-asmoffset.h
>>> +   $(CC) -o $@ $(LDFLAGS) -mabi=ilp32d -march=rv32g $<
>>> +
>>> +$(SUBDIR)/vdso-64.so: vdso.S vdso.ld vdso-asmoffset.h
>>> +   $(CC) -o $@ $(LDFLAGS) -mabi=lp64d -march=rv64g $<
>> So by default qemu ships the vdso binary. How can one rebuild it ?
>>
>>  From skimming the build files it seems following ought to do it
>>      make update-linux-vdso
>>
>> with a prior configure cmd like below with PATH pointing to the cross
>> compiler.
>> ../configure  --target-list=riscv64-linux-user
>> --cross-cc-riscv64=riscv64-unknown-linux-gnu-gcc
> Yes, that should do it.
>
>> But it doesn't, I'm sure we are missing something basis here.
> Do you get an error message?
> Did $(BUILD_DIR)/tests/tcg/riscv64-linux-user/config-target.mak get created 
> properly?

It was indeed, but invoking make wasn't doing anything.

Weirdly enought, after (yet another) clean rebuild the vdso does seem to
build.
So the vdso is subsumed into the qemu binary ? I mean do we need to
rebuild qemu after the vdso rebuild. Something in the build dependency
is off for sure.
Sorry but do u have a cmdline for dummies to rebuild vdso and then qemu.

Thing is something is definitely off with 8.2.

With 2 setups of riscv-gnu-toolchain and everything being same except
for qemu (v8.2 and v8.1.2) we see 4 additional failures in gcc testsuite.
These all pertain to unwinding off signal stack

 = Summary of gcc testsuite =
| # of unexpected case/ # of unique unexpected case
|  gcc | g++ | gfortran |
rv64imafdc_zba_zbb_zbs_zicond/  lp64d/ medlow |  40 /19 |   23 / 8 |   
72 /12 |   # v8.2
rv64imafdc_zba_zbb_zbs_zicond/  lp64d/ medlow |  36 /15 |7 / 4 |
  - | # v8.1.2

FAIL: gcc.dg/cleanup-10.c execution test
FAIL: gcc.dg/cleanup-11.c execution test
FAIL: gcc.dg/cleanup-8.c execution test
FAIL: gcc.dg/cleanup-9.c execution test

Now if only I could rebuild vdso/qemu with the revert of following and
the reg size change.
 2021-07-06 468c1bb5cac9 linux-user/riscv: Add vdso  


>
>> For starters we saw something that seems like a thinko in
>>
>> diff --git a/linux-user/riscv/vdso.S b/linux-user/riscv/vdso.S
>> -#define sizeof_reg (__riscv_xlen / 4)
>> +#define sizeof_reg (__riscv_xlen / 8)
> Oops.

Thx for taking care of that.


>> As as aside, we also see that rt_sigreturn in kernel vdso elides the
>> explicit the call frame information. Again we naively don't know if that
>> is required in qemu.
>>
>>      .text
>> ENTRY(__vdso_rt_sigreturn)
>>      .cfi_startproc
>>      .cfi_signal_frame
>>      li a7, __NR_rt_sigreturn
>>      ecall
>>      .cfi_endproc
>> ENDPROC(__vdso_rt_sigreturn)
> Perhaps it's not required, no.  But I'd consider the lack of info from the 
> kernel to be a 
> bug.  Lack of it means places like gcc have to have special cases.



Re: [PULL 15/21] linux-user/riscv: Add vdso

2024-01-11 Thread Vineet Gupta
Hi Richard, Alistair

On 10/30/23 14:17, Richard Henderson wrote:
> diff --git a/linux-user/riscv/Makefile.vdso b/linux-user/riscv/Makefile.vdso
> new file mode 100644
> index 00..2c257dbfda
> --- /dev/null
> +++ b/linux-user/riscv/Makefile.vdso
> @@ -0,0 +1,15 @@
> +include $(BUILD_DIR)/tests/tcg/riscv64-linux-user/config-target.mak
> +
> +SUBDIR = $(SRC_PATH)/linux-user/riscv
> +VPATH += $(SUBDIR)
> +
> +all: $(SUBDIR)/vdso-32.so $(SUBDIR)/vdso-64.so
> +
> +LDFLAGS = -nostdlib -shared -fpic -Wl,-h,linux-vdso.so.1 -Wl,--build-id=sha1 
> \
> +   -Wl,--hash-style=both -Wl,-T,$(SUBDIR)/vdso.ld
> +
> +$(SUBDIR)/vdso-32.so: vdso.S vdso.ld vdso-asmoffset.h
> + $(CC) -o $@ $(LDFLAGS) -mabi=ilp32d -march=rv32g $<
> +
> +$(SUBDIR)/vdso-64.so: vdso.S vdso.ld vdso-asmoffset.h
> + $(CC) -o $@ $(LDFLAGS) -mabi=lp64d -march=rv64g $<

So by default qemu ships the vdso binary. How can one rebuild it ?

>From skimming the build files it seems following ought to do it
    make update-linux-vdso

with a prior configure cmd like below with PATH pointing to the cross
compiler.
../configure  --target-list=riscv64-linux-user
--cross-cc-riscv64=riscv64-unknown-linux-gnu-gcc

But it doesn't, I'm sure we are missing something basis here.

The reason for trying to rebuild is, Edwin ran into an issue running gcc
testsuite for RISC-V.

TEst such as gcc/testsuite/gcc.dg/cleanup-10.c segv with new QEMU (with
exact same test binary)

With QEMU 8.1.2 it runs fine (exit code 0)
but with 8.2 it segv (exit code 134).

This is because it can't unwind out of signal stack and force_unwind
abort()s.

Edwin bisected this to following QEMU commit,

    2021-07-06 468c1bb5cac9 linux-user/riscv: Add vdso  

which kind of makes sense since this changes the rt_sigreturn trampoline
and associated unwinding stuff.

For starters we saw something that seems like a thinko in

diff --git a/linux-user/riscv/vdso.S b/linux-user/riscv/vdso.S
-#define sizeof_reg (__riscv_xlen / 4)
+#define sizeof_reg (__riscv_xlen / 8)

And wanted to see if that fixes it but can't really coax the build
system to rebuild the vdso as described above.

As as aside, we also see that rt_sigreturn in kernel vdso elides the
explicit the call frame information. Again we naively don't know if that
is required in qemu.

    .text
ENTRY(__vdso_rt_sigreturn)
    .cfi_startproc
    .cfi_signal_frame
    li a7, __NR_rt_sigreturn
    ecall
    .cfi_endproc
ENDPROC(__vdso_rt_sigreturn)


Would appreicate any answers and pointers.

Thx,


> diff --git a/linux-user/riscv/meson.build b/linux-user/riscv/meson.build
> new file mode 100644
> index 00..beb989a7ca
> --- /dev/null
> +++ b/linux-user/riscv/meson.build
> @@ -0,0 +1,7 @@
> +vdso_32_inc = gen_vdso.process('vdso-32.so',
> +   extra_args: ['-r', '__vdso_rt_sigreturn'])
> +vdso_64_inc = gen_vdso.process('vdso-64.so',
> +   extra_args: ['-r', '__vdso_rt_sigreturn'])
> +
> +linux_user_ss.add(when: 'TARGET_RISCV32', if_true: vdso_32_inc)
> +linux_user_ss.add(when: 'TARGET_RISCV64', if_true: vdso_64_inc)
> diff --git a/linux-user/riscv/vdso-32.so b/linux-user/riscv/vdso-32.so
> new file mode 100755
> index 
> ..1ad1e5c8b1fe36b0fe4bcb6c06fab8219ecd
> GIT binary patch
> literal 2900
> zcmb_eTWl0n7(TPZVyUHWDVSm#Q)tBoakE`3X|QQ_x7~7aX*Y!`UgC5+yX~&9yUp%w
> zLmR@yBHoaTR0Kq>L6L=zLABtH4zgAYa%jnRm{h}7?!nX^o_4?g&}-~Q*n
> z%s=PM`Oi1AEgb1m6h%l;#cx7dEpVPL6Ji#0i>Mc~MU$u!9%$NEaRFn3d4wv|^w
> zTRLpb7;DS=wp%clxP}go5H6@1BuN~CP00Gu?~M2%+(e=gF+#?vFF7z%d_LkRAy#(x
> ziH@zeoN%Z87eg1Tzxkm)cKE@*AHS_c%m7%c6BGLV%2aj>G?#S_n$Mt!IhI531+E1!
> zb!!dKcx z_D*OiRpQsFei%N)IbX)BH+hNx@;@l%HgX^T`9`e~Z1?i^n9( z1g}x5<}?R|$3vTWW zq|%A(U_3S|efp-XzEMX0<;*4W(ua%n;exaVHx=?pEv3iwWT|}4(DJFFT*4^iGGbFI
> zQ`mefZ|lW>vxuzj?%SZXc$;s>gdwhJf9!SZZ}R-lbJ=rs0Q-uj8Vu zlv?$Udf0qM9wfFxC!wRz6VR9G2Po=TuTZuqO6^>iV4WJ^OMpEfqLuY=G&6I30BbFy
> zA2MMnUcU|n0!nR#tLSzI;yVT&>K}h#=lu`gyKBQEcRjlPvEIj{Pjv5^7is=ZzN
> zI`@a33LI#kXnT6a!PaM%A8I-5KjM4Vdvw`zOP_CgVac)1$ zT3+$L>U+(5dfDqs-)MSsNwM**#c$W2x$#F=KbqGn16QOhn|lr}{z=*S
> z+2_AN!< zm+o=9s@>5)^*R-~LR437Z$I z`hy|- zWcyUJJ1h>QOEpi4#W|L(nwbF}76{Kig zuIGuiIM(}DX+aWbC8HPmWpr)bfhYp=kvXMt zJZSr2lk+Vp7hw$K`We4ZhM>u9gV0Xy8iVcR#yzl|T*jVra$ym+lRH0y?c_$G>(TsZ
> zB(~9;6^)yP)_G5NaP8_2p+$RpI>McOy zt2%pywldNa3P!Y^uC8_A^)k^xD>li^#7r|0-*i5k#$9%$5;Zzlu;e3=3gOL zw2`xPQjRiNBO$!;(M&9z(#-*Kp<_l!x3tZ!(roT7DyEc}5bsd@7rnW@vHYO(eC!
> zTr8c?l5u28OL ztdn8AgZR$kJCxnddVJ>-uHggF2
> z{; zrr`NvZC>O27K3SuT)P2F=8->$pX?kenYY!> vc~2SIg1ifGl4tmx!+P8|`R#(CDQ#Hj*V2HN^{Oz_
> literal 0
> HcmV?d1
>
> diff --git a/linux-user/riscv/vdso-64.so b/linux-user/riscv/vdso-64.so
> new file mode 100755
> index 
> ..83992bebe6d0182f24edfffc531015fd2f4e1cfb
> GIT binary patch
> literal 3856
> zcmc&%>u*#=6hC(_g{2g1p%}2$6tfXRaI z(h#Xy-@LSlh@eFjU#RsBzS<8a68*wQ{R51AFh(W%#Ya77?>Woe=_Y*egQvasH*+2{
> zbI;8DW_Eul-0jdb5YgcZT&)54*l?-dDl9 

Re: [PATCH 1/2] riscv: zicond: make non-experimental

2023-08-24 Thread Vineet Gupta




On 8/10/23 10:14, Alistair Francis wrote:

On Tue, Aug 8, 2023 at 2:18 PM Vineet Gupta  wrote:

zicond is now codegen supported in both llvm and gcc.

This change allows seamless enabling/testing of zicond in downstream
projects. e.g. currently riscv-gnu-toolchain parses elf attributes
to create a cmdline for qemu but fails short of enabling it because of
the "x-" prefix.

Signed-off-by: Vineet Gupta 

Reviewed-by: Alistair Francis 

Alistair


Gentle ping to remind that this lands in some -next tree and not forgotten !

Thx,
-Vineet




---
  target/riscv/cpu.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 6b93b04453c8..022bd9d01223 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1816,6 +1816,7 @@ static Property riscv_cpu_extensions[] = {
  DEFINE_PROP_BOOL("zcf", RISCVCPU, cfg.ext_zcf, false),
  DEFINE_PROP_BOOL("zcmp", RISCVCPU, cfg.ext_zcmp, false),
  DEFINE_PROP_BOOL("zcmt", RISCVCPU, cfg.ext_zcmt, false),
+DEFINE_PROP_BOOL("zicond", RISCVCPU, cfg.ext_zicond, false),

  /* Vendor-specific custom extensions */
  DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
@@ -1832,7 +1833,6 @@ static Property riscv_cpu_extensions[] = {
  DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),

  /* These are experimental so mark with 'x-' */
-DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false),

  /* ePMP 0.9.3 */
  DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),
--
2.34.1







Re: [PATCH 2/2] riscv: zicond: make default

2023-08-08 Thread Vineet Gupta




On 8/8/23 14:06, Daniel Henrique Barboza wrote:

(CCing Alistair and other reviewers)

On 8/8/23 15:17, Vineet Gupta wrote:

Again this helps with better testing and something qemu has been doing
with newer features anyways.

Signed-off-by: Vineet Gupta 
---


Even if we can reach a consensus about removing the experimental (x- 
prefix) status
from an extension that is Frozen instead of ratified, enabling stuff 
in the default
CPUs because it's easier to test is something we would like to avoid. 
The rv64
CPU has a random set of extensions enabled for the most different and 
undocumented
reasons, and users don't know what they'll get because we keep beefing 
up the

generic CPUs arbitrarily.


I understand this position given the arbitrary nature of gazillion 
extensions. However pragmatically things like bitmanip and zicond are so 
fundamental it would be strange for designs to not have them, in a few 
years. Besides these don't compete or conflict with other extensions.
But on face value it is indeed possible for vendors to drop them for 
various reasons or no-reasons.


But having the x- dropped is good enough for our needs as there's 
already mechanisms to enable the toggles from elf attributes.




Starting on QEMU 8.2 we'll have a 'max' CPU type that will enable all 
non-experimental
and non-vendor extensions by default, making it easier for tooling to 
test new
features/extensions. All tooling should consider changing their 
scripts to use the

'max' CPU when it's available.


That would be great.



For now, I fear that gcc and friends will still need to enable 
'zicond' in the command

line via 'zicond=true'.  Thanks,


Thx,
-Vineet



Re: [PATCH 1/2] riscv: zicond: make non-experimental

2023-08-08 Thread Vineet Gupta




On 8/8/23 11:29, Richard Henderson wrote:

On 8/8/23 11:17, Vineet Gupta wrote:

zicond is now codegen supported in both llvm and gcc.


It is still not in

https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions


Right, its been frozen since April though and with support trickling in 
rest of tooling it becomes harder to test.

I don't know what exactly QEMU's policy is on this ?

-Vineet




[PATCH 2/2] riscv: zicond: make default

2023-08-08 Thread Vineet Gupta
Again this helps with better testing and something qemu has been doing
with newer features anyways.

Signed-off-by: Vineet Gupta 
---
 target/riscv/cpu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 022bd9d01223..e6e28414b223 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -438,6 +438,7 @@ static void rv64_thead_c906_cpu_init(Object *obj)
 cpu->cfg.ext_xtheadbs = true;
 cpu->cfg.ext_xtheadcmo = true;
 cpu->cfg.ext_xtheadcondmov = true;
+cpu->cfg.ext_zicond = false;
 cpu->cfg.ext_xtheadfmemidx = true;
 cpu->cfg.ext_xtheadmac = true;
 cpu->cfg.ext_xtheadmemidx = true;
@@ -483,6 +484,7 @@ static void rv64_veyron_v1_cpu_init(Object *obj)
 cpu->cfg.ext_zbc = true;
 cpu->cfg.ext_zbs = true;
 cpu->cfg.ext_XVentanaCondOps = true;
+cpu->cfg.ext_zicond = false;
 
 cpu->cfg.mvendorid = VEYRON_V1_MVENDORID;
 cpu->cfg.marchid = VEYRON_V1_MARCHID;
@@ -1816,7 +1818,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("zcf", RISCVCPU, cfg.ext_zcf, false),
 DEFINE_PROP_BOOL("zcmp", RISCVCPU, cfg.ext_zcmp, false),
 DEFINE_PROP_BOOL("zcmt", RISCVCPU, cfg.ext_zcmt, false),
-DEFINE_PROP_BOOL("zicond", RISCVCPU, cfg.ext_zicond, false),
+DEFINE_PROP_BOOL("zicond", RISCVCPU, cfg.ext_zicond, true),
 
 /* Vendor-specific custom extensions */
 DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
-- 
2.34.1




[PATCH 1/2] riscv: zicond: make non-experimental

2023-08-08 Thread Vineet Gupta
zicond is now codegen supported in both llvm and gcc.

This change allows seamless enabling/testing of zicond in downstream
projects. e.g. currently riscv-gnu-toolchain parses elf attributes
to create a cmdline for qemu but fails short of enabling it because of
the "x-" prefix.

Signed-off-by: Vineet Gupta 
---
 target/riscv/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 6b93b04453c8..022bd9d01223 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1816,6 +1816,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("zcf", RISCVCPU, cfg.ext_zcf, false),
 DEFINE_PROP_BOOL("zcmp", RISCVCPU, cfg.ext_zcmp, false),
 DEFINE_PROP_BOOL("zcmt", RISCVCPU, cfg.ext_zcmt, false),
+DEFINE_PROP_BOOL("zicond", RISCVCPU, cfg.ext_zicond, false),
 
 /* Vendor-specific custom extensions */
 DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
@@ -1832,7 +1833,6 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
 
 /* These are experimental so mark with 'x-' */
-DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false),
 
 /* ePMP 0.9.3 */
 DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),
-- 
2.34.1




Re: [PATCH] target/riscv: fix inverted checks for ext_zb[abcs]

2022-02-28 Thread Vineet Gupta

Hi Alistair,

On 2/3/22 16:59, Alistair Francis wrote:

On Fri, Feb 4, 2022 at 1:42 AM Philipp Tomsich  wrote:


While changing to the use of cfg_ptr, the conditions for REQUIRE_ZB[ABCS]
inadvertently became inverted and slipped through the initial testing (which
used RV64GC_XVentanaCondOps as a target).
This fixes the regression.

Tested against SPEC2017 w/ GCC 12 (prerelease) for RV64GC_zba_zbb_zbc_zbs.

Fixes: 718143c126 ("target/riscv: add a MAINTAINERS entry for XVentanaCondOps")

Signed-off-by: Philipp Tomsich 


Reviewed-by: Alistair Francis 



---
We may want to squash this onto the affected commit, if it hasn't made
it beyond the next-tree, yet.


Yeah, agreed. I'll squash it in

Alistair


Has this already been committed upstream. I was running into weird issue 
related to bitmanip and seems this was missing in my local tree.


Also the "Fixes: " entry in changelog doesn't seem OK; the issue seems 
to have been introduced in f2a32bec8f0da99 ("target/riscv: access cfg 
structure through DisasContext")


Thx,
-Vineet





  target/riscv/insn_trans/trans_rvb.c.inc | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc 
b/target/riscv/insn_trans/trans_rvb.c.inc
index f9bd3b7ec4..e3c6b459d6 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -19,25 +19,25 @@
   */

  #define REQUIRE_ZBA(ctx) do {\
-if (ctx->cfg_ptr->ext_zba) { \
+if (!ctx->cfg_ptr->ext_zba) { \
  return false;\
  }\
  } while (0)

  #define REQUIRE_ZBB(ctx) do {\
-if (ctx->cfg_ptr->ext_zbb) { \
+if (!ctx->cfg_ptr->ext_zbb) { \
  return false;\
  }\
  } while (0)

  #define REQUIRE_ZBC(ctx) do {\
-if (ctx->cfg_ptr->ext_zbc) { \
+if (!ctx->cfg_ptr->ext_zbc) { \
  return false;\
  }\
  } while (0)

  #define REQUIRE_ZBS(ctx) do {\
-if (ctx->cfg_ptr->ext_zbs) { \
+if (!ctx->cfg_ptr->ext_zbs) { \
  return false;\
  }\
  } while (0)
--
2.34.1











[PATCH] build: fix build failure with gcc 11.2 by disabling -fcf-protection

2022-02-08 Thread Vineet Gupta
When doing RV qemu builds with host gcc 11.2, ran into following build failure

| cc -MMD -MP -MT linuxboot_dma.o -MF ./linuxboot_dma.d -O2 -g -march=i486 
-Wall \
|   -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings 
-Wmissing-prototypes \
|   -Wold-style-declaration -Wold-style-definition -Wtype-limits 
-Wformat-security \
|   -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs 
\
|   -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 
-Wno-missing-include-dirs \
|   -Wno-shift-negative-value -Wno-psabi -fno-pie -ffreestanding -IQEMU/include 
\
|   -fno-stack-protector   -m16   -Wa,-32 \
|   -c QEMU/pc-bios/optionrom/linuxboot_dma.c -o linuxboot_dma.o
|cc1: error: ‘-fcf-protection’ is not compatible with this target

Signed-off-by: Vineet Gupta 
---
This might be a crude fix to the problem
---
 pc-bios/optionrom/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
index 5d55d25acca2..8f843ee803c1 100644
--- a/pc-bios/optionrom/Makefile
+++ b/pc-bios/optionrom/Makefile
@@ -22,6 +22,9 @@ override CFLAGS += $(CFLAGS_NOPIE) -ffreestanding 
-I$(TOPSRC_DIR)/include
 override CFLAGS += $(call cc-option, -fno-stack-protector)
 override CFLAGS += $(call cc-option, -m16)
 
+# issue with gcc 11.2
+override CFLAGS += $(call cc-option, -fcf-protection=none)
+
 ifeq ($(filter -m16, $(CFLAGS)),)
 # Attempt to work around compilers that lack -m16 (GCC <= 4.8, clang <= ??)
 # On GCC we add -fno-toplevel-reorder to keep the order of asm blocks with
-- 
2.32.0




[PATCH] target/riscv: make H-extension non-experimental

2021-12-21 Thread Vineet Gupta
H-ext v1.0 was ratified recently as part of Privileged Spec 1.12.
So move it out of experimental.

[1] https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions

Signed-off-by: Vineet Gupta 
---
 target/riscv/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 6ef3314bced8..a582179b1773 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -640,12 +640,12 @@ static Property riscv_cpu_properties[] = {
 DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
 DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
 
-/* These are experimental so mark with 'x-' */
 DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
 DEFINE_PROP_BOOL("zbb", RISCVCPU, cfg.ext_zbb, true),
 DEFINE_PROP_BOOL("zbc", RISCVCPU, cfg.ext_zbc, true),
 DEFINE_PROP_BOOL("zbs", RISCVCPU, cfg.ext_zbs, true),
-DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false),
+DEFINE_PROP_BOOL("h", RISCVCPU, cfg.ext_h, false),
+/* These are experimental so mark with 'x-' */
 DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false),
 /* ePMP 0.9.3 */
 DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),
-- 
2.30.2




[PATCH] target/riscv: Enable bitmanip Zb[abcs] instructions

2021-12-15 Thread Vineet Gupta
The bitmanip extension has now been ratified [1] and upstream tooling
(gcc/binutils) support it too, so move them out of experimental and also
enable by default (for better test exposure/coverage)

[1] https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions

Signed-off-by: Vineet Gupta 
---
 target/riscv/cpu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f81299812350..c00d59cd04b5 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -635,10 +635,10 @@ static Property riscv_cpu_properties[] = {
 DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
 
 /* These are experimental so mark with 'x-' */
-DEFINE_PROP_BOOL("x-zba", RISCVCPU, cfg.ext_zba, false),
-DEFINE_PROP_BOOL("x-zbb", RISCVCPU, cfg.ext_zbb, false),
-DEFINE_PROP_BOOL("x-zbc", RISCVCPU, cfg.ext_zbc, false),
-DEFINE_PROP_BOOL("x-zbs", RISCVCPU, cfg.ext_zbs, false),
+DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
+DEFINE_PROP_BOOL("zbb", RISCVCPU, cfg.ext_zbb, true),
+DEFINE_PROP_BOOL("zbc", RISCVCPU, cfg.ext_zbc, true),
+DEFINE_PROP_BOOL("zbs", RISCVCPU, cfg.ext_zbs, true),
 DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false),
 DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false),
 DEFINE_PROP_BOOL("x-v", RISCVCPU, cfg.ext_v, false),
-- 
2.30.2




Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci

2021-10-13 Thread Vineet Gupta

On 10/13/21 6:49 AM, Philipp Tomsich wrote:

On Wed, 13 Oct 2021 at 15:44, Vincent Palatin  wrote:


On Wed, Oct 13, 2021 at 3:13 PM Philipp Tomsich
 wrote:


I had a much simpler version initially (using 3 x mask/shift/or, for
12 instructions after setup of constants), but took up the suggestion
to optimize based on haszero(v)...
Indeed this appears to not do what we expect, when there's only 0x01
set in a byte.

The less optimized form, with a single constant, that would still do
what we want is:
/* set high-bit for non-zero bytes */
constant = dup_const_tl(MO_8, 0x7f);
tmp = v & constant;   // AND
tmp += constant;   // ADD
tmp |= v;// OR
/* extract high-bit to low-bit, for each word */
tmp &= ~constant; // ANDC
tmp >>= 7; // SHR
/* multiply with 0xff to populate entire byte where the low-bit is set */
tmp *= 0xff;// MUL

I'll submit a patch with this one later today, once I had a chance to
pass this through a full test.



Thanks for the insight.

I have tried it, implemented as:
```
static void gen_orc_b(TCGv ret, TCGv source1)
{
 TCGv  tmp = tcg_temp_new();
 TCGv  constant = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));

 /* set high-bit for non-zero bytes */
 tcg_gen_and_tl(tmp, source1, constant);
 tcg_gen_add_tl(tmp, tmp, constant);
 tcg_gen_or_tl(tmp, tmp, source1);
 /* extract high-bit to low-bit, for each word */
 tcg_gen_andc_tl(tmp, tmp, constant);
 tcg_gen_shri_tl(tmp, tmp, 7);

 /* Replicate the lsb of each byte across the byte. */
 tcg_gen_muli_tl(ret, tmp, 0xff);

 tcg_temp_free(tmp);
}
```

It does pass my own test sequences.


I am running it against SPEC at the moment, using optimized
strlen/strcpy/strcmp functions using orc.b.
The verdict on that should be available later today...


off topic but relates, for Zb (and similar things in the future) whats 
the strategy for change management/discovery. I understand you can 
hardcode things for quick test, but for a proper glibc implementation 
this would be an IFUNC but there seems to be no architectural way per 
spec (for software/kernel) to discover this.


Same issue is with building linux kernel with Zb - how do we make sure 
that hardware/sim supports Zb when running corresponding software.


It seems some generic discovery/enumeration scheme is in works but what 
to do in the interim.


Thx,
-Vineet



Re: [PATCH v11 00/16] target/riscv: Update QEmu for Zb[abcs] 1.0.0

2021-09-27 Thread Vineet Gupta




On 9/27/21 1:23 PM, Jim Wilson wrote:
On Mon, Sep 27, 2021 at 1:01 PM Vineet Gupta <mailto:vine...@rivosinc.com>> wrote:


So I obviously forgot to get the equivalent binutils branch, but the
only rvb branch on sifive fork feels dated


https://github.com/riscv-collab/riscv-binutils-gdb/tree/riscv-binutils-2.35-rvb

<https://github.com/riscv-collab/riscv-binutils-gdb/tree/riscv-binutils-2.35-rvb>


That is the right branch to use with the gcc that you are using.  This 
stuff hasn't been actively maintained so we have old gcc and binutils 
release versions.


We are in the process of putting stuff upstream now.



Thx Jim. Guess we'd have to wait for dust to settle, as this instance of 
binutils can't seem to grok sh1add.uw spit out by rvb-shNadd-03.c


-Vineet



Re: [PATCH v11 00/16] target/riscv: Update QEmu for Zb[abcs] 1.0.0

2021-09-27 Thread Vineet Gupta

Hi,

On 9/11/21 7:00 AM, Philipp Tomsich wrote:


The Zb[abcs] extensions have complete public review and are nearing
ratifications. These individual extensions are one part of what was
previously though of as the "BitManip" (B) extension, leaving the
final details of future Zb* extensions open as they will undergo
further public discourse.

This series updates the earlier support for the B extension by
  - removing those instructions that are not included in Zb[abcs]
  - splitting this into 4 separate extensions that can be independently
enabled: Zba (addressing), Zbb (basic bit-manip), Zbc (carryless
multiplication), Zbs (single-bit operations)
  - update the to the 1.0.0 version (e.g. w-forms of rev8 and Zbs
instructions are not included in Zb[abcs])

For the latest version of the public review speicifcaiton
(incorporating some editorial fixes and corrections from the review
period), refer to:
   
https://github.com/riscv/riscv-bitmanip/releases/download/1.0.0/bitmanip-1.0.0-31-g2af7256.pdf


I was curious to try these out. Challenge was not qemu but stuff built 
to run on this qemu.


At LPC last week Jim/Kito suggested I use the gcc branch @
https://github.com/riscv-collab/riscv-gcc/tree/riscv-gcc-10.2.0-rvb

With that I get

$ riscv64-unknown-elf-gcc 
~/gnu/gcc/gcc/testsuite/gcc.target/riscv/rvb-zbs-bclr.c -c --save-temps 
-march=rv64gc_zbb_zbs -O2

Assembler messages:
Error: -march=rv64imafdc_zbb_zbs: unknown prefixed ISA extension `zbs'

So I obviously forgot to get the equivalent binutils branch, but the 
only rvb branch on sifive fork feels dated


https://github.com/riscv-collab/riscv-binutils-gdb/tree/riscv-binutils-2.35-rvb

Can someone point me to the right binutils repo/branch to pair with gcc 
changes above.


Thx,
-Vineet




Changes in v11:
- Swaps out the EXT_ZERO to EXT_NONE, as no extension is to be performed.
- Fix typos in commit message.

Changes in v10:
- New patch
- New patch, fixing regressions discovered with x264_r.
- New patch, fixing correctnes for clzw called on a register with undefined
   (as in: not properly sign-extended) upper bits.
- Retested with CF3 and SPEC2017 (size=test, size=ref); addressing new
   regressions (due to bugs in gen_clzw) from testing with SPEC2017 using
   different optimization levels
- Split off gen_add_uw() fix into a separate patch, as requested.

Changes in v9:
- Retested with CF3 and SPEC2017 (size=test only).
- Rebased to 8880cc4362.
- Update gen_add_uw() to use a temporary instead of messing with
   arg1 (fixes a regression after rebase on CF3 and SPEC2017).
- Rebased to 8880cc4362.
- Picked up Alistair's Reviewed-by, after patman had failed to catch
   it for v8.
- Rebased to 8880cc4362.
- Fixes a whitespace-at-the-end-of-line warning for the rev8 comment
   in insn32.decode
- Rebased to 8880cc4362.

Changes in v8:
- Optimize orc.b further by reordering the shift/and, updating the
   comment to reflect that we put the truth-value into the LSB, and
   putting the (now only) constant in a temporary
- Fold the final bitwise-not into the second and, using and andc.

Changes in v7:
- Free TCG temporary in gen_orc_b().

Changes in v6:
- Move gen_clmulh to trans_rvb.c.inc, as per Richard H's request.
- Fixed orc.b (now passes SPEC w/ optimized string functions) by
   adding the missing final negation.

Changes in v5:
- Introduce gen_clmulh (as suggested by Richard H) and use to simplify
   trans_clmulh().

Changes in v4:
- Drop rewrite of slli.uw (to match formal specification), as it would
   remove an optimization.
- Change orc.b to implementation suggested by Richard Henderson
- reorder trans_rev8* functions to be sequential
- rename rev8 to rev8_32 in decoder
- Renamed RV32 variant to zext_h_32.
- Reordered trans_zext_h_{32,64} to be next to each other.

Changes in v3:
- Split off removal of 'x-b' property and 'ext_b' field into a separate
   patch to ensure bisectability.
- The changes to the Zba instructions (i.e. the REQUIRE_ZBA macro
   and its use for qualifying the Zba instructions) are moved into
   a separate commit.
- Remove the W-form instructions from Zbs in a separate commit.
- Remove shift-one instructions in a separate commit.
- The changes to the Zbs instructions (i.e. the REQUIRE_ZBS macro) and
   its use for qualifying the Zba instructions) are moved into a
   separate commit.
- This adds the Zbc instructions as a spearate commit.
- Uses a helper for clmul/clmulr instead of inlining the calculation of
   the result (addressing a comment from Richard Henderson).
- The changes to the Zbb instructions (i.e. use the REQUIRE_ZBB macro)
   are now in a separate commit.
- Moved orc.b and gorc/gorci changes into separate commit.
- Using the simpler orc.b implementation suggested by Richard Henderson
- Moved the REQUIRE_32BIT macro into a separate commit.
- rev8-addition & grevi*-removal moved to a separate commit
- Moved zext.h-addition & pack*-removal to a separate commit.
- Removing RVB moved into a separate commit at