Re: [PATCH] hw/scsi/lsi53c895a: bump instruction limit in scripts processing to fix regression

2024-07-16 Thread Sven Schnelle
Fiona Ebner  writes:

> Commit 9876359990 ("hw/scsi/lsi53c895a: add timer to scripts
> processing") reduced the maximum allowed instruction count by
> a factor of 100 all the way down to 100.
>
> This causes the "Check Point R81.20 Gaia" appliance [0] to fail to
> boot after fully finishing the installation via the appliance's web
> interface (there is already one reboot before that).
>
> With a limit of 150, the appliance still fails to boot, while with a
> limit of 200, it works. Bump to 500 to fix the regression and be on
> the safe side.
>
> Originally reported in the Proxmox community forum[1].
>
> [0]: https://support.checkpoint.com/results/download/124397
> [1]: https://forum.proxmox.com/threads/149772/post-683459
>
> Cc: qemu-sta...@nongnu.org
> Fixes: 9876359990 ("hw/scsi/lsi53c895a: add timer to scripts processing")
> Signed-off-by: Fiona Ebner 
> ---
>  hw/scsi/lsi53c895a.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
> index eb9828dd5e..f1935e5328 100644
> --- a/hw/scsi/lsi53c895a.c
> +++ b/hw/scsi/lsi53c895a.c
> @@ -188,7 +188,7 @@ static const char *names[] = {
>  #define LSI_TAG_VALID (1 << 16)
>  
>  /* Maximum instructions to process. */
> -#define LSI_MAX_INSN100
> +#define LSI_MAX_INSN500
>  
>  typedef struct lsi_request {
>  SCSIRequest *req;

Fine with me - i just picked a random number assuming it
works. Obviously i was wrong :-)

Acked-by: Sven Schnelle 



Re: [PATCH 00/45] target/hppa: Misc improvements

2024-05-12 Thread Sven Schnelle
Philippe Mathieu-Daudé  writes:

> Cc'ing Helge & Sven as I'm going to skip this series.
>
> Suggestion:
>
> -- >8 --
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1b79767d61..be7535b55e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -254,6 +254,8 @@ F: target/hexagon/gen_idef_parser_funcs.py
>
>  HPPA (PA-RISC) TCG CPUs
>  M: Richard Henderson 
> +R: Helge Deller 
> +R: Sven Schnelle 
>  S: Maintained
>  F: target/hppa/
>  F: disas/hppa.c
> @@ -1214,6 +1216,7 @@ HP-PARISC Machines
>  HP B160L, HP C3700
>  M: Richard Henderson 
>  R: Helge Deller 
> +R: Sven Schnelle 
>  S: Odd Fixes
>  F: configs/devices/hppa-softmmu/default.mak
>  F: hw/display/artist.c

Please don't add me as reviewer - i'm only looking in irregular
intervals at hppa tcg in qemu.



Re: [PATCH 2/3] target/hppa: mask offset bits in gva

2024-04-01 Thread Sven Schnelle
Richard Henderson  writes:

> On 4/1/24 20:01, Sven Schnelle wrote:
>> Implement dr2 and the mfdiag/mtdiag instructions. dr2 contains a bit
>> which enables/disables space id hashing. Seabios would then set
>> this bit when booting. Linux would disable it again during boot (this
>> would be the same like on real hardware), while HP-UX would leave it
>> enabled.
>
> Pointer to documentation?

There's no documentation about that in the public. There's this code since the
beginning of linux on hppa in the linux kernel (arch/parisc/kernel/pacache.S):

/* Disable Space Register Hashing for PCXL */

.word 0x141c0600  /* mfdiag %dr0, %r28 */
depwi   0,28,2, %r28  /* Clear DHASH_EN & IHASH_EN */
.word 0x141c0240  /* mtdiag %r28, %dr0 */
b,n   srdis_done

srdis_pa20:

/* Disable Space Register Hashing for PCXU,PCXU+,PCXW,PCXW+,PCXW2 */

.word 0x144008bc/* mfdiag %dr2, %r28 */
depdi 0, 54,1, %r28 /* clear DIAG_SPHASH_ENAB (bit 54) */
.word 0x145c1840/* mtdiag %r28, %dr2 */

So PCXL (32 bit) uses dr0, while 64 bit uses dr2. This still is the same
on my C8000 - i see firmware still contains code reading dr2 to figure
out whether space id hashing is enabled. The mfdiag/mtdiag instructions
are described in the PCXL/PCXL2 ERS.

https://parisc.wiki.kernel.org/index.php/File:PCXL_ers.pdf
https://parisc.wiki.kernel.org/index.php/File:Pcxl2_ers.pdf

There was a discussion mentioning disabling Space ID hashing in Linux:

https://yhbt.net/lore/linux-parisc/199912161642.iaa11...@lucy.cup.hp.com/




Re: [PATCH 2/3] target/hppa: mask offset bits in gva

2024-04-01 Thread Sven Schnelle
Richard,

Sven Schnelle  writes:

> Richard Henderson  writes:
>
>> On 3/23/24 22:09, Sven Schnelle wrote:
>>> The CPU seems to mask a few bits in the offset when running
>>> under HP-UX. ISR/IOR register contents for an address in
>>> the processor HPA (0xfffa) on my C8000 and J6750:
>>> running on Linux: 3fff c000fffa0500
>>> running on HP-UX: 301f c000fffa0500
>>> I haven't found how this is switched (guess some diag in the
>>> firmware), but linux + seabios seems to handle that as well,
>>> so lets mask out the additional bits.
>>> Signed-off-by: Sven Schnelle 
>>> [..]
>> [..]
>> Though my argument would suggest the mask should be 0xff for the
>> 40-bit physical address, which is not what you see at all, so perhaps
>> the thing is moot.  I am at a loss to explain why or how HP-UX gets a
>> 7-bit hole in the ISR result.
>>
>> On the other hand, there are some not-well-documented shenanigans (aka
>> implementation defined behaviour) between Figure H-8 and Figure H-11,
>> where the 62-bit absolute address is expanded to a 64-bit logical
>> physical address and then compacted to a 40-bit implementation
>> physical address.
>>
>> We've already got hacks in place for this in hppa_abs_to_phys_pa2_w1,
>> which just truncates everything down to 40 bits.  But that's probably
>> not what the processor is really doing.
>
> I looked into this again, and it's caused by Space-ID hashing. HP-UX asks
> PDC/Firmware how many bits are used for the hashing. seabios returns
> zero, in which case HP-UX uses a default mask of 0xf01f.
> By modifying seabios, i can make HP-UX use the appropriate mask, but
> switching of SpaceID hashing entirely is impossible. The reason why
> the CPU doesn't strip the bits when running linux is that Linux switches
> of Space-ID hashing early in the startup code (before mm gets
> initialized).
>
> My J6750 Firmware only returns two values: 0 when Space-ID hashing is
> off, 0xfe0 when it is enabled. This is hardcoded in the firmware - the
> only thing PDC checks is a bit in Debug Register 2, which enables
> Space-ID hashing. 0xfe0 matches the 0xf01f... mask used by HP-UX
> pretty well.
>
> So if qemu wants to run 64 Bit HP-UX the proper way, i guess it needs
> to implement Space-ID hashing.

I wonder wether it would be acceptable to implement this masking in a
switchable way?

This would mean:

Implement dr2 and the mfdiag/mtdiag instructions. dr2 contains a bit
which enables/disables space id hashing. Seabios would then set
this bit when booting. Linux would disable it again during boot (this
would be the same like on real hardware), while HP-UX would leave it
enabled.



Re: [PATCH] target/hppa: Fix IIAOQ, IIASQ for pa2.0

2024-04-01 Thread Sven Schnelle
Richard Henderson  writes:

> The contents of IIAOQ depend on PSW_W.
> Follow the text in "Interruption Instruction Address Queues",
> pages 2-13 through 2-15.
>
> Reported-by: Sven Schnelle 
> Fixes: b10700d826c ("target/hppa: Update IIAOQ, IIASQ for pa2.0")
> Signed-off-by: Richard Henderson 
> ---
>
> Sven, I looked again through IIAOQ documentation and it does seem
> like some of the bits are wrong, both on interrupt delivery and RFI.

Yes, this fixes my issues, thanks!

Tested-by: Sven Schnelle 



Re: [PATCH] target/hppa: mask upper iaoq bits when returning to narrow mode

2024-04-01 Thread Sven Schnelle
Richard Henderson  writes:

> On 4/1/24 10:39, Sven Schnelle wrote:
>> Richard Henderson  writes:
>>>> For unknown reasons, Java 1.5 on 64-bit HP-UX 11.11 does signed
>>>> computation of the new IAOQ value in the signal handler. In the
>>>> current code these bits are not masked when returning to narrow
>>>> mode, causing java to crash.
>>>> Signed-off-by: Sven Schnelle 
>> INT   3530: instruction tlb miss fault @ :c007 
>> for :4000c004
>> INT   3531: external interrupt @ :c007 for 
>> :4000c004
>> INT   3532: instruction tlb miss fault @ :c007 
>> for :4000c004
>> INT   3533: external interrupt @ :c007 for 
>> :4000c004
>> So the PSW indicates narrow mode, but IAOQ seems to contain all the
>> ... bits.
>
> I believe that the IAOQ *should* contain all of the bits.  The bits
> should only be discarded when we form the GVA -- exactly like "ldb
> 0(r2)", where r2 contains all of the offset bits.  In particular, I
> believe that "b,l .+8,r2" should copy all of those bits to r2 from
> IAOQ_Back+4 and the fact that mainline crops those bits is a bug.
>
>
>> Also interesting is that the second TLB miss (INT 3530)
>> misses the Space ID.
>
> That is a bit curious, yes.
>
>> Any thoughts? Otherwise i need to investigate and make a wrong patch
>> again :-)
>> The only patch i have on top which touches target/hppa is the space
>> id
>> hashing mask patch:
>
> Ok.  I do have an hppa 11.11 iso -- for clarity, what is your command-line?

I'm using:

./build/qemu-system-hppa -M C3700 -m 1024 -cdrom 
/home/svens/parisc/hpux/11.11/HP-UX 11.11 (2004-12) - TCOE - Core OS, Install 
and Recovery - DVD.iso -bios 
/home/svens/seabios-hppa/out-64/hppa-firmware64.img -nographic -hda 
/home/svens/parisc/hpux.img -boot d


The qemu i'm using is: https://github.com/svenschnelle/qemu/tree/devel

You also need a special seabios-hppa version, because a special console
driver is needed:

https://github.com/hdeller/seabios-hppa/tree/devel



Re: [PATCH] target/hppa: mask upper iaoq bits when returning to narrow mode

2024-04-01 Thread Sven Schnelle
Sven Schnelle  writes:

> Sven Schnelle  writes:
>
>> Richard Henderson  writes:
>>
>>> On 4/1/24 04:52, Sven Schnelle wrote:
>>>> For unknown reasons, Java 1.5 on 64-bit HP-UX 11.11 does signed
>>>> computation of the new IAOQ value in the signal handler. In the
>>>> current code these bits are not masked when returning to narrow
>>>> mode, causing java to crash.
>>>> Signed-off-by: Sven Schnelle 
>>>> ---
>>>>   target/hppa/sys_helper.c | 4 
>>>>   1 file changed, 4 insertions(+)
>>>> diff --git a/target/hppa/sys_helper.c b/target/hppa/sys_helper.c
>>>> index 208e51c086..3bbc2da71b 100644
>>>> --- a/target/hppa/sys_helper.c
>>>> +++ b/target/hppa/sys_helper.c
>>>> @@ -83,6 +83,10 @@ void HELPER(rfi)(CPUHPPAState *env)
>>>>   env->iaoq_f = env->cr[CR_IIAOQ];
>>>>   env->iaoq_b = env->cr_back[1];
>>>>   +if (!(env->cr[CR_IPSW] & PSW_W)) {
>>>> +env->iaoq_f &= 0x;
>>>> +env->iaoq_b &= 0x;
>>>> +}
>>>
>>> This shouldn't be needed, because we are already masking these bits
>>> later, in cpu_get_tb_cpu_state.  But I do have some cleanups in this
>>> area, and perhaps one of them matters.
>> Any thoughts? Otherwise i need to investigate and make a wrong patch
>> again :-)
>
> This seems to be caused by IIAOQ's containing the upper bits. With the
> patch below i'm able to boot. Not sure whether it's correct though.
>
> diff --git a/target/hppa/int_helper.c b/target/hppa/int_helper.c
> index 58c13d3e61..f7c4cca8f1 100644
> --- a/target/hppa/int_helper.c
> +++ b/target/hppa/int_helper.c
> @@ -123,8 +123,14 @@ void hppa_cpu_do_interrupt(CPUState *cs)
>  env->cr[CR_IIASQ] = 0;
>  env->cr_back[0] = 0;
>  }
> -env->cr[CR_IIAOQ] = env->iaoq_f;
> -env->cr_back[1] = env->iaoq_b;
> +if (old_psw & PSW_W) {
> +env->cr[CR_IIAOQ] = env->iaoq_f;
> +env->cr_back[1] = env->iaoq_b;
> +} else {
> +env->cr[CR_IIAOQ] = (env->iaoq_f & 0x);
> +env->cr_back[1] = env->iaoq_b & 0x;
> +}
> +

I guess the interesting question where should these bits get masked out
- i would assume that this place is to late, and it should happen
earlier in trans_be/when the iaoq value is copied. On the other hand
you had one commit that removed the masking in copy_iaoq_entry()...



Re: [PATCH] target/hppa: mask upper iaoq bits when returning to narrow mode

2024-04-01 Thread Sven Schnelle
Sven Schnelle  writes:

> Richard Henderson  writes:
>
>> On 4/1/24 04:52, Sven Schnelle wrote:
>>> For unknown reasons, Java 1.5 on 64-bit HP-UX 11.11 does signed
>>> computation of the new IAOQ value in the signal handler. In the
>>> current code these bits are not masked when returning to narrow
>>> mode, causing java to crash.
>>> Signed-off-by: Sven Schnelle 
>>> ---
>>>   target/hppa/sys_helper.c | 4 
>>>   1 file changed, 4 insertions(+)
>>> diff --git a/target/hppa/sys_helper.c b/target/hppa/sys_helper.c
>>> index 208e51c086..3bbc2da71b 100644
>>> --- a/target/hppa/sys_helper.c
>>> +++ b/target/hppa/sys_helper.c
>>> @@ -83,6 +83,10 @@ void HELPER(rfi)(CPUHPPAState *env)
>>>   env->iaoq_f = env->cr[CR_IIAOQ];
>>>   env->iaoq_b = env->cr_back[1];
>>>   +if (!(env->cr[CR_IPSW] & PSW_W)) {
>>> +env->iaoq_f &= 0x;
>>> +env->iaoq_b &= 0x;
>>> +}
>>
>> This shouldn't be needed, because we are already masking these bits
>> later, in cpu_get_tb_cpu_state.  But I do have some cleanups in this
>> area, and perhaps one of them matters.
>
> Ouch. Appologies, i removed that masking to make 64 bit work, but forgot
> about that. Sorry. I tried your branch, but i'm not able to boot 64 bit HP-UX
> (it wasn't working before without additional changes, so your branch
> doesn't break it). However, i would like to get your oppinion before
> debugging. The code is:
>
> IN: 
> 0xef468002c18:   20 20 08 01   ldil L%-4000,r1
> 0xef468002c1c:   e4 20 e0 08   be,l 4(sr7,r1),sr0,r31
> 0xef468002c20:   34 16 00 72   ldi 39,r22
>
> IA_F 0ef46800:2c23 (0ef468002c23)
> IA_B 01e27c00:c007 (01e27c00c007)
> PSW  0004000f CB    -CQPDI
> GR00  GR01 c000 GR02 2ac3 GR03 
> 7a00
> GR04 0002 GR05 7a34 GR06 7a40 GR07 
> 7a50
> GR08 7ae0 GR09 800040001000 GR10 0003 GR11 
> 0006
> GR12 001ecee8 GR13  GR14 0801 GR15 
> 0006
> GR16 8000400020c0 GR17 0001e720 GR18 0008 GR19 
> 
> GR20 0ef46800 GR21 7a60 GR22  GR23 
> 7a50
> GR24  GR25  GR26 7a0020f0 GR27 
> 40008410
> GR28 0002 GR29  GR30 7a002160 GR31 
> 2c27
> RC   7fff CR1   CR2   CR3  
> 
> CR4   CR5   CR6  0002 CR7  
> fffb
> PID1 256f PID2 7306 CCR  00c0 SAR  
> 0004
> PID3  PID4  IVA  0002a000 EIEM 
> 
> ITMR 0005d637f804 ISQF 0ef46800 IOQF 2aab IIR  
> 6bc23fd9
> ISR  04d0d000 IOR  40007a0020cc IPSW 0004000f EIRR 
> 
> TR0  00abfe68 TR1  0465d1d6 TR2  0002 TR3  
> 
> TR4   TR5  0001757973bb TR6  21eb TR7  
> 7a0020e0
> ISQB 0ef46800 IOQB 2aaf
> SR00 0ef46800 SR01  SR02  SR03 
> SR04 0ef46800 SR05 04d0d000 SR06 01e27c00 SR07 01e27c00
>
> INT   3529: instruction tlb miss fault @ 01e27c00:c007 
> for 01e27c00:4000c004
> INT   3530: instruction tlb miss fault @ :c007 
> for :4000c004
> INT   3531: external interrupt @ :c007 for 
> :4000c004
> INT   3532: instruction tlb miss fault @ :c007 
> for :4000c004
> INT   3533: external interrupt @ :c007 for 
> :4000c004
>
> So the PSW indicates narrow mode, but IAOQ seems to contain all the
> ... bits. Also interesting is that the second TLB miss (INT 3530)
> misses the Space ID.
>
> Any thoughts? Otherwise i need to investigate and make a wrong patch
> again :-)

This seems to be caused by IIAOQ's containing the upper bits. With the
patch below i'm able to boot. Not sure whether it's correct though.

diff --git a/target/hppa/int_helper.c b/target/hppa/int_helper.c
index 58c13d3e61..f7c4cca8f1 100644
--- a/target/hppa/int_helper.c
+++ b/target/hppa/int_helper.c
@@ -123,8 +123,14 @@ void hppa_cpu_do_interrupt(CPUState *cs)
 env->cr[CR_IIASQ] = 0;
 env->cr_back[0] = 0;
 }
-env->cr[CR_IIAOQ] = env->iaoq_f;
-env->cr_back[1] = env->iaoq_b;
+if (old_psw & PSW_W) {
+env->cr[CR_IIAOQ] = env->iaoq_f;
+env->cr_back[1] = env->iaoq_b;
+} else {
+env->cr[CR_IIAOQ] = (env->iaoq_f & 0x);
+env->cr_back[1] = env->iaoq_b & 0x;
+}
+



Re: [PATCH] target/hppa: mask upper iaoq bits when returning to narrow mode

2024-04-01 Thread Sven Schnelle
Richard Henderson  writes:

> On 4/1/24 04:52, Sven Schnelle wrote:
>> For unknown reasons, Java 1.5 on 64-bit HP-UX 11.11 does signed
>> computation of the new IAOQ value in the signal handler. In the
>> current code these bits are not masked when returning to narrow
>> mode, causing java to crash.
>> Signed-off-by: Sven Schnelle 
>> ---
>>   target/hppa/sys_helper.c | 4 
>>   1 file changed, 4 insertions(+)
>> diff --git a/target/hppa/sys_helper.c b/target/hppa/sys_helper.c
>> index 208e51c086..3bbc2da71b 100644
>> --- a/target/hppa/sys_helper.c
>> +++ b/target/hppa/sys_helper.c
>> @@ -83,6 +83,10 @@ void HELPER(rfi)(CPUHPPAState *env)
>>   env->iaoq_f = env->cr[CR_IIAOQ];
>>   env->iaoq_b = env->cr_back[1];
>>   +if (!(env->cr[CR_IPSW] & PSW_W)) {
>> +env->iaoq_f &= 0x;
>> +env->iaoq_b &= 0x;
>> +}
>
> This shouldn't be needed, because we are already masking these bits
> later, in cpu_get_tb_cpu_state.  But I do have some cleanups in this
> area, and perhaps one of them matters.

Ouch. Appologies, i removed that masking to make 64 bit work, but forgot
about that. Sorry. I tried your branch, but i'm not able to boot 64 bit HP-UX
(it wasn't working before without additional changes, so your branch
doesn't break it). However, i would like to get your oppinion before
debugging. The code is:

IN: 
0xef468002c18:   20 20 08 01   ldil L%-4000,r1
0xef468002c1c:   e4 20 e0 08   be,l 4(sr7,r1),sr0,r31
0xef468002c20:   34 16 00 72   ldi 39,r22

IA_F 0ef46800:2c23 (0ef468002c23)
IA_B 01e27c00:c007 (01e27c00c007)
PSW  0004000f CB    -CQPDI
GR00  GR01 c000 GR02 2ac3 GR03 
7a00
GR04 0002 GR05 7a34 GR06 7a40 GR07 
7a50
GR08 7ae0 GR09 800040001000 GR10 0003 GR11 
0006
GR12 001ecee8 GR13  GR14 0801 GR15 
0006
GR16 8000400020c0 GR17 0001e720 GR18 0008 GR19 

GR20 0ef46800 GR21 7a60 GR22  GR23 
7a50
GR24  GR25  GR26 7a0020f0 GR27 
40008410
GR28 0002 GR29  GR30 7a002160 GR31 
2c27
RC   7fff CR1   CR2   CR3  

CR4   CR5   CR6  0002 CR7  
fffb
PID1 256f PID2 7306 CCR  00c0 SAR  
0004
PID3  PID4  IVA  0002a000 EIEM 

ITMR 0005d637f804 ISQF 0ef46800 IOQF 2aab IIR  
6bc23fd9
ISR  04d0d000 IOR  40007a0020cc IPSW 0004000f EIRR 

TR0  00abfe68 TR1  0465d1d6 TR2  0002 TR3  

TR4   TR5  0001757973bb TR6  21eb TR7  
7a0020e0
ISQB 0ef46800 IOQB 2aaf
SR00 0ef46800 SR01  SR02  SR03 
SR04 0ef46800 SR05 04d0d000 SR06 01e27c00 SR07 01e27c00

INT   3529: instruction tlb miss fault @ 01e27c00:c007 for 
01e27c00:4000c004
INT   3530: instruction tlb miss fault @ :c007 for 
:4000c004
INT   3531: external interrupt @ :c007 for 
:4000c004
INT   3532: instruction tlb miss fault @ :c007 for 
:4000c004
INT   3533: external interrupt @ :c007 for 
:4000c004

So the PSW indicates narrow mode, but IAOQ seems to contain all the
... bits. Also interesting is that the second TLB miss (INT 3530)
misses the Space ID.

Any thoughts? Otherwise i need to investigate and make a wrong patch
again :-)

The only patch i have on top which touches target/hppa is the space id
hashing mask patch:

diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index 66cae795bd..b35c7fa688 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -311,12 +311,13 @@ static inline int HPPA_BTLB_ENTRIES(CPUHPPAState *env)

 void hppa_translate_init(void);

+#define HPPA_GVA_OFFSET_MASK64 0x301f
 #define CPU_RESOLVING_TYPE TYPE_HPPA_CPU

 static inline uint64_t gva_offset_mask(target_ulong psw)
 {
 return (psw & PSW_W
-? MAKE_64BIT_MASK(0, 62)
+? HPPA_GVA_OFFSET_MASK64
 : MAKE_64BIT_MASK(0, 32));
 }

But that hoppefully isn't causing the issue here.



[PATCH] target/hppa: mask upper iaoq bits when returning to narrow mode

2024-04-01 Thread Sven Schnelle
For unknown reasons, Java 1.5 on 64-bit HP-UX 11.11 does signed
computation of the new IAOQ value in the signal handler. In the
current code these bits are not masked when returning to narrow
mode, causing java to crash.

Signed-off-by: Sven Schnelle 
---
 target/hppa/sys_helper.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/hppa/sys_helper.c b/target/hppa/sys_helper.c
index 208e51c086..3bbc2da71b 100644
--- a/target/hppa/sys_helper.c
+++ b/target/hppa/sys_helper.c
@@ -83,6 +83,10 @@ void HELPER(rfi)(CPUHPPAState *env)
 env->iaoq_f = env->cr[CR_IIAOQ];
 env->iaoq_b = env->cr_back[1];
 
+if (!(env->cr[CR_IPSW] & PSW_W)) {
+env->iaoq_f &= 0x;
+env->iaoq_b &= 0x;
+}
 /*
  * For pa2.0, IIASQ is the top bits of the virtual address.
  * To recreate the space identifier, remove the offset bits.
-- 
2.43.2




Re: [PATCH 2/3] target/hppa: mask offset bits in gva

2024-03-28 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/23/24 22:09, Sven Schnelle wrote:
>> The CPU seems to mask a few bits in the offset when running
>> under HP-UX. ISR/IOR register contents for an address in
>> the processor HPA (0xfffa) on my C8000 and J6750:
>> running on Linux: 3fff c000fffa0500
>> running on HP-UX: 301f c000fffa0500
>> I haven't found how this is switched (guess some diag in the
>> firmware), but linux + seabios seems to handle that as well,
>> so lets mask out the additional bits.
>> Signed-off-by: Sven Schnelle 
>> [..]
> [..]
> Though my argument would suggest the mask should be 0xff for the
> 40-bit physical address, which is not what you see at all, so perhaps
> the thing is moot.  I am at a loss to explain why or how HP-UX gets a
> 7-bit hole in the ISR result.
>
> On the other hand, there are some not-well-documented shenanigans (aka
> implementation defined behaviour) between Figure H-8 and Figure H-11,
> where the 62-bit absolute address is expanded to a 64-bit logical
> physical address and then compacted to a 40-bit implementation
> physical address.
>
> We've already got hacks in place for this in hppa_abs_to_phys_pa2_w1,
> which just truncates everything down to 40 bits.  But that's probably
> not what the processor is really doing.

I looked into this again, and it's caused by Space-ID hashing. HP-UX asks
PDC/Firmware how many bits are used for the hashing. seabios returns
zero, in which case HP-UX uses a default mask of 0xf01f.
By modifying seabios, i can make HP-UX use the appropriate mask, but
switching of SpaceID hashing entirely is impossible. The reason why
the CPU doesn't strip the bits when running linux is that Linux switches
of Space-ID hashing early in the startup code (before mm gets
initialized).

My J6750 Firmware only returns two values: 0 when Space-ID hashing is
off, 0xfe0 when it is enabled. This is hardcoded in the firmware - the
only thing PDC checks is a bit in Debug Register 2, which enables
Space-ID hashing. 0xfe0 matches the 0xf01f... mask used by HP-UX
pretty well.

So if qemu wants to run 64 Bit HP-UX the proper way, i guess it needs
to implement Space-ID hashing.



Re: [PATCH for-9.0] target/hppa: Clear psw_n for BE on use_nullify_skip path

2024-03-26 Thread Sven Schnelle
Richard Henderson  writes:

> Along this path we have already skipped the insn to be
> nullified, so the subsequent insn should be executed.
>
> Cc: qemu-sta...@nongnu.org
> Reported-by: Sven Schnelle 
> Signed-off-by: Richard Henderson 
> ---
>  target/hppa/translate.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
> index 143818c2d9..8a1a8bc3aa 100644
> --- a/target/hppa/translate.c
> +++ b/target/hppa/translate.c
> @@ -3948,6 +3948,7 @@ static bool trans_be(DisasContext *ctx, arg_be *a)
>  copy_iaoq_entry(ctx, cpu_iaoq_b, -1, tmp);
>  tcg_gen_mov_i64(cpu_iasq_f, new_spc);
>  tcg_gen_mov_i64(cpu_iasq_b, cpu_iasq_f);
> +nullify_set(ctx, 0);
>  } else {
>  copy_iaoq_entry(ctx, cpu_iaoq_f, ctx->iaoq_b, cpu_iaoq_b);
>  if (ctx->iaoq_b == -1) {

Tested-by: Sven Schnelle 



Re: target/hppa: be,n nullifying first insn at branch target?

2024-03-26 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/25/24 09:33, Sven Schnelle wrote:
>> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
>> index 7546a5f5a2..56c68a7cbe 100644
>> --- a/target/hppa/translate.c
>> +++ b/target/hppa/translate.c
>> @@ -3847,7 +3849,7 @@ static bool trans_be(DisasContext *ctx, arg_be *a)
>>   copy_iaoq_entry(ctx, cpu_gr[31], ctx->iaoq_n, ctx->iaoq_n_var);
>>   tcg_gen_mov_i64(cpu_sr[0], cpu_iasq_b);
>>   }
>> -if (a->n && use_nullify_skip(ctx)) {
>> +if (0 && a->n && use_nullify_skip(ctx)) {
>>   copy_iaoq_entry(ctx, cpu_iaoq_f, -1, tmp);
>>   tcg_gen_addi_i64(tmp, tmp, 4);
>>   copy_iaoq_entry(ctx, cpu_iaoq_b, -1, tmp);
>> So i think the problem is caused by this optimization. Does this
>> ring a
>> bell? Debugging this is rather hard, alone the logfile above is 6GB in
>> size...
>
> The problem is a missing
>
> nullify_set(ctx, 0)
>
> within this block.
>
> I have patches queued to reorg the IAQ handling, which I hope will fix
> the problem Sven saw with spaces.  It would fix this as a consequence
> of other unification.  But I think it's a bit too big for 9.0.

Thanks Richard. Let me know if you want me to test patches.



target/hppa: be,n nullifying first insn at branch target?

2024-03-25 Thread Sven Schnelle
Hi Richard,

one of the last issue i'm seeing with 64bit HP-UX 11.11 is a crash
of ld64 while building (linking) the kernel. The program is killed
due to a unaligned access. I've wrote a logfile, and it looks like
be,n is nullifying the first instruction at the branch target:

IN: 
0x9d4a416d4fc:  cmpiclr,<> c,r31,r0
0x9d4a416d500:  addi,tr 13,r0,ret1
0x9d4a416d504:  ldi 0,ret1
0x9d4a416d508:  ldi 0,ret0
0x9d4a416d50c:  ldsid (rp),r31
0x9d4a416d510:  mtsp r31,sr0
0x9d4a416d514:  be,n 0(sr0,rp)

Trace 0: 0x7efd7f9d75c0 [9d4a404/09d4a416d4fc/00040306/ff00] 
IA_F 09d4a416d4ff IA_B 09d4a416d503 IIR 4afc3ff9
PSW  00ff0004ff1f CB    -C---RQPDI
GR00  GR01 09d4a400 GR02 00107573 GR03 
003c79b8
GR04 00419f50 GR05 00410a30 GR06 7a0005c8 GR07 
00419f50
GR08 004122f8 GR09 000b GR10 001db1b8 GR11 
001c81f8
GR12 001c81f8 GR13 001c81f8 GR14  GR15 
001dbf18
GR16  GR17 0001 GR18 001d5278 GR19 
001c5000
GR20 00416a40 GR21 006a GR22 0016d4e8 GR23 
004a
GR24 029c GR25  GR26 00419f50 GR27 
001e65f8
GR28 0001 GR29 001db1b8 GR30 7a029510 GR31 
000c
SR00 09d4a400 SR01  SR02  SR03 
SR04 09d4a400 SR05 09d4a400 SR06 01eea400 SR07 01eea400


IN: 
0x9d4a4107570:  ldw 0(r4),r19
0x9d4a4107574:  ldw 58(r19),r22
0x9d4a4107578:  ldo 0(ret1),r7
0x9d4a410757c:  ldo 0(r4),r26
0x9d4a4107580:  b,l 0x9d4a41074d8,r31
0x9d4a4107584:  ldo 0(r31),rp

Trace 0: 0x7efd7f9d77c0 [9d4a404/09d4a4107570/00240306/ff00] 
IA_F 09d4a4107573 IA_B 09d4a4107577 IIR 4afc3ff9
PSW  0024001f CB    --N--C---RQPDI
GR00  GR01 09d4a400 GR02 00107573 GR03 
003c79b8
GR04 00419f50 GR05 00410a30 GR06 7a0005c8 GR07 
00419f50
GR08 004122f8 GR09 000b GR10 001db1b8 GR11 
001c81f8
GR12 001c81f8 GR13 001c81f8 GR14  GR15 
001dbf18
GR16  GR17 0001 GR18 001d5278 GR19 
001c5000
GR20 00416a40 GR21 006a GR22 0016d4e8 GR23 
004a
GR24 029c GR25  GR26 00419f50 GR27 
001e65f8
GR28  GR29 0013 GR30 7a029510 GR31 
09d4a400
SR00 09d4a400 SR01  SR02  SR03 
SR04 09d4a400 SR05 09d4a400 SR06 01eea400 SR07 01eea400

Trace 0: 0x7efd7f9adb80 [9d4a404/09d4a41074d8/00040306/ff00] 
IA_F 09d4a41074db IA_B 09d4a41074df IIR 4afc3ff9
PSW  0004001f CB    -C---RQPDI
GR00  GR01 09d4a400 GR02 0010758b GR03 
003c79b8
GR04 00419f50 GR05 00410a30 GR06 7a0005c8 GR07 
0013
GR08 004122f8 GR09 000b GR10 001db1b8 GR11 
001c81f8
GR12 001c81f8 GR13 001c81f8 GR14  GR15 
001dbf18
GR16  GR17 0001 GR18 001d5278 GR19 
001c5000
GR20 00416a40 GR21 006a GR22 2400 GR23 
004a
GR24 029c GR25  GR26 00419f50 GR27 
001e65f8
GR28  GR29 0013 GR30 7a029510 GR31 
0010758b
SR00 09d4a400 SR01  SR02  SR03 
SR04 09d4a400 SR05 09d4a400 SR06 01eea400 SR07 01eea400

The problem is the 0x1c5000 value in r19, which is the value of an old
instruction. At 0x9d4a416d514 is a be,n and at 09d4a4107570 the
N bit is set. First i thought it might be just a display issue with the
log, but the following change to confirm the issue makes the kernel
linking succeed:


diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 7546a5f5a2..56c68a7cbe 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3847,7 +3849,7 @@ static bool trans_be(DisasContext *ctx, arg_be *a)
 copy_iaoq_entry(ctx, cpu_gr[31], ctx->iaoq_n, ctx->iaoq_n_var);
 tcg_gen_mov_i64(cpu_sr[0], cpu_iasq_b);
 }
-if (a->n && use_nullify_skip(ctx)) {
+if (0 && a->n && use_nullify_skip(ctx)) {
 copy_iaoq_entry(ctx, cpu_iaoq_f, -1, tmp);
 tcg_gen_addi_i64(tmp, tmp, 4);
 copy_iaoq_entry(ctx, cpu_iaoq_b, -1, tmp);


So i think the problem is caused by this optimization. Does this ring a
bell? Debugging this is rather hard, alone the logfile above is 6GB in
size...

Thanks,
Sven



Re: [PATCH 2/3] target/hppa: mask offset bits in gva

2024-03-24 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/24/24 08:41, Sven Schnelle wrote:
>> 7f09e0: val=000fffb0301f r2=110e0f01 r1=01fff600 
>> phys=fffb 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
>> 'val' is the value constructed from IOR/ISR,
>
> Is this byte swapped in some weird way?  I do not see how 'val'
> corresponds to any of the addresses we're talking about.  From here,
> the string "301f" appears to an out-of-context grep hit.

It's just both values combined together, where the 301f is basically
the ISR content. It's not a out of context grep - the real machines i have
are constructing the same value, and the same offset into the pagetable.
I verified that by patching the DTLB miss handler in HPUX to write the
ISR/IOR and calulated pagetable offset/value to PAGE0 and looked with
the kernel debugger at the values.



Re: [PATCH 2/3] target/hppa: mask offset bits in gva

2024-03-24 Thread Sven Schnelle
Hi Richard,

Richard Henderson  writes:

> In particular Figure 2-14 for "data translation disabled" may be
> instructive.  Suppose the cpu does not implement all of the physical
> address lines (true for all extant pa-risc cpus; qemu implements 40
> bits to match pa-8500 iirc).  Suppose when reporting a trap with
> translation disabled, it is a truncated physical address that is used
> as input to Figure 2-14.
>
> If that is so, then the fix might be in hppa_set_ior_and_isr.  Perhaps
>
> -env->cr[CR_ISR] &= 0x3fff;
> +env->cr[CR_ISR] &= 0x301f;
>
> Though my argument would suggest the mask should be 0xff for the
> 40-bit physical address, which is not what you see at all, so perhaps
> the thing is moot.  I am at a loss to explain why or how HP-UX gets a
> 7-bit hole in the ISR result.
>
> On the other hand, there are some not-well-documented shenanigans (aka
> implementation defined behaviour) between Figure H-8 and Figure H-11,
> where the 62-bit absolute address is expanded to a 64-bit logical
> physical address and then compacted to a 40-bit implementation
> physical address.
>
> We've already got hacks in place for this in hppa_abs_to_phys_pa2_w1,
> which just truncates everything down to 40 bits.  But that's probably
> not what the processor is really doing.
>
> Anyhow, will you please try the hppa_set_ior_and_isr change and see if
> that fixes your HP-UX problems?

The problem occurs with data address translation - it's working without,
which is not suprising because no exception can happen there. But as
soon as the kernel enables address translation it will hit a data tlb
miss exception because it can't find 0xfffb0500 in the page
tables. Trying to truncate the ISR in hppa_set_ior_and_isr() for the
data translation enabled case leads to this loop:

hppa_tlb_fill_excp env=0x55bf06e976e0 addr=0x3ffb0500 size=4 type=0 
mmu_idx=9
hppa_tlb_find_entry env=0x55bf06e976e0 ent=0x55bf06e97b30 valid=1 va_b=0x20 
va_e=0x2f pa=0x20
hppa_tlb_get_physical_address env=0x55bf06e976e0 ret=-1 prot=5 addr=0x26170c 
phys=0x26170c
hppa_tlb_flush_ent env=0x55bf06e976e0 ent=0x55bf06e97bf0 
va_b=0x301b va_e=0x301b0fff pa=0xfffb
hppa_tlb_itlba env=0x55bf06e976e0 ent=0x55bf06e97bf0 va_b=0x301b 
va_e=0x301b0fff pa=0xfffb
hppa_tlb_itlbp env=0x55bf06e976e0 ent=0x55bf06e97bf0 access_id=0 u=1 pl2=0 
pl1=0 type=1 b=0 d=0 t=0

So qemu is looking up 0x3ffb0500 in the TLB, can't find it,
raises an exception, HP-UX says: "ah nice, i have a translation for
you", but that doesn't match because we're only stripping the bits
in the ISR.

As i was a bit puzzled in the beginning what's going on, i dumped the
pagetables and wrote a small dump program:

68: val=000f47ff301f r2=110e0f01 r1=01e8ffe0 
phys=f47ff000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
680020: val=000f47fe301f r2=110e0f01 r1=01e8ffc0 
phys=f47fe000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
680060: val=000f47fc301f r2=110e0f01 r1=01e8ff80 
phys=f47fc000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5860: val=000fed3c301f r2=010e0001 r1=01fda780 
phys=fed3c000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d58e0: val=000fed38301f r2=010e0001 r1=01fda700 
phys=fed38000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d59a0: val=000fed32301f r2=010e0001 r1=01fda640 
phys=fed32000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d59e0: val=000fed30301f r2=110e0f01 r1=01fda600 
phys=fed3 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a00: val=000fed2f301f r2=010e0001 r1=01fda5e0 
phys=fed2f000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a20: val=000fed2e301f r2=010e0001 r1=01fda5c0 
phys=fed2e000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a40: val=000fed2d301f r2=010e0001 r1=01fda5a0 
phys=fed2d000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a60: val=000fed2c301f r2=010e0001 r1=01fda580 
phys=fed2c000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a80: val=000fed2b301f r2=010e0001 r1=01fda560 
phys=fed2b000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5aa0: val=000fed2a301f r2=010e0001 r1=01fda540 
phys=fed2a000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ac0: val=000fed29301f r2=010e0001 r1=01fda520 
phys=fed29000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ae0: val=000fed28301f r2=010e0001 r1=01fda500 
phys=fed28000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b00: val=000fed27301f r2=010e0001 r1=01fda4e0 
phys=fed27000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b20: val=000fed26301f r2=010e0001 r1=01fda4c0 
phys=fed26000 4K aid=1 pl1=0, pl2=0 type=1 (

[PATCH 3/3] target/hppa: fix building gva for wide mode

2024-03-24 Thread Sven Schnelle
64 Bit hppa no longer has a fixed 32/32 bit split between space and
offset. Instead it uses 42 bits for the offset. The lower 10 bits of
the space are always zero, leaving 22 bits actually used. Simply or
the values together to build the gva.

Signed-off-by: Sven Schnelle 
---
 target/hppa/mem_helper.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/target/hppa/mem_helper.c b/target/hppa/mem_helper.c
index 84785b5a5c..6f895fced7 100644
--- a/target/hppa/mem_helper.c
+++ b/target/hppa/mem_helper.c
@@ -523,13 +523,16 @@ void HELPER(itlbp_pa11)(CPUHPPAState *env, target_ulong 
addr, target_ulong reg)
 }
 
 static void itlbt_pa20(CPUHPPAState *env, target_ulong r1,
-   target_ulong r2, vaddr va_b)
+   target_ulong r2, uint64_t spc, uint64_t off)
 {
 HPPATLBEntry *ent;
-vaddr va_e;
+vaddr va_b, va_e;
 uint64_t va_size;
 int mask_shift;
 
+va_b = off & gva_offset_mask(env->psw);
+va_b |= spc << 32;
+
 mask_shift = 2 * (r1 & 0xf);
 va_size = (uint64_t)TARGET_PAGE_SIZE << mask_shift;
 va_b &= -va_size;
@@ -569,14 +572,12 @@ static void itlbt_pa20(CPUHPPAState *env, target_ulong r1,
 
 void HELPER(idtlbt_pa20)(CPUHPPAState *env, target_ulong r1, target_ulong r2)
 {
-vaddr va_b = deposit64(env->cr[CR_IOR], 32, 32, env->cr[CR_ISR]);
-itlbt_pa20(env, r1, r2, va_b);
+itlbt_pa20(env, r1, r2, env->cr[CR_ISR], env->cr[CR_IOR]);
 }
 
 void HELPER(iitlbt_pa20)(CPUHPPAState *env, target_ulong r1, target_ulong r2)
 {
-vaddr va_b = deposit64(env->cr[CR_IIAOQ], 32, 32, env->cr[CR_IIASQ]);
-itlbt_pa20(env, r1, r2, va_b);
+itlbt_pa20(env, r1, r2, env->cr[CR_IIASQ], env->cr[CR_IIAOQ]);
 }
 
 /* Purge (Insn/Data) TLB. */
-- 
2.43.2




[PATCH 0/3] few hppa fixes for 64bit mode

2024-03-24 Thread Sven Schnelle
Hi,

in preparation of getting 64bit HP-UX running in qemu, here are a few fixes
to make HP-UX progress a bit further.

Sven Schnelle (3):
  target/hppa: use gva_offset_mask() everywhere
  target/hppa: mask offset bits in gva
  target/hppa: fix building gva for wide mode

 target/hppa/cpu.h| 11 +--
 target/hppa/mem_helper.c | 13 +++--
 target/hppa/translate.c  | 12 +++-
 3 files changed, 19 insertions(+), 17 deletions(-)

-- 
2.43.2




[PATCH 2/3] target/hppa: mask offset bits in gva

2024-03-24 Thread Sven Schnelle
The CPU seems to mask a few bits in the offset when running
under HP-UX. ISR/IOR register contents for an address in
the processor HPA (0xfffa) on my C8000 and J6750:

running on Linux: 3fff c000fffa0500
running on HP-UX: 301f c000fffa0500

I haven't found how this is switched (guess some diag in the
firmware), but linux + seabios seems to handle that as well,
so lets mask out the additional bits.

Signed-off-by: Sven Schnelle 
---
 target/hppa/cpu.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index a072d0bb63..9bc4d208fa 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -283,12 +283,13 @@ static inline int HPPA_BTLB_ENTRIES(CPUHPPAState *env)
 
 void hppa_translate_init(void);
 
+#define HPPA_GVA_OFFSET_MASK64 0x301f
 #define CPU_RESOLVING_TYPE TYPE_HPPA_CPU
 
 static inline uint64_t gva_offset_mask(target_ulong psw)
 {
 return (psw & PSW_W
-? MAKE_64BIT_MASK(0, 62)
+? HPPA_GVA_OFFSET_MASK64
 : MAKE_64BIT_MASK(0, 32));
 }
 
-- 
2.43.2




[PATCH 1/3] target/hppa: use gva_offset_mask() everywhere

2024-03-24 Thread Sven Schnelle
move it to cpu.h, so it can also be used in hppa_form_gva_psw()

Signed-off-by: Sven Schnelle 
---
 target/hppa/cpu.h   | 10 --
 target/hppa/translate.c | 12 +++-
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index a92dc352cb..a072d0bb63 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -285,14 +285,20 @@ void hppa_translate_init(void);
 
 #define CPU_RESOLVING_TYPE TYPE_HPPA_CPU
 
+static inline uint64_t gva_offset_mask(target_ulong psw)
+{
+return (psw & PSW_W
+? MAKE_64BIT_MASK(0, 62)
+: MAKE_64BIT_MASK(0, 32));
+}
+
 static inline target_ulong hppa_form_gva_psw(target_ulong psw, uint64_t spc,
  target_ulong off)
 {
 #ifdef CONFIG_USER_ONLY
 return off;
 #else
-off &= psw & PSW_W ? MAKE_64BIT_MASK(0, 62) : MAKE_64BIT_MASK(0, 32);
-return spc | off;
+return spc | (off & gva_offset_mask(psw));
 #endif
 }
 
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 19594f917e..0af125ed74 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -586,17 +586,10 @@ static bool nullify_end(DisasContext *ctx)
 return true;
 }
 
-static uint64_t gva_offset_mask(DisasContext *ctx)
-{
-return (ctx->tb_flags & PSW_W
-? MAKE_64BIT_MASK(0, 62)
-: MAKE_64BIT_MASK(0, 32));
-}
-
 static void copy_iaoq_entry(DisasContext *ctx, TCGv_i64 dest,
 uint64_t ival, TCGv_i64 vval)
 {
-uint64_t mask = gva_offset_mask(ctx);
+uint64_t mask = gva_offset_mask(ctx->tb_flags);
 
 if (ival != -1) {
 tcg_gen_movi_i64(dest, ival & mask);
@@ -1403,7 +1396,8 @@ static void form_gva(DisasContext *ctx, TCGv_i64 *pgva, 
TCGv_i64 *pofs,
 
 *pofs = ofs;
 *pgva = addr = tcg_temp_new_i64();
-tcg_gen_andi_i64(addr, modify <= 0 ? ofs : base, gva_offset_mask(ctx));
+tcg_gen_andi_i64(addr, modify <= 0 ? ofs : base,
+ gva_offset_mask(ctx->tb_flags));
 #ifndef CONFIG_USER_ONLY
 if (!is_phys) {
 tcg_gen_or_i64(addr, addr, space_select(ctx, sp, base));
-- 
2.43.2




Re: [PATCH 1/3] target/hppa: add unit conditions for wide mode

2024-03-21 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/21/24 08:42, Sven Schnelle wrote:
>> Wide mode provides two more conditions, add them.
>> Signed-off-by: Sven Schnelle 
>> ---
>>   target/hppa/translate.c | 25 +++--
>>   1 file changed, 23 insertions(+), 2 deletions(-)
>> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
>> index 8a87996fc1..f493e207e1 100644
>> --- a/target/hppa/translate.c
>> +++ b/target/hppa/translate.c
>> @@ -963,11 +963,22 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
>> TCGv_i64 res,
>> switch (cf >> 1) {
>>   case 0: /* never / TR */
>> -case 1: /* undefined */
>> -case 5: /* undefined */
>>   cond = cond_make_f();
>>   break;
>>   +case 1:
>
> Wants a comment for /* SWZ / NWZ */
>
>> +case 5:
>
> /* SWC / NWC */

Are you going to fix that up, or should i send a v2?



Re: [PATCH 2/3] target/hppa: sub: fix trap on overflow for narrow mode

2024-03-21 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/21/24 08:42, Sven Schnelle wrote:
>> Signed-off-by: Sven Schnelle 
>> ---
>>   target/hppa/translate.c | 3 +++
>>   1 file changed, 3 insertions(+)
>> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
>> index f493e207e1..4d2b96f876 100644
>> --- a/target/hppa/translate.c
>> +++ b/target/hppa/translate.c
>> @@ -1213,6 +1213,9 @@ static void do_sub(DisasContext *ctx, unsigned rt, 
>> TCGv_i64 in1,
>>   if (is_tsv || cond_need_sv(c)) {
>>   sv = do_sub_sv(ctx, dest, in1, in2);
>>   if (is_tsv) {
>> +if (!d) {
>> +tcg_gen_ext32s_i64(sv, sv);
>> +}
>>   gen_helper_tsv(tcg_env, sv);
>>   }
>>   }
>
> Difficult to pinpoint exactly which patch should have added this.  :-)

Yes, after missing the Fixes: tags on all of my patches in the last
patchset, i tried add one but wasn't sure either. :-)

> Reviewed-by: Richard Henderson 

Thanks!



[PATCH 3/3] target/hppa: add: fix trap on overflow for narrow mode

2024-03-21 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 4d2b96f876..74a9ea0cd8 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1122,6 +1122,9 @@ static void do_add(DisasContext *ctx, unsigned rt, 
TCGv_i64 in1,
 if (is_tsv || cond_need_sv(c)) {
 sv = do_add_sv(ctx, dest, in1, in2);
 if (is_tsv) {
+if (!d) {
+tcg_gen_ext32s_i64(sv, sv);
+}
 /* ??? Need to include overflow from shift.  */
 gen_helper_tsv(tcg_env, sv);
 }
-- 
2.43.2




[PATCH 1/3] target/hppa: add unit conditions for wide mode

2024-03-21 Thread Sven Schnelle
Wide mode provides two more conditions, add them.

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 8a87996fc1..f493e207e1 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -963,11 +963,22 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
TCGv_i64 res,
 
 switch (cf >> 1) {
 case 0: /* never / TR */
-case 1: /* undefined */
-case 5: /* undefined */
 cond = cond_make_f();
 break;
 
+case 1:
+if (d) {
+tmp = tcg_temp_new_i64();
+tcg_gen_subi_i64(tmp, res, d_repl * 0x0001u);
+tcg_gen_andc_i64(tmp, tmp, res);
+tcg_gen_andi_i64(tmp, tmp, d_repl * 0x8000u);
+cond = cond_make_0(TCG_COND_NE, tmp);
+} else {
+/* undefined */
+cond = cond_make_f();
+}
+break;
+
 case 2: /* SBZ / NBZ */
 /* See hasless(v,1) from
  * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
@@ -992,6 +1003,16 @@ static DisasCond do_unit_cond(unsigned cf, bool d, 
TCGv_i64 res,
 cond = cond_make_0(TCG_COND_NE, cb);
 break;
 
+case 5:
+if (d) {
+tcg_gen_andi_i64(cb, cb, d_repl * 0x8000u);
+cond = cond_make_0(TCG_COND_NE, cb);
+} else {
+/* undefined */
+cond = cond_make_f();
+}
+break;
+
 case 6: /* SBC / NBC */
 tcg_gen_andi_i64(cb, cb, d_repl * 0x80808080u);
 cond = cond_make_0(TCG_COND_NE, cb);
-- 
2.43.2




[PATCH 2/3] target/hppa: sub: fix trap on overflow for narrow mode

2024-03-21 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index f493e207e1..4d2b96f876 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1213,6 +1213,9 @@ static void do_sub(DisasContext *ctx, unsigned rt, 
TCGv_i64 in1,
 if (is_tsv || cond_need_sv(c)) {
 sv = do_sub_sv(ctx, dest, in1, in2);
 if (is_tsv) {
+if (!d) {
+tcg_gen_ext32s_i64(sv, sv);
+}
 gen_helper_tsv(tcg_env, sv);
 }
 }
-- 
2.43.2




Re: [PULL 0/9] target/hppa fixes for 9.0

2024-03-21 Thread Sven Schnelle
Michael Tokarev  writes:

> 20.03.2024 03:32, Richard Henderson :
>
>> Richard Henderson (3):
>>target/hppa: Fix assemble_16 insns for wide mode
>>target/hppa: Fix assemble_11a insns for wide mode
>>target/hppa: Fix assemble_12a insns for wide mode
>> Sven Schnelle (6):
>>target/hppa: ldcw,s uses static shift of 3
>>target/hppa: fix shrp for wide mode
>>target/hppa: fix access_id check
>>target/hppa: exit tb on flush cache instructions
>>target/hppa: mask privilege bits in mfia
>>target/hppa: fix do_stdby_e()
>
> Is it all -stable material (when appropriate)?

I'd say yes.



Re: [RFC PATCH] target/hppa: Introduce and use IAQEntry

2024-03-19 Thread Sven Schnelle
Sven Schnelle  writes:

> Hi Richard,
>
> Richard Henderson  writes:
>
>> Wrap offset and space into a single structure, with
>> offset represented either as a constant or as variable.
>> This enhances copy_iaoq_entry to copy the space, and at
>> the same time simplifying the code.
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>
>> Sven, I think this might solve your branch-with-space-change issue.
>> There are a couple of places where it does appear as if we might be
>> losing a space change, with a branch within a branch delay slot.
>> I'm trying to solve this by keeping both queue entries together.
>
> Thanks! I've put that on top of the other changes i have for 64bit
> HP-UX, but it introduces a new problem:
>
>* Checking configuration for consistency...
> WARNING: The disk at: 10/0/0/0.0.0 (QEMU_QEMU_HARDDISK) appears to contain a 
>  file system and boot area.  Continuing the installation will destroy 
>  any existing data on this disk.
>
>  Do you wish to cancel the non-interactive installation in order to
>  respond to the warnings above? ([y]/n): n
>   
>* Loading configuration utility...
>* Beginning installation from source: /dev/dsk/c0t2d0
> ===  03/19/24 13:59:53 EST  Starting system configuration...
>* Configure_Disks:  Begin
>* Will install B.11.11 onto this system.
>* Creating LVM physical volume "/dev/rdsk/c0t0d0" (10/0/0/0.0.0).
>* Creating volume group "vg00".
> vgcreate: Illegal path "".
>* Creating logical volume "vg00/lvol1" (/stand).
> lvcreate: Illegal path "".
>* Creating logical volume "vg00/lvol2" (swap_dump).
> lvcreate: Illegal path "".
>* Creating logical volume "vg00/lvol3" (/).
> lvcreate: Illegal path "".
>* Creating logical volume "vg00/lvol4" (/tmp).
> lvcreate: Illegal path "".
>* Creating logical volume "vg00/lvol5" (/home).
> lvcreate: Illegal path "".
>* Creating logical volume "vg00/lvol6" (/opt).
> lvcreate: Illegal path "".
>
> The 'illegal path' errors where not present before. I haven't done any
> debugging, as debugging hppa userspace in qemu is a really time
> consuming thing.

I'm seeing the same error with 32bit HP-UX 10.20 - so at least it's not
caused by my 64bit hacks. Linux seems to run fine, but Linux is also
utilizing address spaces not to the same extent as HP-UX.



Re: [RFC PATCH] target/hppa: Introduce and use IAQEntry

2024-03-19 Thread Sven Schnelle
Hi Richard,

Richard Henderson  writes:

> Wrap offset and space into a single structure, with
> offset represented either as a constant or as variable.
> This enhances copy_iaoq_entry to copy the space, and at
> the same time simplifying the code.
>
> Signed-off-by: Richard Henderson 
> ---
>
> Sven, I think this might solve your branch-with-space-change issue.
> There are a couple of places where it does appear as if we might be
> losing a space change, with a branch within a branch delay slot.
> I'm trying to solve this by keeping both queue entries together.

Thanks! I've put that on top of the other changes i have for 64bit
HP-UX, but it introduces a new problem:

   * Checking configuration for consistency...
WARNING: The disk at: 10/0/0/0.0.0 (QEMU_QEMU_HARDDISK) appears to contain a 
 file system and boot area.  Continuing the installation will destroy 
 any existing data on this disk.

 Do you wish to cancel the non-interactive installation in order to
 respond to the warnings above? ([y]/n): n
  
   * Loading configuration utility...
   * Beginning installation from source: /dev/dsk/c0t2d0
===  03/19/24 13:59:53 EST  Starting system configuration...
   * Configure_Disks:  Begin
   * Will install B.11.11 onto this system.
   * Creating LVM physical volume "/dev/rdsk/c0t0d0" (10/0/0/0.0.0).
   * Creating volume group "vg00".
vgcreate: Illegal path "".
   * Creating logical volume "vg00/lvol1" (/stand).
lvcreate: Illegal path "".
   * Creating logical volume "vg00/lvol2" (swap_dump).
lvcreate: Illegal path "".
   * Creating logical volume "vg00/lvol3" (/).
lvcreate: Illegal path "".
   * Creating logical volume "vg00/lvol4" (/tmp).
lvcreate: Illegal path "".
   * Creating logical volume "vg00/lvol5" (/home).
lvcreate: Illegal path "".
   * Creating logical volume "vg00/lvol6" (/opt).
lvcreate: Illegal path "".

The 'illegal path' errors where not present before. I haven't done any
debugging, as debugging hppa userspace in qemu is a really time
consuming thing.

But when you're add ioaq/iasq handling - The branch i was referring to is
https://github.com/svenschnelle/qemu/commits/hppa_fixes64/

With the main difference in commits:

https://github.com/svenschnelle/qemu/commit/a67a19ac082ab3435009c1ab3b57bfe6d9588827
https://github.com/svenschnelle/qemu/commit/ca288e26f92b61cc9148fa5cfe7552aba6241466

The second commit is really hacky, because it tries to hack around the
problem the we have 2 bits Space select, 22 bits Space and 42 bits
offset in PA2.0. And than there's also the problem that the
iaoq_f/iaoq_b difference might get encoded into cs_base, but that works
only if it's less than 32bits. I don't have a solution for that.

But at least it gets a 64 bit HP-UX image from my C8000 booted in qemu.
So if you know of a nice way to handle the space selector bits + 64 Bit
GVA, im happy to test patches :-)

> r~
>
> ---
>  target/hppa/translate.c | 498 +---
>  1 file changed, 263 insertions(+), 235 deletions(-)
>
> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
> index be0b0494d0..e555d542c8 100644
> --- a/target/hppa/translate.c
> +++ b/target/hppa/translate.c
> @@ -41,15 +41,31 @@ typedef struct DisasCond {
>  TCGv_i64 a0, a1;
>  } DisasCond;
>  
> +/* Describe a complete Instruction Address Queue entry. */
> +typedef struct IAQEntry {
> +/*
> + * Non-constant value of ia space.
> + * When null, IASQ_Front, IASQ_Back, and IASQ_Next are all equal.
> + */
> +TCGv_i64 space;
> +/*
> + * Non-constant value of ia offset, or env destination.
> + * If non-null, will always be set on copy.
> + */
> +TCGv_i64 ofsv;
> +/*
> + * Known constant value of ia offset with no change to ia space.
> + * Unused if either spc or ofsv are non-null.
> + */
> +uint64_t ofsc;
> +} IAQEntry;
> +
>  typedef struct DisasContext {
>  DisasContextBase base;
>  CPUState *cs;
>  TCGOp *insn_start;
>  
> -uint64_t iaoq_f;
> -uint64_t iaoq_b;
> -uint64_t iaoq_n;
> -TCGv_i64 iaoq_n_var;
> +IAQEntry iaq_f, iaq_b, iaq_n;
>  
>  DisasCond null_cond;
>  TCGLabel *null_lab;
> @@ -221,15 +237,13 @@ static int cmpbid_c(DisasContext *ctx, int val)
>  static TCGv_i64 cpu_gr[32];
>  static TCGv_i64 cpu_sr[4];
>  static TCGv_i64 cpu_srH;
> -static TCGv_i64 cpu_iaoq_f;
> -static TCGv_i64 cpu_iaoq_b;
> -static TCGv_i64 cpu_iasq_f;
> -static TCGv_i64 cpu_iasq_b;
>  static TCGv_i64 cpu_sar;
>  static TCGv_i64 cpu_

[PATCH v2 3/6] target/hppa: fix access_id check

2024-03-19 Thread Sven Schnelle
PA2.0 provides 8 instead of 4 PID registers.

Signed-off-by: Sven Schnelle 
---
 target/hppa/mem_helper.c | 59 +---
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/target/hppa/mem_helper.c b/target/hppa/mem_helper.c
index 80f51e753f..f89ba91b20 100644
--- a/target/hppa/mem_helper.c
+++ b/target/hppa/mem_helper.c
@@ -152,6 +152,49 @@ static HPPATLBEntry *hppa_alloc_tlb_ent(CPUHPPAState *env)
 return ent;
 }
 
+#define ACCESS_ID_MASK 0x
+
+/* Return the set of protections allowed by a PID match. */
+static int match_prot_id_1(uint32_t access_id, uint32_t prot_id)
+{
+if (((access_id ^ (prot_id >> 1)) & ACCESS_ID_MASK) == 0) {
+return (prot_id & 1
+? PROT_EXEC | PROT_READ
+: PROT_EXEC | PROT_READ | PROT_WRITE);
+}
+return 0;
+}
+
+static int match_prot_id32(CPUHPPAState *env, uint32_t access_id)
+{
+int r, i;
+
+for (i = CR_PID1; i <= CR_PID4; ++i) {
+r = match_prot_id_1(access_id, env->cr[i]);
+if (r) {
+return r;
+}
+}
+return 0;
+}
+
+static int match_prot_id64(CPUHPPAState *env, uint32_t access_id)
+{
+int r, i;
+
+for (i = CR_PID1; i <= CR_PID4; ++i) {
+r = match_prot_id_1(access_id, env->cr[i]);
+if (r) {
+return r;
+}
+r = match_prot_id_1(access_id, env->cr[i] >> 32);
+if (r) {
+return r;
+}
+}
+return 0;
+}
+
 int hppa_get_physical_address(CPUHPPAState *env, vaddr addr, int mmu_idx,
   int type, hwaddr *pphys, int *pprot,
   HPPATLBEntry **tlb_entry)
@@ -226,16 +269,12 @@ int hppa_get_physical_address(CPUHPPAState *env, vaddr 
addr, int mmu_idx,
 
 /* access_id == 0 means public page and no check is performed */
 if (ent->access_id && MMU_IDX_TO_P(mmu_idx)) {
-/* If bits [31:1] match, and bit 0 is set, suppress write.  */
-int match = ent->access_id * 2 + 1;
-
-if (match == env->cr[CR_PID1] || match == env->cr[CR_PID2] ||
-match == env->cr[CR_PID3] || match == env->cr[CR_PID4]) {
-prot &= PAGE_READ | PAGE_EXEC;
-if (type == PAGE_WRITE) {
-ret = EXCP_DMPI;
-goto egress;
-}
+int access_prot = (hppa_is_pa20(env)
+   ? match_prot_id64(env, ent->access_id)
+   : match_prot_id32(env, ent->access_id));
+if (prot & ~access_prot) {
+ret = type & PAGE_EXEC ? EXCP_IMP : EXCP_DMPI;
+goto egress;
 }
 }
 
-- 
2.43.2




[PATCH v2 4/6] target/hppa: exit tb on flush cache instructions

2024-03-19 Thread Sven Schnelle
When the guest modifies the tb it is currently executing from,
it executes a fic instruction. Exit the tb on such instruction,
otherwise we might execute stale code.

Signed-off-by: Sven Schnelle 
---
 target/hppa/insns.decode |  6 +++---
 target/hppa/translate.c  | 11 +++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/target/hppa/insns.decode b/target/hppa/insns.decode
index f5a3f02fd1..409e3ea9c9 100644
--- a/target/hppa/insns.decode
+++ b/target/hppa/insns.decode
@@ -143,9 +143,9 @@ getshadowregs      1101 1110 1010 1101 0010
 nop 01 - - -- 11001010 0 - # fdc, disp
 nop_addrx   01 . . -- 01001010 . -  @addrx # fdc, index
 nop_addrx   01 . . -- 01001011 . -  @addrx # fdce
-nop_addrx   01 . . --- 0001010 . -  @addrx # fic 0x0a
-nop_addrx   01 . . -- 0100 . 0  @addrx # fic 0x4f
-nop_addrx   01 . . --- 0001011 . -  @addrx # fice
+fic 01 . . --- 0001010 . -  @addrx # fic 0x0a
+fic 01 . . -- 0100 . 0  @addrx # fic 0x4f
+fic 01 . . --- 0001011 . -  @addrx # fice
 nop_addrx   01 . . -- 01001110 . 0  @addrx # pdc
 
 probe   01 b:5 ri:5 sp:2 imm:1 100011 write:1 0 t:5
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 8ba31567e8..46da546eb9 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -2293,6 +2293,17 @@ static bool trans_nop_addrx(DisasContext *ctx, arg_ldst 
*a)
 return true;
 }
 
+static bool trans_fic(DisasContext *ctx, arg_fic *a)
+{
+arg_ldst b;
+
+b.b = a->b;
+b.x = a->x;
+b.m = a->m;
+ctx->base.is_jmp = DISAS_IAQ_N_STALE;
+return trans_nop_addrx(ctx, &b);
+}
+
 static bool trans_probe(DisasContext *ctx, arg_probe *a)
 {
 TCGv_i64 dest, ofs;
-- 
2.43.2




[PATCH v2 0/6] few fixes for hppa target

2024-03-19 Thread Sven Schnelle
Hi,

here are a few fixes for the hppa target i made while debugging
some wide mode issues.

Changes in v2:

- use Richards version for access id matching
- add trans_fic()

Sven Schnelle (6):
  target/hppa: ldcw,s uses static shift of 3
  target/hppa: fix shrp for wide mode
  target/hppa: fix access_id check
  target/hppa: exit tb on flush cache instructions
  target/hppa: mask privilege bits in mfia
  target/hppa: fix do_stdby_e()

 target/hppa/insns.decode |  6 ++--
 target/hppa/mem_helper.c | 59 +---
 target/hppa/op_helper.c  | 10 +++
 target/hppa/translate.c  | 19 ++---
 4 files changed, 72 insertions(+), 22 deletions(-)

-- 
2.43.2




[PATCH v2 1/6] target/hppa: ldcw,s uses static shift of 3

2024-03-19 Thread Sven Schnelle
Fixes: 96d6407f363 ("target-hppa: Implement loads and stores")
Signed-off-by: Sven Schnelle 
Reviewed-by: Richard Henderson 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index eb2046c5ad..6a513d7d5c 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3085,7 +3085,7 @@ static bool trans_ldc(DisasContext *ctx, arg_ldst *a)
 dest = dest_gpr(ctx, a->t);
 }
 
-form_gva(ctx, &addr, &ofs, a->b, a->x, a->scale ? a->size : 0,
+form_gva(ctx, &addr, &ofs, a->b, a->x, a->scale ? 3 : 0,
  a->disp, a->sp, a->m, MMU_DISABLED(ctx));
 
 /*
-- 
2.43.2




[PATCH v2 6/6] target/hppa: fix do_stdby_e()

2024-03-19 Thread Sven Schnelle
stdby,e,m was writing data from the wrong half of the register
into memory for cases 0-3.

Fixes: 25460fc5a71 ("target/hppa: Implement STDBY")
Signed-off-by: Sven Schnelle 
Reviewed-by: Richard Henderson 
---
 target/hppa/op_helper.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/hppa/op_helper.c b/target/hppa/op_helper.c
index 480fe80844..6cf49f33b7 100644
--- a/target/hppa/op_helper.c
+++ b/target/hppa/op_helper.c
@@ -281,17 +281,17 @@ static void do_stdby_e(CPUHPPAState *env, target_ulong 
addr, uint64_t val,
 case 3:
 /* The 3 byte store must appear atomic.  */
 if (parallel) {
-atomic_store_mask32(env, addr - 3, val, 0xff00u, ra);
+atomic_store_mask32(env, addr - 3, val >> 32, 0xff00u, ra);
 } else {
-cpu_stw_data_ra(env, addr - 3, val >> 16, ra);
-cpu_stb_data_ra(env, addr - 1, val >> 8, ra);
+cpu_stw_data_ra(env, addr - 3, val >> 48, ra);
+cpu_stb_data_ra(env, addr - 1, val >> 40, ra);
 }
 break;
 case 2:
-cpu_stw_data_ra(env, addr - 2, val >> 16, ra);
+cpu_stw_data_ra(env, addr - 2, val >> 48, ra);
 break;
 case 1:
-cpu_stb_data_ra(env, addr - 1, val >> 24, ra);
+cpu_stb_data_ra(env, addr - 1, val >> 56, ra);
 break;
 default:
 /* Nothing is stored, but protection is checked and the
-- 
2.43.2




[PATCH v2 2/6] target/hppa: fix shrp for wide mode

2024-03-19 Thread Sven Schnelle
Fixes: f7b775a9c075 ("target/hppa: Implement SHRPD")
Signed-off-by: Sven Schnelle 
Reviewed-by: Richard Henderson 
Reviewed-by: Helge Deller 
---
 target/hppa/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 6a513d7d5c..8ba31567e8 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3462,7 +3462,7 @@ static bool trans_shrp_sar(DisasContext *ctx, 
arg_shrp_sar *a)
 /* Install the new nullification.  */
 cond_free(&ctx->null_cond);
 if (a->c) {
-ctx->null_cond = do_sed_cond(ctx, a->c, false, dest);
+ctx->null_cond = do_sed_cond(ctx, a->c, a->d, dest);
 }
 return nullify_end(ctx);
 }
@@ -3505,7 +3505,7 @@ static bool trans_shrp_imm(DisasContext *ctx, 
arg_shrp_imm *a)
 /* Install the new nullification.  */
 cond_free(&ctx->null_cond);
 if (a->c) {
-ctx->null_cond = do_sed_cond(ctx, a->c, false, dest);
+ctx->null_cond = do_sed_cond(ctx, a->c, a->d, dest);
 }
 return nullify_end(ctx);
 }
-- 
2.43.2




[PATCH v2 5/6] target/hppa: mask privilege bits in mfia

2024-03-19 Thread Sven Schnelle
mfia should return only the iaoq bits without privilege
bits.

Fixes: 98a9cb792c8 ("target-hppa: Implement system and memory-management insns")
Signed-off-by: Sven Schnelle 
Reviewed-by: Richard Henderson 
Reviewed-by: Helge Deller 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 46da546eb9..51bf1b06c9 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1961,7 +1961,7 @@ static bool trans_mfia(DisasContext *ctx, arg_mfia *a)
 {
 unsigned rt = a->t;
 TCGv_i64 tmp = dest_gpr(ctx, rt);
-tcg_gen_movi_i64(tmp, ctx->iaoq_f);
+tcg_gen_movi_i64(tmp, ctx->iaoq_f & ~3ULL);
 save_gpr(ctx, rt, tmp);
 
 cond_free(&ctx->null_cond);
-- 
2.43.2




Re: [PATCH 3/7] target/hppa: fix access_id check

2024-03-19 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/17/24 12:14, Sven Schnelle wrote:
>>  /* If bits [31:1] match, and bit 0 is set, suppress write.  */
>> -int match = ent->access_id * 2 + 1;
>> -
>> -if (match == env->cr[CR_PID1] || match == env->cr[CR_PID2] ||
>> -match == env->cr[CR_PID3] || match == env->cr[CR_PID4]) {
>> -prot &= PAGE_READ | PAGE_EXEC;
>> -if (type == PAGE_WRITE) {
>> -ret = EXCP_DMPI;
>> -goto egress;
>> +uint32_t pid;
>> +if (match_prot_id(env, ent->access_id, &pid)) {
>> +if ((pid & 1) && (prot & PROT_WRITE)) {
>> +prot &= ~PROT_WRITE;
>>  }
>> +} else {
>> +prot = 0;
>>  }
>
> You're losing the data memory protection id trap.

Oops, indeed.

> Therefore I suggest
> [..]
> At this point there are now a couple of hppa_is_pa20() calls within
> hppa_get_physical_address, which could be unified to a single local
> bool.

Thanks, i'll take your version and update the patch.



Re: [PATCH 5/7] target/hppa: copy new_spc to iasq_f on be,n instruction

2024-03-19 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/17/24 12:14, Sven Schnelle wrote:
>> Otherwise the first instruction at the new location gets executed from
>> the old space.
>> Signed-off-by: Sven Schnelle 
>> ---
>>   target/hppa/translate.c | 3 +++
>>   1 file changed, 3 insertions(+)
>> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
>> index 58d7ec1ade..a09112e4ae 100644
>> --- a/target/hppa/translate.c
>> +++ b/target/hppa/translate.c
>> @@ -3777,6 +3777,9 @@ static bool trans_be(DisasContext *ctx, arg_be *a)
>>   }
>>   copy_iaoq_entry(ctx, cpu_iaoq_b, -1, tmp);
>>   tcg_gen_mov_i64(cpu_iasq_b, new_spc);
>> +if (a->n) {
>> +tcg_gen_mov_i64(cpu_iasq_f, new_spc);
>> +}
>>   nullify_set(ctx, a->n);
>>   }
>>   tcg_gen_lookup_and_goto_ptr();
>
>
> Without use_nullify_skip(), we're going to execute the next
> instruction even if we know it is nullified (a->n).  This is usually
> because there's a page crossing or breakpoint, and we need to take the
> exception that might be raised there.
>
> So, we advance the queue:
>
> copy_iaoq_entry(ctx, cpu_iaoq_f, ctx->iaoq_b, cpu_iaoq_b);
> if (ctx->iaoq_b == -1) {
> tcg_gen_mov_i64(cpu_iasq_f, cpu_iasq_b);
> }
>
> then put the branch destination at the back of the queue:
>
> copy_iaoq_entry(ctx, cpu_iaoq_b, -1, tmp);
> tcg_gen_mov_i64(cpu_iasq_b, new_spc);
>
> Note that iaoq_b is always -1 on a space change.
>
> So your change does not look correct.
> What is the issue that you saw?

I was running the CPU instruction tests from HP Offline Diagnostics, and
some be instruction was ending up in the wrong space. I don't have the
details anymore. I'd say we drop the patch for now and I test it again
later and provide more details.



Re: [PATCH 0/3] 64 Bit support for hppa gdbstub

2024-03-18 Thread Sven Schnelle
Richard Henderson  writes:

> On 3/17/24 20:32, Sven Schnelle wrote:
>> Hi Richard,
>> Sven Schnelle  writes:
>> 
>>> Hi List,
>>>
>>> this patchset allows to debug the hppa target when running in wide (64 bit)
>>> mode. gdb needs a small patch to switch to 64 bit mode. I pushed the
>>> patch to 
>>> https://github.com/bminor/binutils-gdb/commit/fd8662ec282d688d1f8100b4365823e57516857b
>>> With this patch gdb will switch to the appropriate mode when connecting
>>> to qemu/gdbstub.
>>>
>>> Sven Schnelle (3):
>>>Revert "target/hppa: Drop attempted gdbstub support for hppa64"
>>>target/hppa: add 64 bit support to gdbstub
>>>target/hppa: mask CR_SAR register writes to 5/6 bit in gdbstub
>>>
>>>   target/hppa/gdbstub.c | 66 +--
>>>   1 file changed, 45 insertions(+), 21 deletions(-)
>> gentle ping - if i followed correctly only one patch was reviewed so
>> far.
>
> We can't really proceed until you get your gdb patch reviewed upstream.
> All that will happen in the meantime is that qemu make-check will fail.

I see. Thanks!



Re: [PATCH 0/3] 64 Bit support for hppa gdbstub

2024-03-17 Thread Sven Schnelle
Hi Richard,

Sven Schnelle  writes:

> Hi List,
>
> this patchset allows to debug the hppa target when running in wide (64 bit)
> mode. gdb needs a small patch to switch to 64 bit mode. I pushed the
> patch to 
> https://github.com/bminor/binutils-gdb/commit/fd8662ec282d688d1f8100b4365823e57516857b
> With this patch gdb will switch to the appropriate mode when connecting
> to qemu/gdbstub.
>
> Sven Schnelle (3):
>   Revert "target/hppa: Drop attempted gdbstub support for hppa64"
>   target/hppa: add 64 bit support to gdbstub
>   target/hppa: mask CR_SAR register writes to 5/6 bit in gdbstub
>
>  target/hppa/gdbstub.c | 66 +--
>  1 file changed, 45 insertions(+), 21 deletions(-)

gentle ping - if i followed correctly only one patch was reviewed so far.



[PATCH 6/7] target/hppa: mask privilege bits in mfia

2024-03-17 Thread Sven Schnelle
mfia should return only the iaoq bits without privilege
bits.

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index a09112e4ae..e47f8f9f47 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1962,7 +1962,7 @@ static bool trans_mfia(DisasContext *ctx, arg_mfia *a)
 {
 unsigned rt = a->t;
 TCGv_i64 tmp = dest_gpr(ctx, rt);
-tcg_gen_movi_i64(tmp, ctx->iaoq_f);
+tcg_gen_movi_i64(tmp, ctx->iaoq_f & ~3ULL);
 save_gpr(ctx, rt, tmp);
 
 cond_free(&ctx->null_cond);
-- 
2.43.2




[PATCH 7/7] target/hppa: fix do_stdby_e()

2024-03-17 Thread Sven Schnelle
stdby,e,m was writing data from the wrong half of the register
into memory for cases 0-3.

Signed-off-by: Sven Schnelle 
---
 target/hppa/op_helper.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/hppa/op_helper.c b/target/hppa/op_helper.c
index 480fe80844..6cf49f33b7 100644
--- a/target/hppa/op_helper.c
+++ b/target/hppa/op_helper.c
@@ -281,17 +281,17 @@ static void do_stdby_e(CPUHPPAState *env, target_ulong 
addr, uint64_t val,
 case 3:
 /* The 3 byte store must appear atomic.  */
 if (parallel) {
-atomic_store_mask32(env, addr - 3, val, 0xff00u, ra);
+atomic_store_mask32(env, addr - 3, val >> 32, 0xff00u, ra);
 } else {
-cpu_stw_data_ra(env, addr - 3, val >> 16, ra);
-cpu_stb_data_ra(env, addr - 1, val >> 8, ra);
+cpu_stw_data_ra(env, addr - 3, val >> 48, ra);
+cpu_stb_data_ra(env, addr - 1, val >> 40, ra);
 }
 break;
 case 2:
-cpu_stw_data_ra(env, addr - 2, val >> 16, ra);
+cpu_stw_data_ra(env, addr - 2, val >> 48, ra);
 break;
 case 1:
-cpu_stb_data_ra(env, addr - 1, val >> 24, ra);
+cpu_stb_data_ra(env, addr - 1, val >> 56, ra);
 break;
 default:
 /* Nothing is stored, but protection is checked and the
-- 
2.43.2




[PATCH 5/7] target/hppa: copy new_spc to iasq_f on be,n instruction

2024-03-17 Thread Sven Schnelle
Otherwise the first instruction at the new location gets executed from
the old space.

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 58d7ec1ade..a09112e4ae 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3777,6 +3777,9 @@ static bool trans_be(DisasContext *ctx, arg_be *a)
 }
 copy_iaoq_entry(ctx, cpu_iaoq_b, -1, tmp);
 tcg_gen_mov_i64(cpu_iasq_b, new_spc);
+if (a->n) {
+tcg_gen_mov_i64(cpu_iasq_f, new_spc);
+}
 nullify_set(ctx, a->n);
 }
 tcg_gen_lookup_and_goto_ptr();
-- 
2.43.2




[PATCH 3/7] target/hppa: fix access_id check

2024-03-17 Thread Sven Schnelle
PA2.0 provides 8 instead of 4 PID registers.

Signed-off-by: Sven Schnelle 
---
 roms/SLOF|  2 +-
 target/hppa/mem_helper.c | 67 +++-
 2 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/roms/SLOF b/roms/SLOF
index 3a259df244..6b6c16b4b4 16
--- a/roms/SLOF
+++ b/roms/SLOF
@@ -1 +1 @@
-Subproject commit 3a259df2449fc4a4e43ab5f33f0b2c66484b4bc3
+Subproject commit 6b6c16b4b40763507cf1f518096f3c3883c5cf2d
diff --git a/target/hppa/mem_helper.c b/target/hppa/mem_helper.c
index 80f51e753f..e4e3f6cdbe 100644
--- a/target/hppa/mem_helper.c
+++ b/target/hppa/mem_helper.c
@@ -152,6 +152,59 @@ static HPPATLBEntry *hppa_alloc_tlb_ent(CPUHPPAState *env)
 return ent;
 }
 
+static uint32_t get_pid(CPUHPPAState *env, int num)
+{
+const struct pid_map {
+int reg;
+bool shift;
+} *pid;
+
+const struct pid_map pids64[] = {
+{ .reg = 8,  .shift = true  },
+{ .reg = 8,  .shift = false },
+{ .reg = 9,  .shift = true  },
+{ .reg = 9,  .shift = false },
+{ .reg = 12, .shift = true  },
+{ .reg = 12, .shift = false },
+{ .reg = 13, .shift = true  },
+{ .reg = 13, .shift = false }
+};
+
+const struct pid_map pids32[] = {
+{ .reg = 8,  .shift = false  },
+{ .reg = 9,  .shift = false  },
+{ .reg = 12, .shift = false  },
+{ .reg = 13, .shift = false  },
+};
+
+if (hppa_is_pa20(env)) {
+pid = pids64 + num;
+} else {
+pid = pids32 + num;
+}
+uint64_t cr = env->cr[pid->reg];
+if (pid->shift) {
+cr >>= 32;
+} else {
+cr &= 0x;
+}
+return cr;
+}
+
+#define ACCESS_ID_MASK 0x
+
+static bool match_prot_id(CPUHPPAState *env, uint32_t access_id, uint32_t 
*_pid)
+{
+for (int i = 0; i < 8; i++) {
+uint32_t pid = get_pid(env, i);
+if ((access_id & ACCESS_ID_MASK) == ((pid >> 1) & ACCESS_ID_MASK)) {
+*_pid = pid;
+return true;
+}
+}
+return false;
+}
+
 int hppa_get_physical_address(CPUHPPAState *env, vaddr addr, int mmu_idx,
   int type, hwaddr *pphys, int *pprot,
   HPPATLBEntry **tlb_entry)
@@ -227,15 +280,13 @@ int hppa_get_physical_address(CPUHPPAState *env, vaddr 
addr, int mmu_idx,
 /* access_id == 0 means public page and no check is performed */
 if (ent->access_id && MMU_IDX_TO_P(mmu_idx)) {
 /* If bits [31:1] match, and bit 0 is set, suppress write.  */
-int match = ent->access_id * 2 + 1;
-
-if (match == env->cr[CR_PID1] || match == env->cr[CR_PID2] ||
-match == env->cr[CR_PID3] || match == env->cr[CR_PID4]) {
-prot &= PAGE_READ | PAGE_EXEC;
-if (type == PAGE_WRITE) {
-ret = EXCP_DMPI;
-goto egress;
+uint32_t pid;
+if (match_prot_id(env, ent->access_id, &pid)) {
+if ((pid & 1) && (prot & PROT_WRITE)) {
+prot &= ~PROT_WRITE;
 }
+} else {
+prot = 0;
 }
 }
 
-- 
2.43.2




[PATCH 0/7] few fixes for hppa target

2024-03-17 Thread Sven Schnelle
Hi,

here are a few fixes for the hppa target i made while debugging
some wide mode issues.

Sven Schnelle (7):
  target/hppa: ldcw,s uses static shift of 3
  target/hppa: fix shrp for wide mode
  target/hppa: fix access_id check
  target/hppa: exit tb on flush cache instructions
  target/hppa: copy new_spc to iasq_f on be,n instruction
  target/hppa: mask privilege bits in mfia
  target/hppa: fix do_stdby_e()

 roms/SLOF|  2 +-
 target/hppa/mem_helper.c | 67 +++-
 target/hppa/op_helper.c  | 10 +++---
 target/hppa/translate.c  | 13 +---
 4 files changed, 74 insertions(+), 18 deletions(-)

-- 
2.43.2




[PATCH 4/7] target/hppa: exit tb on flush cache instructions

2024-03-17 Thread Sven Schnelle
When the guest modifies the tb it is currently executing from,
it executes a fic instruction. Exit the tb on such instruction,
otherwise we might execute stale code.

Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 8ba31567e8..58d7ec1ade 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -1940,6 +1940,7 @@ static void do_page_zero(DisasContext *ctx)
 static bool trans_nop(DisasContext *ctx, arg_nop *a)
 {
 cond_free(&ctx->null_cond);
+ctx->base.is_jmp = DISAS_IAQ_N_STALE;
 return true;
 }
 
@@ -2290,6 +2291,7 @@ static bool trans_nop_addrx(DisasContext *ctx, arg_ldst 
*a)
 save_gpr(ctx, a->b, dest);
 }
 cond_free(&ctx->null_cond);
+ctx->base.is_jmp = DISAS_IAQ_N_STALE;
 return true;
 }
 
-- 
2.43.2




[PATCH 2/7] target/hppa: fix shrp for wide mode

2024-03-17 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 6a513d7d5c..8ba31567e8 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3462,7 +3462,7 @@ static bool trans_shrp_sar(DisasContext *ctx, 
arg_shrp_sar *a)
 /* Install the new nullification.  */
 cond_free(&ctx->null_cond);
 if (a->c) {
-ctx->null_cond = do_sed_cond(ctx, a->c, false, dest);
+ctx->null_cond = do_sed_cond(ctx, a->c, a->d, dest);
 }
 return nullify_end(ctx);
 }
@@ -3505,7 +3505,7 @@ static bool trans_shrp_imm(DisasContext *ctx, 
arg_shrp_imm *a)
 /* Install the new nullification.  */
 cond_free(&ctx->null_cond);
 if (a->c) {
-ctx->null_cond = do_sed_cond(ctx, a->c, false, dest);
+ctx->null_cond = do_sed_cond(ctx, a->c, a->d, dest);
 }
 return nullify_end(ctx);
 }
-- 
2.43.2




[PATCH 1/7] target/hppa: ldcw,s uses static shift of 3

2024-03-17 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index eb2046c5ad..6a513d7d5c 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -3085,7 +3085,7 @@ static bool trans_ldc(DisasContext *ctx, arg_ldst *a)
 dest = dest_gpr(ctx, a->t);
 }
 
-form_gva(ctx, &addr, &ofs, a->b, a->x, a->scale ? a->size : 0,
+form_gva(ctx, &addr, &ofs, a->b, a->x, a->scale ? 3 : 0,
  a->disp, a->sp, a->m, MMU_DISABLED(ctx));
 
 /*
-- 
2.43.2




Re: [PATCH v3 01/12] util/log: convert debug_regions to GList

2024-03-04 Thread Sven Schnelle
Alex Bennée  writes:

>> So i wonder
>> whether we should convert all of those functions to GArray? (if
>> possible, i haven't checked)
>
> I think that would depend on access patterns and flexibility. For the
> dfilter there is no particular need for sorting and the principle
> operation is a sequence of lookups. For the other use cases keeping a
> sorted list and quick insertion might be more useful.
>
> While its tempting to go on a wider re-factoring I would caution against
> it so close to softfreeze.
>
>> What do you think?
>
> Lets stick to the dfilter case and worry about wider clean-ups later. As
> Richard points out it might be the interval tree makes more sense for
> some of these things.

I think i go with the GArray variant for now. I'd guess that -dfilter is
usually only used with one or a few arguments, so using a Interval Tree
is probably not neccessary.



Re: [PATCH v3 01/12] util/log: convert debug_regions to GList

2024-03-04 Thread Sven Schnelle
Alex Bennée  writes:

> Sven Schnelle  writes:
>
>> In preparation of making the parsing part of qemu_set_dfilter_ranges()
>> available to other users, convert it to return a GList, so the result
>> can be used with other functions taking a GList of struct Range.
>
> Why do we need to convert it to a Glist? When I originally wrote the
> dfilter code I deliberately chose GArray over GList to avoid bouncing
> across cache lines during the tight loop that is checking ranges. It's
> not like we can't pass GArray's to plugins as well?

Looking at the code again, i remember that the reason for choosing GList
was that the other range matching function all take GList's - having
some functions take GArray's, and some not would be odd. So i wonder
whether we should convert all of those functions to GArray? (if
possible, i haven't checked)

What do you think?



[PATCH v3] hw/scsi/lsi53c895a: add timer to scripts processing

2024-03-04 Thread Sven Schnelle
HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, start a timer and exit the loop. Until the
timer fires, the CPU can process instructions which might change the
memory location.

The limit of instructions is also reduced because scripts running on
the SCSI processor are usually very short. This keeps the time until
the loop is exited short.

Suggested-by: Peter Maydell 
Signed-off-by: Sven Schnelle 
Reviewed-by: Peter Maydell 
---
 hw/scsi/lsi53c895a.c | 43 +--
 hw/scsi/trace-events |  2 ++
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 949b2bbd10..3b5e76bbd9 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -188,7 +188,7 @@ static const char *names[] = {
 #define LSI_TAG_VALID (1 << 16)
 
 /* Maximum instructions to process. */
-#define LSI_MAX_INSN1
+#define LSI_MAX_INSN100
 
 typedef struct lsi_request {
 SCSIRequest *req;
@@ -205,6 +205,7 @@ enum {
 LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
 LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
 LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
+LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit */
 };
 
 enum {
@@ -224,6 +225,7 @@ struct LSIState {
 MemoryRegion ram_io;
 MemoryRegion io_io;
 AddressSpace pci_io_as;
+QEMUTimer *scripts_timer;
 
 int carry; /* ??? Should this be an a visible register somewhere?  */
 int status;
@@ -415,6 +417,7 @@ static void lsi_soft_reset(LSIState *s)
 s->sbr = 0;
 assert(QTAILQ_EMPTY(&s->queue));
 assert(!s->current);
+timer_del(s->scripts_timer);
 }
 
 static int lsi_dma_40bit(LSIState *s)
@@ -1135,6 +1138,12 @@ static void lsi_wait_reselect(LSIState *s)
 }
 }
 
+static void lsi_scripts_timer_start(LSIState *s)
+{
+trace_lsi_scripts_timer_start();
+timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 500);
+}
+
 static void lsi_execute_script(LSIState *s)
 {
 PCIDevice *pci_dev = PCI_DEVICE(s);
@@ -1144,6 +1153,11 @@ static void lsi_execute_script(LSIState *s)
 int insn_processed = 0;
 static int reentrancy_level;
 
+if (s->waiting == LSI_WAIT_SCRIPTS) {
+timer_del(s->scripts_timer);
+s->waiting = LSI_NOWAIT;
+}
+
 reentrancy_level++;
 
 s->istat1 |= LSI_ISTAT1_SRUN;
@@ -1151,8 +1165,8 @@ again:
 /*
  * Some windows drivers make the device spin waiting for a memory location
  * to change. If we have executed more than LSI_MAX_INSN instructions then
- * assume this is the case and force an unexpected device disconnect. This
- * is apparently sufficient to beat the drivers into submission.
+ * assume this is the case and start a timer. Until the timer fires, the
+ * guest CPU has a chance to run and change the memory location.
  *
  * Another issue (CVE-2023-0330) can occur if the script is programmed to
  * trigger itself again and again. Avoid this problem by stopping after
@@ -1160,13 +1174,8 @@ again:
  * which should be enough for all valid use cases).
  */
 if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
-if (!(s->sien0 & LSI_SIST0_UDC)) {
-qemu_log_mask(LOG_GUEST_ERROR,
-  "lsi_scsi: inf. loop with UDC masked");
-}
-lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-lsi_disconnect(s);
-trace_lsi_execute_script_stop();
+s->waiting = LSI_WAIT_SCRIPTS;
+lsi_scripts_timer_start(s);
 reentrancy_level--;
 return;
 }
@@ -2205,6 +2214,9 @@ static int lsi_post_load(void *opaque, int version_id)
 return -EINVAL;
 }
 
+if (s->waiting == LSI_WAIT_SCRIPTS) {
+lsi_scripts_timer_start(s);
+}
 return 0;
 }
 
@@ -2302,6 +2314,15 @@ static const struct SCSIBusInfo lsi_scsi_info = {
 .cancel = lsi_request_cancelled
 };
 
+static void scripts_timer_cb(void *opaque)
+{
+LSIState *s = opaque;
+
+trace_lsi_scripts_timer_triggered();
+s->waiting = LSI_NOWAIT;
+lsi_execute_script(s);
+}
+
 static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
 {
 LSIState *s = LSI53C895A(dev);
@@ -2321,6 +2342,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
   "lsi-ram", 0x2000);
 memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
   "lsi-io", 256);
+s->scripts_timer = timer_new_us(QEMU_CLOCK_VIRT

Re: [PATCH v2] hw/scsi/lsi53c895a: add timer to scripts processing

2024-03-04 Thread Sven Schnelle
Peter Maydell  writes:

> On Thu, 29 Feb 2024 at 20:44, Sven Schnelle  wrote:
>>
>> HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
>> under certain circumstances. As the SCSI controller and CPU are not
>> running at the same time this loop will never finish. After some
>> time, the check loop interrupts with a unexpected device disconnect.
>> This works, but is slow because the kernel resets the scsi controller.
>> Instead of signaling UDC, start a timer and exit the loop. Until the
>> timer fires, the CPU can process instructions which might changes the
>> memory location.
>>
>> The limit of instructions is also reduced because scripts running on
>> the SCSI processor are usually very short. This keeps the time until
>> the loop is exit short.
>
> "exited"
>
>>
>> Suggested-by: Peter Maydell 
>> Signed-off-by: Sven Schnelle 
>> ---
>> Changes in v2:
>> - update comment in lsi_execute_script()
>> - reset waiting state and del timer in lsi_execute_script() to
>>   handle the case where script processing is triggered via
>>   register write, and not from the pending timer
>> - delete timer in lsi_scsi_exit()
>
> Other than the s/host/guest/ comment fix,
> Reviewed-by: Peter Maydell 
>
> I don't suppose anybody has a setup with the Windows drivers
> to test this on? (commit ee4d919f30f13 suggests that at least
> Windows XP and 2003 had this problem.)

I have a Windows XP VM with lsi53c895a. I just fired it up and added
a qemu_log() in the timer path, and seen it trigger once while copying
a few files. It looks like Windows XP (or better the SCSI driver) also
works with this patch.



Re: [PATCH v3 01/12] util/log: convert debug_regions to GList

2024-03-04 Thread Sven Schnelle
Alex Bennée  writes:

> Sven Schnelle  writes:
>
>> In preparation of making the parsing part of qemu_set_dfilter_ranges()
>> available to other users, convert it to return a GList, so the result
>> can be used with other functions taking a GList of struct Range.
>
> Why do we need to convert it to a Glist? When I originally wrote the
> dfilter code I deliberately chose GArray over GList to avoid bouncing
> across cache lines during the tight loop that is checking ranges. It's
> not like we can't pass GArray's to plugins as well?

Good point. I'll change it back to a GArray in the next iteration.



Re: lsi53c895a assert with AmigaOS

2024-03-03 Thread Sven Schnelle
BALATON Zoltan  writes:

> Hello,
>
> AmigaOS4 also has a driver for this card so I've tried to test it but
> it trips an assert. Does anybody have an idea why and how it could be
> fixed? Sven's recent patches don't seem to have an effect on this, it
> still happens shortly after it tries to access the SCSI device with
> those patches applied. (Unfortunately AmigaOS is not freely available
> so it's a bit hard to reproduce but I can do tests if needed.) I got
> the following traces:
>
> lsi_reg_write Write reg SIEN0 0x40 = 0x84
> lsi_reg_write Write reg SIEN1 0x41 = 0x04
> lsi_reg_write Write reg DIEN 0x39 = 0xff
> lsi_reg_write Write reg DSP0 0x2c = 0x00
> lsi_reg_write Write reg DSP1 0x2d = 0x80
> lsi_reg_write Write reg DSP2 0x2e = 0x19
> lsi_reg_write Write reg DSP3 0x2f = 0x00
> lsi_execute_script SCRIPTS dsp=0x198000 opcode 0x7c07fe00 arg 0x0
> lsi_execute_script_io_opcode Read-Modify-Write reg 0x7 AND data8=0xfe 
> sfbr=0x01
> lsi_reg_read Read reg GPREG 0x7 = 0x7f
> lsi_reg_write Write reg GPREG 0x7 = 0x7e
> lsi_execute_script SCRIPTS dsp=0x198008 opcode 0x6200 arg 0x0
> lsi_execute_script_io_clear Clear TM
> lsi_execute_script SCRIPTS dsp=0x198010 opcode 0x4000 arg 0x198208
> lsi_execute_script_io_alreadyreselected Already reselected, jumping to
> alternative address
here ---^
> lsi_do_msgout_select Select LUN 0
> lsi_execute_script SCRIPTS dsp=0x198070 opcode 0x820b arg 0x1981f8
> lsi_execute_script_tc_compp Compare phase CMD == CMD
> lsi_execute_script_tc_jump Jump to 0x1981f8
> lsi_execute_script SCRIPTS dsp=0x1981f8 opcode 0xa06 arg 0x199000
> lsi_do_command Send command len=6
> qemu-system-ppc: ../hw/scsi/lsi53c895a.c:863: lsi_do_command: Assertion 
> `s->current == NULL' failed.
>
> Any idea what could it be and what could be done about it?

Wild guess is that this is because of the 'Already reselected' line
above. lsi_reselect() sets s->current, and later when lsi_do_command()
is called it gets confused because s->current is already set. But i
would need the whole logfile to see why this is going wrong, or even
better AmigaOS (which is not free as you already mentioned)

Sven



Re: lsi53c895a assert with AmigaOS

2024-03-02 Thread Sven Schnelle
BALATON Zoltan  writes:

> AmigaOS4 also has a driver for this card so I've tried to test it but
> it trips an assert. Does anybody have an idea why and how it could be
> fixed? Sven's recent patches don't seem to have an effect on this, it
> still happens shortly after it tries to access the SCSI device with
> those patches applied. (Unfortunately AmigaOS is not freely available
> so it's a bit hard to reproduce but I can do tests if needed.) I got
> the following traces:
> [..]
> lsi_do_command Send command len=6
> qemu-system-ppc: ../hw/scsi/lsi53c895a.c:863: lsi_do_command: Assertion 
> `s->current == NULL' failed.
>
> Any idea what could it be and what could be done about it?

I think the Host is resetting the SCSI controller while it still has
some request pending. I made a hack to work around that bug, but so
far i haven't spent the time to verify whether it's correct or whether
there are additional changes required. Here it is:

>From 6a807653679fde5e3e09a7f27576c673f335fef6 Mon Sep 17 00:00:00 2001
From: Sven Schnelle 
Date: Sat, 3 Feb 2024 19:46:07 +0100
Subject: [PATCH] lsi53c895a: free pending requests on reset

Signed-off-by: Sven Schnelle 
---
 hw/scsi/lsi53c895a.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index d607a5f9fb..c6bd801a7e 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -346,6 +346,8 @@ static lsi_request *get_pending_req(LSIState *s)
 
 static void lsi_soft_reset(LSIState *s)
 {
+lsi_request *p, *p_next;
+
 trace_lsi_reset();
 s->carry = 0;
 
@@ -413,8 +415,14 @@ static void lsi_soft_reset(LSIState *s)
 s->sbc = 0;
 s->csbc = 0;
 s->sbr = 0;
-assert(QTAILQ_EMPTY(&s->queue));
-assert(!s->current);
+
+QTAILQ_FOREACH_SAFE(p, &s->queue, next, p_next) {
+scsi_req_cancel(p->req);
+}
+
+if (s->current)
+scsi_req_cancel(s->current->req);
+s->current = NULL;
 }
 
 static int lsi_dma_40bit(LSIState *s)
@@ -860,7 +868,9 @@ static void lsi_do_command(LSIState *s)
 return;
 }
 
-assert(s->current == NULL);
+if (s->current)
+scsi_req_cancel(s->current->req);
+
 s->current = g_new0(lsi_request, 1);
 s->current->tag = s->select_tag;
 s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun, buf,
-- 
2.43.2




[PATCH] hw/scsi/lsi53c895a: stop script on phase mismatch

2024-03-02 Thread Sven Schnelle
Netbsd isn't happy with qemu lsi53c895a emulation:

cd0(esiop0:0:2:0): command with tag id 0 reset
esiop0: autoconfiguration error: phase mismatch without command
esiop0: autoconfiguration error: unhandled scsi interrupt, sist=0x80 sstat1=0x0 
DSA=0x23a64b1 DSP=0x50

This is because lsi_bad_phase() triggers a phase mismatch, which
stops SCRIPT processing. However, after returning to
lsi_command_complete(), SCRIPT is restarted with lsi_resume_script().
Fix this by adding a return value to lsi_bad_phase(), and only resume
script processing when lsi_bad_phase() didn't trigger a host interrupt.

Signed-off-by: Sven Schnelle 
---
 hw/scsi/lsi53c895a.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 4ff9470381..59b88aff3f 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -573,8 +573,9 @@ static inline void lsi_set_phase(LSIState *s, int phase)
 s->sstat1 = (s->sstat1 & ~PHASE_MASK) | phase;
 }
 
-static void lsi_bad_phase(LSIState *s, int out, int new_phase)
+static int lsi_bad_phase(LSIState *s, int out, int new_phase)
 {
+int ret = 0;
 /* Trigger a phase mismatch.  */
 if (s->ccntl0 & LSI_CCNTL0_ENPMJ) {
 if ((s->ccntl0 & LSI_CCNTL0_PMJCTL)) {
@@ -587,8 +588,10 @@ static void lsi_bad_phase(LSIState *s, int out, int 
new_phase)
 trace_lsi_bad_phase_interrupt();
 lsi_script_scsi_interrupt(s, LSI_SIST0_MA, 0);
 lsi_stop_script(s);
+ret = 1;
 }
 lsi_set_phase(s, new_phase);
+return ret;
 }
 
 
@@ -792,7 +795,7 @@ static int lsi_queue_req(LSIState *s, SCSIRequest *req, 
uint32_t len)
 static void lsi_command_complete(SCSIRequest *req, size_t resid)
 {
 LSIState *s = LSI53C895A(req->bus->qbus.parent);
-int out;
+int out, stop = 0;
 
 out = (s->sstat1 & PHASE_MASK) == PHASE_DO;
 trace_lsi_command_complete(req->status);
@@ -800,7 +803,10 @@ static void lsi_command_complete(SCSIRequest *req, size_t 
resid)
 s->command_complete = 2;
 if (s->waiting && s->dbc != 0) {
 /* Raise phase mismatch for short transfers.  */
-lsi_bad_phase(s, out, PHASE_ST);
+stop = lsi_bad_phase(s, out, PHASE_ST);
+if (stop) {
+s->waiting = 0;
+}
 } else {
 lsi_set_phase(s, PHASE_ST);
 }
@@ -810,7 +816,9 @@ static void lsi_command_complete(SCSIRequest *req, size_t 
resid)
 lsi_request_free(s, s->current);
 scsi_req_unref(req);
 }
-lsi_resume_script(s);
+if (!stop) {
+lsi_resume_script(s);
+}
 }
 
  /* Callback to indicate that the SCSI layer has completed a transfer.  */
-- 
2.43.2




[PATCH] target/hppa: add assemble_16()

2024-03-02 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/insns.decode | 99 
 target/hppa/translate.c  | 22 +
 2 files changed, 72 insertions(+), 49 deletions(-)

diff --git a/target/hppa/insns.decode b/target/hppa/insns.decode
index f5a3f02fd1..8f17e18cd0 100644
--- a/target/hppa/insns.decode
+++ b/target/hppa/insns.decode
@@ -62,7 +62,7 @@
 
 
 # All insns that need to form a virtual address should use this set.
-&ldst   t b x disp sp m scale size
+&ldst   t b x disp sp m scale size w16
 
 &rr_cf_dt r cf d
 &rrrt r1 r2
@@ -138,7 +138,7 @@ getshadowregs      1101 1110 1010 1101 0010
 
 
 @addrx  .. b:5 x:5 ..  m:1 .\
-&ldst disp=0 scale=0 t=0 sp=0 size=0
+&ldst disp=0 scale=0 t=0 sp=0 size=0 w16=0
 
 nop 01 - - -- 11001010 0 - # fdc, disp
 nop_addrx   01 . . -- 01001010 . -  @addrx # fdc, index
@@ -163,24 +163,24 @@ ixtlbt  01 r2:5 r1:5 000 data:1 10 0 
0# idtlbt
 
 # pdtlb, pitlb
 pxtlb   01 b:5 x:5 sp:2 01001000 m:1 - \
-&ldst disp=0 scale=0 size=0 t=0
+&ldst disp=0 scale=0 size=0 t=0 w16=0
 pxtlb   01 b:5 x:5 ...   0001000 m:1 - \
-&ldst disp=0 scale=0 size=0 t=0 sp=%assemble_sr3x
+&ldst disp=0 scale=0 size=0 t=0 sp=%assemble_sr3x w16=0
 
 # ... pa20 local
 pxtlb_l 01 b:5 x:5 sp:2 01011000 m:1 - \
-&ldst disp=0 scale=0 size=0 t=0
+&ldst disp=0 scale=0 size=0 t=0 w16=0
 pxtlb_l 01 b:5 x:5 ...   0011000 m:1 - \
-&ldst disp=0 scale=0 size=0 t=0 sp=%assemble_sr3x
+&ldst disp=0 scale=0 size=0 t=0 sp=%assemble_sr3x w16=0
 
 # pdtlbe, pitlbe
 pxtlbe  01 b:5 x:5 sp:2 01001001 m:1 - \
-&ldst disp=0 scale=0 size=0 t=0
+&ldst disp=0 scale=0 size=0 t=0 w16=0
 pxtlbe  01 b:5 x:5 ...   0001001 m:1 - \
-&ldst disp=0 scale=0 size=0 t=0 sp=%assemble_sr3x
+&ldst disp=0 scale=0 size=0 t=0 sp=%assemble_sr3x w16=0
 
 lpa 01 b:5 x:5 sp:2 01001101 m:1 t:5\
-&ldst disp=0 scale=0 size=0
+&ldst disp=0 scale=0 size=0 w16=0
 
 lci 01 - - -- 01001100 0 t:5
 
@@ -221,7 +221,7 @@ sub_b_tsv   10 . .  110100 . .  
@rrr_cf_d
 
 ldil001000 t:5 .i=%assemble_21
 addil   001010 r:5 .i=%assemble_21
-ldo 001101 b:5 t:5 -- ..i=%lowsign_14
+ldo 001101 b:5 t:5 s:2 ..i=%lowsign_14 w16=1
 
 addi101101 . .  0 ...   @rri_cf
 addi_tsv101101 . .  1 ...   @rri_cf
@@ -264,19 +264,19 @@ permh   10 r1:5  r2:5  0 c0:2 0 c1:2 c2:2 
c3:2 0 t:5
 @stim5  .. b:5 t:5 sp:2 . . \
 &ldst disp=%im5_0 x=0 scale=0 m=%ma_to_m
 
-ld  11 . . .. . 1 -- 00 size:2 ..   @ldim5
-ld  11 . . .. . 0 -- 00 size:2 ..   @ldstx
-st  11 . . .. . 1 -- 10 size:2 ..   @stim5
-ldc 11 . . .. . 1 -- 0111  ..   @ldim5 size=2
-ldc 11 . . .. . 0 -- 0111  ..   @ldstx size=2
-ldc 11 . . .. . 1 -- 0101  ..   @ldim5 size=3
-ldc 11 . . .. . 0 -- 0101  ..   @ldstx size=3
-lda 11 . . .. . 1 -- 0110  ..   @ldim5 size=2
-lda 11 . . .. . 0 -- 0110  ..   @ldstx size=2
-lda 11 . . .. . 1 -- 0100  ..   @ldim5 size=3
-lda 11 . . .. . 0 -- 0100  ..   @ldstx size=3
-sta 11 . . .. . 1 -- 1110  ..   @stim5 size=2
-sta 11 . . .. . 1 --   ..   @stim5 size=3
+ld  11 . . .. . 1 -- 00 size:2 ..   @ldim5 w16=0
+ld  11 . . .. . 0 -- 00 size:2 ..   @ldstx w16=0
+st  11 . . .. . 1 -- 10 size:2 ..   @stim5 w16=0
+ldc 11 . . .. . 1 -- 0111  ..   @ldim5 size=2 
w16=0
+ldc 11 . . .. . 0 -- 0111  ..   @ldstx size=2 
w16=0
+ldc 11 . . .. . 1 -- 0101  ..   @ldim5 size=3 
w16=0
+ldc 11 . . .. . 0 -- 0101  ..   @ldstx size=3 
w16=0
+lda 11 . . .. . 1 -- 0110  ..   @ldim5 size=2 
w16=0
+lda 11 . . .. . 0 -- 0110  

[PATCH RFC] hppa: assemble_16() in wide mode

2024-03-02 Thread Sven Schnelle
Hi Richard,

while looking into a HPPA tcg issue i noticed that the current
tcg code doesn't do the special wide mode handling described in the
Parisc 2.0 specification, Chapter E -> assemble_16(). In wide mode,
assemble_16() adds two more bits to the immediate value/displacement
of certain instruction like ldo(ldi), st[bhwd] and ld[bhwd] and some
others.

I wonder what the easiest way to implement this is - it has to be xor'd
and is dependend on the W bit, so i don't think it will be possible to
implement this with changing only insn.decode. I came up with the
attached patch, do you think there's a better way?

Thanks!
Sven




[PATCH v3 03/12] util/range: move range_list_from_string() to range.c

2024-03-01 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 include/qemu/range.h |  7 
 util/log.c   | 74 --
 util/range.c | 76 
 3 files changed, 83 insertions(+), 74 deletions(-)

diff --git a/include/qemu/range.h b/include/qemu/range.h
index 205e1da76d..530b0c7db1 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -233,4 +233,11 @@ void range_inverse_array(GList *in_ranges,
  GList **out_ranges,
  uint64_t low, uint64_t high);
 
+/*
+ * Parse a comma separated list of address ranges into a @GArray
+ * of @Range entries. On error @errp is set.
+ */
+void range_list_from_string(GList **out_ranges, const char *filter_spec,
+Error **errp);
+
 #endif
diff --git a/util/log.c b/util/log.c
index f183ea4e4d..90897ef0f9 100644
--- a/util/log.c
+++ b/util/log.c
@@ -382,80 +382,6 @@ bool qemu_log_in_addr_range(uint64_t addr)
 return false;
 }
 
-static void range_list_from_string(GList **out_ranges, const char *filter_spec,
-   Error **errp)
-{
-gchar **ranges = g_strsplit(filter_spec, ",", 0);
-struct Range *range = NULL;
-int i;
-
-if (*out_ranges) {
-g_list_free_full(*out_ranges, g_free);
-*out_ranges = NULL;
-}
-
-for (i = 0; ranges[i]; i++) {
-const char *r = ranges[i];
-const char *range_op, *r2, *e;
-uint64_t r1val, r2val, lob, upb;
-range = g_new0(struct Range, 1);
-
-range_op = strstr(r, "-");
-r2 = range_op ? range_op + 1 : NULL;
-if (!range_op) {
-range_op = strstr(r, "+");
-r2 = range_op ? range_op + 1 : NULL;
-}
-if (!range_op) {
-range_op = strstr(r, "..");
-r2 = range_op ? range_op + 2 : NULL;
-}
-if (!range_op) {
-error_setg(errp, "Bad range specifier");
-goto out;
-}
-
-if (qemu_strtou64(r, &e, 0, &r1val)
-|| e != range_op) {
-error_setg(errp, "Invalid number to the left of %.*s",
-   (int)(r2 - range_op), range_op);
-goto out;
-}
-if (qemu_strtou64(r2, NULL, 0, &r2val)) {
-error_setg(errp, "Invalid number to the right of %.*s",
-   (int)(r2 - range_op), range_op);
-goto out;
-}
-
-switch (*range_op) {
-case '+':
-lob = r1val;
-upb = r1val + r2val - 1;
-break;
-case '-':
-upb = r1val;
-lob = r1val - (r2val - 1);
-break;
-case '.':
-lob = r1val;
-upb = r2val;
-break;
-default:
-g_assert_not_reached();
-}
-if (lob > upb) {
-error_setg(errp, "Invalid range");
-goto out;
-}
-range_set_bounds(range, lob, upb);
-   *out_ranges = g_list_append(*out_ranges, range);
-range = NULL;
-}
-out:
-g_free(range);
-g_strfreev(ranges);
-}
-
 void qemu_set_dfilter_ranges(const char *filter_spec, Error **errp)
 {
 range_list_from_string(&debug_regions, filter_spec, errp);
diff --git a/util/range.c b/util/range.c
index f3f40098d5..bd2d0961bd 100644
--- a/util/range.c
+++ b/util/range.c
@@ -19,6 +19,8 @@
 
 #include "qemu/osdep.h"
 #include "qemu/range.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
 
 int range_compare(Range *a, Range *b)
 {
@@ -121,3 +123,77 @@ void range_inverse_array(GList *in, GList **rev,
 exit:
 *rev = out;
 }
+
+void range_list_from_string(GList **out_ranges, const char *filter_spec,
+Error **errp)
+{
+gchar **ranges = g_strsplit(filter_spec, ",", 0);
+struct Range *range = NULL;
+int i;
+
+if (*out_ranges) {
+g_list_free_full(*out_ranges, g_free);
+*out_ranges = NULL;
+}
+
+for (i = 0; ranges[i]; i++) {
+const char *r = ranges[i];
+const char *range_op, *r2, *e;
+uint64_t r1val, r2val, lob, upb;
+range = g_new0(struct Range, 1);
+
+range_op = strstr(r, "-");
+r2 = range_op ? range_op + 1 : NULL;
+if (!range_op) {
+range_op = strstr(r, "+");
+r2 = range_op ? range_op + 1 : NULL;
+}
+if (!range_op) {
+range_op = strstr(r, "..");
+r2 = range_op ? range_op + 2 : NULL;
+}
+if (!range_op) {
+error_setg(errp, "Bad range specifier");
+goto out;
+}
+
+if (qemu_strtou64(r, &e, 0, &r1val)
+|| e != range_op) {
+error_setg(errp, "

[PATCH v3 11/12] plugins/execlog: use range list api

2024-03-01 Thread Sven Schnelle
Instead of doing its own implementation, use the new range
list API to parse and match address lists.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index a1dfd59ab7..c518797893 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -35,7 +35,7 @@ static GArray *cpus;
 static GRWLock expand_array_lock;
 
 static GPtrArray *imatches;
-static GArray *amatches;
+static GList *amatches;
 static GPtrArray *rmatches;
 static bool disas_assist;
 static GMutex add_reg_name_lock;
@@ -215,12 +215,8 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 }
 
 if (skip && amatches) {
-int j;
-for (j = 0; j < amatches->len && skip; j++) {
-uint64_t v = g_array_index(amatches, uint64_t, j);
-if (v == insn_vaddr) {
-skip = false;
-}
+if (qemu_plugin_range_list_contains(amatches, insn_vaddr)) {
+skip = false;
 }
 }
 
@@ -398,16 +394,6 @@ static void parse_insn_match(char *match)
 g_ptr_array_add(imatches, g_strdup(match));
 }
 
-static void parse_vaddr_match(char *match)
-{
-uint64_t v = g_ascii_strtoull(match, NULL, 16);
-
-if (!amatches) {
-amatches = g_array_new(false, true, sizeof(uint64_t));
-}
-g_array_append_val(amatches, v);
-}
-
 /*
  * We have to wait until vCPUs are started before we can check the
  * patterns find anything.
@@ -440,7 +426,12 @@ QEMU_PLUGIN_EXPORT int 
qemu_plugin_install(qemu_plugin_id_t id,
 if (g_strcmp0(tokens[0], "ifilter") == 0) {
 parse_insn_match(tokens[1]);
 } else if (g_strcmp0(tokens[0], "afilter") == 0) {
-parse_vaddr_match(tokens[1]);
+Error *err = NULL;
+qemu_plugin_range_list_from_string(&amatches, tokens[1], &err);
+if (err) {
+qemu_plugin_error_print(err);
+return -1;
+}
 } else if (g_strcmp0(tokens[0], "reg") == 0) {
 add_regpat(tokens[1]);
 } else if (g_strcmp0(tokens[0], "rdisas") == 0) {
-- 
2.43.2




[PATCH v3 02/12] util/log: make qemu_set_dfilter_ranges() take a GList

2024-03-01 Thread Sven Schnelle
In preparation of making qemu_set_dfilter_ranges() available
to other users, move the code to a new function range_list_from_string()
which takes an additional GList argument.

Signed-off-by: Sven Schnelle 
---
 util/log.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/util/log.c b/util/log.c
index 779c0689a9..f183ea4e4d 100644
--- a/util/log.c
+++ b/util/log.c
@@ -382,15 +382,16 @@ bool qemu_log_in_addr_range(uint64_t addr)
 return false;
 }
 
-void qemu_set_dfilter_ranges(const char *filter_spec, Error **errp)
+static void range_list_from_string(GList **out_ranges, const char *filter_spec,
+   Error **errp)
 {
 gchar **ranges = g_strsplit(filter_spec, ",", 0);
 struct Range *range = NULL;
 int i;
 
-if (debug_regions) {
-g_list_free_full(debug_regions, g_free);
-debug_regions = NULL;
+if (*out_ranges) {
+g_list_free_full(*out_ranges, g_free);
+*out_ranges = NULL;
 }
 
 for (i = 0; ranges[i]; i++) {
@@ -447,7 +448,7 @@ void qemu_set_dfilter_ranges(const char *filter_spec, Error 
**errp)
 goto out;
 }
 range_set_bounds(range, lob, upb);
-debug_regions = g_list_append(debug_regions, range);
+   *out_ranges = g_list_append(*out_ranges, range);
 range = NULL;
 }
 out:
@@ -455,6 +456,11 @@ out:
 g_strfreev(ranges);
 }
 
+void qemu_set_dfilter_ranges(const char *filter_spec, Error **errp)
+{
+range_list_from_string(&debug_regions, filter_spec, errp);
+}
+
 const QEMULogItem qemu_log_items[] = {
 { CPU_LOG_TB_OUT_ASM, "out_asm",
   "show generated host assembly code for each compiled TB" },
-- 
2.43.2




[PATCH v3 06/12] util/range: split up range_list_from_string()

2024-03-01 Thread Sven Schnelle
Makes the code a bit easier to read and maintain.

Signed-off-by: Sven Schnelle 
---
 util/range.c | 119 ++-
 1 file changed, 70 insertions(+), 49 deletions(-)

diff --git a/util/range.c b/util/range.c
index db535de9a7..8c463995e7 100644
--- a/util/range.c
+++ b/util/range.c
@@ -124,6 +124,74 @@ exit:
 *rev = out;
 }
 
+static const char *split_single_range(const char *r, const char **r2)
+{
+char *range_op;
+
+range_op = strstr(r, "-");
+if (range_op) {
+*r2 = range_op + 1;
+return range_op;
+}
+range_op = strstr(r, "+");
+if (range_op) {
+*r2 = range_op + 1;
+return range_op;
+}
+range_op = strstr(r, "..");
+if (range_op) {
+*r2 = range_op + 2;
+return range_op;
+}
+return NULL;
+}
+
+static int parse_single_range(const char *r, Error **errp,
+  uint64_t *lob, uint64_t *upb)
+{
+const char *range_op, *r2, *e;
+uint64_t r1val, r2val;
+
+range_op = split_single_range(r, &r2);
+if (!range_op) {
+error_setg(errp, "Bad range specifier");
+return 1;
+}
+if (qemu_strtou64(r, &e, 0, &r1val)
+|| e != range_op) {
+error_setg(errp, "Invalid number to the left of %.*s",
+   (int)(r2 - range_op), range_op);
+return 1;
+}
+if (qemu_strtou64(r2, NULL, 0, &r2val)) {
+error_setg(errp, "Invalid number to the right of %.*s",
+   (int)(r2 - range_op), range_op);
+return 1;
+}
+
+switch (*range_op) {
+case '+':
+*lob = r1val;
+*upb = r1val + r2val - 1;
+break;
+case '-':
+*upb = r1val;
+*lob = r1val - (r2val - 1);
+break;
+case '.':
+*lob = r1val;
+*upb = r2val;
+break;
+default:
+g_assert_not_reached();
+}
+if (*lob > *upb) {
+error_setg(errp, "Invalid range");
+return 1;
+}
+return 0;
+}
+
 void range_list_from_string(GList **out_ranges, const char *filter_spec,
 Error **errp)
 {
@@ -136,60 +204,13 @@ void range_list_from_string(GList **out_ranges, const 
char *filter_spec,
 }
 
 for (i = 0; ranges[i]; i++) {
-const char *r = ranges[i];
-const char *range_op, *r2, *e;
-uint64_t r1val, r2val, lob, upb;
-
-range_op = strstr(r, "-");
-r2 = range_op ? range_op + 1 : NULL;
-if (!range_op) {
-range_op = strstr(r, "+");
-r2 = range_op ? range_op + 1 : NULL;
-}
-if (!range_op) {
-range_op = strstr(r, "..");
-r2 = range_op ? range_op + 2 : NULL;
-}
-if (!range_op) {
-error_setg(errp, "Bad range specifier");
-goto out;
-}
-
-if (qemu_strtou64(r, &e, 0, &r1val)
-|| e != range_op) {
-error_setg(errp, "Invalid number to the left of %.*s",
-   (int)(r2 - range_op), range_op);
-goto out;
-}
-if (qemu_strtou64(r2, NULL, 0, &r2val)) {
-error_setg(errp, "Invalid number to the right of %.*s",
-   (int)(r2 - range_op), range_op);
-goto out;
-}
+uint64_t lob, upb;
 
-switch (*range_op) {
-case '+':
-lob = r1val;
-upb = r1val + r2val - 1;
+if (parse_single_range(ranges[i], errp, &lob, &upb)) {
 break;
-case '-':
-upb = r1val;
-lob = r1val - (r2val - 1);
-break;
-case '.':
-lob = r1val;
-upb = r2val;
-break;
-default:
-g_assert_not_reached();
-}
-if (lob > upb) {
-error_setg(errp, "Invalid range");
-goto out;
 }
 *out_ranges = append_new_range(*out_ranges, lob, upb);
 }
-out:
 g_strfreev(ranges);
 }
 
-- 
2.43.2




[PATCH v3 10/12] plugins: add range list API

2024-03-01 Thread Sven Schnelle
Export range_list_from_string(), range_contains() and range_list_free()
to allow plugins to parse filter ranges and match them to avoid
reimplementing this functionality.

Signed-off-by: Sven Schnelle 
---
 include/qemu/qemu-plugin.h   | 41 
 plugins/api.c| 18 
 plugins/qemu-plugins.symbols |  3 +++
 3 files changed, 62 insertions(+)

diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 5839feea4d..4910a63d70 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -765,4 +765,45 @@ typedef struct Error Error;
 QEMU_PLUGIN_API
 void qemu_plugin_error_print(Error *err);
 
+typedef GList qemu_plugin_range_list;
+
+/**
+ * qemu_plugin_ranges_from_string() - parse a filter range string
+ *
+ * @out_ranges: a pointer to a @qemu_plugin_range_list pointer
+ * @filter_spec: input string
+ * @errp: @Error string on parse failure
+ *
+ * This function parses a filter specification string and stores the
+ * parsed adress ranges found in @out. On parse failure a @Error pointer
+ * is stored in @errp. The function accepts a comma-separated list
+ * of start and end addresses or single addresses.
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_range_list_from_string(qemu_plugin_range_list **out_range,
+const char *filter_spec,
+Error **errp);
+
+/**
+ * qemu_plugin_range_list_contains() - match a value against a list
+ * of ranges
+ *
+ * @ranges: a pointer to a @qemu_plugin_range_list
+ * @val: the value to match
+ *
+ * This function matches @val against the adress range list in @ranges.
+ * On success, true is returned, otherwise false.
+ */
+QEMU_PLUGIN_API
+bool qemu_plugin_range_list_contains(qemu_plugin_range_list *ranges,
+ uint64_t val);
+
+/**
+ * qemu_plugin_range_list_free() - free a list of ranges
+ *
+ * @ranges: a pointer to the list to be freed
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_range_list_free(qemu_plugin_range_list *ranges);
+
 #endif /* QEMU_QEMU_PLUGIN_H */
diff --git a/plugins/api.c b/plugins/api.c
index 8fd3a8964a..8dbd14c648 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -472,3 +472,21 @@ void qemu_plugin_error_print(Error *err)
 error_report_err(err);
 }
 
+void qemu_plugin_range_list_from_string(qemu_plugin_range_list **out,
+const char *filter_spec,
+Error **errp)
+{
+return range_list_from_string(out, filter_spec, errp);
+}
+
+bool qemu_plugin_range_list_contains(qemu_plugin_range_list *ranges,
+ uint64_t val)
+{
+return range_list_contains(ranges, val);
+}
+
+void qemu_plugin_range_list_free(qemu_plugin_range_list *ranges)
+{
+return range_list_free(ranges);
+}
+
diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
index b142d11e58..472b29fc5f 100644
--- a/plugins/qemu-plugins.symbols
+++ b/plugins/qemu-plugins.symbols
@@ -21,6 +21,9 @@
   qemu_plugin_num_vcpus;
   qemu_plugin_outs;
   qemu_plugin_path_to_binary;
+  qemu_plugin_range_list_contains;
+  qemu_plugin_range_list_free;
+  qemu_plugin_range_list_from_string;
   qemu_plugin_read_register;
   qemu_plugin_register_atexit_cb;
   qemu_plugin_register_flush_cb;
-- 
2.43.2




[PATCH v3 09/12] plugins: add API to print errors

2024-03-01 Thread Sven Schnelle
add qemu_plugin_error_print() which is a wrapper around
error_report_err(). This will be used by
qemu_plugin_parse_filter_ranges() to report parse failures.

Signed-off-by: Sven Schnelle 
---
 include/qemu/qemu-plugin.h   | 12 
 plugins/api.c|  7 +++
 plugins/qemu-plugins.symbols |  1 +
 3 files changed, 20 insertions(+)

diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 45e2ebc8f8..5839feea4d 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -752,5 +752,17 @@ QEMU_PLUGIN_API
 int qemu_plugin_read_register(struct qemu_plugin_register *handle,
   GByteArray *buf);
 
+typedef struct Error Error;
+
+/**
+ * qemu_plugin_error_print() - print and free error
+ *
+ * @err: a @Error handle
+ *
+ * This function shows and and frees the supplied error.
+ */
+
+QEMU_PLUGIN_API
+void qemu_plugin_error_print(Error *err);
 
 #endif /* QEMU_QEMU_PLUGIN_H */
diff --git a/plugins/api.c b/plugins/api.c
index 81f43c9ce8..8fd3a8964a 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -45,6 +45,7 @@
 #include "exec/ram_addr.h"
 #include "disas/disas.h"
 #include "plugin.h"
+#include "qapi/error.h"
 #ifndef CONFIG_USER_ONLY
 #include "qemu/plugin-memory.h"
 #include "hw/boards.h"
@@ -465,3 +466,9 @@ int qemu_plugin_read_register(struct qemu_plugin_register 
*reg, GByteArray *buf)
 
 return gdb_read_register(current_cpu, buf, GPOINTER_TO_INT(reg));
 }
+
+void qemu_plugin_error_print(Error *err)
+{
+error_report_err(err);
+}
+
diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
index 27fe97239b..b142d11e58 100644
--- a/plugins/qemu-plugins.symbols
+++ b/plugins/qemu-plugins.symbols
@@ -2,6 +2,7 @@
   qemu_plugin_bool_parse;
   qemu_plugin_end_code;
   qemu_plugin_entry_code;
+  qemu_plugin_error_print;
   qemu_plugin_get_hwaddr;
   qemu_plugin_get_registers;
   qemu_plugin_hwaddr_device_name;
-- 
2.43.2




[PATCH v3 04/12] util/range: add range_list_free()

2024-03-01 Thread Sven Schnelle
Introduce range_list_free(), which takes a GList of ranges
and frees the list and each range.

Signed-off-by: Sven Schnelle 
---
 include/qemu/range.h | 5 +
 util/range.c | 5 +
 2 files changed, 10 insertions(+)

diff --git a/include/qemu/range.h b/include/qemu/range.h
index 530b0c7db1..4ff9799d89 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -240,4 +240,9 @@ void range_inverse_array(GList *in_ranges,
 void range_list_from_string(GList **out_ranges, const char *filter_spec,
 Error **errp);
 
+/*
+ * Free a list of ranges.
+ */
+void range_list_free(GList *ranges);
+
 #endif
diff --git a/util/range.c b/util/range.c
index bd2d0961bd..7234ab7a53 100644
--- a/util/range.c
+++ b/util/range.c
@@ -197,3 +197,8 @@ out:
 g_free(range);
 g_strfreev(ranges);
 }
+
+void range_list_free(GList *ranges)
+{
+g_list_free_full(ranges, g_free);
+}
-- 
2.43.2




[PATCH v3 08/12] qemu/range: add range_list_contains() function

2024-03-01 Thread Sven Schnelle
Add a small inline function to match an address against
a list of ranges.

Signed-off-by: Sven Schnelle 
---
 include/qemu/range.h | 12 
 util/log.c   | 10 +-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/include/qemu/range.h b/include/qemu/range.h
index 4ff9799d89..7ef9b3d5fb 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -57,6 +57,18 @@ static inline bool range_contains(const Range *range, 
uint64_t val)
 return val >= range->lob && val <= range->upb;
 }
 
+static inline bool range_list_contains(GList *ranges, uint64_t val)
+{
+GList *p;
+
+for (p = g_list_first(ranges); p; p = g_list_next(p)) {
+if (range_contains(p->data, val)) {
+return true;
+}
+}
+return false;
+}
+
 /* Initialize @range to the empty range */
 static inline void range_make_empty(Range *range)
 {
diff --git a/util/log.c b/util/log.c
index 90897ef0f9..032110d700 100644
--- a/util/log.c
+++ b/util/log.c
@@ -368,18 +368,10 @@ bool qemu_set_log_filename_flags(const char *name, int 
flags, Error **errp)
  */
 bool qemu_log_in_addr_range(uint64_t addr)
 {
-GList *p;
-
 if (!debug_regions) {
 return true;
 }
-
-for (p = g_list_first(debug_regions); p; p = g_list_next(p)) {
-if (range_contains(p->data, addr)) {
-return true;
-}
-}
-return false;
+return range_list_contains(debug_regions, addr);
 }
 
 void qemu_set_dfilter_ranges(const char *filter_spec, Error **errp)
-- 
2.43.2




[PATCH v3 12/12] plugins/execlog: add data address match

2024-03-01 Thread Sven Schnelle
Add a match similar to the afilter address match, but for data
addresses. When an address is specified with '-dfilter=0x12345'
only load/stores to/from address 0x12345 are printed. All other
instructions are hidden.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index c518797893..c44fd0e593 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -27,6 +27,8 @@ typedef struct CPU {
 GString *last_exec;
 /* Ptr array of Register */
 GPtrArray *registers;
+/* whether this instruction should be logged */
+bool log;
 } CPU;
 
 QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
@@ -36,6 +38,7 @@ static GRWLock expand_array_lock;
 
 static GPtrArray *imatches;
 static GList *amatches;
+static GList *dmatches;
 static GPtrArray *rmatches;
 static bool disas_assist;
 static GMutex add_reg_name_lock;
@@ -62,6 +65,11 @@ static void vcpu_mem(unsigned int cpu_index, 
qemu_plugin_meminfo_t info,
 
 /* Find vCPU in array */
 
+if (dmatches && !qemu_plugin_range_list_contains(dmatches, vaddr)) {
+return;
+}
+c->log = true;
+
 /* Indicate type of memory access */
 if (qemu_plugin_mem_is_store(info)) {
 g_string_append(s, ", store");
@@ -121,15 +129,17 @@ static void vcpu_insn_exec_with_regs(unsigned int 
cpu_index, void *udata)
 if (cpu->registers) {
 insn_check_regs(cpu);
 }
-
-qemu_plugin_outs(cpu->last_exec->str);
-qemu_plugin_outs("\n");
+if (cpu->log) {
+qemu_plugin_outs(cpu->last_exec->str);
+qemu_plugin_outs("\n");
+}
 }
 
 /* Store new instruction in cache */
 /* vcpu_mem will add memory access information to last_exec */
 g_string_printf(cpu->last_exec, "%u, ", cpu_index);
 g_string_append(cpu->last_exec, (char *)udata);
+cpu->log = dmatches ? false : true;
 }
 
 /* Log last instruction while checking registers, ignore next */
@@ -157,7 +167,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 CPU *cpu = get_cpu(cpu_index);
 
 /* Print previous instruction in cache */
-if (cpu->last_exec->len) {
+if (cpu->log && cpu->last_exec->len) {
 qemu_plugin_outs(cpu->last_exec->str);
 qemu_plugin_outs("\n");
 }
@@ -166,6 +176,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 /* vcpu_mem will add memory access information to last_exec */
 g_string_printf(cpu->last_exec, "%u, ", cpu_index);
 g_string_append(cpu->last_exec, (char *)udata);
+cpu->log = dmatches ? false : true;
 }
 
 /**
@@ -377,7 +388,7 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
 g_rw_lock_reader_lock(&expand_array_lock);
 for (i = 0; i < cpus->len; i++) {
 CPU *c = get_cpu(i);
-if (c->last_exec && c->last_exec->str) {
+if (c->log && c->last_exec && c->last_exec->str) {
 qemu_plugin_outs(c->last_exec->str);
 qemu_plugin_outs("\n");
 }
@@ -432,6 +443,13 @@ QEMU_PLUGIN_EXPORT int 
qemu_plugin_install(qemu_plugin_id_t id,
 qemu_plugin_error_print(err);
 return -1;
 }
+} else if (g_strcmp0(tokens[0], "dfilter") == 0) {
+Error *err = NULL;
+qemu_plugin_range_list_from_string(&dmatches, tokens[1], &err);
+if (err) {
+qemu_plugin_error_print(err);
+return -1;
+}
 } else if (g_strcmp0(tokens[0], "reg") == 0) {
 add_regpat(tokens[1]);
 } else if (g_strcmp0(tokens[0], "rdisas") == 0) {
-- 
2.43.2




[PATCH v3 05/12] util/range: use append_new_range() in range_list_from_string()

2024-03-01 Thread Sven Schnelle
Use append_new_ranges() instead of manually allocating and
filling the new range member.

Signed-off-by: Sven Schnelle 
---
 util/range.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/util/range.c b/util/range.c
index 7234ab7a53..db535de9a7 100644
--- a/util/range.c
+++ b/util/range.c
@@ -128,7 +128,6 @@ void range_list_from_string(GList **out_ranges, const char 
*filter_spec,
 Error **errp)
 {
 gchar **ranges = g_strsplit(filter_spec, ",", 0);
-struct Range *range = NULL;
 int i;
 
 if (*out_ranges) {
@@ -140,7 +139,6 @@ void range_list_from_string(GList **out_ranges, const char 
*filter_spec,
 const char *r = ranges[i];
 const char *range_op, *r2, *e;
 uint64_t r1val, r2val, lob, upb;
-range = g_new0(struct Range, 1);
 
 range_op = strstr(r, "-");
 r2 = range_op ? range_op + 1 : NULL;
@@ -189,12 +187,9 @@ void range_list_from_string(GList **out_ranges, const char 
*filter_spec,
 error_setg(errp, "Invalid range");
 goto out;
 }
-range_set_bounds(range, lob, upb);
-*out_ranges = g_list_append(*out_ranges, range);
-range = NULL;
+*out_ranges = append_new_range(*out_ranges, lob, upb);
 }
 out:
-g_free(range);
 g_strfreev(ranges);
 }
 
-- 
2.43.2




[PATCH v3 00/12] plugins/execlog: add data address match and address range support

2024-03-01 Thread Sven Schnelle
Hi List,

this patchset adds a new -dfilter option and address range matching. With this
execlog can match only a certain range of address for both instruction and
data adresses.

Example usage:

qemu-system-xxx  -d plugin -plugin 
libexeclog.so,afilter=0x1000-0x2000,dfilter=0x388

This would only log instruction in the address range 0x1000 to 0x2000
and accessing data at address 0x388.

Changes in v3:
- extend plugin api to re-use the existing range parsing infrastructure
- export error report api

Changes in v2:
- rebased on top of latest master

Sven Schnelle (12):
  util/log: convert debug_regions to GList
  util/log: make qemu_set_dfilter_ranges() take a GList
  util/range: move range_list_from_string() to range.c
  util/range: add range_list_free()
  util/range: use append_new_range() in range_list_from_string()
  util/range: split up range_list_from_string()
  util/range: make range_list_from_string() accept a single number
  qemu/range: add range_list_contains() function
  plugins: add API to print errors
  plugins: add range list API
  plugins/execlog: use range list api
  plugins/execlog: add data address match

 contrib/plugins/execlog.c|  55 +++
 include/qemu/qemu-plugin.h   |  53 ++
 include/qemu/range.h |  24 +
 plugins/api.c|  25 +
 plugins/qemu-plugins.symbols |   4 ++
 util/log.c   |  84 ++---
 util/range.c | 102 +++
 7 files changed, 244 insertions(+), 103 deletions(-)

--
2.43.2



[PATCH v3 01/12] util/log: convert debug_regions to GList

2024-03-01 Thread Sven Schnelle
In preparation of making the parsing part of qemu_set_dfilter_ranges()
available to other users, convert it to return a GList, so the result
can be used with other functions taking a GList of struct Range.

Signed-off-by: Sven Schnelle 
---
 util/log.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/util/log.c b/util/log.c
index d36c98da0b..779c0689a9 100644
--- a/util/log.c
+++ b/util/log.c
@@ -46,7 +46,7 @@ static __thread Notifier qemu_log_thread_cleanup_notifier;
 
 int qemu_loglevel;
 static bool log_per_thread;
-static GArray *debug_regions;
+static GList *debug_regions;
 
 /* Returns true if qemu_log() will really write somewhere. */
 bool qemu_log_enabled(void)
@@ -368,38 +368,36 @@ bool qemu_set_log_filename_flags(const char *name, int 
flags, Error **errp)
  */
 bool qemu_log_in_addr_range(uint64_t addr)
 {
-if (debug_regions) {
-int i = 0;
-for (i = 0; i < debug_regions->len; i++) {
-Range *range = &g_array_index(debug_regions, Range, i);
-if (range_contains(range, addr)) {
-return true;
-}
-}
-return false;
-} else {
+GList *p;
+
+if (!debug_regions) {
 return true;
 }
-}
 
+for (p = g_list_first(debug_regions); p; p = g_list_next(p)) {
+if (range_contains(p->data, addr)) {
+return true;
+}
+}
+return false;
+}
 
 void qemu_set_dfilter_ranges(const char *filter_spec, Error **errp)
 {
 gchar **ranges = g_strsplit(filter_spec, ",", 0);
+struct Range *range = NULL;
 int i;
 
 if (debug_regions) {
-g_array_unref(debug_regions);
+g_list_free_full(debug_regions, g_free);
 debug_regions = NULL;
 }
 
-debug_regions = g_array_sized_new(FALSE, FALSE,
-  sizeof(Range), g_strv_length(ranges));
 for (i = 0; ranges[i]; i++) {
 const char *r = ranges[i];
 const char *range_op, *r2, *e;
 uint64_t r1val, r2val, lob, upb;
-struct Range range;
+range = g_new0(struct Range, 1);
 
 range_op = strstr(r, "-");
 r2 = range_op ? range_op + 1 : NULL;
@@ -448,10 +446,12 @@ void qemu_set_dfilter_ranges(const char *filter_spec, 
Error **errp)
 error_setg(errp, "Invalid range");
 goto out;
 }
-range_set_bounds(&range, lob, upb);
-g_array_append_val(debug_regions, range);
+range_set_bounds(range, lob, upb);
+debug_regions = g_list_append(debug_regions, range);
+range = NULL;
 }
 out:
+g_free(range);
 g_strfreev(ranges);
 }
 
-- 
2.43.2




[PATCH v3 07/12] util/range: make range_list_from_string() accept a single number

2024-03-01 Thread Sven Schnelle
To use range_list_from_string() as a replacement in the execlog
plugin, make it accept single numbers instead of a range. This
might also be useful for the already present debug_ranges filtering.

Signed-off-by: Sven Schnelle 
---
 util/range.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/util/range.c b/util/range.c
index 8c463995e7..7784c21b12 100644
--- a/util/range.c
+++ b/util/range.c
@@ -154,6 +154,11 @@ static int parse_single_range(const char *r, Error **errp,
 
 range_op = split_single_range(r, &r2);
 if (!range_op) {
+if (!qemu_strtou64(r, &e, 0, &r1val) && *e == '\0') {
+*lob = r1val;
+*upb = r1val;
+return 0;
+}
 error_setg(errp, "Bad range specifier");
 return 1;
 }
-- 
2.43.2




Re: [PATCH] hw/scsi/lsi53c895a: add timer to scripts processing

2024-02-29 Thread Sven Schnelle
BALATON Zoltan  writes:

> On Thu, 29 Feb 2024, Sven Schnelle wrote:
>> BALATON Zoltan  writes:
>> Yes, i changed it to describe the new method:
>>
>>/*
>> * Some windows drivers make the device spin waiting for a memory location
>> * to change. If we have executed more than LSI_MAX_INSN instructions then
>> * assume this is the case and start a timer. Until the timer fires, the
>> * host CPU has a chance to run and change the memory location.
>
> Is that the host or guest CPU? I thought the guest vcpu needs to get a
> chance to do something about this but maybe I did not get the problem
> at all.

Of course it's the guest CPU. I'll wait for other review comments and
fix it in the next version.

Thanks!
Sven



[PATCH v2] hw/scsi/lsi53c895a: add timer to scripts processing

2024-02-29 Thread Sven Schnelle
HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, start a timer and exit the loop. Until the
timer fires, the CPU can process instructions which might changes the
memory location.

The limit of instructions is also reduced because scripts running on
the SCSI processor are usually very short. This keeps the time until
the loop is exit short.

Suggested-by: Peter Maydell 
Signed-off-by: Sven Schnelle 
---
Changes in v2:
- update comment in lsi_execute_script()
- reset waiting state and del timer in lsi_execute_script() to
  handle the case where script processing is triggered via
  register write, and not from the pending timer
- delete timer in lsi_scsi_exit()

 hw/scsi/lsi53c895a.c | 43 +--
 hw/scsi/trace-events |  2 ++
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index d607a5f9fb..4ff9470381 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -188,7 +188,7 @@ static const char *names[] = {
 #define LSI_TAG_VALID (1 << 16)
 
 /* Maximum instructions to process. */
-#define LSI_MAX_INSN1
+#define LSI_MAX_INSN100
 
 typedef struct lsi_request {
 SCSIRequest *req;
@@ -205,6 +205,7 @@ enum {
 LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
 LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
 LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
+LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit */
 };
 
 enum {
@@ -224,6 +225,7 @@ struct LSIState {
 MemoryRegion ram_io;
 MemoryRegion io_io;
 AddressSpace pci_io_as;
+QEMUTimer *scripts_timer;
 
 int carry; /* ??? Should this be an a visible register somewhere?  */
 int status;
@@ -415,6 +417,7 @@ static void lsi_soft_reset(LSIState *s)
 s->sbr = 0;
 assert(QTAILQ_EMPTY(&s->queue));
 assert(!s->current);
+timer_del(s->scripts_timer);
 }
 
 static int lsi_dma_40bit(LSIState *s)
@@ -1127,6 +1130,12 @@ static void lsi_wait_reselect(LSIState *s)
 }
 }
 
+static void lsi_scripts_timer_start(LSIState *s)
+{
+trace_lsi_scripts_timer_start();
+timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 500);
+}
+
 static void lsi_execute_script(LSIState *s)
 {
 PCIDevice *pci_dev = PCI_DEVICE(s);
@@ -1136,6 +1145,11 @@ static void lsi_execute_script(LSIState *s)
 int insn_processed = 0;
 static int reentrancy_level;
 
+if (s->waiting == LSI_WAIT_SCRIPTS) {
+timer_del(s->scripts_timer);
+s->waiting = LSI_NOWAIT;
+}
+
 reentrancy_level++;
 
 s->istat1 |= LSI_ISTAT1_SRUN;
@@ -1143,8 +1157,8 @@ again:
 /*
  * Some windows drivers make the device spin waiting for a memory location
  * to change. If we have executed more than LSI_MAX_INSN instructions then
- * assume this is the case and force an unexpected device disconnect. This
- * is apparently sufficient to beat the drivers into submission.
+ * assume this is the case and start a timer. Until the timer fires, the
+ * host CPU has a chance to run and change the memory location.
  *
  * Another issue (CVE-2023-0330) can occur if the script is programmed to
  * trigger itself again and again. Avoid this problem by stopping after
@@ -1152,13 +1166,8 @@ again:
  * which should be enough for all valid use cases).
  */
 if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
-if (!(s->sien0 & LSI_SIST0_UDC)) {
-qemu_log_mask(LOG_GUEST_ERROR,
-  "lsi_scsi: inf. loop with UDC masked");
-}
-lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-lsi_disconnect(s);
-trace_lsi_execute_script_stop();
+s->waiting = LSI_WAIT_SCRIPTS;
+lsi_scripts_timer_start(s);
 reentrancy_level--;
 return;
 }
@@ -2197,6 +2206,9 @@ static int lsi_post_load(void *opaque, int version_id)
 return -EINVAL;
 }
 
+if (s->waiting == LSI_WAIT_SCRIPTS) {
+lsi_scripts_timer_start(s);
+}
 return 0;
 }
 
@@ -2294,6 +2306,15 @@ static const struct SCSIBusInfo lsi_scsi_info = {
 .cancel = lsi_request_cancelled
 };
 
+static void scripts_timer_cb(void *opaque)
+{
+LSIState *s = opaque;
+
+trace_lsi_scripts_timer_triggered();
+s->waiting = LSI_NOWAIT;
+lsi_execute_script(s);
+}
+
 static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
 {
 LSIState *s = LSI53C895A(dev);
@@ -2313,6 +2334,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
   

Re: [PATCH] hw/scsi/lsi53c895a: add timer to scripts processing

2024-02-29 Thread Sven Schnelle
BALATON Zoltan  writes:

> On Thu, 29 Feb 2024, Sven Schnelle wrote:
>> Some OS's like HP-UX 10.20 are spinn
>
> I guess the above line is left here by accident.

Yes.

>> HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
>> under certain circumstances. As the SCSI controller and CPU are not
>> running at the same time this loop will never finish. After some
>> time, the check loop interrupts with a unexpected device disconnect.
>> This works, but is slow because the kernel resets the scsi controller.
>> Instead of signaling UDC, start a timer and exit the loop. Until the
>> timer fires, the CPU can process instructions until the timer fires.
>> The limit of instructions is also reduced because scripts running
>> on the SCSI processor are usually very short. This keeps the time
>> until the loop-exit short.
>>
>> Suggested-by: Peter Maydell 
>> Signed-off-by: Sven Schnelle 
>> ---
>> hw/scsi/lsi53c895a.c | 33 +
>> 1 file changed, 25 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
>> index d607a5f9fb..0b6f1dc72f 100644
>> --- a/hw/scsi/lsi53c895a.c
>> +++ b/hw/scsi/lsi53c895a.c
>> @@ -188,7 +188,7 @@ static const char *names[] = {
>> #define LSI_TAG_VALID (1 << 16)
>>
>> /* Maximum instructions to process. */
>> -#define LSI_MAX_INSN1
>> +#define LSI_MAX_INSN100
>>
>> typedef struct lsi_request {
>> SCSIRequest *req;
>> @@ -205,6 +205,7 @@ enum {
>> LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
>> LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
>> LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
>> +LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit 
>> */
>> };
>>
>> enum {
>> @@ -224,6 +225,7 @@ struct LSIState {
>> MemoryRegion ram_io;
>> MemoryRegion io_io;
>> AddressSpace pci_io_as;
>> +QEMUTimer *scripts_timer;
>>
>> int carry; /* ??? Should this be an a visible register somewhere?  */
>> int status;
>> @@ -415,6 +417,7 @@ static void lsi_soft_reset(LSIState *s)
>> s->sbr = 0;
>> assert(QTAILQ_EMPTY(&s->queue));
>> assert(!s->current);
>> +timer_del(s->scripts_timer);
>
> Maybe the rimer needs to be deleted in lsi_scsi_exit() too but I'm not
> sure.

I added it, thanks.

>> }
>>
>> static int lsi_dma_40bit(LSIState *s)
>> @@ -1127,6 +1130,12 @@ static void lsi_wait_reselect(LSIState *s)
>> }
>> }
>>
>> +static void lsi_scripts_timer_start(LSIState *s)
>> +{
>> +trace_lsi_scripts_timer_start();
>> +timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 
>> 500);
>> +}
>> +
>> static void lsi_execute_script(LSIState *s)
>> {
>> PCIDevice *pci_dev = PCI_DEVICE(s);
>> @@ -1152,13 +1161,8 @@ again:
>>  * which should be enough for all valid use cases).
>>  */
>
> Does tha above comment need updating to say what the code does now?

Yes, i changed it to describe the new method:

/*
 * Some windows drivers make the device spin waiting for a memory location
 * to change. If we have executed more than LSI_MAX_INSN instructions then
 * assume this is the case and start a timer. Until the timer fires, the
 * host CPU has a chance to run and change the memory location.
 *
 * Another issue (CVE-2023-0330) can occur if the script is programmed to
 * trigger itself again and again. Avoid this problem by stopping after
 * being called multiple times in a reentrant way (8 is an arbitrary value
 * which should be enough for all valid use cases).
 */

> Regards,
> BALATON Zoltan
>
>> if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
>> -if (!(s->sien0 & LSI_SIST0_UDC)) {
>> -qemu_log_mask(LOG_GUEST_ERROR,
>> -  "lsi_scsi: inf. loop with UDC masked");
>> -}
>> -lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
>> -lsi_disconnect(s);
>> -trace_lsi_execute_script_stop();
>> +s->waiting = LSI_WAIT_SCRIPTS;
>> +lsi_scripts_timer_start(s);
>> reentrancy_level--;
>> return;
>> }
>> @@ -2197,6 +2201,9 @@ static int lsi_post_load(void *opaque, int version_id)
>> return -EINVAL;
>> }
>>
>> +if (s->waiting == LSI_WAIT_SCRIPTS) {
>

Re: [PATCH 0/3] plugins/execlog: add data address match and address range support

2024-02-29 Thread Sven Schnelle
Hi Alex,

Alex Bennée  writes:

> Sven Schnelle  writes:
> I think we've lost a patch in the re-posting. patchew hasn't seen it
> either:
>
>   https://patchew.org/QEMU/20240229150729.1620410-1-sv...@stackframe.org/
>
>>   plugins/execlog: add data address match
>>   plugins/execlog: add address range matching
>>
>>  contrib/plugins/execlog.c | 95 ---
>>  1 file changed, 79 insertions(+), 16 deletions(-)

Yes, i got a 550 mail. But i'll look into the qemu_set_dfilter_ranges()
before resending.



[PATCH] hw/scsi/lsi53c895a: add timer to scripts processing

2024-02-29 Thread Sven Schnelle
Some OS's like HP-UX 10.20 are spinn

HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, start a timer and exit the loop. Until the
timer fires, the CPU can process instructions until the timer fires.
The limit of instructions is also reduced because scripts running
on the SCSI processor are usually very short. This keeps the time
until the loop-exit short.

Suggested-by: Peter Maydell 
Signed-off-by: Sven Schnelle 
---
 hw/scsi/lsi53c895a.c | 33 +
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index d607a5f9fb..0b6f1dc72f 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -188,7 +188,7 @@ static const char *names[] = {
 #define LSI_TAG_VALID (1 << 16)
 
 /* Maximum instructions to process. */
-#define LSI_MAX_INSN1
+#define LSI_MAX_INSN100
 
 typedef struct lsi_request {
 SCSIRequest *req;
@@ -205,6 +205,7 @@ enum {
 LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
 LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
 LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
+LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit */
 };
 
 enum {
@@ -224,6 +225,7 @@ struct LSIState {
 MemoryRegion ram_io;
 MemoryRegion io_io;
 AddressSpace pci_io_as;
+QEMUTimer *scripts_timer;
 
 int carry; /* ??? Should this be an a visible register somewhere?  */
 int status;
@@ -415,6 +417,7 @@ static void lsi_soft_reset(LSIState *s)
 s->sbr = 0;
 assert(QTAILQ_EMPTY(&s->queue));
 assert(!s->current);
+timer_del(s->scripts_timer);
 }
 
 static int lsi_dma_40bit(LSIState *s)
@@ -1127,6 +1130,12 @@ static void lsi_wait_reselect(LSIState *s)
 }
 }
 
+static void lsi_scripts_timer_start(LSIState *s)
+{
+trace_lsi_scripts_timer_start();
+timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 500);
+}
+
 static void lsi_execute_script(LSIState *s)
 {
 PCIDevice *pci_dev = PCI_DEVICE(s);
@@ -1152,13 +1161,8 @@ again:
  * which should be enough for all valid use cases).
  */
 if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
-if (!(s->sien0 & LSI_SIST0_UDC)) {
-qemu_log_mask(LOG_GUEST_ERROR,
-  "lsi_scsi: inf. loop with UDC masked");
-}
-lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-lsi_disconnect(s);
-trace_lsi_execute_script_stop();
+s->waiting = LSI_WAIT_SCRIPTS;
+lsi_scripts_timer_start(s);
 reentrancy_level--;
 return;
 }
@@ -2197,6 +2201,9 @@ static int lsi_post_load(void *opaque, int version_id)
 return -EINVAL;
 }
 
+if (s->waiting == LSI_WAIT_SCRIPTS) {
+lsi_scripts_timer_start(s);
+}
 return 0;
 }
 
@@ -2294,6 +2301,15 @@ static const struct SCSIBusInfo lsi_scsi_info = {
 .cancel = lsi_request_cancelled
 };
 
+static void scripts_timer_cb(void *opaque)
+{
+LSIState *s = opaque;
+
+trace_lsi_scripts_timer_triggered();
+s->waiting = LSI_NOWAIT;
+lsi_execute_script(s);
+}
+
 static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
 {
 LSIState *s = LSI53C895A(dev);
@@ -2313,6 +2329,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
   "lsi-ram", 0x2000);
 memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
   "lsi-io", 256);
+s->scripts_timer = timer_new_us(QEMU_CLOCK_VIRTUAL, scripts_timer_cb, s);
 
 /*
  * Since we use the address-space API to interact with ram_io, disable the
-- 
2.43.2




Re: [PATCH 3/3] plugins/execlog: add address range matching

2024-02-29 Thread Sven Schnelle
Hi Alex,

Alex Bennée  writes:

> Sven Schnelle  writes:
>> +static void parse_vaddr_match(GArray **matches, char *token)
>>  {
>> -uint64_t v = g_ascii_strtoull(match, NULL, 16);
>> +uint64_t low, high;
>> +gchar *endp;
>>  
>> -if (!matches) {
>> -*matches = g_array_new(false, true, sizeof(uint64_t));
>> +low = g_ascii_strtoull(token, &endp, 16);
>> +if (endp == token) {
>> +fprintf(stderr, "Invalid address(range) specified: %s\n", token);
>> +return;
>> +}
>> +
>> +if (*endp != '-') {
>> +high = low;
>> +} else {
>> +high = g_ascii_strtoull(endp + 1, &endp, 16);
>> +if (endp == token) {
>> +fprintf(stderr, "Invalid address(range) specified: %s\n", 
>> token);
>> +return;
>> +}
>> +}
>> +
>> +if (!*matches) {
>> +*matches = g_array_new(false, true, sizeof(struct address_match));
>>  }
>> -g_array_append_val(*matches, v);
>> +struct address_match *match = g_new(struct address_match, 1);
>> +match->low = low;
>> +match->high = high;
>> +g_array_append_val(*matches, match);
>
> This is almost but not quite qemu_set_dfilter_ranges(). I wonder if it
> would be worth a light re-factoring and then exposing the parser as a
> helper function?

Thanks, I'll take a look. I wasn't aware of qemu_set_dfilter_ranges().



Re: [PATCH] hw/scsi/lsi53c895a: add hack to prevent scsi timeouts in HP-UX 10.20

2024-02-29 Thread Sven Schnelle
Peter Maydell  writes:

> On Thu, 29 Feb 2024 at 16:54, Sven Schnelle  wrote:
>>
>> Peter Maydell  writes:
>>
>> > On Wed, 28 Feb 2024 at 21:12, Sven Schnelle  wrote:
>> >>
>> >> HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
>> >> under certain circumstances. As the SCSI controller and CPU are not
>> >> running at the same time this loop will never finish. After some
>> >> time, the check loop interrupts with a unexpected device disconnect.
>> >> This works, but is slow because the kernel resets the scsi controller.
>> >> Instead of signaling UDC, add an option 'hpux-spin-workaround' which
>> >> emulates a INTERRUPT 2 script instruction. This instruction tells the
>> >> kernel that the request was fulfilled. With this change, SCSI speeds
>> >> improves significantly.
>> >> [..]
>> > I see we already have a hacky workaround for other OSes
>> > that do something similar. The ideal fix for both of these
>> > I think would be for lsi_execute_script() to, instead of stopping,
>> > arrange to defer executing more script instructions until
>> > after the guest has had a chance to run a bit more.
>> > I think setting a timer that calls lsi_resume_script() after
>> > a while would have that effect.
>>
>> Thanks, good idea. So something like this?
>
> Yeah, something like that I think. (You probably want to delete
> the timer on reset, and you need to handle migration -- you can
> either put the timer state in a new vmstate section, or else
> in post-load if the state is 'LSI_WAIT_SCRIPTS' arm the timer.)

> Does it work? :-)

Yes it does. I only did a quick hack without caring about things like
reset, freeing the timer and migration. I successfully booted HP-UX 10.20
without any scsi errors in the dmesg. So i think this is much better
than the commandline option. I'll prepare a proper patch in the next
days and submit it.

Thanks!
Sven



Re: [PATCH] hw/scsi/lsi53c895a: add hack to prevent scsi timeouts in HP-UX 10.20

2024-02-29 Thread Sven Schnelle
Peter Maydell  writes:

> On Wed, 28 Feb 2024 at 21:12, Sven Schnelle  wrote:
>>
>> HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
>> under certain circumstances. As the SCSI controller and CPU are not
>> running at the same time this loop will never finish. After some
>> time, the check loop interrupts with a unexpected device disconnect.
>> This works, but is slow because the kernel resets the scsi controller.
>> Instead of signaling UDC, add an option 'hpux-spin-workaround' which
>> emulates a INTERRUPT 2 script instruction. This instruction tells the
>> kernel that the request was fulfilled. With this change, SCSI speeds
>> improves significantly.
>> [..]
> I see we already have a hacky workaround for other OSes
> that do something similar. The ideal fix for both of these
> I think would be for lsi_execute_script() to, instead of stopping,
> arrange to defer executing more script instructions until
> after the guest has had a chance to run a bit more.
> I think setting a timer that calls lsi_resume_script() after
> a while would have that effect.

Thanks, good idea. So something like this?

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index d607a5f9fb..9931799d44 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -188,7 +188,7 @@ static const char *names[] = {
 #define LSI_TAG_VALID (1 << 16)
 
 /* Maximum instructions to process. */
-#define LSI_MAX_INSN1
+#define LSI_MAX_INSN1000
 
 typedef struct lsi_request {
 SCSIRequest *req;
@@ -205,6 +205,7 @@ enum {
 LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
 LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
 LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
+LSI_WAIT_SCRIPTS, /* SCRIPTS where stopped because of instruction count 
limit */
 };
 
 enum {
@@ -224,6 +225,7 @@ struct LSIState {
 MemoryRegion ram_io;
 MemoryRegion io_io;
 AddressSpace pci_io_as;
+QEMUTimer *scripts_timer;
 
 int carry; /* ??? Should this be an a visible register somewhere?  */
 int status;
@@ -1152,13 +1154,9 @@ again:
  * which should be enough for all valid use cases).
  */
 if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
-if (!(s->sien0 & LSI_SIST0_UDC)) {
-qemu_log_mask(LOG_GUEST_ERROR,
-  "lsi_scsi: inf. loop with UDC masked");
-}
-lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-lsi_disconnect(s);
-trace_lsi_execute_script_stop();
+s->waiting = LSI_WAIT_SCRIPTS;
+timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 
500);
+trace_lsi_scripts_timer_arm();
 reentrancy_level--;
 return;
 }
@@ -2294,6 +2292,17 @@ static const struct SCSIBusInfo lsi_scsi_info = {
 .cancel = lsi_request_cancelled
 };
 
+static void scripts_timer_cb(void *opaque)
+{
+LSIState *s = opaque;
+
+trace_lsi_scripts_timer_triggered();
+if (s->waiting == LSI_WAIT_SCRIPTS) {
+s->waiting = 0;
+lsi_execute_script(s);
+}
+}
+
 static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
 {
 LSIState *s = LSI53C895A(dev);
@@ -2313,6 +2322,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
   "lsi-ram", 0x2000);
 memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
   "lsi-io", 256);
+s->scripts_timer = timer_new_us(QEMU_CLOCK_VIRTUAL, scripts_timer_cb, s);
 
 /*
  * Since we use the address-space API to interact with ram_io, disable the
diff --git a/hw/scsi/trace-events b/hw/scsi/trace-events
index d72f741ed8..4456a08ab0 100644
--- a/hw/scsi/trace-events
+++ b/hw/scsi/trace-events
@@ -302,6 +302,8 @@ lsi_execute_script_stop(void) "SCRIPTS execution stopped"
 lsi_awoken(void) "Woken by SIGP"
 lsi_reg_read(const char *name, int offset, uint8_t ret) "Read reg %s 0x%x = 
0x%02x"
 lsi_reg_write(const char *name, int offset, uint8_t val) "Write reg %s 0x%x = 
0x%02x"
+lsi_scripts_timer_triggered(void) "SCRIPTS timer triggered"
+lsi_scripts_timer_arm(void) "SCRIPTS timer armed"
 
 # virtio-scsi.c
 virtio_scsi_cmd_req(int lun, uint32_t tag, uint8_t cmd) "virtio_scsi_cmd_req 
lun=%u tag=0x%x cmd=0x%x"



Re: [PATCH 0/4] plugins/execlog: add data address match and address range support

2024-02-29 Thread Sven Schnelle
Pierrick Bouvier  writes:

> Hi Sven, thanks for your series.
>
> Yesterday, series for new API to access registers from plugins was
> merged. As part of it, execlog plugin was extended to support this
> [1].
> This conflict with the changes presented here.
>
> Could you please rebase this series on top of master?

I sent out a rebased version. Should have done a git pull before...

Thanks,
Sven



[PATCH 3/3] plugins/execlog: add address range matching

2024-02-29 Thread Sven Schnelle
Allow to match memory ranges with the address matches. This
allows to give a range of adresses like '-dfilter=0-0x400'
which would only log memory accesses between 0 and 400.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 73 ++-
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index c89ebc08b6..b1b2a7baf1 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -44,6 +44,11 @@ static bool disas_assist;
 static GMutex add_reg_name_lock;
 static GPtrArray *all_reg_names;
 
+struct address_match {
+uint64_t low;
+uint64_t high;
+};
+
 static CPU *get_cpu(int vcpu_index)
 {
 CPU *c;
@@ -54,11 +59,12 @@ static CPU *get_cpu(int vcpu_index)
 return c;
 }
 
-static bool match_vaddr(uint64_t vaddr)
+static bool match_address_range(GArray *match, uint64_t vaddr)
 {
-for (int i = 0; i < dmatches->len; i++) {
-uint64_t v = g_array_index(dmatches, uint64_t, i);
-if (v == vaddr) {
+for (int i = 0; i < match->len; i++) {
+struct address_match *m =
+g_array_index(match, struct address_match *, i);
+if (vaddr >= m->low && vaddr <= m->high) {
 return true;
 }
 }
@@ -74,9 +80,7 @@ static void vcpu_mem(unsigned int cpu_index, 
qemu_plugin_meminfo_t info,
 CPU *c = get_cpu(cpu_index);
 GString *s = c->last_exec;
 
-/* Find vCPU in array */
-
-if (dmatches && !match_vaddr(vaddr)) {
+if (dmatches && !match_address_range(dmatches, vaddr)) {
 return;
 }
 c->log = true;
@@ -164,8 +168,10 @@ static void vcpu_insn_exec_only_regs(unsigned int 
cpu_index, void *udata)
 insn_check_regs(cpu);
 }
 
-qemu_plugin_outs(cpu->last_exec->str);
-qemu_plugin_outs("\n");
+if (cpu->log) {
+qemu_plugin_outs(cpu->last_exec->str);
+qemu_plugin_outs("\n");
+}
 }
 
 /* reset */
@@ -178,7 +184,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 CPU *cpu = get_cpu(cpu_index);
 
 /* Print previous instruction in cache */
-if (cpu->last_exec->len) {
+if (cpu->log && cpu->last_exec->len) {
 qemu_plugin_outs(cpu->last_exec->str);
 qemu_plugin_outs("\n");
 }
@@ -239,8 +245,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 if (skip && amatches) {
 int j;
 for (j = 0; j < amatches->len && skip; j++) {
-uint64_t v = g_array_index(amatches, uint64_t, j);
-if (v == insn_vaddr) {
+if (match_address_range(amatches, insn_vaddr)) {
 skip = false;
 }
 }
@@ -394,6 +399,17 @@ static void vcpu_init(qemu_plugin_id_t id, unsigned int 
vcpu_index)
 c->registers = registers_init(vcpu_index);
 }
 
+static void free_matches(GArray *matches)
+{
+if (!matches) {
+return;
+}
+
+for (int i = 0; i < matches->len; i++) {
+g_free(g_array_index(matches, struct address_match *, i));
+}
+}
+
 /**
  * On plugin exit, print last instruction in cache
  */
@@ -409,6 +425,9 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
 }
 }
 g_rw_lock_reader_unlock(&expand_array_lock);
+
+free_matches(amatches);
+free_matches(dmatches);
 }
 
 /* Add a match to the array of matches */
@@ -420,14 +439,34 @@ static void parse_insn_match(char *match)
 g_ptr_array_add(imatches, g_strdup(match));
 }
 
-static void parse_vaddr_match(GArray **matches, char *match)
+static void parse_vaddr_match(GArray **matches, char *token)
 {
-uint64_t v = g_ascii_strtoull(match, NULL, 16);
+uint64_t low, high;
+gchar *endp;
 
-if (!matches) {
-*matches = g_array_new(false, true, sizeof(uint64_t));
+low = g_ascii_strtoull(token, &endp, 16);
+if (endp == token) {
+fprintf(stderr, "Invalid address(range) specified: %s\n", token);
+return;
+}
+
+if (*endp != '-') {
+high = low;
+} else {
+high = g_ascii_strtoull(endp + 1, &endp, 16);
+if (endp == token) {
+fprintf(stderr, "Invalid address(range) specified: %s\n", token);
+return;
+}
+}
+
+if (!*matches) {
+*matches = g_array_new(false, true, sizeof(struct address_match));
 }
-g_array_append_val(*matches, v);
+struct address_match *match = g_new(struct address_match, 1);
+match->low = low;
+match->high = high;
+g_array_append_val(*matches, match);
 }
 
 /*
-- 
2.43.2




[PATCH 2/3] plugins/execlog: add data address match

2024-02-29 Thread Sven Schnelle
Add a match similar to the afilter address match, but for data
addresses. When an address is specified with '-dfilter=0x12345'
only load/stores to/from address 0x12345 are printed. All other
instructions are hidden.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index 934553e83d..c89ebc08b6 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -27,6 +27,8 @@ typedef struct CPU {
 GString *last_exec;
 /* Ptr array of Register */
 GPtrArray *registers;
+/* whether this instruction should be logged */
+bool log;
 } CPU;
 
 QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
@@ -36,6 +38,7 @@ static GRWLock expand_array_lock;
 
 static GPtrArray *imatches;
 static GArray *amatches;
+static GArray *dmatches;
 static GPtrArray *rmatches;
 static bool disas_assist;
 static GMutex add_reg_name_lock;
@@ -51,6 +54,17 @@ static CPU *get_cpu(int vcpu_index)
 return c;
 }
 
+static bool match_vaddr(uint64_t vaddr)
+{
+for (int i = 0; i < dmatches->len; i++) {
+uint64_t v = g_array_index(dmatches, uint64_t, i);
+if (v == vaddr) {
+return true;
+}
+}
+return false;
+}
+
 /**
  * Add memory read or write information to current instruction log
  */
@@ -62,6 +76,11 @@ static void vcpu_mem(unsigned int cpu_index, 
qemu_plugin_meminfo_t info,
 
 /* Find vCPU in array */
 
+if (dmatches && !match_vaddr(vaddr)) {
+return;
+}
+c->log = true;
+
 /* Indicate type of memory access */
 if (qemu_plugin_mem_is_store(info)) {
 g_string_append(s, ", store");
@@ -121,15 +140,17 @@ static void vcpu_insn_exec_with_regs(unsigned int 
cpu_index, void *udata)
 if (cpu->registers) {
 insn_check_regs(cpu);
 }
-
-qemu_plugin_outs(cpu->last_exec->str);
-qemu_plugin_outs("\n");
+if (cpu->log) {
+qemu_plugin_outs(cpu->last_exec->str);
+qemu_plugin_outs("\n");
+}
 }
 
 /* Store new instruction in cache */
 /* vcpu_mem will add memory access information to last_exec */
 g_string_printf(cpu->last_exec, "%u, ", cpu_index);
 g_string_append(cpu->last_exec, (char *)udata);
+cpu->log = dmatches ? false : true;
 }
 
 /* Log last instruction while checking registers, ignore next */
@@ -166,6 +187,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 /* vcpu_mem will add memory access information to last_exec */
 g_string_printf(cpu->last_exec, "%u, ", cpu_index);
 g_string_append(cpu->last_exec, (char *)udata);
+cpu->log = dmatches ? false : true;
 }
 
 /**
@@ -381,7 +403,7 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
 g_rw_lock_reader_lock(&expand_array_lock);
 for (i = 0; i < cpus->len; i++) {
 CPU *c = get_cpu(i);
-if (c->last_exec && c->last_exec->str) {
+if (c->log && c->last_exec && c->last_exec->str) {
 qemu_plugin_outs(c->last_exec->str);
 qemu_plugin_outs("\n");
 }
@@ -441,6 +463,8 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t 
id,
 parse_insn_match(tokens[1]);
 } else if (g_strcmp0(tokens[0], "afilter") == 0) {
 parse_vaddr_match(&amatches, tokens[1]);
+} else if (g_strcmp0(tokens[0], "dfilter") == 0) {
+parse_vaddr_match(&dmatches, tokens[1]);
 } else if (g_strcmp0(tokens[0], "reg") == 0) {
 add_regpat(tokens[1]);
 } else if (g_strcmp0(tokens[0], "rdisas") == 0) {
-- 
2.43.2




[PATCH 0/3] plugins/execlog: add data address match and address range support

2024-02-29 Thread Sven Schnelle
Hi List,

this patchset adds a new -dfilter option and address range matching. With this
execlog can match only a certain range of address for both instruction and
data adresses.

Example usage:

qemu-system-xxx  -d plugin -plugin 
libexeclog.so,afilter=0x1000-0x2000,dfilter=0x388

This would only log instruction in the address range 0x1000 to 0x2000
and accessing data at address 0x388.

Changes in v2:
- rebased on top of latest master

Sven Schnelle (3):
  plugins/execlog: pass matches array to parse_vaddr_match
  plugins/execlog: add data address match
  plugins/execlog: add address range matching

 contrib/plugins/execlog.c | 95 ---
 1 file changed, 79 insertions(+), 16 deletions(-)

-- 
2.43.2




[PATCH] hw/scsi/lsi53c895a: add hack to prevent scsi timeouts in HP-UX 10.20

2024-02-28 Thread Sven Schnelle
HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, add an option 'hpux-spin-workaround' which
emulates a INTERRUPT 2 script instruction. This instruction tells the
kernel that the request was fulfilled. With this change, SCSI speeds
improves significantly.

The option can be enabled by adding

-global lsi53c895a.hpux-spin-workaround=on

to the qemu commandline.

Signed-off-by: Sven Schnelle 
---
 hw/scsi/lsi53c895a.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index d607a5f9fb..20c353f594 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -304,6 +304,7 @@ struct LSIState {
 uint32_t adder;
 
 uint8_t script_ram[2048 * sizeof(uint32_t)];
+bool hpux_spin_workaround;
 };
 
 #define TYPE_LSI53C810  "lsi53c810"
@@ -1156,8 +1157,17 @@ again:
 qemu_log_mask(LOG_GUEST_ERROR,
   "lsi_scsi: inf. loop with UDC masked");
 }
-lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
-lsi_disconnect(s);
+if (s->hpux_spin_workaround) {
+/*
+ * Workaround for HP-UX 10.20: Instead of disconnecting, which
+ * causes a long delay, emulate a INTERRUPT 2 instruction.
+ */
+s->dsps = 2;
+lsi_script_dma_interrupt(s, LSI_DSTAT_SIR);
+} else {
+lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
+lsi_disconnect(s);
+}
 trace_lsi_execute_script_stop();
 reentrancy_level--;
 return;
@@ -2339,6 +2349,11 @@ static void lsi_scsi_exit(PCIDevice *dev)
 address_space_destroy(&s->pci_io_as);
 }
 
+static Property lsi_props[] = {
+DEFINE_PROP_BOOL("hpux-spin-workaround", LSIState, hpux_spin_workaround,
+ false),
+};
+
 static void lsi_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -2353,6 +2368,7 @@ static void lsi_class_init(ObjectClass *klass, void *data)
 dc->reset = lsi_scsi_reset;
 dc->vmsd = &vmstate_lsi_scsi;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+device_class_set_props(dc, lsi_props);
 }
 
 static const TypeInfo lsi_info = {
-- 
2.43.2




[PATCH 2/3] target/hppa: add 64 bit support to gdbstub

2024-02-28 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/gdbstub.c | 48 +--
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/target/hppa/gdbstub.c b/target/hppa/gdbstub.c
index 48a514384f..a5b2c80c07 100644
--- a/target/hppa/gdbstub.c
+++ b/target/hppa/gdbstub.c
@@ -21,11 +21,25 @@
 #include "cpu.h"
 #include "gdbstub/helpers.h"
 
+static int hppa_num_regs(CPUHPPAState *env)
+{
+return hppa_is_pa20(env) ? 96 : 128;
+}
+
+static int hppa_reg_size(CPUHPPAState *env)
+{
+return hppa_is_pa20(env) ? 8 : 4;
+}
+
 int hppa_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
 {
 HPPACPU *cpu = HPPA_CPU(cs);
 CPUHPPAState *env = &cpu->env;
-target_ureg val;
+target_ulong val;
+
+if (n >= hppa_num_regs(env)) {
+return 0;
+}
 
 switch (n) {
 case 0:
@@ -128,18 +142,18 @@ int hppa_cpu_gdb_read_register(CPUState *cs, GByteArray 
*mem_buf, int n)
 val = env->cr[30];
 break;
 case 64 ... 127:
-val = extract64(env->fr[(n - 64) / 2], (n & 1 ? 0 : 32), 32);
-break;
-default:
-if (n < 128) {
-val = 0;
+if (hppa_is_pa20(env)) {
+val = env->fr[n - 64];
 } else {
-return 0;
+val = extract64(env->fr[(n - 64) / 2], (n & 1 ? 0 : 32), 32);
 }
 break;
+default:
+val = 0;
+break;
 }
 
-if (TARGET_REGISTER_BITS == 64) {
+if (hppa_is_pa20(env)) {
 return gdb_get_reg64(mem_buf, val);
 } else {
 return gdb_get_reg32(mem_buf, val);
@@ -150,9 +164,13 @@ int hppa_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 {
 HPPACPU *cpu = HPPA_CPU(cs);
 CPUHPPAState *env = &cpu->env;
-target_ureg val;
+target_ulong val;
+
+if (n >= hppa_num_regs(env)) {
+return 0;
+}
 
-if (TARGET_REGISTER_BITS == 64) {
+if (hppa_is_pa20(env)) {
 val = ldq_p(mem_buf);
 } else {
 val = ldl_p(mem_buf);
@@ -267,16 +285,16 @@ int hppa_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 cpu_hppa_loaded_fr0(env);
 break;
 case 65 ... 127:
-{
+if (hppa_is_pa20(env)) {
+env->fr[n - 64] = val;
+} else {
 uint64_t *fr = &env->fr[(n - 64) / 2];
 *fr = deposit64(*fr, (n & 1 ? 0 : 32), 32, val);
 }
 break;
 default:
-if (n >= 128) {
-return 0;
-}
 break;
 }
-return sizeof(target_ureg);
+
+return hppa_reg_size(env);
 }
-- 
2.43.2




[PATCH 3/3] target/hppa: mask CR_SAR register writes to 5/6 bit in gdbstub

2024-02-28 Thread Sven Schnelle
Signed-off-by: Sven Schnelle 
---
 target/hppa/gdbstub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/gdbstub.c b/target/hppa/gdbstub.c
index a5b2c80c07..049b2d6381 100644
--- a/target/hppa/gdbstub.c
+++ b/target/hppa/gdbstub.c
@@ -184,7 +184,7 @@ int hppa_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 env->gr[n] = val;
 break;
 case 32:
-env->cr[CR_SAR] = val;
+env->cr[CR_SAR] = val & (hppa_is_pa20(env) ? 63 : 31);
 break;
 case 33:
 env->iaoq_f = val;
-- 
2.43.2




[PATCH 0/3] 64 Bit support for hppa gdbstub

2024-02-28 Thread Sven Schnelle
Hi List,

this patchset allows to debug the hppa target when running in wide (64 bit)
mode. gdb needs a small patch to switch to 64 bit mode. I pushed the
patch to 
https://github.com/bminor/binutils-gdb/commit/fd8662ec282d688d1f8100b4365823e57516857b
With this patch gdb will switch to the appropriate mode when connecting
to qemu/gdbstub.

Sven Schnelle (3):
  Revert "target/hppa: Drop attempted gdbstub support for hppa64"
  target/hppa: add 64 bit support to gdbstub
  target/hppa: mask CR_SAR register writes to 5/6 bit in gdbstub

 target/hppa/gdbstub.c | 66 +--
 1 file changed, 45 insertions(+), 21 deletions(-)

-- 
2.43.2




[PATCH 1/3] Revert "target/hppa: Drop attempted gdbstub support for hppa64"

2024-02-28 Thread Sven Schnelle
Despite commit e207b4aa718e ("target/hppa: Drop attempted gdbstub
support for hppa64") saying that hppa-linux-gdb doesn't support 64 bit
mode via remote protocol, it is actually working with a small add-on
patch which enables gdb to guess the size from the g protocol:

$ hppa64-linux-gnu-gdb ~/seabios-hppa/out-64/hppa-firmware64.img
[..]
Reading symbols from /home/svens/seabios-hppa/out-64/hppa-firmware64.img...
(gdb) target remote :1234
Remote debugging using :1234
warning: remote target does not support file transfer, attempting to access 
files from local filesystem.
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
startup () at src/parisc/head.S:144
144 rsm PSW_I, %r0  /* disable local irqs */
(gdb)

Signed-off-by: Sven Schnelle 
---
 target/hppa/gdbstub.c | 32 +++-
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/target/hppa/gdbstub.c b/target/hppa/gdbstub.c
index 4a965b38d7..48a514384f 100644
--- a/target/hppa/gdbstub.c
+++ b/target/hppa/gdbstub.c
@@ -21,16 +21,11 @@
 #include "cpu.h"
 #include "gdbstub/helpers.h"
 
-/*
- * GDB 15 only supports PA1.0 via the remote protocol, and ignores
- * any provided xml.  Which means that any attempt to provide more
- * data results in "Remote 'g' packet reply is too long".
- */
-
 int hppa_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
 {
-CPUHPPAState *env = cpu_env(cs);
-uint32_t val;
+HPPACPU *cpu = HPPA_CPU(cs);
+CPUHPPAState *env = &cpu->env;
+target_ureg val;
 
 switch (n) {
 case 0:
@@ -144,13 +139,24 @@ int hppa_cpu_gdb_read_register(CPUState *cs, GByteArray 
*mem_buf, int n)
 break;
 }
 
-return gdb_get_reg32(mem_buf, val);
+if (TARGET_REGISTER_BITS == 64) {
+return gdb_get_reg64(mem_buf, val);
+} else {
+return gdb_get_reg32(mem_buf, val);
+}
 }
 
 int hppa_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
 {
-CPUHPPAState *env = cpu_env(cs);
-uint32_t val = ldl_p(mem_buf);
+HPPACPU *cpu = HPPA_CPU(cs);
+CPUHPPAState *env = &cpu->env;
+target_ureg val;
+
+if (TARGET_REGISTER_BITS == 64) {
+val = ldq_p(mem_buf);
+} else {
+val = ldl_p(mem_buf);
+}
 
 switch (n) {
 case 0:
@@ -160,7 +166,7 @@ int hppa_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 env->gr[n] = val;
 break;
 case 32:
-env->cr[CR_SAR] = val & (hppa_is_pa20(env) ? 63 : 31);
+env->cr[CR_SAR] = val;
 break;
 case 33:
 env->iaoq_f = val;
@@ -272,5 +278,5 @@ int hppa_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 }
 break;
 }
-return 4;
+return sizeof(target_ureg);
 }
-- 
2.43.2




[PATCH 0/4] plugins/execlog: add data address match and address range support

2024-02-28 Thread Sven Schnelle
Hi List,

this patchset adds a new -dfilter option and address range matching. With this
execlog can match only a certain range of address for both instruction and
data adresses.

Example usage:

qemu-system-xxx  -d plugin -plugin 
libexeclog.so,afilter=0x1000-0x2000,dfilter=0x388

This would only log instruction in the address range 0x1000 to 0x2000
and accessing data at address 0x388.

Sven Schnelle (4):
  plugins/execlog: add struct execlog_ctx
  plugins/execlog: pass matches array to parse_vaddr_match
  plugins/execlog: add data address match
  plugins/execlog: add address range matching

 contrib/plugins/execlog.c | 102 ++
 1 file changed, 82 insertions(+), 20 deletions(-)

-- 
2.43.2




[PATCH 4/4] plugins/execlog: add address range matching

2024-02-28 Thread Sven Schnelle
Allow to match memory ranges with the address matches. This
allows to give a range of adresses like '-dfilter=0-0x400'
which would only log memory accesses between 0 and 400.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 65 +++
 1 file changed, 52 insertions(+), 13 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index 33fef9bfc6..a505f98be8 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -30,6 +30,11 @@ struct execlog_ctx {
 bool log;
 };
 
+struct address_match {
+uint64_t low;
+uint64_t high;
+};
+
 /*
  * Expand last_exec array.
  *
@@ -47,17 +52,18 @@ static void expand_last_exec(int cpu_index)
 g_rw_lock_writer_unlock(&expand_array_lock);
 }
 
-static bool match_vaddr(struct execlog_ctx *ctx, uint64_t vaddr)
+static bool match_address_range(GArray *match, uint64_t vaddr)
 {
-for (int i = 0; i < dmatches->len; i++) {
-uint64_t v = g_array_index(dmatches, uint64_t, i);
-if (v == vaddr) {
-ctx->log = true;
+for (int i = 0; i < match->len; i++) {
+struct address_match *m =
+g_array_index(match, struct address_match *, i);
+if (vaddr >= m->low && vaddr <= m->high) {
 return true;
 }
 }
 return false;
 }
+
 /**
  * Add memory read or write information to current instruction log
  */
@@ -70,9 +76,10 @@ static void vcpu_mem(unsigned int cpu_index, 
qemu_plugin_meminfo_t info,
 struct execlog_ctx *ctx = g_ptr_array_index(last_exec, cpu_index);
 g_rw_lock_reader_unlock(&expand_array_lock);
 
-if (dmatches && !match_vaddr(ctx, vaddr)) {
+if (dmatches && !match_address_range(dmatches, vaddr)) {
 return;
 }
+ctx->log = true;
 GString *s = ctx->s;
 /* Indicate type of memory access */
 if (qemu_plugin_mem_is_store(info)) {
@@ -166,8 +173,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 if (skip && amatches) {
 int j;
 for (j = 0; j < amatches->len && skip; j++) {
-uint64_t v = g_array_index(amatches, uint64_t, j);
-if (v == insn_vaddr) {
+if (match_address_range(amatches, insn_vaddr)) {
 skip = false;
 }
 }
@@ -197,6 +203,16 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 }
 }
 
+static void free_matches(GArray *matches)
+{
+if (!matches) {
+return;
+}
+
+for (int i = 0; i < matches->len; i++) {
+g_free(g_array_index(matches, struct address_match *, i));
+}
+}
 /**
  * On plugin exit, print last instruction in cache
  */
@@ -212,6 +228,9 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
 qemu_plugin_outs("\n");
 }
 }
+
+free_matches(amatches);
+free_matches(dmatches);
 }
 
 /* Add a match to the array of matches */
@@ -223,14 +242,34 @@ static void parse_insn_match(char *match)
 g_ptr_array_add(imatches, match);
 }
 
-static void parse_vaddr_match(GArray **matches, char *match)
+static void parse_vaddr_match(GArray **matches, char *token)
 {
-uint64_t v = g_ascii_strtoull(match, NULL, 16);
+uint64_t low, high;
+gchar *endp;
 
-if (!matches) {
-*matches = g_array_new(false, true, sizeof(uint64_t));
+low = g_ascii_strtoull(token, &endp, 16);
+if (endp == token) {
+fprintf(stderr, "Invalid address(range) specified: %s\n", token);
+return;
+}
+
+if (*endp != '-') {
+high = low;
+} else {
+high = g_ascii_strtoull(endp + 1, &endp, 16);
+if (endp == token) {
+fprintf(stderr, "Invalid address(range) specified: %s\n", token);
+return;
+}
+}
+
+if (!*matches) {
+*matches = g_array_new(false, true, sizeof(struct address_match));
 }
-g_array_append_val(*matches, v);
+struct address_match *match = g_new(struct address_match, 1);
+match->low = low;
+match->high = high;
+g_array_append_val(*matches, match);
 }
 
 /**
-- 
2.43.2




[PATCH 3/4] plugins/execlog: add data address match

2024-02-28 Thread Sven Schnelle
Add a match similar to the afilter address match, but for data
addresses. When an address is specified with '-dfilter=0x12345'
only load/stores to/from address 0x12345 are printed. All other
instructions are hidden.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index b4b5ba113c..33fef9bfc6 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -23,9 +23,11 @@ static GRWLock expand_array_lock;
 
 static GPtrArray *imatches;
 static GArray *amatches;
+static GArray *dmatches;
 
 struct execlog_ctx {
 GString *s;
+bool log;
 };
 
 /*
@@ -45,6 +47,17 @@ static void expand_last_exec(int cpu_index)
 g_rw_lock_writer_unlock(&expand_array_lock);
 }
 
+static bool match_vaddr(struct execlog_ctx *ctx, uint64_t vaddr)
+{
+for (int i = 0; i < dmatches->len; i++) {
+uint64_t v = g_array_index(dmatches, uint64_t, i);
+if (v == vaddr) {
+ctx->log = true;
+return true;
+}
+}
+return false;
+}
 /**
  * Add memory read or write information to current instruction log
  */
@@ -57,6 +70,9 @@ static void vcpu_mem(unsigned int cpu_index, 
qemu_plugin_meminfo_t info,
 struct execlog_ctx *ctx = g_ptr_array_index(last_exec, cpu_index);
 g_rw_lock_reader_unlock(&expand_array_lock);
 
+if (dmatches && !match_vaddr(ctx, vaddr)) {
+return;
+}
 GString *s = ctx->s;
 /* Indicate type of memory access */
 if (qemu_plugin_mem_is_store(info)) {
@@ -93,7 +109,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 GString *s = ctx->s;
 
 /* Print previous instruction in cache */
-if (s->len) {
+if (ctx->log && s->len) {
 qemu_plugin_outs(s->str);
 qemu_plugin_outs("\n");
 }
@@ -102,6 +118,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 /* vcpu_mem will add memory access information to last_exec */
 g_string_printf(s, "%u, ", cpu_index);
 g_string_append(s, (char *)udata);
+ctx->log = dmatches ? false : true;
 }
 
 /**
@@ -190,7 +207,7 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
 for (i = 0; i < last_exec->len; i++) {
 struct execlog_ctx *ctx = g_ptr_array_index(last_exec, i);
 GString *s = ctx->s;
-if (s->str) {
+if (ctx->log && s->str) {
 qemu_plugin_outs(s->str);
 qemu_plugin_outs("\n");
 }
@@ -240,6 +257,8 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t 
id,
 parse_insn_match(tokens[1]);
 } else if (g_strcmp0(tokens[0], "afilter") == 0) {
 parse_vaddr_match(&amatches, tokens[1]);
+} else if (g_strcmp0(tokens[0], "dfilter") == 0) {
+parse_vaddr_match(&dmatches, tokens[1]);
 } else {
 fprintf(stderr, "option parsing failed: %s\n", opt);
 return -1;
-- 
2.43.2




[PATCH 1/4] plugins/execlog: add struct execlog_ctx

2024-02-28 Thread Sven Schnelle
Add a context structure for future enhancements. No functional
change intended.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index 82dc2f584e..90da1911b2 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -24,6 +24,10 @@ static GRWLock expand_array_lock;
 static GPtrArray *imatches;
 static GArray *amatches;
 
+struct execlog_ctx {
+GString *s;
+};
+
 /*
  * Expand last_exec array.
  *
@@ -34,8 +38,9 @@ static void expand_last_exec(int cpu_index)
 {
 g_rw_lock_writer_lock(&expand_array_lock);
 while (cpu_index >= last_exec->len) {
-GString *s = g_string_new(NULL);
-g_ptr_array_add(last_exec, s);
+struct execlog_ctx *ctx = g_new(struct execlog_ctx, 1);
+ctx->s = g_string_new(NULL);
+g_ptr_array_add(last_exec, ctx);
 }
 g_rw_lock_writer_unlock(&expand_array_lock);
 }
@@ -46,14 +51,13 @@ static void expand_last_exec(int cpu_index)
 static void vcpu_mem(unsigned int cpu_index, qemu_plugin_meminfo_t info,
  uint64_t vaddr, void *udata)
 {
-GString *s;
-
 /* Find vCPU in array */
 g_rw_lock_reader_lock(&expand_array_lock);
 g_assert(cpu_index < last_exec->len);
-s = g_ptr_array_index(last_exec, cpu_index);
+struct execlog_ctx *ctx = g_ptr_array_index(last_exec, cpu_index);
 g_rw_lock_reader_unlock(&expand_array_lock);
 
+GString *s = ctx->s;
 /* Indicate type of memory access */
 if (qemu_plugin_mem_is_store(info)) {
 g_string_append(s, ", store");
@@ -77,8 +81,6 @@ static void vcpu_mem(unsigned int cpu_index, 
qemu_plugin_meminfo_t info,
  */
 static void vcpu_insn_exec(unsigned int cpu_index, void *udata)
 {
-GString *s;
-
 /* Find or create vCPU in array */
 g_rw_lock_reader_lock(&expand_array_lock);
 if (cpu_index >= last_exec->len) {
@@ -86,8 +88,9 @@ static void vcpu_insn_exec(unsigned int cpu_index, void 
*udata)
 expand_last_exec(cpu_index);
 g_rw_lock_reader_lock(&expand_array_lock);
 }
-s = g_ptr_array_index(last_exec, cpu_index);
+struct execlog_ctx *ctx = g_ptr_array_index(last_exec, cpu_index);
 g_rw_lock_reader_unlock(&expand_array_lock);
+GString *s = ctx->s;
 
 /* Print previous instruction in cache */
 if (s->len) {
@@ -183,9 +186,10 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 static void plugin_exit(qemu_plugin_id_t id, void *p)
 {
 guint i;
-GString *s;
+
 for (i = 0; i < last_exec->len; i++) {
-s = g_ptr_array_index(last_exec, i);
+struct execlog_ctx *ctx = g_ptr_array_index(last_exec, i);
+GString *s = ctx->s;
 if (s->str) {
 qemu_plugin_outs(s->str);
 qemu_plugin_outs("\n");
-- 
2.43.2




[PATCH 2/4] plugins/execlog: pass matches array to parse_vaddr_match

2024-02-28 Thread Sven Schnelle
Pass the matches array to parse_vaddr_match(), so future address
matches can reuse that function.

Signed-off-by: Sven Schnelle 
---
 contrib/plugins/execlog.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
index 90da1911b2..b4b5ba113c 100644
--- a/contrib/plugins/execlog.c
+++ b/contrib/plugins/execlog.c
@@ -206,14 +206,14 @@ static void parse_insn_match(char *match)
 g_ptr_array_add(imatches, match);
 }
 
-static void parse_vaddr_match(char *match)
+static void parse_vaddr_match(GArray **matches, char *match)
 {
 uint64_t v = g_ascii_strtoull(match, NULL, 16);
 
-if (!amatches) {
-amatches = g_array_new(false, true, sizeof(uint64_t));
+if (!matches) {
+*matches = g_array_new(false, true, sizeof(uint64_t));
 }
-g_array_append_val(amatches, v);
+g_array_append_val(*matches, v);
 }
 
 /**
@@ -239,7 +239,7 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t 
id,
 if (g_strcmp0(tokens[0], "ifilter") == 0) {
 parse_insn_match(tokens[1]);
 } else if (g_strcmp0(tokens[0], "afilter") == 0) {
-parse_vaddr_match(tokens[1]);
+parse_vaddr_match(&amatches, tokens[1]);
 } else {
 fprintf(stderr, "option parsing failed: %s\n", opt);
 return -1;
-- 
2.43.2




[PATCH] hw/net/tulip: add chip status register values

2024-02-05 Thread Sven Schnelle
Netbsd isn't able to detect a link on the emulated tulip card. That's
because netbsd reads the Chip Status Register of the Phy (address
0x14). The default phy data in the qemu tulip driver is all zero,
which means no link is established and autonegotation isn't complete.

Therefore set the register to 0x3b40, which means:

Link is up, Autonegotation complete, Full Duplex, 100MBit/s Link
speed.

Also clear the mask because this register is read only.

Signed-off-by: Sven Schnelle 
---
 hw/net/tulip.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/tulip.c b/hw/net/tulip.c
index 6d4fb06dad..1f2ef20977 100644
--- a/hw/net/tulip.c
+++ b/hw/net/tulip.c
@@ -421,7 +421,7 @@ static uint16_t tulip_mdi_default[] = {
 /* MDI Registers 8 - 15 */
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 /* MDI Registers 16 - 31 */
-0x0003, 0x, 0x0001, 0x, 0x, 0x, 0x, 0x,
+0x0003, 0x, 0x0001, 0x, 0x3b40, 0x, 0x, 0x,
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 };
 
@@ -429,7 +429,7 @@ static uint16_t tulip_mdi_default[] = {
 static const uint16_t tulip_mdi_mask[] = {
 0x, 0x, 0x, 0x, 0xc01f, 0x, 0x, 0x,
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
-0x0fff, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
+0x0fff, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x,
 };
 
-- 
2.43.0




Re: [PATCH] hw/scsi/lsi53c895a: add missing decrement of reentrancy counter

2024-01-28 Thread Sven Schnelle
Thomas Huth  writes:

> On 28/01/2024 21.22, Sven Schnelle wrote:
>> When the maximum count of SCRIPTS instructions is reached, the code
>> stops execution and returns, but fails to decrement the reentrancy
>> counter. This effectively renders the SCSI controller unusable
>> because on next entry the reentrancy counter is still above the limit.
>> This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
>> loops.
>
> Out of curiosity: What happened there before we introduced the
> reentrancy_level fix? Did it end up in an endless loop, or was it
> finishing at one point? In the latter case, we might need to adjust
> the "reentrancy_level > 8" to allow deeper nesting.

Without the reentrancy counter it was triggering the insn_processed
limit. The HP-UX scsi driver seems to spin on some memory value during
some SCSI writes (CDB with command 0x2a). So it is spinning in an
endless loop until the insn_processed counter will trigger the exit.
In HP-UX you will see a SCSI command timeout error in the kernel log
- at least i'm assuming that's related, but can't say for sure as
there's no kernel source available.



[PATCH] hw/scsi/lsi53c895a: add missing decrement of reentrancy counter

2024-01-28 Thread Sven Schnelle
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.

This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.

Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI 
controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle 
---
 hw/scsi/lsi53c895a.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 34e3b89287..d607a5f9fb 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1159,6 +1159,7 @@ again:
 lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
 lsi_disconnect(s);
 trace_lsi_execute_script_stop();
+reentrancy_level--;
 return;
 }
 insn = read_dword(s, s->dsp);
-- 
2.43.0




Re: [PATCH v2 0/4] target/s390x: Exit tb after executing ex_value

2022-07-02 Thread Sven Schnelle
Hi Richard,

Richard Henderson  writes:

> Ok, version 1 didn't work, so once more with feeling.
> Can you give it a try, Sven?

This seems to work. Thanks a lot!

Regards
Sven



Re: [PATCH] target/s390x: Exit tb after executing ex_value

2022-06-30 Thread Sven Schnelle
Hi Richard,

Richard Henderson  writes:

> When EXECUTE sets ex_value to interrupt the constructed instruction,
> we implicitly disable interrupts so that the value is not corrupted.
> Exit to the main loop after execution, so that we re-evaluate any
> pending interrupts.
>
> Reported-by: Sven Schnelle 
> Signed-off-by: Richard Henderson 
> ---
>
> Hi Sven.  Will you test this vs your testcase?  Thanks,

Of course, i'm happy if someone fixes this so i don't have to. :-)

Unfortunately it doesn't fix the issue:

exec_tb_exit tb:(nil) flags=0x0
exec_tb tb:0x3ff35c66f00 pc=0x400
exec_tb tb:0x3ff3410f300 pc=0x1edf7f8
tcg_handle_interrupt: 2
exec_tb_exit tb:0x3ff340d2d00 flags=0x3
ignoring irq during EX
ignoring irq during EX
exec_tb tb:0x3ff340d2d00 pc=0x1edf810

writing dc->base.is_jmp to the qemu log shows:

s390x_tr_translate_insn: is_jmp: 3
s390x_tr_translate_insn: is_jmp: 3
s390x_tr_translate_insn: is_jmp: 3
s390x_tr_translate_insn: is_jmp: 3
s390x_tr_translate_insn: is_jmp: 3
s390x_tr_translate_insn: is_jmp: 3
[..]

So is_jump is always 3, which is DISAS_TARGET_0. I think the
if (dc->base.is_jmp == DISAS_NEXT) condition therefore never matches.




Re: qemu-system-s390x hang in tcg

2022-06-29 Thread Sven Schnelle
Alex Bennée  writes:

> Sven Schnelle  writes:
>
>> Hi,
>>
>> David Hildenbrand  writes:
>>
>>> On 04.05.22 09:37, Janosch Frank wrote:
>>>> I had a short look yesterday and the boot usually hangs in the raid6 
>>>> code. Disabling vector instructions didn't make a difference but a few 
>>>> interruptions via GDB solve the problem for some reason.
>>>> 
>>>> CCing David and Thomas for TCG
>>>> 
>>>
>>> I somehow recall that KASAN was always disabled under TCG, I might be
>>> wrong (I thought we'd get a message early during boot that the HW
>>> doesn't support KASAN).
>>>
>>> I recall that raid code is a heavy user of vector instructions.
>>>
>>> How can I reproduce? Compile upstream (or -next?) with kasan support and
>>> run it under TCG?
>>
>> I spent some time looking into this. It's usually hanging in
>> s390vx8_gen_syndrome(). My first thought was that it is a problem with
>> the VX instructions, but turned out that it hangs even if i remove all
>> the code from s390vx8_gen_syndrome().
>>
>> Tracing the execution of TB's, i see that the generated code is always
>> jumping between a few TB's, but never exiting the TB's to check for
>> interrupts (i.e. return to cpu_tb_exec(). I only see calls to
>> helper_lookup_tb_ptr to lookup the tb pointer for the next TB.
>>
>> The raid6 code is waiting for some time to expire by reading jiffies,
>> but interrupts are never processed and therefore jiffies doesn't change.
>> So the raid6 code hangs forever.
>>
>> As a test, i made a quick change to test:
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index c997c2e8e0..35819fd5a7 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -319,7 +319,8 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>>  cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>>
>>  cflags = curr_cflags(cpu);
>> -if (check_for_breakpoints(cpu, pc, &cflags)) {
>> +if (check_for_breakpoints(cpu, pc, &cflags) ||
>> +unlikely(qatomic_read(&cpu->interrupt_request))) {
>>  cpu_loop_exit(cpu);
>>  }
>>
>> And that makes the problem go away. But i'm not familiar with the TCG
>> internals, so i can't say whether the generated code is incorrect or
>> something else is wrong. I have tcg log files of a failing + working run
>> if someone wants to take a look. They are rather large so i would have to
>> upload them somewhere.
>
> Whatever is setting cpu->interrupt_request should be calling
> cpu_exit(cpu) which sets the exit flag which is checked at the start of
> every TB execution (see gen_tb_start).

Thanks, that was very helpful. I added debugging and it turned out
that the TB is left because of a pending irq. The code then calls
s390_cpu_exec_interrupt:

bool s390_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
{
if (interrupt_request & CPU_INTERRUPT_HARD) {
S390CPU *cpu = S390_CPU(cs);
CPUS390XState *env = &cpu->env;

if (env->ex_value) {
/* Execution of the target insn is indivisible from
   the parent EXECUTE insn.  */
return false;
}
if (s390_cpu_has_int(cpu)) {
s390_cpu_do_interrupt(cs);
return true;
}
if (env->psw.mask & PSW_MASK_WAIT) {
/* Woken up because of a floating interrupt but it has already
 * been delivered. Go back to sleep. */
cpu_interrupt(CPU(cpu), CPU_INTERRUPT_HALT);
}
}
return false;
}

Note the 'if (env->ex_value) { }' check. It looks like this function
just returns false in case tcg is executing an EX instruction. After
that the information that the TB should be exited because of an
interrupt is gone. So the TB's are never exited again, although the
interrupt wasn't handled. At least that's my assumption now, if i'm
wrong please tell me.

So the raid6 code is spinning waiting that the jiffies value reaches a
timeout, but as the timer interrupt was lost it will never change.

So i wonder now how this could be fixed.



qemu-system-s390x hang in tcg (was: Re: [PATCH v8 23/70] mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap())

2022-06-29 Thread Sven Schnelle
Hi,

David Hildenbrand  writes:

> On 04.05.22 09:37, Janosch Frank wrote:
>> I had a short look yesterday and the boot usually hangs in the raid6 
>> code. Disabling vector instructions didn't make a difference but a few 
>> interruptions via GDB solve the problem for some reason.
>> 
>> CCing David and Thomas for TCG
>> 
>
> I somehow recall that KASAN was always disabled under TCG, I might be
> wrong (I thought we'd get a message early during boot that the HW
> doesn't support KASAN).
>
> I recall that raid code is a heavy user of vector instructions.
>
> How can I reproduce? Compile upstream (or -next?) with kasan support and
> run it under TCG?

I spent some time looking into this. It's usually hanging in
s390vx8_gen_syndrome(). My first thought was that it is a problem with
the VX instructions, but turned out that it hangs even if i remove all
the code from s390vx8_gen_syndrome().

Tracing the execution of TB's, i see that the generated code is always
jumping between a few TB's, but never exiting the TB's to check for
interrupts (i.e. return to cpu_tb_exec(). I only see calls to
helper_lookup_tb_ptr to lookup the tb pointer for the next TB.

The raid6 code is waiting for some time to expire by reading jiffies,
but interrupts are never processed and therefore jiffies doesn't change.
So the raid6 code hangs forever.

As a test, i made a quick change to test:

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index c997c2e8e0..35819fd5a7 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -319,7 +319,8 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
 cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);

 cflags = curr_cflags(cpu);
-if (check_for_breakpoints(cpu, pc, &cflags)) {
+if (check_for_breakpoints(cpu, pc, &cflags) ||
+unlikely(qatomic_read(&cpu->interrupt_request))) {
 cpu_loop_exit(cpu);
 }

And that makes the problem go away. But i'm not familiar with the TCG
internals, so i can't say whether the generated code is incorrect or
something else is wrong. I have tcg log files of a failing + working run
if someone wants to take a look. They are rather large so i would have to
upload them somewhere.



Re: qemu-system-s390x hang in tcg

2022-06-29 Thread Sven Schnelle
Sven Schnelle  writes:

> Alex Bennée  writes:
>
>> Sven Schnelle  writes:
>>
>>> Hi,
>>>
>>> David Hildenbrand  writes:
>>>
>>>> On 04.05.22 09:37, Janosch Frank wrote:
>>>>> I had a short look yesterday and the boot usually hangs in the raid6 
>>>>> code. Disabling vector instructions didn't make a difference but a few 
>>>>> interruptions via GDB solve the problem for some reason.
>>>>> 
>>>>> CCing David and Thomas for TCG
>>>>> 
>>>>
>>>> I somehow recall that KASAN was always disabled under TCG, I might be
>>>> wrong (I thought we'd get a message early during boot that the HW
>>>> doesn't support KASAN).
>>>>
>>>> I recall that raid code is a heavy user of vector instructions.
>>>>
>>>> How can I reproduce? Compile upstream (or -next?) with kasan support and
>>>> run it under TCG?
>>>
>>> I spent some time looking into this. It's usually hanging in
>>> s390vx8_gen_syndrome(). My first thought was that it is a problem with
>>> the VX instructions, but turned out that it hangs even if i remove all
>>> the code from s390vx8_gen_syndrome().
>>>
>>> Tracing the execution of TB's, i see that the generated code is always
>>> jumping between a few TB's, but never exiting the TB's to check for
>>> interrupts (i.e. return to cpu_tb_exec(). I only see calls to
>>> helper_lookup_tb_ptr to lookup the tb pointer for the next TB.
>>>
>>> The raid6 code is waiting for some time to expire by reading jiffies,
>>> but interrupts are never processed and therefore jiffies doesn't change.
>>> So the raid6 code hangs forever.
>>>
>>> As a test, i made a quick change to test:
>>>
>>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>>> index c997c2e8e0..35819fd5a7 100644
>>> --- a/accel/tcg/cpu-exec.c
>>> +++ b/accel/tcg/cpu-exec.c
>>> @@ -319,7 +319,8 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>>>  cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>>>
>>>  cflags = curr_cflags(cpu);
>>> -if (check_for_breakpoints(cpu, pc, &cflags)) {
>>> +if (check_for_breakpoints(cpu, pc, &cflags) ||
>>> +unlikely(qatomic_read(&cpu->interrupt_request))) {
>>>  cpu_loop_exit(cpu);
>>>  }
>>>
>>> And that makes the problem go away. But i'm not familiar with the TCG
>>> internals, so i can't say whether the generated code is incorrect or
>>> something else is wrong. I have tcg log files of a failing + working run
>>> if someone wants to take a look. They are rather large so i would have to
>>> upload them somewhere.
>>
>> Whatever is setting cpu->interrupt_request should be calling
>> cpu_exit(cpu) which sets the exit flag which is checked at the start of
>> every TB execution (see gen_tb_start).
>
> Thanks, that was very helpful. I added debugging and it turned out
> that the TB is left because of a pending irq. The code then calls
> s390_cpu_exec_interrupt:
>
> bool s390_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
> {
> if (interrupt_request & CPU_INTERRUPT_HARD) {
> S390CPU *cpu = S390_CPU(cs);
> CPUS390XState *env = &cpu->env;
>
> if (env->ex_value) {
> /* Execution of the target insn is indivisible from
>the parent EXECUTE insn.  */
> return false;
> }
> if (s390_cpu_has_int(cpu)) {
> s390_cpu_do_interrupt(cs);
> return true;
> }
> if (env->psw.mask & PSW_MASK_WAIT) {
> /* Woken up because of a floating interrupt but it has already
>  * been delivered. Go back to sleep. */
> cpu_interrupt(CPU(cpu), CPU_INTERRUPT_HALT);
> }
> }
> return false;
> }
>
> Note the 'if (env->ex_value) { }' check. It looks like this function
> just returns false in case tcg is executing an EX instruction. After
> that the information that the TB should be exited because of an
> interrupt is gone. So the TB's are never exited again, although the
> interrupt wasn't handled. At least that's my assumption now, if i'm
> wrong please tell me.

Looking at the code i see CF_NOIRQ to prevent TB's from getting
interrupted. But i only see that used in the core tcg code. Would
that be a possibility, or is there something else/better?

Sorry for the dumb questions, i'm not often working on qemu ;-)



  1   2   3   >