[Qemu-commits] [qemu/qemu] fe43cc: tests/libqos: Add loongarch virt machine node

2024-06-06 Thread Richard Henderson via Qemu-commits
by default.

Signed-off-by: Song Gao 
Reviewed-by: Bibo Mao 
Message-Id: <20240528083855.1912757-4-gaos...@loongson.cn>


  Commit: 78f932ea1f7b3b9b0ac628dc2a91281318fe51fa
  
https://github.com/qemu/qemu/commit/78f932ea1f7b3b9b0ac628dc2a91281318fe51fa
  Author: lanyanzhi 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M target/loongarch/cpu.c

  Log Message:
  ---
  target/loongarch: fix a wrong print in cpu dump

description:
loongarch_cpu_dump_state() want to dump all loongarch cpu
state registers, but there is a tiny typographical error when
printing "PRCFG2".

Cc: qemu-sta...@nongnu.org
Signed-off-by: lanyanzhi 
Reviewed-by: Richard Henderson 
Reviewed-by: Song Gao 
Message-Id: <20240604073831.90-1-lanyanzhi...@ict.ac.cn>
Signed-off-by: Song Gao 


  Commit: dec9742cbc59415a8b83e382e7ae36395394e4bd
  
https://github.com/qemu/qemu/commit/dec9742cbc59415a8b83e382e7ae36395394e4bd
  Author: Richard Henderson 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M hw/intc/loongarch_extioi.c
M hw/loongarch/virt.c
M include/hw/intc/loongarch_extioi.h
M include/hw/loongarch/virt.h
M target/loongarch/cpu.c
M target/loongarch/cpu.h
A tests/qtest/libqos/loongarch-virt-machine.c
M tests/qtest/libqos/meson.build
M tests/qtest/meson.build
M tests/qtest/numa-test.c

  Log Message:
  ---
  Merge tag 'pull-loongarch-20240606' of https://gitlab.com/gaosong/qemu into 
staging

pull-loongarch-20240606

# -BEGIN PGP SIGNATURE-
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZmE0HwAKCRBAov/yOSY+
# 396sA/90m/zr91pLQlkhFuYLHg958Ow3L5ysblcuAAmcTXGi8iE9IeTTeZru6WEO
# H/CL/njUkIgP+/Tio0n0Lx6rWkxOzGxWCpvzqrabsPGvs4GUtFEjI/2pvEWP6C9/
# S6Jon3py0oZeoVx8D6Tr/CJrhD0IBptbEn1aiQNDRuSzeuCo1Q==
# =xpjH
# -END PGP SIGNATURE-
# gpg: Signature made Wed 05 Jun 2024 08:59:27 PM PDT
# gpg:using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao " [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:  There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C  6C2C 40A2 FFF2 3926 3EDF

* tag 'pull-loongarch-20240606' of https://gitlab.com/gaosong/qemu:
  target/loongarch: fix a wrong print in cpu dump
  hw/loongarch/virt: Enable extioi virt extension
  hw/loongarch/virt: Use MemTxAttrs interface for misc ops
  hw/intc/loongarch_extioi: Add extioi virt extension definition
  tests/qtest: Add numa test for loongarch system
  tests/libqos: Add loongarch virt machine node

Signed-off-by: Richard Henderson 


Compare: https://github.com/qemu/qemu/compare/064f26ee396a...dec9742cbc59

To unsubscribe from these emails, change your notification settings at 
https://github.com/qemu/qemu/settings/notifications



Re: [PULL 00/12] testing cleanups (ci, vm, lcitool, ansible)

2024-06-06 Thread Richard Henderson

On 6/6/24 04:50, Alex Bennée wrote:

The following changes since commit db2feb2df8d19592c9859efb3f682404e0052957:

   Merge tag 'pull-misc-20240605' ofhttps://gitlab.com/rth7680/qemu  into 
staging (2024-06-05 14:17:01 -0700)

are available in the Git repository at:

   https://gitlab.com/stsquad/qemu.git  tags/pull-maintainer-june24-060624-1

for you to fetch changes up to c99064d03fc574254ab098562798c937a4761161:

   scripts/ci: drive ubuntu/build-environment.yml from lcitool (2024-06-06 
10:26:22 +0100)


testing cleanups (ci, vm, lcitool, ansible):

   - clean up left over Centos 8 references
   - use -fno-sanitize=function to avoid non-useful errors
   - bump lcitool and update images (alpine, fedora)
   - make sure we have mingw-w64-tools for windows builds
   - drive ansible scripts with lcitool package lists


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/9.1 as 
appropriate.


r~




[Qemu-commits] [qemu/qemu] 421a22: ci: remove centos-steam-8 customer runner

2024-06-06 Thread Richard Henderson via Qemu-commits
  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 421a22ef8ec19bc6c1859a905142faeeaa7c35b2
  
https://github.com/qemu/qemu/commit/421a22ef8ec19bc6c1859a905142faeeaa7c35b2
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M .gitlab-ci.d/custom-runners.yml
R .gitlab-ci.d/custom-runners/centos-stream-8-x86_64.yml
M docs/devel/ci-jobs.rst.inc
R scripts/ci/org.centos/stream/8/build-environment.yml
R scripts/ci/org.centos/stream/8/x86_64/configure
R scripts/ci/org.centos/stream/8/x86_64/test-avocado
R scripts/ci/org.centos/stream/README

  Log Message:
  ---
  ci: remove centos-steam-8 customer runner

This broke since eef0bae3a7 (migration: Remove block migration) but
even after that was addressed it still fails to complete. As it will
shortly be EOL lets to remove the runner definition and the related
ansible setup bits.

We still have centos9 docker images build and test.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-2-alex.ben...@linaro.org>


  Commit: cc1d2e04d516da0e1c2e4e99aedf86c5688bd845
  
https://github.com/qemu/qemu/commit/cc1d2e04d516da0e1c2e4e99aedf86c5688bd845
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M docs/devel/testing.rst

  Log Message:
  ---
  docs/devel: update references to centos to non-versioned container

>From the website:

"After May 31, 2024, CentOS Stream 8 will be archived and no further
updates will be provided."

We have updated a few bits but there are still references that need
fixing. Rather than bump I've replaced them with references to the
Debian image so we don't have to bump at the next update.

Reviewed-by: Richard Henderson 
Reviewed-by: Thomas Huth 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-3-alex.ben...@linaro.org>


  Commit: 5ed4e5a15ccf13c633c8b664097bc6f2d61d1109
  
https://github.com/qemu/qemu/commit/5ed4e5a15ccf13c633c8b664097bc6f2d61d1109
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M tests/vm/centos.aarch64

  Log Message:
  ---
  tests/vm: update centos.aarch64 image to 9

As Centos Stream 8 goes out of support we need to update. To do this
powertools is replaced by crb and we don't over specify the python3 we
want.

Reviewed-by: Richard Henderson 
Reviewed-by: Thomas Huth 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-4-alex.ben...@linaro.org>


  Commit: 0f73539676719605e618a0d23326fdc85230963f
  
https://github.com/qemu/qemu/commit/0f73539676719605e618a0d23326fdc85230963f
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M tests/vm/Makefile.include
R tests/vm/centos

  Log Message:
  ---
  tests/vm: remove plain centos image

This isn't really used and we have lighter weight docker containers
for testing this stuff directly.

Reviewed-by: Thomas Huth 
Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-5-alex.ben...@linaro.org>


  Commit: 053d5042ad8df5c4670bee61e175f0dc8046ee6d
  
https://github.com/qemu/qemu/commit/053d5042ad8df5c4670bee61e175f0dc8046ee6d
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M scripts/ci/setup/build-environment.yml

  Log Message:
  ---
  scripts/ci: remove CentOS bits from common build-environment

Although I've just removed the CentOS specific build-environment its
probably a bit too confusing to have multiple distros mixed up in one
place. Prior to moving clean-up what will be just for ubuntu.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-6-alex.ben...@linaro.org>


  Commit: 0eb7fadcfdaca701105480f2215bd3e38e40b3da
  
https://github.com/qemu/qemu/commit/0eb7fadcfdaca701105480f2215bd3e38e40b3da
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
M .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
M .gitlab-ci.d/custom-runners/ubuntu-22.04-s390x.yml
M docs/devel/ci-runners.rst.inc
R scripts/ci/setup/build-environment.yml
A scripts/ci/setup/ubuntu/build-environment.yml

  Log Message:
  ---
  docs/ci: clean-up references for consistency

Document we have split up build-environment by distro and update the
references that exist in the code base to be correct.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-7-alex.ben...@linaro.org>


  Commit: 8e3034914a51444a4e5db9b82a8cc711cc1f76ed
  
https://github.com/qemu/qemu/commit/8e3034914a51444a4e5db9b82a8cc711cc1f76ed
  Author: Thomas Huth 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
R tests/lcitool/targets/centos-stream-8.yml

  Log Message:
  ---
  t

Re: [PATCH] target/sparc: use signed denominator in sdiv helper

2024-06-06 Thread Richard Henderson

On 6/6/24 07:43, Clément Chigot wrote:

The result has to be done with the signed denominator (b32) instead of
the unsigned value passed in argument (b).

Fixes: 1326010322d6 ("target/sparc: Remove CC_OP_DIV")
Signed-off-by: Clément Chigot 
---
  target/sparc/helper.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/sparc/helper.c b/target/sparc/helper.c
index 2247e243b5..7846ddd6f6 100644
--- a/target/sparc/helper.c
+++ b/target/sparc/helper.c
@@ -121,7 +121,7 @@ uint64_t helper_sdiv(CPUSPARCState *env, target_ulong a, 
target_ulong b)
  return (uint32_t)(b32 < 0 ? INT32_MAX : INT32_MIN) | (-1ull << 32);
  }
  
-a64 /= b;

+a64 /= b32;
  r = a64;
  if (unlikely(r != a64)) {
  return (uint32_t)(a64 < 0 ? INT32_MIN : INT32_MAX) | (-1ull << 32);


Oops.

Reviewed-by: Richard Henderson 

r~



Re: Unbinding Alt-1 Does Not Work

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 11:21, Jean-Marc Lasgouttes wrote:

Le 06/06/2024 à 16:50, Richard Kimberly Heck a écrit :
I assume there is also ~C? Does that ever get used? It's not in any 
of our files, only ~S. Allowing for the possibility of something like 
C-~S-~A-8 is not going to be trivial, so I'm wondering if we could 
change that.


I guess that everything exists in general, but only ~S matters. Isn't 
it just a matter of defining an equality function that takes this into 
account?


Where is the relevant code?


It's hard to say. Some of it is definitely in KeyMap.* and 
KeySequence.*, but the unbind shortcut handling is more in GuiPrefs.*. 
It's possible that it is enough to change KeySequence::print, which 
seems to ignore ~S and the like, but I haven't experimented with it.


Riki


--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: [LyX/master] Fixup de5f63eeb: the code did not do what it was supposed to

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 02:55, Jean-Marc Lasgouttes wrote:

Le 06/06/2024 à 08:48, Jean-Marc Lasgouttes a écrit :

commit 92ef555abde86466b7ca3c3401ab8132258fc497
Author: Jean-Marc Lasgouttes 
Date:   Wed Jun 5 23:05:22 2024 +0200

 Fixup de5f63eeb: the code did not do what it was supposed to


Riki,

I think this should go to 2.4.x too ;) It took me time to see the issue.


I was not aware myself that it was so sensitive to ordering.

OK for 2.4.x.

Riki


--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


[Qemu-commits] [qemu/qemu] 421a22: ci: remove centos-steam-8 customer runner

2024-06-06 Thread Richard Henderson via Qemu-commits
  Branch: refs/heads/staging
  Home:   https://github.com/qemu/qemu
  Commit: 421a22ef8ec19bc6c1859a905142faeeaa7c35b2
  
https://github.com/qemu/qemu/commit/421a22ef8ec19bc6c1859a905142faeeaa7c35b2
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M .gitlab-ci.d/custom-runners.yml
R .gitlab-ci.d/custom-runners/centos-stream-8-x86_64.yml
M docs/devel/ci-jobs.rst.inc
R scripts/ci/org.centos/stream/8/build-environment.yml
R scripts/ci/org.centos/stream/8/x86_64/configure
R scripts/ci/org.centos/stream/8/x86_64/test-avocado
R scripts/ci/org.centos/stream/README

  Log Message:
  ---
  ci: remove centos-steam-8 customer runner

This broke since eef0bae3a7 (migration: Remove block migration) but
even after that was addressed it still fails to complete. As it will
shortly be EOL lets to remove the runner definition and the related
ansible setup bits.

We still have centos9 docker images build and test.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-2-alex.ben...@linaro.org>


  Commit: cc1d2e04d516da0e1c2e4e99aedf86c5688bd845
  
https://github.com/qemu/qemu/commit/cc1d2e04d516da0e1c2e4e99aedf86c5688bd845
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M docs/devel/testing.rst

  Log Message:
  ---
  docs/devel: update references to centos to non-versioned container

>From the website:

"After May 31, 2024, CentOS Stream 8 will be archived and no further
updates will be provided."

We have updated a few bits but there are still references that need
fixing. Rather than bump I've replaced them with references to the
Debian image so we don't have to bump at the next update.

Reviewed-by: Richard Henderson 
Reviewed-by: Thomas Huth 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-3-alex.ben...@linaro.org>


  Commit: 5ed4e5a15ccf13c633c8b664097bc6f2d61d1109
  
https://github.com/qemu/qemu/commit/5ed4e5a15ccf13c633c8b664097bc6f2d61d1109
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M tests/vm/centos.aarch64

  Log Message:
  ---
  tests/vm: update centos.aarch64 image to 9

As Centos Stream 8 goes out of support we need to update. To do this
powertools is replaced by crb and we don't over specify the python3 we
want.

Reviewed-by: Richard Henderson 
Reviewed-by: Thomas Huth 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-4-alex.ben...@linaro.org>


  Commit: 0f73539676719605e618a0d23326fdc85230963f
  
https://github.com/qemu/qemu/commit/0f73539676719605e618a0d23326fdc85230963f
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M tests/vm/Makefile.include
R tests/vm/centos

  Log Message:
  ---
  tests/vm: remove plain centos image

This isn't really used and we have lighter weight docker containers
for testing this stuff directly.

Reviewed-by: Thomas Huth 
Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-5-alex.ben...@linaro.org>


  Commit: 053d5042ad8df5c4670bee61e175f0dc8046ee6d
  
https://github.com/qemu/qemu/commit/053d5042ad8df5c4670bee61e175f0dc8046ee6d
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M scripts/ci/setup/build-environment.yml

  Log Message:
  ---
  scripts/ci: remove CentOS bits from common build-environment

Although I've just removed the CentOS specific build-environment its
probably a bit too confusing to have multiple distros mixed up in one
place. Prior to moving clean-up what will be just for ubuntu.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-6-alex.ben...@linaro.org>


  Commit: 0eb7fadcfdaca701105480f2215bd3e38e40b3da
  
https://github.com/qemu/qemu/commit/0eb7fadcfdaca701105480f2215bd3e38e40b3da
  Author: Alex Bennée 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
M .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
M .gitlab-ci.d/custom-runners/ubuntu-22.04-s390x.yml
M docs/devel/ci-runners.rst.inc
R scripts/ci/setup/build-environment.yml
A scripts/ci/setup/ubuntu/build-environment.yml

  Log Message:
  ---
  docs/ci: clean-up references for consistency

Document we have split up build-environment by distro and update the
references that exist in the code base to be correct.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Message-Id: <20240603175328.3823123-7-alex.ben...@linaro.org>


  Commit: 8e3034914a51444a4e5db9b82a8cc711cc1f76ed
  
https://github.com/qemu/qemu/commit/8e3034914a51444a4e5db9b82a8cc711cc1f76ed
  Author: Thomas Huth 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
R tests/lcitool/targets/centos-stream-8.yml

  Log Message:
  ---

Re: FW: LyX 2.4.0 -- annoying bug (?) in Windows with multiple documents open

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 11:04, Dr Paul Verschueren wrote:


On 6/6/24 10:27, Bernt Lie via lyx-users wrote:

LyX 2.4.0 on Windows 11, latest version.

When I have more than one document open in LyX and click on the
document tab

[any one of them], the following warning pops up:

This message shows up irrespective of whether I have made any
changes, or not.

Has anyone else seen this?

Are you using a Dropbox folder, or something of the sort?

Riki

I’m getting the same using OneDrive (also on Win11).

It's possible, then, that the timestamp is being changed when it's 
sync'ed to the cloud.


Riki

-- 
lyx-users mailing list
lyx-users@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-users


[gcc r14-10285] aarch64: Add missing ACLE macro for NEON-SVE Bridge

2024-06-06 Thread Richard Ball via Gcc-cvs
https://gcc.gnu.org/g:35ed54f136fe63bd04d48ada6efb305457bbd824

commit r14-10285-g35ed54f136fe63bd04d48ada6efb305457bbd824
Author: Richard Ball 
Date:   Thu Jun 6 16:28:00 2024 +0100

aarch64: Add missing ACLE macro for NEON-SVE Bridge

__ARM_NEON_SVE_BRIDGE was missed in the original patch and is
added by this patch.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros):
Add missing __ARM_NEON_SVE_BRIDGE.

(cherry picked from commit 43530bc40b1d0465911e493e56a6631202ce85b1)

Diff:
---
 gcc/config/aarch64/aarch64-c.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index fe1a20e4e54..d042e5fbd8c 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -75,6 +75,7 @@ aarch64_define_unconditional_macros (cpp_reader *pfile)
 
   builtin_define ("__ARM_STATE_ZA");
   builtin_define ("__ARM_STATE_ZT0");
+  builtin_define ("__ARM_NEON_SVE_BRIDGE");
 
   /* Define keyword attributes like __arm_streaming as macros that expand
  to the associated [[...]] attribute.  Use __extension__ in the attribute


[gcc r15-1075] aarch64: Add missing ACLE macro for NEON-SVE Bridge

2024-06-06 Thread Richard Ball via Gcc-cvs
https://gcc.gnu.org/g:43530bc40b1d0465911e493e56a6631202ce85b1

commit r15-1075-g43530bc40b1d0465911e493e56a6631202ce85b1
Author: Richard Ball 
Date:   Thu Jun 6 16:28:00 2024 +0100

aarch64: Add missing ACLE macro for NEON-SVE Bridge

__ARM_NEON_SVE_BRIDGE was missed in the original patch and is
added by this patch.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros):
Add missing __ARM_NEON_SVE_BRIDGE.

Diff:
---
 gcc/config/aarch64/aarch64-c.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index fe1a20e4e54..d042e5fbd8c 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -75,6 +75,7 @@ aarch64_define_unconditional_macros (cpp_reader *pfile)
 
   builtin_define ("__ARM_STATE_ZA");
   builtin_define ("__ARM_STATE_ZT0");
+  builtin_define ("__ARM_NEON_SVE_BRIDGE");
 
   /* Define keyword attributes like __arm_streaming as macros that expand
  to the associated [[...]] attribute.  Use __extension__ in the attribute


Re: [PATCH v2 5/9] target/i386: Split out gdb-internal.h

2024-06-06 Thread Richard Henderson

On 6/5/24 23:51, Philippe Mathieu-Daudé wrote:

Shouldn't we remove the definitions from the source to
complete the "split"?


Gah, I thought I had done that.


r~



Re: [PATCH V2] aarch64: Add missing ACLE macro for NEON-SVE Bridge

2024-06-06 Thread Richard Sandiford
Richard Ball  writes:
> v2: Change macro definition following internal discussion.
>
> __ARM_NEON_SVE_BRIDGE was missed in the original patch and is
> added by this patch.
>
> Ok for trunk and a backport into gcc-14?

Yes, thanks.

Richard

> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros):
>   Add missing __ARM_NEON_SVE_BRIDGE.
>
> On 6/6/24 13:20, Richard Sandiford wrote:
>> Richard Ball  writes:
>>> __ARM_NEON_SVE_BRIDGE was missed in the original patch and is
>>> added by this patch.
>>>
>>> Ok for trunk and a backport into gcc-14?
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
>>> Add missing __ARM_NEON_SVE_BRIDGE.
>> 
>> After this patch was posted, there was some internal discussion
>> involving LLVM & GNU devs about what this kind of macro means, now that
>> we have FMV.  The feeling was that __ARM_NEON_SVE_BRIDGE should just
>> indicate whether the compiler provides the file, not whether AdvSIMD
>> & SVE are enabled.  I think we should therefore add this to
>> aarch64_define_unconditional_macros instead.
>> 
>> Sorry for the slow review.  I was waiting for the outcome of that
>> discussion before replying.
>> 
>> Thanks,
>> Richard
>> 
>>> diff --git a/gcc/config/aarch64/aarch64-c.cc 
>>> b/gcc/config/aarch64/aarch64-c.cc
>>> index 
>>> fe1a20e4e546a68e5f7eddff3bbb0d3e831fbd9b..1121be118cf8d05e3736ad4ee75568ff7cb92bfd
>>>  100644
>>> --- a/gcc/config/aarch64/aarch64-c.cc
>>> +++ b/gcc/config/aarch64/aarch64-c.cc
>>> @@ -260,6 +260,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
>>>aarch64_def_or_undef (TARGET_SME_I16I64, "__ARM_FEATURE_SME_I16I64", 
>>> pfile);
>>>aarch64_def_or_undef (TARGET_SME_F64F64, "__ARM_FEATURE_SME_F64F64", 
>>> pfile);
>>>aarch64_def_or_undef (TARGET_SME2, "__ARM_FEATURE_SME2", pfile);
>>> +  aarch64_def_or_undef (TARGET_SVE, "__ARM_NEON_SVE_BRIDGE", pfile);
>>>  
>>>/* Not for ACLE, but required to keep "float.h" correct if we switch
>>>   target between implementations that do or do not support ARMv8.2-A
>
> diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
> index 
> fe1a20e4e546a68e5f7eddff3bbb0d3e831fbd9b..d042e5fbd8c562df2e4538b51b960c194d2ca2c9
>  100644
> --- a/gcc/config/aarch64/aarch64-c.cc
> +++ b/gcc/config/aarch64/aarch64-c.cc
> @@ -75,6 +75,7 @@ aarch64_define_unconditional_macros (cpp_reader *pfile)
>  
>builtin_define ("__ARM_STATE_ZA");
>builtin_define ("__ARM_STATE_ZT0");
> +  builtin_define ("__ARM_NEON_SVE_BRIDGE");
>  
>/* Define keyword attributes like __arm_streaming as macros that expand
>   to the associated [[...]] attribute.  Use __extension__ in the attribute


Re: [PATCH V2] aarch64: Add missing ACLE macro for NEON-SVE Bridge

2024-06-06 Thread Richard Ball
v2: Change macro definition following internal discussion.

__ARM_NEON_SVE_BRIDGE was missed in the original patch and is
added by this patch.

Ok for trunk and a backport into gcc-14?

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros):
Add missing __ARM_NEON_SVE_BRIDGE.

On 6/6/24 13:20, Richard Sandiford wrote:
> Richard Ball  writes:
>> __ARM_NEON_SVE_BRIDGE was missed in the original patch and is
>> added by this patch.
>>
>> Ok for trunk and a backport into gcc-14?
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
>>  Add missing __ARM_NEON_SVE_BRIDGE.
> 
> After this patch was posted, there was some internal discussion
> involving LLVM & GNU devs about what this kind of macro means, now that
> we have FMV.  The feeling was that __ARM_NEON_SVE_BRIDGE should just
> indicate whether the compiler provides the file, not whether AdvSIMD
> & SVE are enabled.  I think we should therefore add this to
> aarch64_define_unconditional_macros instead.
> 
> Sorry for the slow review.  I was waiting for the outcome of that
> discussion before replying.
> 
> Thanks,
> Richard
> 
>> diff --git a/gcc/config/aarch64/aarch64-c.cc 
>> b/gcc/config/aarch64/aarch64-c.cc
>> index 
>> fe1a20e4e546a68e5f7eddff3bbb0d3e831fbd9b..1121be118cf8d05e3736ad4ee75568ff7cb92bfd
>>  100644
>> --- a/gcc/config/aarch64/aarch64-c.cc
>> +++ b/gcc/config/aarch64/aarch64-c.cc
>> @@ -260,6 +260,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
>>aarch64_def_or_undef (TARGET_SME_I16I64, "__ARM_FEATURE_SME_I16I64", 
>> pfile);
>>aarch64_def_or_undef (TARGET_SME_F64F64, "__ARM_FEATURE_SME_F64F64", 
>> pfile);
>>aarch64_def_or_undef (TARGET_SME2, "__ARM_FEATURE_SME2", pfile);
>> +  aarch64_def_or_undef (TARGET_SVE, "__ARM_NEON_SVE_BRIDGE", pfile);
>>  
>>/* Not for ACLE, but required to keep "float.h" correct if we switch
>>   target between implementations that do or do not support ARMv8.2-Adiff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index fe1a20e4e546a68e5f7eddff3bbb0d3e831fbd9b..d042e5fbd8c562df2e4538b51b960c194d2ca2c9 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -75,6 +75,7 @@ aarch64_define_unconditional_macros (cpp_reader *pfile)
 
   builtin_define ("__ARM_STATE_ZA");
   builtin_define ("__ARM_STATE_ZT0");
+  builtin_define ("__ARM_NEON_SVE_BRIDGE");
 
   /* Define keyword attributes like __arm_streaming as macros that expand
  to the associated [[...]] attribute.  Use __extension__ in the attribute


[gcc r15-1074] arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

2024-06-06 Thread Richard Ball via Gcc-cvs
https://gcc.gnu.org/g:2963c76e8e24d4ebaf2b1b4ac4d7ca44eb0a9025

commit r15-1074-g2963c76e8e24d4ebaf2b1b4ac4d7ca44eb0a9025
Author: Richard Ball 
Date:   Thu Jun 6 16:10:14 2024 +0100

arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

The CASE_VECTOR_SHORTEN_MODE query is missing some equals signs
which causes suboptimal codegen due to missed optimisation
opportunities. This patch also adds a test for thumb2
switch statements as none exist currently.

gcc/ChangeLog:
PR target/115353
* config/arm/arm.h (enum arm_auto_incmodes):
Correct CASE_VECTOR_SHORTEN_MODE query.

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb2-switchstatement.c: New test.

Diff:
---
 gcc/config/arm/arm.h   |   4 +-
 .../gcc.target/arm/thumb2-switchstatement.c| 144 +
 2 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 449e6935b32..0cd5d733952 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2111,8 +2111,8 @@ enum arm_auto_incmodes
   ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, HImode)   \
   : SImode)
\
: (TARGET_THUMB2\
-  ? ((min > 0 && max < 0x200) ? QImode \
-  : (min > 0 && max <= 0x2) ? HImode   \
+  ? ((min >= 0 && max < 0x200) ? QImode\
+  : (min >= 0 && max < 0x2) ? HImode   \
   : SImode)
\
: ((min >= 0 && max < 1024) \
   ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, QImode)   \
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c 
b/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c
new file mode 100644
index 000..8badf318e62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c
@@ -0,0 +1,144 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb --param case-values-threshold=1 -fno-reorder-blocks 
-fno-tree-dce -O2" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#define NOP "nop;"
+#define NOP2 NOP NOP
+#define NOP4 NOP2 NOP2
+#define NOP8 NOP4 NOP4
+#define NOP16 NOP8 NOP8
+#define NOP32 NOP16 NOP16
+#define NOP64 NOP32 NOP32
+#define NOP128 NOP64 NOP64
+#define NOP256 NOP128 NOP128
+#define NOP512 NOP256 NOP256
+#define NOP1024 NOP512 NOP512
+#define NOP2048 NOP1024 NOP1024
+#define NOP4096 NOP2048 NOP2048
+#define NOP8192 NOP4096 NOP4096
+#define NOP16384 NOP8192 NOP8192
+#define NOP32768 NOP16384 NOP16384
+#define NOP65536 NOP32768 NOP32768
+#define NOP131072 NOP65536 NOP65536
+
+enum z
+{
+  a = 1,
+  b,
+  c,
+  d,
+  e,
+  f = 7,
+};
+
+inline void QIFunction (const char* flag)
+{
+  asm volatile (NOP32);
+  return;
+}
+
+inline void HIFunction (const char* flag)
+{
+  asm volatile (NOP512);
+  return;
+}
+
+inline void SIFunction (const char* flag)
+{
+  asm volatile (NOP131072);
+  return;
+}
+
+/*
+**QImode_test:
+** ...
+** tbb \[pc, r[0-9]+\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
QImode_test(enum z x)
+{
+  switch (x)
+{
+  case d:
+QIFunction("QItest");
+return "InlineASM";
+  case f:
+return "TEST";
+  default:
+return "Default";
+}
+}
+
+/* { dg-final { scan-assembler ".byte" } } */
+
+/*
+**HImode_test:
+** ...
+** tbh \[pc, r[0-9]+, lsl #1\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
HImode_test(enum z x)
+{
+  switch (x)
+  {
+case d:
+  HIFunction("HItest");
+  return "InlineASM";
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
+
+/* { dg-final { scan-assembler ".2byte" } } */
+
+/*
+**SImode_test:
+** ...
+** adr (r[0-9]+), .L[0-9]+
+** ldr pc, \[\1, r[0-9]+, lsl #2\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
SImode_test(enum z x)
+{
+  switch (x)
+  {
+case d:
+  SIFunction("SItest");
+  return "InlineASM";
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
+
+/* { dg-final { scan-assembler ".word" } } */
+
+/*
+**backwards_branch_test:
+** ...
+** adr (r[0-9]+), .L[0-9]+
+** ldr pc, \[\1, r[0-9]+, lsl #2\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const cha

[jira] [Commented] (MNG-7868) "Could not acquire lock(s)" error in concurrent maven builds

2024-06-06 Thread Richard Eckart de Castilho (Jira)


[ 
https://issues.apache.org/jira/browse/MNG-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852829#comment-17852829
 ] 

Richard Eckart de Castilho commented on MNG-7868:
-

It is a {{clean install}} call and the plugin goal being executed when the 
exception was thrown was `biz.aQute.bnd:bnd-testing-maven-plugin:7.0.0:testing` 
(see above).

The parallelization was triggered using {{-T 2}}

The {{sentencepiece}} and {{bar-api}} artifacts are downloaded from a local 
mirror repository. {{module-foo}}, {{module-bar}} are from the reactor.

> "Could not acquire lock(s)" error in concurrent maven builds
> 
>
> Key: MNG-7868
> URL: https://issues.apache.org/jira/browse/MNG-7868
> Project: Maven
>  Issue Type: Bug
> Environment: windows, maven 3.9.4
>Reporter: Jörg Hohwiller
>Priority: Major
> Attachments: image-2024-04-10-15-44-37-013.png, screenshot-1.png
>
>
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:3.1.1:install (default-install) 
> on project foo.bar: Execution default-install of goal 
> org.apache.maven.plugins:maven-install-plugin:3.1.1:install failed: Could not 
> acquire lock(s) -> [Help 1]
> {code}
> I am using maven 3.9.4 on windows:
> {code}
> $ mvn -v
> Apache Maven 3.9.4 (dfbb324ad4a7c8fb0bf182e6d91b0ae20e3d2dd9)
> Maven home: D:\projects\test\software\mvn
> Java version: 17.0.5, vendor: Eclipse Adoptium, runtime: 
> D:\projects\test\software\java
> Default locale: en_US, platform encoding: UTF-8
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> {code}
> I searched for this bug and found issues like MRESOLVER-332 that first look 
> identical or similar but do not really seem to be related so I decided to 
> create this issue.
> For this bug I made the following observations:
> * it only happens with concurrent builds: {{mvn -T ...}}
> * is seems to be windows related (at least mainly happens on windows)
> * it is in-deterministic and is not so easy to create an isolated and simple 
> project and a reproducible scenario that always results in this error. 
> However, I get this very often in my current project with many modules (500+).
> * it is not specific to the maven-install-plugin and also happens from other 
> spots in maven:
> I also got this stacktrace:
> {code}
> Suppressed: java.lang.IllegalStateException: Attempt 1: Could not acquire 
> write lock for 
> 'C:\Users\hohwille\.m2\repository\.locks\artifact~com.caucho~com.springsource.com.caucho~3.2.1.lock'
>  in 30 SECONDS
> at 
> org.eclipse.aether.internal.impl.synccontext.named.NamedLockFactoryAdapter$AdaptedLockSyncContext.acquire
>  (NamedLockFactoryAdapter.java:202)
> at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve 
> (DefaultArtifactResolver.java:271)
> at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts 
> (DefaultArtifactResolver.java:259)
> at 
> org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
> (DefaultRepositorySystem.java:352)
> {code}
> See also this related discussion:
> https://github.com/apache/maven-mvnd/issues/836#issuecomment-1702488377



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PATCH] arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

2024-06-06 Thread Richard Earnshaw (lists)
On 06/06/2024 15:40, Richard Ball wrote:
> The CASE_VECTOR_SHORTEN_MODE query is missing some equals signs
> which causes suboptimal codegen due to missed optimisation
> opportunities. This patch also adds a test for thumb2
> switch statements as none exist currently.
> 
> gcc/ChangeLog:
>   PR target/115353
>   * config/arm/arm.h (enum arm_auto_incmodes):
>   Correct CASE_VECTOR_SHORTEN_MODE query.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/thumb2-switchstatement.c: New test.

OK.

R.


Re: Unbinding Alt-1 Does Not Work

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 09:07, Jean-Marc Lasgouttes wrote:

Le 04/06/2024 à 17:31, Richard Kimberly Heck a écrit :
There are other cases like C-~S-underscore. On most English 
keyboards, underscore is shifted, but it might not be even on all 
English keyboards.


I guess the only solution then is to figure out how to make unbinding 
work for sequences with ~S and the like.


I guess a sequence with ~S should match both the sequence with S and 
the sequence without S.


I assume there is also ~C? Does that ever get used? It's not in any of 
our files, only ~S. Allowing for the possibility of something like 
C-~S-~A-8 is not going to be trivial, so I'm wondering if we could 
change that.


Riki


--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: LyX 2.4.0 -- annoying bug (?) in Windows with multiple documents open

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 10:27, Bernt Lie via lyx-users wrote:


LyX 2.4.0 on Windows 11, latest version.

When I have more than one document open in LyX and click on the 
document tab


[any one of them], the following warning pops up:

This message shows up irrespective of whether I have made any changes, 
or not.


Has anyone else seen this?


Are you using a Dropbox folder, or something of the sort?

Riki

-- 
lyx-users mailing list
lyx-users@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-users


Re: LyX 2.4.0 -- annoying bug (?) in Windows with multiple documents open

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 10:27, Bernt Lie via lyx-users wrote:


LyX 2.4.0 on Windows 11, latest version.

When I have more than one document open in LyX and click on the 
document tab


[any one of them], the following warning pops up:

This message shows up irrespective of whether I have made any changes, 
or not.


Has anyone else seen this?


Are you using a Dropbox folder, or something of the sort?

Riki

-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


[PATCH] arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

2024-06-06 Thread Richard Ball
The CASE_VECTOR_SHORTEN_MODE query is missing some equals signs
which causes suboptimal codegen due to missed optimisation
opportunities. This patch also adds a test for thumb2
switch statements as none exist currently.

gcc/ChangeLog:
PR target/115353
* config/arm/arm.h (enum arm_auto_incmodes):
Correct CASE_VECTOR_SHORTEN_MODE query.

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb2-switchstatement.c: New test.diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 449e6935b32f8f272df709ba43aa2ba7de37e6b3..0cd5d733952d7620f452d9d90cec9103b3fb5300 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2111,8 +2111,8 @@ enum arm_auto_incmodes
   ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, HImode)	\
   : SImode)\
: (TARGET_THUMB2			\
-  ? ((min > 0 && max < 0x200) ? QImode\
-  : (min > 0 && max <= 0x2) ? HImode\
+  ? ((min >= 0 && max < 0x200) ? QImode\
+  : (min >= 0 && max < 0x2) ? HImode\
   : SImode)\
: ((min >= 0 && max < 1024)		\
   ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, QImode)	\
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c b/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c
new file mode 100644
index ..8badf318e626de1911e297bff8e93ac72160224f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c
@@ -0,0 +1,144 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb --param case-values-threshold=1 -fno-reorder-blocks -fno-tree-dce -O2" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#define NOP "nop;"
+#define NOP2 NOP NOP
+#define NOP4 NOP2 NOP2
+#define NOP8 NOP4 NOP4
+#define NOP16 NOP8 NOP8
+#define NOP32 NOP16 NOP16
+#define NOP64 NOP32 NOP32
+#define NOP128 NOP64 NOP64
+#define NOP256 NOP128 NOP128
+#define NOP512 NOP256 NOP256
+#define NOP1024 NOP512 NOP512
+#define NOP2048 NOP1024 NOP1024
+#define NOP4096 NOP2048 NOP2048
+#define NOP8192 NOP4096 NOP4096
+#define NOP16384 NOP8192 NOP8192
+#define NOP32768 NOP16384 NOP16384
+#define NOP65536 NOP32768 NOP32768
+#define NOP131072 NOP65536 NOP65536
+
+enum z
+{
+  a = 1,
+  b,
+  c,
+  d,
+  e,
+  f = 7,
+};
+
+inline void QIFunction (const char* flag)
+{
+  asm volatile (NOP32);
+  return;
+}
+
+inline void HIFunction (const char* flag)
+{
+  asm volatile (NOP512);
+  return;
+}
+
+inline void SIFunction (const char* flag)
+{
+  asm volatile (NOP131072);
+  return;
+}
+
+/*
+**QImode_test:
+**	...
+**	tbb	\[pc, r[0-9]+\]
+**	...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* QImode_test(enum z x)
+{
+  switch (x)
+{
+  case d:
+QIFunction("QItest");
+return "InlineASM";
+  case f:
+return "TEST";
+  default:
+return "Default";
+}
+}
+
+/* { dg-final { scan-assembler ".byte" } } */
+
+/*
+**HImode_test:
+**	...
+**	tbh	\[pc, r[0-9]+, lsl #1\]
+**	...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* HImode_test(enum z x)
+{
+  switch (x)
+  {
+case d:
+  HIFunction("HItest");
+  return "InlineASM";
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
+
+/* { dg-final { scan-assembler ".2byte" } } */
+
+/*
+**SImode_test:
+**	...
+**	adr	(r[0-9]+), .L[0-9]+
+**	ldr	pc, \[\1, r[0-9]+, lsl #2\]
+**	...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* SImode_test(enum z x)
+{
+  switch (x)
+  {
+case d:
+  SIFunction("SItest");
+  return "InlineASM";
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
+
+/* { dg-final { scan-assembler ".word" } } */
+
+/*
+**backwards_branch_test:
+**	...
+**	adr	(r[0-9]+), .L[0-9]+
+**	ldr	pc, \[\1, r[0-9]+, lsl #2\]
+**	...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* backwards_branch_test(enum z x, int flag)
+{
+  if (flag == 5)
+  {
+backwards:
+  asm volatile (NOP512);
+  return "ASM";
+  }
+  switch (x)
+  {
+case d:
+  goto backwards;
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
\ No newline at end of file


Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-06 Thread Richard Sandiford
Ajit Agarwal  writes:
> On 06/06/24 2:28 pm, Richard Sandiford wrote:
>> Hi,
>> 
>> Just some comments on the fuseable_load_p part, since that's what
>> we were discussing last time.
>> 
>> It looks like this now relies on:
>> 
>> Ajit Agarwal  writes:
>>> +  /* We use DF data flow because we change location rtx
>>> +which is easier to find and modify.
>>> +We use mix of rtl-ssa def-use and DF data flow
>>> +where it is easier.  */
>>> +  df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
>>> +  df_analyze ();
>>> +  df_set_flags (DF_DEFER_INSN_RESCAN);
>> 
>> But please don't do this!  For one thing, building DU/UD chains
>> as well as rtl-ssa is really expensive in terms of compile time.
>> But more importantly, modifications need to happen via rtl-ssa
>> to ensure that the IL is kept up-to-date.  If we don't do that,
>> later fuse attempts will be based on stale data and so could
>> generate incorrect code.
>> 
>
> Sure I have made changes to use only rtl-ssa and not to use
> UD/DU chains. I will send the changes in separate subsequent
> patch.

Thanks.  Before you send the patch though:

>>> +// Check whether load can be fusable or not.
>>> +// Return true if fuseable otherwise false.
>>> +bool
>>> +rs6000_pair_fusion::fuseable_load_p (insn_info *info)
>>> +{
>>> +  for (auto def : info->defs())
>>> +{
>>> +  auto set = dyn_cast (def);
>>> +  for (auto use1 : set->nondebug_insn_uses ())
>>> +   use1->set_is_live_out_use (true);
>>> +}
>> 
>> What was the reason for adding this loop?
>>
>
> The purpose of adding is to avoid assert failure in gcc/rtl-ssa/changes.cc:252

That assert is making sure that we don't delete a definition of a
register (or memory) while a real insn still uses it.  If the assert
is firing then something has gone wrong.

Live-out uses are a particular kind of use that occur at the end of
basic blocks.  It's incorrect to mark normal insn uses as live-out.

When an assert fails, it's important to understand why the failure
occurs, rather than brute-force the assert condition to true.

>>> [...]
>>> +
>>> +  rtx addr = XEXP (SET_SRC (body), 0);
>>> +
>>> +  if (GET_CODE (addr) == PLUS
>>> +  && XEXP (addr, 1) && CONST_INT_P (XEXP (addr, 1)))
>>> +{
>>> +  if (INTVAL (XEXP (addr, 1)) == -16)
>>> +   return false;
>>> +  }
>> 
>> What's special about -16?
>> 
>
> The tests like libgomp/for-8 fails with fused load with offset -16 and 0.
> Thats why I have added this check.

But why does it fail though?  It sounds like the testcase is pointing
out a problem in the pass (or perhaps elsewhere).  It's important that
we try to understand and fix the underlying problem.

>>> +
>>> +  df_ref use;
>>> +  df_insn_info *insn_info = DF_INSN_INFO_GET (info->rtl ());
>>> +  FOR_EACH_INSN_INFO_DEF (use, insn_info)
>>> +{
>>> +  struct df_link *def_link = DF_REF_CHAIN (use);
>>> +
>>> +  if (!def_link || !def_link->ref
>>> + || DF_REF_IS_ARTIFICIAL (def_link->ref))
>>> +   continue;
>>> +
>>> +  while (def_link && def_link->ref)
>>> +   {
>>> + rtx_insn *insn = DF_REF_INSN (def_link->ref);
>>> + if (GET_CODE (PATTERN (insn)) == PARALLEL)
>>> +   return false;
>> 
>> Why do you need to skip PARALLELs?
>>
>
> vec_select with parallel give failures final.cc "can't split-up with subreg 
> 128 (reg OO"
> Thats why I have added this.

But in (vec_select ... (parallel ...)), the parallel won't be the 
PATTERN (insn).  It'll instead be a suboperand of the vec_select.

Here too it's important to understand why the final.cc failure occurs
and what the correct fix is.

>>> +
>>> + rtx set = single_set (insn);
>>> + if (set == NULL_RTX)
>>> +   return false;
>>> +
>>> + rtx op0 = SET_SRC (set);
>>> + rtx_code code = GET_CODE (op0);
>>> +
>>> + // This check is added as register pairs are not generated
>>> + // by RA for neg:V2DF (fma: V2DF (reg1)
>>> + //  (reg2)
>>> + //  (neg:V2DF (reg3)))
>>> + if (GET_RTX_CLASS (code) == RTX_UNARY)
>>> +   return false;
>> 
>> What's special about (neg (fma ...))?
>>
>
> I am not sure why register allocator fails allocating register p

Re: linux-user emulation hangs during fork

2024-06-06 Thread Richard Henderson

On 6/6/24 01:27, Andreas Schwab wrote:

Which ruby?

$ ruby --version
ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux-gnu]



ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]

That might have been handy to have with your original report.


r~



[openssl/openssl] 76cb23: Drop the old PGP key fingerprint

2024-06-06 Thread Richard Levitte
  Branch: refs/heads/openssl-3.2
  Home:   https://github.com/openssl/openssl
  Commit: 76cb2357be60cebf0fe360bd4c39862c8474104a
  
https://github.com/openssl/openssl/commit/76cb2357be60cebf0fe360bd4c39862c8474104a
  Author: Richard Levitte 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M doc/fingerprints.txt

  Log Message:
  ---
  Drop the old PGP key fingerprint

All public releases have the information of the new PGP key in
doc/fingerprints.txt, so it is finally time to drop the old.

Reviewed-by: Kurt Roeckx 
Reviewed-by: Tomas Mraz 
(Merged from https://github.com/openssl/openssl/pull/24563)

(cherry picked from commit a9fa07f47cea6a43d5ac4a3aa336ab34756c2e9b)



To unsubscribe from these emails, change your notification settings at 
https://github.com/openssl/openssl/settings/notifications


[openssl/openssl] bb4095: Drop the old PGP key fingerprint

2024-06-06 Thread Richard Levitte
  Branch: refs/heads/openssl-3.3
  Home:   https://github.com/openssl/openssl
  Commit: bb40954ec723e02a538735d35edeeb2dd880e4e1
  
https://github.com/openssl/openssl/commit/bb40954ec723e02a538735d35edeeb2dd880e4e1
  Author: Richard Levitte 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M doc/fingerprints.txt

  Log Message:
  ---
  Drop the old PGP key fingerprint

All public releases have the information of the new PGP key in
doc/fingerprints.txt, so it is finally time to drop the old.

Reviewed-by: Kurt Roeckx 
Reviewed-by: Tomas Mraz 
(Merged from https://github.com/openssl/openssl/pull/24563)

(cherry picked from commit a9fa07f47cea6a43d5ac4a3aa336ab34756c2e9b)



To unsubscribe from these emails, change your notification settings at 
https://github.com/openssl/openssl/settings/notifications


[openssl/openssl] 793298: Drop the old PGP key fingerprint

2024-06-06 Thread Richard Levitte
  Branch: refs/heads/openssl-3.0
  Home:   https://github.com/openssl/openssl
  Commit: 793298a6d8d58ec6a4f412fa876371694252a19c
  
https://github.com/openssl/openssl/commit/793298a6d8d58ec6a4f412fa876371694252a19c
  Author: Richard Levitte 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M doc/fingerprints.txt

  Log Message:
  ---
  Drop the old PGP key fingerprint

All public releases have the information of the new PGP key in
doc/fingerprints.txt, so it is finally time to drop the old.

Reviewed-by: Kurt Roeckx 
Reviewed-by: Tomas Mraz 
(Merged from https://github.com/openssl/openssl/pull/24563)

(cherry picked from commit a9fa07f47cea6a43d5ac4a3aa336ab34756c2e9b)



To unsubscribe from these emails, change your notification settings at 
https://github.com/openssl/openssl/settings/notifications


[openssl/openssl] a9fa07: Drop the old PGP key fingerprint

2024-06-06 Thread Richard Levitte
  Branch: refs/heads/master
  Home:   https://github.com/openssl/openssl
  Commit: a9fa07f47cea6a43d5ac4a3aa336ab34756c2e9b
  
https://github.com/openssl/openssl/commit/a9fa07f47cea6a43d5ac4a3aa336ab34756c2e9b
  Author: Richard Levitte 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M doc/fingerprints.txt

  Log Message:
  ---
  Drop the old PGP key fingerprint

All public releases have the information of the new PGP key in
doc/fingerprints.txt, so it is finally time to drop the old.

Reviewed-by: Kurt Roeckx 
Reviewed-by: Tomas Mraz 
(Merged from https://github.com/openssl/openssl/pull/24563)



To unsubscribe from these emails, change your notification settings at 
https://github.com/openssl/openssl/settings/notifications


[openssl/openssl] 42d56b: Drop the old PGP key fingerprint

2024-06-06 Thread Richard Levitte
  Branch: refs/heads/openssl-3.1
  Home:   https://github.com/openssl/openssl
  Commit: 42d56b8a839a6f924ffe3c69605901b4d57db8a7
  
https://github.com/openssl/openssl/commit/42d56b8a839a6f924ffe3c69605901b4d57db8a7
  Author: Richard Levitte 
  Date:   2024-06-06 (Thu, 06 Jun 2024)

  Changed paths:
M doc/fingerprints.txt

  Log Message:
  ---
  Drop the old PGP key fingerprint

All public releases have the information of the new PGP key in
doc/fingerprints.txt, so it is finally time to drop the old.

Reviewed-by: Kurt Roeckx 
Reviewed-by: Tomas Mraz 
(Merged from https://github.com/openssl/openssl/pull/24563)

(cherry picked from commit a9fa07f47cea6a43d5ac4a3aa336ab34756c2e9b)



To unsubscribe from these emails, change your notification settings at 
https://github.com/openssl/openssl/settings/notifications


Re: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD

2024-06-06 Thread Richard Biener
On Thu, Jun 6, 2024 at 3:37 PM  wrote:
>
> From: Pan Li 
>
> After we support one gassign form of the unsigned .SAT_ADD,  we
> would like to support more forms including both the branch and
> branchless.  There are 5 other forms of .SAT_ADD,  list as below:
>
> Form 1:
>   #define SAT_ADD_U_1(T) \
>   T sat_add_u_1_##T(T x, T y) \
>   { \
> return (T)(x + y) >= x ? (x + y) : -1; \
>   }
>
> Form 2:
>   #define SAT_ADD_U_2(T) \
>   T sat_add_u_2_##T(T x, T y) \
>   { \
> T ret; \
> T overflow = __builtin_add_overflow (x, y, ); \
> return (T)(-overflow) | ret; \
>   }
>
> Form 3:
>   #define SAT_ADD_U_3(T) \
>   T sat_add_u_3_##T (T x, T y) \
>   { \
> T ret; \
> return __builtin_add_overflow (x, y, ) ? -1 : ret; \
>   }
>
> Form 4:
>   #define SAT_ADD_U_4(T) \
>   T sat_add_u_4_##T (T x, T y) \
>   { \
> T ret; \
> return __builtin_add_overflow (x, y, ) == 0 ? ret : -1; \
>   }
>
> Form 5:
>   #define SAT_ADD_U_5(T) \
>   T sat_add_u_5_##T(T x, T y) \
>   { \
> return (T)(x + y) < x ? -1 : (x + y); \
>   }
>
> Take the forms 3 of above as example:
>
> uint64_t
> sat_add (uint64_t x, uint64_t y)
> {
>   uint64_t ret;
>   return __builtin_add_overflow (x, y, ) ? -1 : ret;
> }
>
> Before this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _1;
>   long unsigned int _2;
>   uint64_t _3;
>   __complex__ long unsigned int _6;
>
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _6 = .ADD_OVERFLOW (x_4(D), y_5(D));
>   _2 = IMAGPART_EXPR <_6>;
>   if (_2 != 0)
> goto ; [35.00%]
>   else
> goto ; [65.00%]
> ;;succ:   4
> ;;3
>
> ;;   basic block 3, loop depth 0
> ;;pred:   2
>   _1 = REALPART_EXPR <_6>;
> ;;succ:   4
>
> ;;   basic block 4, loop depth 0
> ;;pred:   3
> ;;2
>   # _3 = PHI <_1(3), 18446744073709551615(2)>
>   return _3;
> ;;succ:   EXIT
> }
>
> After this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _12;
>
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _12 = .SAT_ADD (x_4(D), y_5(D)); [tail call]
>   return _12;
> ;;succ:   EXIT
> }
>
> The flag '^' acts on cond_expr will generate matching code similar as below:
>
> else if (gphi *_a1 = dyn_cast  (_d1))
>   {
> basic_block _b1 = gimple_bb (_a1);
> if (gimple_phi_num_args (_a1) == 2)
>   {
> basic_block _pb_0_1 = EDGE_PRED (_b1, 0)->src;
> basic_block _pb_1_1 = EDGE_PRED (_b1, 1)->src;
> basic_block _db_1 = safe_dyn_cast  (*gsi_last_bb (_pb_0_1))
> ? _pb_0_1 : _pb_1_1;
> basic_block _other_db_1 = safe_dyn_cast  (*gsi_last_bb 
> (_pb_0_1))
>   ? _pb_1_1 : _pb_0_1;
> gcond *_ct_1 = safe_dyn_cast  (*gsi_last_bb (_db_1));
> if (_ct_1 && EDGE_COUNT (_other_db_1->preds) == 1
>   && EDGE_COUNT (_other_db_1->succs) == 1
>   && EDGE_PRED (_other_db_1, 0)->src == _db_1)
>   {
> tree _cond_lhs_1 = gimple_cond_lhs (_ct_1);
> tree _cond_rhs_1 = gimple_cond_rhs (_ct_1);
> tree _p0 = build2 (gimple_cond_code (_ct_1), boolean_type_node,
>_cond_lhs_1, _cond_rhs_1);
>         bool _arg_0_is_true_1 = gimple_phi_arg_edge (_a1, 0)->flags & 
> EDGE_TRUE_VALUE;
> tree _p1 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 0 : 1);
> tree _p2 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 1 : 0);
> 
>
> The below test suites are passed for this patch.
> * The x86 bootstrap test.
> * The x86 fully regression test.
> * The riscv fully regression test.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> * doc/match-and-simplify.texi: Add doc for the matching flag '^'.
> * genmatch.cc (cmp_operand): Add match_phi comparation.
> (dt_node::gen_kids_1): Add cond_expr bool flag for phi match.
> (dt_operand::gen_phi_on_cond): Add new func to gen phi matching
> on cond_expr.
> (parser::parse_expr): Add handling for the expr flag '^'.
> * match.pd: Add more form for unsigned .SAT_ADD.
> * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
> new func impl to build call for phi gimple.
> (match_unsigned_saturation_add): Add new func impl to match the
> .SAT_ADD for phi gimple.
> (math_opts_dom_walker::after_dom_children): Add phi 

[PATCH] Add SLP_TREE_MEMORY_ACCESS_TYPE

2024-06-06 Thread Richard Biener
It turns out target costing code looks at STMT_VINFO_MEMORY_ACCESS_TYPE
to identify operations from (emulated) gathers for example.  This
doesn't work for SLP loads since we do not set STMT_VINFO_MEMORY_ACCESS_TYPE
there as the vectorization strathegy might differ between different
stmt uses.  It seems we got away with setting it for stores though.
The following adds a memory_access_type field to slp_tree and sets it
from load and store vectorization code.  All the costing doesn't record
the SLP node (that was only done selectively for some corner case).  The
costing is really in need of a big overhaul, the following just massages
the two relevant ops to fix gcc.dg/target/pr88531-2[bc].c FAILs when
switching on SLP for non-grouped stores.  In particular currently
we either have a SLP node or a stmt_info in the cost hook but not both.

So the following is a hack(?).  Other targets look possibly affected as
well.  I do want to postpone rewriting all of the costing to after
all-SLP.

Any comments?

* tree-vectorizer.h (_slp_tree::memory_access_type): Add.
(SLP_TREE_MEMORY_ACCESS_TYPE): New.
(record_stmt_cost): Add another overload.
* tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize
memory_access_type.
* tree-vect-stmts.cc (vectorizable_store): Set
SLP_TREE_MEMORY_ACCESS_TYPE.
(vectorizable_load): Likewise.  Also record the SLP node
when costing emulated gather offset decompose and vector
composition.
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Also
recognize SLP emulated gather/scatter.
---
 gcc/config/i386/i386.cc |  22 ++---
 gcc/tree-vect-slp.cc|   1 +
 gcc/tree-vect-stmts.cc  |  16 +--
 gcc/tree-vectorizer.h   | 102 
 4 files changed, 91 insertions(+), 50 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 4126ab24a79..32ecf31d8d1 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25150,13 +25150,21 @@ ix86_vector_costs::add_stmt_cost (int count, 
vect_cost_for_stmt kind,
  (AGU and load ports).  Try to account for this by scaling the
  construction cost by the number of elements involved.  */
   if ((kind == vec_construct || kind == vec_to_scalar)
-  && stmt_info
-  && (STMT_VINFO_TYPE (stmt_info) == load_vec_info_type
- || STMT_VINFO_TYPE (stmt_info) == store_vec_info_type)
-  && ((STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE
-  && (TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info)))
-  != INTEGER_CST))
- || STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER))
+  && ((stmt_info
+  && (STMT_VINFO_TYPE (stmt_info) == load_vec_info_type
+  || STMT_VINFO_TYPE (stmt_info) == store_vec_info_type)
+  && ((STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE
+   && (TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info)))
+   != INTEGER_CST))
+  || (STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
+  == VMAT_GATHER_SCATTER)))
+ || (node
+ && ((SLP_TREE_MEMORY_ACCESS_TYPE (node) == VMAT_ELEMENTWISE
+ && (TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF
+   (SLP_TREE_REPRESENTATIVE (node
+ != INTEGER_CST))
+ || (SLP_TREE_MEMORY_ACCESS_TYPE (node)
+ == VMAT_GATHER_SCATTER)
 {
   stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
   stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1);
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index e1e47b786c2..c359e8a0bbc 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -122,6 +122,7 @@ _slp_tree::_slp_tree ()
   SLP_TREE_CODE (this) = ERROR_MARK;
   SLP_TREE_VECTYPE (this) = NULL_TREE;
   SLP_TREE_REPRESENTATIVE (this) = NULL;
+  SLP_TREE_MEMORY_ACCESS_TYPE (this) = VMAT_INVARIANT;
   SLP_TREE_REF_COUNT (this) = 1;
   this->failed = NULL;
   this->max_nunits = 1;
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index bd7dd149d11..8049c458136 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -8316,6 +8316,8 @@ vectorizable_store (vec_info *vinfo,
   if (costing_p) /* transformation not required.  */
 {
   STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) = memory_access_type;
+  if (slp_node)
+   SLP_TREE_MEMORY_ACCESS_TYPE (slp_node) = memory_access_type;
 
   if (loop_vinfo
  && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
@@ -8356,7 +8358,10 @@ vectorizable_store (vec_info *vinfo,
  && first_stmt_info != stmt_info)
return true;
 }
-  gcc_assert (memory_access_type == STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info));
+  if (slp_node)
+gcc_assert (memory_access_type == SLP_TREE_MEMORY_ACCESS_TYPE (stmt_info));
+  else
+gcc_assert 

Re: Appendix border lines painting cross document border

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 08:44, Jean-Marc Lasgouttes wrote:

Le 04/06/2024 à 10:02, Pavel Sanda a écrit :

Hi,

just a small painting issue:

1. start new file
2. document -> start appendix here
3. type anything
4. the border lines stretch below the end of document
5. it get's fixed byt enter
6. type again -> border is wrong again


This is fixed in master at 7acfbe0fccc7, although the bug does not 
manifest itself there. The code has been obviously wrong forever, but 
this was not visible because the grey area below the document was 
always repainted.


This "always repaint" behavior has changed at some point before 2.4.0, 
but is restored at 1a11abe4394272.


So the bug will disappear when branch 2.4.1-devel is merged, but still 
I'd like to back port this commit.


Riki, OK to back port to 2.4.x?


Yes.

Riki


--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: quotes around argument of PackageOptions

2024-06-06 Thread Richard Kimberly Heck

On 6/6/24 01:01, Jürgen Spitzmüller wrote:

Am Donnerstag, dem 06.06.2024 um 00:19 +0300 schrieb Udicoudco:

I attached a file that compiles with 2.3.x but not with 2.4.0.
If I remove the quotes around the second argument of PackageOptions
then LyX 2.3.7 complains that the layout is invalid, and with 2.4.0
the file compiles.

That's due to a77c84a0b4d5.

Fixed in master (1449fbf9ae3ec), candidate for 2.4.1.


OK!

Riki


--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


[PATCH] RISC-V: Handle non-grouped stores as single-lane SLP

2024-06-06 Thread Richard Biener
The following enables single-lane loop SLP discovery for non-grouped stores
and adjusts vectorizable_store to properly handle those.

For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop,
not running into the "not falling back to strided accesses" bail-out.
I have not investigated in detail.  Similar for gcc.dg/vect/slp-19c.c.

The gcc.dg/vect/O3-pr39675-2.c and gcc.dg/vect/slp-19[abc].c SLPs
depend on the load permute lowering as the single-lane store we
now want to handle is fed from a single lane from groups of size four.
I've updated the expected number of SLPs but they FAIL.

For gfortran.dg/vect/fast-math-mgrid-resid.f predictive commoning
now unrolls the loop, the vectorization factor is the same.  I think
association during SLP build might be the reason for the difference.

There is a set of i386 target assembler test FAILs,
gcc.target/i386/pr88531-2[bc].c in particular fail because the
target cannot identify SLP emulated gathers, see another mail from me.
Others need adjustment, I've adjusted one with this patch only.

I'm probably delaying this a bit until the load permute lowering
is good enough for pushing.

* tree-vect-slp.cc (vect_analyze_slp): Perform single-lane
loop SLP discovery for non-grouped stores.
* tree-vect-stmts.cc (vectorizable_store): Always set
vec_num for SLP.

* gcc.dg/vect/O3-pr39675-2.c: Adjust expected number of SLP.
* gcc.dg/vect/fast-math-vect-call-1.c: Likewise.
* gcc.dg/vect/no-scevccp-slp-31.c: Likewise.
* gcc.dg/vect/slp-12b.c: Likewise.
* gcc.dg/vect/slp-12c.c: Likewise.
* gcc.dg/vect/slp-19a.c: Likewise.
* gcc.dg/vect/slp-19b.c: Likewise.
* gcc.dg/vect/slp-19c.c: Likewise.
* gcc.dg/vect/slp-4-big-array.c: Likewise.
* gcc.dg/vect/slp-4.c: Likewise.
* gcc.dg/vect/slp-5.c: Likewise.
* gcc.dg/vect/slp-7.c: Likewise.
* gcc.dg/vect/slp-perm-7.c: Likewise.
* gcc.dg/vect/slp-37.c: Likewise.
* gcc.dg/vect/vect-outer-slp-3.c: Disable vectorization of
initialization loop.
* gcc.dg/vect/slp-reduc-5.c: Likewise.
* gcc.dg/vect/no-scevccp-outer-12.c: Un-XFAIL.  SLP can handle
inner loop inductions with multiple vector stmt copies.
* gfortran.dg/vect/vect-8.f90: Adjust expected number of
vectorized loops.
* gfortran.dg/vect/fast-math-mgrid-resid.f: Expect predictive
commoning with unrolling.
* gcc.target/i386/vectorize1.c: Adjust what we scan for.
---
 gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c  |  2 +-
 .../gcc.dg/vect/fast-math-vect-call-1.c   |  2 +-
 .../gcc.dg/vect/no-scevccp-outer-12.c |  3 +--
 gcc/testsuite/gcc.dg/vect/no-scevccp-slp-31.c |  5 ++--
 gcc/testsuite/gcc.dg/vect/slp-12b.c   |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-12c.c   |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-19a.c   |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-19b.c   |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-19c.c   |  4 ++--
 gcc/testsuite/gcc.dg/vect/slp-37.c|  2 +-
 gcc/testsuite/gcc.dg/vect/slp-4-big-array.c   |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-4.c |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-5.c |  2 +-
 gcc/testsuite/gcc.dg/vect/slp-7.c |  4 ++--
 gcc/testsuite/gcc.dg/vect/slp-perm-7.c|  4 ++--
 gcc/testsuite/gcc.dg/vect/slp-reduc-5.c   |  3 ++-
 gcc/testsuite/gcc.dg/vect/vect-outer-slp-3.c  |  1 +
 gcc/testsuite/gcc.target/i386/vectorize1.c|  4 ++--
 .../gfortran.dg/vect/fast-math-mgrid-resid.f  |  2 +-
 gcc/testsuite/gfortran.dg/vect/vect-8.f90 |  2 +-
 gcc/tree-vect-slp.cc  | 23 +++
 gcc/tree-vect-stmts.cc| 11 +
 22 files changed, 57 insertions(+), 29 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c 
b/gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c
index c3f0f6dc1be..ddaac56cc0b 100644
--- a/gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c
+++ b/gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c
@@ -27,5 +27,5 @@ foo ()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
vect_strided4 } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target vect_strided4 } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target vect_strided4 } } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c 
b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
index ad22f6e82b3..6c9b7c37b6e 100644
--- a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
+++ b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
@@ -101,4 +101,4 @@ main ()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" { target { 
vect_call_copysignf && vect_call_sqrtf } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { 
target { { vect_call_copysignf && vect_call_sqrtf 

Re: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]

2024-06-06 Thread Richard Biener
On Thu, 6 Jun 2024, Richard Sandiford wrote:

> Pengxuan Zheng  writes:
> > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is 
> > implemented
> > using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI).
> >
> > PR target/113882
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.
> 
> Could we handle this by extending the target-independent code instead?
> Richard mentioned in comment 1 that the current set of intermediate
> conversions is hard-coded, but it didn't sound like he was implying that
> the set shouldn't change.

Yes, much like non-SLP uses supportable_narrowing_operation with any
number of intermediate conversions the SLP case should do something
similar.

Richard.

> Thanks,
> Richard
> 
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/fix_trunc2.c: New test.
> >
> > Signed-off-by: Pengxuan Zheng 
> > ---
> >  gcc/config/aarch64/aarch64-simd.md| 13 +
> >  gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++
> >  2 files changed, 27 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> >
> > diff --git a/gcc/config/aarch64/aarch64-simd.md 
> > b/gcc/config/aarch64/aarch64-simd.md
> > index 868f4486218..096f7b56a27 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -3032,6 +3032,19 @@ (define_expand 
> > "2"
> >"TARGET_SIMD"
> >{})
> >  
> > +
> > +(define_expand "fix_truncv4sfv4hi2"
> > +  [(match_operand:V4HI 0 "register_operand")
> > +   (match_operand:V4SF 1 "register_operand")]
> > +  "TARGET_SIMD"
> > +  {
> > +rtx tmp = gen_reg_rtx (V4SImode);
> > +emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
> > +emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
> > +DONE;
> > +  }
> > +)
> > +
> >  (define_expand "ftrunc2"
> >[(set (match_operand:VHSDF 0 "register_operand")
> > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]
> > diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c 
> > b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > new file mode 100644
> > index 000..57cc00913a3
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > @@ -0,0 +1,14 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +
> > +void
> > +f (short *__restrict a, float *__restrict b)
> > +{
> > +  a[0] = b[0];
> > +  a[1] = b[1];
> > +  a[2] = b[2];
> > +  a[3] = b[3];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } 
> > } */
> > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[jira] [Commented] (MNG-7868) "Could not acquire lock(s)" error in concurrent maven builds

2024-06-06 Thread Richard Eckart de Castilho (Jira)


[ 
https://issues.apache.org/jira/browse/MNG-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852783#comment-17852783
 ] 

Richard Eckart de Castilho commented on MNG-7868:
-

The bnd plugin version used here is 7.0.0.

> "Could not acquire lock(s)" error in concurrent maven builds
> 
>
> Key: MNG-7868
> URL: https://issues.apache.org/jira/browse/MNG-7868
> Project: Maven
>  Issue Type: Bug
> Environment: windows, maven 3.9.4
>Reporter: Jörg Hohwiller
>Priority: Major
> Attachments: image-2024-04-10-15-44-37-013.png, screenshot-1.png
>
>
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:3.1.1:install (default-install) 
> on project foo.bar: Execution default-install of goal 
> org.apache.maven.plugins:maven-install-plugin:3.1.1:install failed: Could not 
> acquire lock(s) -> [Help 1]
> {code}
> I am using maven 3.9.4 on windows:
> {code}
> $ mvn -v
> Apache Maven 3.9.4 (dfbb324ad4a7c8fb0bf182e6d91b0ae20e3d2dd9)
> Maven home: D:\projects\test\software\mvn
> Java version: 17.0.5, vendor: Eclipse Adoptium, runtime: 
> D:\projects\test\software\java
> Default locale: en_US, platform encoding: UTF-8
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> {code}
> I searched for this bug and found issues like MRESOLVER-332 that first look 
> identical or similar but do not really seem to be related so I decided to 
> create this issue.
> For this bug I made the following observations:
> * it only happens with concurrent builds: {{mvn -T ...}}
> * is seems to be windows related (at least mainly happens on windows)
> * it is in-deterministic and is not so easy to create an isolated and simple 
> project and a reproducible scenario that always results in this error. 
> However, I get this very often in my current project with many modules (500+).
> * it is not specific to the maven-install-plugin and also happens from other 
> spots in maven:
> I also got this stacktrace:
> {code}
> Suppressed: java.lang.IllegalStateException: Attempt 1: Could not acquire 
> write lock for 
> 'C:\Users\hohwille\.m2\repository\.locks\artifact~com.caucho~com.springsource.com.caucho~3.2.1.lock'
>  in 30 SECONDS
> at 
> org.eclipse.aether.internal.impl.synccontext.named.NamedLockFactoryAdapter$AdaptedLockSyncContext.acquire
>  (NamedLockFactoryAdapter.java:202)
> at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve 
> (DefaultArtifactResolver.java:271)
> at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts 
> (DefaultArtifactResolver.java:259)
> at 
> org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
> (DefaultRepositorySystem.java:352)
> {code}
> See also this related discussion:
> https://github.com/apache/maven-mvnd/issues/836#issuecomment-1702488377



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (MNG-7868) "Could not acquire lock(s)" error in concurrent maven builds

2024-06-06 Thread Richard Eckart de Castilho (Jira)


[ 
https://issues.apache.org/jira/browse/MNG-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852776#comment-17852776
 ] 

Richard Eckart de Castilho commented on MNG-7868:
-

{noformat}
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
biz.aQute.bnd:bnd-testing-maven-plugin:7.0.0:testing (osgi-test-execution) on 
project module-foo: Could not acquire lock(s)
at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 
(MojoExecutor.java:333)
at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute 
(MojoExecutor.java:316)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:212)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:174)
at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 
(MojoExecutor.java:75)
at org.apache.maven.lifecycle.internal.MojoExecutor$1.run 
(MojoExecutor.java:162)
at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute 
(DefaultMojosExecutionStrategy.java:39)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:159)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:105)
at 
org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call
 (MultiThreadedBuilder.java:193)
at 
org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call
 (MultiThreadedBuilder.java:180)
at java.util.concurrent.FutureTask.run (FutureTask.java:264)
at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539)
at java.util.concurrent.FutureTask.run (FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker 
(ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run 
(ThreadPoolExecutor.java:635)
at java.lang.Thread.run (Thread.java:833)
Caused by: org.apache.maven.plugin.MojoExecutionException: Could not acquire 
lock(s)
at aQute.bnd.maven.testing.plugin.TestingMojo.execute (TestingMojo.java:170)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
(DefaultBuildPluginManager.java:126)
at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 
(MojoExecutor.java:328)
at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute 
(MojoExecutor.java:316)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:212)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:174)
at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 
(MojoExecutor.java:75)
at org.apache.maven.lifecycle.internal.MojoExecutor$1.run 
(MojoExecutor.java:162)
at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute 
(DefaultMojosExecutionStrategy.java:39)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:159)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:105)
at 
org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call
 (MultiThreadedBuilder.java:193)
at 
org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call
 (MultiThreadedBuilder.java:180)
at java.util.concurrent.FutureTask.run (FutureTask.java:264)
at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539)
at java.util.concurrent.FutureTask.run (FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker 
(ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run 
(ThreadPoolExecutor.java:635)
at java.lang.Thread.run (Thread.java:833)
Caused by: java.lang.IllegalStateException: Could not acquire lock(s)
at 
org.eclipse.aether.internal.impl.synccontext.named.NamedLockFactoryAdapter$AdaptedLockSyncContext.acquire
 (NamedLockFactoryAdapter.java:219)
at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve 
(DefaultArtifactResolver.java:276)
at 
org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts 
(DefaultArtifactResolver.java:261)
at 
org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
(DefaultRepositorySystem.java:353)
at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve 
(DefaultProjectDependenciesResolver.java:182)
at aQute.bnd.maven.lib.resolve.DependencyResolver.resolve 
(DependencyResolver.java:226)
at aQute.bnd.maven.lib.resolve.DependencyResolver.getFileSetRepository 
(DependencyResolver.java:268)
at aQute.bnd.maven.lib.resolve.BndrunContainer.getFileSetRepository 
(BndrunContainer.java:260)
at aQute.bnd.maven.lib.resolve.BndrunContainer.getFileSetRepository 
(BndrunContainer.java:244)
at aQute.bnd.maven.lib.resolve.BndrunContainer.injectImplicitRepository

Re: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]

2024-06-06 Thread Richard Sandiford
Pengxuan Zheng  writes:
> This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is 
> implemented
> using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI).
>
>   PR target/113882
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.

Could we handle this by extending the target-independent code instead?
Richard mentioned in comment 1 that the current set of intermediate
conversions is hard-coded, but it didn't sound like he was implying that
the set shouldn't change.

Thanks,
Richard

> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/fix_trunc2.c: New test.
>
> Signed-off-by: Pengxuan Zheng 
> ---
>  gcc/config/aarch64/aarch64-simd.md| 13 +
>  gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++
>  2 files changed, 27 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 868f4486218..096f7b56a27 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -3032,6 +3032,19 @@ (define_expand 
> "2"
>"TARGET_SIMD"
>{})
>  
> +
> +(define_expand "fix_truncv4sfv4hi2"
> +  [(match_operand:V4HI 0 "register_operand")
> +   (match_operand:V4SF 1 "register_operand")]
> +  "TARGET_SIMD"
> +  {
> +rtx tmp = gen_reg_rtx (V4SImode);
> +emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
> +emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
> +DONE;
> +  }
> +)
> +
>  (define_expand "ftrunc2"
>[(set (match_operand:VHSDF 0 "register_operand")
>   (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]
> diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c 
> b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> new file mode 100644
> index 000..57cc00913a3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +void
> +f (short *__restrict a, float *__restrict b)
> +{
> +  a[0] = b[0];
> +  a[1] = b[1];
> +  a[2] = b[2];
> +  a[3] = b[3];
> +}
> +
> +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } 
> */
> +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */


Re: [PATCH] aarch64: Add missing ACLE macro for NEON-SVE Bridge

2024-06-06 Thread Richard Sandiford
Richard Ball  writes:
> __ARM_NEON_SVE_BRIDGE was missed in the original patch and is
> added by this patch.
>
> Ok for trunk and a backport into gcc-14?
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
>   Add missing __ARM_NEON_SVE_BRIDGE.

After this patch was posted, there was some internal discussion
involving LLVM & GNU devs about what this kind of macro means, now that
we have FMV.  The feeling was that __ARM_NEON_SVE_BRIDGE should just
indicate whether the compiler provides the file, not whether AdvSIMD
& SVE are enabled.  I think we should therefore add this to
aarch64_define_unconditional_macros instead.

Sorry for the slow review.  I was waiting for the outcome of that
discussion before replying.

Thanks,
Richard

> diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
> index 
> fe1a20e4e546a68e5f7eddff3bbb0d3e831fbd9b..1121be118cf8d05e3736ad4ee75568ff7cb92bfd
>  100644
> --- a/gcc/config/aarch64/aarch64-c.cc
> +++ b/gcc/config/aarch64/aarch64-c.cc
> @@ -260,6 +260,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
>aarch64_def_or_undef (TARGET_SME_I16I64, "__ARM_FEATURE_SME_I16I64", 
> pfile);
>aarch64_def_or_undef (TARGET_SME_F64F64, "__ARM_FEATURE_SME_F64F64", 
> pfile);
>aarch64_def_or_undef (TARGET_SME2, "__ARM_FEATURE_SME2", pfile);
> +  aarch64_def_or_undef (TARGET_SVE, "__ARM_NEON_SVE_BRIDGE", pfile);
>  
>/* Not for ACLE, but required to keep "float.h" correct if we switch
>   target between implementations that do or do not support ARMv8.2-A


Re: Heading Level Access In Safari Browser

2024-06-06 Thread Richard Turner
I thought you were using a bluetooth keyboard. Those are keyboard commands. Sorry. Richard, USA“Grandma always told us, “Be careful when you pray for patience. God stores it on the other side of Hell and you will have to go through Hell to get it.”-- Cedrick Bridgeforth My web site: https://www.turner42.com/ On Jun 5, 2024, at 11:22 PM, Ron Canazzi  wrote:
  

  
  
Hi Richard,

This is on the iPhone. I don't understand this arrows and shift key
business.


On 5/31/2024 12:57 PM, Richard Turner
  wrote:

When
  on a web site or in your html file, turn on quicknav if it isn't
  on using left+right arrows together.
  Then, press VO+q to turn on single letter quicknav.  I have
the VO command as control+Options so control+Options+q.
  Then, you can use the singel numbers or h for the next
heading, shift plus h for previous or even shift+1 for previous
heading level 1, etc.
  HTH,

  
  
  
Richard, USA
“Grandma always told us, “Be
careful when you pray for patience. God stores it on the
other side of Hell and you will have to go through Hell
to get it.”
-- Cedrick Bridgeforth
 
My web site: https://www.turner42.com/
 
  
  
  


  On May 31, 2024, at 9:18 AM, Mario
Eiland  wrote:

  


  Use the rotor while in the Safari app
  and look for headings. Once you hear headings then flick
  down with one finger and that should take you from heading
  to heading. To go up flick up.
If you can't find the heading option in the rotor then
  you must add it in the VoiceOver rotor settings.

Good luck!

-Original Message-
From: viphone@googlegroups.com
   On Behalf Of Ron Canazzi
Sent: Friday, May 31, 2024 8:42 AM
To: ViPhone List 
Subject: Heading Level Access In Safari Browser

Hi Group,

I finally was able to change some settings in Safari
  Browser on the iPhone to get it to display HTML files that
  are stored locally on the iPhone.  I  created my modified
  Dice Football Game Play Sheet by using headings to more
  quickly navigate from play list to play list.  I have the
  lists separated into running plays, kicking plays, passing
  plays and conversions at level one and the various list of
  items such as short pass, long pass and screen pass for
  the passing plays and the running plays such as left end
  run, right tackle play and reverse at heading level two.

Is there any way to navigate by heading levels using a
  quick number scheme such as is done on Windows desktops
  with quick key number navigation such as number one for
  heading level one and number two for heading level two on
  the iPhone?

Thanks for any help.

--
Signature:
For a nation to admit it has done grievous wrongs and
  will strive to correct them for the betterment of all is
  no vice; For a nation to claim it has always been great,
  needs no improvement  and to cling to its past
  achievements is no virtue!

--
The following information is important for all members
  of the V iPhone list.

If you have any questions or concerns about the
  running of this list, or if you feel that a member's post
  is inappropriate, please contact the owners or moderators
  directly rather than posting on the list itself.

Your V iPhone list moderator is Mark Taylor.  Mark can
  be reached at:  mk...@ucla.edu.  Your list owner is Cara
  Quinn - you can reach Cara at caraqu...@caraquinn.com

The archives for this list can be searched at:
http://www.mail-archive.com/viphone@googlegroups.com/
---
You received this message because you are subscribed
  to the Google Groups "VIPhone" group.
To unsubscribe from this group and stop receiving
  emails from it, send an email to
  viphone+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/viphone/7a3d2c9c-6deb-8621-6d2a-105199764add%40roadrunner.com.

-- 
The following information is importa

Re: [PATCH]AArch64: correct constraint on Upl early clobber alternatives

2024-06-06 Thread Richard Sandiford
ine_insn "@aarch64_pred_cmp_wide"
> UNSPEC_PRED_Z))
> (clobber (reg:CC_NZC CC_REGNUM))]
>"TARGET_SVE"
> -  {@ [ cons: =0, 1, 2, 3, 4; attrs: pred_clobber ]
> - [ ,  Upl ,  , w, w; yes ] 
> cmp\t%0., %1/z, %3., %4.d
> - [ ?Upa,  0Upl,  , w, w; yes ] ^
> - [ Upa ,  Upl ,  , w, w; no  ] ^
> +  {@ [ cons: =0, 1   , 2, 3, 4; attrs: pred_clobber ]
> + [ ,  Upl,  , w, w; yes ] 
> cmp\t%0., %1/z, %3., %4.d
> + [ ?Upl,  0  ,  , w, w; yes ] ^
> + [ Upa ,  Upl,  , w, w; no  ] ^
>}
>  )
>  
> @@ -8298,10 +8298,10 @@ (define_insn "*aarch64_pred_cmp_wide_cc"
> UNSPEC_PRED_Z))]
>"TARGET_SVE
> && aarch64_sve_same_pred_for_ptest_p ([4], [6])"
> -  {@ [ cons: =0, 1, 2, 3, 6  ; attrs: pred_clobber ]
> - [ ,  Upl , w, w, Upl; yes ] 
> cmp\t%0., %1/z, %2., %3.d
> - [ ?Upa,  0Upl, w, w, Upl; yes ] ^
> - [ Upa ,  Upl , w, w, Upl; no  ] ^
> +  {@ [ cons: =0, 1   , 2, 3, 6  ; attrs: pred_clobber ]
> + [ ,  Upl, w, w, Upl; yes ] 
> cmp\t%0., %1/z, %2., %3.d
> + [ ?Upl,  0  , w, w, Upl; yes ] ^
> + [ Upa ,  Upl, w, w, Upl; no  ] ^
>}
>  )

...these patterns are a bit of an unusual case, and were already
playing in the margins of correctness before the pred_clobber changes,
in that operand 6 is "close enough" to operand 1 that we can ignore it.
Using "Upl" for operand 6 (as in the patch) feels a bit safer than using
"0" for both operands.

Richard

>  
> @@ -8325,10 +8325,10 @@ (define_insn 
> "*aarch64_pred_cmp_wide_ptest"
> (clobber (match_scratch: 0))]
>"TARGET_SVE
> && aarch64_sve_same_pred_for_ptest_p ([4], [6])"
> -  {@ [ cons:  =0, 1, 2, 3, 6  ; attrs: pred_clobber ]
> - [  ,  Upl , w, w, Upl; yes ] 
> cmp\t%0., %1/z, %2., %3.d
> - [ ?Upa ,  0Upl, w, w, Upl; yes ] ^
> - [ Upa  ,  Upl , w, w, Upl; no  ] ^
> +  {@ [ cons:  =0, 1   , 2, 3, 6  ; attrs: pred_clobber ]
> + [  ,  Upl, w, w, Upl; yes ] 
> cmp\t%0., %1/z, %2., %3.d
> + [ ?Upl ,  0  , w, w, Upl; yes ] ^
> + [ Upa  ,  Upl, w, w, Upl; no  ] ^
>}
>  )
>  
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index 
> eaba9d8f25fac704c9c66e444c6249470bef3ccd..972b03a4fef0b0bd4d50edf392bcfcb9acde551e
>  100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -3351,7 +3351,7 @@ (define_insn "@aarch64_pred_"
>"TARGET_SVE2 && TARGET_NON_STREAMING"
>{@ [ cons: =0, 1  , 3, 4; attrs: pred_clobber ]
>   [ , Upl, w, w; yes ] \t%0., 
> %1/z, %3., %4.
> - [ ?Upa, 0  , w, w; yes ] ^
> + [ ?Upl, 0  , w, w; yes ] ^
>   [ Upa , Upl, w, w; no  ] ^
>}
>  )


Re: arm: Add .type and .size to __gnu_cmse_nonsecure_call [PR115360]

2024-06-06 Thread Richard Earnshaw (lists)
On 05/06/2024 17:07, Andre Vieira (lists) wrote:
> Hi,
> 
> This patch adds missing assembly directives to the CMSE library wrapper to 
> call functions with attribute cmse_nonsecure_call.  Without the .type 
> directive the linker will fail to produce the correct veneer if a call to 
> this wrapper function is to far from the wrapper itself.  The .size was added 
> for completeness, though we don't necessarily have a usecase for it.
> 
> I did not add a testcase as I couldn't get dejagnu to disassemble the linked 
> binary to check we used an appropriate branch instruction, I did however test 
> it locally and with this change the GNU linker now generates an appropriate 
> veneer and call to that veneer when __gnu_cmse_nonsecure_call is too far.
> 
> OK for trunk and backport to any release branches still in support (after 
> waiting a week or so)?
> 
> libgcc/ChangeLog:
> 
> PR target/115360
> * config/arm/cmse_nonsecure_call.S: Add .type and .size directives.

OK.

R.


[jira] [Commented] (MNG-7868) "Could not acquire lock(s)" error in concurrent maven builds

2024-06-06 Thread Richard Eckart de Castilho (Jira)


[ 
https://issues.apache.org/jira/browse/MNG-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852754#comment-17852754
 ] 

Richard Eckart de Castilho commented on MNG-7868:
-

[~cstamas] can you read anything useful from the information I have provided?

> "Could not acquire lock(s)" error in concurrent maven builds
> 
>
> Key: MNG-7868
> URL: https://issues.apache.org/jira/browse/MNG-7868
> Project: Maven
>  Issue Type: Bug
> Environment: windows, maven 3.9.4
>Reporter: Jörg Hohwiller
>Priority: Major
> Attachments: image-2024-04-10-15-44-37-013.png, screenshot-1.png
>
>
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:3.1.1:install (default-install) 
> on project foo.bar: Execution default-install of goal 
> org.apache.maven.plugins:maven-install-plugin:3.1.1:install failed: Could not 
> acquire lock(s) -> [Help 1]
> {code}
> I am using maven 3.9.4 on windows:
> {code}
> $ mvn -v
> Apache Maven 3.9.4 (dfbb324ad4a7c8fb0bf182e6d91b0ae20e3d2dd9)
> Maven home: D:\projects\test\software\mvn
> Java version: 17.0.5, vendor: Eclipse Adoptium, runtime: 
> D:\projects\test\software\java
> Default locale: en_US, platform encoding: UTF-8
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> {code}
> I searched for this bug and found issues like MRESOLVER-332 that first look 
> identical or similar but do not really seem to be related so I decided to 
> create this issue.
> For this bug I made the following observations:
> * it only happens with concurrent builds: {{mvn -T ...}}
> * is seems to be windows related (at least mainly happens on windows)
> * it is in-deterministic and is not so easy to create an isolated and simple 
> project and a reproducible scenario that always results in this error. 
> However, I get this very often in my current project with many modules (500+).
> * it is not specific to the maven-install-plugin and also happens from other 
> spots in maven:
> I also got this stacktrace:
> {code}
> Suppressed: java.lang.IllegalStateException: Attempt 1: Could not acquire 
> write lock for 
> 'C:\Users\hohwille\.m2\repository\.locks\artifact~com.caucho~com.springsource.com.caucho~3.2.1.lock'
>  in 30 SECONDS
> at 
> org.eclipse.aether.internal.impl.synccontext.named.NamedLockFactoryAdapter$AdaptedLockSyncContext.acquire
>  (NamedLockFactoryAdapter.java:202)
> at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve 
> (DefaultArtifactResolver.java:271)
> at 
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts 
> (DefaultArtifactResolver.java:259)
> at 
> org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies 
> (DefaultRepositorySystem.java:352)
> {code}
> See also this related discussion:
> https://github.com/apache/maven-mvnd/issues/836#issuecomment-1702488377



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PATCH v2] Vect: Support IFN SAT_SUB for unsigned vector int

2024-06-06 Thread Richard Biener
On Thu, Jun 6, 2024 at 8:26 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to support the .SAT_SUB for the unsigned
> vector int.  Given we have below example code:
>
> void
> vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
> {
>   for (unsigned i = 0; i < n; i++)
> out[i] = (x[i] - y[i]) & (-(uint64_t)(x[i] >= y[i]));
> }
>
> Before this patch:
> void
> vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
> {
>   ...
>   _77 = .SELECT_VL (ivtmp_75, POLY_INT_CST [2, 2]);
>   ivtmp_56 = _77 * 8;
>   vect__4.7_59 = .MASK_LEN_LOAD (vectp_x.5_57, 64B, { -1, ... }, _77, 0);
>   vect__6.10_63 = .MASK_LEN_LOAD (vectp_y.8_61, 64B, { -1, ... }, _77, 0);
>
>   mask__7.11_64 = vect__4.7_59 >= vect__6.10_63;
>   _66 = .COND_SUB (mask__7.11_64, vect__4.7_59, vect__6.10_63, { 0, ... });
>
>   .MASK_LEN_STORE (vectp_out.15_71, 64B, { -1, ... }, _77, 0, _66);
>   vectp_x.5_58 = vectp_x.5_57 + ivtmp_56;
>   vectp_y.8_62 = vectp_y.8_61 + ivtmp_56;
>   vectp_out.15_72 = vectp_out.15_71 + ivtmp_56;
>   ivtmp_76 = ivtmp_75 - _77;
>   ...
> }
>
> After this patch:
> void
> vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
> {
>   ...
>   _76 = .SELECT_VL (ivtmp_74, POLY_INT_CST [2, 2]);
>   ivtmp_60 = _76 * 8;
>   vect__4.7_63 = .MASK_LEN_LOAD (vectp_x.5_61, 64B, { -1, ... }, _76, 0);
>   vect__6.10_67 = .MASK_LEN_LOAD (vectp_y.8_65, 64B, { -1, ... }, _76, 0);
>
>   vect_patt_37.11_68 = .SAT_SUB (vect__4.7_63, vect__6.10_67);
>
>   .MASK_LEN_STORE (vectp_out.12_70, 64B, { -1, ... }, _76, 0, 
> vect_patt_37.11_68);
>   vectp_x.5_62 = vectp_x.5_61 + ivtmp_60;
>   vectp_y.8_66 = vectp_y.8_65 + ivtmp_60;
>   vectp_out.12_71 = vectp_out.12_70 + ivtmp_60;
>   ivtmp_75 = ivtmp_74 - _76;
>   ...
> }
>
> The below test suites are passed for this patch
> * The x86 bootstrap test.
> * The x86 fully regression test.
> * The riscv fully regression tests.

OK.

Richard.

> gcc/ChangeLog:
>
> * match.pd: Add new form for vector mode recog.
> * tree-vect-patterns.cc (gimple_unsigned_integer_sat_sub): Add
> new match func decl;
> (vect_recog_build_binary_gimple_call): Extract helper func to
> build gcall with given internal_fn.
> (vect_recog_sat_sub_pattern): Add new func impl to recog .SAT_SUB.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd  | 14 +++
>  gcc/tree-vect-patterns.cc | 85 ---
>  2 files changed, 84 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7c1ad428a3c..ebc60eba8dc 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3110,6 +3110,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>&& types_match (type, @0, @1
>
> +/* Unsigned saturation sub, case 3 (branchless with gt):
> +   SAT_U_SUB = (X - Y) * (X > Y).  */
> +(match (unsigned_integer_sat_sub @0 @1)
> + (mult:c (minus @0 @1) (convert (gt @0 @1)))
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
> +
> +/* Unsigned saturation sub, case 4 (branchless with ge):
> +   SAT_U_SUB = (X - Y) * (X >= Y).  */
> +(match (unsigned_integer_sat_sub @0 @1)
> + (mult:c (minus @0 @1) (convert (ge @0 @1)))
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
> +
>  /* x >  y  &&  x != XXX_MIN  -->  x > y
> x >  y  &&  x == XXX_MIN  -->  false . */
>  (for eqne (eq ne)
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 81e8fdc9122..cef901808eb 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -4488,6 +4488,32 @@ vect_recog_mult_pattern (vec_info *vinfo,
>  }
>
>  extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree));
> +extern bool gimple_unsigned_integer_sat_sub (tree, tree*, tree (*)(tree));
> +
> +static gcall *
> +vect_recog_build_binary_gimple_call (vec_info *vinfo, gimple *stmt,
> +internal_fn fn, tree *type_out,
> +tree op_0, tree op_1)
> +{
> +  tree itype = TREE_TYPE (op_0);
> +  tree vtype = get_vectype_for_scalar_type (vinfo, itype);
> +
> +  if (vtype != NULL_TREE
> +&& direct_internal_fn_supported_p (fn, vtype, OPTIMIZE_FOR_BOTH))
> +{
> +  gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> +
> +  gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL));
> 

Re: [PATCH v4] Match: Support more form for scalar unsigned SAT_ADD

2024-06-06 Thread Richard Biener
On Thu, Jun 6, 2024 at 3:19 AM Li, Pan2  wrote:
>
> Hi Richard,
>
> After revisited all the comments of the mail thread, I would like to confirm 
> if my understanding is correct according to the generated match code.
> For now the generated code looks like below:
>
> else if (gphi *_a1 = dyn_cast  (_d1))
>   {
> basic_block _b1 = gimple_bb (_a1);
> if (gimple_phi_num_args (_a1) == 2)
>   {
> basic_block _pb_0_1 = EDGE_PRED (_b1, 0)->src;
> basic_block _pb_1_1 = EDGE_PRED (_b1, 1)->src;
> basic_block _db_1 = safe_dyn_cast  (*gsi_last_bb (_pb_0_1)) 
> ? _pb_0_1 : _pb_1_1;
> basic_block _other_db_1 = safe_dyn_cast  (*gsi_last_bb 
> (_pb_0_1)) ? _pb_1_1 : _pb_0_1;
> gcond *_ct_1 = safe_dyn_cast  (*gsi_last_bb (_db_1));
> if (_ct_1 && EDGE_COUNT (_other_db_1->preds) == 1
>   && EDGE_COUNT (_other_db_1->succs) == 1
>   && EDGE_PRED (_other_db_1, 0)->src == _db_1)
>   {
> tree _cond_lhs_1 = gimple_cond_lhs (_ct_1);
> tree _cond_rhs_1 = gimple_cond_rhs (_ct_1);
> tree _p0 = build2 (gimple_cond_code (_ct_1), boolean_type_node, 
> _cond_lhs_1, _cond_rhs_1);
> bool _arg_0_is_true_1 = gimple_phi_arg_edge (_a1, 0)->flags & 
> EDGE_TRUE_VALUE;
> tree _p1 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 0 : 1);
> tree _p2 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 1 : 0);
> 
>
> The flow may look like below, or can only handling flow like below.
>
> +--+
> | cond |---+
> +--+   v
>|+---+
>|| other |
>|+---+
>v   |
> +-+|
> | PHI | <--+
> +-+
>
> Thus, I think it cannot handle the below 2 PHI flows (or even more 
> complicated shapes)
>
> +--+
> | cond |---+
> +--+   |
>|   |
>v   |
> +--+   |
> | mid  |   v
> +--++---+
>|| other |
>|+---+
>v   |
> +-+|
> | PHI | <--+
> +-+
>
> +--+
> | cond |---+
> +--+   |
>|   v
>|+---+
>|| mid-0 |+
>|+---+|
>|   | v
>|   |   +---+
>|   |   | mid-1 |
>|   v   +---+
>|+---+|
>|| other |<---+
>|+---+
>v   |
> +-+|
> | PHI | <--+
> +-+

Correct.

> So I am not very sure if we need (or reasonable) to take care of all the PHI 
> gimple flows (may impossible ?) Or keep the simplest one for now and add more 
> case by case.
> Thanks a lot.

I'd only keep the simplest one for now.  More complex cases can be
handled easily
with using dominators but those might not always be available or up-to-date when
doing match queries.  So let's revisit when we run into a case where
the simple form
isn't enough.

Richard.

>
> Pan
>
> -Original Message-
> From: Li, Pan2
> Sent: Wednesday, June 5, 2024 9:44 PM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> tamar.christ...@arm.com
> Subject: RE: [PATCH v4] Match: Support more form for scalar unsigned SAT_ADD
>
> Thanks Richard for comments, will address the comments in v7, and looks like 
> I also need to resolve conflict up to a point.
>
> Pan
>
> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, June 5, 2024 4:50 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> tamar.christ...@arm.com
> Subject: Re: [PATCH v4] Match: Support more form for scalar unsigned SAT_ADD
>
> On Thu, May 30, 2024 at 3:37 PM  wrote:
> >
> > From: Pan Li 
> >
> > After we support one gassign form of the unsigned .SAT_ADD,  we
> > would like to support more forms including both the branch and
> > branchless.  There are 5 other forms of .SAT_ADD,  list as below:
> >
> > Form 1:
> >   #define SAT_ADD_U_1(T) \
> >   T sat_add_u_1_##T(T x, T y) \
> >   { \
> > return (T)(x + y) >= x ? (x + y) : -1; \
> >   }
> >
> > Form 2:
> >   #define SAT_ADD_U_2(T) \
> >   T sat_add_u_2_##T(T x, T y) \
> >   { \
> > T ret; \
> > T overflow = __builtin_add_overflow (x, y, ); \
> > return (T)(-overflow) | ret; \
> >   }
> >
&g

Re: [PATCH v2] aarch64: Add vector floating point extend pattern [PR113880, PR113869]

2024-06-06 Thread Richard Sandiford
Pengxuan Zheng  writes:
> This patch adds vector floating point extend pattern for V2SF->V2DF and
> V4HF->V4SF conversions by renaming the existing 
> aarch64_float_extend_lo_
> pattern to the standard optab one, i.e., extend2. This allows the
> vectorizer to vectorize certain floating point widening operations for the
> aarch64 target.
>
>   PR target/113880
>   PR target/113869
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-builtins.cc (VAR1): Remap float_extend_lo_
>   builtin codes to standard optab ones.
>   * config/aarch64/aarch64-simd.md (aarch64_float_extend_lo_): 
> Rename
>   to...
>   (extend2): ... This.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/extend-vec.c: New test.

OK, thanks, and sorry for the slow review.

Richard

> Signed-off-by: Pengxuan Zheng 
> ---
>  gcc/config/aarch64/aarch64-builtins.cc|  9 
>  gcc/config/aarch64/aarch64-simd.md|  2 +-
>  gcc/testsuite/gcc.target/aarch64/extend-vec.c | 21 +++
>  3 files changed, 31 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/extend-vec.c
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index f8eeccb554d..25189888d17 100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -534,6 +534,15 @@ BUILTIN_VDQ_BHSI (urhadd, uavg, _ceil, 0)
>  BUILTIN_VDQ_BHSI (shadd, avg, _floor, 0)
>  BUILTIN_VDQ_BHSI (uhadd, uavg, _floor, 0)
>  
> +/* The builtins below should be expanded through the standard optabs
> +   CODE_FOR_extend2. */
> +#undef VAR1
> +#define VAR1(F,T,N,M) \
> +  constexpr insn_code CODE_FOR_aarch64_##F##M = CODE_FOR_##T##N##M##2;
> +
> +VAR1 (float_extend_lo_, extend, v2sf, v2df)
> +VAR1 (float_extend_lo_, extend, v4hf, v4sf)
> +
>  #undef VAR1
>  #define VAR1(T, N, MAP, FLAG, A) \
>{#N #A, UP (A), CF##MAP (N, A), 0, TYPES_##T, FLAG_##FLAG},
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 868f4486218..c5e2c9f00d0 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -3132,7 +3132,7 @@
>  DONE;
>}
>  )
> -(define_insn "aarch64_float_extend_lo_"
> +(define_insn "extend2"
>[(set (match_operand: 0 "register_operand" "=w")
>   (float_extend:
> (match_operand:VDF 1 "register_operand" "w")))]
> diff --git a/gcc/testsuite/gcc.target/aarch64/extend-vec.c 
> b/gcc/testsuite/gcc.target/aarch64/extend-vec.c
> new file mode 100644
> index 000..f6241d5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/extend-vec.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.2d, v[0-9]+.2s} 1 } } */
> +void
> +f (float *__restrict a, double *__restrict b)
> +{
> +  b[0] = a[0];
> +  b[1] = a[1];
> +}
> +
> +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.4s, v[0-9]+.4h} 1 } } */
> +void
> +f1 (_Float16 *__restrict a, float *__restrict b)
> +{
> +
> +  b[0] = a[0];
> +  b[1] = a[1];
> +  b[2] = a[2];
> +  b[3] = a[3];
> +}


Re: [PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-06-06 Thread Richard Sandiford
YunQiang Su  writes:
> YunQiang Su  于2024年5月29日周三 10:02写道:
>>
>> Richard Sandiford  于2024年5月29日周三 05:28写道:
>> >
>> > YunQiang Su  writes:
>> > > If `find_a_program` cannot find `as/ld/objcopy` and we are a cross 
>> > > toolchain,
>> > > the final fallback is `as/ld` of system.  In fact, we can have a try with
>> > > -as/ld/objcopy before fallback to native as/ld/objcopy.
>> > >
>> > > This patch is derivatived from Debian's patch:
>> > >   gcc-search-prefixed-as-ld.diff
>> >
>> > I'm probably making you repeat a previous discussion, sorry, but could
>> > you describe the use case in more detail?  The current approach to
>> > handling cross toolchains has been used for many years.  Presumably
>> > this patch is supporting a different way of organising things,
>> > but I wasn't sure from the description what it was.
>> >
>> > AIUI, we currently assume that cross as, ld and objcopy will be
>> > installed under those names in $prefix/$target_alias/bin (aka 
>> > $tooldir/bin).
>> > E.g.:
>> >
>> >bin/aarch64-elf-as = aarch64-elf/bin/as
>> >
>> > GCC should then find as in aarch64-elf/bin.
>> >
>> > Is that not true in your case?
>> >
>>
>> Yes. This patch is only about the final fallback. I mean aarch64-elf/bin/as
>> still has higher priority than bin/aarch64-elf-as.
>>
>> In the current code, we find gas with:
>> /prefix/aarch64-elf/bin/as > $PATH/as
>>
>> And this patch a new one between them:
>> /prefix/aarch64-elf/bin/as > $PATH/aarch64-elf-as > $PATH/as
>>
>> > To be clear, I'm not saying the patch is wrong.  I'm just trying to
>> > understand why the patch is needed.
>> >
>>
>> Yes. If gcc is configured correctly, it is not so useful.
>> In some case for some lazy user, it may be useful,
>> for example, the binutils installed into different prefix with libc etc.
>>
>> For example, binutils is installed into /usr/aarch64-elf/bin, while
>> libc is installed into /usr/local/aarch64-elf/.
>>
>
> Any idea about it? Is it a use case making sense?

Yeah, I think it makes sense.  GCC and binutils are separate packages.
Users could cherry-pick a GCC installation and a separate binutils
installation rather than bundling them together into a single
toolchain.  And not everyone will have permission to change $tooldir.

So I agree we should support searching the user's path for an
as/ld/etc. based on the tool prefix.  Unfortunately, I don't think
I understand the code & constraints well enough to do a review.

In particular, it seems unfortunate that we need to do a trial
subcommand invocation before committing to the prefixed name.
And, if we continue to search for "as" in the user's path as a fallback,
it's not 100% obvious that "${triple}-as" later in the path should trump
"as" earlier in the path.

In some ways, it seems more consistent to do the replacement without
first doing a trial invocation.  But I don't know whether that would
break existing use cases.  (To be clear, I wouldn't feel comfortable
approving a patch to do that without buy-in from other maintainers.)

Thanks,
Richard


Re: [OE-core] [PATCH] gcc: Fix wrong order of gcc include paths on musl systems

2024-06-06 Thread Richard Purdie
On Thu, 2024-06-06 at 00:10 -0700, Khem Raj via lists.openembedded.org wrote:
> musl does not use gcc private system headers, however, the path gets
> prepended since gcc driver passes -iprefix option to cc1 based on its
> installation location. This starts to prefer these headers instead of
> musl provided equivalent system headers which is not as per musl's
> design. This patch switches prepend to append for musl systems.
> 
> Signed-off-by: Khem Raj 
> ---
>  meta/recipes-devtools/gcc/gcc-14.1.inc    |  1 +
>  ...te-include-paths-on-musl-instead-of-.patch | 35 +++
>  2 files changed, 36 insertions(+)
>  create mode 100644 
> meta/recipes-devtools/gcc/gcc/0026-Append-GCC-private-include-paths-on-musl-instead-of-.patch
> 
> diff --git a/meta/recipes-devtools/gcc/gcc-14.1.inc 
> b/meta/recipes-devtools/gcc/gcc-14.1.inc
> index b057e570f3b..c4bc4c72664 100644
> --- a/meta/recipes-devtools/gcc/gcc-14.1.inc
> +++ b/meta/recipes-devtools/gcc/gcc-14.1.inc
> @@ -68,6 +68,7 @@ SRC_URI = "${BASEURI} \
>     file://0023-Fix-install-path-of-linux64.h.patch \
>     file://0024-Avoid-hardcoded-build-paths-into-ppc-libgcc.patch \
>     file://0025-gcc-testsuite-tweaks-for-mips-OE.patch \
> +   
> file://0026-Append-GCC-private-include-paths-on-musl-instead-of-.patch \
>  "
>  
>  S = "${TMPDIR}/work-shared/gcc-${PV}-${PR}/${SOURCEDIR}"
> diff --git 
> a/meta/recipes-devtools/gcc/gcc/0026-Append-GCC-private-include-paths-on-musl-instead-of-.patch
>  
> b/meta/recipes-devtools/gcc/gcc/0026-Append-GCC-private-include-paths-on-musl-instead-of-.patch
> new file mode 100644
> index 000..1bcff39aa7c
> --- /dev/null
> +++ 
> b/meta/recipes-devtools/gcc/gcc/0026-Append-GCC-private-include-paths-on-musl-instead-of-.patch
> @@ -0,0 +1,35 @@
> +From 30f1229a8b663ee4dc35d389acf60241a4536fb8 Mon Sep 17 00:00:00 2001
> +From: Khem Raj 
> +Date: Wed, 5 Jun 2024 22:56:12 -0700
> +Subject: [PATCH] Append GCC private include paths on musl instead of
> + prepending
> +
> +Musl does not need gcc private compiler headers, therefore use them
> +after standard system header search paths.
> +
> +This fixes packages like python builds to detect the musl systems
> +correclty, as it looks for musl specific stuff in stdarg.h system
> +header, which is wrongly picked from gcc private headers in OE
> +
> +Upstream-Status: Submitted 
> [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115368]
> +Signed-off-by: Khem Raj 
> +---
> + gcc/gcc.cc | 4 
> + 1 file changed, 4 insertions(+)
> +
> +diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> +index 01968001c44..d0d5c35cf83 100644
> +--- a/gcc/gcc.cc
>  b/gcc/gcc.cc
> +@@ -6589,7 +6589,11 @@ do_spec_1 (const char *spec, int inswitch, const char 
> *soft_matched_part)
> + 
> +   if (gcc_exec_prefix)
> + {
> ++#if DEFAULT_LIBC == LIBC_MUSL
> ++  do_spec_1 ("-idirafter", 1, NULL);
> ++#else
> +   do_spec_1 ("-iprefix", 1, NULL);
> ++#endif
> +   /* Make this a separate argument.  */
> +   do_spec_1 (" ", 0, NULL);
> +   do_spec_1 (gcc_exec_prefix, 1, NULL);
> 

The trouble is we build one cross compiler for both glibc and musl so
this will fix musl and break glibc. It would need to change this
depending upon the target at runtime. I suspect upstream gcc will say
the same thing :/

Cheers,

Richard


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#200387): 
https://lists.openembedded.org/g/openembedded-core/message/200387
Mute This Topic: https://lists.openembedded.org/mt/106518958/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-06 Thread Richard Sandiford
Hi,

Just some comments on the fuseable_load_p part, since that's what
we were discussing last time.

It looks like this now relies on:

Ajit Agarwal  writes:
> +  /* We use DF data flow because we change location rtx
> +  which is easier to find and modify.
> +  We use mix of rtl-ssa def-use and DF data flow
> +  where it is easier.  */
> +  df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> +  df_analyze ();
> +  df_set_flags (DF_DEFER_INSN_RESCAN);

But please don't do this!  For one thing, building DU/UD chains
as well as rtl-ssa is really expensive in terms of compile time.
But more importantly, modifications need to happen via rtl-ssa
to ensure that the IL is kept up-to-date.  If we don't do that,
later fuse attempts will be based on stale data and so could
generate incorrect code.

> +// Check whether load can be fusable or not.
> +// Return true if fuseable otherwise false.
> +bool
> +rs6000_pair_fusion::fuseable_load_p (insn_info *info)
> +{
> +  for (auto def : info->defs())
> +{
> +  auto set = dyn_cast (def);
> +  for (auto use1 : set->nondebug_insn_uses ())
> + use1->set_is_live_out_use (true);
> +}

What was the reason for adding this loop?

> +
> +  rtx_insn *rtl_insn = info ->rtl ();
> +  rtx body = PATTERN (rtl_insn);
> +  rtx dest_exp = SET_DEST (body);
> +
> +  if (REG_P (dest_exp) &&
> +  (DF_REG_DEF_COUNT (REGNO (dest_exp)) > 1

The rtl-ssa way of checking this is:

  crtl->ssa->is_single_dominating_def (...)

> +   || DF_REG_EQ_USE_COUNT (REGNO (dest_exp)) > 0))
> +return  false;

Why are uses in notes a problem?  In the worst case, we should just be
able to remove the note instead.

> +
> +  rtx addr = XEXP (SET_SRC (body), 0);
> +
> +  if (GET_CODE (addr) == PLUS
> +  && XEXP (addr, 1) && CONST_INT_P (XEXP (addr, 1)))
> +{
> +  if (INTVAL (XEXP (addr, 1)) == -16)
> + return false;
> +  }

What's special about -16?

> +
> +  df_ref use;
> +  df_insn_info *insn_info = DF_INSN_INFO_GET (info->rtl ());
> +  FOR_EACH_INSN_INFO_DEF (use, insn_info)
> +{
> +  struct df_link *def_link = DF_REF_CHAIN (use);
> +
> +  if (!def_link || !def_link->ref
> +   || DF_REF_IS_ARTIFICIAL (def_link->ref))
> + continue;
> +
> +  while (def_link && def_link->ref)
> + {
> +   rtx_insn *insn = DF_REF_INSN (def_link->ref);
> +   if (GET_CODE (PATTERN (insn)) == PARALLEL)
> + return false;

Why do you need to skip PARALLELs?

> +
> +   rtx set = single_set (insn);
> +   if (set == NULL_RTX)
> + return false;
> +
> +   rtx op0 = SET_SRC (set);
> +   rtx_code code = GET_CODE (op0);
> +
> +   // This check is added as register pairs are not generated
> +   // by RA for neg:V2DF (fma: V2DF (reg1)
> +   //  (reg2)
> +   //  (neg:V2DF (reg3)))
> +   if (GET_RTX_CLASS (code) == RTX_UNARY)
> + return false;

What's special about (neg (fma ...))?

> +
> +   def_link = def_link->next;
> + }
> + }
> +  return true;
> +}

Thanks,
Richard


[jira] [Resolved] (OPENNLP-1565) Deploy Model Snapshots via GitHub Actions

2024-06-06 Thread Richard Zowalla (Jira)


 [ 
https://issues.apache.org/jira/browse/OPENNLP-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Zowalla resolved OPENNLP-1565.
--
Resolution: Fixed

> Deploy Model Snapshots via GitHub Actions
> -
>
> Key: OPENNLP-1565
> URL: https://issues.apache.org/jira/browse/OPENNLP-1565
> Project: OpenNLP
>  Issue Type: Sub-task
>    Reporter: Richard Zowalla
>    Assignee: Richard Zowalla
>Priority: Major
> Fix For: 2.3.4, 2.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (OPENNLP-1565) Deploy Model Snapshots via GitHub Actions

2024-06-06 Thread Richard Zowalla (Jira)
Richard Zowalla created OPENNLP-1565:


 Summary: Deploy Model Snapshots via GitHub Actions
 Key: OPENNLP-1565
 URL: https://issues.apache.org/jira/browse/OPENNLP-1565
 Project: OpenNLP
  Issue Type: Sub-task
Reporter: Richard Zowalla
Assignee: Richard Zowalla
 Fix For: 2.3.4, 2.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[gcc r15-1056] Allow single-lane SLP in-order reductions

2024-06-06 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:4653b682ef161c3c2fc7bf8462b8f9206a1349e6

commit r15-1056-g4653b682ef161c3c2fc7bf8462b8f9206a1349e6
Author: Richard Biener 
Date:   Tue Mar 5 15:46:24 2024 +0100

Allow single-lane SLP in-order reductions

The single-lane case isn't different from non-SLP, no re-association
implied.  But the transform stage cannot handle a conditional reduction
op which isn't checked during analysis - this makes it work, exercised
with a single-lane non-reduction-chain by gcc.target/i386/pr112464.c

* tree-vect-loop.cc (vectorizable_reduction): Allow
single-lane SLP in-order reductions.
(vectorize_fold_left_reduction): Handle SLP reduction with
conditional reduction op.

Diff:
---
 gcc/tree-vect-loop.cc | 48 +++-
 1 file changed, 19 insertions(+), 29 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index b9e8e9b5559..ceb92156b58 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7139,56 +7139,46 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
   gcc_assert (TREE_CODE_LENGTH (tree_code (code)) == binary_op);
 
   if (slp_node)
-{
-  if (is_cond_op)
-   {
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"fold-left reduction on SLP not supported.\n");
- return false;
-   }
-
-  gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (vectype_out),
-   TYPE_VECTOR_SUBPARTS (vectype_in)));
-}
+gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (vectype_out),
+ TYPE_VECTOR_SUBPARTS (vectype_in)));
 
   /* The operands either come from a binary operation or an IFN_COND operation.
  The former is a gimple assign with binary rhs and the latter is a
  gimple call with four arguments.  */
   gcc_assert (num_ops == 2 || num_ops == 4);
-  tree op0, opmask;
-  if (!is_cond_op)
-op0 = ops[1 - reduc_index];
-  else
-{
-  op0 = ops[2 + (1 - reduc_index)];
-  opmask = ops[0];
-  gcc_assert (!slp_node);
-}
 
   int group_size = 1;
   stmt_vec_info scalar_dest_def_info;
   auto_vec vec_oprnds0, vec_opmask;
   if (slp_node)
 {
-  auto_vec > vec_defs (2);
-  vect_get_slp_defs (loop_vinfo, slp_node, _defs);
-  vec_oprnds0.safe_splice (vec_defs[1 - reduc_index]);
-  vec_defs[0].release ();
-  vec_defs[1].release ();
+  vect_get_slp_defs (SLP_TREE_CHILDREN (slp_node)[(is_cond_op ? 2 : 0)
+ + (1 - reduc_index)],
+ _oprnds0);
   group_size = SLP_TREE_SCALAR_STMTS (slp_node).length ();
   scalar_dest_def_info = SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1];
+  /* For an IFN_COND_OP we also need the vector mask operand.  */
+  if (is_cond_op)
+   vect_get_slp_defs (SLP_TREE_CHILDREN (slp_node)[0], _opmask);
 }
   else
 {
+  tree op0, opmask;
+  if (!is_cond_op)
+   op0 = ops[1 - reduc_index];
+  else
+   {
+ op0 = ops[2 + (1 - reduc_index)];
+ opmask = ops[0];
+   }
   vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1,
 op0, _oprnds0);
   scalar_dest_def_info = stmt_info;
 
   /* For an IFN_COND_OP we also need the vector mask operand.  */
   if (is_cond_op)
- vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1,
-opmask, _opmask);
+   vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1,
+  opmask, _opmask);
 }
 
   gimple *sdef = vect_orig_stmt (scalar_dest_def_info)->stmt;
@@ -8210,7 +8200,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 }
 
   if (reduction_type == FOLD_LEFT_REDUCTION
-  && slp_node
+  && (slp_node && SLP_TREE_LANES (slp_node) > 1)
   && !REDUC_GROUP_FIRST_ELEMENT (stmt_info))
 {
   /* We cannot use in-order reductions in this case because there is


[gcc r15-1054] Allow single-lane COND_REDUCTION vectorization

2024-06-06 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:202a9c8fe7db9dd94e5a77f42e54ef3d966f88e8

commit r15-1054-g202a9c8fe7db9dd94e5a77f42e54ef3d966f88e8
Author: Richard Biener 
Date:   Fri Mar 1 14:39:08 2024 +0100

Allow single-lane COND_REDUCTION vectorization

The following enables single-lane COND_REDUCTION vectorization.

* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Adjust for single-lane COND_REDUCTION SLP vectorization.
(vectorizable_reduction): Likewise.
(vect_transform_cycle_phi): Likewise.

Diff:
---
 gcc/tree-vect-loop.cc | 97 ++-
 1 file changed, 81 insertions(+), 16 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 06292ed8bbe..ccd6acef5c5 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -6030,7 +6030,13 @@ vect_create_epilog_for_reduction (loop_vec_info 
loop_vinfo,
   tree induc_val = NULL_TREE;
   tree adjustment_def = NULL;
   if (slp_node)
-;
+{
+  /* Optimize: for induction condition reduction, if we can't use zero
+for induc_val, use initial_def.  */
+  if (STMT_VINFO_REDUC_TYPE (reduc_info) == INTEGER_INDUC_COND_REDUCTION)
+   induc_val = STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL (reduc_info);
+  /* ???  Coverage for double_reduc and 'else' isn't clear.  */
+}
   else
 {
   /* Optimize: for induction condition reduction, if we can't use zero
@@ -6075,23 +6081,46 @@ vect_create_epilog_for_reduction (loop_vec_info 
loop_vinfo,
   if (STMT_VINFO_REDUC_TYPE (reduc_info) == COND_REDUCTION)
 {
   auto_vec, 2> ccompares;
-  stmt_vec_info cond_info = STMT_VINFO_REDUC_DEF (reduc_info);
-  cond_info = vect_stmt_to_vectorize (cond_info);
-  while (cond_info != reduc_info)
+  if (slp_node)
{
- if (gimple_assign_rhs_code (cond_info->stmt) == COND_EXPR)
+ slp_tree cond_node = slp_node_instance->root;
+ while (cond_node != slp_node_instance->reduc_phis)
{
- gimple *vec_stmt = STMT_VINFO_VEC_STMTS (cond_info)[0];
- gcc_assert (gimple_assign_rhs_code (vec_stmt) == VEC_COND_EXPR);
- ccompares.safe_push
-   (std::make_pair (unshare_expr (gimple_assign_rhs1 (vec_stmt)),
-STMT_VINFO_REDUC_IDX (cond_info) == 2));
+ stmt_vec_info cond_info = SLP_TREE_REPRESENTATIVE (cond_node);
+ if (gimple_assign_rhs_code (cond_info->stmt) == COND_EXPR)
+   {
+ gimple *vec_stmt
+   = SSA_NAME_DEF_STMT (SLP_TREE_VEC_DEFS (cond_node)[0]);
+ gcc_assert (gimple_assign_rhs_code (vec_stmt) == 
VEC_COND_EXPR);
+ ccompares.safe_push
+   (std::make_pair (gimple_assign_rhs1 (vec_stmt),
+STMT_VINFO_REDUC_IDX (cond_info) == 2));
+   }
+ /* ???  We probably want to have REDUC_IDX on the SLP node?  */
+ cond_node = SLP_TREE_CHILDREN
+   (cond_node)[STMT_VINFO_REDUC_IDX (cond_info)];
}
- cond_info
-   = loop_vinfo->lookup_def (gimple_op (cond_info->stmt,
-1 + STMT_VINFO_REDUC_IDX
-   (cond_info)));
+   }
+  else
+   {
+ stmt_vec_info cond_info = STMT_VINFO_REDUC_DEF (reduc_info);
  cond_info = vect_stmt_to_vectorize (cond_info);
+ while (cond_info != reduc_info)
+   {
+ if (gimple_assign_rhs_code (cond_info->stmt) == COND_EXPR)
+   {
+ gimple *vec_stmt = STMT_VINFO_VEC_STMTS (cond_info)[0];
+ gcc_assert (gimple_assign_rhs_code (vec_stmt) == 
VEC_COND_EXPR);
+ ccompares.safe_push
+   (std::make_pair (gimple_assign_rhs1 (vec_stmt),
+STMT_VINFO_REDUC_IDX (cond_info) == 2));
+   }
+ cond_info
+   = loop_vinfo->lookup_def (gimple_op (cond_info->stmt,
+1 + STMT_VINFO_REDUC_IDX
+(cond_info)));
+ cond_info = vect_stmt_to_vectorize (cond_info);
+   }
}
   gcc_assert (ccompares.length () != 0);
 
@@ -7844,7 +7873,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   /* If we have a condition reduction, see if we can simplify it further.  */
   if (v_reduc_type == COND_REDUCTION)
 {
-  if (slp_node)
+  if (slp_node && SLP_TREE_LANES (slp_node) != 1)
return false;
 
   /* When the condition uses the reduction value in the condition, fail.  
*/
@@ -8050,6 +8079,18 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
}
 }
 
+  if ((reduction_type == COND_REDUCTION
+   || reductio

[gcc r15-1055] Add double reduction support for SLP vectorization

2024-06-06 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:2ee41ef76a99ef5a8b62b351e2c01dad93f51b18

commit r15-1055-g2ee41ef76a99ef5a8b62b351e2c01dad93f51b18
Author: Richard Biener 
Date:   Tue Mar 5 15:28:58 2024 +0100

Add double reduction support for SLP vectorization

The following makes double reduction vectorization work when
using (single-lane) SLP vectorization.

* tree-vect-loop.cc (vect_analyze_scalar_cycles_1): Queue
double reductions in LOOP_VINFO_REDUCTIONS.
(vect_create_epilog_for_reduction): Remove asserts disabling
SLP for double reductions.
(vectorizable_reduction): Analyze SLP double reductions
only once and start off the correct places.
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Allow
vect_double_reduction_def.
(vect_build_slp_tree_2): Fix condition for the ignored
reduction initial values.
* tree-vect-stmts.cc (vect_analyze_stmt): Allow
vect_double_reduction_def.

Diff:
---
 gcc/tree-vect-loop.cc  | 35 +--
 gcc/tree-vect-slp.cc   |  3 ++-
 gcc/tree-vect-stmts.cc |  4 
 3 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index ccd6acef5c5..b9e8e9b5559 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -685,6 +685,8 @@ vect_analyze_scalar_cycles_1 (loop_vec_info loop_vinfo, 
class loop *loop,
 
   STMT_VINFO_DEF_TYPE (stmt_vinfo) = vect_double_reduction_def;
  STMT_VINFO_DEF_TYPE (reduc_stmt_info) = vect_double_reduction_def;
+ /* Make it accessible for SLP vectorization.  */
+ LOOP_VINFO_REDUCTIONS (loop_vinfo).safe_push (reduc_stmt_info);
 }
   else
 {
@@ -5975,7 +5977,6 @@ vect_create_epilog_for_reduction (loop_vec_info 
loop_vinfo,
   stmt_vec_info rdef_info = stmt_info;
   if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
 {
-  gcc_assert (!slp_node);
   double_reduc = true;
   stmt_info = loop_vinfo->lookup_def (gimple_phi_arg_def
(stmt_info->stmt, 0));
@@ -6020,7 +6021,7 @@ vect_create_epilog_for_reduction (loop_vec_info 
loop_vinfo,
 {
   outer_loop = loop;
   loop = loop->inner;
-  gcc_assert (!slp_node && double_reduc);
+  gcc_assert (double_reduc);
 }
 
   vectype = STMT_VINFO_REDUC_VECTYPE (reduc_info);
@@ -6035,7 +6036,7 @@ vect_create_epilog_for_reduction (loop_vec_info 
loop_vinfo,
 for induc_val, use initial_def.  */
   if (STMT_VINFO_REDUC_TYPE (reduc_info) == INTEGER_INDUC_COND_REDUCTION)
induc_val = STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL (reduc_info);
-  /* ???  Coverage for double_reduc and 'else' isn't clear.  */
+  /* ???  Coverage for 'else' isn't clear.  */
 }
   else
 {
@@ -7605,15 +7606,16 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type;
   return true;
 }
-  if (slp_node)
-{
-  slp_node_instance->reduc_phis = slp_node;
-  /* ???  We're leaving slp_node to point to the PHIs, we only
-need it to get at the number of vector stmts which wasn't
-yet initialized for the instance root.  */
-}
   if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
 {
+  if (gimple_bb (stmt_info->stmt) != loop->header)
+   {
+ /* For SLP we arrive here for both the inner loop LC PHI and
+the outer loop PHI.  The latter is what we want to analyze
+the reduction with.  */
+ gcc_assert (slp_node);
+ return true;
+   }
   use_operand_p use_p;
   gimple *use_stmt;
   bool res = single_imm_use (gimple_phi_result (stmt_info->stmt),
@@ -7622,6 +7624,14 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   phi_info = loop_vinfo->lookup_stmt (use_stmt);
 }
 
+  if (slp_node)
+{
+  slp_node_instance->reduc_phis = slp_node;
+  /* ???  We're leaving slp_node to point to the PHIs, we only
+need it to get at the number of vector stmts which wasn't
+yet initialized for the instance root.  */
+}
+
   /* PHIs should not participate in patterns.  */
   gcc_assert (!STMT_VINFO_RELATED_STMT (phi_info));
   gphi *reduc_def_phi = as_a  (phi_info->stmt);
@@ -7637,6 +7647,11 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   bool only_slp_reduc_chain = true;
   stmt_info = NULL;
   slp_tree slp_for_stmt_info = slp_node ? slp_node_instance->root : NULL;
+  /* For double-reductions we start SLP analysis at the inner loop LC PHI
+ which is the def of the outer loop live stmt.  */
+  if (STMT_VINFO_DEF_TYPE (reduc_info) == vect_double_reduction_def
+  && slp_node)
+slp_for_stmt_info = SLP_TREE_CHILDREN (slp_for_stmt_info)[0];
   while (reduc_def != PHI_RESULT

[gcc r15-1053] Relax COND_EXPR reduction vectorization SLP restriction

2024-06-06 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:28edeb1409a7b839407ec06031899b933390bff3

commit r15-1053-g28edeb1409a7b839407ec06031899b933390bff3
Author: Richard Biener 
Date:   Fri Feb 23 16:16:38 2024 +0100

Relax COND_EXPR reduction vectorization SLP restriction

Allow one-lane SLP but for the case where we need to swap the arms.

* tree-vect-stmts.cc (vectorizable_condition): Allow
single-lane SLP, but not when we need to swap then and
else clause.

Diff:
---
 gcc/tree-vect-stmts.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index b26cc74f417..c82381e799e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -12116,7 +12116,7 @@ vectorizable_condition (vec_info *vinfo,
 = STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info)) != NULL;
   if (for_reduction)
 {
-  if (slp_node)
+  if (slp_node && SLP_TREE_LANES (slp_node) > 1)
return false;
   reduc_info = info_for_reduction (vinfo, stmt_info);
   reduction_type = STMT_VINFO_REDUC_TYPE (reduc_info);
@@ -12205,6 +12205,10 @@ vectorizable_condition (vec_info *vinfo,
  cond_expr = NULL_TREE;
}
}
+  /* ???  The vectorized operand query below doesn't allow swapping
+this way for SLP.  */
+  if (slp_node)
+   return false;
   std::swap (then_clause, else_clause);
 }


[jira] [Resolved] (OPENNLP-1562) Create a Markdown List

2024-06-06 Thread Richard Zowalla (Jira)


 [ 
https://issues.apache.org/jira/browse/OPENNLP-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Zowalla resolved OPENNLP-1562.
--
Resolution: Fixed

> Create a Markdown List
> --
>
> Key: OPENNLP-1562
> URL: https://issues.apache.org/jira/browse/OPENNLP-1562
> Project: OpenNLP
>  Issue Type: Sub-task
>  Components: Models
>    Reporter: Richard Zowalla
>Priority: Major
> Fix For: 2.3.4, 2.4.0
>
>
> README of [apache/opennlp-models: Apache OpenNLP Models 
> (github.com)|https://github.com/apache/opennlp-models]
> should contain a table similar to [Models Download - Apache 
> OpenNLP|https://opennlp.apache.org/models.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (TOMEE-4350) mp-jwt: Add qualifier for produced Jsonb

2024-06-06 Thread Richard Zowalla (Jira)


 [ 
https://issues.apache.org/jira/browse/TOMEE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Zowalla resolved TOMEE-4350.

Fix Version/s: 10.0.0-M2
   9.1.4
   Resolution: Fixed

> mp-jwt: Add qualifier for produced Jsonb
> 
>
> Key: TOMEE-4350
> URL: https://issues.apache.org/jira/browse/TOMEE-4350
> Project: TomEE
>  Issue Type: Improvement
>  Components: TomEE Core Server
>Affects Versions: 10.0.0-M1, 9.1.3
>Reporter: Markus Jung
>Assignee: Markus Jung
>Priority: Minor
> Fix For: 10.0.0-M2, 9.1.4
>
> Attachments: reproducer.zip
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> JsonbProducer currently produces an unqalified Jsonb, making it impossible 
> for the application to define its own producer: 
> [https://github.com/apache/tomee/blob/tomee-project-10.0.0-M1/mp-jwt/src/main/java/org/apache/tomee/microprofile/jwt/cdi/JsonbProducer.java]
>  
> This results in the following deployment error when OWB starts up (reproducer 
> attached):
> {code:java}
> 05-Jun-2024 15:48:31.010 SEVERE [main] 
> org.apache.openejb.cdi.OpenEJBLifecycle.startApplication CDI Beans module 
> deployment failed
>     org.apache.webbeans.exception.WebBeansDeploymentException: 
> jakarta.enterprise.inject.AmbiguousResolutionException: There is more than 
> one Bean with type jakarta.json.bind.Jsonb Qualifiers: 
> [@jakarta.enterprise.inject.Default()]
> for injection into Field Injection Point, field name :  jsonb, Bean Owner : 
> [ExampleBean, WebBeansType:MANAGED, Name:null, API 
> Types:[java.lang.Object,org.example.ExampleBean], 
> Qualifiers:[jakarta.enterprise.inject.Default,jakarta.enter
> prise.inject.Any]]
> found beans:  
> Jsonb, WebBeansType:PRODUCERMETHOD, Name:null, API 
> Types:[java.lang.Object,jakarta.json.bind.Jsonb,java.lang.AutoCloseable], 
> Qualifiers:[jakarta.enterprise.inject.Default,jakarta.enterprise.inject.Any], 
> Producer Method: public jakarta.j
> son.bind.Jsonb org.example.JsonbProducer.createJsonb() from 
> file:/home/markus/tmp/tomee-jsonb-unqalified/target/apache-tomee/webapps/tomee-embedded-mp-1.0-SNAPSHOT/WEB-INF/classes/org/example/JsonbProducer.class
> Jsonb, WebBeansType:PRODUCERMETHOD, Name:null, API 
> Types:[java.lang.Object,jakarta.json.bind.Jsonb,java.lang.AutoCloseable], 
> Qualifiers:[jakarta.enterprise.inject.Default,jakarta.enterprise.inject.Any], 
> Producer Method: public jakarta.j
> son.bind.Jsonb org.apache.tomee.microprofile.jwt.cdi.JsonbProducer.create() 
> from 
> jar:file:/home/markus/tmp/tomee-jsonb-unqalified/target/apache-tomee/lib/mp-jwt-10.0.0-M1.jar!/org/apache/tomee/microprofile/jwt/cdi/JsonbProducer.class
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PATCH] tree-optimization/115254 - don't account single-lane SLP against discovery limit

2024-06-06 Thread Richard Biener
On Thu, 6 Jun 2024, YunQiang Su wrote:

> Richard Biener  于2024年5月28日周二 17:47写道:
> >
> > The following avoids accounting single-lane SLP to the discovery
> > limit.  As the two testcases show this makes discovery fail,
> > unfortunately even not the same across targets.  The following
> > should fix two FAILs for GCN as a side-effect.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
> >
> > PR tree-optimization/115254
> > * tree-vect-slp.cc (vect_build_slp_tree): Only account
> > multi-lane SLP to limit.
> >
> > * gcc.dg/vect/slp-cond-2-big-array.c: Expect 4 times SLP.
> > * gcc.dg/vect/slp-cond-2.c: Likewise.
> 
> With this patch, MIPS/MSA still has only 3 times SLP.
> I am digging the problem

I bet it's an issue with missed permutes.  f3() requires interleaving
of two VnQImode vectors.

> 
> > ---
> >  .../gcc.dg/vect/slp-cond-2-big-array.c|  2 +-
> >  gcc/testsuite/gcc.dg/vect/slp-cond-2.c|  2 +-
> >  gcc/tree-vect-slp.cc  | 31 +++
> >  3 files changed, 20 insertions(+), 15 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c 
> > b/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c
> > index cb7eb94b3a3..9a9f63c0b8d 100644
> > --- a/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c
> > +++ b/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c
> > @@ -128,4 +128,4 @@ main ()
> >return 0;
> >  }
> >
> > -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 
> > "vect" } } */
> > +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
> > "vect" } } */
> > diff --git a/gcc/testsuite/gcc.dg/vect/slp-cond-2.c 
> > b/gcc/testsuite/gcc.dg/vect/slp-cond-2.c
> > index 1dcee46cd95..08bbb3dbec6 100644
> > --- a/gcc/testsuite/gcc.dg/vect/slp-cond-2.c
> > +++ b/gcc/testsuite/gcc.dg/vect/slp-cond-2.c
> > @@ -128,4 +128,4 @@ main ()
> >return 0;
> >  }
> >
> > -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 
> > "vect" } } */
> > +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
> > "vect" } } */
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index 0dd9a4daf6a..bbfde8849c1 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -1725,21 +1725,26 @@ vect_build_slp_tree (vec_info *vinfo,
> >SLP_TREE_SCALAR_STMTS (res) = stmts;
> >bst_map->put (stmts.copy (), res);
> >
> > -  if (*limit == 0)
> > +  /* Single-lane SLP doesn't have the chance of run-away, do not account
> > + it to the limit.  */
> > +  if (stmts.length () > 1)
> >  {
> > -  if (dump_enabled_p ())
> > -   dump_printf_loc (MSG_NOTE, vect_location,
> > -"SLP discovery limit exceeded\n");
> > -  /* Mark the node invalid so we can detect those when still in use
> > -as backedge destinations.  */
> > -  SLP_TREE_SCALAR_STMTS (res) = vNULL;
> > -  SLP_TREE_DEF_TYPE (res) = vect_uninitialized_def;
> > -  res->failed = XNEWVEC (bool, group_size);
> > -  memset (res->failed, 0, sizeof (bool) * group_size);
> > -  memset (matches, 0, sizeof (bool) * group_size);
> > -  return NULL;
> > +  if (*limit == 0)
> > +   {
> > + if (dump_enabled_p ())
> > +   dump_printf_loc (MSG_NOTE, vect_location,
> > +"SLP discovery limit exceeded\n");
> > + /* Mark the node invalid so we can detect those when still in use
> > +as backedge destinations.  */
> > + SLP_TREE_SCALAR_STMTS (res) = vNULL;
> > + SLP_TREE_DEF_TYPE (res) = vect_uninitialized_def;
> > + res->failed = XNEWVEC (bool, group_size);
> > + memset (res->failed, 0, sizeof (bool) * group_size);
> > + memset (matches, 0, sizeof (bool) * group_size);
> > + return NULL;
> > +   }
> > +  --*limit;
> >  }
> > -  --*limit;
> >
> >if (dump_enabled_p ())
> >  dump_printf_loc (MSG_NOTE, vect_location,
> > --
> > 2.35.3
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[jira] [Created] (TOMEE-4351) Jakarta Security 3.0

2024-06-06 Thread Richard Zowalla (Jira)
Richard Zowalla created TOMEE-4351:
--

 Summary: Jakarta Security 3.0
 Key: TOMEE-4351
 URL: https://issues.apache.org/jira/browse/TOMEE-4351
 Project: TomEE
  Issue Type: New Feature
Reporter: Richard Zowalla
Assignee: Markus Jung
 Fix For: 10.0.0-M2


as the title says. Mainly OIDC



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (TOMEE-4350) mp-jwt: Add qualifier for produced Jsonb

2024-06-06 Thread Richard Zowalla (Jira)


 [ 
https://issues.apache.org/jira/browse/TOMEE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Zowalla reassigned TOMEE-4350:
--

Assignee: Markus Jung

> mp-jwt: Add qualifier for produced Jsonb
> 
>
> Key: TOMEE-4350
> URL: https://issues.apache.org/jira/browse/TOMEE-4350
> Project: TomEE
>  Issue Type: Improvement
>  Components: TomEE Core Server
>Affects Versions: 10.0.0-M1, 9.1.3
>Reporter: Markus Jung
>Assignee: Markus Jung
>Priority: Minor
> Attachments: reproducer.zip
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> JsonbProducer currently produces an unqalified Jsonb, making it impossible 
> for the application to define its own producer: 
> [https://github.com/apache/tomee/blob/tomee-project-10.0.0-M1/mp-jwt/src/main/java/org/apache/tomee/microprofile/jwt/cdi/JsonbProducer.java]
>  
> This results in the following deployment error when OWB starts up (reproducer 
> attached):
> {code:java}
> 05-Jun-2024 15:48:31.010 SEVERE [main] 
> org.apache.openejb.cdi.OpenEJBLifecycle.startApplication CDI Beans module 
> deployment failed
>     org.apache.webbeans.exception.WebBeansDeploymentException: 
> jakarta.enterprise.inject.AmbiguousResolutionException: There is more than 
> one Bean with type jakarta.json.bind.Jsonb Qualifiers: 
> [@jakarta.enterprise.inject.Default()]
> for injection into Field Injection Point, field name :  jsonb, Bean Owner : 
> [ExampleBean, WebBeansType:MANAGED, Name:null, API 
> Types:[java.lang.Object,org.example.ExampleBean], 
> Qualifiers:[jakarta.enterprise.inject.Default,jakarta.enter
> prise.inject.Any]]
> found beans:  
> Jsonb, WebBeansType:PRODUCERMETHOD, Name:null, API 
> Types:[java.lang.Object,jakarta.json.bind.Jsonb,java.lang.AutoCloseable], 
> Qualifiers:[jakarta.enterprise.inject.Default,jakarta.enterprise.inject.Any], 
> Producer Method: public jakarta.j
> son.bind.Jsonb org.example.JsonbProducer.createJsonb() from 
> file:/home/markus/tmp/tomee-jsonb-unqalified/target/apache-tomee/webapps/tomee-embedded-mp-1.0-SNAPSHOT/WEB-INF/classes/org/example/JsonbProducer.class
> Jsonb, WebBeansType:PRODUCERMETHOD, Name:null, API 
> Types:[java.lang.Object,jakarta.json.bind.Jsonb,java.lang.AutoCloseable], 
> Qualifiers:[jakarta.enterprise.inject.Default,jakarta.enterprise.inject.Any], 
> Producer Method: public jakarta.j
> son.bind.Jsonb org.apache.tomee.microprofile.jwt.cdi.JsonbProducer.create() 
> from 
> jar:file:/home/markus/tmp/tomee-jsonb-unqalified/target/apache-tomee/lib/mp-jwt-10.0.0-M1.jar!/org/apache/tomee/microprofile/jwt/cdi/JsonbProducer.class
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PATCH] util/bufferiszero: Split out host include files

2024-06-05 Thread Richard Henderson
Split out host/bufferiszero.h.inc for x86, aarch64 and generic
in order to avoid an overlong ifdef ladder.

Signed-off-by: Richard Henderson 
---
 host/include/aarch64/host/bufferiszero.h.inc |  76 
 host/include/generic/host/bufferiszero.h.inc |  10 +
 host/include/i386/host/bufferiszero.h.inc| 124 
 host/include/x86_64/host/bufferiszero.h.inc  |   1 +
 util/bufferiszero.c  | 191 +--
 5 files changed, 212 insertions(+), 190 deletions(-)
 create mode 100644 host/include/aarch64/host/bufferiszero.h.inc
 create mode 100644 host/include/generic/host/bufferiszero.h.inc
 create mode 100644 host/include/i386/host/bufferiszero.h.inc
 create mode 100644 host/include/x86_64/host/bufferiszero.h.inc

diff --git a/host/include/aarch64/host/bufferiszero.h.inc 
b/host/include/aarch64/host/bufferiszero.h.inc
new file mode 100644
index 00..0f0e478831
--- /dev/null
+++ b/host/include/aarch64/host/bufferiszero.h.inc
@@ -0,0 +1,76 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * buffer_is_zero acceleration, aarch64 version.
+ */
+
+#ifdef __ARM_NEON
+#include 
+
+/*
+ * Helper for preventing the compiler from reassociating
+ * chains of binary vector operations.
+ */
+#define REASSOC_BARRIER(vec0, vec1) asm("" : "+w"(vec0), "+w"(vec1))
+
+static bool buffer_is_zero_simd(const void *buf, size_t len)
+{
+uint32x4_t t0, t1, t2, t3;
+
+/* Align head/tail to 16-byte boundaries.  */
+const uint32x4_t *p = QEMU_ALIGN_PTR_DOWN(buf + 16, 16);
+const uint32x4_t *e = QEMU_ALIGN_PTR_DOWN(buf + len - 1, 16);
+
+/* Unaligned loads at head/tail.  */
+t0 = vld1q_u32(buf) | vld1q_u32(buf + len - 16);
+
+/* Collect a partial block at tail end.  */
+t1 = e[-7] | e[-6];
+t2 = e[-5] | e[-4];
+t3 = e[-3] | e[-2];
+t0 |= e[-1];
+REASSOC_BARRIER(t0, t1);
+REASSOC_BARRIER(t2, t3);
+t0 |= t1;
+t2 |= t3;
+REASSOC_BARRIER(t0, t2);
+t0 |= t2;
+
+/*
+ * Loop over complete 128-byte blocks.
+ * With the head and tail removed, e - p >= 14, so the loop
+ * must iterate at least once.
+ */
+do {
+/*
+ * Reduce via UMAXV.  Whatever the actual result,
+ * it will only be zero if all input bytes are zero.
+ */
+if (unlikely(vmaxvq_u32(t0) != 0)) {
+return false;
+}
+
+t0 = p[0] | p[1];
+t1 = p[2] | p[3];
+t2 = p[4] | p[5];
+t3 = p[6] | p[7];
+REASSOC_BARRIER(t0, t1);
+REASSOC_BARRIER(t2, t3);
+t0 |= t1;
+t2 |= t3;
+REASSOC_BARRIER(t0, t2);
+t0 |= t2;
+p += 8;
+} while (p < e - 7);
+
+return vmaxvq_u32(t0) == 0;
+}
+
+static biz_accel_fn const accel_table[] = {
+buffer_is_zero_int_ge256,
+buffer_is_zero_simd,
+};
+
+#define best_accel() 1
+#else
+# include "host/include/generic/host/bufferiszero.h.inc"
+#endif
diff --git a/host/include/generic/host/bufferiszero.h.inc 
b/host/include/generic/host/bufferiszero.h.inc
new file mode 100644
index 00..ea0875c24a
--- /dev/null
+++ b/host/include/generic/host/bufferiszero.h.inc
@@ -0,0 +1,10 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * buffer_is_zero acceleration, generic version.
+ */
+
+static biz_accel_fn const accel_table[1] = {
+buffer_is_zero_int_ge256
+};
+
+#define best_accel() 0
diff --git a/host/include/i386/host/bufferiszero.h.inc 
b/host/include/i386/host/bufferiszero.h.inc
new file mode 100644
index 00..ac9bcd07ee
--- /dev/null
+++ b/host/include/i386/host/bufferiszero.h.inc
@@ -0,0 +1,124 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * buffer_is_zero acceleration, x86 version.
+ */
+
+#if defined(CONFIG_AVX2_OPT) || defined(__SSE2__)
+#include 
+
+/* Helper for preventing the compiler from reassociating
+   chains of binary vector operations.  */
+#define SSE_REASSOC_BARRIER(vec0, vec1) asm("" : "+x"(vec0), "+x"(vec1))
+
+/* Note that these vectorized functions may assume len >= 256.  */
+
+static bool __attribute__((target("sse2")))
+buffer_zero_sse2(const void *buf, size_t len)
+{
+/* Unaligned loads at head/tail.  */
+__m128i v = *(__m128i_u *)(buf);
+__m128i w = *(__m128i_u *)(buf + len - 16);
+/* Align head/tail to 16-byte boundaries.  */
+const __m128i *p = QEMU_ALIGN_PTR_DOWN(buf + 16, 16);
+const __m128i *e = QEMU_ALIGN_PTR_DOWN(buf + len - 1, 16);
+__m128i zero = { 0 };
+
+/* Collect a partial block at tail end.  */
+v |= e[-1]; w |= e[-2];
+SSE_REASSOC_BARRIER(v, w);
+v |= e[-3]; w |= e[-4];
+SSE_REASSOC_BARRIER(v, w);
+v |= e[-5]; w |= e[-6];
+SSE_REASSOC_BARRIER(v, w);
+v |= e[-7]; v |= w;
+
+/*
+ * Loop over complete 128-byte blocks.
+ * With the head and tail removed, e - p >= 14, so the loop
+ * must iterate at least once.
+ */
+do 

Re: [PATCH] target/i386: SEV: do not assume machine->cgs is SEV

2024-06-05 Thread Richard Henderson

On 6/5/24 20:45, Zhao Liu wrote:

@@ -1710,7 +1710,9 @@ void sev_es_set_reset_vector(CPUState *cpu)
  {
  X86CPU *x86;
  CPUX86State *env;
-SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
+ConfidentialGuestSupport *cgs = MACHINE(qdev_get_machine())->cgs;
+SevCommonState *sev_common = SEV_COMMON(
+object_dynamic_cast(OBJECT(cgs), TYPE_SEV_COMMON));


SEV_COMMON(object_dynamic_cast()) looks to be twice cast, we can just
force to do conversion with pointer type:

(SevCommonState *) object_dynamic_cast(OBJECT(cgs), TYPE_SEV_COMMON)


You don't need the explicit cast either, since C auto-converts from void*.

  sev_common = object_dynamic_cast(OBJECT(cgs), TYPE_SEV_COMMON);


r~



Re: [PATCH 2/2] util/bufferiszero: Add simd acceleration for loongarch64

2024-06-05 Thread Richard Henderson

On 6/5/24 20:36, maobibo wrote:

static biz_accel_fn const accel_table[] = {
 buffer_is_zero_int_ge256,
#ifdef __loongarch_sx
 buffer_is_zero_lsx,
#endif
#ifdef __loongarch_asx
 buffer_is_zero_lasx,
#endif
};

static unsigned best_accel(void)
{
#ifdef __loongarch_asx
 /* lasx may be index 1 or 2, but always last */
 return ARRAY_SIZE(accel_table) - 1;
#else
 /* lsx is always index 1 */
 return 1;
#endif
}
size of accel_table is decided at compile-time, will it be better if runtime checking is 
added also?  something like this:


  unsigned info = cpuinfo_init();

  #ifdef __loongarch_asx
  if (info & CPUINFO_LASX) {
   /* lasx may be index 1 or 2, but always last */
   return ARRAY_SIZE(accel_table) - 1;
  }
  #endif


No, because the ifdef checks that the *compiler* is prepared to use LASX/LSX instructions 
itself without further checks.  There's no point in qemu checking further.



r~



[PATCH v2 5/9] target/i386: Split out gdb-internal.h

2024-06-05 Thread Richard Henderson
Reviewed-by: Alex Bennée 
Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 target/i386/gdb-internal.h | 65 ++
 target/i386/gdbstub.c  |  1 +
 2 files changed, 66 insertions(+)
 create mode 100644 target/i386/gdb-internal.h

diff --git a/target/i386/gdb-internal.h b/target/i386/gdb-internal.h
new file mode 100644
index 00..7cf4c1a656
--- /dev/null
+++ b/target/i386/gdb-internal.h
@@ -0,0 +1,65 @@
+/*
+ * x86 gdb server stub
+ *
+ * Copyright (c) 2003-2005 Fabrice Bellard
+ * Copyright (c) 2013 SUSE LINUX Products GmbH
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef I386_GDB_INTERNAL_H
+#define I386_GDB_INTERNAL_H
+
+/*
+ * Keep these in sync with assignment to
+ * gdb_num_core_regs in target/i386/cpu.c
+ * and with the machine description
+ */
+
+/*
+ * SEG: 6 segments, plus fs_base, gs_base, kernel_gs_base
+ */
+
+/*
+ * general regs ->  8 or 16
+ */
+#define IDX_NB_IP   1
+#define IDX_NB_FLAGS1
+#define IDX_NB_SEG  (6 + 3)
+#define IDX_NB_CTL  6
+#define IDX_NB_FP   16
+/*
+ * fpu regs --> 8 or 16
+ */
+#define IDX_NB_MXCSR1
+/*
+ *  total > 8+1+1+9+6+16+8+1=50 or 16+1+1+9+6+16+16+1=66
+ */
+
+#define IDX_IP_REG  CPU_NB_REGS
+#define IDX_FLAGS_REG   (IDX_IP_REG + IDX_NB_IP)
+#define IDX_SEG_REGS(IDX_FLAGS_REG + IDX_NB_FLAGS)
+#define IDX_CTL_REGS(IDX_SEG_REGS + IDX_NB_SEG)
+#define IDX_FP_REGS (IDX_CTL_REGS + IDX_NB_CTL)
+#define IDX_XMM_REGS(IDX_FP_REGS + IDX_NB_FP)
+#define IDX_MXCSR_REG   (IDX_XMM_REGS + CPU_NB_REGS)
+
+#define IDX_CTL_CR0_REG (IDX_CTL_REGS + 0)
+#define IDX_CTL_CR2_REG (IDX_CTL_REGS + 1)
+#define IDX_CTL_CR3_REG (IDX_CTL_REGS + 2)
+#define IDX_CTL_CR4_REG (IDX_CTL_REGS + 3)
+#define IDX_CTL_CR8_REG (IDX_CTL_REGS + 4)
+#define IDX_CTL_EFER_REG(IDX_CTL_REGS + 5)
+
+#endif
diff --git a/target/i386/gdbstub.c b/target/i386/gdbstub.c
index 4acf485879..96b4382a5d 100644
--- a/target/i386/gdbstub.c
+++ b/target/i386/gdbstub.c
@@ -20,6 +20,7 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "gdbstub/helpers.h"
+#include "gdb-internal.h"
 
 #ifdef TARGET_X86_64
 static const int gpr_map[16] = {
-- 
2.34.1




[PATCH v2 6/9] target/i386: Introduce cpu_compute_eflags_ccop

2024-06-05 Thread Richard Henderson
This is a generalization of cpu_compute_eflags, with a dynamic
value of cc_op, and is thus tcg specific.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 target/i386/cpu.h   |  2 ++
 target/i386/tcg/cc_helper.c | 10 ++
 2 files changed, 12 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c64ef0c1a2..48ad6f495b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2431,6 +2431,8 @@ void cpu_x86_inject_mce(Monitor *mon, X86CPU *cpu, int 
bank,
 
 uint32_t cpu_cc_compute_all(CPUX86State *env1);
 
+uint32_t cpu_compute_eflags_ccop(CPUX86State *env, CCOp op);
+
 static inline uint32_t cpu_compute_eflags(CPUX86State *env)
 {
 uint32_t eflags = env->eflags;
diff --git a/target/i386/tcg/cc_helper.c b/target/i386/tcg/cc_helper.c
index f76e9cb8cf..8203682ca8 100644
--- a/target/i386/tcg/cc_helper.c
+++ b/target/i386/tcg/cc_helper.c
@@ -225,6 +225,16 @@ uint32_t cpu_cc_compute_all(CPUX86State *env)
 return helper_cc_compute_all(CC_DST, CC_SRC, CC_SRC2, CC_OP);
 }
 
+uint32_t cpu_compute_eflags_ccop(CPUX86State *env, CCOp op)
+{
+uint32_t eflags;
+
+eflags = helper_cc_compute_all(CC_DST, CC_SRC, CC_SRC2, op);
+eflags |= env->df & DF_MASK;
+eflags |= env->eflags & ~(VM_MASK | RF_MASK);
+return eflags;
+}
+
 target_ulong helper_cc_compute_c(target_ulong dst, target_ulong src1,
  target_ulong src2, int op)
 {
-- 
2.34.1




[PATCH v2 2/9] accel/tcg: Set CPUState.plugin_ra before all plugin callbacks

2024-06-05 Thread Richard Henderson
Store a host code address to use with the tcg unwinder when called
from a plugin.  Generate one such store per guest insn that uses
a plugin callback.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h  |  4 +---
 accel/tcg/plugin-gen.c | 49 +-
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index a2c8536943..19b7fcc9f3 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -354,9 +354,7 @@ typedef union IcountDecr {
 typedef struct CPUNegativeOffsetState {
 CPUTLB tlb;
 #ifdef CONFIG_PLUGIN
-/*
- * The callback pointer are accessed via TCG (see gen_empty_mem_helper).
- */
+uintptr_t plugin_ra;
 GArray *plugin_mem_cbs;
 #endif
 IcountDecr icount_decr;
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index cc1634e7a6..650e3810e6 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -37,6 +37,12 @@ enum plugin_gen_from {
 PLUGIN_GEN_AFTER_TB,
 };
 
+enum plugin_gen_ra {
+GEN_RA_DONE,
+GEN_RA_FROM_TB,
+GEN_RA_FROM_INSN,
+};
+
 /* called before finishing a TB with exit_tb, goto_tb or goto_ptr */
 void plugin_gen_disable_mem_helpers(void)
 {
@@ -213,11 +219,37 @@ static void gen_mem_cb(struct qemu_plugin_regular_cb *cb,
 tcg_temp_free_i32(cpu_index);
 }
 
-static void inject_cb(struct qemu_plugin_dyn_cb *cb)
+static void inject_ra(enum plugin_gen_ra *gen_ra)
+{
+TCGv_ptr ra;
 
+switch (*gen_ra) {
+case GEN_RA_DONE:
+return;
+case GEN_RA_FROM_TB:
+ra = tcg_constant_ptr(NULL);
+break;
+case GEN_RA_FROM_INSN:
+ra = tcg_temp_ebb_new_ptr();
+tcg_gen_plugin_pc(ra);
+break;
+default:
+g_assert_not_reached();
+}
+
+tcg_gen_st_ptr(ra, tcg_env,
+   offsetof(CPUState, neg.plugin_ra) -
+   offsetof(ArchCPU, env));
+tcg_temp_free_ptr(ra);
+*gen_ra = GEN_RA_DONE;
+}
+
+static void inject_cb(struct qemu_plugin_dyn_cb *cb,
+  enum plugin_gen_ra *gen_ra)
 {
 switch (cb->type) {
 case PLUGIN_CB_REGULAR:
+inject_ra(gen_ra);
 gen_udata_cb(>regular);
 break;
 case PLUGIN_CB_COND:
@@ -235,19 +267,21 @@ static void inject_cb(struct qemu_plugin_dyn_cb *cb)
 }
 
 static void inject_mem_cb(struct qemu_plugin_dyn_cb *cb,
+  enum plugin_gen_ra *gen_ra,
   enum qemu_plugin_mem_rw rw,
   qemu_plugin_meminfo_t meminfo, TCGv_i64 addr)
 {
 switch (cb->type) {
 case PLUGIN_CB_MEM_REGULAR:
 if (rw && cb->regular.rw) {
+inject_ra(gen_ra);
 gen_mem_cb(>regular, meminfo, addr);
 }
 break;
 case PLUGIN_CB_INLINE_ADD_U64:
 case PLUGIN_CB_INLINE_STORE_U64:
 if (rw && cb->inline_insn.rw) {
-inject_cb(cb);
+inject_cb(cb, gen_ra);
 }
 break;
 default:
@@ -260,6 +294,7 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 {
 TCGOp *op, *next;
 int insn_idx = -1;
+enum plugin_gen_ra gen_ra;
 
 if (unlikely(qemu_loglevel_mask(LOG_TB_OP_PLUGIN)
  && qemu_log_in_addr_range(tcg_ctx->plugin_db->pc_first))) {
@@ -279,10 +314,12 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
  */
 memset(tcg_ctx->free_temps, 0, sizeof(tcg_ctx->free_temps));
 
+gen_ra = GEN_RA_FROM_TB;
 QTAILQ_FOREACH_SAFE(op, _ctx->ops, link, next) {
 switch (op->opc) {
 case INDEX_op_insn_start:
 insn_idx++;
+gen_ra = GEN_RA_FROM_INSN;
 break;
 
 case INDEX_op_plugin_cb:
@@ -318,7 +355,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 cbs = plugin_tb->cbs;
 for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
 inject_cb(
-_array_index(cbs, struct qemu_plugin_dyn_cb, i));
+_array_index(cbs, struct qemu_plugin_dyn_cb, i),
+_ra);
 }
 break;
 
@@ -330,7 +368,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 cbs = insn->insn_cbs;
 for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
 inject_cb(
-_array_index(cbs, struct qemu_plugin_dyn_cb, i));
+_array_index(cbs, struct qemu_plugin_dyn_cb, i),
+_ra);
 }
 break;
 
@@ -362,7 +401,7 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 cbs = insn->mem_cbs;
 for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
 inject_mem_cb(_array_inde

[PATCH v2 7/9] target/i386: Implement TCGCPUOps for plugin register reads

2024-06-05 Thread Richard Henderson
Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 target/i386/tcg/tcg-cpu.c | 72 ++-
 1 file changed, 56 insertions(+), 16 deletions(-)

diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
index cca19cd40e..2370053df2 100644
--- a/target/i386/tcg/tcg-cpu.c
+++ b/target/i386/tcg/tcg-cpu.c
@@ -22,9 +22,11 @@
 #include "helper-tcg.h"
 #include "qemu/accel.h"
 #include "hw/core/accel-cpu.h"
-
+#include "gdbstub/helpers.h"
+#include "gdb-internal.h"
 #include "tcg-cpu.h"
 
+
 /* Frob eflags into and out of the CPU temporary format.  */
 
 static void x86_cpu_exec_enter(CPUState *cs)
@@ -61,38 +63,74 @@ static void x86_cpu_synchronize_from_tb(CPUState *cs,
 }
 }
 
-static void x86_restore_state_to_opc(CPUState *cs,
- const TranslationBlock *tb,
- const uint64_t *data)
+static uint64_t eip_from_unwind(CPUX86State *env, const TranslationBlock *tb,
+uint64_t data0)
 {
-X86CPU *cpu = X86_CPU(cs);
-CPUX86State *env = >env;
-int cc_op = data[1];
 uint64_t new_pc;
 
 if (tb_cflags(tb) & CF_PCREL) {
 /*
- * data[0] in PC-relative TBs is also a linear address, i.e. an 
address with
- * the CS base added, because it is not guaranteed that EIP bits 12 
and higher
- * stay the same across the translation block.  Add the CS base back 
before
- * replacing the low bits, and subtract it below just like for 
!CF_PCREL.
+ * data[0] in PC-relative TBs is also a linear address,
+ * i.e. an address with the CS base added, because it is
+ * not guaranteed that EIP bits 12 and higher stay the
+ * same across the translation block.  Add the CS base
+ * back before replacing the low bits, and subtract it
+ * below just like for !CF_PCREL.
  */
 uint64_t pc = env->eip + tb->cs_base;
-new_pc = (pc & TARGET_PAGE_MASK) | data[0];
+new_pc = (pc & TARGET_PAGE_MASK) | data0;
 } else {
-new_pc = data[0];
+new_pc = data0;
 }
 if (tb->flags & HF_CS64_MASK) {
-env->eip = new_pc;
-} else {
-env->eip = (uint32_t)(new_pc - tb->cs_base);
+return new_pc;
 }
+return (uint32_t)(new_pc - tb->cs_base);
+}
 
+static void x86_restore_state_to_opc(CPUState *cs,
+ const TranslationBlock *tb,
+ const uint64_t *data)
+{
+CPUX86State *env = cpu_env(cs);
+CCOp cc_op;
+
+env->eip = eip_from_unwind(env, tb, data[0]);
+
+cc_op = data[1];
 if (cc_op != CC_OP_DYNAMIC) {
 env->cc_op = cc_op;
 }
 }
 
+static bool x86_plugin_need_unwind_for_reg(CPUState *cs, int reg)
+{
+return reg == IDX_IP_REG || reg == IDX_FLAGS_REG;
+}
+
+static int x86_plugin_unwind_read_reg(CPUState *cs, GByteArray *buf, int reg,
+  const TranslationBlock *tb,
+  const uint64_t *data)
+{
+CPUX86State *env = cpu_env(cs);
+CCOp cc_op;
+
+switch (reg) {
+case IDX_IP_REG:
+return gdb_get_regl(buf, eip_from_unwind(env, tb, data[0]));
+
+case IDX_FLAGS_REG:
+cc_op = data[1];
+if (cc_op == CC_OP_DYNAMIC) {
+cc_op = env->cc_op;
+}
+return gdb_get_reg32(buf, cpu_compute_eflags_ccop(env, cc_op));
+
+default:
+g_assert_not_reached();
+}
+}
+
 #ifndef CONFIG_USER_ONLY
 static bool x86_debug_check_breakpoint(CPUState *cs)
 {
@@ -110,6 +148,8 @@ static const TCGCPUOps x86_tcg_ops = {
 .initialize = tcg_x86_init,
 .synchronize_from_tb = x86_cpu_synchronize_from_tb,
 .restore_state_to_opc = x86_restore_state_to_opc,
+.plugin_need_unwind_for_reg = x86_plugin_need_unwind_for_reg,
+.plugin_unwind_read_reg = x86_plugin_unwind_read_reg,
 .cpu_exec_enter = x86_cpu_exec_enter,
 .cpu_exec_exit = x86_cpu_exec_exit,
 #ifdef CONFIG_USER_ONLY
-- 
2.34.1




[PATCH v2 0/9] plugins: Use unwind info for special gdb registers

2024-06-05 Thread Richard Henderson


This is an attempt to fix
https://gitlab.com/qemu-project/qemu/-/issues/2208
("PC is not updated for each instruction in TCG plugins")

I have only updated target/{i386,arm} so far, but basically all
targets need updating for the new callbacks.  Extra points to
anyone who sees how to avoid the extra code duplication.  :-)


r~


Richard Henderson (9):
  tcg: Introduce INDEX_op_plugin_pc
  accel/tcg: Set CPUState.plugin_ra before all plugin callbacks
  accel/tcg: Return the TranslationBlock from cpu_unwind_state_data
  plugins: Introduce TCGCPUOps callbacks for mid-tb register reads
  target/i386: Split out gdb-internal.h
  target/i386: Introduce cpu_compute_eflags_ccop
  target/i386: Implement TCGCPUOps for plugin register reads
  target/arm: Add aarch64_tcg_ops
  target/arm: Implement TCGCPUOps for plugin register reads

 include/exec/cpu-common.h |  9 +++--
 include/hw/core/cpu.h |  4 +-
 include/hw/core/tcg-cpu-ops.h | 14 +++
 include/tcg/tcg-op-common.h   |  1 +
 include/tcg/tcg-opc.h |  1 +
 target/arm/internals.h|  8 +++-
 target/i386/cpu.h |  2 +
 target/i386/gdb-internal.h| 65 +++
 accel/tcg/plugin-gen.c| 49 +---
 accel/tcg/translate-all.c |  9 +++--
 plugins/api.c | 36 +-
 target/arm/cpu.c  | 40 ++-
 target/arm/cpu64.c| 55 ++
 target/arm/tcg/cpu-v7m.c  |  2 +
 target/i386/gdbstub.c |  1 +
 target/i386/helper.c  |  6 ++-
 target/i386/tcg/cc_helper.c   | 10 +
 target/i386/tcg/tcg-cpu.c | 72 +++
 tcg/tcg-op.c  |  5 +++
 tcg/tcg.c | 10 +
 20 files changed, 360 insertions(+), 39 deletions(-)
 create mode 100644 target/i386/gdb-internal.h

-- 
2.34.1




[PATCH v2 3/9] accel/tcg: Return the TranslationBlock from cpu_unwind_state_data

2024-06-05 Thread Richard Henderson
Adjust the i386 get_memio_eip function to use tb->cflags instead
of tcg_cflags_has, which is technically more correct.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-common.h | 9 +
 accel/tcg/translate-all.c | 9 +
 target/i386/helper.c  | 6 --
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 815342d043..c1887462e6 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -189,12 +189,13 @@ uint32_t curr_cflags(CPUState *cpu);
  * @host_pc: the host pc within the translation
  * @data: output data
  *
- * Attempt to load the the unwind state for a host pc occurring in
- * translated code.  If @host_pc is not in translated code, the
- * function returns false; otherwise @data is loaded.
+ * Attempt to load the the unwind state for a host pc occurring in translated
+ * code.  If @host_pc is not in translated code, the function returns NULL;
+ * otherwise @data is loaded and the TranslationBlock is returned.
  * This is the same unwind info as given to restore_state_to_opc.
  */
-bool cpu_unwind_state_data(CPUState *cpu, uintptr_t host_pc, uint64_t *data);
+const TranslationBlock *cpu_unwind_state_data(CPUState *cpu, uintptr_t host_pc,
+  uint64_t *data);
 
 /**
  * cpu_restore_state:
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index fdf6d8ac19..45a1cf57bc 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -243,15 +243,16 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc)
 return false;
 }
 
-bool cpu_unwind_state_data(CPUState *cpu, uintptr_t host_pc, uint64_t *data)
+const TranslationBlock *
+cpu_unwind_state_data(CPUState *cpu, uintptr_t host_pc, uint64_t *data)
 {
 if (in_code_gen_buffer((const void *)(host_pc - tcg_splitwx_diff))) {
 TranslationBlock *tb = tcg_tb_lookup(host_pc);
-if (tb) {
-return cpu_unwind_data_from_tb(tb, host_pc, data) >= 0;
+if (tb && cpu_unwind_data_from_tb(tb, host_pc, data) >= 0) {
+return tb;
 }
 }
-return false;
+return NULL;
 }
 
 void page_init(void)
diff --git a/target/i386/helper.c b/target/i386/helper.c
index f9d1381f90..565e01a3a9 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -521,13 +521,15 @@ static inline target_ulong get_memio_eip(CPUX86State *env)
 #ifdef CONFIG_TCG
 uint64_t data[TARGET_INSN_START_WORDS];
 CPUState *cs = env_cpu(env);
+const TranslationBlock *tb;
 
-if (!cpu_unwind_state_data(cs, cs->mem_io_pc, data)) {
+tb = cpu_unwind_state_data(cs, cs->mem_io_pc, data);
+if (!tb) {
 return env->eip;
 }
 
 /* Per x86_restore_state_to_opc. */
-if (tcg_cflags_has(cs, CF_PCREL)) {
+if (tb->cflags & CF_PCREL) {
 return (env->eip & TARGET_PAGE_MASK) | data[0];
 } else {
 return data[0] - env->segs[R_CS].base;
-- 
2.34.1




[PATCH v2 8/9] target/arm: Add aarch64_tcg_ops

2024-06-05 Thread Richard Henderson
For the moment, this is an exact copy of arm_tcg_ops.
Export arm_cpu_exec_interrupt for the cross-file reference.

Signed-off-by: Richard Henderson 
---
 target/arm/internals.h |  1 +
 target/arm/cpu.c   |  2 +-
 target/arm/cpu64.c | 30 ++
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 11b5da2562..dc53d86249 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -364,6 +364,7 @@ void arm_restore_state_to_opc(CPUState *cs,
 
 #ifdef CONFIG_TCG
 void arm_cpu_synchronize_from_tb(CPUState *cs, const TranslationBlock *tb);
+bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request);
 #endif /* CONFIG_TCG */
 
 typedef enum ARMFPRounding {
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 35fa281f1b..3cd4711064 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -824,7 +824,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
 return unmasked || pstate_unmasked;
 }
 
-static bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
+bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
 {
 CPUClass *cc = CPU_GET_CLASS(cs);
 CPUARMState *env = cpu_env(cs);
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 262a1d6c0b..7ba80099af 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -31,6 +31,9 @@
 #include "hvf_arm.h"
 #include "qapi/visitor.h"
 #include "hw/qdev-properties.h"
+#ifdef CONFIG_TCG
+#include "hw/core/tcg-cpu-ops.h"
+#endif
 #include "internals.h"
 #include "cpu-features.h"
 #include "cpregs.h"
@@ -793,6 +796,29 @@ static const gchar *aarch64_gdb_arch_name(CPUState *cs)
 return "aarch64";
 }
 
+#ifdef CONFIG_TCG
+static const TCGCPUOps aarch64_tcg_ops = {
+.initialize = arm_translate_init,
+.synchronize_from_tb = arm_cpu_synchronize_from_tb,
+.debug_excp_handler = arm_debug_excp_handler,
+.restore_state_to_opc = arm_restore_state_to_opc,
+
+#ifdef CONFIG_USER_ONLY
+.record_sigsegv = arm_cpu_record_sigsegv,
+.record_sigbus = arm_cpu_record_sigbus,
+#else
+.tlb_fill = arm_cpu_tlb_fill,
+.cpu_exec_interrupt = arm_cpu_exec_interrupt,
+.do_interrupt = arm_cpu_do_interrupt,
+.do_transaction_failed = arm_cpu_do_transaction_failed,
+.do_unaligned_access = arm_cpu_do_unaligned_access,
+.adjust_watchpoint_address = arm_adjust_watchpoint_address,
+.debug_check_watchpoint = arm_debug_check_watchpoint,
+.debug_check_breakpoint = arm_debug_check_breakpoint,
+#endif /* !CONFIG_USER_ONLY */
+};
+#endif /* CONFIG_TCG */
+
 static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
 {
 CPUClass *cc = CPU_CLASS(oc);
@@ -802,6 +828,10 @@ static void aarch64_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->gdb_core_xml_file = "aarch64-core.xml";
 cc->gdb_arch_name = aarch64_gdb_arch_name;
 
+#ifdef CONFIG_TCG
+cc->tcg_ops = _tcg_ops;
+#endif
+
 object_class_property_add_bool(oc, "aarch64", aarch64_cpu_get_aarch64,
aarch64_cpu_set_aarch64);
 object_class_property_set_description(oc, "aarch64",
-- 
2.34.1




[PATCH v2 9/9] target/arm: Implement TCGCPUOps for plugin register reads

2024-06-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/internals.h   |  7 +--
 target/arm/cpu.c | 38 ++
 target/arm/cpu64.c   | 25 +
 target/arm/tcg/cpu-v7m.c |  2 ++
 4 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index dc53d86249..fe28937515 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -358,11 +358,14 @@ void init_cpreg_list(ARMCPU *cpu);
 void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu);
 void arm_translate_init(void);
 
+#ifdef CONFIG_TCG
 void arm_restore_state_to_opc(CPUState *cs,
   const TranslationBlock *tb,
   const uint64_t *data);
-
-#ifdef CONFIG_TCG
+bool arm_plugin_need_unwind_for_reg(CPUState *cs, int reg);
+int arm_plugin_unwind_read_reg(CPUState *cs, GByteArray *buf, int reg,
+   const TranslationBlock *tb,
+   const uint64_t *data);
 void arm_cpu_synchronize_from_tb(CPUState *cs, const TranslationBlock *tb);
 bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request);
 #endif /* CONFIG_TCG */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 3cd4711064..e8ac3da351 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -29,6 +29,7 @@
 #include "cpu.h"
 #ifdef CONFIG_TCG
 #include "hw/core/tcg-cpu-ops.h"
+#include "gdbstub/helpers.h"
 #endif /* CONFIG_TCG */
 #include "internals.h"
 #include "cpu-features.h"
@@ -120,6 +121,41 @@ void arm_restore_state_to_opc(CPUState *cs,
 env->exception.syndrome = data[2] << ARM_INSN_START_WORD2_SHIFT;
 }
 }
+
+bool arm_plugin_need_unwind_for_reg(CPUState *cs, int reg)
+{
+return reg == 15 || reg == 25; /* pc (r15) or cpsr */
+}
+
+int arm_plugin_unwind_read_reg(CPUState *cs, GByteArray *buf, int reg,
+   const TranslationBlock *tb,
+   const uint64_t *data)
+{
+CPUARMState *env = cpu_env(cs);
+uint32_t val, condexec;
+
+switch (reg) {
+case 15: /* PC */
+val = data[0];
+if (tb_cflags(tb) & CF_PCREL) {
+val |= env->regs[15] & TARGET_PAGE_MASK;
+}
+break;
+case 25: /* CPSR, or XPSR for M-profile */
+if (arm_feature(env, ARM_FEATURE_M)) {
+val = xpsr_read(env);
+} else {
+val = cpsr_read(env);
+}
+condexec = data[1] & 0xff;
+val = (val & ~(3 << 25)) | ((condexec & 3) << 25);
+val = (val & ~(0xfc << 8)) | ((condexec & 0xfc) << 8);
+break;
+default:
+g_assert_not_reached();
+}
+return gdb_get_reg32(buf, val);
+}
 #endif /* CONFIG_TCG */
 
 /*
@@ -2657,6 +2693,8 @@ static const TCGCPUOps arm_tcg_ops = {
 .synchronize_from_tb = arm_cpu_synchronize_from_tb,
 .debug_excp_handler = arm_debug_excp_handler,
 .restore_state_to_opc = arm_restore_state_to_opc,
+.plugin_need_unwind_for_reg = arm_plugin_need_unwind_for_reg,
+.plugin_unwind_read_reg = arm_plugin_unwind_read_reg,
 
 #ifdef CONFIG_USER_ONLY
 .record_sigsegv = arm_cpu_record_sigsegv,
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 7ba80099af..1595be5d8f 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -33,6 +33,8 @@
 #include "hw/qdev-properties.h"
 #ifdef CONFIG_TCG
 #include "hw/core/tcg-cpu-ops.h"
+#include "exec/translation-block.h"
+#include "gdbstub/helpers.h"
 #endif
 #include "internals.h"
 #include "cpu-features.h"
@@ -797,11 +799,34 @@ static const gchar *aarch64_gdb_arch_name(CPUState *cs)
 }
 
 #ifdef CONFIG_TCG
+static bool aarch64_plugin_need_unwind_for_reg(CPUState *cs, int reg)
+{
+return reg == 32; /* pc */
+}
+
+static int aarch64_plugin_unwind_read_reg(CPUState *cs, GByteArray *buf,
+  int reg, const TranslationBlock *tb,
+  const uint64_t *data)
+{
+CPUARMState *env = cpu_env(cs);
+uint64_t val;
+
+assert(reg == 32);
+
+val = data[0];
+if (tb_cflags(tb) & CF_PCREL) {
+val |= env->pc & TARGET_PAGE_MASK;
+}
+return gdb_get_reg64(buf, val);
+}
+
 static const TCGCPUOps aarch64_tcg_ops = {
 .initialize = arm_translate_init,
 .synchronize_from_tb = arm_cpu_synchronize_from_tb,
 .debug_excp_handler = arm_debug_excp_handler,
 .restore_state_to_opc = arm_restore_state_to_opc,
+.plugin_need_unwind_for_reg = aarch64_plugin_need_unwind_for_reg,
+.plugin_unwind_read_reg = aarch64_plugin_unwind_read_reg,
 
 #ifdef CONFIG_USER_ONLY
 .record_sigsegv = arm_cpu_record_sigsegv,
diff --git a/target/arm/tcg/cpu-v7m.c b/target/arm/tcg/cpu-v7m.c
index c059c681e9..47e44f70c7 100644
--- a/target/arm/tcg/

[PATCH v2 4/9] plugins: Introduce TCGCPUOps callbacks for mid-tb register reads

2024-06-05 Thread Richard Henderson
Certain target registers are not updated continuously within
the translation block.  For normal exception handling we use
unwind info to re-generate the correct value when required.
Leverage that same info for reading those registers for plugins.

All targets will need updating for these new callbacks.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/hw/core/tcg-cpu-ops.h | 14 ++
 plugins/api.c | 36 +--
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index 099de3375e..b34f999e78 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -53,6 +53,20 @@ struct TCGCPUOps {
 /** @debug_excp_handler: Callback for handling debug exceptions */
 void (*debug_excp_handler)(CPUState *cpu);
 
+/**
+ * @plugin_need_unwind_for_reg:
+ * True if unwind info needed for reading reg.
+ */
+bool (*plugin_need_unwind_for_reg)(CPUState *cpu, int reg);
+/**
+ * @plugin_unwind_read_reg:
+ * Like CPUClass.gdb_read_register, but for registers that require
+ * regeneration using unwind info, like in @restore_state_to_opc.
+ */
+int (*plugin_unwind_read_reg)(CPUState *cpu, GByteArray *buf, int reg,
+  const TranslationBlock *tb,
+  const uint64_t *data);
+
 #ifdef CONFIG_USER_ONLY
 /**
  * @fake_user_interrupt: Callback for 'fake exception' handling.
diff --git a/plugins/api.c b/plugins/api.c
index 5a0a7f8c71..53127ed9ee 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -40,10 +40,12 @@
 #include "qemu/plugin.h"
 #include "qemu/log.h"
 #include "tcg/tcg.h"
+#include "tcg/insn-start-words.h"
 #include "exec/exec-all.h"
 #include "exec/gdbstub.h"
 #include "exec/translator.h"
 #include "disas/disas.h"
+#include "hw/core/tcg-cpu-ops.h"
 #include "plugin.h"
 #ifndef CONFIG_USER_ONLY
 #include "exec/ram_addr.h"
@@ -526,9 +528,39 @@ GArray *qemu_plugin_get_registers(void)
 
 int qemu_plugin_read_register(struct qemu_plugin_register *reg, GByteArray 
*buf)
 {
-g_assert(current_cpu);
+CPUState *cs;
+uintptr_t ra;
+int regno;
 
-return gdb_read_register(current_cpu, buf, GPOINTER_TO_INT(reg));
+assert(current_cpu);
+cs = current_cpu;
+ra = cs->neg.plugin_ra;
+regno = GPOINTER_TO_INT(reg);
+
+/*
+ * When plugin_ra is 0, we have no unwind info.  This will be true for
+ * TB callbacks that happen before any insns of the TB have started.
+ */
+if (ra) {
+const TCGCPUOps *tcg_ops = cs->cc->tcg_ops;
+
+/*
+ * For plugins in the middle of the TB, we may need to locate
+ * and use unwind data to reconstruct a register value.
+ * Usually this required for the PC, but there may be others.
+ */
+if (tcg_ops->plugin_need_unwind_for_reg &&
+tcg_ops->plugin_need_unwind_for_reg(cs, regno)) {
+uint64_t data[TARGET_INSN_START_WORDS];
+const TranslationBlock *tb;
+
+tb = cpu_unwind_state_data(cs, ra, data);
+assert(tb);
+return tcg_ops->plugin_unwind_read_reg(cs, buf, regno, tb, data);
+}
+}
+
+return gdb_read_register(cs, buf, regno);
 }
 
 struct qemu_plugin_scoreboard *qemu_plugin_scoreboard_new(size_t element_size)
-- 
2.34.1




[PATCH v2 1/9] tcg: Introduce INDEX_op_plugin_pc

2024-06-05 Thread Richard Henderson
Add an opcode to find a code address within the current insn,
for later use with unwinding.  Generate the code generically
using tcg_reg_alloc_do_movi.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op-common.h |  1 +
 include/tcg/tcg-opc.h   |  1 +
 tcg/tcg-op.c|  5 +
 tcg/tcg.c   | 10 ++
 4 files changed, 17 insertions(+)

diff --git a/include/tcg/tcg-op-common.h b/include/tcg/tcg-op-common.h
index 009e2778c5..a32c88a182 100644
--- a/include/tcg/tcg-op-common.h
+++ b/include/tcg/tcg-op-common.h
@@ -76,6 +76,7 @@ void tcg_gen_lookup_and_goto_ptr(void);
 
 void tcg_gen_plugin_cb(unsigned from);
 void tcg_gen_plugin_mem_cb(TCGv_i64 addr, unsigned meminfo);
+void tcg_gen_plugin_pc(TCGv_ptr);
 
 /* 32 bit ops */
 
diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index 546eb49c11..087d1b82da 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -199,6 +199,7 @@ DEF(goto_ptr, 0, 1, 0, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 
 DEF(plugin_cb, 0, 0, 1, TCG_OPF_NOT_PRESENT)
 DEF(plugin_mem_cb, 0, 1, 1, TCG_OPF_NOT_PRESENT)
+DEF(plugin_pc, 1, 0, 0, TCG_OPF_NOT_PRESENT)
 
 /* Replicate ld/st ops for 32 and 64-bit guest addresses. */
 DEF(qemu_ld_a32_i32, 1, 1, 1,
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index eff3728622..b8ca78cbe4 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -322,6 +322,11 @@ void tcg_gen_plugin_mem_cb(TCGv_i64 addr, unsigned meminfo)
 tcg_gen_op2(INDEX_op_plugin_mem_cb, tcgv_i64_arg(addr), meminfo);
 }
 
+void tcg_gen_plugin_pc(TCGv_ptr arg)
+{
+tcg_gen_op1(INDEX_op_plugin_pc, tcgv_ptr_arg(arg));
+}
+
 /* 32 bit ops */
 
 void tcg_gen_discard_i32(TCGv_i32 arg)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 34e3056380..b7c28d92a6 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -4689,6 +4689,13 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp 
*op)
 }
 }
 
+static void tcg_reg_alloc_plugin_pc(TCGContext *s, const TCGOp *op)
+{
+tcg_reg_alloc_do_movi(s, arg_temp(op->args[0]),
+  (uintptr_t)tcg_splitwx_to_rx(s->code_ptr),
+  op->life, output_pref(op, 0));
+}
+
 /*
  * Specialized code generation for INDEX_op_dup_vec.
  */
@@ -6196,6 +6203,9 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb, 
uint64_t pc_start)
 case INDEX_op_mov_vec:
 tcg_reg_alloc_mov(s, op);
 break;
+case INDEX_op_plugin_pc:
+tcg_reg_alloc_plugin_pc(s, op);
+break;
 case INDEX_op_dup_vec:
 tcg_reg_alloc_dup(s, op);
 break;
-- 
2.34.1




Re: [PATCH 2/2] util/bufferiszero: Add simd acceleration for loongarch64

2024-06-05 Thread Richard Henderson

On 6/5/24 20:18, Richard Henderson wrote:

On 6/5/24 19:30, maobibo wrote:



On 2024/6/6 上午7:51, Richard Henderson wrote:

On 6/5/24 02:32, Bibo Mao wrote:

Different gcc versions have different features, macro CONFIG_LSX_OPT
and CONFIG_LASX_OPT is added here to detect whether gcc supports
built-in lsx/lasx macro.

Function buffer_zero_lsx() is added for 128bit simd fpu optimization,
and function buffer_zero_lasx() is for 256bit simd fpu optimization.

Loongarch gcc built-in lsx/lasx macro can be used only when compiler
option -mlsx/-mlasx is added, and there is no separate compiler option
for function only. So it is only in effect when qemu is compiled with
parameter --extra-cflags="-mlasx"

Signed-off-by: Bibo Mao 
---
  meson.build |  11 +
  util/bufferiszero.c | 103 
  2 files changed, 114 insertions(+)

diff --git a/meson.build b/meson.build
index 6386607144..29bc362d7a 100644
--- a/meson.build
+++ b/meson.build
@@ -2855,6 +2855,17 @@ config_host_data.set('CONFIG_ARM_AES_BUILTIN', 
cc.compiles('''
  void foo(uint8x16_t *p) { *p = vaesmcq_u8(*p); }
    '''))
+# For Loongarch64, detect if LSX/LASX are available.
+ config_host_data.set('CONFIG_LSX_OPT', cc.compiles('''
+    #include "lsxintrin.h"
+    int foo(__m128i v) { return __lsx_bz_v(v); }
+  '''))
+
+config_host_data.set('CONFIG_LASX_OPT', cc.compiles('''
+    #include "lasxintrin.h"
+    int foo(__m256i v) { return __lasx_xbz_v(v); }
+  '''))


Both of these are introduced by gcc 14 and llvm 18, so I'm not certain of the utility 
of separate tests.  We might simplify this with


   config_host_data.set('CONFIG_LSX_LASX_INTRIN_H',
 cc.has_header('lsxintrin.h') && cc.has_header('lasxintrin.h'))


As you say, these headers require vector instructions to be enabled at compile-time 
rather than detecting them at runtime.  This is a point where the compilers could be 
improved to support __attribute__((target("xyz"))) and the builtins with that.  The 
i386 port does this, for instance.


In the meantime, it means that you don't need a runtime test.  Similar to aarch64 and 
the use of __ARM_NEON as a compile-time test for simd support.  Perhaps


#elif defined(CONFIG_LSX_LASX_INTRIN_H) && \
   (defined(__loongarch_sx) || defined(__loongarch_asx))
# ifdef __loongarch_sx
   ...
# endif
# ifdef __loongarch_asx
   ...
# endif

Sure, will do in this way.
And also there is runtime check coming from hwcap, such this:

unsigned info = cpuinfo_init();
   if (info & CPUINFO_LASX)


static biz_accel_fn const accel_table[] = {
     buffer_is_zero_int_ge256,
#ifdef __loongarch_sx
     buffer_is_zero_lsx,
#endif
#ifdef __loongarch_asx
     buffer_is_zero_lasx,
#endif
};

static unsigned best_accel(void)
{
#ifdef __loongarch_asx
     /* lasx may be index 1 or 2, but always last */
     return ARRAY_SIZE(accel_table) - 1;
#else
     /* lsx is always index 1 */
     return 1;
#endif
}


It occurs to me that by accumulating host specific sections to this file, we should split 
it like the atomics.  Put each portion in host/include/*/host/bufferiszero.h.inc.


I'll send a patch set handling the existing two hosts.


r~




Re: [PATCH 2/2] util/bufferiszero: Add simd acceleration for loongarch64

2024-06-05 Thread Richard Henderson

On 6/5/24 19:30, maobibo wrote:



On 2024/6/6 上午7:51, Richard Henderson wrote:

On 6/5/24 02:32, Bibo Mao wrote:

Different gcc versions have different features, macro CONFIG_LSX_OPT
and CONFIG_LASX_OPT is added here to detect whether gcc supports
built-in lsx/lasx macro.

Function buffer_zero_lsx() is added for 128bit simd fpu optimization,
and function buffer_zero_lasx() is for 256bit simd fpu optimization.

Loongarch gcc built-in lsx/lasx macro can be used only when compiler
option -mlsx/-mlasx is added, and there is no separate compiler option
for function only. So it is only in effect when qemu is compiled with
parameter --extra-cflags="-mlasx"

Signed-off-by: Bibo Mao 
---
  meson.build |  11 +
  util/bufferiszero.c | 103 
  2 files changed, 114 insertions(+)

diff --git a/meson.build b/meson.build
index 6386607144..29bc362d7a 100644
--- a/meson.build
+++ b/meson.build
@@ -2855,6 +2855,17 @@ config_host_data.set('CONFIG_ARM_AES_BUILTIN', 
cc.compiles('''
  void foo(uint8x16_t *p) { *p = vaesmcq_u8(*p); }
    '''))
+# For Loongarch64, detect if LSX/LASX are available.
+ config_host_data.set('CONFIG_LSX_OPT', cc.compiles('''
+    #include "lsxintrin.h"
+    int foo(__m128i v) { return __lsx_bz_v(v); }
+  '''))
+
+config_host_data.set('CONFIG_LASX_OPT', cc.compiles('''
+    #include "lasxintrin.h"
+    int foo(__m256i v) { return __lasx_xbz_v(v); }
+  '''))


Both of these are introduced by gcc 14 and llvm 18, so I'm not certain of the utility of 
separate tests.  We might simplify this with


   config_host_data.set('CONFIG_LSX_LASX_INTRIN_H',
 cc.has_header('lsxintrin.h') && cc.has_header('lasxintrin.h'))


As you say, these headers require vector instructions to be enabled at compile-time 
rather than detecting them at runtime.  This is a point where the compilers could be 
improved to support __attribute__((target("xyz"))) and the builtins with that.  The i386 
port does this, for instance.


In the meantime, it means that you don't need a runtime test.  Similar to aarch64 and 
the use of __ARM_NEON as a compile-time test for simd support.  Perhaps


#elif defined(CONFIG_LSX_LASX_INTRIN_H) && \
   (defined(__loongarch_sx) || defined(__loongarch_asx))
# ifdef __loongarch_sx
   ...
# endif
# ifdef __loongarch_asx
   ...
# endif

Sure, will do in this way.
And also there is runtime check coming from hwcap, such this:

unsigned info = cpuinfo_init();
   if (info & CPUINFO_LASX)


static biz_accel_fn const accel_table[] = {
buffer_is_zero_int_ge256,
#ifdef __loongarch_sx
buffer_is_zero_lsx,
#endif
#ifdef __loongarch_asx
buffer_is_zero_lasx,
#endif
};

static unsigned best_accel(void)
{
#ifdef __loongarch_asx
/* lasx may be index 1 or 2, but always last */
return ARRAY_SIZE(accel_table) - 1;
#else
/* lsx is always index 1 */
return 1;
#endif
}


r~



[TLS]Re: Is NIST actually prohibiting X25519?

2024-06-05 Thread Richard Barnes
As with the earlier thread, this message is off-topic for this list.

Regardless of what NIST does, the TLS protocol does and will support a
variety of curves.


On Wed, Jun 5, 2024 at 20:14 D. J. Bernstein  wrote:

> Andrei Popov writes:
> > This is a complicated compliance question. I'm not qualified to
> > comment on this option.
>
> I think it's worth investigating, considering the following NIST quote:
>
>Their associated key agreement schemes, X25519 and X448, will be
>considered for inclusion in a subsequent revision to SP 800-56A.  The
>CMVP does not intend to enforce compliance with SP 800-56A until
>these revisions are complete.
>
>
> https://web.archive.org/web/20200810165057/https://csrc.nist.gov/projects/cryptographic-module-validation-program/notices
>
> Does anyone have any documents showing that NIST has reneged on the
> above announcement? Possibilities:
>
>* Yes: then I'd appreciate a pointer so that concerned members of the
>  community can tell NIST what they think about this and, hopefully,
>  get NIST to change course.
>
>* No: then the announcement and consistent handling of this by NIST
>  would be another reason for IETF to not be dragged down by the
>  current limitations of NIST SP 800-56A.
>
> If nobody has ever tried asking NIST to approve an X25519 solution as
> per the above announcement, surely that would be a useful experiment,
> creating a path towards simplifying subsequent TLS WG discussions.
>
> ---D. J. Bernstein
>
> ___
> TLS mailing list -- tls@ietf.org
> To unsubscribe send an email to tls-le...@ietf.org
>
___
TLS mailing list -- tls@ietf.org
To unsubscribe send an email to tls-le...@ietf.org


Re: [PULL v3 00/41] virtio: features,fixes

2024-06-05 Thread Richard Henderson

On 6/5/24 16:34, Michael S. Tsirkin wrote:

Dropped acpi patches that had endian-ness issues.

The following changes since commit 60b54b67c63d8f076152e0f7dccf39854dfc6a77:

   Merge tag 'pull-lu-20240526' of https://gitlab.com/rth7680/qemu into staging 
(2024-05-26 17:51:00 -0700)

are available in the Git repository at:

   https://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to d23bc95d390a1800198c92a0177240d9e1a1eb66:

   hw/cxl: Fix read from bogus memory (2024-06-05 19:33:01 -0400)


virtio: features,fixes

A bunch of improvements:
- vhost dirty log is now only scanned once, not once per device
- virtio and vhost now support VIRTIO_F_NOTIFICATION_DATA
- cxl gained DCD emulation support
- pvpanic gained shutdown support
- beginning of patchset for Generic Port Affinity Structure
- new tests
- bugfixes

Signed-off-by: Michael S. Tsirkin 


Sorry to have to require a v4, but

merging...
Auto-merging hw/misc/pvpanic-isa.c
CONFLICT (content): Merge conflict in hw/misc/pvpanic-isa.c
Auto-merging hw/misc/pvpanic-pci.c
CONFLICT (content): Merge conflict in hw/misc/pvpanic-pci.c
Auto-merging hw/misc/pvpanic.c
CONFLICT (content): Merge conflict in hw/misc/pvpanic.c

Looks like Paolo's pull induced the conflict.


r~



Re: [PULL 00/16] sprintf fixes

2024-06-05 Thread Richard Henderson

On 6/5/24 14:15, Richard Henderson wrote:

The following changes since commit f1572ab94738bd5787b7badcd4bd93a3657f0680:

   Merge tag 'for-upstream' ofhttps://gitlab.com/bonzini/qemu  into staging 
(2024-06-05 07:45:23 -0700)

are available in the Git repository at:

   https://gitlab.com/rth7680/qemu.git  tags/pull-misc-20240605

for you to fetch changes up to b89fb575fd467ed5dfde4608d51c47c2aa427f30:

   disas/riscv: Use GString in format_inst (2024-06-05 12:29:54 -0700)


util/hexdump: Use a GString for qemu_hexdump_line.
system/qtest: Replace sprintf by qemu_hexdump_line
hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf
disas/microblaze: Reorg to avoid intermediate sprintf
disas/riscv: Use GString in format_inst


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/9.1 as 
appropriate.


r~




[Qemu-commits] [qemu/qemu] 53ee5f: util/hexdump: Use a GString for qemu_hexdump_line

2024-06-05 Thread Richard Henderson via Qemu-commits
  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 53ee5f551e5743516c90a662425276cae4cf0aeb
  
https://github.com/qemu/qemu/commit/53ee5f551e5743516c90a662425276cae4cf0aeb
  Author: Richard Henderson 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/virtio/vhost-vdpa.c
M include/qemu/cutils.h
M util/hexdump.c

  Log Message:
  ---
  util/hexdump: Use a GString for qemu_hexdump_line

Allocate a new, or append to an existing GString instead of
using a fixed sized buffer.  Require the caller to determine
the length of the line -- do not bound len here.

Signed-off-by: Richard Henderson 
Message-Id: <20240412073346.458116-4-richard.hender...@linaro.org>


  Commit: c49d1c37d89a2ea994861600859b7dcd3ffa4ede
  
https://github.com/qemu/qemu/commit/c49d1c37d89a2ea994861600859b7dcd3ffa4ede
  Author: Richard Henderson 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/virtio/vhost-vdpa.c
M include/qemu/cutils.h
M util/hexdump.c

  Log Message:
  ---
  util/hexdump: Add unit_len and block_len to qemu_hexdump_line

Generalize the current 1 byte unit and 4 byte blocking
within the output.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-5-richard.hender...@linaro.org>


  Commit: 10e4927bc4c5ad673e12c0731e6150050cf327de
  
https://github.com/qemu/qemu/commit/10e4927bc4c5ad673e12c0731e6150050cf327de
  Author: Richard Henderson 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M util/hexdump.c

  Log Message:
  ---
  util/hexdump: Inline g_string_append_printf "%02x"

Trivial arithmetic can be used for emitting the nibbles,
rather than full-blown printf formatting.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-6-richard.hender...@linaro.org>


  Commit: 3a8ff3667187459427eef5c6b1cdb950c563e094
  
https://github.com/qemu/qemu/commit/3a8ff3667187459427eef5c6b1cdb950c563e094
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/mips/malta.c

  Log Message:
  ---
  hw/mips/malta: Add re-usable rng_seed_hex_new() method

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.

Extract common code from reinitialize_rng_seed and load_kernel
to rng_seed_hex_new.  Using qemu_hexdump_line both fixes the
deprecation warning and simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé 
[rth: Use qemu_hexdump_line.]
Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-7-richard.hender...@linaro.org>


  Commit: 4b69210978bdf92d50b84d8662b3c38c78d79803
  
https://github.com/qemu/qemu/commit/4b69210978bdf92d50b84d8662b3c38c78d79803
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M system/qtest.c

  Log Message:
  ---
  system/qtest: Replace sprintf by qemu_hexdump_line

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé `
[rth: Use qemu_hexdump_line]
Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-8-richard.hender...@linaro.org>


  Commit: 00a17d803d0931b00bffdb3b3e8a3e81251de9fa
  
https://github.com/qemu/qemu/commit/00a17d803d0931b00bffdb3b3e8a3e81251de9fa
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/scsi/scsi-disk.c

  Log Message:
  ---
  hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Note that this drops the "0x" prefix to every byte, which should
be of no consequence to tracing.

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-9-richard.hender...@linaro.org>


  Commit: 80e945894acf6ca837f03292a22cbf44550d22df
  
https://github.com/qemu/qemu/commit/80e945894acf6ca837f03292a22cbf44550d22df
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/ide/atapi.c

  Log Message:
  ---
  hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-10-richard.hender...@linaro.org>


  Commit: 7210ddb45fd6ee32140ac9d9731b88c0f61c3f0b
  
https://github.com/qemu/qemu/commit/7210ddb45fd6ee32140

Re: [PATCH 2/2] util/bufferiszero: Add simd acceleration for loongarch64

2024-06-05 Thread Richard Henderson

On 6/5/24 02:32, Bibo Mao wrote:

Different gcc versions have different features, macro CONFIG_LSX_OPT
and CONFIG_LASX_OPT is added here to detect whether gcc supports
built-in lsx/lasx macro.

Function buffer_zero_lsx() is added for 128bit simd fpu optimization,
and function buffer_zero_lasx() is for 256bit simd fpu optimization.

Loongarch gcc built-in lsx/lasx macro can be used only when compiler
option -mlsx/-mlasx is added, and there is no separate compiler option
for function only. So it is only in effect when qemu is compiled with
parameter --extra-cflags="-mlasx"

Signed-off-by: Bibo Mao 
---
  meson.build |  11 +
  util/bufferiszero.c | 103 
  2 files changed, 114 insertions(+)

diff --git a/meson.build b/meson.build
index 6386607144..29bc362d7a 100644
--- a/meson.build
+++ b/meson.build
@@ -2855,6 +2855,17 @@ config_host_data.set('CONFIG_ARM_AES_BUILTIN', 
cc.compiles('''
  void foo(uint8x16_t *p) { *p = vaesmcq_u8(*p); }
'''))
  
+# For Loongarch64, detect if LSX/LASX are available.

+ config_host_data.set('CONFIG_LSX_OPT', cc.compiles('''
+#include "lsxintrin.h"
+int foo(__m128i v) { return __lsx_bz_v(v); }
+  '''))
+
+config_host_data.set('CONFIG_LASX_OPT', cc.compiles('''
+#include "lasxintrin.h"
+int foo(__m256i v) { return __lasx_xbz_v(v); }
+  '''))


Both of these are introduced by gcc 14 and llvm 18, so I'm not certain of the utility of 
separate tests.  We might simplify this with


  config_host_data.set('CONFIG_LSX_LASX_INTRIN_H',
cc.has_header('lsxintrin.h') && cc.has_header('lasxintrin.h'))


As you say, these headers require vector instructions to be enabled at compile-time rather 
than detecting them at runtime.  This is a point where the compilers could be improved to 
support __attribute__((target("xyz"))) and the builtins with that.  The i386 port does 
this, for instance.


In the meantime, it means that you don't need a runtime test.  Similar to aarch64 and the 
use of __ARM_NEON as a compile-time test for simd support.  Perhaps


#elif defined(CONFIG_LSX_LASX_INTRIN_H) && \
  (defined(__loongarch_sx) || defined(__loongarch_asx))
# ifdef __loongarch_sx
  ...
# endif
# ifdef __loongarch_asx
  ...
# endif


The actual code is perfectly fine, of course, since it follows the pattern from the 
others.  How much improvement do you see from bufferiszero-bench?



r~



Re: linux-user emulation hangs during fork

2024-06-05 Thread Richard Henderson

On 6/5/24 02:14, Andreas Schwab wrote:

$ qemu-x86_64 --version
qemu-x86_64 version 9.0.50 (v9.0.0-1211-gd16cab541a)
Copyright (c) 2003-2024 Fabrice Bellard and the QEMU Project developers
$ cat fork.rb
begin
   r, w = IO.pipe
   if pid1 = fork
 w.close
 r.read 1
 Process.kill "USR1", pid1
 Process.wait2 pid1
   else
 print "child\n"
 r.close
 if pid2 = fork
   trap("USR1") { print "child: kill\n"; Process.kill "USR2", pid2 }
   w.close
   print "child: wait\n"
   Process.wait2 pid2
 else
   print "grandchild\n"
   w.close
   sleep 0.2
 end
   end
end
$ ruby fork.rb
child
child: wait
grandchild
child: kill
$ qemu-x86_64 /usr/bin/ruby fork.rb
child
child: wait
^Z
[1]+  Stopped qemu-x86_64 /usr/bin/ruby fork.rb
$ grep SigB $(for p in $(pidof qemu-x86_64); do echo /proc/$p/status; done | 
sort)
/proc/3221/status:SigBlk:   
/proc/3224/status:SigBlk:   
/proc/3228/status:SigBlk:   fff27ffbfa9f



Works for me:

rth@stoup:~/zz$ ~/qemu/bld/qemu-x86_64 `which ruby` fork.rb
child
grandchild
child: wait
child: kill
rth@stoup:~/zz$ ~/qemu/bld/qemu-x86_64 `which ruby` fork.rb
child
grandchild
child: wait
child: kill
rth@stoup:~/zz$ ~/qemu/bld/qemu-x86_64 `which ruby` fork.rb
child
grandchild
child: wait
child: kill
rth@stoup:~/zz$ ~/qemu/bld/qemu-x86_64 `which ruby` fork.rb
child
grandchild
child: wait
child: kill


r~



[clang] Pass LangOpts from CompilerInstance to DependencyScanningWorker (PR #93753)

2024-06-05 Thread Richard Smith via cfe-commits

zygoloid wrote:

> > I guess the general question is - is it acceptable to have the Scanner 
> > operating in a language standard different than the passed in language mode 
> > and different than the compiler language standard?
> 
> I think that is acceptable. It is kinda hacky, but the lexer and preprocessor 
> are largely independent of the language and the standard. When they do depend 
> on those settings, taking the union of the features and letting the compiler 
> trim it down is still a perfectly sound thing to do.

You can certainly construct cases where the different lexing rules in different 
language modes allow you to detect which language you're in from within the 
preprocessor ([1](https://eel.is/c++draft/diff.cpp11.lex) 
[2](https://eel.is/c++draft/diff.cpp14.lex#2) 
[3](https://eel.is/c++draft/diff.cpp03.lex#1)) or where enabling more language 
mode flags may reject valid code. It may be good enough for what the scanner is 
trying to do, but I think it's a stretch to say that it's sound.

https://github.com/llvm/llvm-project/pull/93753
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] accel/tcg/plugin: Fix inject_mem_cb rw masking

2024-06-05 Thread Richard Henderson
These are not booleans, but masks.

Fixes: f86fd4d8721 ("plugins: distinct types for callbacks")
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index cc1634e7a6..b6bae32b99 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -240,13 +240,13 @@ static void inject_mem_cb(struct qemu_plugin_dyn_cb *cb,
 {
 switch (cb->type) {
 case PLUGIN_CB_MEM_REGULAR:
-if (rw && cb->regular.rw) {
+if (rw & cb->regular.rw) {
 gen_mem_cb(>regular, meminfo, addr);
 }
 break;
 case PLUGIN_CB_INLINE_ADD_U64:
 case PLUGIN_CB_INLINE_STORE_U64:
-if (rw && cb->inline_insn.rw) {
+if (rw & cb->inline_insn.rw) {
 inject_cb(cb);
 }
 break;
-- 
2.34.1




[PATCH v2 00/10] target/s390x: pc-relative translation

2024-06-05 Thread Richard Henderson
v1: 20220906101747.344559-1-richard.hender...@linaro.org

A lot has changed in the 20 months since, including generic
cleanups and splitting out the PER fixes.


r~


Richard Henderson (10):
  target/s390x: Change help_goto_direct to work on displacements
  target/s390x: Introduce gen_psw_addr_disp
  target/s390x: Remove pc argument to pc_to_link_into
  target/s390x: Use gen_psw_addr_disp in pc_to_link_info
  target/s390x: Use gen_psw_addr_disp in save_link_info
  target/s390x: Use deposit in save_link_info
  target/s390x: Use gen_psw_addr_disp in op_sam
  target/s390x: Use ilen instead in branches
  target/s390x: Assert masking of psw.addr in cpu_get_tb_cpu_state
  target/s390x: Enable CF_PCREL

 target/s390x/cpu.c   |  23 +
 target/s390x/tcg/translate.c | 190 +--
 2 files changed, 138 insertions(+), 75 deletions(-)

-- 
2.34.1




[PATCH v2 04/10] target/s390x: Use gen_psw_addr_disp in pc_to_link_info

2024-06-05 Thread Richard Henderson
This is slightly more complicated than a straight displacement
for 31 and 24-bit modes.  Dont bother with a cant-happen assert.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 14162769a9..2d611da8af 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -174,17 +174,19 @@ static void gen_psw_addr_disp(DisasContext *s, TCGv_i64 
dest, int64_t disp)
 
 static void pc_to_link_info(TCGv_i64 out, DisasContext *s)
 {
-uint64_t pc = s->pc_tmp;
+TCGv_i64 tmp;
 
-if (s->base.tb->flags & FLAG_MASK_32) {
-if (s->base.tb->flags & FLAG_MASK_64) {
-tcg_gen_movi_i64(out, pc);
-return;
-}
-pc |= 0x8000;
+if (s->base.tb->flags & FLAG_MASK_64) {
+gen_psw_addr_disp(s, out, s->ilen);
+return;
 }
-assert(!(s->base.tb->flags & FLAG_MASK_64));
-tcg_gen_deposit_i64(out, out, tcg_constant_i64(pc), 0, 32);
+
+tmp = tcg_temp_new_i64();
+gen_psw_addr_disp(s, tmp, s->ilen);
+if (s->base.tb->flags & FLAG_MASK_32) {
+tcg_gen_ori_i64(tmp, tmp, 0x8000);
+}
+tcg_gen_deposit_i64(out, out, tmp, 0, 32);
 }
 
 static TCGv_i64 psw_addr;
-- 
2.34.1




[PATCH v2 06/10] target/s390x: Use deposit in save_link_info

2024-06-05 Thread Richard Henderson
Replace manual masking and oring with deposits.

Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 2654c85a8e..0f0688424f 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1418,24 +1418,32 @@ static DisasJumpType op_bas(DisasContext *s, DisasOps 
*o)
 
 static void save_link_info(DisasContext *s, DisasOps *o)
 {
-TCGv_i64 t;
+TCGv_i64 t1, t2;
 
 if (s->base.tb->flags & (FLAG_MASK_32 | FLAG_MASK_64)) {
 pc_to_link_info(o->out, s);
 return;
 }
+
 gen_op_calc_cc(s);
-t = tcg_temp_new_i64();
-tcg_gen_andi_i64(o->out, o->out, 0xull);
-gen_psw_addr_disp(s, t, s->ilen);
-tcg_gen_or_i64(o->out, o->out, t);
-tcg_gen_ori_i64(o->out, o->out, (s->ilen / 2) << 30);
-tcg_gen_shri_i64(t, psw_mask, 16);
-tcg_gen_andi_i64(t, t, 0x0f00);
-tcg_gen_or_i64(o->out, o->out, t);
-tcg_gen_extu_i32_i64(t, cc_op);
-tcg_gen_shli_i64(t, t, 28);
-tcg_gen_or_i64(o->out, o->out, t);
+t1 = tcg_temp_new_i64();
+t2 = tcg_temp_new_i64();
+
+/* Shift program mask into place, garbage outside of [27:24]. */
+tcg_gen_shri_i64(t1, psw_mask, 16);
+/* Deposit pc to replace garbage bits below program mask. */
+gen_psw_addr_disp(s, t2, s->ilen);
+tcg_gen_deposit_i64(t1, t1, t2, 0, 24);
+/*
+ * Deposit cc to replace garbage bits above program mask.
+ * Note that cc is in [0-3], thus [63:30] are set to zero.
+ */
+tcg_gen_extu_i32_i64(t2, cc_op);
+tcg_gen_deposit_i64(t1, t1, t2, 28, 64 - 28);
+/* Install ilen. */
+tcg_gen_ori_i64(t1, t1, (s->ilen / 2) << 30);
+
+tcg_gen_deposit_i64(o->out, o->out, t1, 0, 32);
 }
 
 static DisasJumpType op_bal(DisasContext *s, DisasOps *o)
-- 
2.34.1




[PATCH v2 03/10] target/s390x: Remove pc argument to pc_to_link_into

2024-06-05 Thread Richard Henderson
All callers pass s->pc_tmp.

Reviewed-by: Ilya Leoshkevich 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index bd4ad33802..14162769a9 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -172,8 +172,10 @@ static void gen_psw_addr_disp(DisasContext *s, TCGv_i64 
dest, int64_t disp)
 tcg_gen_movi_i64(dest, s->base.pc_next + disp);
 }
 
-static void pc_to_link_info(TCGv_i64 out, DisasContext *s, uint64_t pc)
+static void pc_to_link_info(TCGv_i64 out, DisasContext *s)
 {
+uint64_t pc = s->pc_tmp;
+
 if (s->base.tb->flags & FLAG_MASK_32) {
 if (s->base.tb->flags & FLAG_MASK_64) {
 tcg_gen_movi_i64(out, pc);
@@ -1404,7 +1406,7 @@ static DisasJumpType op_ni(DisasContext *s, DisasOps *o)
 
 static DisasJumpType op_bas(DisasContext *s, DisasOps *o)
 {
-pc_to_link_info(o->out, s, s->pc_tmp);
+pc_to_link_info(o->out, s);
 if (o->in2) {
 return help_goto_indirect(s, o->in2);
 } else {
@@ -1417,7 +1419,7 @@ static void save_link_info(DisasContext *s, DisasOps *o)
 TCGv_i64 t;
 
 if (s->base.tb->flags & (FLAG_MASK_32 | FLAG_MASK_64)) {
-pc_to_link_info(o->out, s, s->pc_tmp);
+pc_to_link_info(o->out, s);
 return;
 }
 gen_op_calc_cc(s);
@@ -1474,7 +1476,7 @@ static DisasJumpType op_basi(DisasContext *s, DisasOps *o)
 bool is_imm;
 int imm;
 
-pc_to_link_info(o->out, s, s->pc_tmp);
+pc_to_link_info(o->out, s);
 
 disas_jdest(s, i2, is_imm, imm, o->in2);
 disas_jcc(s, , 0xf);
-- 
2.34.1




[PATCH v2 02/10] target/s390x: Introduce gen_psw_addr_disp

2024-06-05 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index f25ae02a4e..bd4ad33802 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -167,6 +167,11 @@ static uint64_t inline_branch_hit[CC_OP_MAX];
 static uint64_t inline_branch_miss[CC_OP_MAX];
 #endif
 
+static void gen_psw_addr_disp(DisasContext *s, TCGv_i64 dest, int64_t disp)
+{
+tcg_gen_movi_i64(dest, s->base.pc_next + disp);
+}
+
 static void pc_to_link_info(TCGv_i64 out, DisasContext *s, uint64_t pc)
 {
 if (s->base.tb->flags & FLAG_MASK_32) {
@@ -337,8 +342,7 @@ static void store_freg32_i64(int reg, TCGv_i64 v)
 
 static void update_psw_addr(DisasContext *s)
 {
-/* psw.addr */
-tcg_gen_movi_i64(psw_addr, s->base.pc_next);
+gen_psw_addr_disp(s, psw_addr, 0);
 }
 
 static void per_branch(DisasContext *s, TCGv_i64 dest)
@@ -352,7 +356,7 @@ static void per_branch(DisasContext *s, TCGv_i64 dest)
 
 static void per_breaking_event(DisasContext *s)
 {
-tcg_gen_movi_i64(gbea, s->base.pc_next);
+gen_psw_addr_disp(s, gbea, 0);
 }
 
 static void update_cc_op(DisasContext *s)
@@ -1086,11 +1090,11 @@ static DisasJumpType help_goto_direct(DisasContext *s, 
int64_t disp)
 }
 if (use_goto_tb(s, dest)) {
 tcg_gen_goto_tb(0);
-tcg_gen_movi_i64(psw_addr, dest);
+gen_psw_addr_disp(s, psw_addr, disp);
 tcg_gen_exit_tb(s->base.tb, 0);
 return DISAS_NORETURN;
 } else {
-tcg_gen_movi_i64(psw_addr, dest);
+gen_psw_addr_disp(s, psw_addr, disp);
 return DISAS_PC_CC_UPDATED;
 }
 }
@@ -1121,7 +1125,7 @@ static DisasJumpType help_branch(DisasContext *s, 
DisasCompare *c,
  * still need a conditional call to helper_per_branch.
  */
 if (c->cond == TCG_COND_ALWAYS
-|| (dest == s->pc_tmp &&
+|| (disp == s->ilen &&
 !(s->base.tb->flags & FLAG_MASK_PER_BRANCH))) {
 return help_goto_direct(s, disp);
 }
@@ -1154,7 +1158,7 @@ static DisasJumpType help_branch(DisasContext *s, 
DisasCompare *c,
 /* Branch taken.  */
 per_breaking_event(s);
 if (is_imm) {
-tcg_gen_movi_i64(psw_addr, dest);
+gen_psw_addr_disp(s, psw_addr, disp);
 } else {
 tcg_gen_mov_i64(psw_addr, cdest);
 }
@@ -1170,7 +1174,7 @@ static DisasJumpType help_branch(DisasContext *s, 
DisasCompare *c,
 gen_set_label(lab);
 
 /* Branch not taken.  */
-tcg_gen_movi_i64(psw_addr, s->pc_tmp);
+gen_psw_addr_disp(s, psw_addr, s->ilen);
 if (use_goto_tb(s, s->pc_tmp)) {
 tcg_gen_goto_tb(1);
 tcg_gen_exit_tb(s->base.tb, 1);
@@ -5758,7 +5762,8 @@ static TCGv gen_ri2(DisasContext *s)
 
 disas_jdest(s, i2, is_imm, imm, ri2);
 if (is_imm) {
-ri2 = tcg_constant_i64(s->base.pc_next + (int64_t)imm * 2);
+ri2 = tcg_temp_new_i64();
+gen_psw_addr_disp(s, ri2, (int64_t)imm * 2);
 }
 
 return ri2;
@@ -6367,7 +6372,7 @@ static DisasJumpType translate_one(CPUS390XState *env, 
DisasContext *s)
 s->base.is_jmp = DISAS_PC_CC_UPDATED;
 /* fall through */
 case DISAS_NEXT:
-tcg_gen_movi_i64(psw_addr, s->pc_tmp);
+gen_psw_addr_disp(s, psw_addr, s->ilen);
 break;
 default:
 break;
-- 
2.34.1




[PATCH v2 09/10] target/s390x: Assert masking of psw.addr in cpu_get_tb_cpu_state

2024-06-05 Thread Richard Henderson
When changing modes via SAM, we raise a specification exception if the
new PC is out of range.  The masking in s390x_tr_init_disas_context
was too late to be correct, but may be removed.  Add a debugging
assert in cpu_get_tb_cpu_state.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 target/s390x/cpu.c   | 6 ++
 target/s390x/tcg/translate.c | 6 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 2bbeaca36e..c786767bd1 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -358,6 +358,12 @@ void cpu_get_tb_cpu_state(CPUS390XState *env, vaddr *pc,
 flags |= FLAG_MASK_VECTOR;
 }
 *pflags = flags;
+
+if (!(flags & FLAG_MASK_32)) {
+tcg_debug_assert(*pc <= 0x00ff);
+} else if (!(flags & FLAG_MASK_64)) {
+tcg_debug_assert(*pc <= 0x7fff);
+}
 }
 
 static const TCGCPUOps s390_tcg_ops = {
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 3014cbea4f..0ee14484d0 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -6409,11 +6409,7 @@ static void s390x_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-/* 31-bit mode */
-if (!(dc->base.tb->flags & FLAG_MASK_64)) {
-dc->base.pc_first &= 0x7fff;
-dc->base.pc_next = dc->base.pc_first;
-}
+/* Note cpu_get_tb_cpu_state asserts PC is masked for the mode. */
 
 dc->cc_op = CC_OP_DYNAMIC;
 dc->ex_value = dc->base.tb->cs_base;
-- 
2.34.1




[PATCH v2 08/10] target/s390x: Use ilen instead in branches

2024-06-05 Thread Richard Henderson
Remove the remaining uses of pc_tmp, and remove the variable.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 28 
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index bce9a0aeb0..3014cbea4f 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -141,12 +141,6 @@ struct DisasContext {
 const DisasInsn *insn;
 DisasFields fields;
 uint64_t ex_value;
-/*
- * During translate_one(), pc_tmp is used to determine the instruction
- * to be executed after base.pc_next - e.g. next sequential instruction
- * or a branch target.
- */
-uint64_t pc_tmp;
 uint32_t ilen;
 enum cc_op cc_op;
 bool exit_to_mainloop;
@@ -344,11 +338,6 @@ static void store_freg32_i64(int reg, TCGv_i64 v)
 tcg_gen_st32_i64(v, tcg_env, freg32_offset(reg));
 }
 
-static void update_psw_addr(DisasContext *s)
-{
-gen_psw_addr_disp(s, psw_addr, 0);
-}
-
 static void per_branch(DisasContext *s, TCGv_i64 dest)
 {
 #ifndef CONFIG_USER_ONLY
@@ -420,7 +409,7 @@ static void gen_program_exception(DisasContext *s, int code)
offsetof(CPUS390XState, int_pgm_ilen));
 
 /* update the psw */
-update_psw_addr(s);
+gen_psw_addr_disp(s, psw_addr, 0);
 
 /* Save off cc.  */
 update_cc_op(s);
@@ -1179,7 +1168,7 @@ static DisasJumpType help_branch(DisasContext *s, 
DisasCompare *c,
 
 /* Branch not taken.  */
 gen_psw_addr_disp(s, psw_addr, s->ilen);
-if (use_goto_tb(s, s->pc_tmp)) {
+if (use_goto_tb(s, s->base.pc_next + s->ilen)) {
 tcg_gen_goto_tb(1);
 tcg_gen_exit_tb(s->base.tb, 1);
 return DISAS_NORETURN;
@@ -2361,7 +2350,7 @@ static DisasJumpType op_ex(DisasContext *s, DisasOps *o)
 return DISAS_NORETURN;
 }
 
-update_psw_addr(s);
+gen_psw_addr_disp(s, psw_addr, 0);
 update_cc_op(s);
 
 if (r1 == 0) {
@@ -3085,7 +3074,7 @@ static DisasJumpType op_lpd(DisasContext *s, DisasOps *o)
 
 /* In a parallel context, stop the world and single step.  */
 if (tb_cflags(s->base.tb) & CF_PARALLEL) {
-update_psw_addr(s);
+gen_psw_addr_disp(s, psw_addr, 0);
 update_cc_op(s);
 gen_exception(EXCP_ATOMIC);
 return DISAS_NORETURN;
@@ -4379,7 +4368,7 @@ static DisasJumpType op_stura(DisasContext *s, DisasOps 
*o)
 
 if (s->base.tb->flags & FLAG_MASK_PER_STORE_REAL) {
 update_cc_op(s);
-update_psw_addr(s);
+gen_psw_addr_disp(s, psw_addr, 0);
 gen_helper_per_store_real(tcg_env, tcg_constant_i32(s->ilen));
 return DISAS_NORETURN;
 }
@@ -4611,7 +4600,7 @@ static DisasJumpType op_svc(DisasContext *s, DisasOps *o)
 {
 TCGv_i32 t;
 
-update_psw_addr(s);
+gen_psw_addr_disp(s, psw_addr, 0);
 update_cc_op(s);
 
 t = tcg_constant_i32(get_field(s, i1) & 0xff);
@@ -6193,7 +6182,6 @@ static const DisasInsn *extract_insn(CPUS390XState *env, 
DisasContext *s)
 g_assert_not_reached();
 }
 }
-s->pc_tmp = s->base.pc_next + ilen;
 s->ilen = ilen;
 
 /* We can't actually determine the insn format until we've looked up
@@ -6413,7 +6401,7 @@ static DisasJumpType translate_one(CPUS390XState *env, 
DisasContext *s)
 
 out:
 /* Advance to the next instruction.  */
-s->base.pc_next = s->pc_tmp;
+s->base.pc_next += s->ilen;
 return ret;
 }
 
@@ -6475,7 +6463,7 @@ static void s390x_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cs)
 case DISAS_NORETURN:
 break;
 case DISAS_TOO_MANY:
-update_psw_addr(dc);
+gen_psw_addr_disp(dc, psw_addr, 0);
 /* FALLTHRU */
 case DISAS_PC_UPDATED:
 /* Next TB starts off with CC_OP_DYNAMIC, so make sure the
-- 
2.34.1




[PATCH v2 10/10] target/s390x: Enable CF_PCREL

2024-06-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/s390x/cpu.c   | 17 +
 target/s390x/tcg/translate.c | 71 +++-
 2 files changed, 62 insertions(+), 26 deletions(-)

diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index c786767bd1..9f03190c35 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -39,6 +39,7 @@
 #include "sysemu/reset.h"
 #endif
 #include "hw/s390x/cpu-topology.h"
+#include "exec/translation-block.h"
 
 #define CR0_RESET   0xE0UL
 #define CR14_RESET  0xC200UL;
@@ -111,6 +112,16 @@ uint64_t s390_cpu_get_psw_mask(CPUS390XState *env)
 return r;
 }
 
+static void s390_cpu_synchronize_from_tb(CPUState *cs,
+ const TranslationBlock *tb)
+{
+/* The program counter is always up to date with CF_PCREL. */
+if (!(tb_cflags(tb) & CF_PCREL)) {
+CPUS390XState *env = cpu_env(cs);
+env->psw.addr = tb->pc;
+}
+}
+
 static void s390_cpu_set_pc(CPUState *cs, vaddr value)
 {
 S390CPU *cpu = S390_CPU(cs);
@@ -246,6 +257,11 @@ static void s390_cpu_realizefn(DeviceState *dev, Error 
**errp)
 S390CPUClass *scc = S390_CPU_GET_CLASS(dev);
 Error *err = NULL;
 
+#if defined(CONFIG_TCG) && !defined(CONFIG_USER_ONLY)
+/* Use pc-relative instructions in system-mode */
+cs->tcg_cflags |= CF_PCREL;
+#endif
+
 /* the model has to be realized before qemu_init_vcpu() due to kvm */
 s390_realize_cpu_model(cs, );
 if (err) {
@@ -368,6 +384,7 @@ void cpu_get_tb_cpu_state(CPUS390XState *env, vaddr *pc,
 
 static const TCGCPUOps s390_tcg_ops = {
 .initialize = s390x_translate_init,
+.synchronize_from_tb = s390_cpu_synchronize_from_tb,
 .restore_state_to_opc = s390x_restore_state_to_opc,
 
 #ifdef CONFIG_USER_ONLY
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 0ee14484d0..6961ad7c67 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -139,6 +139,7 @@ struct DisasFields {
 struct DisasContext {
 DisasContextBase base;
 const DisasInsn *insn;
+target_ulong pc_save;
 DisasFields fields;
 uint64_t ex_value;
 uint32_t ilen;
@@ -161,28 +162,6 @@ static uint64_t inline_branch_hit[CC_OP_MAX];
 static uint64_t inline_branch_miss[CC_OP_MAX];
 #endif
 
-static void gen_psw_addr_disp(DisasContext *s, TCGv_i64 dest, int64_t disp)
-{
-tcg_gen_movi_i64(dest, s->base.pc_next + disp);
-}
-
-static void pc_to_link_info(TCGv_i64 out, DisasContext *s)
-{
-TCGv_i64 tmp;
-
-if (s->base.tb->flags & FLAG_MASK_64) {
-gen_psw_addr_disp(s, out, s->ilen);
-return;
-}
-
-tmp = tcg_temp_new_i64();
-gen_psw_addr_disp(s, tmp, s->ilen);
-if (s->base.tb->flags & FLAG_MASK_32) {
-tcg_gen_ori_i64(tmp, tmp, 0x8000);
-}
-tcg_gen_deposit_i64(out, out, tmp, 0, 32);
-}
-
 static TCGv_i64 psw_addr;
 static TCGv_i64 psw_mask;
 static TCGv_i64 gbea;
@@ -338,6 +317,34 @@ static void store_freg32_i64(int reg, TCGv_i64 v)
 tcg_gen_st32_i64(v, tcg_env, freg32_offset(reg));
 }
 
+static void gen_psw_addr_disp(DisasContext *s, TCGv_i64 dest, int64_t disp)
+{
+assert(s->pc_save != -1);
+if (tb_cflags(s->base.tb) & CF_PCREL) {
+disp += s->base.pc_next - s->pc_save;
+tcg_gen_addi_i64(dest, psw_addr, disp);
+} else {
+tcg_gen_movi_i64(dest, s->base.pc_next + disp);
+}
+}
+
+static void pc_to_link_info(TCGv_i64 out, DisasContext *s)
+{
+TCGv_i64 tmp;
+
+if (s->base.tb->flags & FLAG_MASK_64) {
+gen_psw_addr_disp(s, out, s->ilen);
+return;
+}
+
+tmp = tcg_temp_new_i64();
+gen_psw_addr_disp(s, tmp, s->ilen);
+if (s->base.tb->flags & FLAG_MASK_32) {
+tcg_gen_ori_i64(tmp, tmp, 0x8000);
+}
+tcg_gen_deposit_i64(out, out, tmp, 0, 32);
+}
+
 static void per_branch(DisasContext *s, TCGv_i64 dest)
 {
 #ifndef CONFIG_USER_ONLY
@@ -1081,13 +1088,13 @@ static DisasJumpType help_goto_direct(DisasContext *s, 
int64_t disp)
 if (disp == s->ilen) {
 return DISAS_NEXT;
 }
+gen_psw_addr_disp(s, psw_addr, disp);
 if (use_goto_tb(s, dest)) {
 tcg_gen_goto_tb(0);
-gen_psw_addr_disp(s, psw_addr, disp);
 tcg_gen_exit_tb(s->base.tb, 0);
 return DISAS_NORETURN;
 } else {
-gen_psw_addr_disp(s, psw_addr, disp);
+s->pc_save = dest;
 return DISAS_PC_CC_UPDATED;
 }
 }
@@ -1097,6 +1104,7 @@ static DisasJumpType help_goto_indirect(DisasContext *s, 
TCGv_i64 dest)
 update_cc_op(s);
 per_breaking_event(s);
 tcg_gen_mov_i64(psw_addr, dest);
+s->pc_save = -1;
 per_branch(s, psw_addr);
 return DISAS_PC_CC_UPDATED;
 }
@@ -1173,6 +1181,7 @@ static DisasJumpType help_branch(DisasContext *s, 
DisasCompare *c,
 tcg_gen_exit_tb(s->base.tb, 1);

[PATCH v2 07/10] target/s390x: Use gen_psw_addr_disp in op_sam

2024-06-05 Thread Richard Henderson
Complicated because we may now require a runtime jump.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 39 +---
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 0f0688424f..bce9a0aeb0 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -3805,7 +3805,7 @@ static DisasJumpType op_sacf(DisasContext *s, DisasOps *o)
 static DisasJumpType op_sam(DisasContext *s, DisasOps *o)
 {
 int sam = s->insn->data;
-TCGv_i64 tsam;
+TCGLabel *fault = NULL;
 uint64_t mask;
 
 switch (sam) {
@@ -3820,20 +3820,35 @@ static DisasJumpType op_sam(DisasContext *s, DisasOps 
*o)
 break;
 }
 
-/* Bizarre but true, we check the address of the current insn for the
-   specification exception, not the next to be executed.  Thus the PoO
-   documents that Bad Things Happen two bytes before the end.  */
-if (s->base.pc_next & ~mask) {
-gen_program_exception(s, PGM_SPECIFICATION);
-return DISAS_NORETURN;
-}
-s->pc_tmp &= mask;
+/*
+ * Bizarre but true, we check the address of the current insn for the
+ * specification exception, not the next to be executed.  Thus the PoO
+ * documents that Bad Things Happen two bytes before the end.
+ */
+if (mask != -1) {
+TCGv_i64 t = tcg_temp_new_i64();
+fault = gen_new_label();
 
-tsam = tcg_constant_i64(sam);
-tcg_gen_deposit_i64(psw_mask, psw_mask, tsam, 31, 2);
+gen_psw_addr_disp(s, t, 0);
+tcg_gen_andi_i64(t, t, ~mask);
+tcg_gen_brcondi_i64(TCG_COND_NE, t, 0, fault);
+}
+
+update_cc_op(s);
+
+tcg_gen_deposit_i64(psw_mask, psw_mask, tcg_constant_i64(sam), 31, 2);
+
+gen_psw_addr_disp(s, psw_addr, s->ilen);
+tcg_gen_andi_i64(psw_addr, psw_addr, mask);
 
 /* Always exit the TB, since we (may have) changed execution mode.  */
-return DISAS_TOO_MANY;
+tcg_gen_lookup_and_goto_ptr();
+
+if (mask != -1) {
+gen_set_label(fault);
+gen_program_exception(s, PGM_SPECIFICATION);
+}
+return DISAS_NORETURN;
 }
 
 static DisasJumpType op_sar(DisasContext *s, DisasOps *o)
-- 
2.34.1




[PATCH v2 01/10] target/s390x: Change help_goto_direct to work on displacements

2024-06-05 Thread Richard Henderson
In preparation for CF_PCREL, reduce reliance on absolute values.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index c81e035dea..f25ae02a4e 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1073,13 +1073,15 @@ struct DisasInsn {
 /* == */
 /* Miscellaneous helpers, used by several operations.  */
 
-static DisasJumpType help_goto_direct(DisasContext *s, uint64_t dest)
+static DisasJumpType help_goto_direct(DisasContext *s, int64_t disp)
 {
+uint64_t dest = s->base.pc_next + disp;
+
 update_cc_op(s);
 per_breaking_event(s);
 per_branch(s, tcg_constant_i64(dest));
 
-if (dest == s->pc_tmp) {
+if (disp == s->ilen) {
 return DISAS_NEXT;
 }
 if (use_goto_tb(s, dest)) {
@@ -1105,7 +1107,8 @@ static DisasJumpType help_goto_indirect(DisasContext *s, 
TCGv_i64 dest)
 static DisasJumpType help_branch(DisasContext *s, DisasCompare *c,
  bool is_imm, int imm, TCGv_i64 cdest)
 {
-uint64_t dest = s->base.pc_next + (int64_t)imm * 2;
+int64_t disp = (int64_t)imm * 2;
+uint64_t dest = s->base.pc_next + disp;
 TCGLabel *lab;
 
 /* Take care of the special cases first.  */
@@ -1120,7 +1123,7 @@ static DisasJumpType help_branch(DisasContext *s, 
DisasCompare *c,
 if (c->cond == TCG_COND_ALWAYS
 || (dest == s->pc_tmp &&
 !(s->base.tb->flags & FLAG_MASK_PER_BRANCH))) {
-return help_goto_direct(s, dest);
+return help_goto_direct(s, disp);
 }
 } else {
 if (!cdest) {
-- 
2.34.1




[PATCH v2 05/10] target/s390x: Use gen_psw_addr_disp in save_link_info

2024-06-05 Thread Richard Henderson
Trivial but non-mechanical conversion away from pc_tmp.

Reviewed-by: Ilya Leoshkevich 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 2d611da8af..2654c85a8e 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1425,9 +1425,11 @@ static void save_link_info(DisasContext *s, DisasOps *o)
 return;
 }
 gen_op_calc_cc(s);
-tcg_gen_andi_i64(o->out, o->out, 0xull);
-tcg_gen_ori_i64(o->out, o->out, ((s->ilen / 2) << 30) | s->pc_tmp);
 t = tcg_temp_new_i64();
+tcg_gen_andi_i64(o->out, o->out, 0xull);
+gen_psw_addr_disp(s, t, s->ilen);
+tcg_gen_or_i64(o->out, o->out, t);
+tcg_gen_ori_i64(o->out, o->out, (s->ilen / 2) << 30);
 tcg_gen_shri_i64(t, psw_mask, 16);
 tcg_gen_andi_i64(t, t, 0x0f00);
 tcg_gen_or_i64(o->out, o->out, t);
-- 
2.34.1




[PULL 14/16] disas/microblaze: Print registers directly with PRIrfsl

2024-06-05 Thread Richard Henderson
Use a printf format instead of sprintf into a buffer.

Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-20-richard.hender...@linaro.org>
---
 disas/microblaze.c | 22 +-
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/disas/microblaze.c b/disas/microblaze.c
index 390f98c0a3..24febfdea9 100644
--- a/disas/microblaze.c
+++ b/disas/microblaze.c
@@ -564,7 +564,6 @@ static const struct op_code_struct {
 
 /* prefix for register names */
 #define register_prefix "r"
-static const char fsl_register_prefix[] = "rfsl";
 static const char pvr_register_prefix[] = "rpvr";
 
 
@@ -580,11 +579,13 @@ static const char pvr_register_prefix[] = "rpvr";
 #include "disas/dis-asm.h"
 
 #define PRIregregister_prefix "%ld"
+#define PRIrfsl   register_prefix "fsl%ld"
 #define PRIimm"%d"
 
 #define get_field_rd(instr)  ((instr & RD_MASK) >> RD_LOW)
 #define get_field_r1(instr)  ((instr & RA_MASK) >> RA_LOW)
 #define get_field_r2(instr)  ((instr & RB_MASK) >> RB_LOW)
+#define get_field_rfsl(instr)(instr & RFSL_MASK)
 #define get_field_imm(instr) ((int16_t)instr)
 #define get_field_imm5(instr)((int)instr & IMM5_MASK)
 #define get_field_imm15(instr)   ((int)instr & IMM15_MASK)
@@ -592,19 +593,6 @@ static const char pvr_register_prefix[] = "rpvr";
 #define get_int_field_imm(instr) ((instr & IMM_MASK) >> IMM_LOW)
 #define get_int_field_r1(instr) ((instr & RA_MASK) >> RA_LOW)
 
-/* Local function prototypes. */
-
-static char * get_field_rfsl (long instr);
-
-static char *
-get_field_rfsl (long instr)
-{
-  char tmpstr[25];
-  snprintf(tmpstr, sizeof(tmpstr), "%s%d", fsl_register_prefix,
-   (short)((instr & RFSL_MASK) >> IMM_LOW));
-  return(strdup(tmpstr));
-}
-
 /*
   char *
   get_field_special (instr) 
@@ -803,11 +791,11 @@ print_insn_microblaze(bfd_vma memaddr, struct 
disassemble_info *info)
  get_field_imm5(inst));
 break;
 case INST_TYPE_RD_RFSL:
-fprintf_func(stream, "%s\t" PRIreg ", %s",
+fprintf_func(stream, "%s\t" PRIreg ", " PRIrfsl,
  op->name, get_field_rd(inst), get_field_rfsl(inst));
 break;
 case INST_TYPE_R1_RFSL:
-fprintf_func(stream, "%s\t" PRIreg ", %s",
+fprintf_func(stream, "%s\t" PRIreg ", " PRIrfsl,
  op->name, get_field_r1(inst), get_field_rfsl(inst));
 break;
 case INST_TYPE_RD_SPECIAL:
@@ -879,7 +867,7 @@ print_insn_microblaze(bfd_vma memaddr, struct 
disassemble_info *info)
  op->name, get_field_rd(inst));
 break;
 case INST_TYPE_RFSL:
-fprintf_func(stream, "%s\t%s",
+fprintf_func(stream, "%s\t" PRIrfsl,
  op->name, get_field_rfsl(inst));
 break;
 default:
-- 
2.34.1




[PULL 02/16] util/hexdump: Add unit_len and block_len to qemu_hexdump_line

2024-06-05 Thread Richard Henderson
Generalize the current 1 byte unit and 4 byte blocking
within the output.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-5-richard.hender...@linaro.org>
---
 include/qemu/cutils.h  |  6 +-
 hw/virtio/vhost-vdpa.c |  2 +-
 util/hexdump.c | 30 +-
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
index 14a3285343..da15547bfb 100644
--- a/include/qemu/cutils.h
+++ b/include/qemu/cutils.h
@@ -287,12 +287,16 @@ int parse_debug_env(const char *name, int max, int 
initial);
  * @str: GString into which to append
  * @buf: buffer to dump
  * @len: number of bytes to dump
+ * @unit_len: add a space between every @unit_len bytes
+ * @block_len: add an extra space between every @block_len bytes
  *
  * Append @len bytes of @buf as hexadecimal into @str.
+ * Add spaces between every @unit_len and @block_len bytes.
  * If @str is NULL, allocate a new string and return it;
  * otherwise return @str.
  */
-GString *qemu_hexdump_line(GString *str, const void *buf, size_t len);
+GString *qemu_hexdump_line(GString *str, const void *buf, size_t len,
+   size_t unit_len, size_t block_len);
 
 /*
  * Hexdump a buffer to a file. An optional string prefix is added to every line
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 419463c154..3cdaa12ed5 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -951,7 +951,7 @@ static void vhost_vdpa_dump_config(struct vhost_dev *dev, 
const uint8_t *config,
 len = MIN(config_len - b, 16);
 
 g_string_truncate(str, 0);
-qemu_hexdump_line(str, config + b, len);
+qemu_hexdump_line(str, config + b, len, 1, 4);
 trace_vhost_vdpa_dump_config(dev, b, str->str);
 }
 }
diff --git a/util/hexdump.c b/util/hexdump.c
index 521e346bc6..b29326b7f2 100644
--- a/util/hexdump.c
+++ b/util/hexdump.c
@@ -1,5 +1,5 @@
 /*
- * Helper to hexdump a buffer
+* Helper to hexdump a buffer
  *
  * Copyright (c) 2013 Red Hat, Inc.
  * Copyright (c) 2013 Gerd Hoffmann 
@@ -16,22 +16,34 @@
 #include "qemu/osdep.h"
 #include "qemu/cutils.h"
 
-GString *qemu_hexdump_line(GString *str, const void *vbuf, size_t len)
+GString *qemu_hexdump_line(GString *str, const void *vbuf, size_t len,
+   size_t unit_len, size_t block_len)
 {
 const uint8_t *buf = vbuf;
-size_t i;
+size_t u, b;
 
 if (str == NULL) {
 /* Estimate the length of the output to avoid reallocs. */
-i = len * 3 + len / 4;
-str = g_string_sized_new(i + 1);
+size_t est = len * 2;
+if (unit_len) {
+est += len / unit_len;
+}
+if (block_len) {
+est += len / block_len;
+}
+str = g_string_sized_new(est + 1);
 }
 
-for (i = 0; i < len; i++) {
-if (i != 0 && (i % 4) == 0) {
+for (u = 0, b = 0; len; u++, b++, len--, buf++) {
+if (unit_len && u == unit_len) {
 g_string_append_c(str, ' ');
+u = 0;
 }
-g_string_append_printf(str, " %02x", buf[i]);
+if (block_len && b == block_len) {
+g_string_append_c(str, ' ');
+b = 0;
+}
+g_string_append_printf(str, "%02x", *buf);
 }
 
 return str;
@@ -67,7 +79,7 @@ void qemu_hexdump(FILE *fp, const char *prefix,
 len = MIN(size - b, QEMU_HEXDUMP_LINE_BYTES);
 
 g_string_truncate(str, 0);
-qemu_hexdump_line(str, bufptr + b, len);
+qemu_hexdump_line(str, bufptr + b, len, 1, 4);
 asciidump_line(ascii, bufptr + b, len);
 
 fprintf(fp, "%s: %04zx: %-*s %s\n",
-- 
2.34.1




[PULL 01/16] util/hexdump: Use a GString for qemu_hexdump_line

2024-06-05 Thread Richard Henderson
Allocate a new, or append to an existing GString instead of
using a fixed sized buffer.  Require the caller to determine
the length of the line -- do not bound len here.

Signed-off-by: Richard Henderson 
Message-Id: <20240412073346.458116-4-richard.hender...@linaro.org>
---
 include/qemu/cutils.h  | 15 ++-
 hw/virtio/vhost-vdpa.c | 14 --
 util/hexdump.c | 27 ---
 3 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
index c5dea63742..14a3285343 100644
--- a/include/qemu/cutils.h
+++ b/include/qemu/cutils.h
@@ -282,12 +282,17 @@ static inline const char *yes_no(bool b)
  */
 int parse_debug_env(const char *name, int max, int initial);
 
-/*
- * Hexdump a line of a byte buffer into a hexadecimal/ASCII buffer
+/**
+ * qemu_hexdump_line:
+ * @str: GString into which to append
+ * @buf: buffer to dump
+ * @len: number of bytes to dump
+ *
+ * Append @len bytes of @buf as hexadecimal into @str.
+ * If @str is NULL, allocate a new string and return it;
+ * otherwise return @str.
  */
-#define QEMU_HEXDUMP_LINE_BYTES 16 /* Number of bytes to dump */
-#define QEMU_HEXDUMP_LINE_LEN 75   /* Number of characters in line */
-void qemu_hexdump_line(char *line, const void *bufptr, size_t len);
+GString *qemu_hexdump_line(GString *str, const void *buf, size_t len);
 
 /*
  * Hexdump a buffer to a file. An optional string prefix is added to every line
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7368b71902..419463c154 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -944,13 +944,15 @@ static int vhost_vdpa_set_config_call(struct vhost_dev 
*dev,
 static void vhost_vdpa_dump_config(struct vhost_dev *dev, const uint8_t 
*config,
uint32_t config_len)
 {
-int b, len;
-char line[QEMU_HEXDUMP_LINE_LEN];
+g_autoptr(GString) str = g_string_sized_new(4 * 16);
+size_t b, len;
 
-for (b = 0; b < config_len; b += 16) {
-len = config_len - b;
-qemu_hexdump_line(line, config + b, len);
-trace_vhost_vdpa_dump_config(dev, b, line);
+for (b = 0; b < config_len; b += len) {
+len = MIN(config_len - b, 16);
+
+g_string_truncate(str, 0);
+qemu_hexdump_line(str, config + b, len);
+trace_vhost_vdpa_dump_config(dev, b, str->str);
 }
 }
 
diff --git a/util/hexdump.c b/util/hexdump.c
index 0f943e31e5..521e346bc6 100644
--- a/util/hexdump.c
+++ b/util/hexdump.c
@@ -16,22 +16,25 @@
 #include "qemu/osdep.h"
 #include "qemu/cutils.h"
 
-void qemu_hexdump_line(char *line, const void *bufptr, size_t len)
+GString *qemu_hexdump_line(GString *str, const void *vbuf, size_t len)
 {
-const char *buf = bufptr;
-int i;
+const uint8_t *buf = vbuf;
+size_t i;
 
-if (len > QEMU_HEXDUMP_LINE_BYTES) {
-len = QEMU_HEXDUMP_LINE_BYTES;
+if (str == NULL) {
+/* Estimate the length of the output to avoid reallocs. */
+i = len * 3 + len / 4;
+str = g_string_sized_new(i + 1);
 }
 
 for (i = 0; i < len; i++) {
 if (i != 0 && (i % 4) == 0) {
-*line++ = ' ';
+g_string_append_c(str, ' ');
 }
-line += sprintf(line, " %02x", (unsigned char)buf[i]);
+g_string_append_printf(str, " %02x", buf[i]);
 }
-*line = '\0';
+
+return str;
 }
 
 static void asciidump_line(char *line, const void *bufptr, size_t len)
@@ -49,24 +52,26 @@ static void asciidump_line(char *line, const void *bufptr, 
size_t len)
 *line = '\0';
 }
 
+#define QEMU_HEXDUMP_LINE_BYTES 16
 #define QEMU_HEXDUMP_LINE_WIDTH \
 (QEMU_HEXDUMP_LINE_BYTES * 2 + QEMU_HEXDUMP_LINE_BYTES / 4)
 
 void qemu_hexdump(FILE *fp, const char *prefix,
   const void *bufptr, size_t size)
 {
-char line[QEMU_HEXDUMP_LINE_LEN];
+g_autoptr(GString) str = g_string_sized_new(QEMU_HEXDUMP_LINE_WIDTH + 1);
 char ascii[QEMU_HEXDUMP_LINE_BYTES + 1];
 size_t b, len;
 
 for (b = 0; b < size; b += len) {
 len = MIN(size - b, QEMU_HEXDUMP_LINE_BYTES);
 
-qemu_hexdump_line(line, bufptr + b, len);
+g_string_truncate(str, 0);
+qemu_hexdump_line(str, bufptr + b, len);
 asciidump_line(ascii, bufptr + b, len);
 
 fprintf(fp, "%s: %04zx: %-*s %s\n",
-prefix, b, QEMU_HEXDUMP_LINE_WIDTH, line, ascii);
+prefix, b, QEMU_HEXDUMP_LINE_WIDTH, str->str, ascii);
 }
 
 }
-- 
2.34.1




[PULL 09/16] disas/microblaze: Split out print_immval_addr

2024-06-05 Thread Richard Henderson
Unify the code blocks that try to print a symbolic address.

Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-15-richard.hender...@linaro.org>
---
 disas/microblaze.c | 89 +++---
 1 file changed, 29 insertions(+), 60 deletions(-)

diff --git a/disas/microblaze.c b/disas/microblaze.c
index 49a4c0fd40..3473c94164 100644
--- a/disas/microblaze.c
+++ b/disas/microblaze.c
@@ -767,6 +767,24 @@ read_insn_microblaze (bfd_vma memaddr,
   return inst;
 }
 
+static void print_immval_addr(struct disassemble_info *info, bool immfound,
+  int immval, unsigned inst, int addend)
+{
+if (info->print_address_func && info->symbol_at_address_func) {
+if (immfound) {
+immval |= get_int_field_imm(inst) & 0x;
+} else {
+immval = (int16_t)get_int_field_imm(inst);
+}
+immval += addend;
+if (immval != 0 && info->symbol_at_address_func(immval, info)) {
+info->fprintf_func(info->stream, "\t// ");
+info->print_address_func (immval, info);
+} else if (addend) {
+info->fprintf_func(info->stream, "\t// %x", immval);
+}
+}
+}
 
 int 
 print_insn_microblaze (bfd_vma memaddr, struct disassemble_info * info)
@@ -821,18 +839,8 @@ print_insn_microblaze (bfd_vma memaddr, struct 
disassemble_info * info)
  break;
 case INST_TYPE_RD_R1_IMM:
  fprintf_func(stream, "\t%s, %s, %s", get_field_rd(inst), 
get_field_r1(inst), get_field_imm(inst));
- if (info->print_address_func && get_int_field_r1(inst) == 0 && 
info->symbol_at_address_func) {
-   if (immfound)
- immval |= (get_int_field_imm(inst) & 0x);
-   else {
- immval = get_int_field_imm(inst);
- if (immval & 0x8000)
-   immval |= 0x;
-   }
-   if (immval > 0 && info->symbol_at_address_func(immval, info)) {
- fprintf_func (stream, "\t// ");
- info->print_address_func (immval, info);
-   }
+ if (get_int_field_r1(inst) == 0) {
+  print_immval_addr(info, immfound, immval, inst, 0);
  }
  break;
case INST_TYPE_RD_R1_IMM5:
@@ -860,61 +868,22 @@ print_insn_microblaze (bfd_vma memaddr, struct 
disassemble_info * info)
  fprintf_func(stream, "\t%s, %s", get_field_r1(inst), 
get_field_imm(inst));
  /* The non-pc relative instructions are returns, which shouldn't 
 have a label printed */
- if (info->print_address_func && op->inst_offset_type == 
INST_PC_OFFSET && info->symbol_at_address_func) {
-   if (immfound)
- immval |= (get_int_field_imm(inst) & 0x);
-   else {
- immval = get_int_field_imm(inst);
- if (immval & 0x8000)
-   immval |= 0x;
-   }
-   immval += memaddr;
-   if (immval > 0 && info->symbol_at_address_func(immval, info)) {
- fprintf_func (stream, "\t// ");
- info->print_address_func (immval, info);
-   } else {
- fprintf_func (stream, "\t\t// ");
- fprintf_func (stream, "%x", immval);
-   }
+ if (op->inst_offset_type == INST_PC_OFFSET) {
+  print_immval_addr(info, immfound, immval, inst, memaddr);
  }
  break;
 case INST_TYPE_RD_IMM:
  fprintf_func(stream, "\t%s, %s", get_field_rd(inst), 
get_field_imm(inst));
- if (info->print_address_func && info->symbol_at_address_func) {
-   if (immfound)
- immval |= (get_int_field_imm(inst) & 0x);
-   else {
- immval = get_int_field_imm(inst);
- if (immval & 0x8000)
-   immval |= 0x;
-   }
-   if (op->inst_offset_type == INST_PC_OFFSET)
- immval += (int) memaddr;
-   if (info->symbol_at_address_func(immval, info)) {
- fprintf_func (stream, "\t// ");
- info->print_address_func (immval, info);
-   } 
- }
+  print_immval_addr(info, immfound, immval, inst,
+op->inst_offset_type == INST_PC_OFFSET
+? memaddr : 0);
  break;
 case INST_TYPE_IMM:
  fprintf_func(stream, "\t%s", get_field_imm(inst));
- if (info->print_address_func && info->symbol_at_address_func && 
op->instr != imm) {
-   if (immfound)
- immval |= (get_int_field_imm(inst) & 0x

[PULL 10/16] disas/microblaze: Re-indent print_insn_microblaze

2024-06-05 Thread Richard Henderson
Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-16-richard.hender...@linaro.org>
---
 disas/microblaze.c | 263 -
 1 file changed, 141 insertions(+), 122 deletions(-)

diff --git a/disas/microblaze.c b/disas/microblaze.c
index 3473c94164..c729c76585 100644
--- a/disas/microblaze.c
+++ b/disas/microblaze.c
@@ -787,134 +787,153 @@ static void print_immval_addr(struct disassemble_info 
*info, bool immfound,
 }
 
 int 
-print_insn_microblaze (bfd_vma memaddr, struct disassemble_info * info)
+print_insn_microblaze(bfd_vma memaddr, struct disassemble_info *info)
 {
-  fprintf_functionfprintf_func = info->fprintf_func;
-  void *  stream = info->stream;
-  unsigned long   inst, prev_inst;
-  const struct op_code_struct *op, *pop;
-  int immval = 0;
-  bfd_boolean immfound = FALSE;
-  static bfd_vma prev_insn_addr = -1; /*init the prev insn addr */
-  static int prev_insn_vma = -1;  /*init the prev insn vma */
-  intcurr_insn_vma = info->buffer_vma;
+fprintf_function fprintf_func = info->fprintf_func;
+void *stream = info->stream;
+unsigned long inst, prev_inst;
+const struct op_code_struct *op, *pop;
+int immval = 0;
+bool immfound = false;
+static bfd_vma prev_insn_addr = -1; /*init the prev insn addr */
+static int prev_insn_vma = -1;  /*init the prev insn vma */
+int curr_insn_vma = info->buffer_vma;
 
-  info->bytes_per_chunk = 4;
+info->bytes_per_chunk = 4;
 
-  inst = read_insn_microblaze (memaddr, info, );
-  if (inst == 0) {
-return -1;
-  }
+inst = read_insn_microblaze (memaddr, info, );
+if (inst == 0) {
+return -1;
+}
   
-  if (prev_insn_vma == curr_insn_vma) {
-  if (memaddr-(info->bytes_per_chunk) == prev_insn_addr) {
-prev_inst = read_insn_microblaze (prev_insn_addr, info, );
-if (prev_inst == 0)
-  return -1;
-if (pop->instr == imm) {
-  immval = (get_int_field_imm(prev_inst) << 16) & 0x;
-  immfound = TRUE;
+if (prev_insn_vma == curr_insn_vma) {
+if (memaddr - info->bytes_per_chunk == prev_insn_addr) {
+prev_inst = read_insn_microblaze (prev_insn_addr, info, );
+if (prev_inst == 0)
+return -1;
+if (pop->instr == imm) {
+immval = (get_int_field_imm(prev_inst) << 16) & 0x;
+immfound = TRUE;
+}
+else {
+immval = 0;
+immfound = FALSE;
+}
+}
 }
-else {
-  immval = 0;
-  immfound = FALSE;
-}
-  }
-  }
-  /* make curr insn as prev insn */
-  prev_insn_addr = memaddr;
-  prev_insn_vma = curr_insn_vma;
+/* make curr insn as prev insn */
+prev_insn_addr = memaddr;
+prev_insn_vma = curr_insn_vma;
 
-  if (op->name == 0) {
-fprintf_func (stream, ".short 0x%04lx", inst);
-  }
-  else
-{
-  fprintf_func (stream, "%s", op->name);
+if (op->name == 0) {
+fprintf_func (stream, ".short 0x%04lx", inst);
+return 4;
+}
+
+fprintf_func (stream, "%s", op->name);
   
-  switch (op->inst_type)
-   {
-  case INST_TYPE_RD_R1_R2:
- fprintf_func(stream, "\t%s, %s, %s", get_field_rd(inst), 
get_field_r1(inst), get_field_r2(inst));
- break;
-case INST_TYPE_RD_R1_IMM:
- fprintf_func(stream, "\t%s, %s, %s", get_field_rd(inst), 
get_field_r1(inst), get_field_imm(inst));
- if (get_int_field_r1(inst) == 0) {
-  print_immval_addr(info, immfound, immval, inst, 0);
- }
- break;
-   case INST_TYPE_RD_R1_IMM5:
- fprintf_func(stream, "\t%s, %s, %s", get_field_rd(inst), 
get_field_r1(inst), get_field_imm5(inst));
- break;
-   case INST_TYPE_RD_RFSL:
- fprintf_func(stream, "\t%s, %s", get_field_rd(inst), 
get_field_rfsl(inst));
- break;
-   case INST_TYPE_R1_RFSL:
- fprintf_func(stream, "\t%s, %s", get_field_r1(inst), 
get_field_rfsl(inst));
- break;
-   case INST_TYPE_RD_SPECIAL:
- fprintf_func(stream, "\t%s, %s", get_field_rd(inst), 
get_field_special(inst, op));
- break;
-   case INST_TYPE_SPECIAL_R1:
- fprintf_func(stream, "\t%s, %s", get_field_special(inst, op), 
get_field_r1(inst));
- break;
-   case INST_TYPE_RD_R1:
- fprintf_func(stream, "\t%s, %s", get_field_rd(inst), 
get_field_r1(inst));
- break;
-   case INST_TYPE_R1_R2:
- fprintf_func(stream, "\t%s, %s", get_field_r1(inst), 
get_field_r2(inst));
- break;
-   case INST_TYPE_R1_IMM:
- fprintf_func(stream, "\t%s, %s", get_field_r1(inst), 
get_field_i

[Qemu-commits] [qemu/qemu] 53ee5f: util/hexdump: Use a GString for qemu_hexdump_line

2024-06-05 Thread Richard Henderson via Qemu-commits
  Branch: refs/heads/staging
  Home:   https://github.com/qemu/qemu
  Commit: 53ee5f551e5743516c90a662425276cae4cf0aeb
  
https://github.com/qemu/qemu/commit/53ee5f551e5743516c90a662425276cae4cf0aeb
  Author: Richard Henderson 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/virtio/vhost-vdpa.c
M include/qemu/cutils.h
M util/hexdump.c

  Log Message:
  ---
  util/hexdump: Use a GString for qemu_hexdump_line

Allocate a new, or append to an existing GString instead of
using a fixed sized buffer.  Require the caller to determine
the length of the line -- do not bound len here.

Signed-off-by: Richard Henderson 
Message-Id: <20240412073346.458116-4-richard.hender...@linaro.org>


  Commit: c49d1c37d89a2ea994861600859b7dcd3ffa4ede
  
https://github.com/qemu/qemu/commit/c49d1c37d89a2ea994861600859b7dcd3ffa4ede
  Author: Richard Henderson 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/virtio/vhost-vdpa.c
M include/qemu/cutils.h
M util/hexdump.c

  Log Message:
  ---
  util/hexdump: Add unit_len and block_len to qemu_hexdump_line

Generalize the current 1 byte unit and 4 byte blocking
within the output.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-5-richard.hender...@linaro.org>


  Commit: 10e4927bc4c5ad673e12c0731e6150050cf327de
  
https://github.com/qemu/qemu/commit/10e4927bc4c5ad673e12c0731e6150050cf327de
  Author: Richard Henderson 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M util/hexdump.c

  Log Message:
  ---
  util/hexdump: Inline g_string_append_printf "%02x"

Trivial arithmetic can be used for emitting the nibbles,
rather than full-blown printf formatting.

Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-6-richard.hender...@linaro.org>


  Commit: 3a8ff3667187459427eef5c6b1cdb950c563e094
  
https://github.com/qemu/qemu/commit/3a8ff3667187459427eef5c6b1cdb950c563e094
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/mips/malta.c

  Log Message:
  ---
  hw/mips/malta: Add re-usable rng_seed_hex_new() method

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.

Extract common code from reinitialize_rng_seed and load_kernel
to rng_seed_hex_new.  Using qemu_hexdump_line both fixes the
deprecation warning and simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé 
[rth: Use qemu_hexdump_line.]
Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-7-richard.hender...@linaro.org>


  Commit: 4b69210978bdf92d50b84d8662b3c38c78d79803
  
https://github.com/qemu/qemu/commit/4b69210978bdf92d50b84d8662b3c38c78d79803
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M system/qtest.c

  Log Message:
  ---
  system/qtest: Replace sprintf by qemu_hexdump_line

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé `
[rth: Use qemu_hexdump_line]
Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-8-richard.hender...@linaro.org>


  Commit: 00a17d803d0931b00bffdb3b3e8a3e81251de9fa
  
https://github.com/qemu/qemu/commit/00a17d803d0931b00bffdb3b3e8a3e81251de9fa
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/scsi/scsi-disk.c

  Log Message:
  ---
  hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Note that this drops the "0x" prefix to every byte, which should
be of no consequence to tracing.

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240412073346.458116-9-richard.hender...@linaro.org>


  Commit: 80e945894acf6ca837f03292a22cbf44550d22df
  
https://github.com/qemu/qemu/commit/80e945894acf6ca837f03292a22cbf44550d22df
  Author: Philippe Mathieu-Daudé 
  Date:   2024-06-05 (Wed, 05 Jun 2024)

  Changed paths:
M hw/ide/atapi.c

  Log Message:
  ---
  hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-10-richard.hender...@linaro.org>


  Commit: 7210ddb45fd6ee32140ac9d9731b88c0f61c3f0b
  
https://github.com/qemu/qemu/commit/7210ddb45fd6ee32140

[PULL 08/16] hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf

2024-06-05 Thread Richard Henderson
From: Philippe Mathieu-Daudé 

sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line both fixes the deprecation warning and
simplifies the code base.

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Reviewed-by: Pierrick Bouvier 
Message-Id: <20240412073346.458116-11-richard.hender...@linaro.org>
---
 hw/dma/pl330.c | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
index 70a502d245..5f89295af3 100644
--- a/hw/dma/pl330.c
+++ b/hw/dma/pl330.c
@@ -15,6 +15,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/cutils.h"
 #include "hw/irq.h"
 #include "hw/qdev-properties.h"
 #include "hw/sysbus.h"
@@ -317,22 +318,14 @@ typedef struct PL330InsnDesc {
 
 static void pl330_hexdump(uint8_t *buf, size_t size)
 {
-unsigned int b, i, len;
-char tmpbuf[80];
+g_autoptr(GString) str = g_string_sized_new(64);
+size_t b, len;
 
-for (b = 0; b < size; b += 16) {
-len = size - b;
-if (len > 16) {
-len = 16;
-}
-tmpbuf[0] = '\0';
-for (i = 0; i < len; i++) {
-if ((i % 4) == 0) {
-strcat(tmpbuf, " ");
-}
-sprintf(tmpbuf + strlen(tmpbuf), " %02x", buf[b + i]);
-}
-trace_pl330_hexdump(b, tmpbuf);
+for (b = 0; b < size; b += len) {
+len = MIN(16, size - b);
+g_string_truncate(str, 0);
+qemu_hexdump_line(str, buf + b, len, 1, 4);
+trace_pl330_hexdump(b, str->str);
 }
 }
 
-- 
2.34.1




[PULL 00/16] sprintf fixes

2024-06-05 Thread Richard Henderson
The following changes since commit f1572ab94738bd5787b7badcd4bd93a3657f0680:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
(2024-06-05 07:45:23 -0700)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-misc-20240605

for you to fetch changes up to b89fb575fd467ed5dfde4608d51c47c2aa427f30:

  disas/riscv: Use GString in format_inst (2024-06-05 12:29:54 -0700)


util/hexdump: Use a GString for qemu_hexdump_line.
system/qtest: Replace sprintf by qemu_hexdump_line
hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf
disas/microblaze: Reorg to avoid intermediate sprintf
disas/riscv: Use GString in format_inst


Philippe Mathieu-Daudé (5):
  hw/mips/malta: Add re-usable rng_seed_hex_new() method
  system/qtest: Replace sprintf by qemu_hexdump_line
  hw/scsi/scsi-disk: Use qemu_hexdump_line to avoid sprintf
  hw/ide/atapi: Use qemu_hexdump_line to avoid sprintf
  hw/dma/pl330: Use qemu_hexdump_line to avoid sprintf

Richard Henderson (11):
  util/hexdump: Use a GString for qemu_hexdump_line
  util/hexdump: Add unit_len and block_len to qemu_hexdump_line
  util/hexdump: Inline g_string_append_printf "%02x"
  disas/microblaze: Split out print_immval_addr
  disas/microblaze: Re-indent print_insn_microblaze
  disas/microblaze: Merge op->name output into each fprintf
  disas/microblaze: Print registers directly with PRIreg
  disas/microblaze: Print immediates directly with PRIimm
  disas/microblaze: Print registers directly with PRIrfsl
  disas/microblaze: Split get_field_special
  disas/riscv: Use GString in format_inst

 include/qemu/cutils.h  |  19 +-
 disas/microblaze.c | 551 +
 disas/riscv.c  | 209 +--
 hw/dma/pl330.c |  23 +--
 hw/ide/atapi.c |  12 +-
 hw/mips/malta.c|  25 +--
 hw/scsi/scsi-disk.c|  13 +-
 hw/virtio/vhost-vdpa.c |  14 +-
 system/qtest.c |  12 +-
 util/hexdump.c |  57 +++--
 10 files changed, 418 insertions(+), 517 deletions(-)



  1   2   3   4   5   6   7   8   9   10   >