from:"Julian Seward"

[valgrind] [Bug 203380] Add a client request to register debug info for code ranges

2023-04-28 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=203380

Julian Seward  changed:

   What|Removed |Added

 Attachment #158504|0   |1
is obsolete||

--- Comment #7 from Julian Seward  ---
Created attachment 158505
  --> https://bugs.kde.org/attachment.cgi?id=158505=edit
Majorly revised and improved patch, take 2 (with debug printing removed)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 203380] Add a client request to register debug info for code ranges

2023-04-28 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=203380

Julian Seward  changed:

   What|Removed |Added

  Attachment #36062|0   |1
is obsolete||
  Attachment #57659|0   |1
is obsolete||

--- Comment #6 from Julian Seward  ---
Created attachment 158504
  --> https://bugs.kde.org/attachment.cgi?id=158504=edit
Majorly revised and improved patch

Here's a revised and improved patch that I have used recently for profiling
webassembly applications running on SpiderMonkey.  It's still not particularly
sophisticated, but works well enough to be useful.  It retains the basic design
of earlier versions of the patch, in that it stands outside the per-object
DebugInfo mechanism used for AOT code.  Hence it is localised complexity that
does not impact the existing mechanisms.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 380942] Experimental: add MESI protocol simulation to Callgrind

2023-02-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=380942

Julian Seward  changed:

   What|Removed |Added

 Attachment #105965|0   |1
is obsolete||

--- Comment #3 from Julian Seward  ---
Created attachment 156636
  --> https://bugs.kde.org/attachment.cgi?id=156636=edit
valgrind-bug380942-Callgrind-MESI.diff-2023Feb23-rebased

Here's the same patch rebased to V sources of 23 Feb 2023.  --fair-sched=yes
has been hardwired (see comment 1) and the scheduling quantum has been reduced
from 1200 to 800, for safety.

TL;DR to build is now:

  git clone https://sourceware.org/git/valgrind.git mesi2023
  cd mesi2023
  patch -p1 < valgrind-bug380942-Callgrind-MESI.diff-2023Feb23-rebased
  ./autogen.sh
  ./configure --prefix=`pwd`/Inst --enable-only64bit
  make -j 8 all
  make -j 8 install

To run:

  /path/to/mesi2023/Inst/bin/valgrind \
 --tool=callgrind --simulate-mesi=yes 
  kcachegrind callgrind.out.

See comment 1 for more details on the RFO events.  These are the "requests for
ownership" (cache line transfers) which contain the info of interest here.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 383010] Add support for AVX-512 instructions

2022-01-01 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=383010

--- Comment #70 from Julian Seward  ---
Created attachment 145020
  --> https://bugs.kde.org/attachment.cgi?id=145020=edit
Fix handling of no-mask reg-reg versions of VEXPAND* and VCOMPRESS*

Here's a bug fix for the VEXPAND and VCOMPRESS instructions, specifically for
the register-to-register, mask-free versions.  By "mask-free" I mean they do
not specify any of `{k1}` to `{k7}`.  (I think that makes the instructions
into trivial reg-to-reg copies, but that's irrelevant).  The bug is that the
generated IR acts as if `{k0}` had been specified, and so the result depends
on whatever value is in `k0` at the time.

I worry that there are potentially other places where the IR is generated
using `getKReg(mask)` when really it should be `mask ? getKReg(mask) :
mkU64(0)`, and that testing isn't catching these.  Not sure though.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 383010] Add support for AVX-512 instructions

2021-12-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=383010

--- Comment #68 from Julian Seward  ---
Created attachment 144930
  --> https://bugs.kde.org/attachment.cgi?id=144930=edit
valgrind-avx512-rollup-fixes-2021Dec29.diff

Rollup fixes to be applied on top of (after) the patches in comments 58, 59 and
60:

* fixes the problem described in comment 67.  The patch set extends
  AMD64Instr::CStore and AMD64Instr::CLoad to also handle 8- and 16- bit
  conditional stores and loads.  However, the emit_AMD64Instr clauses for
  these cases were not correct and still generating 64-bit transactions.  This
  fixes them.  That removes a bunch of incorrect results in regression tests,
  and crashing when running real programs.  The test case avx512-skx is still
  failing, though.

* [trivial fixes] fixes segfaults caused by insufficient alignment in test
  cases avx512-l1.c and avx512-l2.c

* [temporary] disables a few test cases in avx512-l1.c since they don't run on
  my hardware (Core i5-1135G7), even natively.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 383010] Add support for AVX-512 instructions

2021-12-28 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=383010

--- Comment #67 from Julian Seward  ---
Created attachment 144910
  --> https://bugs.kde.org/attachment.cgi?id=144910=edit
Demonstrates misbehaving `vmovdqu8 %ymm7, (%r9){%k5}`

I've been testing the patches from comment 58, 59, 60 against the trunk, using
Fedora 35 running on a Core i5-1135G7.  It passes the tests in the comment 60
patch, but causes regressions in various other tests.  I tracked one problem
down to an incorrect implementation of 256-bit stores that use a guard
register (k1 .. k7).  This causes glibc's memset() to misbehave, hence causing
--tool=none runs to fail.

Testcase is attached.  I imagine it's caused by an incorrect translation into
IR, but I haven't figured out how that translation is done.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 446103] Memcheck: `--track-origins=yes` causes extreme slowdowns for large mmap/munmap

2021-12-07 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=446103

Julian Seward  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #2 from Julian Seward  ---
Fixed, 8ee1656165902125c414d598cff788c7bb0b1556.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 446103] Memcheck: `--track-origins=yes` causes extreme slowdowns for large mmap/munmap

2021-12-03 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=446103

--- Comment #1 from Julian Seward  ---
Created attachment 144170
  --> https://bugs.kde.org/attachment.cgi?id=144170=edit
Proposed patch (needs commit message)

Attached is a proposed fix.  It makes gigabyte-sized SARPs about
3 x faster, when --track-origins=yes.

Note, from comment 0:
> a huge AVL tree is created

This was speculation and is totally wrong.  The slowdowns were for
other reasons.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 446236] Issues with Valgrind 3.8.1 version, after compiler upgrade

2021-11-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=446236

--- Comment #2 from Julian Seward  ---
Also, if it really is 3.8.1 that you are using .. that is ancient and
unsupported.
Try again with the most recent version, 3.18.1.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 446103] New: Memcheck: `--track-origins=yes` causes extreme slowdowns for large mmap/munmap

2021-11-25 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=446103

Bug ID: 446103
   Summary: Memcheck: `--track-origins=yes` causes extreme
slowdowns for large mmap/munmap
   Product: valgrind
   Version: unspecified
  Platform: Other
OS: Linux
Status: REPORTED
  Severity: normal
  Priority: NOR
 Component: memcheck
  Assignee: jsew...@acm.org
  Reporter: jsew...@acm.org
  Target Milestone: ---

Created attachment 143957
  --> https://bugs.kde.org/attachment.cgi?id=143957=edit
C test case

The attached test program repeatedly mmaps/munmaps a 2GB chunk of memory.
Without `--track-origins=yes`, it runs in about 0.7 seconds.  But with origin
tracking enabled, it takes 95.5 seconds, that is, it runs 137 times slower.

The problem occurs because the 2GB mmaps/munmaps cause Memcheck to perform a
set-address-range-perms operation on the area, changing it to "defined" and
then back to "noaccess".  Large-range SARPs are (highly) optimised for the
V/A-bit (normal) Memcheck shadow memory, but they are very inefficient for the
origin-tracking shadow area.  Origin-tracking shadows are kept in a 64MB 2-way
set associative cache, which is pretty fast.  But lines ejected from that are
kept in a OSet (AVL-tree) based backing store.  Because these SARP lengths
greatly exceed the size of the cache, a huge AVL tree is created, which is
very time- and space-inefficient.

This was initially noticed in
https://bugzilla.mozilla.org/show_bug.cgi?id=1742851.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 445415] New: arm64 front end: alignment checks missing for atomic instructions

2021-11-13 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=445415

Bug ID: 445415
   Summary: arm64 front end: alignment checks missing for atomic
instructions
   Product: valgrind
   Version: unspecified
  Platform: Other
OS: Linux
Status: REPORTED
  Severity: normal
  Priority: NOR
 Component: vex
  Assignee: jsew...@acm.org
  Reporter: jsew...@acm.org
  Target Milestone: ---

For the arm64 front end, none of the atomic instructions have address
alignment checks included in their IR.  They all should.  The effect of 
missing alignment checks in the IR is that, since this IR will in most cases
translated back to atomic instructions in the back end, we will get 
alignment traps (SIGBUS) on the host side and not on the guest side,
which is (very) incorrect behaviour of the simulation.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89 (LD{,A}XP and ST{,L}XP)

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

--- Comment #9 from Julian Seward  ---
Landed, 530df882b8f60ecacaf2b9b8a719f7ea1c1d1650.  I think it's
OK, and the patch contains test cases, but .. please test.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89 (LD{,A}XP and ST{,L}XP)

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

Julian Seward  changed:

   What|Removed |Added

Summary|disInstr(arm64): unhandled  |disInstr(arm64): unhandled
   |instruction 0xC87F2D89  |instruction 0xC87F2D89
   ||(LD{,A}XP and ST{,L}XP)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399
Bug 444399 depends on bug 445354, which changed state.

Bug 445354 Summary: arm64 backend: incorrect code emitted for doubleword CAS
https://bugs.kde.org/show_bug.cgi?id=445354

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 445354] arm64 backend: incorrect code emitted for doubleword CAS

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=445354

Julian Seward  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #1 from Julian Seward  ---
Fixed, 7dbe2fed72886874f2eaf57dc07929542ae55b58.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

Julian Seward  changed:

   What|Removed |Added

 Attachment #143328|0   |1
is obsolete||

--- Comment #8 from Julian Seward  ---
Created attachment 143474
  --> https://bugs.kde.org/attachment.cgi?id=143474=edit
Final proposed patch

Final proposed patch.  This includes the fix for blocking bug 445354, 
which is small and which I will land separately and first.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 445354] arm64 backend: incorrect code emitted for doubleword CAS

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=445354

Julian Seward  changed:

   What|Removed |Added

 Blocks||444399


Referenced Bugs:

https://bugs.kde.org/show_bug.cgi?id=444399
[Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89
-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-11-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

Julian Seward  changed:

   What|Removed |Added

 Depends on||445354


Referenced Bugs:

https://bugs.kde.org/show_bug.cgi?id=445354
[Bug 445354] arm64 backend: incorrect code emitted for doubleword CAS
-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 445354] New: arm64 backend: incorrect code emitted for doubleword CAS

2021-11-11 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=445354

Bug ID: 445354
   Summary: arm64 backend: incorrect code emitted for doubleword
CAS
   Product: valgrind
   Version: unspecified
  Platform: Other
OS: Linux
Status: REPORTED
  Severity: normal
  Priority: NOR
 Component: vex
  Assignee: jsew...@acm.org
  Reporter: jsew...@acm.org
CC: m...@klomp.org
  Target Milestone: ---

The sequence of instructions emitted by the arm64 backend for doubleword
compare-and-swap is incorrect.  This could lead to incorrect simulation of the
AArch8.1 atomic instructions (CASP, at least), and causes failures in the
upcoming fix for v8.0 support for LD{,A}XP/ST{,L}XP in bug 444399.  In the
worst case it can cause segfaulting in the generated code, because it could
jump backwards unexpectedly far.

The problem is the sequence emitted for ARM64in_CASP:

* the jump offsets are incorrect, both for `bne out` (x 2) and `cbnz w1, loop`.

* using w1 to hold the success indication of the stxp instruction trashes the
  previous value in x1.  But the value in x1 is an output of ARM64in_CASP,
  hence one of the two output registers is corrupted.  That confuses any code
  downstream that want to inspect those values to find out if the transaction
  succeeded or not.

The fixes are to

* fix the branch offsets

* use a different register to hold the stxp success indication.  w3 is a
  convenient check.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-11-07 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

--- Comment #7 from Julian Seward  ---
Created attachment 143328
  --> https://bugs.kde.org/attachment.cgi?id=143328=edit
WIP patch that will possibly get you back on the road.  DO NOT LAND.

Fixing this is a whole trip because the various IR and arm64 frameworks
were not really designed to accommodate it.  Anyways, here is a WIP 
patch.  It seems to work for simple tests (in the patch) but is not fully
tested.  It will not work if you run with `--sim-hints=fallback-llsc` or if the
fallback LL/SC implementation is auto-selected, based on your processor,
at startup.  It applies against the head and also against a vanilla 3.18.1
tarball, although I haven't tested it in the latter case.

If anyone wants to test it, and let me know if works, that would be 
appreciated.  I will try to finish it up properly this coming week.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434283] disInstr(arm64): unhandled instruction 0xC87FAB5F (LDAXP and STLXP)

2021-11-07 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434283

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org
 Resolution|--- |DUPLICATE
 Status|REPORTED|RESOLVED

--- Comment #1 from Julian Seward  ---
I'm going to close this as a dup of bug 444399, where there is, now,
a preliminary fix available.

*** This bug has been marked as a duplicate of bug 444399 ***

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-11-07 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

Julian Seward  changed:

   What|Removed |Added

 CC||a...@ao2.it

--- Comment #6 from Julian Seward  ---
*** Bug 434283 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-11-02 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

--- Comment #5 from Julian Seward  ---
I looked into this a bit.  It does indeed appear that LD{A}XP and ST{L}XP
exist in AArch64 8.0 but are not implemented in V.  I am somewhat surprised by
this since I distinctly remember carefully making a list of all instructions
that needed to be implemented, when doing the initial AArch64 port, so I'm not
sure how these got forgotten.

I will fix it, but it may not be an immediate fix.  VEX's intermediate
representation has a way to represent doubleword CAS, but can only represent
single word load-exclusive / store-check, so it will need to be extended
accordingly, and that may have some minor knock-on effect on other
architectures.

I would guess that the immediate cause of the failure is that LLVM 12 has
started generating these instructions.  That would explain why rustc shows the
problem in comment 0 -- presumably that is rustc nightly -- and also why
clang++ 12 shows the problem in comment 3.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444545] memchr behaves differently under valgrind --tool=massif

2021-11-01 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444545

--- Comment #10 from Julian Seward  ---
That is all very strange.  Q: for the failing case, does 
adding `--alignment=32` make any difference?  That changes
the alignment produced by `malloc`.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444545] memchr behaves differently under valgrind --tool=massif

2021-11-01 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444545

--- Comment #6 from Julian Seward  ---
>  Model name:Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz

Hmm, nothing particularly exotic there.

> Sorry this is proving more troublesome than I expected.

It's OK; however; we can't fix it if we can't reproduce it.  Can you
maybe re-check your testing setup?  TBH I seriously doubt that
Massif or the V framework as a whole mishandles memchr, since
I've run huge apps (eg, Firefox) on Massif and I see no such breakage.
Plus there are multiple auto-test runs every night, on various F35
targets, and I'm not aware of such breakage in them.

That said it maybe that you've fallen across some obscure corner 
case that is broken.  But I can't imagine what that could be.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444545] memchr behaves differently under valgrind --tool=massif

2021-11-01 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444545

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #4 from Julian Seward  ---
I can't reproduce this, running on Fedora 35, x86_64, using V from git
and the system gcc, 11.2.1.  This is on an Intel i7-3615QM, in short an
older machine that supports up to avx but not avx2.

Can you post the results of /usr/bin/lscpu on your F35 install?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 444399] disInstr(arm64): unhandled instruction 0xC87F2D89

2021-10-27 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=444399

--- Comment #2 from Julian Seward  ---
I can't reproduce this, testing on Fedora 33 on Parallels Workstation
running on an M1 Mac Mini, with either rustc-1.55 or rustc-1.56.

I suspect this is some kind of hardware capabilities problem, in that
rustc-generated code is using instructions that V doesn't claim to
support.  From irc I see that the hardware you used here was a 
"Graviton 2", which has Neoverse N1 cores, and they support AArch64-v8.4.

The M1 is AArch64-v8.6 I think, although which of those extensions are
available within Parallels I don't know.  That said, I'm still surprised
it fails for you, given that it doesn't fail here, *and* given that V
doesn't even fully support v8.2 of the instruction set.

The output of /usr/bin/lscpu for F33-on-Parallels-on-M1 are below.
Can you show the output on the failing target?

Architecture:aarch64
CPU op-mode(s):  64-bit
Byte Order:  Little Endian
CPU(s):  8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per socket:  8
Socket(s):   1
NUMA node(s):1
Vendor ID:   ARM
Model:   0
Stepping:r0p0
BogoMIPS:48.00
NUMA node0 CPU(s):   0-7
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf:  Not affected
Vulnerability Mds:   Not affected
Vulnerability Meltdown:  Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled
via prctl
Vulnerability Spectre v1:Mitigation; __user pointer sanitization
Vulnerability Spectre v2:Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort:   Not affected
Flags:   fp asimd evtstrm aes pmull sha1 sha2 crc32
atomics fphp asimdhp 
 cpuid asimdrdm jscvt fcma lrcpc dcpop sha3
asimddp sha512 asimdf
 hm dit uscat ilrcpc flagm ssbs sb paca pacg
dcpodp flagm2 frint

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 433510] FreeBSD support, part 12

2021-10-08 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=433510

--- Comment #6 from Julian Seward  ---
(In reply to Mark Wielaard from comment #5)
> (In reply to Paul Floyd from comment #4)
> > > - TRACE_SYMTAB("rx_map:  avma %#lx   size %lu  foff %ld\n",
> > > + TRACE_SYMTAB("rx_map:  avma %#lx   size %lu  foff %lld\n",
> > 
> > ^^^ generates warnings
> > 
> > I think that both versions probably generate warnings on either 32bit or
> > 64bit systems, and that we should use a macro to choose l or ll.
> 
> I don't think so. I believe the actual pointers and offsets are either
> 32/64bit already and match the lx/lu/ld directives.

As a general comment, the way I've handled these problems in the past is
simply to cast the relevant values to the widest possible type -- a 64-bit
signed or unsigned int (Long or ULong) and then used %lld, %llx, etc.  That
makes the problem go away without need of macros or other complexity,
although you do have to be a bit careful about whether the cast requires
signed or unsigned widening.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 442168] Floating-point erroneous behavior with memcheck with regards to class (isfinite(), etc.)

2021-09-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=442168

--- Comment #8 from Julian Seward  ---
(In reply to Xavier Roche from comment #7)

I checked V's {V,}MOVSD implementation more, and still find no problem
I think this is unrelated to those insns.

If I had to guess, I'd say it *might* be related to the use of

+   vcmpneq_oqsd.LCPI1_1(%rip), %xmm1, %xmm1

since the implementation is a bit shaky:

  // 0xC: this isn't really right because it returns all-1s when
  // either operand is a NaN, and it should return all-0s.
  case 0x4:  XXX(False, False, Iop_CmpEQ32Fx4, True);  break; // NEQ_UQ
  case 0xC:  XXX(False, False, Iop_CmpEQ32Fx4, True);  break; // NEQ_OQ

(from `findSSECmpOp`).  But that's just a guess; without a way to reproduce
the failure, I think the chances of resolving this is pretty low.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 442168] Floating-point erroneous behavior with memcheck with regards to class (isfinite(), etc.)

2021-09-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=442168

--- Comment #5 from Julian Seward  ---
Looking at the Intel docs for MOVSD and VMOVSD, and comparing against
what guest_amd64_toIR.c implements, I don't see any error in the 
implementation.  Plus, both of those must be at least moderately commonly
used, and so if there was really an emulation bug with them we would
probably have heard about it long ago.  So I find it a bit difficult to
believe that this MOVSD/VMOVSD difference is really the root cause of
whatever it is that you are seeing.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 442168] Floating-point erroneous behavior with memcheck with regards to class (isfinite(), etc.)

2021-09-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=442168

--- Comment #4 from Julian Seward  ---
(In reply to Xavier Roche from comment #2)

> The difference between the correctly executed code under valgrind and the
> faulty one:

> -   movsd   %xmm0, (%rsp)   # 8-byte Spill
> +   vmovsd  %xmm0, (%rsp)   # 8-byte Spill

> -   movsd   (%rsp), %xmm0   # 8-byte Reload
> +   vmovsd  (%rsp), %xmm0   # 8-byte Reload

> -   movsd   .LCPI6_0(%rip), %xmm0   # xmm0 = mem[0],zero
> +   vmovsd  .LCPI6_0(%rip), %xmm0   # xmm0 = mem[0],zero

Can you give some more information about why you think the change from
movsd to vmovsd causes the error?  Also, which one gives correct execution
and which doesn't?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432387] s390x: z15 instructions support

2021-08-31 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432387

--- Comment #5 from Julian Seward  ---
(In reply to Andreas Arnez from comment #4)

> > +   s390_insn_assert("vcdlg", m3 == 2 || m3 == 3);
> > (multiple times)
> > This will fail only on internal logic errors, correct? .. it can't fail only
> > because the insn can't be decoded, right?
> Actually, the latter.  The s390_insn_assert macro passes a special kind of
> decoding failure (a "specification exception") up through the call chain
> if the asserted condition is not met.

So .. am a bit unclear about this, but it doesn't matter.  It's enough to
say that it should be impossible to cause the front end to assert by feeding
it invalid instructions.  Instead, all invalid insns should cause a SIGILL
to be handed to the guest (or whatever is appropriate on s390).  So long
as the above holds, then I'm happy.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 440893] False positive "uninitialzied value" error on memeory allocated with new[]

2021-08-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=440893

--- Comment #3 from Julian Seward  ---
> It's strange that in my case the data block is always filled with 0's 
> (abort() is never called), I'm using gcc 8.5.0 and c++17.

The Linux kernel is obliged to zero out all memory pages it gives to a 
process, since not doing so would be a massive security hole.  That's
quite possibly what you're observing.

> Just for my information is there a reason why valgrind only complained 
> about the first element of the array being uninitialized, and not the 
> rest of the array?

Because all of those accesses will have the same stack trace, and
Memcheck will only show the first one.  Use -v if you want per-stack-trace
error counts.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432387] s390x: z15 instructions support

2021-08-11 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432387

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #3 from Julian Seward  ---
Nice work!  The following look OK to land as-is:

Subject: [PATCH 01/13] s390x: Misc-insn-3, bitwise logical 3-way instructions
Subject: [PATCH 02/13] s390x: Misc-insn-3, "select" instructions
Subject: [PATCH 03/13] s390x: Misc-insn-3, new POPCNT variant
Subject: [PATCH 04/13] s390x: Misc-insn-3, MVCRL
Subject: [PATCH 06/13] s390x: Vec-enh-2, extend VSL, VSRA, and VSRL
Subject: [PATCH 09/13] s390x: Vec-enh-2, VSLD and VSRD
Subject: [PATCH 10/13] s390x: Vec-enh-2, VSTRS
Subject: [PATCH 11/13] s390x: Mark arch13 features as supported
Subject: [PATCH 12/13] s390x: Vec-enh-2, test cases
Subject: [PATCH 13/13] s390x: Wrap up misc-insn-3 and vec-enh-2 support

For patches 05/13 and 07/13, I have a couple of queries/comments, but nothing
serious.  Providing those assertions won't fail and that there's no build-time
dependency problems for the test cases, OK to land these two too.



Subject: [PATCH 05/13] s390x: Misc-insn-3, test case

none/tests/s390x/Makefile.am
@@ -19,7 +19,8 @@ INSN_TESTS = clc clcle cvb cvd icm lpr tcxb lam_stam xc mvst
add sub mul \
 spechelper-ltr spechelper-or   \
 spechelper-icm-1  spechelper-icm-2 spechelper-tmll \
 spechelper-tm laa vector lsc2 ppno vector_string vector_integer \
-vector_float add-z14 sub-z14 mul-z14 bic
+vector_float add-z14 sub-z14 mul-z14 bic \
+misc3

Is `misc3` safe to always-build?  Or would it require gating on assembler
features, in the style of "if BUILD_DFP_TESTS .." just below?

.. ah .. later .. I see misc3.c doesn't require any assembler support,
since it just does `insn rrf,0x" #opcode ",%[out],%[a],%[b],0\n\t"`.


diff --git a/none/tests/s390x/misc3.vgtest b/none/tests/s390x/misc3.vgtest
new file mode 100644
index 0..d051a06bd
--- /dev/null
+++ b/none/tests/s390x/misc3.vgtest
@@ -0,0 +1 @@
+prog: misc3

Similarly, does this require gating on host hwcaps?  Looking at your _toIR.c
implementation, maybe not, since that appears to be down-translating (I mean,
translating from the z15 dialect into IR which will be re-emitted as
instructions for some earlier zSeries CPU).  I suppose this means that on
older platforms, it will be possible to compile this test, and run it on V,
but not to run it natively.



Subject: [PATCH 07/13] s390x: Vec-enh-2, extend VCDG, VCDLG, VCGD, and VCLGD

+   s390_insn_assert("vcdlg", m3 == 2 || m3 == 3);
(multiple times)
This will fail only on internal logic errors, correct? .. it can't fail only
because the insn can't be decoded, right?

Subject: [PATCH 08/13] s390x: Vec-enh-2, VLBR and friends

+++ b/VEX/priv/host_s390_isel.c
+  case Iop_Perm8x16:

Seeing use of Iop_Perm8x16 made me wonder if it had a definition that
is independent of the guest/host endianness .. and, no it doesn't:
libvex_ir.h says:

for i in 0 .. 15 . result[i] = argL[ argR[i] ] 

so the meaning of "[n]" is itself ambiguous.  Anyways, no need to
change anything in your code.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432801] Valgrind 3.16.1 reports a jump based on uninitialized memory somehow related to clang and signals

2021-07-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432801

--- Comment #44 from Julian Seward  ---
I meant, unsigned < and <= scalar comparisons on undefined data.  Duh.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432801] Valgrind 3.16.1 reports a jump based on uninitialized memory somehow related to clang and signals

2021-07-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432801

--- Comment #43 from Julian Seward  ---
As a general note, I have been investigating unsigned < and <=
comparisons on scalar values for arm64, and expect to continue doing
so as a background task for the next few weeks.  So, this is not
forgotten.  Ideally we can arrive at a cheap and effective solution
that works for all targets.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 440095] No support for "dc cvac" and "dc cvap" on Arm64

2021-07-21 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=440095

--- Comment #1 from Julian Seward  ---
Try copying the scheme used for 'dc cvau' in guest_arm64_toIR.c.
That might just work for the cases you care about too.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 438630] Adding zero variants of arm64 v8.2 FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT instructions

2021-06-17 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=438630

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #2 from Julian Seward  ---
(In reply to ahashmi from comment #1)
> Created attachment 139381 [details]
> Adds zero variants of arm64 v8.2 FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT
> instructions

This seems fine; r+ to land.  Has no assertions on the decode-fail paths,
which is as desired.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 438038] Addition of arm64 v8.2 FCCMP, FCCMPE, FCMEQ, FCMGE and FCMGT instructions

2021-06-17 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=438038

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #3 from Julian Seward  ---
(In reply to ahashmi from comment #1)
> Created attachment 139231 [details]
> Adds arm64 v8.2 FCCMP, FCCMPE, FCMEQ, FCMGE and FCMGT instructions

Looks mostly fine.  Comments below.  
r+ provided the SIGILL-vs-assert thing is fixed.


In dis_AdvSIMD_fp_conditional_compare

+   else if (ty == 3) {
+  if ((archinfo->hwcaps & VEX_HWCAPS_ARM64_FP16) == 0)
+ return False;
+  ity  = Ity_F16;
+  irop = Iop_CmpF16;
+   }
+   else
+  vassert(0); /* ty = 2 => illegal encoding */

No .. ensure SIGILL is raised (decode failure) for any undecodable insns.
Never assert.


+   setFlags_COPY(nzcv_28x0);
+   DIP("fccmp%s %s, %s, #%u, %s\n", isCMPE ? "e" : "",
+   nameQRegLO(nn, ity), nameQRegLO(mm, ity), nzcv, nameCC(cond));
+   return True;

return False;

This is a strange sequence.  The second return is unreachable.  (Maybe I
misread?)


+  if (e->Iex.Binop.op == Iop_CmpF64 || e->Iex.Binop.op == Iop_CmpF32 ||
+  e->Iex.Binop.op == Iop_CmpF16) {
+ HReg (*iselExpr)(ISelEnv*, IRExpr*);
+ ARM64Instr* (*VCmp)(HReg, HReg);

As a nit, in such situations I'd prefer if you set these both to NULL so as to
protect against future snafus .. I know it's redundant as the code stands.
Viz:

 HReg (*iselExpr)(ISelEnv*, IRExpr*) = NULL;
 ARM64Instr* (*VCmp)(HReg, HReg) = NULL;

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 436873] Addition of arm64 v8.2 vector FABD, FACGE, FACGT and FADD instructions

2021-06-08 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=436873

--- Comment #4 from Julian Seward  ---
(In reply to ahashmi from comment #3)
> The review fixes for the scalar variants patch
> https://bugs.kde.org/show_bug.cgi?id=436411 (hwcaps-gating and
> iselIntExpr_RIL_wrk issues) means this latest patch works. Tested with
> regtest and combinations of --memcheck:leak-check=yes --tool=memcheck
> --track-origins=yes.

Ah, my favourite kind of patch to review -- mostly test cases :-)

I didn't actually see any gating stuff in the three new frontend
(guest_arm64_toIR.c) cases, but maybe it's done elsewhere?  Anyways
I'll take your word for it.  So long as the relevant gating is done
as per the scalar patch, then +1 to land.

J

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 436411] Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

2021-06-03 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=436411

--- Comment #5 from Julian Seward  ---
(In reply to ahashmi from comment #4)
> Created attachment 138947 [details]
> Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

That all looks fine; please land.  Just before you do -- if you haven't
already -- please though run the test cases once by hand on Memcheck,
with --track-origins=yes, and check nothing breaks.  This has been
known to happen in the past, at least to me :-)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435665] PPC ISA 3.0 copy, paste, cpabort instructions are not supported

2021-06-02 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #14 from Julian Seward  ---
(In reply to Tulio Magno Quites Machado Filho from comment #13)
> (In reply to Carl Love from comment #12)
> Correct.  The address is always mmap'ed.

Ok.  Then I think you're good to go as-is, apart from the ifdef thing.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435665] PPC ISA 3.0 copy, paste, cpabort instructions are not supported

2021-06-02 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #11 from Julian Seward  ---
(In reply to Carl Love from comment #10)
> Created attachment 138846 [details]
> PPC add copy, paste, cpabort support
> 
> Updated patch to add the Memcheck trickery to get Memcheck to detect
> undefined bits in the copy paste buffer.



I *think* this is OK now, with two caveats:

+UInt copy_paste_abort_dirty_helper(UInt addr, UInt op) {

This will not build on anything except a Power target.  It needs to be
ifdef'd in the usual way.

Ok to land if you fix the above plus sanity check the following issue.

Here's another clarifying comment; I think you should add something
like this.  It's about the "paste" effects annotations.  If I understand
the code correctly, for "paste" you will now have annotations

  Ifx_Write at (wherever the target address is) for 128 bytes
  and also
  the function takes two args `EA_base` and `operation`
  and it returns the uint32_t that you mention.

How Memcheck will instrument that is, if there is any undefinedness in
the inputs, then all of the outputs will be undefined.  Hence

  ifEA_base or operation contain any undefined bits

  then  the return value is undefined and the specified 128-byte memory area
are undefined after the call

  else  the return value is undefined and the specified 128-byte memory area
are defined after the call

The point is .. if I understand "paste" correctly, the destination address is
not part of the "normal" program address space.  So the Ifx_Write annotation
in this case is irrelevant.  Indeed, if the destination address is part of
*some other* address space, then we actually need to omit that Ifx_Write
annotation, because othewise Memcheck will update its status bits for that
address range in this program's address space, which would be wrong.

So .. I hope that makes sense.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435665] PPC ISA 3.0 copy, paste, cpabort instructions are not supported

2021-05-27 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #9 from Julian Seward  ---
So .. I had an idea about how to get the annotations right.  Problem is, it's a
nasty hack (although simple), and I had hoped that the passage of a bit of time
would cause me to think of something cleaner.  But nothing came to mind, so
here it is.  I *think* it'll work, but it needs implementing and testing, to
check that if you do pass undefined values to the accelerator, you do actually
get a Memcheck error.

The basic idea is for your helper function -- the one that copies 128 bytes [or
whatever] to the accelerator -- to return a fake return value.  At present it
doesn't return anything.  Specifically, you change the helper to return a UWord
and that UWord should be (for example) zero.  Then, add a fake use of the fake
value, in a way that will cause Memcheck to test it, as follows.  Furthermore
this fake use must not be something that IR optimisation can later remove.

On the IR side (in guest_ppc_toIR.c), you allocate an IRTemp of type I64
(assuming 64-bit only operation) to receive the fake value, and set the
IRTemp::tmp field to be that temp.  Also for the IRDirty structure, you must
set mFx/mAddr/mSize so as to declare the memory area it is reading.

Immediately after emitting the IRDirty (in guest_ppc_toIR.c), place an
IRStmt_Exit that uses the fake return value.  Something very similar to this:

  stmt( IRStmt_Exit( 
   binop(Iop_CmpNE64, fake_return_val_tmp, mkU64(0)),
   Ijk_SigTRAP,
   mode64 ? IRConst_U64(cia) : IRConst_U32((UInt)cia),
   OFFB_CIA
  ));

where cia is the current instruction address (is often passed around, see (eg)
do_trap in guest_ppc_toIR.c).

The effects of this trickery are as follows:

(1) the above stmt() asks the IR to exit, asking Valgrind to hand the program a
SIGTRAP at this point, if the fake return value is nonzero, however ..

(2) .. that never happens, because your helper always returns zero.

(3) Memcheck will believe that any undefinedness in the area read by the helper
will be propagated through to the fake return value, and will generate
instrumentation to cause that to happen.

(4) Memcheck will instrument the IRStmt_Exit to check the definedness computed
by (3) and emit an error if the fake return value contains any undefined
bits.  Hence you get the warning we want.

(5) We note that the IR optimisation passes do not know what value the helper
call will return.  Hence we are guaranteed that they can't optimise away
the IRStmt_Exit and its associated check.

Bad, huh?!  I wish there were a cleaner way.

I suggest you test it at least as follows:

(a) check that, if the memory area is fully defined, no error is printed.

(b) check that, if the memory area contains even one undefined bit, an error is
printed.

(c) check that, if the helper function does return nonzero, then you really do
get SIGTRAP synthesised at the instruction.  Not that this is important,
but just as a kind of sanity check.

Good luck!  I am happy to look at patches, etc, as ever.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 436873] Addition of arm64 v8.2 vector FABD, FACGE, FACGT and FADD instructions

2021-05-25 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=436873

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #2 from Julian Seward  ---
This is structurally very similar to the scalar-version patch at
https://bugs.kde.org/show_bug.cgi?id=436411#c2 (which is good) and
so the same comments for that patch (in comment 3) apply here,
meaning, the concern about hwcaps-gating.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 436411] Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

2021-05-25 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=436411

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #3 from Julian Seward  ---
This is for the most part good, and apart from the following I have no
concerns.

I see two issues of concern, both pertaining to CPU feature-gating.  The first
is in the front end:

+   if (bitU == 1 && size == X11 && opcode == BITS5(0,0,0,1,0)) {
+  /*  1,11,00010 FABD h_h_h  */
..

and the same for the other added instructions: these need to be feature-gated,
so that they are not accepted on non-8.2 capable hosts.  Consider what would
happen if, with these changes, we tried to run a binary containing these insns
on an 8.0 host.  Then the front end will accept the insns, generate (eg
Iop_Add16F), which the back end will translate back into an 8.2 insn, and we
will take a SIGILL in the generated code.  But that's not the correct
behaviour: what is required is that the front end declines to decode the insn,
and that in turn will cause the existing front end machinery to eventually
deliver a SIGILL to the guest.

The second area of concern is similar but in the back end, regarding the
Iex_Const case for iselIntExpr_RIL_wrk.  To be honest I'm not sure that this
should have been changed at all -- unless the RIL field, with its division into
N/R/S fields, has acquired a new interpretation, but not a new layout, in 8.2
instructions.  Can you clarify: does the existing immediate-for-bitfield
instruction-fragment (here called RIL, and the the ARM/ARM described using the
N/R/S triple) get a new interpretation in 8.2 instructions?

If it doesn't, and is instead a new kind of immediate field for these new
insns, then it seems to me that we'd need a new struct altogether, in similar
style to, but different from, ARM64RI6/ARM64RIL/ARM64RIA/ARM64AMode.

Independent of all of that, there's still the issues that iselIntExpr_RIL_wrk
can't be allowed to have different behaviour than it does now, without first
testing that we're on an 8.2 host rather than an 8.0 host.

In the backend, the top level iselSB_ARM64 passes around a `hwcaps_host`, and
that should contain whatever gating info is needed.  In the front end,
disInstr_ARM64 similarly passes around an `archinfo` and so the
`archinfo->hwcaps` field will contain whatever's needed for gating in the front
end.

+ /* This is a temporary alert while work is done on adding v8.2
+  * support. It will be removed and fixed when Ity_I16 immediate
+  * encodings are understood better as v8.2 work progresses.
+  */
+ vpanic("iselIntExpr_RIL_wrk: TODO: unhandled Ity_I16 for Iex_Const");

Regardless of all of the above, that can't land .. it would potentially
destabilise the back end.  If you don't want to handle the case specially, then
jump to the "default case" bit at the end of the function, which can handle
anything.  (The same scheme applies to many other isel-functions too).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435665] PPC ISA 3.0 copy, paste, cpabort instructions are not supported

2021-05-10 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #7 from Julian Seward  ---
I looked at the PowerISA 3.0b documentation you linked to.
Given that this is a copy from "normal" memory to an accelerator,
I think you don't have the option to implement it precisely.

I'd say the least worst is to implement it so that, for the copy,
Memcheck will flag an error if any of the 128 bytes are undefined,
and for the paste we make it look as if the 128 bytes are
completely defined.  That way, at least you'll know if you're
sending undefined values to the accelerator.

Exactly how to achieve this I am not sure.  It will take some
fiddling with the annotations on the dirty helper calls.  I'll
contemplate it more.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435665] PPC ISA 3.0 copy, paste, cpabort instructions are not supported

2021-05-07 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #5 from Julian Seward  ---
I'm somewhat concerned, at a fundamental level .. if I understand
correctly, these instructions allow one to copy memory from one
place to another (so why bother?  Why not just do normal loads/stores?)
but -- at least as implemented -- there's no way to tell Memcheck
or any other tool, that the values in the destination area are
derived from values in the source area.  So you'll wind up with
definedness false positives or negatives as a result of using them.

Can you explain the background to the insns, what they are used for,
etc, so we can see if there's an implementation that fits better
in the instrumentation framework?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434296] s390x: False-positive memcheck diagnostics from vector string instructions

2021-05-07 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434296

--- Comment #8 from Julian Seward  ---
(In reply to Andreas Arnez from comment #6)
> OK, done.  Unless there's anything else, I'm going to push with this change
> then.

Yes, pls land.  Looks fine to me.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434296] s390x: False-positive memcheck diagnostics from vector string instructions

2021-05-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434296

--- Comment #5 from Julian Seward  ---
(In reply to Andreas Arnez from comment #4)

> > the front end, set the returned DisResult::hint field to Dis_HintVerbose.

> Ah, good point, will do.  Is there a recommended threshold when to set
> this flag?

No .. it's an ad-hoc hack :-) but I would suggest you do it for all of the
insns covered by this bug, since they look like they could each generate 
20 or more IRStmts per guest insn.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 431157] PPC_FEATURE2_SCV needs to be masked in AT_HWCAP2

2021-05-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=431157

--- Comment #29 from Julian Seward  ---
(In reply to Carl Love from comment #27)
> Created attachment 137063 [details]
> patch to add support for the scv instruction

This looks OK to me.

> One issue with the patch is the SC_FLAG and SCV_FLAG defines.  These
> #defines were created to set a flag indicating which system call instruction
> is to be used.  Unfortunately I could not find an include file that is
> visible in VEX as well as coregrind/m_syswrap and in
> coregrind/pub_core_syscall.h.   As a result, I had to add the defines for
> these values in three different places with a comment pointing at the other
> locations.  Rather ugly.  Wasn't sure if we wanted to create an include file
> for these that would then be globally available??  Suggestions on how to
> better was to make these defines more globally visible would be nice.

Nothing better immediately springs to mind.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434840] PPC64 darn instruction not supported

2021-05-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434840

--- Comment #7 from Julian Seward  ---
(In reply to Carl Love from comment #6)
> Created attachment 137204 [details]
> patch to add darn instruction support

+static Int dis_darn ( UInt prefix, UInt theInstr,
+

+   IRExpr** args = mkIRExprVec_1( mkU32( L ) );
+
+   d = unsafeIRDirty_1_N (
+  rD,
+  0/*regparms*/,
+  "darn_dirty_helper",
+  fnptr_to_fnentry( vbi, _dirty_helper ),
+  args );

This wants a comment saying that there are no effects on memory or guest
state, since it is unusual for a dirty helper to have no such effects.

Otherwise looks fine to me.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 435665] PPC ISA 3.0 copy, paste, cpabort instructions are not supported

2021-05-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #2 from Julian Seward  ---
(In reply to Carl Love from comment #1)
> Created attachment 137535 [details]
> PPC add copy, paste, cpabort support

+   d = unsafeIRDirty_1_N (
+  cr0,
+  0/*regparms*/,
+  "copy_paste_abort_dirty_helper",
+  fnptr_to_fnentry( vbi, _paste_abort_dirty_helper ),
+  args );
+   stmt( IRStmt_Dirty(d) );

Doesn't this need some statement of effects?  (mem/regs read/written)
Without that Memcheck won't know what effect the instruction has.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434296] s390x: False-positive memcheck diagnostics from vector string instructions

2021-05-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434296

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #3 from Julian Seward  ---
>From an implementation view this all looks fine.  I have only the
minor query:

  -   /* Check for specification exception */
  -   vassert(m3 < 3);
  -   vassert((m5 & 0b1110) == 0);
  +   s390_insn_assert("vistr", m3 < 3 && m5 == (m5 & 1));

  Is it possible for any incoming instruction to cause this assertion to fail?
  If so that should be reported as SIGILL, and not cause an assertion failure.

>From the code-expansion point of view, I am somewhat concerned that this
may cause failures in VEX due to generating too many instructions in the
back end, etc.  But I can't think of any obvious improvement and it's clear
you've spent time thinking about this too.

The one thing I would suggest is that, for the instructions in question, in
the front end, set the returned DisResult::hint field to Dis_HintVerbose.
This tells the top level block-disassembly logic that the instruction just
translated to IR is "verbose" and so will cause it to stop pulling new insns
into the block sooner rather than later.  For examples look for 

   dres->hint = Dis_HintVerbose;

in guest_amd64_toIR.c.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 433863] s390x: memcheck/tests/s390x/{cds,cs,csg} failures

2021-05-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=433863

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #4 from Julian Seward  ---
(In reply to Andreas Arnez from comment #3)
> (In reply to Andreas Arnez from comment #2)
> > Created attachment 137982 [details]
> > Remove memcheck test cases for cs, cds, and csg
> Any objections/comments?  Otherwise I'm planning to apply the patch early
> next week.

This seems reasonable to me.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 433801] PPC ISA 3.1 support is missing, part 10

2021-03-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=433801

--- Comment #8 from Julian Seward  ---
(In reply to Carl Love from comment #4)
> Created attachment 136294 [details]
> functional tests for  Reduced-Precision: Missing Integer-based Outer  
> Product Operations

OK to land.  Again, please ensure any new files end up in the tarball.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 433801] PPC ISA 3.1 support is missing, part 10

2021-03-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=433801

--- Comment #7 from Julian Seward  ---
(In reply to Carl Love from comment #2)
> Created attachment 136292 [details]
> functional tests for Reduced-Precision - bfloat16 Outer Product &   Format
> Conversion Operations

OK to land.  Please make sure though that all the new files get included
in the tarball.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 433801] PPC ISA 3.1 support is missing, part 10

2021-03-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=433801

--- Comment #6 from Julian Seward  ---
(In reply to Carl Love from comment #3)
> Created attachment 136293 [details]
> PPC64: Reduced-Precision: Missing Integer-based Outer   Product Operations

-static UInt exts8( UInt src)
+static ULong exts8( UInt src)

-static UInt extz8( UInt src)
+static ULong extz8( UInt src)

Mark these as 'static'.

Otherwise, OK to land.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 433801] PPC ISA 3.1 support is missing, part 10

2021-03-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=433801

--- Comment #5 from Julian Seward  ---
(In reply to Carl Love from comment #1)
> Created attachment 136291 [details]
> Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

+static Float conv_bf16_to_float( UInt input )
+{
..
+ output is 64-bit float.
+ bias +127, exponent 8-bits, fraction 22-bits

Is this comment correct?  1 sign bit + 8 exponent bits + 22 mantissa bits
looks much more like a 32-bit float than a 64-bit float.

--

Is there an inconsistency in naming these functions?  It appears that in
some places, a 32-bit float is called `_float`  in the name, but in others
it is called `_f32`.  Eg.

+static Float conv_bf16_to_float( UInt input )
vs
+static UInt conv_f32_to_bf16( UInt input )

Can you either fix the inconsistencies (if they exist) and/or also add a
comment at the top explaining the naming?

---

+ULong convert_from_f32tobf16_helper( ULong src ) {

In this file, either mark functions as 'static' or add a comment saying they
are called from generated code.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429375] PPC ISA 3.1 support is missing, part 9

2021-03-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429375

--- Comment #5 from Julian Seward  ---
Looks OK to land now.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434840] PPC64 darn instruction not supported

2021-03-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434840

--- Comment #5 from Julian Seward  ---
(In reply to Carl Love from comment #4)
> Created attachment 136999 [details]
> patch to add darn instruction support

+ULong darn_dirty_helper ( UInt L )
+{
+
...
+# else
+   val = 0xULL;  /* error */
+# endif

Given that you initialise `val` to the error value outside of any ifdef,
this #else clause is redundant.  You could remove it.


+static Int dis_darn ( UInt prefix, UInt theInstr,
+  const VexAbiInfo* vbi )
..
+   if (L == 3)
+  /*  Hardware reports illegal instruction if L = 3.  */
+  vpanic("ERROR: darn instruction L=3 is not valid\n");

Generate SIGILL for an unrecognised instruction, don't assert.
Please always do this -- I've asked multiple times before.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434840] PPC64 darn instruction not supported

2021-03-23 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434840

--- Comment #2 from Julian Seward  ---
(In reply to Carl Love from comment #1)

Handing it off to the hardware via a helper function is the right
thing to do.  But you can't use a clean helper for this; that breaks
the rules for clean helpers.  A clean helper

* must produce a value which depends only on its arguments
* given the same argument, always produces the same result.

and the IR optimiser relies (or may rely) on the above being true.
You need to use a dirty helper.

Also I'm not happy about the replacement C implementations; I'm not
sure when they would ever get used.  I'd prefer if they were removed.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 401416] Compile failure with openmpi 4.0

2021-03-17 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=401416

Julian Seward  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|CONFIRMED   |RESOLVED

--- Comment #8 from Julian Seward  ---
Fixed, 3415e1e1acc5095607663071db299f961bd65bde.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 401416] Compile failure with openmpi 4.0

2021-03-16 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=401416

--- Comment #5 from Julian Seward  ---
(In reply to Mark Wielaard from comment #4) 
> Is there a reason to remove them completely instead of keep using the if  
> defined (...) constructs?

I had wondered that too.  I'm trying the if-defined scheme right now.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434193] GCC 9+ inlined strcmp causes "Conditional jump or move depends on uninitialised value" report

2021-03-12 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434193

Julian Seward  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Julian Seward  ---
Committed, 21571252964c4d7860210cbe9f60a49eb6881824.
Thanks for the patch (and analysis).

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432801] Valgrind 3.16.1 reports a jump based on uninitialized memory somehow related to clang and signals

2021-03-10 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432801

--- Comment #22 from Julian Seward  ---
(In reply to Eyal from comment #21)
> Will anyone consider the patch?

This (problem) is definitely something I want to deal with, and your
patch seems like a good starting place.  However, 3.17 is imminent and
I would prefer to defer till after 3.17 is done.  In part because I'd
like to do more general qualifying runs against gcc-11 and clang-11 
compiled code.  The patch might need to be generalised, so as to make
it handle different vector widths, lane widths and even maybe the scalar
cases.

Regarding the frequency of false positives, this is the first time
I have heard of this problem. By comparison, we have many bug reports
that relate to (eg) clang/gcc doing equality comparisons on partially
defined data.  I'd accept also though, that if this is a new optimisation
in clang, then we might get more such reports in the future.  But right
now it doesn't seem critical enough to put into 3.17.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434193] GCC 9+ inlined strcmp causes "Conditional jump or move depends on uninitialised value" report

2021-03-10 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434193

--- Comment #6 from Julian Seward  ---
Here's my proposed fix.  It works for your test case on x86_64
but I didn't test it on arm64.  Does it work for you?

diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c
index 516988bdd..759f00b3a 100644
--- a/memcheck/mc_translate.c
+++ b/memcheck/mc_translate.c
@@ -8590,12 +8590,14 @@ IRSB* MC_(instrument) ( VgCallbackClosure* closure,
 # elif defined(VGA_amd64)
   mce.dlbo.dl_Add32   = DLexpensive;
   mce.dlbo.dl_Add64   = DLauto;
+  mce.dlbo.dl_CmpEQ16_CmpNE16 = DLexpensive;
   mce.dlbo.dl_CmpEQ32_CmpNE32 = DLexpensive;
   mce.dlbo.dl_CmpEQ64_CmpNE64 = DLexpensive;
 # elif defined(VGA_ppc64le)
   // Needed by (at least) set_AV_CR6() in the front end.
   mce.dlbo.dl_CmpEQ64_CmpNE64 = DLexpensive;
 # elif defined(VGA_arm64)
+  mce.dlbo.dl_CmpEQ16_CmpNE16 = DLexpensive;
   mce.dlbo.dl_CmpEQ64_CmpNE64 = DLexpensive;
 # endif

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434193] GCC 9+ inlined strcmp causes "Conditional jump or move depends on uninitialised value" report

2021-03-09 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434193

--- Comment #5 from Julian Seward  ---
(In reply to Mike Crowe from comment #4)
> It ain't half slow on anything more than toy test cases though. :)

That was a general test to establish whether lack of precise
instrumentation was indeed the problem.  Now that we've established that,
I can enable precise instrumentation just for the case CmpEQ16/NE16 on
those two targets, which shouldn't be much of a per hit, unless your 
program spends all its time doing 16-bit equality comparisons.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 434193] GCC 9+ inlined strcmp causes "Conditional jump or move depends on uninitialised value" report

2021-03-09 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=434193

--- Comment #3 from Julian Seward  ---
Does running with --expensive-definedness-checks=yes make any difference?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 430429] valgrind.h doesn't compile on s390x with clang

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=430429

--- Comment #6 from Julian Seward  ---
(In reply to Paul Floyd from comment #5)
> Quick confirmation for amd64 Fedora. No problems with clang 11 or clang 13

Thanks.

(general comment, not suggesting any action right now): Looking at valgrind.h
now, I am thinking that it is a mistake for us to use `unsigned long int` as
the basic machine-word type for 64-bit targets.  Perhaps we should change those
en masse to `unsigned long long int`, so as to avoid any ambiguity in the
type-sizes.  TBH I am not sure why it uses `unsigned long int` at all.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 431157] PPC_FEATURE2_SCV needs to be masked in AT_HWCAP2

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=431157

--- Comment #25 from Julian Seward  ---
(In reply to Tom Hughes from comment #24)
> Indeed if that is -2 that would be errno 2 aka ENOENT which would make sense.

Agreed.

So then it would seem to me that the problem is that the patch doesn't
update getSyscallStatusFromGuestState.  How this works is: the assembly code
in syscall-ppc64{le,be}-linux.S hands the syscall to the host, and when
it returns, it copies the result back into the guest state:

/* put the result back in the threadstate  */
3:  std 3,OFFSET_ppc64_GPR3(30) /* gst->GPR3 = sc result */
/* copy cr0.so back to simulated state */
mfcr5   /* r5 = CR   */
rlwinm  5,5,4,31,31 /* r5 = (CR >> 28) & 1   */
stb 5,OFFSET_ppc64_CR0_0(30)/* gst->CR0.SO = cr0.so  */

So far so good.  No problem there.  The problem occurs slightly later, when
the main driver logic in syswrap-main.c calls getSyscallStatusFromGuestState
to find out whether the syscall failed or not, in a target-independent way.
And I am guessing this needs to be updated too.  That currently computes
the result as

   canonical->sres = VG_(mk_SysRes_ppc64_linux)( gst->guest_GPR3, cr0so );

meaning, it is using the legacy (R3 + CR0.S0) scheme to figure out whether
the call failed, regardless of which scheme (sc or scv) the guest originally
requested.  Hence also, the syscall appears to fail or succeed depending on
what junk the kernel left for us in CR0.S0, hence the apparently random
failures that have been observed.

Fixing it would be pretty simple if getSyscallStatusFromGuestState can find
out for sure which syscall scheme was requested.  That is probably safest done
by passing it sci->orig_args.

Another complication (sigh) is that this only covers the cases where we
run the syscall async:

  if (sci->flags & SfMayBlock) {

 /* Syscall may block, so run it asynchronously */
 vki_sigset_t mask;

But this case will also need fixing

  } else {

 /* run the syscall directly */

and hence it looks like convert_SysRes_to_SyscallStatus will need fixing too.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 431157] PPC_FEATURE2_SCV needs to be masked in AT_HWCAP2

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=431157

--- Comment #23 from Julian Seward  ---
(In reply to Tom Hughes from comment #22)
> So if thinks the open was a success with a negative file descriptor returned?

Ah, maybe it's trying to tell us something :-)  I vaguely remember reading
that this new syscall convention denotes failure by returning a value between
-1 and -4096 [bizarrely, like x86-32 in the earliest days of Linux].  If that's
true, it would imply that the call was passed to the kernel using the new 
convention, it failed, but we mistakenly believe, in the post-syscall handling,
that it succeeded.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 430429] valgrind.h doesn't compile on s390x with clang

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=430429

--- Comment #4 from Julian Seward  ---
(In reply to Andreas Arnez from comment #2)

@Andreas: if the patch works on s390, I think you should land it.

> @Julian: Isn't the patch necessary for other 64-bit platforms as well then? 
> Such as ppc64le or amd64?

Well, we have no reports of such problems on either ppc64le or amd64.
Though looking at the code, I fundamentally agree with you .. I don't
see why s390 should have a problem here but not the others.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 431157] PPC_FEATURE2_SCV needs to be masked in AT_HWCAP2

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=431157

--- Comment #20 from Julian Seward  ---
Looking at the end of the log there:

SYSCALL[279836,1](291) sys_newfstatat ( 24, 0x4a73980(), 0x4ad2578 )[sync] -->
Failure(0x9) 
SYSCALL[279836,1](90) sys_mmap ( 0x0, 0, 1, 2, 24, 0 ) --> [pre-fail]
Failure(0x16) 
==279836== 
==279836== Process terminating with default action of signal 11 (SIGSEGV):
dumping core

The mmap call is marked "[pre-fail]", which means that V never passed it
to the kernel.  Instead it decided to "fail" it itself, for the reason 0x16,
whatever that is.  Might be worth investigating.

Also I wonder if the failure of the immediately preceding sys_newfstatat
syscall is related.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432552] [AArch64] invalid error emitted for pre-decremented byte and half-word addresses

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432552

Julian Seward  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #2 from Julian Seward  ---
Fixed, 2ab8ab38a372ab1f10094588dc5ad8f299c13ca6.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432552] [AArch64] invalid error emitted for pre-decremented byte and half-word addresses

2021-03-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432552

Julian Seward  changed:

   What|Removed |Added

Summary|[AArch64] invalid error |[AArch64] invalid error
   |emitted for pre-incremented |emitted for pre-decremented
   |byte and half-word  |byte and half-word
   |addresses   |addresses

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429354] PPC ISA 3.1 support is missing, part 8

2021-03-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429354

--- Comment #6 from Julian Seward  ---
Thanks for the fixes.  r+; please land.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432552] [AArch64] invalid error emitted for pre-incremented byte and half-word addresses

2021-03-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432552

--- Comment #1 from Julian Seward  ---
I suspect this can be fixed by finding the following at or near
line 4967 of guest_arm64_toIR.c

 Bool earlyWBack
   = wBack && simm9 < 0 && (szB == 8 || szB == 4)
 && how == BITS2(1,1) && nn == 31 && !isLoad;

and changing it to also accept szB == 2 and szB == 1.  Given the
existing logic, it's not surprising that you observe ..

  This error is emitted for byte and half-word stores.
  The error does not happen for word and double-word stores.

I haven't tested this proposed fix yet though.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432801] Valgrind 3.16.1 reports a jump based on uninitialized memory somehow related to clang and signals

2021-03-04 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432801

--- Comment #17 from Julian Seward  ---
Interesting analysis, and a plausible patch; thank you for that.  This seems
like a new trick from LLVM.

I'm still struggling to understand what's going on, though.  I can see that

  for (size_t i = 0; i < plen; ++i)
hp += pattern[i];

could be vectorised as you say, so that it loads 4 bytes at a time, and uses
punpcklbw twice to interleave them as described in comment 12.  But:

* where's the addition instruction that merges the lanes together?  I don't
  see that.

* what is the purpose of the pcmpgtd instruction?  The original sources
  contain a scalar comparison against zero

  if (hp==j) {
j++;
  }

  Is that related?  If so, how does a scalar 32-bit equality test against zero
  get translated into a vector 32x4 signed-greater-than operation?

---

In the patch, there's mention of biasing:

+  // From here on out, we're dealing with biased integers instead of 2's
+  // complement.

What does that mean, in this context?

Regarding the test:

* you put it in memcheck/tests/x86; "x86" here means 32-bit only.  Is that
  what you intended?  I would have expected it to go in the "amd64" directory.

* because the test is written in C, whether or not it tests what you expect it
  to test depends entirely on the compiler used to compile it.  And most
  likely, it won't be vectorised, or won't be vectorised in the same way.
  This kind of test really needs to be written in assembly (inline assembly)
  so we know what we're testing.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429354] PPC ISA 3.1 support is missing, part 8

2021-02-15 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429354

--- Comment #4 from Julian Seward  ---
return a & 0xFF, I meant.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429354] PPC ISA 3.1 support is missing, part 8

2021-02-15 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429354

--- Comment #3 from Julian Seward  ---
(In reply to Julian Seward from comment #2)

>uint16_t getMSBs_8x16(vec128)
>{
>   let hiHalf = vec128[127:64]  // LE numbering
>   let loHalf = vec128[ 63:0]
>   // In each byte lane, copy the MSB to all bit positions
>   hiHalf = shift_right_signed_8x8(hiHalf, 7);
>   loHalf = shift_right_signed_8x8(loHalf, 7);
>   // Now each byte lane is either 0x00 or 0xFF
>   // Make (eg) lane 7 contain either 0x00 or 0x80, lane 6 contain
>   // either 0x00 or 0x40, etc
>   hiHalf &= 0x8040201008040201;
>   loHalf &= 0x8040201008040201;
>   hi8msbs = add_across_lanes_8x8(hiHalf)
>   lo8msbs = add_across_lanes_8x8(loHalf)
>   return (hi8msbs << 8) | lo8msbs;
>}

One more thought, regarding add_across_lanes_8x8().  In fact you can
do this semi-reasonably using standard 64-bit scalar code, because of the
nature of the values involved.  Specifically, we are adding together 8 bytes,
each of which is either zero or it has a 1 bit in a different location.
Hence there will never be any carry-bit propagation at all in the addition,
and so it can be implemented -- for this particular use case only -- as

   uint64_t add_across_lanes_8x8(uint64_t a) {
  a += (a >> 8);
  a += (a >> 16);
  a += (a >> 32);
  return a;
   }

(I *think*)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429375] PPC ISA 3.1 support is missing, part 9

2021-02-13 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429375

--- Comment #3 from Julian Seward  ---
==

https://bugs.kde.org/attachment.cgi?id=133479
test cases for VSX-PCV generate operations

OK to land

==

https://bugs.kde.org/attachment.cgi?id=134757
VSX- PCV Generate Operation functional support

Some concerns here.

 /*---*/
 /*--- Misc integer helpers.   ---*/
 /*---*/
+void write_VSX_entry (VexGuestPPC64State* gst, UInt reg_offset,
+  ULong *vsx_entry);

Mark this as static or give it a comment that it is called from generated code
(one or the other)

Also it seems like this function prototype is in a .c file, not a header.
Seems not right.

+/*--*/
+/* VSX Vector Generate PCV from Mask helpers ---*/
+/*--*/
+void write_VSX_entry (VexGuestPPC64State* gst, UInt reg_offset,
+  ULong *vsx_entry)

comment or 'static' pls

+void vector_gen_pvc_byte_mask_dirty_helper( VexGuestPPC64State* gst,
+ULong src_hi, ULong src_lo,
+UInt reg_offset, UInt imm ) {

and here

+   } else {
+  vex_printf("ERROR, vector_gen_pvc_byte_mask_dirty_helper, imm value %u
not supported.\n",
+ imm);
+   }

After printing, do vassert(0); so the system stops, rather than continues
with incorrect data.

+   return;
+}

the return is redundant

+void vector_gen_pvc_hword_mask_dirty_helper( VexGuestPPC64State* gst,
+ ULong src_hi, ULong src_lo,
+ UInt reg_offset, UInt imm ) {

comment or 'static'

+   } else {
+  vex_printf("ERROR, vector_gen_pvc_hword_dirty_mask_helper, imm value %u
not supported.\n",
+ imm);
+   }

and assert, and rm redundant return


+void vector_gen_pvc_word_mask_dirty_helper( VexGuestPPC64State* gst,
+ULong src_hi, ULong src_lo,
+UInt reg_offset, UInt imm ) {

same comments

+void vector_gen_pvc_dword_mask_dirty_helper( VexGuestPPC64State* gst,
+ ULong src_hi, ULong src_lo,
+ UInt reg_offset, UInt imm ) {

same comments

+   IREffect VSX_fx = Ifx_Write;

Inline this at its single use point, since it is never changed.

+   IRExpr** args = mkIRExprVec_5(
+  IRExpr_GSPTR(),
+  mkexpr( src_hi ),
+  mkexpr( src_lo ),
+  mkU32( reg_offset ),
+  mkU64( IMM ) );
+
+
+

No need for three blank lines here.

+   d->nFxState = 1;
+   vex_bzero(>fxState, sizeof(d->fxState));
+   d->fxState[0].fx = VSX_fx;
+   d->fxState[0].size   = sizeof(U128);
+   d->fxState[0].offset = reg_offset;

You're sure this declaration of effects is correct, right?  The called helpers
really only writes one of the vector registers and doesn't read any?

+   default:
+  vex_printf("dis_vector_generate_pvc_from_mask (opc2)\n");
+  return False;
+   }

Don't print in failure paths; just return False.

==

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429354] PPC ISA 3.1 support is missing, part 8

2021-02-13 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429354

--- Comment #2 from Julian Seward  ---
==

https://bugs.kde.org/attachment.cgi?id=133462
Test cases for VSX Mask Manipulation operations

OK to land

==

https://bugs.kde.org/attachment.cgi?id=133461
Functional support for ISA 3.1, VSX Mask manipulation operations

I have some concerns about the verboseness/inefficency of the generated IR.

+static IRExpr * copy_MSB_bit_fields (  IRExpr *src, UInt size )
+{
+   /* Input src is a V128 value.  Input size is the number of bits in each
+* vector field.  The function copies the MSB of each vector field into
+* the low order bits of the 64-bit result.
+*/

You can do it like this, but it looks very expensive.  Note that there is
an IROp which (if I understand this correctly) does what you want:

  /* MISC CONVERSION -- get high bits of each byte lane, a la
 x86/amd64 pmovmskb */
  Iop_GetMSBs8x16, /* V128 -> I16 */

You could use that instead (and I'd encourage you to do so), although
it does mean you'd have to handle this in the POWER backend, obviously.

Although now I think about it, I am not sure what the deal is with
endianness -- Iop_GetMSBs8x16 has been used so far only on little-endian
targets.

If Power has an instruction that can add all the lanes of a SIMD value
together, creating a single number, then there's a faster way to do this, that
doesn't involve Iop_GetMSBs8x16.  Something like this:

   uint16_t getMSBs_8x16(vec128)
   {
  let hiHalf = vec128[127:64]  // LE numbering
  let loHalf = vec128[ 63:0]
  // In each byte lane, copy the MSB to all bit positions
  hiHalf = shift_right_signed_8x8(hiHalf, 7);
  loHalf = shift_right_signed_8x8(loHalf, 7);
  // Now each byte lane is either 0x00 or 0xFF
  // Make (eg) lane 7 contain either 0x00 or 0x80, lane 6 contain
  // either 0x00 or 0x40, etc
  hiHalf &= 0x8040201008040201;
  loHalf &= 0x8040201008040201;
  hi8msbs = add_across_lanes_8x8(hiHalf)
  lo8msbs = add_across_lanes_8x8(loHalf)
  return (hi8msbs << 8) | lo8msbs;
   }

There are variants, but you get the general idea from the above.

+   if (IFIELD(theInstr, 1, 5) == 0xA)//mtvsrbmi
+  inst_select = 0x;

Add a comment to explain that this is a special-case hack for mtvsrbmi.

+   default:
+  vex_printf("dis_VSR_byte_mask (opc2)\n");
+  return False;
+   }

Don't print on failure paths; only return False.


+   for(i = 0; i< max; i++) {
+  ones_lo[i+1] = newTemp( Ity_I64 );
+  test_lo[i] = newTemp( Ity_I64 );
+  ones_hi[i+1] = newTemp( Ity_I64 );
+  test_hi[i] = newTemp( Ity_I64 );
..

This seems really inefficient; is there no way to do this somewhat
in parallel using SIMD IROps ?

==

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429352] PPC ISA 3.1 support is missing, part 7

2021-02-13 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429352

--- Comment #10 from Julian Seward  ---
==

https://bugs.kde.org/attachment.cgi?id=133459
Test cases for new IOps

OK to land

==

https://bugs.kde.org/attachment.cgi?id=135083
Fix pcast for existing code

OK to land

==

https://bugs.kde.org/attachment.cgi?id=135176
Add ACC support into routine get_otrack_shadow_offset_wrk()

OK to land

==

https://bugs.kde.org/attachment.cgi?id=135082
Functional support for ISA 3.1 New Iops

OK to land -- I don't see any significant problems with it -- but first please
fix the following comments.

+UInt generate_DFP_FPRF_value_helper( UInt gfield,

If this is intended to be called from VEX-generated code, add a comment to
that effect, and add an extern qualifier.  If it isn't, add a static
qualifier.  (== follow the existing conventions)

+PPCInstr* PPCInstr_AvTernaryInt128 ( PPCAvOp op, HReg dst,
+ HReg src1, HReg src2, HReg src3 ) {
+   PPCInstr* i = LibVEX_Alloc_inline(sizeof(PPCInstr));
+   i->tag  = Pin_AvTrinaryInt128;
+   i->Pin.AvTrinaryInt128.op   = op;

There's still an inconsistency in the naming: Ternary vs Trinary.  I'd prefer
you use Ternary consistently, everywhere.

@@ -2321,6 +2428,21 @@ void ppPPCInstr ( const PPCInstr* i, Bool mode64 )
   ppHRegPPC(i->Pin.Dfp128Cmp.dst);
   vex_printf(",8,28,31");
   return;
+
+   case Pin_XFormUnary994:
+  if (Px_DFPTOIQS) {

This isn't right.  It needs to be if (i->Pin.XFormUnary994.op == Px_DFPTOIQS)
or something like that.

+   case Pin_XFormUnary994: {
+   int inst_sel;

Regarding "int", please use the project-defined types, always: Int, Char,
UChar etc, not "int", "char"

Also, here, "inst_sel" is local to each of the two case-clauses, so move it
into each.

+  case Iop_D128toI128S: {
+ HReg srcHi = newVRegF(env);
+ HReg srcLo = newVRegF(env);
..
+ iselDfp128Expr( , , env, e->Iex.Binop.arg2, IEndianess );

Isn't it the case that iselDfp128Expr always writes its first two arguments
(meaning, it selects the output virtual registers itself) ?  If so, then it's
pointless to initialise srcHi/srcLo by calling newVRegF, because those vregs
will be ignored.  Better to initialise them with HReg_INVALID (or is it
INVALID_HREG).  Similar comment in some other places further down the file.

+ // store the two 64-bit pars

pairs

+  HReg rHi, rLo;
+  HReg dst  = newVRegV(env);
+
+  iselInt128Expr(,, env, e->Iex.Unop.arg, IEndianess);

For example here: now there's no initialisation at all.  Initialise rHi/rLo to
HREG_INVALID here.

@@ -314,8 +316,10 @@ void ppIROp ( IROp op )
   case Iop_F128LOtoF64: vex_printf("F128LOtoF64"); return;
   case Iop_I32StoF128: vex_printf("I32StoF128"); return;
   case Iop_I64StoF128: vex_printf("I64StoF128"); return;
+  case Iop_I128StoF128: vex_printf("128StoF128"); return;

In the output text, the initial "I" is missing

@@ -4753,15 +4769,19 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce,
  uifu = mkUifU1; difd = mkDifD1; 
  and_or_ty = Ity_I1; improve = mkImproveOR1; goto do_And_Or;

-  do_And_Or:
+   do_And_Or:

Please restore the original indentation for this label.

==

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432510] RFE: ENOMEM fault injection mode

2021-02-11 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432510

--- Comment #2 from Julian Seward  ---
(In reply to Frank Ch. Eigler from comment #0)

Before we get into discussing how to implement stuff, I think it's important to
have a more precise statement of what functionality is required.  Can you give
more details on what errors should be triggered, how they should be reported,
and when they should be triggered?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432161] Addition of arm64 v8.2 FABS, FNEG and FSQRT instructions

2021-01-29 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432161

--- Comment #6 from Julian Seward  ---
(In reply to ahashmi from comment #5)

Looks good to me.  Land!

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 432161] Addition of arm64 v8.2 FABS, FNEG and FSQRT instructions

2021-01-27 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=432161

--- Comment #3 from Julian Seward  ---
Nicely done!  I did spot one bug though:

@@ -3752,6 +3775,8 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce,

   case Iop_I32StoF32x4:
   case Iop_F32toI32Sx4:
+  case Iop_Sqrt16Fx8:
+ return unary16Fx8_w_rm(mce, vatom1, vatom2);
   case Iop_Sqrt32Fx4:
  return unary32Fx4_w_rm(mce, vatom1, vatom2);
   case Iop_Sqrt64Fx2:

This isn't quite right, in that the two added lines change the behaviour for
the fallthrough cases immediately above.  I mean: consider what happens for
Iop_I32StoF32x4 and Iop_F32toI32Sx4 with and without the added lines.

Otherwise it's fine.

One final question: did you check that the relevant test programs run OK
(without any assertion failures) on memcheck, by running them by hand, with
--tool=memcheck and also --tool=memcheck --track-origins=yes ?  If not, could
you please do that and make sure nothing breaks.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429352] PPC ISA 3.1 support is missing, part 7

2021-01-22 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429352

--- Comment #5 from Julian Seward  ---
(In reply to Carl Love from comment #1)
> Created attachment 133459 [details]
> Test cases for new IOps
> 
> The test cases

This seems OK to be.  But per comments on 
https://bugs.kde.org/attachment.cgi?id=133458, you need to ensure
that absolutely all of the Power instruction tests run successfully
with --tool=memcheck and --tool=memcheck --track-origins=yes.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429352] PPC ISA 3.1 support is missing, part 7

2021-01-22 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429352

--- Comment #4 from Julian Seward  ---
(In reply to Carl Love from comment #0)
> Created attachment 133458 [details]
> Functional support for ISA 3.1, New Iops
> 
> Power PC missing ISA 3.1 support.  Additional patches, part 7

It seems to me that the mc_translate.c sections of the patch aren't right,
which could cause Memcheck to assert when running programs containing the new
instructions.

--

diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c

@@ -3410,8 +3412,7 @@ IRAtom* expr2vbits_Triop ( MCEnv* mce,
 unary64Fx2_w_rm(mce, vatom1, vatom2),
 unary64Fx2_w_rm(mce, vatom1, vatom3)));

-
-  default:
+   default:

Nit: return the `default:` to its correct indentation

--

+ // CARLL not sure about this yet.

Remove this pls

--

-  do_And_Or:
+  /* Int 128-bit Integer two arg  */
+ // CARLL not sure about this yet.
+  case Iop_DivU128:
+  case Iop_DivS128:
+  case Iop_DivU128E:
+  case Iop_DivS128E:
+  case Iop_ModU128:
+  case Iop_ModS128:
+ uifu = mkUifUV128; difd = mkDifDV128;
+ and_or_ty = Ity_V128; improve = mkImproveORV128; goto do_And_Or;
+

This strikes me as wrong.  `do_And_Or` is only intended to instrument AND and
OR operations.  But here, you're using it to instrument Div/Mod operations,
which isn't correct.

--

@@ -5001,15 +5024,18 @@ IRExpr* expr2vbits_Unop ( MCEnv* mce, IROp op, IRAtom*
atom )
  return assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, vatom));
   case Iop_F128LOtoF64:  /* F128 -> low  half of F128 */
   case Iop_D128LOtoD64:  /* D128 -> low  half of D128 */
+  case Iop_TruncF128toI128S: /* F128 -> I128S */
+  case Iop_TruncF128toI128U: /* F128 -> I128U */
  return assignNew('V', mce, Ity_I64, unop(Iop_128to64, vatom));

This also doesn't seem correct to me, in the sense that it simply ignores the
definedness of the most significant 64 bits of the input, and, I am guessing,
will create general incorrectly-typed IR.  Has it been tested?  These cases
TruncF128toI128S and TruncF128toI128U do not belong with F128LOtoF64 and
D128LOtoD64.  I suggest you add them instead to the group that includes
NegF128, AbsF128 and RndF128.

--

But even that (present code) doesn't seem correct:

  case Iop_NegF128:
  case Iop_AbsF128:
  case Iop_RndF128:
  case Iop_TruncF128toI64S: /* F128 -> I64S */
  case Iop_TruncF128toI32S: /* F128 -> I32S (result stored in 64-bits) */
  case Iop_TruncF128toI64U: /* F128 -> I64U */
  case Iop_TruncF128toI32U: /* F128 -> I32U (result stored in 64-bits) */
 return mkPCastTo(mce, Ity_I128, vatom);

NegF128/AbsF128/RndF128 are correct, because their shadow value will be an
I128 type and hence it is correct to do ` mkPCastTo(mce, Ity_I128, vatom)`.
But TruncF128toI64S and Iop_TruncF128toI64U need to be PCast-ed to an I64
value -- so they belong somewhere else in this file, not here, and
TruncF128toI32S and TruncF128toI32U need to be PCast-ed to an I32 value,
again, somewhere else.

Similarly these 6

  case Iop_ReinterpF64asI64:
  case Iop_ReinterpI64asF64:
  case Iop_ReinterpI32asF32:
  case Iop_ReinterpF32asI32:
  case Iop_ReinterpI64asD64:
  case Iop_ReinterpD64asI64:

are probably not correct; they need to be PCasted to I64/I32 appropriately.

--

Some smaller things, in other places in the patch:

diff --git a/VEX/priv/guest_ppc_toIR.c b/VEX/priv/guest_ppc_toIR.c

+static void generate_store_DFP_FPRF_value( ULong irType, IRExpr *src,
+   const VexAbiInfo* vbi )
...
+   } else {
+  vex_printf("ERROR generate_store_DFP_FPRF_value, unknown value for
irType 0x%lx\n", (unsigned long ) irType);
+   }

Uhh .. this isn't good.  Please remove it, and either assert/panic here if
reaching here indicates a bug in this front end, or cause this to generate a
SIGILL (insn-not-decoded) in the normal way.

--

+ }
+   break;
}

Indent the `break` correctly, so it's easier to see what scope it's relevant
to.

--

diff --git a/VEX/priv/host_ppc_defs.c b/VEX/priv/host_ppc_defs.c

+PPCInstr* PPCInstr_AvTrinaryInt128 ( PPCAvOp op, HReg dst,
+ HReg src1, HReg src2, HReg src3 ) {

I believe the normal terminology for a 3-way thing is "ternary", not "trinary"
(I've never heard the latter word before, AFAIR)

--

@@ -4902,13 +5075,27 @@ Int emit_PPCInstr ( /*MB_MOD*/Bool* is_profInc,
   case Pfp_IDUTOQ: // xscvudqp
  p = mkFormVXR0( p, 63, fr_dst, 2, fr_src, 836, 0, endness_host );
  break;
+  case Pfp_IQSTOQ: // xscvsqqp
+// vex_printf("CARLL, issue xscvsqqp instruction. If not on P10, I
will crash now on illegal inst.\n");
+ p = mkFormVXR0( p,

[valgrind] [Bug 431556] Complete arm64 FADDP v8.2 instruction support started in 413547

2021-01-22 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=431556

Julian Seward  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REPORTED|RESOLVED

--- Comment #6 from Julian Seward  ---
Landed, 8fa9e36f7c269563feebbf41a7a658bc7879ad39.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 431556] Complete arm64 FADDP v8.2 instruction support started in 413547

2021-01-14 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=431556

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #3 from Julian Seward  ---
(In reply to ahashmi from comment #2)
> Created attachment 134852 [details]
> Patch completes addition of arm64 v8.2 faddp instructions.

Looks good to me; a couple of small queries, but basically is landable.

---

The cases are distinguished as follows:
isD == True,  bitQ == 1  =>  2d
isD == False, bitQ == 1  =>  4s
isD == False, bitQ == 0  =>  2s
+   isH == True,  bitQ == 0  =>  4h
+   isH == False, bitQ == 1  =>  8h

Is this comment out of date?  The function it applies to takes an ARM64VecESize
now,
not isH / isD.

---

+   if (1) test_faddp_4h_00_00_00(TyH);

The tests where the three register numbers are the same .. are they of any
value?  In particular, they won't expose mixups where the wrong register
number is used in decode.  Those cases are covered by the _N_N+1_N+2 variants
afaics.  Is there some other reason to keep the N_N_N variants?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413547] regression test does not check for Arm 64 features

2021-01-06 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=413547

Julian Seward  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Julian Seward  ---
Landed, 3b1710d38cf19619242c9113a2dbe291e914a8c2.
Thank you for the work on this!

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 414268] Enable AArch64 feature detection and decoding for v8.x instructions (where x>0)

2020-12-09 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=414268

Julian Seward  changed:

   What|Removed |Added

 Status|CONFIRMED   |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Julian Seward  ---
Landed, cb52fee5ddbc2c0e936fd1efe5107a1afcf375cf.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413547] regression test does not check for Arm 64 features

2020-12-09 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=413547

--- Comment #7 from Julian Seward  ---
You can get to an "patch is obsolete" check box by following the twisty
maze attached to the "Details" link on the "Attachments" box.



Anyways, review comments.

This looks fine, on the whole.  OK to land providing the apparently-bogus
faddp implementation issue is resolved.

Most of the code in this patch is, I assume, boilerplate test-support
functions taken from other tests in none/tests/arm64.



+   /* Temporary clause to test hwcaps functionality. */
+   if (bitU == 0 && sz <= X00 && opcode == BITS5(0,1,1,0,1)) {
+  /*  0,00,01101 ADDP h_2h  */
+  if ((archinfo->hwcaps & VEX_HWCAPS_ARM64_FP16) == 0)
+ return False;
+  DIP("faddp h%u, v%u.2h\n", dd, nn);
+  vex_printf("Unimplemented v8.2 instruction scalar FADDP: 0x%08x\n",
insn);
+  return True;
+   }

Should this be present?  It certainly looks like it shouldn't be; if it
accepts the instruction then it gives no implementation for it, which is
definitely bogus.


+static Double randDouble ( void )
+{
+   ensure_special_values_initted();
+   UChar c = randUChar();
+   if (c >= 128) {
+  // return a normal number most of the time.

"most" is somewhat misleading; "half" might be more accurate.  Ditto in
randFloat.


> tests/arm64_features.c

Just as a general sanity check .. is the way that the capabilities are
established (by looking at auxv HWCAPS) consistent with the outcome of the
discussion in bug 414268 ?  I'm not looking for any change here.  Instead I
just want to check that there's an overall top-level coherent story about how
capabilities in V are done for AArch64.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413547] regression test does not check for Arm 64 features

2020-12-09 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=413547

Julian Seward  changed:

   What|Removed |Added

 Attachment #133667|0   |1
is obsolete||

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 413547] regression test does not check for Arm 64 features

2020-12-08 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=413547

--- Comment #5 from Julian Seward  ---
Assad, am I right to understand that the comment 3 patch
has been made redundant by the comment 4 patch?

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 414268] Enable AArch64 feature detection and decoding for v8.x instructions (where x>0)

2020-12-08 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=414268

--- Comment #14 from Julian Seward  ---
Thank you all for the high quality discussion and analysis.  This looks
fine to me; please land.

One tiny nit .. please #undef get_cpu_ftr and get_ftr just before the
closing brace of the fn that defines them.

I agree with Peter's and Mark's view that this should be feature-based;
but I see the difficulties in making that work cleanly because (unlike
on x86) there's no "single source of truth" (viz, CPUID) that we can fake
up for the guest, at our own convenience.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 429864] s390x: C++ atomic test_and_set yields false-positive memcheck diagnostics

2020-12-01 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=429864

--- Comment #1 from Julian Seward  ---
Yes.  There's a special Iop_CasCmpEQ{32,64} or some such, designed specifically
to deal with this problem.  See libvex_ir.h for details.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 428716] VEX/useful/smchash.c:39:16: error: Memory leak: gb [memleak]

2020-11-08 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=428716

Julian Seward  changed:

   What|Removed |Added

 CC||jsew...@acm.org

--- Comment #6 from Julian Seward  ---
I don't think this is worth bothering with.  It's not part of the build.
It's just a program I hacked up to collect stats used to develop the 
hash functions that you see at the bottom of VEX/priv/guest_generic_bb_to_IR.c.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 427404] PPC ISA 3.1 support is missing, part 6

2020-11-05 Thread Julian Seward

https://bugs.kde.org/show_bug.cgi?id=427404

--- Comment #11 from Julian Seward  ---
OK to land provided that setup_fxstate_struct Read/Write/Modify annotations are
set more precisely, so as to avoid false positives and false negatives from 
Memcheck.

-- 
You are receiving this mail because:
You are watching all bug changes.

1 2 3 4 5 6 7 8 9 >

1 - 100 of 891 matches

Mail list logo