https://bugs.kde.org/show_bug.cgi?id=435665

--- Comment #6 from Carl Love <c...@us.ibm.com> ---
Julian:

On Fri, 2021-05-07 at 14:12 +0000, Julian Seward wrote:
> https://bugs.kde.org/show_bug.cgi?id=435665 
> 
> --- Comment #5 from Julian Seward <jsew...@acm.org> ---
> I'm somewhat concerned, at a fundamental level .. if I understand
> correctly, these instructions allow one to copy memory from one
> place to another (so why bother?  Why not just do normal
> loads/stores?)
> but -- at least as implemented -- there's no way to tell Memcheck
> or any other tool, that the values in the destination area are
> derived from values in the source area.  So you'll wind up with
> definedness false positives or negatives as a result of using them.
> 
> Can you explain the background to the insns, what they are used for,
> etc, so we can see if there's an implementation that fits better
> in the instrumentation framework?
> 

The instructions were added to communicate with optional hardware
accelerator units that a system may have.  The copy/paste instructions
an also be used to implement memcopy.

To use a hardware accelerator a the user program makes an OS call to
register the hardware accelerator.  The user program maps a memory
region for the accellerator into its program space.  The user program
can then communicate with the accelerator reading/writing to the memory
region,  The copy and paste instruction are used to move the data
to/from the mapped memory for the hardware accelerator.  The hardware
only allows the copy and paste instructions to be used to communicate
with the accelerators.  Normal loads and stores can not be used.  If
you try to use normal loads/stores you get a bus error.

The document:

 https://github.com/libnxz/power-gzip/blob/master/doc/power_nx_gzip_um.pdf

describes this in more detail.   I haven't actually written an code for
an accelerator but I was given some test programs for use in the
Valgrind support development.  One of the tests is a simple memcopy
program without an accelerator.  The other test communicate with an
accelerator.  The specific instructions are described in the ISA

https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv

I believe the layout for the memory used to communicate could be
different for each accelerator.  The memory will be changed by the
accelerator.  I don't think there is anyway we could know what parts of
the memory were changed (initialized or not) by the accelerator prior
to a copy from the mapped memory region back to a data structure in the
user program.  The instructions require the entire 128byes to be
copied/pasted.  You can not do a subset of the 128-byte memory.

In my first attempt at supporting the instructions, I created a new
128-byte "copy buffer register' in the guest state.  I had the copy and
paste instructions explicitly copy and paste to the guest state copy
buffer register.  I also had a guest state register to track the status
of a copy/paste in progress or not so I could detect errors.  This
worked for the memory copy program.  I would still have to have the
host do a copy/paste of the guest copy buffer register.  I would have
to get the mapped address from the instruction to then do the
copy/paste of the guest copy buffer.  It was all rather a mess.  I saw
no benefit in having a guest copy buffer.  

Since I have to use the copy and paste instructions, I really didn't
see any other way to handle these instructions other then using a dirty
helper to actually do the operations on the host.  

I can send you the test programs if that helps.

                        Carl

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to