On Tue, Jun 16, 2026 at 05:24:15AM +1000, Gavin Shan wrote:
> Hi Peter and Michael,
> 
> On 6/16/26 1:12 AM, Peter Maydell wrote:
> > On Mon, 15 Jun 2026 at 15:56, Michael S. Tsirkin <[email protected]> wrote:
> > > 
> > > On Mon, Jun 15, 2026 at 11:57:09AM +0100, Peter Maydell wrote:
> > > > On Mon, 15 Jun 2026 at 11:03, Gavin Shan <[email protected]> wrote:
> > > > > 
> > > > > All ram device regions were turned to be indirectly accessible by 
> > > > > commit
> > > > > 4a2e242bbb ("memory: Don't use memcpy for ram_device regions"). This 
> > > > > leads
> > > > > to guest hang on attempt to build 'cuda-samples' as reported by 
> > > > > Julia. The
> > > > > guest is started by the following command lines, with GH100 GPU card 
> > > > > passed
> > > > > from the host.
> > > > 
> > > > > diff --git a/include/system/memory.h b/include/system/memory.h
> > > > > index 1417132f6d..5878727d09 100644
> > > > > --- a/include/system/memory.h
> > > > > +++ b/include/system/memory.h
> > > > > @@ -2897,6 +2897,8 @@ void 
> > > > > address_space_register_map_client(AddressSpace *as, QEMUBH *bh);
> > > > >   void address_space_unregister_map_client(AddressSpace *as, QEMUBH 
> > > > > *bh);
> > > > > 
> > > > >   /* Internal functions, part of the implementation of 
> > > > > address_space_read.  */
> > > > > +void qemu_ram_copy(void *dest, const void *src, size_t n);
> > > > > +void qemu_ram_move(void *dest, const void *src, size_t n);
> > > > 
> > > > New function prototypes in include headers need documentation comments.
> > > > 
> > > > In particular for these, it's really important that we clearly say
> > > > what semantics we are attempting to provide with them, so that
> > > > (a) when we're reviewing or later updating the implementation we
> > > > know what we are trying to provide
> > > > (b) when we're looking at the callsites we know what the function
> > > > is guaranteeing to us and what it is not, and thus whether it's
> > > > OK to use it or we need something els.
> > > > 
> > > > > +#if defined(__i386__) || defined(__x86_64__)
> > > > > +#define HOST_UNALIGNED_MMIO_OK 1
> > > > > +#else
> > > > > +#define HOST_UNALIGNED_MMIO_OK 0
> > > > > +#endif
> > > > 
> > > > We need to do something better than this. We can't
> > > > just say "oh, we trust that on x86 this works": it is
> > > > neither actually true that the compiler guarantees it
> > > 
> > > sorry guarantee what?
> > 
> > Well, that's part of my point about "we need a doc comment":
> > what exactly are we trying to guarantee ? But whatever it is,
> > "hardcoded ifdef that privileges x86" is definitely the wrong
> > answer.
> > 
> 
> Could you please check the following comments (documentation context) for the
> added functions look good to you? Please let me know if there are anything
> can be improved.
> 
> +
> +/**
> + * qemu_ram_copy: copy data to a RAMBlock
> + *
> + * @dest: destination where the data is copied to. It's the pointer returned
> + *        by ramblock_ptr() and its variants.
> + * @src: source where the data is copied from. It can be a regular memory 
> block
> + *       or a pointer returned by ramblock_ptr() and its variants.
> + * @n: length of data to be copied
> + *
> + * The destination is always a pointer to data area of a RAMBlock which can
> + * be for a regular memory block or a MMIO region. The source can be a 
> pointer
> + * to a regular memory block or a MMIO region. Atomicity isn't guranteed by
> + * memcpy() to copy data between those MMIO regions as we do in this 
> function.
> + * Besides, unaligned accesses are well handled on all architectures except
> + * i386 and x86_64 where the unaligned accesses are supported by the CPUs.
> + *
> + * The ensured atomicity and alignment consideration, which aren't needed
> + * by data copy between two regular memory blocks. So performance penalty
> + * is introduced by this function in such circumstance.
> + */
> +void qemu_ram_copy(void *dest, const void *src, size_t n);
> +
> +/**
> + * qemu_ram_move: move data to a RAMBlock
> + *
> + * @dest: destination where the data is moved to. It's the pointer returned
> + *        by ramblock_ptr() and its variants.
> + * @src: source where the data is moved from. It can be a regular memory 
> block
> + *       or a pointer returned by ramblock_ptr() and its variants.
> + * @n: length of data to be moved
> + *
> + * The destination is always a pointer to data area of a RAMBlock which can
> + * be for a regular memory block or a MMIO region. The source can be a 
> pointer
> + * to a regular memory block or a MMIO region. Atomicity isn't guranteed by
> + * memmove() to move data from or to those MMIO regions as we do in this
> + * function. Besides, unaligned accesses are well handled on all 
> architectures
> + * except i386 and x86_64 where the unaligned accesses are supported by the
> + * CPUs.
> + *
> + * The ensured atomicity and alignment consideration, which aren't needed
> + * by data movement between two regular memory blocks. So performance penalty
> + * is introduced by this function in such circumstance.
> + */
> +void qemu_ram_move(void *dest, const void *src, size_t n);
> +


I apologize, I don't understand what these comments are saying, at all.



> > -- PMM
> > 
> 
> Thanks,
> Gavin


Reply via email to