On Tue, Jun 16, 2026 at 05:24:15AM +1000, Gavin Shan wrote: > Hi Peter and Michael, > > On 6/16/26 1:12 AM, Peter Maydell wrote: > > On Mon, 15 Jun 2026 at 15:56, Michael S. Tsirkin <[email protected]> wrote: > > > > > > On Mon, Jun 15, 2026 at 11:57:09AM +0100, Peter Maydell wrote: > > > > On Mon, 15 Jun 2026 at 11:03, Gavin Shan <[email protected]> wrote: > > > > > > > > > > All ram device regions were turned to be indirectly accessible by > > > > > commit > > > > > 4a2e242bbb ("memory: Don't use memcpy for ram_device regions"). This > > > > > leads > > > > > to guest hang on attempt to build 'cuda-samples' as reported by > > > > > Julia. The > > > > > guest is started by the following command lines, with GH100 GPU card > > > > > passed > > > > > from the host. > > > > > > > > > diff --git a/include/system/memory.h b/include/system/memory.h > > > > > index 1417132f6d..5878727d09 100644 > > > > > --- a/include/system/memory.h > > > > > +++ b/include/system/memory.h > > > > > @@ -2897,6 +2897,8 @@ void > > > > > address_space_register_map_client(AddressSpace *as, QEMUBH *bh); > > > > > void address_space_unregister_map_client(AddressSpace *as, QEMUBH > > > > > *bh); > > > > > > > > > > /* Internal functions, part of the implementation of > > > > > address_space_read. */ > > > > > +void qemu_ram_copy(void *dest, const void *src, size_t n); > > > > > +void qemu_ram_move(void *dest, const void *src, size_t n); > > > > > > > > New function prototypes in include headers need documentation comments. > > > > > > > > In particular for these, it's really important that we clearly say > > > > what semantics we are attempting to provide with them, so that > > > > (a) when we're reviewing or later updating the implementation we > > > > know what we are trying to provide > > > > (b) when we're looking at the callsites we know what the function > > > > is guaranteeing to us and what it is not, and thus whether it's > > > > OK to use it or we need something els. > > > > > > > > > +#if defined(__i386__) || defined(__x86_64__) > > > > > +#define HOST_UNALIGNED_MMIO_OK 1 > > > > > +#else > > > > > +#define HOST_UNALIGNED_MMIO_OK 0 > > > > > +#endif > > > > > > > > We need to do something better than this. We can't > > > > just say "oh, we trust that on x86 this works": it is > > > > neither actually true that the compiler guarantees it > > > > > > sorry guarantee what? > > > > Well, that's part of my point about "we need a doc comment": > > what exactly are we trying to guarantee ? But whatever it is, > > "hardcoded ifdef that privileges x86" is definitely the wrong > > answer. > > > > Could you please check the following comments (documentation context) for the > added functions look good to you? Please let me know if there are anything > can be improved. > > + > +/** > + * qemu_ram_copy: copy data to a RAMBlock > + * > + * @dest: destination where the data is copied to. It's the pointer returned > + * by ramblock_ptr() and its variants. > + * @src: source where the data is copied from. It can be a regular memory > block > + * or a pointer returned by ramblock_ptr() and its variants. > + * @n: length of data to be copied > + * > + * The destination is always a pointer to data area of a RAMBlock which can > + * be for a regular memory block or a MMIO region. The source can be a > pointer > + * to a regular memory block or a MMIO region. Atomicity isn't guranteed by > + * memcpy() to copy data between those MMIO regions as we do in this > function. > + * Besides, unaligned accesses are well handled on all architectures except > + * i386 and x86_64 where the unaligned accesses are supported by the CPUs. > + * > + * The ensured atomicity and alignment consideration, which aren't needed > + * by data copy between two regular memory blocks. So performance penalty > + * is introduced by this function in such circumstance. > + */ > +void qemu_ram_copy(void *dest, const void *src, size_t n); > + > +/** > + * qemu_ram_move: move data to a RAMBlock > + * > + * @dest: destination where the data is moved to. It's the pointer returned > + * by ramblock_ptr() and its variants. > + * @src: source where the data is moved from. It can be a regular memory > block > + * or a pointer returned by ramblock_ptr() and its variants. > + * @n: length of data to be moved > + * > + * The destination is always a pointer to data area of a RAMBlock which can > + * be for a regular memory block or a MMIO region. The source can be a > pointer > + * to a regular memory block or a MMIO region. Atomicity isn't guranteed by > + * memmove() to move data from or to those MMIO regions as we do in this > + * function. Besides, unaligned accesses are well handled on all > architectures > + * except i386 and x86_64 where the unaligned accesses are supported by the > + * CPUs. > + * > + * The ensured atomicity and alignment consideration, which aren't needed > + * by data movement between two regular memory blocks. So performance penalty > + * is introduced by this function in such circumstance. > + */ > +void qemu_ram_move(void *dest, const void *src, size_t n); > +
I apologize, I don't understand what these comments are saying, at all. > > -- PMM > > > > Thanks, > Gavin
