On 12/03/2015 07:08 AM, Peter Maydell wrote: > On 3 December 2015 at 14:58, Laurent Desnogues > <laurent.desnog...@gmail.com> wrote: >> On Thu, Dec 3, 2015 at 3:36 PM, Peter Maydell <peter.mayd...@linaro.org> >> wrote: >>> On 30 November 2015 at 22:23, Andrew Baumann >>> <andrew.baum...@microsoft.com> wrote: >>>> Qemu does not generally perform alignment checks. However, the ARM ARM >>>> requires implementation of alignment exceptions for a number of cases >>>> including LDREX, and Windows-on-ARM relies on this. > >>> TCG supports "this load/store should do an alignment check" >>> using the MO_ALIGN TCGMemOp flag (which results in a call to >>> the CPU's do_unaligned_access hook if the guest address is not >>> aligned). I think we should use this core-code functionality >>> rather than rolling our own equivalent (it is more efficient). >>> There are some examples in a few of the other targets (eg MIPS) >>> of how to do this, but basically you need to arrange that the >>> initial loads in gen_load_exclusive get the MO_ALIGN flag >>> ORed in, and then wire up the do_unaligned_access hook and >>> make it raise a suitable exception. >> >> After quickly looking at the code in softmmu_template.h, I wonder if >> MO_ALIGN would correcly handle the ldrexd pair case which requires an >> 8-byte alignment but does 2 4-byte loads (even if the code is tweaked >> to read 8-byte at once, then checking 16-byte alignment of AArch64 >> ldxp 64-bit could not be handled correctly). > > You're right, those are not going to be handled correctly. > But I think it would be better to enhance the MO_ALIGN > handling somehow to deal with "must be more highly aligned than > the datasize" cases as well as the "alignment must match datasize" > ones.
What's the full set of features that you'd like here? > (As you say we'd need > to do the ldrexd as a 64-bit access, but we should do that > anyway because it's supposed to be single-copy-atomic, > architecturally speaking.) Something to remember for future is that we're not doing single-copy of 64-bit data for 32-bit hosts. I'm not even sure that's generally possible without generating awful code. r~