Hi John, On Tue, Mar 11, 2014 at 02:54:18AM +0000, John Carr wrote: > A comment in arm/sync.md notes "We should consider issuing a inner > shareability zone barrier here instead." Here is my first attempt > at a patch to emit weaker memory barriers. Three instructions seem > to be relevant for user mode code on my Cortex A9 Linux box: > > dmb ishst, dmb ish, dmb sy > > I believe these correspond to a release barrier, a full barrier > with respect to other CPUs, and a full barrier that also orders > relative to I/O.
Not quite; DMB ISHST only orders writes with other writes, so loads can move across it in both directions. That means it's not sufficient for releasing a lock, for example. <shameless plug> I gave a presentation at ELCE about the various ARMv7 barrier options (from a kernel perspective): https://www.youtube.com/watch?v=6ORn6_35kKo </shameless plug> > Consider this a request for comments on whether the approach is correct. > I haven't done any testing yet (beyond eyeballing the assembly output). You'll probably find that a lot of ARM micro-architectures treat them equivalently, which makes this a good source of potentially latent bugs if you get the wrong options. When I added these options to the kernel, I tested with a big/little A7/A15 setup using a CCI400 (the cache-coherent interconnect) to try and tickle things a bit better, but it's by no means definitive. > (define_insn "*memory_barrier" > [(set (match_operand:BLK 0 "" "") > - (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] > - "TARGET_HAVE_MEMORY_BARRIER" > + (unspec:BLK > + [(match_dup 0) (match_operand:SI 1 "const_int_operand")] > + UNSPEC_MEMORY_BARRIER))] > + "TARGET_HAVE_DMB || TARGET_HAVE_MEMORY_BARRIER" > { > if (TARGET_HAVE_DMB) > { > - /* Note we issue a system level barrier. We should consider issuing > - a inner shareabilty zone barrier here instead, ie. "DMB ISH". */ > - /* ??? Differentiate based on SEQ_CST vs less strict? */ > - return "dmb\tsy"; > + switch (INTVAL(operands[1])) > + { > + case MEMMODEL_RELEASE: > + return "dmb\tishst"; As stated above, this isn't correct. > + case MEMMODEL_CONSUME: You might be able to build this one out of <control dependency> + ISB (in lieu of a true address dependency). However, given the recent monolithic C11 thread that took over this list and others, let's play it safe for now and stick with DMB ISH. > + case MEMMODEL_ACQUIRE: > + case MEMMODEL_ACQ_REL: > + return "dmb\tish"; I think you probably *always* want ISH for GCC. You assume normal, cacheable , coherent memory right? That also mirrors the first comment you delete above. > + case MEMMODEL_SEQ_CST: > + return "dmb\tsy"; Again, ISH here should be fine. SY is for things like non-coherent devices, which I don't think GCC has a concept of (the second comment you delete doesn't make any sense and has probably tricked you). Hope that helps, Will