https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67281
--- Comment #4 from Tulio Magno Quites Machado Filho <tuliom at linux dot vnet.ibm.com> --- (In reply to Andrew Pinski from comment #3) > Since there are no stores, the load seems like it can be pulled out of the > loop too. I disagree with you. If I use the value of dest to take a decision inside the transaction, I need the memory barrier before the access to *src0. Here's an example: long foo (long dest, long *src0, long src1, long tries) { long i; for (i = 0; i < tries; i++) { __builtin_tbegin (0); dest = *src0 + src1; if (dest == 13) __builtin_tabort(0); __builtin_tend (0); } return dest; } In other words, if you access *src0 before the memory barrier, its value may change when the memory barrier is created. This is particularly useful if dest says if a lock has been acquired by another thread or not. For the reference, this has been found in glibc source code: https://sourceware.org/ml/libc-alpha/2015-07/msg00986.html