Chandler Carruth <[email protected]> writes: > This isn't an argument against extra complexity, this is the *wrong model*. > > Very fundamentally, it is essential that the frontend emit the widest loads > and stores permitted by the language spec. The memory model very narrowly > constrains the backends ability to widen loads or stores or to merge adjacent > loads and stores across control dependencies. By emitting the full load and > store at each point, LLVM is able to combine *much* more aggressively around > control dependencies without violating the memory model. > > In addition to breaking this theoretical power of the middle-end optimizers, > it also effectively masks racing memory accesses to different bitfield slots > from tools like ThreadSanitizer. By using the full width in the initial load/ > store emission, sanitizers are aware of the *potential* domain of any race > regardless of what gets dropped during lowering. > > I'll reply separately to the specific performance problems, but please revert > this until there is an actual discussion about changing this very fundamental > design constraint. =/ This is not "obvious" or something that should go in > without review and careful consideration.
Thanks for the detailed explanation. I appreciate the context on this. Reverted in r215648 and I'll take a look at teaching the arm backend to narrow this kind of load more effectively when I get a chance. _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
