On Sun, Mar 5, 2017 at 11:54 AM, Borislav Petkov <b...@suse.de> wrote: >> >> We seem to have broken this *really* long ago, though. > > I wonder why nothing blew up or failed strangely by now...
The hardware that cared was pretty broken to begin with, and I think it was mainly some really odd graphics cards. And from memory, they had issues with 64-bit writes. We actually have a slow 16-bit word at a time copy for exactly these kinds of issues: scr_memcpyw() and friends. I'd like to say that it was one of those shit server-only cards that nobody sane would ever use (but "server hardware is validated and better quality!"), but that might have been another issue. >> For example, "rep movsb" really is the right thing to use on normal >> memory on modern CPU's. > > So Logan's box is a SNB and it doesn't have the ERMS optimizations. Are > you saying, regardless, we should let gcc put REP; MOVSB for smaller > sizes? I think gcc makes bad choices, but they are gcc';s choices to make. I have up on gcc's "-Os" because the choices were so bad and not getting fixed. .. and none of that has _anything_ to do with accesses to IO memory, which is fundamentally different. > Because gcc does generate a REP; MOVSB there when it puts its own > memcpy, see mail upthread. (Even though that is wrong to do on iomem.) No, *THAT* is not wrong to do on iomem. If we tell gcc that "memcpy()" works on iomem, then gcc can damn well do whatever it wants. "rep stosb" isn't wrong for memcpy(). Gcc may do stupid things with it, but that's completely immaterial. > Oh, and along with the revert we would need a big fat warning explaining > why we need that special memcpy for IO memory. Well, quite frankly, just a simple "IO memory is different from cached memory" should be sufficient. Linus