On Sun, Mar 05, 2017 at 11:19:42AM -0800, Linus Torvalds wrote: >> But it is *not* the right thing to use on IO memory, because the CPU >> only does the magic cacheline access optimizations on cacheable >> memory!
Yes, and actually this is where I started. I thought my memcpy was using byte accesses on purpose and I needed to create a patch for a different IO memcpy because obviously byte accesses over the PCI bus would be very un-ideal. However, when I found my system wasn't intentionally using that implementation that was no longer my focus. So, I have no way to test this, but it sounds like any Ivy bridge system using the ERMS version of memcpy could have the same slow PCI memcpy performance I've been seeing (unless the microcode fixes it up?). So it sounds like it would be a good idea to revert the change Linus is talking about. >> So I think we should re-introduce that old "__inline_memcpy()" as that >> special "safe memcpy" thing. Not just for KMEMCHECK, and not just for >> 64-bit. On 05/03/17 12:54 PM, Borislav Petkov wrote: > Logan, wanna give that a try, see if it takes care of your issue? Well honestly my issue was solved by fixing my kernel config. I have no idea why I had optimize for size in there in the first place. Thanks, Logan