On 25/07/19 22:30, Michael S. Tsirkin wrote: > On Thu, Jul 25, 2019 at 05:35:01PM +0200, Paolo Bonzini wrote: >> On 25/07/19 16:46, Michael S. Tsirkin wrote: >>> Actually, I think I have a better idea. >>> At the moment we just get an exit on these reads and return all-ones. >>> Yes, in theory there could be a UR bit set in a bunch of >>> registers but in practice no one cares about these, >>> and I don't think we implement them. >>> So how about mapping a single page, read-only, and filling it >>> with all-ones? >> >> Yes, that's nice indeed. :) But it does have some cost, in terms of >> either number of VMAs or QEMU RSS since the MMCONFIG area is large. >> >> What breaks if we return all zeroes? Zero is not a valid vendor ID. >> >> Paolo > > I think I know what you are thinking of doing: > map /dev/zero so we get a single VMA but all mapped to > a single zero pte?
Yes, exactly. You absolutely need to share the page because the guest could easily touch 32*256 pages just to scan function 0 on every bus and device, even if the VM has just 4 or 5 devices and all of them on the root complex. And that causes fragmentation so you have to map bigger areas. > - we can implement /dev/ones. in fact, we can implement > /dev/byteXX for each possible value, the cost will > be only 1M on a 4k page system. > it might come in handy for e.g. free page hinting: > at the moment if guest memory is poisoned > we can not unmap it, with this trick we can > map it to /dev/byteXX. I also thought of /dev/ones, not sure how it would be accepted. :) Also you cannot map lazily on page fault, otherwise you get a vmexit and it's slow again. So /dev/ones needs to be written to use a huge page, possibly. Paolo