On 2019/4/9 0:14, Laszlo Ersek wrote: > On 04/08/19 15:43, Xiang Zheng wrote: >> >> On 2019/4/3 23:35, Laszlo Ersek wrote: >>>> I thought about your comments and wrote the following patch (just for test) >>>> which uses a file mapping to replace the anonymous mapping. UEFI seems to >>>> work >>>> fine. So why not use a file mapping to read or write a pflash device? >>> Honestly I can't answer "why not do this". Maybe we should. >>> >>> Regarding "why not do this *exactly as shown below*" -- probably because >>> then we'd have updates to the same underlying regular file via both >>> mmap() and write(), and the interactions between those are really tricky >>> (= best avoided). >>> >>> One of my favorite questions over the years used to be posting a matrix >>> of possible mmap() and file descriptor r/w/sync interactions, with the >>> outcomes as they were "implied" by POSIX, and then asking at the bottom, >>> "is my understanding correct?" I've never ever received an answer to >>> that. :) >> >> In my opinion, it's feasible to r/w/sync the memory devices which use a block >> backend via mmap() and write(). > > Maybe. I think that would be a first in QEMU, and you'd likely have to > describe and follow a semi-formal model between fd actions and mmap() > actions. > > Here's the (unconfirmed) table I referred to earlier. > > +-------------+-----------------------------------------------------+ > | change made | change visible via | > | through +-----------------+-------------+---------------------+ > | | MAP_SHARED | MAP_PRIVATE | read() | > +-------------+-----------------+-------------+---------------------+ > | MAP_SHARED | yes | unspecified | depends on MS_SYNC, | > | | | | MS_ASYNC, or normal | > | | | | system activity | > +-------------+-----------------+-------------+---------------------+ > | MAP_PRIVATE | no | no | no | > +-------------+-----------------+-------------+---------------------+ > | write() | depends on | unspecified | yes | > | | MS_INVALIDATE, | | | > | | or the system's | | | > | | read/write | | | > | | consistency | | | > +-------------+-----------------+-------------+---------------------+ > > Presumably: > > - a write() to some offset range of a regular file should be visible in > a MAP_SHARED mapping of that range after a matching msync(..., > MS_INVALIDATE) call; > > - changes through a MAP_SHARED mapping to a regular file should be > visible via read() after a matching msync(..., MS_SYNC) call returns. > > I sent this table first in 2010 to the Austin Group mailing list, and > received no comments. Then another person (on the same list) asked > basically the same questions in 2013, to which I responded with the > above assumptions / interpretations -- and received no comments from > third parties again. > > But, I'm really out of my depth here -- we're not even discussing > generic read()/write()/mmap()/munmap()/msync() interactions, but how > they would fit into the qemu block layer. And I have no idea. > >> >>> >>> Also... although we don't really use them in practice, some QCOW2 >>> features for pflash block backends are -- or would be -- nice. (Not the >>> internal snapshots or the live RAM dumps, of course.) >> >> Regarding sparse files, can we read actual backend size data for the >> read-only >> pflash memory as the following steps shown? >> >> 1) Create a sparse file -- QEMU_EFI-test.raw: >> dd if=/dev/zero of=QEMU_EFI-test.raw bs=1M seek=64 count=0 >> >> 2) Read from QEMU_EFI.fd and write to QEMU_EFI-test.raw: >> dd of="QEMU_EFI-test.raw" if="QEMU_EFI.fd" conv=notrunc >> >> 3) Use QEMU_EFI-test.raw as the read-only pflash and start qemu with the >> below >> patch applied: >> >> ---8>--- >> >> diff --git a/hw/block/block.c b/hw/block/block.c >> index bf56c76..ad18d5e 100644 >> --- a/hw/block/block.c >> +++ b/hw/block/block.c >> @@ -30,7 +30,7 @@ >> bool blk_check_size_and_read_all(BlockBackend *blk, void *buf, hwaddr size, >> Error **errp) >> { >> - int64_t blk_len; >> + int64_t blk_len, actual_len; >> int ret; >> >> blk_len = blk_getlength(blk); >> @@ -53,7 +53,9 @@ bool blk_check_size_and_read_all(BlockBackend *blk, void >> *buf, hwaddr size, >> * block device and read only on demand. >> */ >> assert(size <= BDRV_REQUEST_MAX_BYTES); >> - ret = blk_pread(blk, 0, buf, size); >> + actual_len = bdrv_get_allocated_file_size(blk_bs(blk)); >> + ret = blk_pread(blk, 0, buf, >> + (!blk_is_read_only(blk) || actual_len < 0) ? size : actual_len); >> if (ret < 0) { >> error_setg_errno(errp, -ret, "can't read block backend"); >> return false; >> > > This hunk looks dubious for a general helper function. It seems to > assume that the holes are punched at the end of the file. > > Sorry, this discussion is making me uncomfortable. I don't want to > ignore your questions, but I have no idea about the right answers. This > really needs the attention of the block people.
I feel deeply sorry for making you uncomfortable. It's generous of you to give me so much of your time. -- Thanks, Xiang