> -----Original Message----- > From: Peter Xu <pet...@redhat.com> > Sent: Thursday, March 28, 2024 11:16 PM > To: Liu, Yuan1 <yuan1....@intel.com> > Cc: Daniel P. Berrangé <berra...@redhat.com>; faro...@suse.de; qemu- > de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com; Zou, > Nanhai <nanhai....@intel.com> > Subject: Re: [PATCH v5 5/7] migration/multifd: implement initialization of > qpl compression > > On Thu, Mar 28, 2024 at 02:32:37AM +0000, Liu, Yuan1 wrote: > > > -----Original Message----- > > > From: Peter Xu <pet...@redhat.com> > > > Sent: Thursday, March 28, 2024 3:26 AM > > > To: Liu, Yuan1 <yuan1....@intel.com> > > > Cc: Daniel P. Berrangé <berra...@redhat.com>; faro...@suse.de; qemu- > > > de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com; > Zou, > > > Nanhai <nanhai....@intel.com> > > > Subject: Re: [PATCH v5 5/7] migration/multifd: implement > initialization of > > > qpl compression > > > > > > On Fri, Mar 22, 2024 at 12:40:32PM -0400, Peter Xu wrote: > > > > > > void multifd_recv_zero_page_process(MultiFDRecvParams *p) > > > > > > { > > > > > > for (int i = 0; i < p->zero_num; i++) { > > > > > > void *page = p->host + p->zero[i]; > > > > > > if (!buffer_is_zero(page, p->page_size)) { > > > > > > memset(page, 0, p->page_size); > > > > > > } > > > > > > } > > > > > > } > > > > > > > > It may not matter much (where I also see your below comments), but > just > > > to > > > > mention another solution to avoid this read is that we can maintain > > > > RAMBlock->receivedmap for precopy (especially, multifd, afaiu > multifd > > > > doesn't yet update this bitmap.. even if normal precopy does), then > here > > > > instead of scanning every time, maybe we can do: > > > > > > > > /* > > > > * If it's the 1st time receiving it, no need to clear it as it > must > > > be > > > > * all zeros now. > > > > */ > > > > if (bitmap_test(rb->receivedmap, page_offset)) { > > > > memset(page, 0, ...); > > > > } else { > > > > bitmap_set(rb->receivedmap, page_offset); > > > > } > > > > > > > > And we also always set the bit when !zero too. > > > > > > > > My rational is that it's unlikely a zero page if it's sent once or > more, > > > > while OTOH for the 1st time we receive it, it must be a zero page, > so no > > > > need to scan for the 1st round. > > > > > > Thinking about this, I'm wondering whether we should have this > regardless. > > > IIUC now multifd will always require two page faults on destination > for > > > anonymous guest memories (I suppose shmem/hugetlb is fine as no zero > page > > > in those worlds). Even though it should be faster than DMA faults, it > > > still is unwanted. > > > > > > I'll take a note myself as todo to do some measurements in the future > > > first. However if anyone thinks that makes sense and want to have a > look, > > > please say so. It'll be more than welcomed. > > > > Yes, I think this is a better improvement to avoid two page faults. I > can test > > the performance impact of this change on SVM-capable devices and give > some data > > later. As we saw before, the IOTLB flush occurs via COW, with the > change, the > > impact of the COW should be gone. > > > > If you need more testing and analysis on this, please let me know > > Nothing more than that. Just a heads up that Xiang used to mention a test > case where Richard used to suggest dropping the zero check: > > https://lore.kernel.org/r/CAAYibXib+TWnJpV22E=adncdBmwXJRqgRjJXK7X71J=bDfa > x...@mail.gmail.com > > AFAIU this should be resolved if we have the bitmap maintained, but we can > double check. IIUC that's exactly the case for an idle guest, in that > case > it should be even faster to skip the memcmp when bit clear. > > If you're going to post the patches, feel free to post that as a > standalone > small series first, then that can be considered merge even earlier. > > Thanks a lot for doing this.
Sure, I will prepare a separate patch for this, and we can have a better discussion on concrete implementation and test results.