Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-15 Thread Chunguang Li



> -Original Messages-
> From: "Dr. David Alan Gilbert" 
> Sent Time: 2017-11-15 22:23:52 (Wednesday)
> To: "Chunguang Li" 
> Cc: qemu-devel@nongnu.org, quint...@redhat.com, amit.s...@redhat.com, 
> pbonz...@redhat.com, stefa...@redhat.com
> Subject: Re: [Qemu-devel] Abnormal observation during migration: too many 
> "write-not-dirty" pages
> 
> * Chunguang Li (lichungu...@hust.edu.cn) wrote:
> > 
> > 
> > 
> > > -原始邮件-
> > > 发件人: "Dr. David Alan Gilbert" 
> > > 发送时间: 2017-11-15 18:11:37 (星期三)
> > > 收件人: "Chunguang Li" 
> > > 抄送: qemu-devel@nongnu.org, quint...@redhat.com, amit.s...@redhat.com, 
> > > pbonz...@redhat.com, stefa...@redhat.com
> > > 主题: Re: [Qemu-devel] Abnormal observation during migration: too many 
> > > "write-not-dirty" pages
> > > 
> > > * Chunguang Li (lichungu...@hust.edu.cn) wrote:
> > > > Hi all!
> > > > 
> > > > I got a very abnormal observation for the VM migration. I found that 
> > > > many pages marked as dirty during migration are "not really dirty", 
> > > > which is, their content are the same as the old version.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > I did the migration experiment like this:
> > > > 
> > > > During the setup phase of migration, first I suspended the VM. Then I 
> > > > copied all the pages within the guest physical address space to a 
> > > > memory buffer as large as the guest memory size. After that, the dirty 
> > > > tracking began and I resumed the VM. Besides, at the end
> > > > of each iteration, I also suspended the VM temporarily. During the 
> > > > suspension, I compared the content of all the pages marked as dirty in 
> > > > this iteration byte-by-byte with their former copies inside the buffer. 
> > > > If the content of one page was the same as its former copy, I recorded 
> > > > it as a "write-not-dirty" page (the page is written exactly with the 
> > > > same content as the old version). Otherwise, I replaced this page in 
> > > > the buffer with the new content, for the possible comparison in the 
> > > > future. After the reset of the dirty bitmap, I resumed the VM. Thus, I 
> > > > obtain the proportion of the write-not-dirty pages within all the pages 
> > > > marked as dirty for each pre-copy iteration.
> > > > 
> > > > I repeated this experiment with 15 workloads, which are 11 CPU2006 
> > > > benchmarks, Memcached server, kernel compilation, playing a video, and 
> > > > an idle VM. The CPU2006 benchmarks and Memcached are write-intensive 
> > > > workloads. So almost all of them did not converge to stop-copy.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Startlingly, the proportions of the write-not-dirty pages are quite 
> > > > high. Memcached and three CPU2006 benchmarks(zeusmp, mcf and bzip2) 
> > > > have the most high proportions. Their proportions of the 
> > > > write-not-dirty pages within all the dirty pages are as high as 
> > > > 45%-80%. The proportions of the other workloads are about 5%-20%, which 
> > > > are also abnormal. According to my intuition, the proportion of 
> > > > write-not-dirty pages should be far less than these numbers. I think it 
> > > > should be quite a particular case that one page is written with exactly 
> > > > the same content as the former data.
> > > > 
> > > > Besides, the zero pages are not counted for all the results. Because I 
> > > > think codes like memset() may write large area of pages to zero pages, 
> > > > which are already zero pages before.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > I excluded some possible unknown reasons with the machine hardware, 
> > > > because I repeated the experiments with two sets of different machines. 
> > > > Then I guessed it might be related with the huge page feature. However, 
> > > > the result was the same when I turned the huge page feature off in the 
> > > > OS.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Now there are only two possible reasons in my opinion. 
> > > > 
> > > > First, there is some bugs in the KVM kernel dirty tracking mechanism. 
> &

Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-15 Thread Dr. David Alan Gilbert
* Chunguang Li (lichungu...@hust.edu.cn) wrote:
> 
> 
> 
> > -原始邮件-
> > 发件人: "Dr. David Alan Gilbert" 
> > 发送时间: 2017-11-15 18:11:37 (星期三)
> > 收件人: "Chunguang Li" 
> > 抄送: qemu-devel@nongnu.org, quint...@redhat.com, amit.s...@redhat.com, 
> > pbonz...@redhat.com, stefa...@redhat.com
> > 主题: Re: [Qemu-devel] Abnormal observation during migration: too many 
> > "write-not-dirty" pages
> > 
> > * Chunguang Li (lichungu...@hust.edu.cn) wrote:
> > > Hi all!
> > > 
> > > I got a very abnormal observation for the VM migration. I found that many 
> > > pages marked as dirty during migration are "not really dirty", which is, 
> > > their content are the same as the old version.
> > > 
> > > 
> > > 
> > > 
> > > I did the migration experiment like this:
> > > 
> > > During the setup phase of migration, first I suspended the VM. Then I 
> > > copied all the pages within the guest physical address space to a memory 
> > > buffer as large as the guest memory size. After that, the dirty tracking 
> > > began and I resumed the VM. Besides, at the end
> > > of each iteration, I also suspended the VM temporarily. During the 
> > > suspension, I compared the content of all the pages marked as dirty in 
> > > this iteration byte-by-byte with their former copies inside the buffer. 
> > > If the content of one page was the same as its former copy, I recorded it 
> > > as a "write-not-dirty" page (the page is written exactly with the same 
> > > content as the old version). Otherwise, I replaced this page in the 
> > > buffer with the new content, for the possible comparison in the future. 
> > > After the reset of the dirty bitmap, I resumed the VM. Thus, I obtain the 
> > > proportion of the write-not-dirty pages within all the pages marked as 
> > > dirty for each pre-copy iteration.
> > > 
> > > I repeated this experiment with 15 workloads, which are 11 CPU2006 
> > > benchmarks, Memcached server, kernel compilation, playing a video, and an 
> > > idle VM. The CPU2006 benchmarks and Memcached are write-intensive 
> > > workloads. So almost all of them did not converge to stop-copy.
> > > 
> > > 
> > > 
> > > 
> > > Startlingly, the proportions of the write-not-dirty pages are quite high. 
> > > Memcached and three CPU2006 benchmarks(zeusmp, mcf and bzip2) have the 
> > > most high proportions. Their proportions of the write-not-dirty pages 
> > > within all the dirty pages are as high as 45%-80%. The proportions of the 
> > > other workloads are about 5%-20%, which are also abnormal. According to 
> > > my intuition, the proportion of write-not-dirty pages should be far less 
> > > than these numbers. I think it should be quite a particular case that one 
> > > page is written with exactly the same content as the former data.
> > > 
> > > Besides, the zero pages are not counted for all the results. Because I 
> > > think codes like memset() may write large area of pages to zero pages, 
> > > which are already zero pages before.
> > > 
> > > 
> > > 
> > > 
> > > I excluded some possible unknown reasons with the machine hardware, 
> > > because I repeated the experiments with two sets of different machines. 
> > > Then I guessed it might be related with the huge page feature. However, 
> > > the result was the same when I turned the huge page feature off in the OS.
> > > 
> > > 
> > > 
> > > 
> > > Now there are only two possible reasons in my opinion. 
> > > 
> > > First, there is some bugs in the KVM kernel dirty tracking mechanism. It 
> > > may mark some pages that do not receive write request as dirty.
> > > 
> > > Second, there is some bugs in the OS running inside the VM. It may issue 
> > > some unnecessary write requests.
> > > 
> > > 
> > > What do you think about this abnormal phenomenon? Any advice or possible 
> > > reasons or even guesses? I appreciate any responses, because it has 
> > > confused me for a long time. Thank you.
> > 
> > Wasn't it you who pointed out last year the other possibility? - The
> > problem of false positives due to sync'ing the whole of memory and then
> > writing the data out, but some of the dirty pages were already written?
> > 
> > Dave
> 
> Yes, you remember that!

Yes, I remember that,

Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-15 Thread Chunguang Li



> -Original Messages-
> From: "Juan Quintela" 
> Sent Time: 2017-11-15 17:45:44 (Wednesday)
> To: "Chunguang Li" 
> Cc: qemu-devel@nongnu.org, dgilb...@redhat.com, amit.s...@redhat.com, 
> pbonz...@redhat.com, stefa...@redhat.com
> Subject: Re: Abnormal observation during migration: too many 
> "write-not-dirty" pages
> 
> "Chunguang Li"  wrote:
> > Hi all! 
> 
> Hi
> 
> Sorry for the delay, I was on vacation an still getting up to speed.

Hi, Juan, thanks for your reply.

> 
> > I got a very abnormal observation for the VM migration. I found that many 
> > pages marked as dirty during
> > migration are "not really dirty", which is, their content are the same as 
> > the old version. 
> 
> I think your test is quite good, and I am also ashamed that 80% of
> "false" dirty pages is really a lot.
> 
> > I did the migration experiment like this: 
> >
> > During the setup phase of migration, first I suspended the VM. Then I 
> > copied all the pages within the guest
> > physical address space to a memory buffer as large as the guest memory 
> > size. After that, the dirty tracking
> > began and I resumed the VM. Besides, at the end
> > of each iteration, I also suspended the VM temporarily. During the 
> > suspension, I compared the content of all
> > the pages marked as dirty in this iteration byte-by-byte with their former 
> > copies inside the buffer. If the
> > content of one page was the same as its former copy, I recorded it as a 
> > "write-not-dirty" page (the page is
> > written exactly with the same content as the old version). Otherwise, I 
> > replaced this page in the buffer with
> > the new content, for the possible comparison in the future. After the reset 
> > of the dirty bitmap, I resumed the
> > VM. Thus, I obtain the proportion of the write-not-dirty pages within all 
> > the pages marked as dirty for each
> > pre-copy iteration. 
> 
> 
> vhost and friends could make a small difference here, but in general,
> this approach should be ok.
> 
> > I repeated this experiment with 15 workloads, which are 11 CPU2006 
> > benchmarks, Memcached server,
> > kernel compilation, playing a video, and an idle VM. The CPU2006 benchmarks 
> > and Memcached are
> > write-intensive workloads. So almost all of them did not converge to 
> > stop-copy. 
> 
> That is the impressive part, 15 workloads.  Thanks for taking the effor.
> 
> BTW, do you have your qemu changes handy, just to be able to test
> locally, and "review" how do you measure things.

Sorry, I do not have my changes handy. But don't worry, I will send them to you 
tomorrow morning. It's night here.

> 
> 
> > Startlingly, the proportions of the write-not-dirty pages are quite high. 
> > Memcached and three CPU2006
> > benchmarks(zeusmp, mcf and bzip2) have the most high proportions. Their 
> > proportions of the write-not-dirty
> > pages within all the dirty pages are as high as 45%-80%.
> 
> Or the workload does really stupid things like:
> 
> a = 0;
> a = 1;
> a = 0;
> 
> This makes no sense at all.
> 
> Just in case, could you try to test this with xbzrle?  It should go well
> with this use case (but you need to get a big enough buffer to cache
> enough memory).

In fact, I have tested these workloads (the 45%-80% ones) with xbzrle. And when 
the buffer is big enough, they really go well. While they do not converge to 
stop-copy before, now they finish migration quickly.

> 
> 
> > The proportions of the other workloads are about
> > 5%-20%, which are also abnormal. According to my intuition, the proportion 
> > of write-not-dirty pages should be
> > far less than these numbers. I think it should be quite a particular case 
> > that one page is written with exactly
> > the same content as the former data. 
> 
> I agree with that.
> 
> > Besides, the zero pages are not counted for all the results. Because I 
> > think codes like memset() may write
> > large area of pages to zero pages, which are already zero pages before. 
> >
> > I excluded some possible unknown reasons with the machine hardware, because 
> > I repeated the experiments
> > with two sets of different machines. Then I guessed it might be related 
> > with the huge page feature. However,
> > the result was the same when I turned the huge page feature off in the OS. 
> 
> Huge page could have caused that.  Remember that we have transparent
> huge pages.  I have to look at that code.

In fact, the results are the same no matter I turn on or turn off the 
transparent huge pages in the OS.

Later, Chunguang.

> 
> > Now there are only two possible reasons in my opinion. 
> >
> > First, there is some bugs in the KVM kernel dirty tracking mechanism. It 
> > may mark some pages that do not
> > receive write request as dirty. 
> 
> That is a posibilty.
> 
> > Second, there is some bugs in the OS running inside the VM. It may issue 
> > some unnecessary write
> > requests. 
> >
> > What do you think about this abnormal phenomenon? Any advice or possible 
> > reasons or even gu

Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-15 Thread Chunguang Li



> -原始邮件-
> 发件人: "Dr. David Alan Gilbert" 
> 发送时间: 2017-11-15 18:11:37 (星期三)
> 收件人: "Chunguang Li" 
> 抄送: qemu-devel@nongnu.org, quint...@redhat.com, amit.s...@redhat.com, 
> pbonz...@redhat.com, stefa...@redhat.com
> 主题: Re: [Qemu-devel] Abnormal observation during migration: too many 
> "write-not-dirty" pages
> 
> * Chunguang Li (lichungu...@hust.edu.cn) wrote:
> > Hi all!
> > 
> > I got a very abnormal observation for the VM migration. I found that many 
> > pages marked as dirty during migration are "not really dirty", which is, 
> > their content are the same as the old version.
> > 
> > 
> > 
> > 
> > I did the migration experiment like this:
> > 
> > During the setup phase of migration, first I suspended the VM. Then I 
> > copied all the pages within the guest physical address space to a memory 
> > buffer as large as the guest memory size. After that, the dirty tracking 
> > began and I resumed the VM. Besides, at the end
> > of each iteration, I also suspended the VM temporarily. During the 
> > suspension, I compared the content of all the pages marked as dirty in this 
> > iteration byte-by-byte with their former copies inside the buffer. If the 
> > content of one page was the same as its former copy, I recorded it as a 
> > "write-not-dirty" page (the page is written exactly with the same content 
> > as the old version). Otherwise, I replaced this page in the buffer with the 
> > new content, for the possible comparison in the future. After the reset of 
> > the dirty bitmap, I resumed the VM. Thus, I obtain the proportion of the 
> > write-not-dirty pages within all the pages marked as dirty for each 
> > pre-copy iteration.
> > 
> > I repeated this experiment with 15 workloads, which are 11 CPU2006 
> > benchmarks, Memcached server, kernel compilation, playing a video, and an 
> > idle VM. The CPU2006 benchmarks and Memcached are write-intensive 
> > workloads. So almost all of them did not converge to stop-copy.
> > 
> > 
> > 
> > 
> > Startlingly, the proportions of the write-not-dirty pages are quite high. 
> > Memcached and three CPU2006 benchmarks(zeusmp, mcf and bzip2) have the most 
> > high proportions. Their proportions of the write-not-dirty pages within all 
> > the dirty pages are as high as 45%-80%. The proportions of the other 
> > workloads are about 5%-20%, which are also abnormal. According to my 
> > intuition, the proportion of write-not-dirty pages should be far less than 
> > these numbers. I think it should be quite a particular case that one page 
> > is written with exactly the same content as the former data.
> > 
> > Besides, the zero pages are not counted for all the results. Because I 
> > think codes like memset() may write large area of pages to zero pages, 
> > which are already zero pages before.
> > 
> > 
> > 
> > 
> > I excluded some possible unknown reasons with the machine hardware, because 
> > I repeated the experiments with two sets of different machines. Then I 
> > guessed it might be related with the huge page feature. However, the result 
> > was the same when I turned the huge page feature off in the OS.
> > 
> > 
> > 
> > 
> > Now there are only two possible reasons in my opinion. 
> > 
> > First, there is some bugs in the KVM kernel dirty tracking mechanism. It 
> > may mark some pages that do not receive write request as dirty.
> > 
> > Second, there is some bugs in the OS running inside the VM. It may issue 
> > some unnecessary write requests.
> > 
> > 
> > What do you think about this abnormal phenomenon? Any advice or possible 
> > reasons or even guesses? I appreciate any responses, because it has 
> > confused me for a long time. Thank you.
> 
> Wasn't it you who pointed out last year the other possibility? - The
> problem of false positives due to sync'ing the whole of memory and then
> writing the data out, but some of the dirty pages were already written?
> 
> Dave

Yes, you remember that! It was me. After that, I did more analysis and 
experiments. I found that, in fact, both reasons contribute to the "fake dirty" 
pages (dirty pages that do not need to be resent, because their contents are 
the same as that in the target node). One is what I pointed out last year, 
which you have mentioned. The other reason is what I am talking about now, the 
"write-not-dirty" phenomenon.
In fact, according to my experiments results, the "wirte-not-dirty" is the main 
reason resulting to the "fake dirty" pages, while sync'ing the whole of memory 
contributes less.

Chunguang
> 
> > 
> > --
> > Chunguang Li, Ph.D. Candidate
> > Wuhan National Laboratory for Optoelectronics (WNLO)
> > Huazhong University of Science & Technology (HUST)
> > Wuhan, Hubei Prov., China
> > 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK





Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-15 Thread Dr. David Alan Gilbert
* Chunguang Li (lichungu...@hust.edu.cn) wrote:
> Hi all!
> 
> I got a very abnormal observation for the VM migration. I found that many 
> pages marked as dirty during migration are "not really dirty", which is, 
> their content are the same as the old version.
> 
> 
> 
> 
> I did the migration experiment like this:
> 
> During the setup phase of migration, first I suspended the VM. Then I copied 
> all the pages within the guest physical address space to a memory buffer as 
> large as the guest memory size. After that, the dirty tracking began and I 
> resumed the VM. Besides, at the end
> of each iteration, I also suspended the VM temporarily. During the 
> suspension, I compared the content of all the pages marked as dirty in this 
> iteration byte-by-byte with their former copies inside the buffer. If the 
> content of one page was the same as its former copy, I recorded it as a 
> "write-not-dirty" page (the page is written exactly with the same content as 
> the old version). Otherwise, I replaced this page in the buffer with the new 
> content, for the possible comparison in the future. After the reset of the 
> dirty bitmap, I resumed the VM. Thus, I obtain the proportion of the 
> write-not-dirty pages within all the pages marked as dirty for each pre-copy 
> iteration.
> 
> I repeated this experiment with 15 workloads, which are 11 CPU2006 
> benchmarks, Memcached server, kernel compilation, playing a video, and an 
> idle VM. The CPU2006 benchmarks and Memcached are write-intensive workloads. 
> So almost all of them did not converge to stop-copy.
> 
> 
> 
> 
> Startlingly, the proportions of the write-not-dirty pages are quite high. 
> Memcached and three CPU2006 benchmarks(zeusmp, mcf and bzip2) have the most 
> high proportions. Their proportions of the write-not-dirty pages within all 
> the dirty pages are as high as 45%-80%. The proportions of the other 
> workloads are about 5%-20%, which are also abnormal. According to my 
> intuition, the proportion of write-not-dirty pages should be far less than 
> these numbers. I think it should be quite a particular case that one page is 
> written with exactly the same content as the former data.
> 
> Besides, the zero pages are not counted for all the results. Because I think 
> codes like memset() may write large area of pages to zero pages, which are 
> already zero pages before.
> 
> 
> 
> 
> I excluded some possible unknown reasons with the machine hardware, because I 
> repeated the experiments with two sets of different machines. Then I guessed 
> it might be related with the huge page feature. However, the result was the 
> same when I turned the huge page feature off in the OS.
> 
> 
> 
> 
> Now there are only two possible reasons in my opinion. 
> 
> First, there is some bugs in the KVM kernel dirty tracking mechanism. It may 
> mark some pages that do not receive write request as dirty.
> 
> Second, there is some bugs in the OS running inside the VM. It may issue some 
> unnecessary write requests.
> 
> 
> What do you think about this abnormal phenomenon? Any advice or possible 
> reasons or even guesses? I appreciate any responses, because it has confused 
> me for a long time. Thank you.

Wasn't it you who pointed out last year the other possibility? - The
problem of false positives due to sync'ing the whole of memory and then
writing the data out, but some of the dirty pages were already written?

Dave

> 
> --
> Chunguang Li, Ph.D. Candidate
> Wuhan National Laboratory for Optoelectronics (WNLO)
> Huazhong University of Science & Technology (HUST)
> Wuhan, Hubei Prov., China
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-15 Thread Juan Quintela
"Chunguang Li"  wrote:
> Hi all! 

Hi

Sorry for the delay, I was on vacation an still getting up to speed.

> I got a very abnormal observation for the VM migration. I found that many 
> pages marked as dirty during
> migration are "not really dirty", which is, their content are the same as the 
> old version. 

I think your test is quite good, and I am also ashamed that 80% of
"false" dirty pages is really a lot.

> I did the migration experiment like this: 
>
> During the setup phase of migration, first I suspended the VM. Then I copied 
> all the pages within the guest
> physical address space to a memory buffer as large as the guest memory size. 
> After that, the dirty tracking
> began and I resumed the VM. Besides, at the end
> of each iteration, I also suspended the VM temporarily. During the 
> suspension, I compared the content of all
> the pages marked as dirty in this iteration byte-by-byte with their former 
> copies inside the buffer. If the
> content of one page was the same as its former copy, I recorded it as a 
> "write-not-dirty" page (the page is
> written exactly with the same content as the old version). Otherwise, I 
> replaced this page in the buffer with
> the new content, for the possible comparison in the future. After the reset 
> of the dirty bitmap, I resumed the
> VM. Thus, I obtain the proportion of the write-not-dirty pages within all the 
> pages marked as dirty for each
> pre-copy iteration. 


vhost and friends could make a small difference here, but in general,
this approach should be ok.

> I repeated this experiment with 15 workloads, which are 11 CPU2006 
> benchmarks, Memcached server,
> kernel compilation, playing a video, and an idle VM. The CPU2006 benchmarks 
> and Memcached are
> write-intensive workloads. So almost all of them did not converge to 
> stop-copy. 

That is the impressive part, 15 workloads.  Thanks for taking the effor.

BTW, do you have your qemu changes handy, just to be able to test
locally, and "review" how do you measure things.


> Startlingly, the proportions of the write-not-dirty pages are quite high. 
> Memcached and three CPU2006
> benchmarks(zeusmp, mcf and bzip2) have the most high proportions. Their 
> proportions of the write-not-dirty
> pages within all the dirty pages are as high as 45%-80%.

Or the workload does really stupid things like:

a = 0;
a = 1;
a = 0;

This makes no sense at all.

Just in case, could you try to test this with xbzrle?  It should go well
with this use case (but you need to get a big enough buffer to cache
enough memory).


> The proportions of the other workloads are about
> 5%-20%, which are also abnormal. According to my intuition, the proportion of 
> write-not-dirty pages should be
> far less than these numbers. I think it should be quite a particular case 
> that one page is written with exactly
> the same content as the former data. 

I agree with that.

> Besides, the zero pages are not counted for all the results. Because I think 
> codes like memset() may write
> large area of pages to zero pages, which are already zero pages before. 
>
> I excluded some possible unknown reasons with the machine hardware, because I 
> repeated the experiments
> with two sets of different machines. Then I guessed it might be related with 
> the huge page feature. However,
> the result was the same when I turned the huge page feature off in the OS. 

Huge page could have caused that.  Remember that we have transparent
huge pages.  I have to look at that code.

> Now there are only two possible reasons in my opinion. 
>
> First, there is some bugs in the KVM kernel dirty tracking mechanism. It may 
> mark some pages that do not
> receive write request as dirty. 

That is a posibilty.

> Second, there is some bugs in the OS running inside the VM. It may issue some 
> unnecessary write
> requests. 
>
> What do you think about this abnormal phenomenon? Any advice or possible 
> reasons or even guesses? I
> appreciate any responses, because it has confused me for a long time. Thank 
> you.

I would like to reproduce this.

Thanks for bringing this to our attention.

Later, Juan.



Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-14 Thread Chunguang Li
Some more details about this experiment:

The host is running Ubuntu-16.04 with 4.4.0 Linux kernel and QEMU-2.5.1; The
guest is running Ubuntu-12.04, except Memcached with Ubuntu-16.04.

 

The exact numbers of the proportions of write-not-dirty pages for the first
2 pre-copy iterations: (0.445 means 44.5%)

Memcached:  0.445, 0.478

Zeusmp:  0.670, 0.727

Mcf: 0.808, 0.793

Bzip2:0.464, 0.447

Milc: 0.341, 0.037

cactusADM:   0.280, 0.248

lbm: 0.090, 0.037

GemsFDTD:   0.226, 0.172

Bwaves:  0.069, 0.003

Astar:0.113, 0.039

Xalancbmk:   0.082, 0.041

Wrf: 0.141, 0.073

 

Any advice? Looking forward to any response. Thank you.

 

Chunguang



[Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-12 Thread Chunguang Li
Hi all!

I got a very abnormal observation for the VM migration. I found that many pages 
marked as dirty during migration are "not really dirty", which is, their 
content are the same as the old version.




I did the migration experiment like this:

During the setup phase of migration, first I suspended the VM. Then I copied 
all the pages within the guest physical address space to a memory buffer as 
large as the guest memory size. After that, the dirty tracking began and I 
resumed the VM. Besides, at the end
of each iteration, I also suspended the VM temporarily. During the suspension, 
I compared the content of all the pages marked as dirty in this iteration 
byte-by-byte with their former copies inside the buffer. If the content of one 
page was the same as its former copy, I recorded it as a "write-not-dirty" page 
(the page is written exactly with the same content as the old version). 
Otherwise, I replaced this page in the buffer with the new content, for the 
possible comparison in the future. After the reset of the dirty bitmap, I 
resumed the VM. Thus, I obtain the proportion of the write-not-dirty pages 
within all the pages marked as dirty for each pre-copy iteration.

I repeated this experiment with 15 workloads, which are 11 CPU2006 benchmarks, 
Memcached server, kernel compilation, playing a video, and an idle VM. The 
CPU2006 benchmarks and Memcached are write-intensive workloads. So almost all 
of them did not converge to stop-copy.




Startlingly, the proportions of the write-not-dirty pages are quite high. 
Memcached and three CPU2006 benchmarks(zeusmp, mcf and bzip2) have the most 
high proportions. Their proportions of the write-not-dirty pages within all the 
dirty pages are as high as 45%-80%. The proportions of the other workloads are 
about 5%-20%, which are also abnormal. According to my intuition, the 
proportion of write-not-dirty pages should be far less than these numbers. I 
think it should be quite a particular case that one page is written with 
exactly the same content as the former data.

Besides, the zero pages are not counted for all the results. Because I think 
codes like memset() may write large area of pages to zero pages, which are 
already zero pages before.




I excluded some possible unknown reasons with the machine hardware, because I 
repeated the experiments with two sets of different machines. Then I guessed it 
might be related with the huge page feature. However, the result was the same 
when I turned the huge page feature off in the OS.




Now there are only two possible reasons in my opinion. 

First, there is some bugs in the KVM kernel dirty tracking mechanism. It may 
mark some pages that do not receive write request as dirty.

Second, there is some bugs in the OS running inside the VM. It may issue some 
unnecessary write requests.




What do you think about this abnormal phenomenon? Any advice or possible 
reasons or even guesses? I appreciate any responses, because it has confused me 
for a long time. Thank you.


--
Chunguang Li, Ph.D. Candidate
Wuhan National Laboratory for Optoelectronics (WNLO)
Huazhong University of Science & Technology (HUST)
Wuhan, Hubei Prov., China