[Qemu-devel] 回复: 答复: migrate_set_speed has no effect if the guest is using hugepages.

Lin Ma Mon, 15 Jul 2019 02:59:17 -0700


> -----邮件原件-----
> 发件人: Dr. David Alan Gilbert <dgilb...@redhat.com>
> 发送时间: 2019年7月12日 20:34
> 收件人: Lin Ma <l...@suse.com>
> 抄送: qemu-devel@nongnu.org
> 主题: Re: 答复: [Qemu-devel] migrate_set_speed has no effect if the guest is
> using hugepages.
> 
> * Lin Ma (l...@suse.com) wrote:
> >
> >
> > > -----邮件原件-----
> > > 发件人: Dr. David Alan Gilbert <dgilb...@redhat.com>
> > > 发送时间: 2019年7月11日 18:24
> > > 收件人: Lin Ma <l...@suse.com>
> > > 抄送: qemu-devel@nongnu.org
> > > 主题: Re: [Qemu-devel] migrate_set_speed has no effect if the guest is
> > > using hugepages.
> > >
> > > * Lin Ma (l...@suse.com) wrote:
> > > > Hi all,
> > >
> > > Hi Lin,
> >
> > Hi Dave,
> > >
> > > > When I live migrate a qemu/kvm guest, If the guest is using huge
> > > > pages, I found that the migrate_set_speed command had no effect
> > > > during
> > > stage 2.
> > >
> > > Can you explain what you mean by 'stage 2'?
> > We know that the live migration contains 3 stages:
> > Stage 1: Mark all of RAM dirty.
> > Stage 2: Keep sending dirty RAM pages since last iteration Stage 3:
> > Stop guest, transfer remaining dirty RAM, device state (Please refer
> > to
> > https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-
> > virtual-machines/#live-migration for further details)
> 
> OK, yeh the numbering is pretty arbitrary so it's not something I normally 
> think
> about like that.
> 
> >
> > > > It was caused by commit 4c011c3 postcopy: Send whole huge pages
> > > >
> > > > I'm wondering that is it by design or is it a bug waiting for fix?
> > >
> > > This is the first report I've seen for it.  How did you conclude
> > > that
> > > 4c011c3 caused it?  While I can see it might have some effect on the
> > > bandwidth management, I'm surprised it has this much effect.
> >
> > While digging into the bandwidth issue, Git bisect shows that this commit 
> > was
> the first bad commit.
> 
> OK.
> 
> > > What size huge pages are you using - 2MB or 1GB?
> >
> > When I hit this issue I was using 1GB huge page size.
> > I tested this issue with 2MB page size today On Gigabit LAN, Although
> > the bandwidth control looks a little better than using 1GB, But not too 
> > much.
> Please refer to the below test result.
> 
> OK, I can certainly see why this might happen with 1GB huge pages; I need to
> have a think about a fix.
> 
> > > I can imagine we might have a problem that since we only do the
> > > sleep between the hugepages, if we were using 1GB hugepages then
> > > we'd see <big chunk of
> > > data>[sleep]<big chunk of data>[sleep] which isn't as smooth as it used to
> be.
> > >
> > > Can you give me some more details of your test?
> >
> > Live migration bandwidth management testing with 2MB hugepage size:
> > sles12sp4_i440fx is a qemu/kvm guest with 6GB memory size.
> > Note: the throughput value is approximating value.
> >
> > Terminal 1:
> > virsh migrate-setspeed sles12sp4_i440fx $bandwidth && virsh migrate
> > --live sles12sp4_i440fx qemu+tcp://5810f/system
> >
> > Terminal 2:
> > virsh qemu-monitor-command sles12sp4_i440fx --hmp "info migrate"
> >
> > bandwidth=5
> > throughput: 160 mbps
> >
> > bandwidth=10
> > throughput: 167 mbps
> >
> > bandwidth=15
> > throughput: 168 mbps
> >
> > bandwidth=20
> > throughput: 168 mbps
> >
> > bandwidth=21
> > throughput: 336 mbps
> >
> > bandwidth=22
> > throughput: 336 mbps
> >
> > bandwidth=25
> > throughput: 335.87 mbps
> >
> > bandwidth=30
> > throughput: 335 mbps
> >
> > bandwidth=35
> > throughput: 335 mbps
> >
> > bandwidth=40
> > throughput: 335 mbps
> >
> > bandwidth=45
> > throughput: 504.00 mbps
> >
> > bandwidth=50
> > throughput: 500.00 mbps
> >
> > bandwidth=55
> > throughput: 500.00 mbps
> >
> > bandwidth=60
> > throughput: 500.00 mbps
> >
> > bandwidth=65
> > throughput: 650.00 mbps
> >
> > bandwidth=70
> > throughput: 660.00 mbps
> 
> OK, so migrate-setspeed takes a bandwidth in MBytes/sec and I guess you're
> throughput is in MBit/sec - so at the higher end it's about right, and at the 
> lower
> end it's way off.
> 
> Let me think about a fix for this.
> 
> What are you using to measure throughput?


I use 'watch' command to observe the output of qemu hmp command 'info migrate', 
calculate the
average value of throughput field during stage 2 of live migration.

Thanks for taking time to dig into this issue,
Lin

[Qemu-devel] 回复: 答复: migrate_set_speed has no effect if the guest is using hugepages.

Reply via email to