Re: [Users] Live Migration Optimal execution

Kir Kolyshkin Fri, 28 Nov 2014 01:56:56 -0800


On 11/28/2014 12:47 AM, Pavel Odintsov wrote:

Nice suggestion! Old fashioned UBC is real nightmare and was deprecated.


In fact they are not deprecated, but rather optional. The beauty of it
is you can still limit say network buffers or number of processes (or
have out of memory killer guarantee) -- but you don't
HAVE to do it, as the default setup (only setting ram/swap) should be
secure enough.

Kir.


On Fri, Nov 28, 2014 at 10:35 AM, CoolCold <[email protected]> wrote:

Hello!
I'd recommend set only ram/swap limits (via --ram/--swap) , letting other
settings be mostly unlimited (while ram/swap limits are not overflowed of
course) - http://wiki.openvz.org/VSwap

On Fri, Nov 28, 2014 at 3:31 AM, Nipun Arora <[email protected]>
wrote:

Nevermind, I figured it out by changing the fail counters in
/proc/user_beans

Thanks
Nipun

On Thu, Nov 27, 2014 at 7:14 PM, Nipun Arora <[email protected]>
wrote:

Thanks, the speed is improved by an order of magnitude :)

btw. is there any benchmark, that you all have looked into for testing
how good/practical live migration is for real-world systems?
Additionally, I'm trying to run a java application(dacapo benchmark), but
keep having trouble in getting java to run..

java -version

Error occurred during initialization of VM

Could not reserve enough space for object heap

Could not create the Java virtual machine.


I've put my vz conf file below, can anyone suggest what could be the
problem?

Thanks
Nipun

# UBC parameters (in form of barrier:limit)

KMEMSIZE="14372700:14790164"

LOCKEDPAGES="2048:2048"

PRIVVMPAGES="65536:69632"

SHMPAGES="21504:21504"

NUMPROC="240:240"

PHYSPAGES="0:131072"

VMGUARPAGES="33792:unlimited"

OOMGUARPAGES="26112:unlimited"

NUMTCPSOCK="360:360"

NUMFLOCK="188:206"

NUMPTY="16:16"

NUMSIGINFO="256:256"

TCPSNDBUF="1720320:2703360"

TCPRCVBUF="1720320:2703360"

OTHERSOCKBUF="1126080:2097152"

DGRAMRCVBUF="262144:262144"

NUMOTHERSOCK="1200"

DCACHESIZE="3409920:3624960"

NUMFILE="9312:9312"

AVNUMPROC="180:180"

NUMIPTENT="128:128"


# Disk quota parameters (in form of softlimit:hardlimit)

DISKSPACE="3145728:3145728"

DISKINODES="131072:144179"

QUOTATIME="0"


# CPU fair scheduler parameter

CPUUNITS="1000"


NETFILTER="stateless"

VE_ROOT="/vz/root/101"

VE_PRIVATE="/vz/private/101"

OSTEMPLATE="centos-6-x86_64"

ORIGIN_SAMPLE="basic"

HOSTNAME="test"

IP_ADDRESS="192.168.1.101"

NAMESERVER="8.8.8.8 8.8.4.4"

CPULIMIT="25"

SWAPPAGES="0:262144"



On Mon, Nov 24, 2014 at 12:16 PM, Kir Kolyshkin <[email protected]> wrote:


On 11/23/2014 07:13 PM, Nipun Arora wrote:

Thanks, I will try your suggestions, and get back to you.
btw... any idea what could be used to share the base image on both
containers?
Like hardlink it in what way? Once both containers start, won't they
have to write to different locations?


ploop is composed as a set of stacked images, with all of them but the
top one being read-only.


I understand that some file systems have a copy on write mechanism,
where after a snapshot all future writes are written to a additional linked
disks.
Does ploop operate in a similar way?


yes


http://wiki.qemu.org/Features/Snapshots


http://openvz.livejournal.com/44508.html



The cloning with a modified vzmigrate script helps.

- Nipun

On Sun, Nov 23, 2014 at 5:29 PM, Kir Kolyshkin <[email protected]> wrote:


On 11/23/2014 04:59 AM, Nipun Arora wrote:

Hi Kir,

Thanks for the response, I'll update it, and tell you about the
results.

1. A follow up question... I found that the write I/O speed of
500-1Mbps increased the suspend time  to several minutes.(mostly pcopy
stage)
This seems extremely high for a relatively low I/O workload, which is
why I was wondering if there are any special things I need to take care of.
(I ran fio (flexible i/o writer) with fixed throughput while doing live
migration)


Please retry with vzctl 4.8 and ploop 1.12.1 (make sure they are on
both sides).
There was a 5 second wait for the remote side to finish syncing
copied ploop data. It helped a case with not much I/O activity in
container, but
ruined the case you are talking about.

Newer ploop and vzctl implement a feedback channel for ploop copy that
eliminates
that wait time.

http://git.openvz.org/?p=ploop;a=commit;h=20d754c91079165b
http://git.openvz.org/?p=vzctl;a=commit;h=374b759dec45255d4

There are some other major improvements as well, such as async send for
ploop.

http://git.openvz.org/?p=ploop;a=commit;h=a55e26e9606e0b


2. For my purposes, I have modified the live migration script to allow
me to do cloning... i.e. I start both the containers instead of deleting the
original. I need to do this "cloning" from time to time for the same target
container...

        a. Which means that lets say we cloned container C1 to container
C2, and let both execute at time t0, this works with no apparent loss of
service.

         b. Now at time t1 I would like to again clone C1 to C2, and
would like to optimize the rsync process as most of the ploop file for C1
and C2 should still be the same (i.e. less time to sync). Can anyone suggest
what would be the best way to realize the second point?


You can create a ploop snapshot and use shared base image for both
containers
(instead of copying the base delta, hardlink it). This is not supported
by tools
(for example, since base delta is now shared you can't merge down to
it, but the
tools are not aware) so you need to figure it out by yourself and be
accurate
but it should work.




Thanks
Nipun

On Sun, Nov 23, 2014 at 12:56 AM, Kir Kolyshkin <[email protected]> wrote:


On 11/22/2014 09:09 AM, Nipun Arora wrote:

Hi All,

I was wondering if anyone can suggest what is the most optimal way to
do the following

1. Can anyone clarify if ploop is the best layout for minimum suspend
time during live migration?


Yes (due to ploop copy which only copies the modified blocks).


2. I tried migrating a ploop device where I increased the --diskspace
to 5G,
and found that the suspend time taken by live migration increased to
57 seconds
(mainly undump and restore increased)...
whereas a 2G diskspace was taking 2-3 seconds suspend time... Is this
expected?


No. Undump and restore times depends mostly on amount of RAM used by a
container.

Having said that, live migration stages influence each other, although
it's less so
in the latest vzctl release (I won't go into details here if you allow
me -- just make sure
you test with vzctl 4.8).


3. I tried running a write intensive workload, and found that beyond
100-150Kbps,
the suspend time during live migration rapidly increased? Is this an
expected trend?


Sure. With increased writing speed, the amount of data that needs to
be copied after CT
is suspended increases.


I am using vzctl 4.7, and ploop 1.11 in centos 6.5


You need to update vzctl and ploop and rerun your tests, there should
be
some improvement (in particular with respect to issue #3).


Thanks
Nipun


_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users


_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users



--
Best regards,
[COOLCOLD-RIPN]

_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users


_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users

Re: [Users] Live Migration Optimal execution

Reply via email to