subject:"\[Qemu\-devel\] \[POC\]colo\-proxy in qemu"



On 11/10/2015 04:30 PM, zhanghailiang wrote:
> On 2015/11/10 15:35, Jason Wang wrote:
>>
>>
>> On 11/10/2015 01:26 PM, Tkid wrote:
>>> Hi,all
>>>
>>> We are planning to reimplement colo proxy in userspace (Here is in
>>> qemu) to
>>> cache and compare net packets.This module is one of the important
>>> components
>>> of COLO project and now it is still in early stage, so any comments and
>>> feedback are warmly welcomed,thanks in advance.
>>>
>>> ## Background
>>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
>>> Service)
>>> project is a high availability solution. Both Primary VM (PVM) and
>>> Secondary VM
>>> (SVM) run in parallel. They receive the same request from client, and
>>> generate
>>> responses in parallel too. If the response packets from PVM and SVM are
>>> identical, they are released immediately. Otherwise, a VM checkpoint
>>> (on demand)
>>> is conducted.
>>> Paper:
>>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
>>> COLO on Xen:
>>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>> COLO on Qemu/KVM:
>>> http://wiki.qemu.org/Features/COLO
>>>
>>> By the needs of capturing response packets from PVM and SVM and
>>> finding out
>>> whether they are identical, we introduce a new module to qemu
>>> networking called
>>> colo-proxy.
>>>
>>> This document describes the design of the colo-proxy module
>>>
>>> ## Glossary
>>>PVM - Primary VM, which provides services to clients.
>>>SVM - Secondary VM, a hot standby and replication of PVM.
>>>PN - Primary Node, the host which PVM runs on
>>>SN - Secondary Node, the host which SVM runs on
>>>
>>> ## Our Idea ##
>>>
>>> COLO-Proxy
>>> COLO-Proxy is a part of COLO,based on qemu net filter and it's a
>>> plugin for
>>> qemu net filter.the function keep SVM connect normal to PVM and compare
>>> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
>>>
>>> == Workflow ==
>>>
>>>
>>> +--+  +--+
>>> |PN|  |SN|
>>> +---+ +---+
>>> | +---+ | | +---+ |
>>> | |   | | | |   | |
>>> | |PVM| | | |SVM| |
>>> | |   | | | |   | |
>>> | +--+-^--+ | | +-^++ |
>>> || || |   ||  |
>>> || | ++ | | +---+ ||  |
>>> || | |COLO| |(socket) | |COLO   | ||  |
>>> || | | CheckPoint +-> CheckPoint| ||  |
>>> || | || |  (6)| |   | ||  |
>>> || | +-^--+ | | +---+ ||  |
>>> || |   (5) || |   ||  |
>>> || |   || |   ||  |
>>> | +--v-+--+ | Forward(socket) | +-+v+ |
>>> | |COLO Proxy  |  +---+(1)+->seq adjust(2)| | |
>>> | |  +-+--+ | | +-+ | |
>>> | |  | Compare(4) <---+(3)+-+ COLO Proxy| |
>>> | +---+ | Forward(socket) | +---+ |
>>> ++Qemu+-+ ++Qemu+-+
>>> | ^
>>> | |
>>> | |
>>>+v-++
>>>|   |
>>>|  Client   |
>>>|   |
>>>+---+
>>>
>>>
>>>
>>>
>>> (1)When PN receive client packets,PN COLO-Proxy copy and forward
>>> packets to
>>> SN COLO-Proxy.
>>> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's
>>> ack,send
>>> adjusted packets to SVM
>>> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
>>> COLO-Proxy.
>>> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's
>>> packets,then
>>> compare PVM's packets data with SVM's packets data. If packets is
>>> different, compare
>>> module notify COLO CheckPoint module to do a checkpoint then send
>>> PVM's packets to
>>> client and drop SVM's packets, otherwise, just send PVM's packets to
>>> client and
>>> drop SVM's packets.
>>> (5)notify COLO-Checkpoint module checkpoint is needed
>>> (6)Do COLO-Checkpoint
>>>
>>> ### QEMU space TCP/IP stack(Based on SLIRP) ###
>>> We need a QEMU space TCP/IP stack to help us to analysis packet. After
>>> looking
>>> into QEMU, we found that SLIRP
>>>
>>> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
>>>
>>>
>>> is a good choice for us. SLIRP proivdes a full TCP/IP stack within
>>> QEMU, it can
>>> help use to handle the packet written to/read from backend(tap) device
>>> which is
>>> just like a link layer(L2) packet.
>>>
>>> ### Packet enqueue and compare

Re: [Qemu-devel] [POC]colo-proxy in qemu

2015-11-10 Thread Dr. David Alan Gilbert

* Tkid (zhangchen.f...@cn.fujitsu.com) wrote:
> Hi,all
> 
> We are planning to reimplement colo proxy in userspace (Here is in qemu) to
> cache and compare net packets.This module is one of the important components
> of COLO project and now it is still in early stage, so any comments and
> feedback are warmly welcomed,thanks in advance.
> 
> ## Background
> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
> Service)
> project is a high availability solution. Both Primary VM (PVM) and Secondary
> VM
> (SVM) run in parallel. They receive the same request from client, and
> generate
> responses in parallel too. If the response packets from PVM and SVM are
> identical, they are released immediately. Otherwise, a VM checkpoint (on
> demand)
> is conducted.
> Paper:
> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
> COLO on Xen:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> COLO on Qemu/KVM:
> http://wiki.qemu.org/Features/COLO
> 
> By the needs of capturing response packets from PVM and SVM and finding out
> whether they are identical, we introduce a new module to qemu networking
> called
> colo-proxy.
> 
> This document describes the design of the colo-proxy module
> 
> ## Glossary
>   PVM - Primary VM, which provides services to clients.
>   SVM - Secondary VM, a hot standby and replication of PVM.
>   PN - Primary Node, the host which PVM runs on
>   SN - Secondary Node, the host which SVM runs on
> 
> ## Our Idea ##
> 
> COLO-Proxy
> COLO-Proxy is a part of COLO,based on qemu net filter and it's a plugin for
> qemu net filter.the function keep SVM connect normal to PVM and compare
> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
> 
> == Workflow ==
> 
> 
> +--+  +--+
> |PN|  |SN|
> +---+ +---+
> | +---+ | | +---+ |
> | |   | | | |   | |
> | |PVM| | | |SVM| |
> | |   | | | |   | |
> | +--+-^--+ | | +-^++ |
> || || |   ||  |
> || | ++ | | +---+ ||  |
> || | |COLO| |(socket) | |COLO   | ||  |
> || | | CheckPoint +-> CheckPoint| ||  |
> || | || |  (6)| |   | ||  |
> || | +-^--+ | | +---+ ||  |
> || |   (5) || |   ||  |
> || |   || |   ||  |
> | +--v-+--+ | Forward(socket) | +-+v+ |
> | |COLO Proxy  |  +---+(1)+->seq adjust(2)| | |
> | |  +-+--+ | | +-+ | |
> | |  | Compare(4) <---+(3)+-+ COLO Proxy| |
> | +---+ | Forward(socket) | +---+ |
> ++Qemu+-+ ++Qemu+-+
>| ^
>| |
>| |
>   +v-++
>   |   |
>   |  Client   |
>   |   |
>   +---+
> 
> 
> (1)When PN receive client packets,PN COLO-Proxy copy and forward packets to
> SN COLO-Proxy.
> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's ack,send
> adjusted packets to SVM
> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
> COLO-Proxy.

What protocol are you using for the data carried over the Forward(socket)?
I'm just wondering if there's an existing layer2 tunneling protocol that
it would be best to use.

> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's packets,then
> compare PVM's packets data with SVM's packets data. If packets is different,
> compare
> module notify COLO CheckPoint module to do a checkpoint then send PVM's
> packets to
> client and drop SVM's packets, otherwise, just send PVM's packets to client
> and
> drop SVM's packets.
> (5)notify COLO-Checkpoint module checkpoint is needed
> (6)Do COLO-Checkpoint
> 
> ### QEMU space TCP/IP stack(Based on SLIRP) ###
> We need a QEMU space TCP/IP stack to help us to analysis packet. After
> looking
> into QEMU, we found that SLIRP
> 
> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
> 
> is a good choice for us. SLIRP proivdes a full TCP/IP stack within QEMU, it
> can
> help use to handle the packet written to/read from backend(tap) device which
> is
> just like a link layer(L2) packet.

I still think SLIRP might be painful; but it might be an easy one to start
with.

> ### Packet enqueue and compare ###
> Together with QEMU space TCP/IP stack, we enqueue all packets sent by PVM
> and
>

Re: [Qemu-devel] [POC]colo-proxy in qemu

2015-11-10 Thread Dong, Eddie

> - What's the plan for vhost? Userspace network in qemu is rather slow, most
> user will choose vhost.
[Dong, Eddie] Hi Jason:
How about we take staging approach? In general, COLO opens a door of 
high performance HA solution, but it will take very long time to make 
everything perfect. As for the network virtualization, I think we may start 
from usage with moderate network bandwidth, like 1Gbps. Otherwise, the 
performance of COLO may be not that good (of course, like David mentioned, the 
worst case is same with periodic checkpoint). At the moment, how about we start 
from in Qemu virtio network, and enhance for vhost case in future? 

The good thing is that we get more people working on the patch series, 
and glad to see UMU also joined the effort.  Thanks and welcome...

Thx Eddie

Re: [Qemu-devel] [POC]colo-proxy in qemu



On 11/10/2015 05:35 PM, Tkid wrote:
>
>
> On 11/10/2015 03:35 PM, Jason Wang wrote:
>> On 11/10/2015 01:26 PM, Tkid wrote:
>>> Hi,all
>>>
>>> We are planning to reimplement colo proxy in userspace (Here is in
>>> qemu) to
>>> cache and compare net packets.This module is one of the important
>>> components
>>> of COLO project and now it is still in early stage, so any comments and
>>> feedback are warmly welcomed,thanks in advance.
>>>
>>> ## Background
>>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
>>> Service)
>>> project is a high availability solution. Both Primary VM (PVM) and
>>> Secondary VM
>>> (SVM) run in parallel. They receive the same request from client, and
>>> generate
>>> responses in parallel too. If the response packets from PVM and SVM are
>>> identical, they are released immediately. Otherwise, a VM checkpoint
>>> (on demand)
>>> is conducted.
>>> Paper:
>>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
>>> COLO on Xen:
>>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>> COLO on Qemu/KVM:
>>> http://wiki.qemu.org/Features/COLO
>>>
>>> By the needs of capturing response packets from PVM and SVM and
>>> finding out
>>> whether they are identical, we introduce a new module to qemu
>>> networking called
>>> colo-proxy.
>>>
>>> This document describes the design of the colo-proxy module
>>>
>>> ## Glossary
>>>PVM - Primary VM, which provides services to clients.
>>>SVM - Secondary VM, a hot standby and replication of PVM.
>>>PN - Primary Node, the host which PVM runs on
>>>SN - Secondary Node, the host which SVM runs on
>>>
>>> ## Our Idea ##
>>>
>>> COLO-Proxy
>>> COLO-Proxy is a part of COLO,based on qemu net filter and it's a
>>> plugin for
>>> qemu net filter.the function keep SVM connect normal to PVM and compare
>>> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
>>>
>>> == Workflow ==
>>>
>>> +--+  +--+
>>> |PN|  |SN|
>>> +---+ +---+
>>> | +---+ | | +---+ |
>>> | |   | | | |   | |
>>> | |PVM| | | |SVM| |
>>> | |   | | | |   | |
>>> | +--+-^--+ | | +-^++ |
>>> || || |   ||  |
>>> || | ++ | | +---+ ||  |
>>> || | |COLO| |(socket) | |COLO   | ||  |
>>> || | | CheckPoint +-> CheckPoint| ||  |
>>> || | || |  (6)| |   | ||  |
>>> || | +-^--+ | | +---+ ||  |
>>> || |   (5) || |   ||  |
>>> || |   || |   ||  |
>>> | +--v-+--+ | Forward(socket) | +-+v+ |
>>> | |COLO Proxy  |  +---+(1)+->seq adjust(2)| | |
>>> | |  +-+--+ | | +-+ | |
>>> | |  | Compare(4) <---+(3)+-+ COLO Proxy| |
>>> | +---+ | Forward(socket) | +---+ |
>>> ++Qemu+-+ ++Qemu+-+
>>> | ^
>>> | |
>>> | |
>>>+v-++
>>>|   |
>>>|  Client   |
>>>|   |
>>>+---+
>>>
>>>
>>> (1)When PN receive client packets,PN COLO-Proxy copy and forward
>>> packets to
>>> SN COLO-Proxy.
>>> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's
>>> ack,send
>>> adjusted packets to SVM
>>> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
>>> COLO-Proxy.
>>> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's
>>> packets,then
>>> compare PVM's packets data with SVM's packets data. If packets is
>>> different, compare
>>> module notify COLO CheckPoint module to do a checkpoint then send
>>> PVM's packets to
>>> client and drop SVM's packets, otherwise, just send PVM's packets to
>>> client and
>>> drop SVM's packets.
>>> (5)notify COLO-Checkpoint module checkpoint is needed
>>> (6)Do COLO-Checkpoint
>>>
>>> ### QEMU space TCP/IP stack(Based on SLIRP) ###
>>> We need a QEMU space TCP/IP stack to help us to analysis packet. After
>>> looking
>>> into QEMU, we found that SLIRP
>>>
>>> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
>>>
>>>
>>> is a good choice for us. SLIRP proivdes a full TCP/IP stack within
>>> QEMU, it can
>>> help use to handle the packet written to/read from backend(tap) device
>>> which is
>>> just like a link layer(L2) packet.
>>>
>>> ### Packet enqueue and compare ###
>>> Together

Re: [Qemu-devel] [POC]colo-proxy in qemu



On 11/10/2015 05:41 PM, Dr. David Alan Gilbert wrote:
> * Jason Wang (jasow...@redhat.com) wrote:
>>
>> On 11/10/2015 01:26 PM, Tkid wrote:
>>> Hi,all
>>>
>>> We are planning to reimplement colo proxy in userspace (Here is in
>>> qemu) to
>>> cache and compare net packets.This module is one of the important
>>> components
>>> of COLO project and now it is still in early stage, so any comments and
>>> feedback are warmly welcomed,thanks in advance.
>>>
>>> ## Background
>>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
>>> Service)
>>> project is a high availability solution. Both Primary VM (PVM) and
>>> Secondary VM
>>> (SVM) run in parallel. They receive the same request from client, and
>>> generate
>>> responses in parallel too. If the response packets from PVM and SVM are
>>> identical, they are released immediately. Otherwise, a VM checkpoint
>>> (on demand)
>>> is conducted.
>>> Paper:
>>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
>>> COLO on Xen:
>>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>> COLO on Qemu/KVM:
>>> http://wiki.qemu.org/Features/COLO
>>>
>>> By the needs of capturing response packets from PVM and SVM and
>>> finding out
>>> whether they are identical, we introduce a new module to qemu
>>> networking called
>>> colo-proxy.
>>>
>>> This document describes the design of the colo-proxy module
>>>
>>> ## Glossary
>>>   PVM - Primary VM, which provides services to clients.
>>>   SVM - Secondary VM, a hot standby and replication of PVM.
>>>   PN - Primary Node, the host which PVM runs on
>>>   SN - Secondary Node, the host which SVM runs on
>>>
>>> ## Our Idea ##
>>>
>>> COLO-Proxy
>>> COLO-Proxy is a part of COLO,based on qemu net filter and it's a
>>> plugin for
>>> qemu net filter.the function keep SVM connect normal to PVM and compare
>>> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
>>>
>>> == Workflow ==
>>>
>>>
>>> +--+  +--+
>>> |PN|  |SN|
>>> +---+ +---+
>>> | +---+ | | +---+ |
>>> | |   | | | |   | |
>>> | |PVM| | | |SVM| |
>>> | |   | | | |   | |
>>> | +--+-^--+ | | +-^++ |
>>> || || |   ||  |
>>> || | ++ | | +---+ ||  |
>>> || | |COLO| |(socket) | |COLO   | ||  |
>>> || | | CheckPoint +-> CheckPoint| ||  |
>>> || | || |  (6)| |   | ||  |
>>> || | +-^--+ | | +---+ ||  |
>>> || |   (5) || |   ||  |
>>> || |   || |   ||  |
>>> | +--v-+--+ | Forward(socket) | +-+v+ |
>>> | |COLO Proxy  |  +---+(1)+->seq adjust(2)| | |
>>> | |  +-+--+ | | +-+ | |
>>> | |  | Compare(4) <---+(3)+-+ COLO Proxy| |
>>> | +---+ | Forward(socket) | +---+ |
>>> ++Qemu+-+ ++Qemu+-+
>>>| ^
>>>| |
>>>| |
>>>   +v-++
>>>   |   |
>>>   |  Client   |
>>>   |   |
>>>   +---+
>>>
>>>
>>>
>>>
>>> (1)When PN receive client packets,PN COLO-Proxy copy and forward
>>> packets to
>>> SN COLO-Proxy.
>>> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's ack,send
>>> adjusted packets to SVM
>>> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
>>> COLO-Proxy.
>>> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's packets,then
>>> compare PVM's packets data with SVM's packets data. If packets is
>>> different, compare
>>> module notify COLO CheckPoint module to do a checkpoint then send
>>> PVM's packets to
>>> client and drop SVM's packets, otherwise, just send PVM's packets to
>>> client and
>>> drop SVM's packets.
>>> (5)notify COLO-Checkpoint module checkpoint is needed
>>> (6)Do COLO-Checkpoint
>>>
>>> ### QEMU space TCP/IP stack(Based on SLIRP) ###
>>> We need a QEMU space TCP/IP stack to help us to analysis packet. After
>>> looking
>>> into QEMU, we found that SLIRP
>>>
>>> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
>>>
>>> is a good choice for us. SLIRP proivdes a full TCP/IP stack within
>>> QEMU, it can
>>> help use to handle the packet written to/read from backend(tap) device
>>> which is
>>> just like a link layer(L2) packet.
>>>
>>> ### Packet enqueue and compare ###
>>>

Re: [Qemu-devel] [POC]colo-proxy in qemu



On 11/11/2015 09:23 AM, Dong, Eddie wrote:
>> - What's the plan for vhost? Userspace network in qemu is rather slow, most
>> user will choose vhost.
> [Dong, Eddie] Hi Jason:
>   How about we take staging approach? In general, COLO opens a door of 
> high performance HA solution, but it will take very long time to make 
> everything perfect. As for the network virtualization, I think we may start 
> from usage with moderate network bandwidth, like 1Gbps. Otherwise, the 
> performance of COLO may be not that good (of course, like David mentioned, 
> the worst case is same with periodic checkpoint). At the moment, how about we 
> start from in Qemu virtio network, and enhance for vhost case in future? 

Yes, of course and it makes sense.

Mentioning vhost here is to avoid re-inventing or abandoning some exist
infrastructures. For example, thinking how netfiler can work with vhost
from the beginning does not harm ...

>
>   The good thing is that we get more people working on the patch series, 
> and glad to see UMU also joined the effort.  Thanks and welcome...
>
> Thx Eddie

Thanks

Re: [Qemu-devel] [POC]colo-proxy in qemu

2015-11-09 Thread Jason Wang



On 11/10/2015 01:26 PM, Tkid wrote:
> Hi,all
>
> We are planning to reimplement colo proxy in userspace (Here is in
> qemu) to
> cache and compare net packets.This module is one of the important
> components
> of COLO project and now it is still in early stage, so any comments and
> feedback are warmly welcomed,thanks in advance.
>
> ## Background
> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
> Service)
> project is a high availability solution. Both Primary VM (PVM) and
> Secondary VM
> (SVM) run in parallel. They receive the same request from client, and
> generate
> responses in parallel too. If the response packets from PVM and SVM are
> identical, they are released immediately. Otherwise, a VM checkpoint
> (on demand)
> is conducted.
> Paper:
> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
> COLO on Xen:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> COLO on Qemu/KVM:
> http://wiki.qemu.org/Features/COLO
>
> By the needs of capturing response packets from PVM and SVM and
> finding out
> whether they are identical, we introduce a new module to qemu
> networking called
> colo-proxy.
>
> This document describes the design of the colo-proxy module
>
> ## Glossary
>   PVM - Primary VM, which provides services to clients.
>   SVM - Secondary VM, a hot standby and replication of PVM.
>   PN - Primary Node, the host which PVM runs on
>   SN - Secondary Node, the host which SVM runs on
>
> ## Our Idea ##
>
> COLO-Proxy
> COLO-Proxy is a part of COLO,based on qemu net filter and it's a
> plugin for
> qemu net filter.the function keep SVM connect normal to PVM and compare
> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
>
> == Workflow ==
>
>
> +--+  +--+
> |PN|  |SN|
> +---+ +---+
> | +---+ | | +---+ |
> | |   | | | |   | |
> | |PVM| | | |SVM| |
> | |   | | | |   | |
> | +--+-^--+ | | +-^++ |
> || || |   ||  |
> || | ++ | | +---+ ||  |
> || | |COLO| |(socket) | |COLO   | ||  |
> || | | CheckPoint +-> CheckPoint| ||  |
> || | || |  (6)| |   | ||  |
> || | +-^--+ | | +---+ ||  |
> || |   (5) || |   ||  |
> || |   || |   ||  |
> | +--v-+--+ | Forward(socket) | +-+v+ |
> | |COLO Proxy  |  +---+(1)+->seq adjust(2)| | |
> | |  +-+--+ | | +-+ | |
> | |  | Compare(4) <---+(3)+-+ COLO Proxy| |
> | +---+ | Forward(socket) | +---+ |
> ++Qemu+-+ ++Qemu+-+
>| ^
>| |
>| |
>   +v-++
>   |   |
>   |  Client   |
>   |   |
>   +---+
>
>
>
>
> (1)When PN receive client packets,PN COLO-Proxy copy and forward
> packets to
> SN COLO-Proxy.
> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's ack,send
> adjusted packets to SVM
> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
> COLO-Proxy.
> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's packets,then
> compare PVM's packets data with SVM's packets data. If packets is
> different, compare
> module notify COLO CheckPoint module to do a checkpoint then send
> PVM's packets to
> client and drop SVM's packets, otherwise, just send PVM's packets to
> client and
> drop SVM's packets.
> (5)notify COLO-Checkpoint module checkpoint is needed
> (6)Do COLO-Checkpoint
>
> ### QEMU space TCP/IP stack(Based on SLIRP) ###
> We need a QEMU space TCP/IP stack to help us to analysis packet. After
> looking
> into QEMU, we found that SLIRP
>
> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
>
> is a good choice for us. SLIRP proivdes a full TCP/IP stack within
> QEMU, it can
> help use to handle the packet written to/read from backend(tap) device
> which is
> just like a link layer(L2) packet.
>
> ### Packet enqueue and compare ###
> Together with QEMU space TCP/IP stack, we enqueue all packets sent by
> PVM and
> SVM on Primary QEMU, and then compare the packet payload for each
> connection.
>

Hi:

Just have the following questions in my mind (some has been raised in
the previous rounds of discussion without a conclusion):

- What's the plan for management layer? The setup

[Qemu-devel] [POC]colo-proxy in qemu

2015-11-09 Thread Tkid


Hi,all

We are planning to reimplement colo proxy in userspace (Here is in qemu) to
cache and compare net packets.This module is one of the important components
of COLO project and now it is still in early stage, so any comments and
feedback are warmly welcomed,thanks in advance.

## Background
COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop 
Service)
project is a high availability solution. Both Primary VM (PVM) and 
Secondary VM
(SVM) run in parallel. They receive the same request from client, and 
generate

responses in parallel too. If the response packets from PVM and SVM are
identical, they are released immediately. Otherwise, a VM checkpoint (on 
demand)

is conducted.
Paper:
http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
COLO on Xen:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
COLO on Qemu/KVM:
http://wiki.qemu.org/Features/COLO

By the needs of capturing response packets from PVM and SVM and finding out
whether they are identical, we introduce a new module to qemu networking 
called

colo-proxy.

This document describes the design of the colo-proxy module

## Glossary
  PVM - Primary VM, which provides services to clients.
  SVM - Secondary VM, a hot standby and replication of PVM.
  PN - Primary Node, the host which PVM runs on
  SN - Secondary Node, the host which SVM runs on

## Our Idea ##

COLO-Proxy
COLO-Proxy is a part of COLO,based on qemu net filter and it's a plugin for
qemu net filter.the function keep SVM connect normal to PVM and compare
PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.

== Workflow ==


+--+  +--+
|PN|  |SN|
+---+ +---+
| +---+ | | +---+ |
| |   | | | |   | |
| |PVM| | | |SVM| |
| |   | | | |   | |
| +--+-^--+ | | +-^++ |
|| || |   ||  |
|| | ++ | | +---+ ||  |
|| | |COLO| |(socket) | |COLO   | ||  |
|| | | CheckPoint +-> CheckPoint| ||  |
|| | || |  (6)| |   | ||  |
|| | +-^--+ | | +---+ ||  |
|| |   (5) || |   ||  |
|| |   || |   ||  |
| +--v-+--+ | Forward(socket) | +-+v+ |
| |COLO Proxy  |  +---+(1)+->seq adjust(2)| | |
| |  +-+--+ | | +-+ | |
| |  | Compare(4) <---+(3)+-+ COLO Proxy| |
| +---+ | Forward(socket) | +---+ |
++Qemu+-+ ++Qemu+-+
   | ^
   | |
   | |
  +v-++
  |   |
  |  Client   |
  |   |
  +---+





(1)When PN receive client packets,PN COLO-Proxy copy and forward packets to
SN COLO-Proxy.
(2)SN COLO-Proxy record PVM's packet inital seq & adjust client's ack,send
adjusted packets to SVM
(3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu 
COLO-Proxy.

(4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's packets,then
compare PVM's packets data with SVM's packets data. If packets is 
different, compare
module notify COLO CheckPoint module to do a checkpoint then send PVM's 
packets to
client and drop SVM's packets, otherwise, just send PVM's packets to 
client and

drop SVM's packets.
(5)notify COLO-Checkpoint module checkpoint is needed
(6)Do COLO-Checkpoint

### QEMU space TCP/IP stack(Based on SLIRP) ###
We need a QEMU space TCP/IP stack to help us to analysis packet. After 
looking

into QEMU, we found that SLIRP

http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29

is a good choice for us. SLIRP proivdes a full TCP/IP stack within QEMU, 
it can
help use to handle the packet written to/read from backend(tap) device 
which is

just like a link layer(L2) packet.

### Packet enqueue and compare ###
Together with QEMU space TCP/IP stack, we enqueue all packets sent by 
PVM and
SVM on Primary QEMU, and then compare the packet payload for each 
connection.

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Gonglei

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:
 * Jason Wang (jasow...@redhat.com) wrote:


 On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
 * Dong, Eddie (eddie.d...@intel.com) wrote:
 A question here, the packet comparing may be very tricky. For example,
 some protocol use random data to generate unpredictable id or
 something else. One example is ipv6_select_ident() in Linux. So COLO
 needs a mechanism to make sure PVM and SVM can generate same random
 data?
 Good question, the random data connection is a big problem for COLO. At
 present, it will trigger checkpoint processing because of the different 
 random
 data.
 I don't think any mechanisms can assure two different machines generate 
 the
 same random data. If you have any ideas, pls tell us :)

 Frequent checkpoint can handle this scenario, but maybe will cause the
 performance poor. :(

 The assumption is that, after VM checkpoint, SVM and PVM have identical 
 internal state, so the pattern used to generate random data has high 
 possibility to generate identical data at short time, at least...
 They do diverge pretty quickly though; I have simple examples which
 reliably cause a checkpoint because of simple randomness in applications.

 Dave


 And it will become even worse if hwrng is used in guest.
 
 Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
 once established, tends to work well without triggering checkpoints;
 and static web pages also work well.  Examples of things that do cause
 more checkpoints are, displaying guest statistics (e.g. running top
 in that ssh) which is timing dependent, and dynamically generated
 web pages that include a unique ID (bugzilla's password reset link in
 it's front page was a fun one), I think also establishing
 new encrypted connections cause the same randomness.
 
 However, it's worth remembering that COLO is trying to reduce the
 number of checkpoints compared to a simple checkpointing world
 which would be aiming to do a checkpoint ~100 times a second,
 and for compute bound workloads, or ones that don't expose
 the randomness that much, it can get checkpoints of a few seconds
 in length which greatly reduces the overhead.
 

Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.

Regards,
-Gonglei

Re: [Qemu-devel] [POC] colo-proxy in qemu

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:
 On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:
 * Gonglei (arei.gong...@huawei.com) wrote:
 On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:
 * Jason Wang (jasow...@redhat.com) wrote:
 
 
 On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
 * Dong, Eddie (eddie.d...@intel.com) wrote:
 A question here, the packet comparing may be very tricky. For example,
 some protocol use random data to generate unpredictable id or
 something else. One example is ipv6_select_ident() in Linux. So COLO
 needs a mechanism to make sure PVM and SVM can generate same random
 data?
 Good question, the random data connection is a big problem for COLO. At
 present, it will trigger checkpoint processing because of the 
 different random
 data.
 I don't think any mechanisms can assure two different machines 
 generate the
 same random data. If you have any ideas, pls tell us :)
 
 Frequent checkpoint can handle this scenario, but maybe will cause the
 performance poor. :(
 
 The assumption is that, after VM checkpoint, SVM and PVM have identical 
 internal state, so the pattern used to generate random data has high 
 possibility to generate identical data at short time, at least...
 They do diverge pretty quickly though; I have simple examples which
 reliably cause a checkpoint because of simple randomness in applications.
 
 Dave
 
 
 And it will become even worse if hwrng is used in guest.
 
 Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
 once established, tends to work well without triggering checkpoints;
 and static web pages also work well.  Examples of things that do cause
 more checkpoints are, displaying guest statistics (e.g. running top
 in that ssh) which is timing dependent, and dynamically generated
 web pages that include a unique ID (bugzilla's password reset link in
 it's front page was a fun one), I think also establishing
 new encrypted connections cause the same randomness.
 
 However, it's worth remembering that COLO is trying to reduce the
 number of checkpoints compared to a simple checkpointing world
 which would be aiming to do a checkpoint ~100 times a second,
 and for compute bound workloads, or ones that don't expose
 the randomness that much, it can get checkpoints of a few seconds
 in length which greatly reduces the overhead.
 
 
 Yes. That's the truth.
 We can set two different modes for different scenarios. Maybe Named
 1) frequent checkpoint mode for multi-connections and randomness scenarios
 and 2) non-frequent checkpoint mode for other scenarios.
 
 But that's the next plan, we are thinking about that.
 
 I have some code that tries to automatically switch between those;
 it measures the checkpoint lengths, and if they're consistently short
 it sends a different message byte to the secondary at the start of the
 checkpoint, so that it doesn't bother running.   Every so often it
 then flips back to a COLO checkpoint to see if the checkpoints
 are still really fast.
 
 
 Do you mean if there are consistent checkpoint requests, not do checkpoint 
 but just send a special message to SVM?
 Resume to common COLO mode until the checkpoint lengths is so not short ?

  We still have to do checkpoints, but we send a special message to the SVM so 
that
the SVM just takes the checkpoint but does not run.

  I'll send the code after I've updated it to your current version; but it's
quite rough/experimental.

It works something like

 ---run PVM run SVM
 COLO long gap
 mode   miscompare
checkpoint
 ---run PVM run SVM
 COLO short gap
 mode   miscompare
checkpoint
 ---run PVM run SVM
 COLO short gap
 mode   miscompare  After a few short runs
checkpoint
 ---run PVM SVM idle   \
   Passivefixed delay|  - repeat 'n' times
 mode   checkpoint /
 ---run PVM run SVM
 COLO short gap   Still a short gap
 mode   miscompare
 ---run PVM SVM idle   \
   Passivefixed delay|  - repeat 'n' times
 mode   checkpoint /
 ---run PVM run SVM
 COLO long gap   long gap now, stay in COLO
 mode   miscompare
checkpoint
 ---run PVM run SVM
 COLO long gap
 mode   miscompare
checkpoint
 
So it saves the CPU time on the SVM, and the comparison traffic, and is
automatic at switching into the passive mode.

It used to be more useful, but your minimum COLO run time that you
added a few versions ago helps a lot in the cases where there are miscompares,
and the delay after the miscompare before you take the checkpoint also helps
in the case where the data is very random.

Dave

 
 Thanks.
 
 Dave
 
 
 Regards,
 -Gonglei
 
 --
 Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
 
 .
 
 
 
--
Dr.

Re: [Qemu-devel] [POC] colo-proxy in qemu

* Gonglei (arei.gong...@huawei.com) wrote:
 On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:
  * Jason Wang (jasow...@redhat.com) wrote:
 
 
  On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
  * Dong, Eddie (eddie.d...@intel.com) wrote:
  A question here, the packet comparing may be very tricky. For example,
  some protocol use random data to generate unpredictable id or
  something else. One example is ipv6_select_ident() in Linux. So COLO
  needs a mechanism to make sure PVM and SVM can generate same random
  data?
  Good question, the random data connection is a big problem for COLO. At
  present, it will trigger checkpoint processing because of the different 
  random
  data.
  I don't think any mechanisms can assure two different machines generate 
  the
  same random data. If you have any ideas, pls tell us :)
 
  Frequent checkpoint can handle this scenario, but maybe will cause the
  performance poor. :(
 
  The assumption is that, after VM checkpoint, SVM and PVM have identical 
  internal state, so the pattern used to generate random data has high 
  possibility to generate identical data at short time, at least...
  They do diverge pretty quickly though; I have simple examples which
  reliably cause a checkpoint because of simple randomness in applications.
 
  Dave
 
 
  And it will become even worse if hwrng is used in guest.
  
  Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
  once established, tends to work well without triggering checkpoints;
  and static web pages also work well.  Examples of things that do cause
  more checkpoints are, displaying guest statistics (e.g. running top
  in that ssh) which is timing dependent, and dynamically generated
  web pages that include a unique ID (bugzilla's password reset link in
  it's front page was a fun one), I think also establishing
  new encrypted connections cause the same randomness.
  
  However, it's worth remembering that COLO is trying to reduce the
  number of checkpoints compared to a simple checkpointing world
  which would be aiming to do a checkpoint ~100 times a second,
  and for compute bound workloads, or ones that don't expose
  the randomness that much, it can get checkpoints of a few seconds
  in length which greatly reduces the overhead.
  
 
 Yes. That's the truth.
 We can set two different modes for different scenarios. Maybe Named
 1) frequent checkpoint mode for multi-connections and randomness scenarios
 and 2) non-frequent checkpoint mode for other scenarios.
 
 But that's the next plan, we are thinking about that.

I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.

Dave

 
 Regards,
 -Gonglei
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread zhanghailiang


On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:

* Gonglei (arei.gong...@huawei.com) wrote:

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:

* Dong, Eddie (eddie.d...@intel.com) wrote:

A question here, the packet comparing may be very tricky. For example,
some protocol use random data to generate unpredictable id or
something else. One example is ipv6_select_ident() in Linux. So COLO
needs a mechanism to make sure PVM and SVM can generate same random

data?
Good question, the random data connection is a big problem for COLO. At
present, it will trigger checkpoint processing because of the different random
data.
I don't think any mechanisms can assure two different machines generate the
same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(


The assumption is that, after VM checkpoint, SVM and PVM have identical 
internal state, so the pattern used to generate random data has high 
possibility to generate identical data at short time, at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in applications.

Dave



And it will become even worse if hwrng is used in guest.


Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.



Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.


I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.



Do you mean if there are consistent checkpoint requests, not do checkpoint but 
just send a special message to SVM?
Resume to common COLO mode until the checkpoint lengths is so not short ?

Thanks.


Dave



Regards,
-Gonglei


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

.

Re: [Qemu-devel] [POC] colo-proxy in qemu

* Jason Wang (jasow...@redhat.com) wrote:
 
 
 On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
  * Dong, Eddie (eddie.d...@intel.com) wrote:
  A question here, the packet comparing may be very tricky. For example,
  some protocol use random data to generate unpredictable id or
  something else. One example is ipv6_select_ident() in Linux. So COLO
  needs a mechanism to make sure PVM and SVM can generate same random
  data?
  Good question, the random data connection is a big problem for COLO. At
  present, it will trigger checkpoint processing because of the different 
  random
  data.
  I don't think any mechanisms can assure two different machines generate 
  the
  same random data. If you have any ideas, pls tell us :)
 
  Frequent checkpoint can handle this scenario, but maybe will cause the
  performance poor. :(
 
  The assumption is that, after VM checkpoint, SVM and PVM have identical 
  internal state, so the pattern used to generate random data has high 
  possibility to generate identical data at short time, at least...
  They do diverge pretty quickly though; I have simple examples which
  reliably cause a checkpoint because of simple randomness in applications.
 
  Dave
 
 
 And it will become even worse if hwrng is used in guest.

Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Yang Hongyang




On 07/30/2015 09:59 PM, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:

* Gonglei (arei.gong...@huawei.com) wrote:

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:

* Dong, Eddie (eddie.d...@intel.com) wrote:

A question here, the packet comparing may be very tricky. For example,
some protocol use random data to generate unpredictable id or
something else. One example is ipv6_select_ident() in Linux. So COLO
needs a mechanism to make sure PVM and SVM can generate same random

data?
Good question, the random data connection is a big problem for COLO. At
present, it will trigger checkpoint processing because of the different random
data.
I don't think any mechanisms can assure two different machines generate the
same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(


The assumption is that, after VM checkpoint, SVM and PVM have identical 
internal state, so the pattern used to generate random data has high 
possibility to generate identical data at short time, at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in applications.

Dave



And it will become even worse if hwrng is used in guest.


Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.



Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.


I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.



Do you mean if there are consistent checkpoint requests, not do checkpoint but 
just send a special message to SVM?
Resume to common COLO mode until the checkpoint lengths is so not short ?


   We still have to do checkpoints, but we send a special message to the SVM so 
that
the SVM just takes the checkpoint but does not run.

   I'll send the code after I've updated it to your current version; but it's
quite rough/experimental.

It works something like

  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare  After a few short runs
 checkpoint
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO short gap   Still a short gap
  mode   miscompare
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO long gap   long gap now, stay in COLO
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint

So it saves the CPU time on the SVM, and the comparison traffic, and is
automatic at switching into the passive mode.

It used to be more useful, but your minimum COLO run time that you
added a few versions ago helps a lot in the cases where there are miscompares,
and the delay after the miscompare before you take the checkpoint also helps
in the case where the data is very random.


This is great! This is exactly what we were thinking about, when random
scenario will fallback to MC/Remus

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Yang Hongyang


On 07/31/2015 09:28 AM, zhanghailiang wrote:

On 2015/7/31 9:08, Yang Hongyang wrote:



On 07/31/2015 01:53 AM, Dr. David Alan Gilbert wrote:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:



On 07/30/2015 09:59 PM, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:

* Gonglei (arei.gong...@huawei.com) wrote:

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:

* Dong, Eddie (eddie.d...@intel.com) wrote:

A question here, the packet comparing may be very tricky. For
example,
some protocol use random data to generate unpredictable id or
something else. One example is ipv6_select_ident() in Linux. So COLO
needs a mechanism to make sure PVM and SVM can generate same random

data?
Good question, the random data connection is a big problem for
COLO. At
present, it will trigger checkpoint processing because of the
different random
data.
I don't think any mechanisms can assure two different machines
generate the
same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(


The assumption is that, after VM checkpoint, SVM and PVM have
identical internal state, so the pattern used to generate random
data has high possibility to generate identical data at short time,
at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in
applications.

Dave



And it will become even worse if hwrng is used in guest.


Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.



Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.


I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.



Do you mean if there are consistent checkpoint requests, not do checkpoint
but just send a special message to SVM?
Resume to common COLO mode until the checkpoint lengths is so not short ?


   We still have to do checkpoints, but we send a special message to the
SVM so that
the SVM just takes the checkpoint but does not run.

   I'll send the code after I've updated it to your current version; but it's
quite rough/experimental.

It works something like

  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare  After a few short runs
 checkpoint
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO short gap   Still a short gap
  mode   miscompare
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO long gap   long gap now, stay in COLO
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint

So it saves the CPU time on the SVM, and the comparison traffic, and is
automatic at switching into the passive mode.

It used to be more useful, but your minimum COLO run time that you
added a few versions ago helps a lot in the cases where there are miscompares,
and the delay after the miscompare before

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Yang Hongyang




On 07/31/2015 01:53 AM, Dr. David Alan Gilbert wrote:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:



On 07/30/2015 09:59 PM, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:

* Gonglei (arei.gong...@huawei.com) wrote:

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:

* Dong, Eddie (eddie.d...@intel.com) wrote:

A question here, the packet comparing may be very tricky. For example,
some protocol use random data to generate unpredictable id or
something else. One example is ipv6_select_ident() in Linux. So COLO
needs a mechanism to make sure PVM and SVM can generate same random

data?
Good question, the random data connection is a big problem for COLO. At
present, it will trigger checkpoint processing because of the different random
data.
I don't think any mechanisms can assure two different machines generate the
same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(


The assumption is that, after VM checkpoint, SVM and PVM have identical 
internal state, so the pattern used to generate random data has high 
possibility to generate identical data at short time, at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in applications.

Dave



And it will become even worse if hwrng is used in guest.


Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.



Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.


I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.



Do you mean if there are consistent checkpoint requests, not do checkpoint but 
just send a special message to SVM?
Resume to common COLO mode until the checkpoint lengths is so not short ?


   We still have to do checkpoints, but we send a special message to the SVM so 
that
the SVM just takes the checkpoint but does not run.

   I'll send the code after I've updated it to your current version; but it's
quite rough/experimental.

It works something like

  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare  After a few short runs
 checkpoint
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO short gap   Still a short gap
  mode   miscompare
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO long gap   long gap now, stay in COLO
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint

So it saves the CPU time on the SVM, and the comparison traffic, and is
automatic at switching into the passive mode.

It used to be more useful, but your minimum COLO run time that you
added a few versions ago helps a lot in the cases where there are miscompares,
and the delay after the miscompare before you take the checkpoint also helps
in the case where the data is very random.


This

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread zhanghailiang


On 2015/7/31 9:08, Yang Hongyang wrote:



On 07/31/2015 01:53 AM, Dr. David Alan Gilbert wrote:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:



On 07/30/2015 09:59 PM, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:

* Gonglei (arei.gong...@huawei.com) wrote:

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:

* Dong, Eddie (eddie.d...@intel.com) wrote:

A question here, the packet comparing may be very tricky. For example,
some protocol use random data to generate unpredictable id or
something else. One example is ipv6_select_ident() in Linux. So COLO
needs a mechanism to make sure PVM and SVM can generate same random

data?
Good question, the random data connection is a big problem for COLO. At
present, it will trigger checkpoint processing because of the different random
data.
I don't think any mechanisms can assure two different machines generate the
same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(


The assumption is that, after VM checkpoint, SVM and PVM have identical 
internal state, so the pattern used to generate random data has high 
possibility to generate identical data at short time, at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in applications.

Dave



And it will become even worse if hwrng is used in guest.


Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.



Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.


I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.



Do you mean if there are consistent checkpoint requests, not do checkpoint but 
just send a special message to SVM?
Resume to common COLO mode until the checkpoint lengths is so not short ?


   We still have to do checkpoints, but we send a special message to the SVM so 
that
the SVM just takes the checkpoint but does not run.

   I'll send the code after I've updated it to your current version; but it's
quite rough/experimental.

It works something like

  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare  After a few short runs
 checkpoint
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO short gap   Still a short gap
  mode   miscompare
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO long gap   long gap now, stay in COLO
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint

So it saves the CPU time on the SVM, and the comparison traffic, and is
automatic at switching into the passive mode.

It used to be more useful, but your minimum COLO run time that you
added a few versions ago helps a lot in the cases where there are miscompares,
and the delay after the miscompare before you take the checkpoint also helps
in the

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread zhanghailiang


On 2015/7/31 1:53, Dr. David Alan Gilbert wrote:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:



On 07/30/2015 09:59 PM, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:

* Gonglei (arei.gong...@huawei.com) wrote:

On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:

* Dong, Eddie (eddie.d...@intel.com) wrote:

A question here, the packet comparing may be very tricky. For example,
some protocol use random data to generate unpredictable id or
something else. One example is ipv6_select_ident() in Linux. So COLO
needs a mechanism to make sure PVM and SVM can generate same random

data?
Good question, the random data connection is a big problem for COLO. At
present, it will trigger checkpoint processing because of the different random
data.
I don't think any mechanisms can assure two different machines generate the
same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(


The assumption is that, after VM checkpoint, SVM and PVM have identical 
internal state, so the pattern used to generate random data has high 
possibility to generate identical data at short time, at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in applications.

Dave



And it will become even worse if hwrng is used in guest.


Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
once established, tends to work well without triggering checkpoints;
and static web pages also work well.  Examples of things that do cause
more checkpoints are, displaying guest statistics (e.g. running top
in that ssh) which is timing dependent, and dynamically generated
web pages that include a unique ID (bugzilla's password reset link in
it's front page was a fun one), I think also establishing
new encrypted connections cause the same randomness.

However, it's worth remembering that COLO is trying to reduce the
number of checkpoints compared to a simple checkpointing world
which would be aiming to do a checkpoint ~100 times a second,
and for compute bound workloads, or ones that don't expose
the randomness that much, it can get checkpoints of a few seconds
in length which greatly reduces the overhead.



Yes. That's the truth.
We can set two different modes for different scenarios. Maybe Named
1) frequent checkpoint mode for multi-connections and randomness scenarios
and 2) non-frequent checkpoint mode for other scenarios.

But that's the next plan, we are thinking about that.


I have some code that tries to automatically switch between those;
it measures the checkpoint lengths, and if they're consistently short
it sends a different message byte to the secondary at the start of the
checkpoint, so that it doesn't bother running.   Every so often it
then flips back to a COLO checkpoint to see if the checkpoints
are still really fast.



Do you mean if there are consistent checkpoint requests, not do checkpoint but 
just send a special message to SVM?
Resume to common COLO mode until the checkpoint lengths is so not short ?


   We still have to do checkpoints, but we send a special message to the SVM so 
that
the SVM just takes the checkpoint but does not run.

   I'll send the code after I've updated it to your current version; but it's
quite rough/experimental.



Yes, please, we can merge them into our branch. ;)


It works something like

  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO short gap
  mode   miscompare  After a few short runs
 checkpoint
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO short gap   Still a short gap
  mode   miscompare
  ---run PVM SVM idle   \
Passivefixed delay|  - repeat 'n' times
  mode   checkpoint /
  ---run PVM run SVM
  COLO long gap   long gap now, stay in COLO
  mode   miscompare
 checkpoint
  ---run PVM run SVM
  COLO long gap
  mode   miscompare
 checkpoint

So it saves the CPU time on the SVM, and the comparison traffic, and is
automatic at switching into the passive mode.



That's a good solution, actually, we have a plan to realize the checkpoint 
strategy,
which can automatically adapt to different situation, including period 
checkpoint (MC/Remus mode),
COLO mode, mix mode (just like your

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Dong, Eddie

 
  A question here, the packet comparing may be very tricky. For example,
  some protocol use random data to generate unpredictable id or
  something else. One example is ipv6_select_ident() in Linux. So COLO
  needs a mechanism to make sure PVM and SVM can generate same random
 data?
 
 Good question, the random data connection is a big problem for COLO. At
 present, it will trigger checkpoint processing because of the different random
 data.
 I don't think any mechanisms can assure two different machines generate the
 same random data. If you have any ideas, pls tell us :)
 
 Frequent checkpoint can handle this scenario, but maybe will cause the
 performance poor. :(
 
The assumption is that, after VM checkpoint, SVM and PVM have identical 
internal state, so the pattern used to generate random data has high 
possibility to generate identical data at short time, at least...

Thx Eddie

Re: [Qemu-devel] [POC] colo-proxy in qemu

* Dong, Eddie (eddie.d...@intel.com) wrote:
  
   A question here, the packet comparing may be very tricky. For example,
   some protocol use random data to generate unpredictable id or
   something else. One example is ipv6_select_ident() in Linux. So COLO
   needs a mechanism to make sure PVM and SVM can generate same random
  data?
  
  Good question, the random data connection is a big problem for COLO. At
  present, it will trigger checkpoint processing because of the different 
  random
  data.
  I don't think any mechanisms can assure two different machines generate the
  same random data. If you have any ideas, pls tell us :)
  
  Frequent checkpoint can handle this scenario, but maybe will cause the
  performance poor. :(
  
 The assumption is that, after VM checkpoint, SVM and PVM have identical 
 internal state, so the pattern used to generate random data has high 
 possibility to generate identical data at short time, at least...

They do diverge pretty quickly though; I have simple examples which
reliably cause a checkpoint because of simple randomness in applications.

Dave

 Thx Eddie
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Jason Wang



On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
 * Dong, Eddie (eddie.d...@intel.com) wrote:
 A question here, the packet comparing may be very tricky. For example,
 some protocol use random data to generate unpredictable id or
 something else. One example is ipv6_select_ident() in Linux. So COLO
 needs a mechanism to make sure PVM and SVM can generate same random
 data?
 Good question, the random data connection is a big problem for COLO. At
 present, it will trigger checkpoint processing because of the different 
 random
 data.
 I don't think any mechanisms can assure two different machines generate the
 same random data. If you have any ideas, pls tell us :)

 Frequent checkpoint can handle this scenario, but maybe will cause the
 performance poor. :(

 The assumption is that, after VM checkpoint, SVM and PVM have identical 
 internal state, so the pattern used to generate random data has high 
 possibility to generate identical data at short time, at least...
 They do diverge pretty quickly though; I have simple examples which
 reliably cause a checkpoint because of simple randomness in applications.

 Dave


And it will become even worse if hwrng is used in guest.

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-30 Thread Gonglei

On 2015/7/30 12:23, Jason Wang wrote:
 
 
 On 07/20/2015 02:42 PM, Li Zhijian wrote:
 Hi, all

 We are planning to implement colo-proxy in qemu to cache and compare
 packets.
 This module is one of the important component of COLO project and now
 it is
 still in early stage, so any comments and feedback are warmly welcomed,
 thanks in advance.

 ## Background
 COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
 Service)
 project is a high availability solution. Both Primary VM (PVM) and
 Secondary VM
 (SVM) run in parallel. They receive the same request from client, and
 generate
 responses in parallel too. If the response packets from PVM and SVM are
 identical, they are released immediately. Otherwise, a VM checkpoint
 (on demand)
 is conducted.
 Paper:
 http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
 COLO on Xen:
 http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
 COLO on Qemu/KVM:
 http://wiki.qemu.org/Features/COLO

 By the needs of capturing response packets from PVM and SVM and
 finding out
 whether they are identical, we introduce a new module to qemu
 networking called
 colo-proxy.
 
 A question here, the packet comparing may be very tricky. For example,
 some protocol use random data to generate unpredictable id or something
 else. One example is ipv6_select_ident() in Linux. So COLO needs a
 mechanism to make sure PVM and SVM can generate same random data?
 
Good question, the random data connection is a big problem for COLO. At
present, it will trigger checkpoint processing because of the different random 
data.
I don't think any mechanisms can assure two different machines generate
the same random data. If you have any ideas, pls tell us :)

Frequent checkpoint can handle this scenario, but maybe will cause the
performance poor. :(

Regards,
-Gonglei

Re: [Qemu-devel] [POC] colo-proxy in qemu

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
 
 
 On 07/30/2015 09:59 PM, Dr. David Alan Gilbert wrote:
 * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:
 On 2015/7/30 20:30, Dr. David Alan Gilbert wrote:
 * Gonglei (arei.gong...@huawei.com) wrote:
 On 2015/7/30 19:56, Dr. David Alan Gilbert wrote:
 * Jason Wang (jasow...@redhat.com) wrote:
 
 
 On 07/30/2015 04:03 PM, Dr. David Alan Gilbert wrote:
 * Dong, Eddie (eddie.d...@intel.com) wrote:
 A question here, the packet comparing may be very tricky. For 
 example,
 some protocol use random data to generate unpredictable id or
 something else. One example is ipv6_select_ident() in Linux. So COLO
 needs a mechanism to make sure PVM and SVM can generate same random
 data?
 Good question, the random data connection is a big problem for COLO. 
 At
 present, it will trigger checkpoint processing because of the 
 different random
 data.
 I don't think any mechanisms can assure two different machines 
 generate the
 same random data. If you have any ideas, pls tell us :)
 
 Frequent checkpoint can handle this scenario, but maybe will cause 
 the
 performance poor. :(
 
 The assumption is that, after VM checkpoint, SVM and PVM have 
 identical internal state, so the pattern used to generate random data 
 has high possibility to generate identical data at short time, at 
 least...
 They do diverge pretty quickly though; I have simple examples which
 reliably cause a checkpoint because of simple randomness in 
 applications.
 
 Dave
 
 
 And it will become even worse if hwrng is used in guest.
 
 Yes; it seems quite application dependent;  (on IPv4) an ssh connection,
 once established, tends to work well without triggering checkpoints;
 and static web pages also work well.  Examples of things that do cause
 more checkpoints are, displaying guest statistics (e.g. running top
 in that ssh) which is timing dependent, and dynamically generated
 web pages that include a unique ID (bugzilla's password reset link in
 it's front page was a fun one), I think also establishing
 new encrypted connections cause the same randomness.
 
 However, it's worth remembering that COLO is trying to reduce the
 number of checkpoints compared to a simple checkpointing world
 which would be aiming to do a checkpoint ~100 times a second,
 and for compute bound workloads, or ones that don't expose
 the randomness that much, it can get checkpoints of a few seconds
 in length which greatly reduces the overhead.
 
 
 Yes. That's the truth.
 We can set two different modes for different scenarios. Maybe Named
 1) frequent checkpoint mode for multi-connections and randomness scenarios
 and 2) non-frequent checkpoint mode for other scenarios.
 
 But that's the next plan, we are thinking about that.
 
 I have some code that tries to automatically switch between those;
 it measures the checkpoint lengths, and if they're consistently short
 it sends a different message byte to the secondary at the start of the
 checkpoint, so that it doesn't bother running.   Every so often it
 then flips back to a COLO checkpoint to see if the checkpoints
 are still really fast.
 
 
 Do you mean if there are consistent checkpoint requests, not do checkpoint 
 but just send a special message to SVM?
 Resume to common COLO mode until the checkpoint lengths is so not short ?
 
We still have to do checkpoints, but we send a special message to the SVM 
  so that
 the SVM just takes the checkpoint but does not run.
 
I'll send the code after I've updated it to your current version; but it's
 quite rough/experimental.
 
 It works something like
 
   ---run PVM run SVM
   COLO long gap
   mode   miscompare
  checkpoint
   ---run PVM run SVM
   COLO short gap
   mode   miscompare
  checkpoint
   ---run PVM run SVM
   COLO short gap
   mode   miscompare  After a few short runs
  checkpoint
   ---run PVM SVM idle   \
 Passivefixed delay|  - repeat 'n' times
   mode   checkpoint /
   ---run PVM run SVM
   COLO short gap   Still a short gap
   mode   miscompare
   ---run PVM SVM idle   \
 Passivefixed delay|  - repeat 'n' times
   mode   checkpoint /
   ---run PVM run SVM
   COLO long gap   long gap now, stay in COLO
   mode   miscompare
  checkpoint
   ---run PVM run SVM
   COLO long gap
   mode   miscompare
  checkpoint
 
 So it saves the CPU time on the SVM, and the comparison traffic, and is
 automatic at switching into the passive mode.
 
 It used to be more useful, but your minimum COLO run time that you
 added a few versions ago helps a lot in the cases where there are 
 miscompares,
 and the delay after the miscompare before you take the checkpoint

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-29 Thread Jason Wang



On 07/20/2015 02:42 PM, Li Zhijian wrote:
 Hi, all

 We are planning to implement colo-proxy in qemu to cache and compare
 packets.
 This module is one of the important component of COLO project and now
 it is
 still in early stage, so any comments and feedback are warmly welcomed,
 thanks in advance.

 ## Background
 COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
 Service)
 project is a high availability solution. Both Primary VM (PVM) and
 Secondary VM
 (SVM) run in parallel. They receive the same request from client, and
 generate
 responses in parallel too. If the response packets from PVM and SVM are
 identical, they are released immediately. Otherwise, a VM checkpoint
 (on demand)
 is conducted.
 Paper:
 http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
 COLO on Xen:
 http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
 COLO on Qemu/KVM:
 http://wiki.qemu.org/Features/COLO

 By the needs of capturing response packets from PVM and SVM and
 finding out
 whether they are identical, we introduce a new module to qemu
 networking called
 colo-proxy.

A question here, the packet comparing may be very tricky. For example,
some protocol use random data to generate unpredictable id or something
else. One example is ipv6_select_ident() in Linux. So COLO needs a
mechanism to make sure PVM and SVM can generate same random data?

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-29 Thread Jan Kiszka

On 2015-07-29 00:12, Samuel Thibault wrote:
 Hello,
 
 Jan Kiszka, le Mon 27 Jul 2015 15:33:27 +0200, a écrit :
 Of course, I'm fine with handing this over to someone who'd like to
 pick up. Do we have volunteers?

 Samuel, would you like to do this? As a subsystem maintainer, you are
 already familiar with QEMU processes.
 
 I can help with maintenance, yes.

Help with will easily mean be the one and only. ;) If you prefer,
send a patch which only adds you as a maintainer, but I would also ack
one that drops me from the list as well.

 
 Well, this still wouldn't resolve the independent review need for
 slirp-ipv6.
 
 Well, actually I didn't write slirp-ipv6, Guillaume Subiron did, and I
 reviewed it (and we iterated quite a bit) before we submit the patch
 series to qemu-devel.

Perfekt!

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-28 Thread Samuel Thibault

zhanghailiang, le Tue 21 Jul 2015 09:59:22 +0800, a écrit :
 I didn't find any news since that version, are you still trying to
 push them to qemu upstream ?

I'd still be trying if I had any actual answer other than we need to
find time to deal about it :)

I can rebase the patch series over the current master and submit again
the patches.

Samuel

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-28 Thread Samuel Thibault

Hello,

Jan Kiszka, le Mon 27 Jul 2015 15:33:27 +0200, a écrit :
 Of course, I'm fine with handing this over to someone who'd like to
 pick up. Do we have volunteers?
 
 Samuel, would you like to do this? As a subsystem maintainer, you are
 already familiar with QEMU processes.

I can help with maintenance, yes.

 Well, this still wouldn't resolve the independent review need for
 slirp-ipv6.

Well, actually I didn't write slirp-ipv6, Guillaume Subiron did, and I
reviewed it (and we iterated quite a bit) before we submit the patch
series to qemu-devel.

Samuel

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Samuel Thibault

Hello,

I'm just back from vacancy with no Internet access, so will answer
shortly :)

Samuel

Re: [Qemu-devel] [POC] colo-proxy in qemu

Hi Dave,

Thanks for the comments!

On 07/27/2015 06:40 PM, Dr. David Alan Gilbert wrote:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:

Hi Jason,

On 07/24/2015 10:12 AM, Jason Wang wrote:

On 07/24/2015 10:04 AM, Dong, Eddie wrote:

Hi Stefan:
Thanks for your comments!

On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:

We are planning to implement colo-proxy in qemu to cache and compare

packets.

I thought there is a kernel module to do that?

Yes, that is the previous solution the COLO sub-community choose to go,
but we realized it might be not the best choices, and thus we want to bring
discussion back here :) More comments are welcome.

Hi:

Could you pls describe more details on this decision? What's the reason
that you realize it was not the best choice?

Below is my opinion:

We realized that there're disadvantages do it in kernel spaces:
1. We need to recompile kernel: the colo-proxy kernel module is
implemented as a nf conntrack extension. Adding a extension need to
modify the extension struct in-kernel, so recompile kernel is needed.

That change is tiny though, so I don't think the change to the kernel
is a big issue (but I'm not a netfilter guy).

(For those following, the patch is:
https://github.com/coloft/colo-proxy/blob/master/patch4kernel/0001-colo-patch-for-kernel.patch
)
The comparison modules are bigger though, but still not massive.

2. We need to recompile iptables/nftables to use together with the colo-proxy
kernel module.

Again, the changes to iptables are small; so I don't think this should
influence it too much.

Yes, these changes are small, but even a small change needs to recompile
the component and reinstall it, for user, it is not friendly...

The bigger problem shown by 12 is that these changes are single-use - just for
COLO, which does make it a little harder to justify.

That's true.

3. Need to configure primary host to forward input packets to secondary as
well as configure secondary to forward output packets to primary host, the
network topology and configuration is too complex for a regular user.

Yes, and that bit is HARD - it took me quite a while to get it right; however,
we'll still need to forward packets between primary and secondary,

If we forward in qemu using a socket connection, a separate forward nic will not
be needed, and all tc stuff will not needed, will make configuration easier I
think.

and all that
hard setup should get rolled into something like libvirt, so perhaps it's not
really
that bad for the user in the end.

You can refer to http://wiki.qemu.org/Features/COLO
to see the network topology and the steps to setup an env.

Setup a test env is too complex. The usability is so important to a feature
like COLO which provide VM FT solution, if fewer people can/willing to
setup the env, the feature is useless. So we decide to develop user space
colo-proxy.

The advantage is obvious,
1. we do not need to recompile kernel.
2. No need to recompile iptables/nftables.
3. we do not need to deal with the network configuration, we just using a
socket connection between 2 QEMUs to forward packets.
4. A complete VM FT solution in one go, we have already developed the block
replication in QEMU, so with the network replication in QEMU, all
components we needed are within QEMU, this is very important, it greatly
improves the usability of COLO feature! We hope it will gain more testers,
users and developers.
5. QEMU will gain a complete VM FT solution and the most advantage FT solution
so far!

Overall, usability is the most important factor that impact our choice.

My biggest worry is your reliance on SLIRP for the TCP/IP stack; it
doesn't get much work done on it and I worry about it's reliability for
using it for the level of complexity you need.

Your current kernel implementation gets all the nf_conntrack stuff for free
which is very powerful.

However, I can see some advantages from doing it in user space; it would
be easier to debug, and possibly easier to configure, and might also be easier
to handle continuous FT (i.e. transferring the state of the proxy to a new COLO
connection).

I think at the moment I'd still prefer kernel space (especially since your
kernel
code now works pretty reliably!)

Another thought; if you're main worry is to do with the complexity of kernel
changes, had you considered looking at the bpf-jit - I'm not sure if it can
do what you need, but perhaps it's worth a look?

Will have a look, thank you!

Dave
P.S. I think 'proxy' is still the right word to describe it rather than
'agency'.

Thanks
.

--
Thanks,
Yang.

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
.

--
Thanks,
Yang.

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Jan Kiszka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2015-07-27 12:13, Stefan Hajnoczi wrote:
 On Tue, Jul 21, 2015 at 10:49:29AM +0100, Stefan Hajnoczi wrote:
 On Tue, Jul 21, 2015 at 08:13:42AM +0200, Jan Kiszka wrote:
 On 2015-07-20 17:01, Stefan Hajnoczi wrote:
 On Mon, Jul 20, 2015 at 2:12 PM, Vasiliy Tolstov
 v.tols...@selfip.ru wrote:
 2015-07-20 14:55 GMT+03:00 zhanghailiang
 zhang.zhanghaili...@huawei.com:
 Agreed, besides, it is seemed that slirp is not
 supporting ipv6, we also have to supplement it.
 
 
 patch for ipv6 slirp support some times ago sended to qemu
 list, but i don't know why in not accepted.
 
 I think no one reviewed it but there was no objection against
 IPv6 support in principle.
 
 Jan: Can we merge slirp IPv6 support for QEMU 2.5?
 
 Sorry, as I pointed out some time back, I don't have the
 bandwidth to look into slirp. Someone need to do a review, then
 send a pull request.
 
 Do you want to remove yourself from the slirp section of the
 MAINTAINERS file?
 
 Going forward we'll need to find someone familiar with the QEMU 
 development process and with enough time to review slirp
 patches.
 
 Ping?
 
 I hoped this would raise some discussion and that maybe we could
 find a new maintainer or co-maintainer to get slirp moving.
 
 Any thoughts?

Of course, I'm fine with handing this over to someone who'd like to
pick up. Do we have volunteers?

Samuel, would you like to do this? As a subsystem maintainer, you are
already familiar with QEMU processes. Well, this still wouldn't
resolve the independent review need for slirp-ipv6.

Jan

- -- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlW2MycACgkQitSsb3rl5xSCaACePNubKPBkrdxQkcThUGD7w56B
Q6oAoIgCzT9qVRzDf5IhY2eKFXgTZ+Ul
=yk6R
-END PGP SIGNATURE-

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread zhanghailiang


On 2015/7/27 18:13, Stefan Hajnoczi wrote:

On Tue, Jul 21, 2015 at 10:49:29AM +0100, Stefan Hajnoczi wrote:

On Tue, Jul 21, 2015 at 08:13:42AM +0200, Jan Kiszka wrote:

On 2015-07-20 17:01, Stefan Hajnoczi wrote:

On Mon, Jul 20, 2015 at 2:12 PM, Vasiliy Tolstov v.tols...@selfip.ru wrote:

2015-07-20 14:55 GMT+03:00 zhanghailiang zhang.zhanghaili...@huawei.com:

Agreed, besides, it is seemed that slirp is not supporting ipv6, we also
have to supplement it.



patch for ipv6 slirp support some times ago sended to qemu list, but i
don't know why in not accepted.


I think no one reviewed it but there was no objection against IPv6
support in principle.

Jan: Can we merge slirp IPv6 support for QEMU 2.5?


Sorry, as I pointed out some time back, I don't have the bandwidth to
look into slirp. Someone need to do a review, then send a pull request.


Do you want to remove yourself from the slirp section of the MAINTAINERS
file?

Going forward we'll need to find someone familiar with the QEMU
development process and with enough time to review slirp patches.


Ping?

I hoped this would raise some discussion and that maybe we could find a
new maintainer or co-maintainer to get slirp moving.



Yes, please, this is important, we need slirp's maintainer to help reviewing 
the COLO proxy
patches that will be implemented based on slirp. (If we finally come to an 
agreement on realizing it in qemu)

We also need to support ipv6 for slirp, i have emailed Samuel who have sent 
ipv6 patches for slirp before,
but got no response. (I would like to respin and test Samuel's ipv6 slirp patch 
if he don't have time to do this,
but firstly, i need to get his permission ：） ）

Cc: Samuel Thibault samuel.thiba...@ens-lyon.org samuel.thiba...@gnu.org.

Thanks,
zhanghailiang


Any thoughts?

Stefan

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Dr. David Alan Gilbert

* Jason Wang (jasow...@redhat.com) wrote:
 
 
 On 07/27/2015 01:51 PM, Yang Hongyang wrote:
  On 07/27/2015 12:49 PM, Jason Wang wrote:
 
 
  On 07/27/2015 11:54 AM, Yang Hongyang wrote:
 
 
  On 07/27/2015 11:24 AM, Jason Wang wrote:
 
 
  On 07/24/2015 04:04 PM, Yang Hongyang wrote:
  Hi Jason,
 
  On 07/24/2015 10:12 AM, Jason Wang wrote:
 
 
  On 07/24/2015 10:04 AM, Dong, Eddie wrote:
  Hi Stefan:
Thanks for your comments!
 
  On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:
  We are planning to implement colo-proxy in qemu to cache and
  compare
  packets.
 
  I thought there is a kernel module to do that?
Yes, that is the previous solution the COLO sub-community
  choose
  to go, but we realized it might be not the best choices, and
  thus we
  want to bring discussion back here :)  More comments are welcome.
 
 
  Hi:
 
  Could you pls describe more details on this decision? What's the
  reason
  that you realize it was not the best choice?
 
  Below is my opinion:
 
  We realized that there're disadvantages do it in kernel spaces:
  1. We need to recompile kernel: the colo-proxy kernel module is
   implemented as a nf conntrack extension. Adding a extension
  need to
   modify the extension struct in-kernel, so recompile kernel is
  needed.
 
  There's no need to do all in kernel, you can use a separate process to
  do the comparing and trigger the state sync through monitor.
 
  I don't get it, colo-proxy kernel module using a kthread do the
  comparing and
  trigger the state sync. We implemented it as a nf conntrack extension
  module,
  so we need to extend the extension struct in-kernel, although it just
  needs
  few lines changes to kernel, but a recompile of kernel is needed.
  Are you
  talking about not implement it as a nf conntrack extension?
 
  Yes, I mean implement the comparing in userspace but not in qemu.
 
  Yes, it is an alternative, that requires other components such as
  netfilter userspace tools, it will add the complexity I think, we
  wanted to implement a simple solution in QEMU.
 
 I didn't get the point that why netfilter is needed? Do you mean the
 packet comparing needs to be stateful?

The current kernel world does a few things that take advantage
of the netfilter code:
   1) It's stateful hanging state off conntrack
   2) It modifies sequence numbers off the secondary to match what the
  primary did when it created the stream.
   3) Comparison is on a per-stream basis so that the order of unrelated
  packets doesn't cause a miscompare.

Dave

  Another reason is
  that using other userspace tools will affect the performance, the
  context switch between kernel and userspace may be an overhead.
 
 We can use 100% time of this process but looks like your RFC of filter
 just did it in iothread?
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Stefan Hajnoczi

On Tue, Jul 21, 2015 at 10:49:29AM +0100, Stefan Hajnoczi wrote:
 On Tue, Jul 21, 2015 at 08:13:42AM +0200, Jan Kiszka wrote:
  On 2015-07-20 17:01, Stefan Hajnoczi wrote:
   On Mon, Jul 20, 2015 at 2:12 PM, Vasiliy Tolstov v.tols...@selfip.ru 
   wrote:
   2015-07-20 14:55 GMT+03:00 zhanghailiang 
   zhang.zhanghaili...@huawei.com:
   Agreed, besides, it is seemed that slirp is not supporting ipv6, we also
   have to supplement it.
  
  
   patch for ipv6 slirp support some times ago sended to qemu list, but i
   don't know why in not accepted.
   
   I think no one reviewed it but there was no objection against IPv6
   support in principle.
   
   Jan: Can we merge slirp IPv6 support for QEMU 2.5?
  
  Sorry, as I pointed out some time back, I don't have the bandwidth to
  look into slirp. Someone need to do a review, then send a pull request.
 
 Do you want to remove yourself from the slirp section of the MAINTAINERS
 file?
 
 Going forward we'll need to find someone familiar with the QEMU
 development process and with enough time to review slirp patches.

Ping?

I hoped this would raise some discussion and that maybe we could find a
new maintainer or co-maintainer to get slirp moving.

Any thoughts?

Stefan


pgpCgURJZCHOu.pgp
Description: PGP signature

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Dr. David Alan Gilbert

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:
Hi Jason,

On 07/24/2015 10:12 AM, Jason Wang wrote:

On 07/24/2015 10:04 AM, Dong, Eddie wrote:
Hi Stefan:
Thanks for your comments!

On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:
We are planning to implement colo-proxy in qemu to cache and compare
packets.

I thought there is a kernel module to do that?
Yes, that is the previous solution the COLO sub-community choose to go,
but we realized it might be not the best choices, and thus we want to
bring discussion back here :) More comments are welcome.

Hi:

Could you pls describe more details on this decision? What's the reason
that you realize it was not the best choice?

Below is my opinion:

That change is tiny though, so I don't think the change to the kernel
is a big issue (but I'm not a netfilter guy).

(For those following, the patch is:
https://github.com/coloft/colo-proxy/blob/master/patch4kernel/0001-colo-patch-for-kernel.patch
)
The comparison modules are bigger though, but still not massive.

2. We need to recompile iptables/nftables to use together with the colo-proxy
kernel module.

Again, the changes to iptables are small; so I don't think this should
influence it too much.

The bigger problem shown by 12 is that these changes are single-use - just for
COLO, which does make it a little harder to justify.

Yes, and that bit is HARD - it took me quite a while to get it right; however,
we'll still need to forward packets between primary and secondary, and all that
hard setup should get rolled into something like libvirt, so perhaps it's not
really
that bad for the user in the end.

You can refer to http://wiki.qemu.org/Features/COLO
to see the network topology and the steps to setup an env.

Overall, usability is the most important factor that impact our choice.

My biggest worry is your reliance on SLIRP for the TCP/IP stack; it
doesn't get much work done on it and I worry about it's reliability for
using it for the level of complexity you need.

Your current kernel implementation gets all the nf_conntrack stuff for free
which is very powerful.

I think at the moment I'd still prefer kernel space (especially since your
kernel
code now works pretty reliably!)

Dave
P.S. I think 'proxy' is still the right word to describe it rather than
'agency'.

Thanks
.

--
Thanks,
Yang.
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Jason Wang



On 07/27/2015 01:51 PM, Yang Hongyang wrote:
 On 07/27/2015 12:49 PM, Jason Wang wrote:


 On 07/27/2015 11:54 AM, Yang Hongyang wrote:


 On 07/27/2015 11:24 AM, Jason Wang wrote:


 On 07/24/2015 04:04 PM, Yang Hongyang wrote:
 Hi Jason,

 On 07/24/2015 10:12 AM, Jason Wang wrote:


 On 07/24/2015 10:04 AM, Dong, Eddie wrote:
 Hi Stefan:
   Thanks for your comments!

 On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:
 We are planning to implement colo-proxy in qemu to cache and
 compare
 packets.

 I thought there is a kernel module to do that?
   Yes, that is the previous solution the COLO sub-community
 choose
 to go, but we realized it might be not the best choices, and
 thus we
 want to bring discussion back here :)  More comments are welcome.


 Hi:

 Could you pls describe more details on this decision? What's the
 reason
 that you realize it was not the best choice?

 Below is my opinion:

 We realized that there're disadvantages do it in kernel spaces:
 1. We need to recompile kernel: the colo-proxy kernel module is
  implemented as a nf conntrack extension. Adding a extension
 need to
  modify the extension struct in-kernel, so recompile kernel is
 needed.

 There's no need to do all in kernel, you can use a separate process to
 do the comparing and trigger the state sync through monitor.

 I don't get it, colo-proxy kernel module using a kthread do the
 comparing and
 trigger the state sync. We implemented it as a nf conntrack extension
 module,
 so we need to extend the extension struct in-kernel, although it just
 needs
 few lines changes to kernel, but a recompile of kernel is needed.
 Are you
 talking about not implement it as a nf conntrack extension?

 Yes, I mean implement the comparing in userspace but not in qemu.

 Yes, it is an alternative, that requires other components such as
 netfilter userspace tools, it will add the complexity I think, we
 wanted to implement a simple solution in QEMU. Another reason is
 that using other userspace tools will affect the performance, the
 context switch between kernel and userspace may be an overhead.




 2. We need to recompile iptables/nftables to use together with the
 colo-proxy
  kernel module.
 3. Need to configure primary host to forward input packets to
 secondary as
  well as configure secondary to forward output packets to primary
 host, the
  network topology and configuration is too complex for a regular
 user.


 You can use current kernel primitives to mirror the traffic of both
 PVM
 and SVM to another process without any modification of kernel. And
 qemu
 can offload all network configuration to management in this case.  And
 what's more import, this works for vhost. Filtering in qemu won't work
 for vhost.

 We are using tc to mirror/forward packets now. Implement in QEMU do
 have some
 limits, but there're also limits in kernel, if the packet do not pass
 the host kernel TCP/IP stack, such as vhost-user.

 But the limits are much less than userspace, no? For vhost-user, maybe
 we could extend the backed to mirror the traffic also.

 IMO the limits are more or less. Besides, for mirror/forward packets,
 using tc requires a separate physical nic or a vlan, the nic should not
 be used for other purpose. if we implement it in QEMU, using an socket
 connection to forward packets, we no longer need an separate nic, it will
 reduce the network topology complexity.

It depends on how do you design your user space. If you want using
userspace to forward the packet, you can 1) use packet socket to capture
all traffic on the tap that is used by VM 2) mirror the traffic to a new
tap device, the user space can then read all traffic from this new tap
device.

Re: [Qemu-devel] [POC] colo-proxy in qemu


On 07/27/2015 03:37 PM, Jason Wang wrote:



On 07/27/2015 01:51 PM, Yang Hongyang wrote:

On 07/27/2015 12:49 PM, Jason Wang wrote:



On 07/27/2015 11:54 AM, Yang Hongyang wrote:



On 07/27/2015 11:24 AM, Jason Wang wrote:



On 07/24/2015 04:04 PM, Yang Hongyang wrote:

Hi Jason,

On 07/24/2015 10:12 AM, Jason Wang wrote:



On 07/24/2015 10:04 AM, Dong, Eddie wrote:

Hi Stefan:
   Thanks for your comments!


On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:

We are planning to implement colo-proxy in qemu to cache and
compare

packets.

I thought there is a kernel module to do that?

   Yes, that is the previous solution the COLO sub-community
choose
to go, but we realized it might be not the best choices, and
thus we
want to bring discussion back here :)  More comments are welcome.



Hi:

Could you pls describe more details on this decision? What's the
reason
that you realize it was not the best choice?


Below is my opinion:

We realized that there're disadvantages do it in kernel spaces:
1. We need to recompile kernel: the colo-proxy kernel module is
  implemented as a nf conntrack extension. Adding a extension
need to
  modify the extension struct in-kernel, so recompile kernel is
needed.


There's no need to do all in kernel, you can use a separate process to
do the comparing and trigger the state sync through monitor.


I don't get it, colo-proxy kernel module using a kthread do the
comparing and
trigger the state sync. We implemented it as a nf conntrack extension
module,
so we need to extend the extension struct in-kernel, although it just
needs
few lines changes to kernel, but a recompile of kernel is needed.
Are you
talking about not implement it as a nf conntrack extension?


Yes, I mean implement the comparing in userspace but not in qemu.


Yes, it is an alternative, that requires other components such as
netfilter userspace tools, it will add the complexity I think, we
wanted to implement a simple solution in QEMU. Another reason is
that using other userspace tools will affect the performance, the
context switch between kernel and userspace may be an overhead.








2. We need to recompile iptables/nftables to use together with the
colo-proxy
  kernel module.
3. Need to configure primary host to forward input packets to
secondary as
  well as configure secondary to forward output packets to primary
host, the
  network topology and configuration is too complex for a regular
user.



You can use current kernel primitives to mirror the traffic of both
PVM
and SVM to another process without any modification of kernel. And
qemu
can offload all network configuration to management in this case.  And
what's more import, this works for vhost. Filtering in qemu won't work
for vhost.


We are using tc to mirror/forward packets now. Implement in QEMU do
have some
limits, but there're also limits in kernel, if the packet do not pass
the host kernel TCP/IP stack, such as vhost-user.


But the limits are much less than userspace, no? For vhost-user, maybe
we could extend the backed to mirror the traffic also.


IMO the limits are more or less. Besides, for mirror/forward packets,
using tc requires a separate physical nic or a vlan, the nic should not
be used for other purpose. if we implement it in QEMU, using an socket
connection to forward packets, we no longer need an separate nic, it will
reduce the network topology complexity.


It depends on how do you design your user space. If you want using
userspace to forward the packet, you can 1) use packet socket to capture
all traffic on the tap that is used by VM 2) mirror the traffic to a new
tap device, the user space can then read all traffic from this new tap
device.


Yes, but we can also do it in QEMU space, right? This will make life easier
because we do all in one solution within QEMU.



.



--
Thanks,
Yang.

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Jason Wang



On 07/27/2015 01:51 PM, Yang Hongyang wrote:
 On 07/27/2015 12:49 PM, Jason Wang wrote:


 On 07/27/2015 11:54 AM, Yang Hongyang wrote:


 On 07/27/2015 11:24 AM, Jason Wang wrote:


 On 07/24/2015 04:04 PM, Yang Hongyang wrote:
 Hi Jason,

 On 07/24/2015 10:12 AM, Jason Wang wrote:


 On 07/24/2015 10:04 AM, Dong, Eddie wrote:
 Hi Stefan:
   Thanks for your comments!

 On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:
 We are planning to implement colo-proxy in qemu to cache and
 compare
 packets.

 I thought there is a kernel module to do that?
   Yes, that is the previous solution the COLO sub-community
 choose
 to go, but we realized it might be not the best choices, and
 thus we
 want to bring discussion back here :)  More comments are welcome.


 Hi:

 Could you pls describe more details on this decision? What's the
 reason
 that you realize it was not the best choice?

 Below is my opinion:

 We realized that there're disadvantages do it in kernel spaces:
 1. We need to recompile kernel: the colo-proxy kernel module is
  implemented as a nf conntrack extension. Adding a extension
 need to
  modify the extension struct in-kernel, so recompile kernel is
 needed.

 There's no need to do all in kernel, you can use a separate process to
 do the comparing and trigger the state sync through monitor.

 I don't get it, colo-proxy kernel module using a kthread do the
 comparing and
 trigger the state sync. We implemented it as a nf conntrack extension
 module,
 so we need to extend the extension struct in-kernel, although it just
 needs
 few lines changes to kernel, but a recompile of kernel is needed.
 Are you
 talking about not implement it as a nf conntrack extension?

 Yes, I mean implement the comparing in userspace but not in qemu.

 Yes, it is an alternative, that requires other components such as
 netfilter userspace tools, it will add the complexity I think, we
 wanted to implement a simple solution in QEMU.

I didn't get the point that why netfilter is needed? Do you mean the
packet comparing needs to be stateful?

 Another reason is
 that using other userspace tools will affect the performance, the
 context switch between kernel and userspace may be an overhead.

We can use 100% time of this process but looks like your RFC of filter
just did it in iothread?

Re: [Qemu-devel] [POC] colo-proxy in qemu

2015-07-27 Thread Jason Wang



On 07/27/2015 03:49 PM, Yang Hongyang wrote:
 On 07/27/2015 03:37 PM, Jason Wang wrote:


 On 07/27/2015 01:51 PM, Yang Hongyang wrote:
 On 07/27/2015 12:49 PM, Jason Wang wrote:


 On 07/27/2015 11:54 AM, Yang Hongyang wrote:


 On 07/27/2015 11:24 AM, Jason Wang wrote:


 On 07/24/2015 04:04 PM, Yang Hongyang wrote:
 Hi Jason,

 On 07/24/2015 10:12 AM, Jason Wang wrote:


 On 07/24/2015 10:04 AM, Dong, Eddie wrote:
 Hi Stefan:
Thanks for your comments!

 On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:
 We are planning to implement colo-proxy in qemu to cache and
 compare
 packets.

 I thought there is a kernel module to do that?
Yes, that is the previous solution the COLO sub-community
 choose
 to go, but we realized it might be not the best choices, and
 thus we
 want to bring discussion back here :)  More comments are welcome.


 Hi:

 Could you pls describe more details on this decision? What's the
 reason
 that you realize it was not the best choice?

 Below is my opinion:

 We realized that there're disadvantages do it in kernel spaces:
 1. We need to recompile kernel: the colo-proxy kernel module is
   implemented as a nf conntrack extension. Adding a extension
 need to
   modify the extension struct in-kernel, so recompile kernel is
 needed.

 There's no need to do all in kernel, you can use a separate
 process to
 do the comparing and trigger the state sync through monitor.

 I don't get it, colo-proxy kernel module using a kthread do the
 comparing and
 trigger the state sync. We implemented it as a nf conntrack extension
 module,
 so we need to extend the extension struct in-kernel, although it just
 needs
 few lines changes to kernel, but a recompile of kernel is needed.
 Are you
 talking about not implement it as a nf conntrack extension?

 Yes, I mean implement the comparing in userspace but not in qemu.

 Yes, it is an alternative, that requires other components such as
 netfilter userspace tools, it will add the complexity I think, we
 wanted to implement a simple solution in QEMU. Another reason is
 that using other userspace tools will affect the performance, the
 context switch between kernel and userspace may be an overhead.




 2. We need to recompile iptables/nftables to use together with the
 colo-proxy
   kernel module.
 3. Need to configure primary host to forward input packets to
 secondary as
   well as configure secondary to forward output packets to
 primary
 host, the
   network topology and configuration is too complex for a
 regular
 user.


 You can use current kernel primitives to mirror the traffic of both
 PVM
 and SVM to another process without any modification of kernel. And
 qemu
 can offload all network configuration to management in this
 case.  And
 what's more import, this works for vhost. Filtering in qemu won't
 work
 for vhost.

 We are using tc to mirror/forward packets now. Implement in QEMU do
 have some
 limits, but there're also limits in kernel, if the packet do not pass
 the host kernel TCP/IP stack, such as vhost-user.

 But the limits are much less than userspace, no? For vhost-user, maybe
 we could extend the backed to mirror the traffic also.

 IMO the limits are more or less. Besides, for mirror/forward packets,
 using tc requires a separate physical nic or a vlan, the nic should not
 be used for other purpose. if we implement it in QEMU, using an socket
 connection to forward packets, we no longer need an separate nic, it
 will
 reduce the network topology complexity.

 It depends on how do you design your user space. If you want using
 userspace to forward the packet, you can 1) use packet socket to capture
 all traffic on the tap that is used by VM 2) mirror the traffic to a new
 tap device, the user space can then read all traffic from this new tap
 device.

 Yes, but we can also do it in QEMU space, right? 

Right.

 This will make life easier
 because we do all in one solution within QEMU.

But I'm not sure qemu is the right place to do this as you mention that
it needs userspace protocol stack support.



 .

Re: [Qemu-devel] [POC] colo-proxy in qemu


On 07/27/2015 04:06 PM, Jason Wang wrote:



On 07/27/2015 03:49 PM, Yang Hongyang wrote:

On 07/27/2015 03:37 PM, Jason Wang wrote:



On 07/27/2015 01:51 PM, Yang Hongyang wrote:

On 07/27/2015 12:49 PM, Jason Wang wrote:



On 07/27/2015 11:54 AM, Yang Hongyang wrote:



On 07/27/2015 11:24 AM, Jason Wang wrote:



On 07/24/2015 04:04 PM, Yang Hongyang wrote:

Hi Jason,

On 07/24/2015 10:12 AM, Jason Wang wrote:



On 07/24/2015 10:04 AM, Dong, Eddie wrote:

Hi Stefan:
Thanks for your comments!


On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:

We are planning to implement colo-proxy in qemu to cache and
compare

packets.

I thought there is a kernel module to do that?

Yes, that is the previous solution the COLO sub-community
choose
to go, but we realized it might be not the best choices, and
thus we
want to bring discussion back here :)  More comments are welcome.



Hi:

Could you pls describe more details on this decision? What's the
reason
that you realize it was not the best choice?


Below is my opinion:

We realized that there're disadvantages do it in kernel spaces:
1. We need to recompile kernel: the colo-proxy kernel module is
   implemented as a nf conntrack extension. Adding a extension
need to
   modify the extension struct in-kernel, so recompile kernel is
needed.


There's no need to do all in kernel, you can use a separate
process to
do the comparing and trigger the state sync through monitor.


I don't get it, colo-proxy kernel module using a kthread do the
comparing and
trigger the state sync. We implemented it as a nf conntrack extension
module,
so we need to extend the extension struct in-kernel, although it just
needs
few lines changes to kernel, but a recompile of kernel is needed.
Are you
talking about not implement it as a nf conntrack extension?


Yes, I mean implement the comparing in userspace but not in qemu.


Yes, it is an alternative, that requires other components such as
netfilter userspace tools, it will add the complexity I think, we
wanted to implement a simple solution in QEMU. Another reason is
that using other userspace tools will affect the performance, the
context switch between kernel and userspace may be an overhead.








2. We need to recompile iptables/nftables to use together with the
colo-proxy
   kernel module.
3. Need to configure primary host to forward input packets to
secondary as
   well as configure secondary to forward output packets to
primary
host, the
   network topology and configuration is too complex for a
regular
user.



You can use current kernel primitives to mirror the traffic of both
PVM
and SVM to another process without any modification of kernel. And
qemu
can offload all network configuration to management in this
case.  And
what's more import, this works for vhost. Filtering in qemu won't
work
for vhost.


We are using tc to mirror/forward packets now. Implement in QEMU do
have some
limits, but there're also limits in kernel, if the packet do not pass
the host kernel TCP/IP stack, such as vhost-user.


But the limits are much less than userspace, no? For vhost-user, maybe
we could extend the backed to mirror the traffic also.


IMO the limits are more or less. Besides, for mirror/forward packets,
using tc requires a separate physical nic or a vlan, the nic should not
be used for other purpose. if we implement it in QEMU, using an socket
connection to forward packets, we no longer need an separate nic, it
will
reduce the network topology complexity.


It depends on how do you design your user space. If you want using
userspace to forward the packet, you can 1) use packet socket to capture
all traffic on the tap that is used by VM 2) mirror the traffic to a new
tap device, the user space can then read all traffic from this new tap
device.


Yes, but we can also do it in QEMU space, right?


Right.


This will make life easier
because we do all in one solution within QEMU.


But I'm not sure qemu is the right place to do this as you mention that
it needs userspace protocol stack support.


We only need some simple features like defragment of TCP packets, analyze
TCP headers, since QEMU has a slirp userspace protocol stack, that should
not be a big deal.







.





.



--
Thanks,
Yang.

Re: [Qemu-devel] [POC] colo-proxy in qemu