On 14.05.2019 14:36, Maxime Coquelin wrote:
> On 5/14/19 12:58 PM, Ilya Maximets wrote:
>> On 14.05.2019 10:44, Maxime Coquelin wrote:
>>> On 5/8/19 3:54 PM, Ilya Maximets wrote:
>>>> From: Liliia Butorina <l.butor...@partner.samsung.com>
>>>>
>>>> Post-copy Live Migration for vHost supported since DPDK 18.11 and
>>>> QEMU 2.12. New global config option 'vhost-postcopy-support' added
>>>> to control this feature. Ex.:
>>>>
>>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-postcopy-support=true
>>>>
>>>> Changing this value requires restarting the daemon. It's safe to
>>>> enable this knob even if QEMU doesn't support post-copy LM.
>>>>
>>>> Feature marked as experimental and disabled by default because it may
>>>> cause PMD thread hang on destination host on page fault for the time
>>>> of page downloading from the source.
>>>>
>>>> Feature is not compatible with 'mlockall' and 'dequeue zero-copy'.
>>>> Support added only for vhost-user-client.
>>>>
>>>> Signed-off-by: Liliia Butorina <l.butor...@partner.samsung.com>
>>>> Co-authored-by: Ilya Maximets <i.maxim...@samsung.com>
>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com>
>>>
>>> Thanks Ilya & Liliia for taking care of that.
>>>
>>>
>>>> ---
>>>>    Documentation/topics/dpdk/vhost-user.rst | 53 +++++++++++++++++++++++-
>>>>    NEWS                                     |  1 +
>>>>    lib/dpdk-stub.c                          |  6 +++
>>>>    lib/dpdk.c                               | 13 ++++++
>>>>    lib/dpdk.h                               |  1 +
>>>>    lib/netdev-dpdk.c                        |  5 +++
>>>>    vswitchd/vswitch.xml                     | 16 +++++++
>>>>    7 files changed, 94 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
>>>> b/Documentation/topics/dpdk/vhost-user.rst
>>>> index 993797de5..6bec8c1f7 100644
>>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>>> @@ -111,7 +111,8 @@ the guest. There are two ways to do this: using QEMU 
>>>> directly, or using
>>>>    libvirt.
>>>>      .. note::
>>>> -   IOMMU is not supported with vhost-user ports.
>>>> +
>>>> +   IOMMU and Post-copy Live Migration are not supported with vhost-user 
>>>> ports.
>>>>      Adding vhost-user ports to the guest (QEMU)
>>>>    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> @@ -301,6 +302,52 @@ The default value is false.
>>>>        QEMU). Starting with QEMU v2.9.1, vhost-iommu-support can safely be
>>>>        enabled, even without having an IOMMU device, with no performance 
>>>> penalty.
>>>>    +vhost-user-client Post-copy Live Migration Support (experimental)
>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> +
>>>> +``Post-copy`` migration is the migration mode where the destination CPUs 
>>>> are
>>>> +started before all the memory has been transferred. The main advantage is 
>>>> the
>>>> +predictable migration time. Mostly used as a second phase after the normal
>>>> +'pre-copy' migration in case it takes too long to converge.
>>>> +
>>>> +More information can be found in QEMU `docs`_.
>>>> +
>>>> +.. _`docs`: 
>>>> https://git.qemu.org/?p=qemu.git;a=blob;f=docs/devel/migration.rst
>>>> +
>>>> +Post-copy support may be enabled via a global config value
>>>> +``vhost-postcopy-support``. Setting this to ``true`` enables Post-copy 
>>>> support
>>>> +for all vhost-user-client ports::
>>>> +
>>>> +    $ ovs-vsctl set Open_vSwitch . 
>>>> other_config:vhost-postcopy-support=true
>>>> +
>>>> +The default value is ``false``.
>>>> +
>>>> +.. important::
>>>> +
>>>> +    Changing this value requires restarting the daemon.
>>>> +
>>>> +.. important::
>>>> +
>>>> +    DPDK Post-copy migration mode uses userfaultfd syscall to communicate 
>>>> with
>>>> +    the kernel about page fault handling and uses shared memory based on 
>>>> huge
>>>> +    pages. So destination host linux kernel should support userfaultfd 
>>>> over
>>>> +    shared hugetlbfs. This feature only introduced in kernel upstream 
>>>> version
>>>> +    4.11.
>>>> +
>>>> +    Post-copy feature supported in DPDK since 18.11.0 version and in QEMU
>>>> +    since 2.12.0 version. But it's suggested to use QEMU >= 3.0.1 because
>>>> +    migration recovery was fixed for post-copy in 3.0 and few additional 
>>>> bug
>>>> +    fixes (like userfaulfd leak) was released in 3.0.1.
>>>> +
>>>> +    DPDK Post-copy feature requires avoiding to populate the guest memory
>>>> +    (application must not call mlock* syscall). So enabling mlockall and
>>>> +    dequeue zero-copy features is mis-compatible with post-copy feature.
>>>> +
>>>> +    Note that during migration of vhost-user device, PMD threads hang for 
>>>> the
>>>> +    time of faulted pages download from source host. Transferring 1GB 
>>>> hugepage
>>>> +    across a 10Gbps link possibly unacceptably slow. So recommended 
>>>> hugepage
>>>> +    size is 2MB.
>>>> +
>>>>    .. _dpdk-testpmd:
>>>>      DPDK in the Guest
>>>> @@ -500,6 +547,10 @@ QEMU versions v2.10 and greater). This value can be 
>>>> set like so::
>>>>      Because of this limitation, this feature is considered 'experimental'.
>>>>    +.. note::
>>>> +
>>>> +   Post-copy Live Migration is not compatible with dequeue zero copy.
>>>> +
>>>>    Further information can be found in the
>>>>    `DPDK documentation
>>>>    <https://doc.dpdk.org/guides-18.11/prog_guide/vhost_lib.html>`__
>>>> diff --git a/NEWS b/NEWS
>>>> index 293531db0..f1f6f074e 100644
>>>> --- a/NEWS
>>>> +++ b/NEWS
>>>> @@ -3,6 +3,7 @@ Post-v2.11.0
>>>>       - DPDK:
>>>>         * New option 'other_config:dpdk-socket-limit' to limit amount of
>>>>           hugepage memory that can be used by DPDK.
>>>> +     * Add support for vHost Post-copy Live Migration (experimental).
>>>>       - OpenFlow:
>>>>         * Removed support for OpenFlow 1.6 (draft), which ONF abandoned.
>>>>         * New action "check_pkt_larger".
>>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
>>>> index 1e0f46101..e55be5750 100644
>>>> --- a/lib/dpdk-stub.c
>>>> +++ b/lib/dpdk-stub.c
>>>> @@ -56,6 +56,12 @@ dpdk_vhost_iommu_enabled(void)
>>>>        return false;
>>>>    }
>>>>    +bool
>>>> +dpdk_vhost_postcopy_enabled(void)
>>>> +{
>>>> +    return false;
>>>> +}
>>>> +
>>>>    bool
>>>>    dpdk_per_port_memory(void)
>>>>    {
>>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>>> index dc6171546..d9ec3cf64 100644
>>>> --- a/lib/dpdk.c
>>>> +++ b/lib/dpdk.c
>>>> @@ -47,6 +47,8 @@ static FILE *log_stream = NULL;       /* Stream for DPDK 
>>>> log redirection */
>>>>      static char *vhost_sock_dir = NULL;   /* Location of vhost-user 
>>>> sockets */
>>>>    static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU 
>>>> support */
>>>> +static bool vhost_postcopy_enabled = false; /* Status of vHost POSTCOPY
>>>> +                                             * support. */
>>>>    static bool dpdk_initialized = false; /* Indicates successful 
>>>> initialization
>>>>                                           * of DPDK. */
>>>>    static bool per_port_memory = false; /* Status of per port memory 
>>>> support */
>>>> @@ -316,6 +318,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>>        VLOG_INFO("Per port memory for DPDK devices %s.",
>>>>                  per_port_memory ? "enabled" : "disabled");
>>>>    +    vhost_postcopy_enabled = smap_get_bool(ovs_other_config,
>>>> +                                           "vhost-postcopy-support", 
>>>> false);
>>>> +    VLOG_INFO("POSTCOPY support for vhost-user-client %s.",
>>>> +              vhost_postcopy_enabled ? "enabled" : "disabled");
>>>> +
>>>>        svec_add(&args, ovs_get_program_name());
>>>>        construct_dpdk_args(ovs_other_config, &args);
>>>>    @@ -492,6 +499,12 @@ dpdk_vhost_iommu_enabled(void)
>>>>        return vhost_iommu_enabled;
>>>>    }
>>>>    +bool
>>>> +dpdk_vhost_postcopy_enabled(void)
>>>> +{
>>>> +    return vhost_postcopy_enabled;
>>>> +}
>>>> +
>>>>    bool
>>>>    dpdk_per_port_memory(void)
>>>>    {
>>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>>>> index bbb89d4e6..7dab83775 100644
>>>> --- a/lib/dpdk.h
>>>> +++ b/lib/dpdk.h
>>>> @@ -39,6 +39,7 @@ void dpdk_init(const struct smap *ovs_other_config);
>>>>    void dpdk_set_lcore_id(unsigned cpu);
>>>>    const char *dpdk_get_vhost_sock_dir(void);
>>>>    bool dpdk_vhost_iommu_enabled(void);
>>>> +bool dpdk_vhost_postcopy_enabled(void);
>>>>    bool dpdk_per_port_memory(void);
>>>>    void print_dpdk_version(void);
>>>>    void dpdk_status(const struct ovsrec_open_vswitch *);
>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>>> index 47153dc60..c06f46931 100644
>>>> --- a/lib/netdev-dpdk.c
>>>> +++ b/lib/netdev-dpdk.c
>>>> @@ -4147,6 +4147,11 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev 
>>>> *netdev)
>>>>                vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>>>            }
>>>>    +        /* Enable POSTCOPY support, if explicitly requested. */
>>>> +        if (dpdk_vhost_postcopy_enabled()) {
>>>> +            vhost_flags |= RTE_VHOST_USER_POSTCOPY_SUPPORT;
>>>> +        }
>>>
>>> Couldn't we also enforce postcopy disablement in the case --mlockall option 
>>> is passed?
>>
>> Sure. This could be done on 'dpdk_init' stage. I'll take care of this.
>>
>> OTOH, this will block the case where source has mlock, but destination 
>> doesn't.
> 
> I would have thought source doesn't need to advertise
> VHOST_USER_PROTOCOL_F_PAGEFAULT to have the postcopy migration to work.

Sure. Thanks. Missed that fact. So, it should work in theory.

> 
> I haven't tested it though, but in theory it shouldn't be needed.> 
>> In theory, it's a valid case. Not sure how practical it is. What do you 
>> think?
> Anyway, I don't think it would problematic in practice.
> 
> Thanks,
> Maxime
> 
>> Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to