Re: [PATCH RFC 07/15] migration: Introduce postcopy channels on dest node

2022-02-08 Thread Peter Xu
On Tue, Feb 08, 2022 at 09:43:49AM +, Dr. David Alan Gilbert wrote:
> > It'll be cleaned up later here:
> > 
> >   loadvm_postcopy_handle_listen
> > postcopy_ram_incoming_setup
> >   postcopy_temp_pages_setup
> > postcopy_ram_incoming_cleanup  <-- if fail above, go here
> >   postcopy_temp_pages_cleanup
> 
> Ah OK, it might still be worth a comment.

Will do.

-- 
Peter Xu




Re: [PATCH RFC 07/15] migration: Introduce postcopy channels on dest node

2022-02-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> On Thu, Feb 03, 2022 at 03:08:39PM +, Dr. David Alan Gilbert wrote:
> > * Peter Xu (pet...@redhat.com) wrote:
> > > Postcopy handles huge pages in a special way that currently we can only 
> > > have
> > > one "channel" to transfer the page.
> > > 
> > > It's because when we install pages using UFFDIO_COPY, we need to have the 
> > > whole
> > > huge page ready, it also means we need to have a temp huge page when 
> > > trying to
> > > receive the whole content of the page.
> > > 
> > > Currently all maintainance around this tmp page is global: firstly we'll
> > > allocate a temp huge page, then we maintain its status mostly within
> > > ram_load_postcopy().
> > > 
> > > To enable multiple channels for postcopy, the first thing we need to do 
> > > is to
> > > prepare N temp huge pages as caching, one for each channel.
> > > 
> > > Meanwhile we need to maintain the tmp huge page status per-channel too.
> > > 
> > > To give some example, some local variables maintained in 
> > > ram_load_postcopy()
> > > are listed; they are responsible for maintaining temp huge page status:
> > > 
> > >   - all_zero: this keeps whether this huge page contains all zeros
> > >   - target_pages: this counts how many target pages have been copied
> > >   - host_page:this keeps the host ptr for the page to install
> > > 
> > > Move all these fields to be together with the temp huge pages to form a 
> > > new
> > > structure called PostcopyTmpPage.  Then for each (future) postcopy 
> > > channel, we
> > > need one structure to keep the state around.
> > > 
> > > For vanilla postcopy, obviously there's only one channel.  It contains 
> > > both
> > > precopy and postcopy pages.
> > > 
> > > This patch teaches the dest migration node to start realize the possible 
> > > number
> > > of postcopy channels by introducing the "postcopy_channels" variable.  Its
> > > value is calculated when setup postcopy on dest node (during 
> > > POSTCOPY_LISTEN
> > > phase).
> > > 
> > > Vanilla postcopy will have channels=1, but when postcopy-preempt 
> > > capability is
> > > enabled (in the future), we will boost it to 2 because even during partial
> > > sending of a precopy huge page we still want to preempt it and start 
> > > sending
> > > the postcopy requested page right away (so we start to keep two temp huge
> > > pages; more if we want to enable multifd).  In this patch there's a TODO 
> > > marked
> > > for that; so far the channels is always set to 1.
> > > 
> > > We need to send one "host huge page" on one channel only and we cannot 
> > > split
> > > them, because otherwise the data upon the same huge page can locate on 
> > > more
> > > than one channel so we need more complicated logic to manage.  One temp 
> > > host
> > > huge page for each channel will be enough for us for now.
> > > 
> > > Postcopy will still always use the index=0 huge page even after this 
> > > patch.
> > > However it prepares for the latter patches where it can start to use 
> > > multiple
> > > channels (which needs src intervention, because only src knows which 
> > > channel we
> > > should use).
> > 
> > Generally OK, some minor nits.
> > 
> > > Signed-off-by: Peter Xu 
> > > ---
> > >  migration/migration.h| 35 +++-
> > >  migration/postcopy-ram.c | 50 +---
> > >  migration/ram.c  | 43 +-
> > >  3 files changed, 91 insertions(+), 37 deletions(-)
> > > 
> > > diff --git a/migration/migration.h b/migration/migration.h
> > > index 8130b703eb..8bb2931312 100644
> > > --- a/migration/migration.h
> > > +++ b/migration/migration.h
> > > @@ -45,6 +45,24 @@ struct PostcopyBlocktimeContext;
> > >   */
> > >  #define CLEAR_BITMAP_SHIFT_MAX31
> > >  
> > > +/* This is an abstraction of a "temp huge page" for postcopy's purpose */
> > > +typedef struct {
> > > +/*
> > > + * This points to a temporary huge page as a buffer for UFFDIO_COPY. 
> > >  It's
> > > + * mmap()ed and needs to be freed when cleanup.
> > > + */
> > > +void *tmp_huge_page;
> > > +/*
> > > + * This points to the host page we're going to install for this temp 
> > > page.
> > > + * It tells us after we've received the whole page, where we should 
> > > put it.
> > > + */
> > > +void *host_addr;
> > > +/* Number of small pages copied (in size of TARGET_PAGE_SIZE) */
> > > +int target_pages;
> > 
> > Can we take the opportunity to convert this to an unsigned?
> 
> Sure.
> 
> > 
> > > +/* Whether this page contains all zeros */
> > > +bool all_zero;
> > > +} PostcopyTmpPage;
> > > +
> > >  /* State for the incoming migration */
> > >  struct MigrationIncomingState {
> > >  QEMUFile *from_src_file;
> > > @@ -81,7 +99,22 @@ struct MigrationIncomingState {
> > >  QemuMutex rp_mutex;/* We send replies from multiple threads */
> > >  /* RAMBlock of last request sent 

Re: [PATCH RFC 07/15] migration: Introduce postcopy channels on dest node

2022-02-07 Thread Peter Xu
On Thu, Feb 03, 2022 at 03:08:39PM +, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > Postcopy handles huge pages in a special way that currently we can only have
> > one "channel" to transfer the page.
> > 
> > It's because when we install pages using UFFDIO_COPY, we need to have the 
> > whole
> > huge page ready, it also means we need to have a temp huge page when trying 
> > to
> > receive the whole content of the page.
> > 
> > Currently all maintainance around this tmp page is global: firstly we'll
> > allocate a temp huge page, then we maintain its status mostly within
> > ram_load_postcopy().
> > 
> > To enable multiple channels for postcopy, the first thing we need to do is 
> > to
> > prepare N temp huge pages as caching, one for each channel.
> > 
> > Meanwhile we need to maintain the tmp huge page status per-channel too.
> > 
> > To give some example, some local variables maintained in ram_load_postcopy()
> > are listed; they are responsible for maintaining temp huge page status:
> > 
> >   - all_zero: this keeps whether this huge page contains all zeros
> >   - target_pages: this counts how many target pages have been copied
> >   - host_page:this keeps the host ptr for the page to install
> > 
> > Move all these fields to be together with the temp huge pages to form a new
> > structure called PostcopyTmpPage.  Then for each (future) postcopy channel, 
> > we
> > need one structure to keep the state around.
> > 
> > For vanilla postcopy, obviously there's only one channel.  It contains both
> > precopy and postcopy pages.
> > 
> > This patch teaches the dest migration node to start realize the possible 
> > number
> > of postcopy channels by introducing the "postcopy_channels" variable.  Its
> > value is calculated when setup postcopy on dest node (during POSTCOPY_LISTEN
> > phase).
> > 
> > Vanilla postcopy will have channels=1, but when postcopy-preempt capability 
> > is
> > enabled (in the future), we will boost it to 2 because even during partial
> > sending of a precopy huge page we still want to preempt it and start sending
> > the postcopy requested page right away (so we start to keep two temp huge
> > pages; more if we want to enable multifd).  In this patch there's a TODO 
> > marked
> > for that; so far the channels is always set to 1.
> > 
> > We need to send one "host huge page" on one channel only and we cannot split
> > them, because otherwise the data upon the same huge page can locate on more
> > than one channel so we need more complicated logic to manage.  One temp host
> > huge page for each channel will be enough for us for now.
> > 
> > Postcopy will still always use the index=0 huge page even after this patch.
> > However it prepares for the latter patches where it can start to use 
> > multiple
> > channels (which needs src intervention, because only src knows which 
> > channel we
> > should use).
> 
> Generally OK, some minor nits.
> 
> > Signed-off-by: Peter Xu 
> > ---
> >  migration/migration.h| 35 +++-
> >  migration/postcopy-ram.c | 50 +---
> >  migration/ram.c  | 43 +-
> >  3 files changed, 91 insertions(+), 37 deletions(-)
> > 
> > diff --git a/migration/migration.h b/migration/migration.h
> > index 8130b703eb..8bb2931312 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -45,6 +45,24 @@ struct PostcopyBlocktimeContext;
> >   */
> >  #define CLEAR_BITMAP_SHIFT_MAX31
> >  
> > +/* This is an abstraction of a "temp huge page" for postcopy's purpose */
> > +typedef struct {
> > +/*
> > + * This points to a temporary huge page as a buffer for UFFDIO_COPY.  
> > It's
> > + * mmap()ed and needs to be freed when cleanup.
> > + */
> > +void *tmp_huge_page;
> > +/*
> > + * This points to the host page we're going to install for this temp 
> > page.
> > + * It tells us after we've received the whole page, where we should 
> > put it.
> > + */
> > +void *host_addr;
> > +/* Number of small pages copied (in size of TARGET_PAGE_SIZE) */
> > +int target_pages;
> 
> Can we take the opportunity to convert this to an unsigned?

Sure.

> 
> > +/* Whether this page contains all zeros */
> > +bool all_zero;
> > +} PostcopyTmpPage;
> > +
> >  /* State for the incoming migration */
> >  struct MigrationIncomingState {
> >  QEMUFile *from_src_file;
> > @@ -81,7 +99,22 @@ struct MigrationIncomingState {
> >  QemuMutex rp_mutex;/* We send replies from multiple threads */
> >  /* RAMBlock of last request sent to source */
> >  RAMBlock *last_rb;
> > -void *postcopy_tmp_page;
> > +/*
> > + * Number of postcopy channels including the default precopy channel, 
> > so
> > + * vanilla postcopy will only contain one channel which contain both
> > + * precopy and postcopy streams.
> > + *
> > + * This is 

Re: [PATCH RFC 07/15] migration: Introduce postcopy channels on dest node

2022-02-03 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> Postcopy handles huge pages in a special way that currently we can only have
> one "channel" to transfer the page.
> 
> It's because when we install pages using UFFDIO_COPY, we need to have the 
> whole
> huge page ready, it also means we need to have a temp huge page when trying to
> receive the whole content of the page.
> 
> Currently all maintainance around this tmp page is global: firstly we'll
> allocate a temp huge page, then we maintain its status mostly within
> ram_load_postcopy().
> 
> To enable multiple channels for postcopy, the first thing we need to do is to
> prepare N temp huge pages as caching, one for each channel.
> 
> Meanwhile we need to maintain the tmp huge page status per-channel too.
> 
> To give some example, some local variables maintained in ram_load_postcopy()
> are listed; they are responsible for maintaining temp huge page status:
> 
>   - all_zero: this keeps whether this huge page contains all zeros
>   - target_pages: this counts how many target pages have been copied
>   - host_page:this keeps the host ptr for the page to install
> 
> Move all these fields to be together with the temp huge pages to form a new
> structure called PostcopyTmpPage.  Then for each (future) postcopy channel, we
> need one structure to keep the state around.
> 
> For vanilla postcopy, obviously there's only one channel.  It contains both
> precopy and postcopy pages.
> 
> This patch teaches the dest migration node to start realize the possible 
> number
> of postcopy channels by introducing the "postcopy_channels" variable.  Its
> value is calculated when setup postcopy on dest node (during POSTCOPY_LISTEN
> phase).
> 
> Vanilla postcopy will have channels=1, but when postcopy-preempt capability is
> enabled (in the future), we will boost it to 2 because even during partial
> sending of a precopy huge page we still want to preempt it and start sending
> the postcopy requested page right away (so we start to keep two temp huge
> pages; more if we want to enable multifd).  In this patch there's a TODO 
> marked
> for that; so far the channels is always set to 1.
> 
> We need to send one "host huge page" on one channel only and we cannot split
> them, because otherwise the data upon the same huge page can locate on more
> than one channel so we need more complicated logic to manage.  One temp host
> huge page for each channel will be enough for us for now.
> 
> Postcopy will still always use the index=0 huge page even after this patch.
> However it prepares for the latter patches where it can start to use multiple
> channels (which needs src intervention, because only src knows which channel 
> we
> should use).

Generally OK, some minor nits.

> Signed-off-by: Peter Xu 
> ---
>  migration/migration.h| 35 +++-
>  migration/postcopy-ram.c | 50 +---
>  migration/ram.c  | 43 +-
>  3 files changed, 91 insertions(+), 37 deletions(-)
> 
> diff --git a/migration/migration.h b/migration/migration.h
> index 8130b703eb..8bb2931312 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -45,6 +45,24 @@ struct PostcopyBlocktimeContext;
>   */
>  #define CLEAR_BITMAP_SHIFT_MAX31
>  
> +/* This is an abstraction of a "temp huge page" for postcopy's purpose */
> +typedef struct {
> +/*
> + * This points to a temporary huge page as a buffer for UFFDIO_COPY.  
> It's
> + * mmap()ed and needs to be freed when cleanup.
> + */
> +void *tmp_huge_page;
> +/*
> + * This points to the host page we're going to install for this temp 
> page.
> + * It tells us after we've received the whole page, where we should put 
> it.
> + */
> +void *host_addr;
> +/* Number of small pages copied (in size of TARGET_PAGE_SIZE) */
> +int target_pages;

Can we take the opportunity to convert this to an unsigned?

> +/* Whether this page contains all zeros */
> +bool all_zero;
> +} PostcopyTmpPage;
> +
>  /* State for the incoming migration */
>  struct MigrationIncomingState {
>  QEMUFile *from_src_file;
> @@ -81,7 +99,22 @@ struct MigrationIncomingState {
>  QemuMutex rp_mutex;/* We send replies from multiple threads */
>  /* RAMBlock of last request sent to source */
>  RAMBlock *last_rb;
> -void *postcopy_tmp_page;
> +/*
> + * Number of postcopy channels including the default precopy channel, so
> + * vanilla postcopy will only contain one channel which contain both
> + * precopy and postcopy streams.
> + *
> + * This is calculated when the src requests to enable postcopy but before
> + * it starts.  Its value can depend on e.g. whether postcopy preemption 
> is
> + * enabled.
> + */
> +int   postcopy_channels;

Also unsigned?

> +/*
> + * An array of temp host huge pages to be used, one for each postcopy
> + * channel.
> 

[PATCH RFC 07/15] migration: Introduce postcopy channels on dest node

2022-01-19 Thread Peter Xu
Postcopy handles huge pages in a special way that currently we can only have
one "channel" to transfer the page.

It's because when we install pages using UFFDIO_COPY, we need to have the whole
huge page ready, it also means we need to have a temp huge page when trying to
receive the whole content of the page.

Currently all maintainance around this tmp page is global: firstly we'll
allocate a temp huge page, then we maintain its status mostly within
ram_load_postcopy().

To enable multiple channels for postcopy, the first thing we need to do is to
prepare N temp huge pages as caching, one for each channel.

Meanwhile we need to maintain the tmp huge page status per-channel too.

To give some example, some local variables maintained in ram_load_postcopy()
are listed; they are responsible for maintaining temp huge page status:

  - all_zero: this keeps whether this huge page contains all zeros
  - target_pages: this counts how many target pages have been copied
  - host_page:this keeps the host ptr for the page to install

Move all these fields to be together with the temp huge pages to form a new
structure called PostcopyTmpPage.  Then for each (future) postcopy channel, we
need one structure to keep the state around.

For vanilla postcopy, obviously there's only one channel.  It contains both
precopy and postcopy pages.

This patch teaches the dest migration node to start realize the possible number
of postcopy channels by introducing the "postcopy_channels" variable.  Its
value is calculated when setup postcopy on dest node (during POSTCOPY_LISTEN
phase).

Vanilla postcopy will have channels=1, but when postcopy-preempt capability is
enabled (in the future), we will boost it to 2 because even during partial
sending of a precopy huge page we still want to preempt it and start sending
the postcopy requested page right away (so we start to keep two temp huge
pages; more if we want to enable multifd).  In this patch there's a TODO marked
for that; so far the channels is always set to 1.

We need to send one "host huge page" on one channel only and we cannot split
them, because otherwise the data upon the same huge page can locate on more
than one channel so we need more complicated logic to manage.  One temp host
huge page for each channel will be enough for us for now.

Postcopy will still always use the index=0 huge page even after this patch.
However it prepares for the latter patches where it can start to use multiple
channels (which needs src intervention, because only src knows which channel we
should use).

Signed-off-by: Peter Xu 
---
 migration/migration.h| 35 +++-
 migration/postcopy-ram.c | 50 +---
 migration/ram.c  | 43 +-
 3 files changed, 91 insertions(+), 37 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index 8130b703eb..8bb2931312 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -45,6 +45,24 @@ struct PostcopyBlocktimeContext;
  */
 #define CLEAR_BITMAP_SHIFT_MAX31
 
+/* This is an abstraction of a "temp huge page" for postcopy's purpose */
+typedef struct {
+/*
+ * This points to a temporary huge page as a buffer for UFFDIO_COPY.  It's
+ * mmap()ed and needs to be freed when cleanup.
+ */
+void *tmp_huge_page;
+/*
+ * This points to the host page we're going to install for this temp page.
+ * It tells us after we've received the whole page, where we should put it.
+ */
+void *host_addr;
+/* Number of small pages copied (in size of TARGET_PAGE_SIZE) */
+int target_pages;
+/* Whether this page contains all zeros */
+bool all_zero;
+} PostcopyTmpPage;
+
 /* State for the incoming migration */
 struct MigrationIncomingState {
 QEMUFile *from_src_file;
@@ -81,7 +99,22 @@ struct MigrationIncomingState {
 QemuMutex rp_mutex;/* We send replies from multiple threads */
 /* RAMBlock of last request sent to source */
 RAMBlock *last_rb;
-void *postcopy_tmp_page;
+/*
+ * Number of postcopy channels including the default precopy channel, so
+ * vanilla postcopy will only contain one channel which contain both
+ * precopy and postcopy streams.
+ *
+ * This is calculated when the src requests to enable postcopy but before
+ * it starts.  Its value can depend on e.g. whether postcopy preemption is
+ * enabled.
+ */
+int   postcopy_channels;
+/*
+ * An array of temp host huge pages to be used, one for each postcopy
+ * channel.
+ */
+PostcopyTmpPage *postcopy_tmp_pages;
+/* This is shared for all postcopy channels */
 void *postcopy_tmp_zero_page;
 /* PostCopyFD's for external userfaultfds & handlers of shared memory */
 GArray   *postcopy_remote_fds;
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index e662dd05cc..d78e1b9373 100644
---