On Wed, 2022-02-23 at 19:54 +1000, Nicholas Piggin wrote: > Excerpts from Haren Myneni's message of February 20, 2022 6:05 am: > > VAS is a hardware engine stays on the chip. So when the partition > > migrates, all VAS windows on the source system have to be closed > > and reopen them on the destination after migration. > > > > This patch make changes to the current reconfig_open/close_windows > > functions to support migration: > > - Set VAS_WIN_MIGRATE_CLOSE to the window status when closes and > > reopen windows with the same status during resume. > > - Continue to close all windows even if deallocate HCALL failed > > (should not happen) since no way to stop migration with the > > current LPM implementation. > > Hmm. pseries_migrate_partition *can* fail?
Yes, it can fail. If pseries_suspend() fails, all VAS windows will be reopened again without migration. vas_migration_handler(VAS_RESUME) is called whether pseries_suspend() returns 0 or not. > > > - If the DLPAR CPU event happens while migration is in progress, > > set VAS_WIN_NO_CRED_CLOSE to the window status. Close window > > happens with the first event (migration or DLPAR) and Reopen > > window happens only with the last event (migration or DLPAR). > > Can DLPAR happen while migration is in progress? Couldn't > this cause your source and destination credits to go out of > whack? Should not be, If the DLPAR event happens while migration is in progress, windows will be closed in the hypervisor (and mark inactive with migration status bit in OS) for migration. For DLPAR event, mark the DLPAR_CLOSED status bits for the necessary windows. Then after the migration, we open windows in the hypervisor and set them active in OS that have only migration status. Open the other remaining windows only after the other DLPAR core add event. Regarding the traget credits on the destination, we get the new capabilities after migration and use the new value for reopen. Ex: Used the following test case - - Configuted 2 dedicated cores (40 credits) and exeuted the test case which opened 35 credits / windows - Removed 1 core, means available 20 credits. So closed 15 windows and set them with DLPAR closed status - Migration start: Closed the remaining 20 windows and set all windows (means 35) migration status - After migration, opened windows that have only migration status - 20 windows, and also clear migration stats for the remaining 15 widnows - Add core which gives the system 20 more credits, So opened the remaining 15 windows and these have only DLPAR closed status. > > Why do you need two close window types, what if you finish > LPM and just open as many as possible regardless how they > are closed? Adding 2 different status bits to support DLPAR and LPM closed staus. As I mentioned above, windows will be active only after both bits are cleared. Thanks Haren > > Thanks, > Nick > > > Signed-off-by: Haren Myneni <ha...@linux.ibm.com> > > --- > > arch/powerpc/include/asm/vas.h | 2 + > > arch/powerpc/platforms/pseries/vas.c | 88 ++++++++++++++++++++++ > > ------ > > 2 files changed, 73 insertions(+), 17 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/vas.h > > b/arch/powerpc/include/asm/vas.h > > index 6baf7b9ffed4..83afcb6c194b 100644 > > --- a/arch/powerpc/include/asm/vas.h > > +++ b/arch/powerpc/include/asm/vas.h > > @@ -36,6 +36,8 @@ > > /* vas mmap() */ > > /* Window is closed in the hypervisor due to lost credit */ > > #define VAS_WIN_NO_CRED_CLOSE 0x00000001 > > +/* Window is closed due to migration */ > > +#define VAS_WIN_MIGRATE_CLOSE 0x00000002 > > > > /* > > * Get/Set bit fields > > diff --git a/arch/powerpc/platforms/pseries/vas.c > > b/arch/powerpc/platforms/pseries/vas.c > > index 3bb219f54806..fbcf311da0ec 100644 > > --- a/arch/powerpc/platforms/pseries/vas.c > > +++ b/arch/powerpc/platforms/pseries/vas.c > > @@ -457,11 +457,12 @@ static int vas_deallocate_window(struct > > vas_window *vwin) > > mutex_lock(&vas_pseries_mutex); > > /* > > * VAS window is already closed in the hypervisor when > > - * lost the credit. So just remove the entry from > > - * the list, remove task references and free vas_window > > + * lost the credit or with migration. So just remove the entry > > + * from the list, remove task references and free vas_window > > * struct. > > */ > > - if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) { > > + if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) && > > + !(win->vas_win.status & VAS_WIN_MIGRATE_CLOSE)) { > > rc = deallocate_free_window(win); > > if (rc) { > > mutex_unlock(&vas_pseries_mutex); > > @@ -578,12 +579,14 @@ static int __init get_vas_capabilities(u8 > > feat, enum vas_cop_feat_type type, > > * by setting the remapping to new paste address if the window is > > * active. > > */ > > -static int reconfig_open_windows(struct vas_caps *vcaps, int > > creds) > > +static int reconfig_open_windows(struct vas_caps *vcaps, int > > creds, > > + bool migrate) > > { > > long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID}; > > struct vas_cop_feat_caps *caps = &vcaps->caps; > > struct pseries_vas_window *win = NULL, *tmp; > > int rc, mv_ents = 0; > > + int flag; > > > > /* > > * Nothing to do if there are no closed windows. > > @@ -602,8 +605,10 @@ static int reconfig_open_windows(struct > > vas_caps *vcaps, int creds) > > * (dedicated). If 1 core is added, this LPAR can have 20 more > > * credits. It means the kernel can reopen 20 windows. So move > > * 20 entries in the VAS windows lost and reopen next 20 > > windows. > > + * For partition migration, reopen all windows that are closed > > + * during resume. > > */ > > - if (vcaps->nr_close_wins > creds) > > + if ((vcaps->nr_close_wins > creds) && !migrate) > > mv_ents = vcaps->nr_close_wins - creds; > > > > list_for_each_entry_safe(win, tmp, &vcaps->list, win_list) { > > @@ -613,12 +618,35 @@ static int reconfig_open_windows(struct > > vas_caps *vcaps, int creds) > > mv_ents--; > > } > > > > + /* > > + * Open windows if they are closed only with migration or > > + * DLPAR (lost credit) before. > > + */ > > + if (migrate) > > + flag = VAS_WIN_MIGRATE_CLOSE; > > + else > > + flag = VAS_WIN_NO_CRED_CLOSE; > > + > > list_for_each_entry_safe_from(win, tmp, &vcaps->list, win_list) > > { > > + /* > > + * This window is closed with DLPAR and migration > > events. > > + * So reopen the window with the last event. > > + * The user space is not suspended with the current > > + * migration notifier. So the user space can issue > > DLPAR > > + * CPU hotplug while migration in progress. In this > > case > > + * this window will be opened with the last event. > > + */ > > + if ((win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) && > > + (win->vas_win.status & VAS_WIN_MIGRATE_CLOSE)) > > { > > + win->vas_win.status &= ~flag; > > + continue; > > + } > > + > > /* > > * Nothing to do on this window if it is not closed > > - * with VAS_WIN_NO_CRED_CLOSE > > + * with this flag > > */ > > - if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE)) > > + if (!(win->vas_win.status & flag)) > > continue; > > > > rc = allocate_setup_window(win, (u64 *)&domain[0], > > @@ -634,7 +662,7 @@ static int reconfig_open_windows(struct > > vas_caps *vcaps, int creds) > > /* > > * Set window status to active > > */ > > - win->vas_win.status &= ~VAS_WIN_NO_CRED_CLOSE; > > + win->vas_win.status &= ~flag; > > mutex_unlock(&win->vas_win.task_ref.mmap_mutex); > > win->win_type = caps->win_type; > > if (!--vcaps->nr_close_wins) > > @@ -661,20 +689,32 @@ static int reconfig_open_windows(struct > > vas_caps *vcaps, int creds) > > * the user space to fall back to SW compression and manage with > > the > > * existing windows. > > */ > > -static int reconfig_close_windows(struct vas_caps *vcap, int > > excess_creds) > > +static int reconfig_close_windows(struct vas_caps *vcap, int > > excess_creds, > > + > > bool migrate) > > { > > struct pseries_vas_window *win, *tmp; > > struct vas_user_win_ref *task_ref; > > struct vm_area_struct *vma; > > - int rc = 0; > > + int rc = 0, flag; > > + > > + if (migrate) > > + flag = VAS_WIN_MIGRATE_CLOSE; > > + else > > + flag = VAS_WIN_NO_CRED_CLOSE; > > > > list_for_each_entry_safe(win, tmp, &vcap->list, win_list) { > > /* > > * This window is already closed due to lost credit > > - * before. Go for next window. > > + * or for migration before. Go for next window. > > + * For migration, nothing to do since this window > > + * closed for DLPAR and will be reopened even on > > + * the destination system with other DLPAR operation. > > */ > > - if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) > > + if ((win->vas_win.status & VAS_WIN_MIGRATE_CLOSE) || > > + (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE)) > > { > > + win->vas_win.status |= flag; > > continue; > > + } > > > > task_ref = &win->vas_win.task_ref; > > mutex_lock(&task_ref->mmap_mutex); > > @@ -683,7 +723,7 @@ static int reconfig_close_windows(struct > > vas_caps *vcap, int excess_creds) > > * Number of available credits are reduced, So select > > * and close windows. > > */ > > - win->vas_win.status |= VAS_WIN_NO_CRED_CLOSE; > > + win->vas_win.status |= flag; > > > > mmap_write_lock(task_ref->mm); > > /* > > @@ -706,12 +746,24 @@ static int reconfig_close_windows(struct > > vas_caps *vcap, int excess_creds) > > * later when the process issued with close(FD). > > */ > > rc = deallocate_free_window(win); > > - if (rc) > > + /* > > + * This failure is from the hypervisor. > > + * No way to stop migration for these failures. > > + * So ignore error and continue closing other windows. > > + */ > > + if (rc && !migrate) > > return rc; > > > > vcap->nr_close_wins++; > > > > - if (!--excess_creds) > > + /* > > + * For migration, do not depend on lpar_creds in case > > if > > + * mismatch with the hypervisor value (should not > > happen). > > + * So close all active windows in the list and will be > > + * reopened windows based on the new lpar_creds on the > > + * destination system during resume. > > + */ > > + if (!migrate && !--excess_creds) > > break; > > } > > > > @@ -761,7 +813,8 @@ int vas_reconfig_capabilties(u8 type) > > * target, reopen windows if they are closed due to > > * the previous DLPAR (core removal). > > */ > > - rc = reconfig_open_windows(vcaps, new_nr_creds - > > old_nr_creds); > > + rc = reconfig_open_windows(vcaps, new_nr_creds - > > old_nr_creds, > > + false); > > } else { > > /* > > * # active windows is more than new LPAR available > > @@ -771,7 +824,8 @@ int vas_reconfig_capabilties(u8 type) > > nr_active_wins = vcaps->nr_open_windows - vcaps- > > >nr_close_wins; > > if (nr_active_wins > new_nr_creds) > > rc = reconfig_close_windows(vcaps, > > - nr_active_wins - new_nr_creds); > > + nr_active_wins - new_nr_creds, > > + false); > > } > > > > out: > > -- > > 2.27.0 > > > > > >