Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
On Mon, Apr 16, 2018 at 04:26:09AM -0600, Jan Beulich wrote: On 16.04.18 at 08:20,wrote: >> On Fri, Apr 13, 2018 at 09:49:17AM -0600, Jan Beulich wrote: >> On 30.03.18 at 08:59, wrote: +static int do_microcode_update(void *_info) +{ +struct microcode_info *info = _info; +int error, ret = 0; + +error = __wait_for_cpus(>cpu_in, USEC_PER_SEC); >>> >>>Why this long a timeout here? >> >> I just use the same timeout as the patch on linux kernel side. As we >> know if the timeout is too small, updating microcode may be likely to >> failed even if other CPUs disabled interrupt temporally. >> >> If you object to such a long timeout (for Xen may need much smaller >> time to rendezvous all CPUs compared to linux kernel because Xen doesn't >> have device drivers which may malfunction), how about just use the >> default timeout, 30ms, used by live patching? if it is also not >> good enough, then we make it an option which comes from callers. > >Yes, 30ms is likely to be acceptable. > +if ( error ) +{ +ret = -EBUSY; +return ret; +} + +error = microcode_update_cpu(info->buffer, info->buffer_size); +if ( error && !ret ) +ret = error; +/* + * Increase the wait timeout to a safe value here since we're serializing + * the microcode update and that could take a while on a large number of + * CPUs. And that is fine as the *actual* timeout will be determined by + * the last CPU finished updating and thus cut short + */ +error = __wait_for_cpus(>cpu_out, USEC_PER_SEC * num_online_cpus()); >>> >>>And this one's even worse, in particular on huge systems. I'm afraid such a >>>long >>>period of time in stop-machine context is going to confuse most of the >>>running >>>domains (including Dom0). There's nothing inherently wrong with e.g. >>>processing >>>the updates on distinct cores (and even more so on distinct sockets) in >>>parallel. >> >> I cannot say for sure. But the original patch does want updating >> microcode be performed one-by-one. > >And they don't restrict the "when" aspect in any way? I.e. all sorts of >applications (and perhaps even KVM guests) may be running? No any restriction on the timing I think. Ashok, could you confirm this? > >>>Therefore revising the locking in microcode_update_cpu() might be a necessary >>>prereq step. >> >> Do you mean changing it to a per-core or per-socket lock? > >Either changing to such, or introducing a second, more fine grained lock. I will introduce a per-core lock. Then only one thread in a core will try to update microcode as an optimization. For I am not sure whether it is permitted to update distinct cores in parallel, I am inclined to keep the original logic. Thanks Chao > >>>Or alternatively you may need to demand that no other running >>>domains exist besides Dom0 (and hope the best for Dom0 itself). >>> >>>I also don't think there's any point invoking the operation on all HT >>>threads on a >>>core, but I realize stop_machine_run() isn't flexible enough to allow such. >> >> Only one thread in a core will do the actual update. Other threads only >> check the microcode version and find the version is already >> update-to-date, then exit the critical region. Considering the check may >> don't need much time, I wonder whether it can significantly benefit the >> overall time to invoking the operation on all HT threads on a core? > >With better parallelism this would be less of an issue. Plus, as said - the >infrastructure we have at present wouldn't allow anyway what I've >described above, and hand-rolling another "flavor" of the stop-machine >logic is quite certainly out of question. To answer your question though: >Taking as the example a 4-threads-per-core system with sufficiently >many cores, I'm afraid the overall time spent in handling the >"uninteresting" threads may become measurable. But of course, if actual >numbers said otherwise, I'd be fine. > > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
>>> On 16.04.18 at 08:20,wrote: > On Fri, Apr 13, 2018 at 09:49:17AM -0600, Jan Beulich wrote: > On 30.03.18 at 08:59, wrote: >>> +static int do_microcode_update(void *_info) >>> +{ >>> +struct microcode_info *info = _info; >>> +int error, ret = 0; >>> + >>> +error = __wait_for_cpus(>cpu_in, USEC_PER_SEC); >> >>Why this long a timeout here? > > I just use the same timeout as the patch on linux kernel side. As we > know if the timeout is too small, updating microcode may be likely to > failed even if other CPUs disabled interrupt temporally. > > If you object to such a long timeout (for Xen may need much smaller > time to rendezvous all CPUs compared to linux kernel because Xen doesn't > have device drivers which may malfunction), how about just use the > default timeout, 30ms, used by live patching? if it is also not > good enough, then we make it an option which comes from callers. Yes, 30ms is likely to be acceptable. >>> +if ( error ) >>> +{ >>> +ret = -EBUSY; >>> +return ret; >>> +} >>> + >>> +error = microcode_update_cpu(info->buffer, info->buffer_size); >>> +if ( error && !ret ) >>> +ret = error; >>> +/* >>> + * Increase the wait timeout to a safe value here since we're >>> serializing >>> + * the microcode update and that could take a while on a large number >>> of >>> + * CPUs. And that is fine as the *actual* timeout will be determined by >>> + * the last CPU finished updating and thus cut short >>> + */ >>> +error = __wait_for_cpus(>cpu_out, USEC_PER_SEC * >>> num_online_cpus()); >> >>And this one's even worse, in particular on huge systems. I'm afraid such a >>long >>period of time in stop-machine context is going to confuse most of the running >>domains (including Dom0). There's nothing inherently wrong with e.g. >>processing >>the updates on distinct cores (and even more so on distinct sockets) in >>parallel. > > I cannot say for sure. But the original patch does want updating > microcode be performed one-by-one. And they don't restrict the "when" aspect in any way? I.e. all sorts of applications (and perhaps even KVM guests) may be running? >>Therefore revising the locking in microcode_update_cpu() might be a necessary >>prereq step. > > Do you mean changing it to a per-core or per-socket lock? Either changing to such, or introducing a second, more fine grained lock. >>Or alternatively you may need to demand that no other running >>domains exist besides Dom0 (and hope the best for Dom0 itself). >> >>I also don't think there's any point invoking the operation on all HT threads >>on a >>core, but I realize stop_machine_run() isn't flexible enough to allow such. > > Only one thread in a core will do the actual update. Other threads only > check the microcode version and find the version is already > update-to-date, then exit the critical region. Considering the check may > don't need much time, I wonder whether it can significantly benefit the > overall time to invoking the operation on all HT threads on a core? With better parallelism this would be less of an issue. Plus, as said - the infrastructure we have at present wouldn't allow anyway what I've described above, and hand-rolling another "flavor" of the stop-machine logic is quite certainly out of question. To answer your question though: Taking as the example a 4-threads-per-core system with sufficiently many cores, I'm afraid the overall time spent in handling the "uninteresting" threads may become measurable. But of course, if actual numbers said otherwise, I'd be fine. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
On Fri, Apr 13, 2018 at 09:49:17AM -0600, Jan Beulich wrote: On 30.03.18 at 08:59,wrote: >> @@ -281,24 +287,52 @@ static int microcode_update_cpu(const void *buf, >> size_t size) >> return err; >> } >> >> -static long do_microcode_update(void *_info) >> +static int __wait_for_cpus(atomic_t *cnt, int timeout) > >No new double-underscore prefixed functions please. > >> { >> -struct microcode_info *info = _info; >> -int error; >> +int cpus = num_online_cpus(); > >unsigned int > >> -BUG_ON(info->cpu != smp_processor_id()); >> +atomic_inc(cnt); >> >> -error = microcode_update_cpu(info->buffer, info->buffer_size); >> -if ( error ) >> -info->error = error; >> +while (atomic_read(cnt) != cpus) > >There are a number of style issues in the patch, mostly (like here) missing >blanks. Will fix all issues pointed out by comments above. > >> +{ >> +if ( timeout <= 0 ) >> +{ >> +printk("Timeout when waiting for CPUs calling in\n"); >> +return -1; >> +} >> +udelay(1); >> +timeout--; >> +} >> >> -info->cpu = cpumask_next(info->cpu, _online_map); >> -if ( info->cpu < nr_cpu_ids ) >> -return continue_hypercall_on_cpu(info->cpu, do_microcode_update, >> info); >> +return 0; >> +} >> >> -error = info->error; >> -xfree(info); >> -return error; >> +static int do_microcode_update(void *_info) >> +{ >> +struct microcode_info *info = _info; >> +int error, ret = 0; >> + >> +error = __wait_for_cpus(>cpu_in, USEC_PER_SEC); > >Why this long a timeout here? I just use the same timeout as the patch on linux kernel side. As we know if the timeout is too small, updating microcode may be likely to failed even if other CPUs disabled interrupt temporally. If you object to such a long timeout (for Xen may need much smaller time to rendezvous all CPUs compared to linux kernel because Xen doesn't have device drivers which may malfunction), how about just use the default timeout, 30ms, used by live patching? if it is also not good enough, then we make it an option which comes from callers. > >> +if ( error ) >> +{ >> +ret = -EBUSY; >> +return ret; >> +} >> + >> +error = microcode_update_cpu(info->buffer, info->buffer_size); >> +if ( error && !ret ) >> +ret = error; >> +/* >> + * Increase the wait timeout to a safe value here since we're >> serializing >> + * the microcode update and that could take a while on a large number of >> + * CPUs. And that is fine as the *actual* timeout will be determined by >> + * the last CPU finished updating and thus cut short >> + */ >> +error = __wait_for_cpus(>cpu_out, USEC_PER_SEC * >> num_online_cpus()); > >And this one's even worse, in particular on huge systems. I'm afraid such a >long >period of time in stop-machine context is going to confuse most of the running >domains (including Dom0). There's nothing inherently wrong with e.g. processing >the updates on distinct cores (and even more so on distinct sockets) in >parallel. I cannot say for sure. But the original patch does want updating microcode be performed one-by-one. >Therefore revising the locking in microcode_update_cpu() might be a necessary >prereq step. Do you mean changing it to a per-core or per-socket lock? >Or alternatively you may need to demand that no other running >domains exist besides Dom0 (and hope the best for Dom0 itself). > >I also don't think there's any point invoking the operation on all HT threads >on a >core, but I realize stop_machine_run() isn't flexible enough to allow such. Only one thread in a core will do the actual update. Other threads only check the microcode version and find the version is already update-to-date, then exit the critical region. Considering the check may don't need much time, I wonder whether it can significantly benefit the overall time to invoking the operation on all HT threads on a core? Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
>>> On 30.03.18 at 08:59,wrote: > @@ -281,24 +287,52 @@ static int microcode_update_cpu(const void *buf, size_t > size) > return err; > } > > -static long do_microcode_update(void *_info) > +static int __wait_for_cpus(atomic_t *cnt, int timeout) No new double-underscore prefixed functions please. > { > -struct microcode_info *info = _info; > -int error; > +int cpus = num_online_cpus(); unsigned int > -BUG_ON(info->cpu != smp_processor_id()); > +atomic_inc(cnt); > > -error = microcode_update_cpu(info->buffer, info->buffer_size); > -if ( error ) > -info->error = error; > +while (atomic_read(cnt) != cpus) There are a number of style issues in the patch, mostly (like here) missing blanks. > +{ > +if ( timeout <= 0 ) > +{ > +printk("Timeout when waiting for CPUs calling in\n"); > +return -1; > +} > +udelay(1); > +timeout--; > +} > > -info->cpu = cpumask_next(info->cpu, _online_map); > -if ( info->cpu < nr_cpu_ids ) > -return continue_hypercall_on_cpu(info->cpu, do_microcode_update, > info); > +return 0; > +} > > -error = info->error; > -xfree(info); > -return error; > +static int do_microcode_update(void *_info) > +{ > +struct microcode_info *info = _info; > +int error, ret = 0; > + > +error = __wait_for_cpus(>cpu_in, USEC_PER_SEC); Why this long a timeout here? > +if ( error ) > +{ > +ret = -EBUSY; > +return ret; > +} > + > +error = microcode_update_cpu(info->buffer, info->buffer_size); > +if ( error && !ret ) > +ret = error; > +/* > + * Increase the wait timeout to a safe value here since we're serializing > + * the microcode update and that could take a while on a large number of > + * CPUs. And that is fine as the *actual* timeout will be determined by > + * the last CPU finished updating and thus cut short > + */ > +error = __wait_for_cpus(>cpu_out, USEC_PER_SEC * > num_online_cpus()); And this one's even worse, in particular on huge systems. I'm afraid such a long period of time in stop-machine context is going to confuse most of the running domains (including Dom0). There's nothing inherently wrong with e.g. processing the updates on distinct cores (and even more so on distinct sockets) in parallel. Therefore revising the locking in microcode_update_cpu() might be a necessary prereq step. Or alternatively you may need to demand that no other running domains exist besides Dom0 (and hope the best for Dom0 itself). I also don't think there's any point invoking the operation on all HT threads on a core, but I realize stop_machine_run() isn't flexible enough to allow such. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
>>> On 13.04.18 at 07:25,wrote: > On Thu, Apr 12, 2018 at 09:29:34AM -0700, Raj, Ashok wrote: >>On Fri, Mar 30, 2018 at 02:59:00PM +0800, Chao Gao wrote: >>> From: Gao Chao >>> >>> This patch is to backport microcode improvement patches from linux >>> kernel. Below are the original patches description: >>> >>> commit a5321aec6412b20b5ad15db2d6b916c05349dbff >>> Author: Ashok Raj >>> Date: Wed Feb 28 11:28:46 2018 +0100 >>> >>> x86/microcode: Synchronize late microcode loading >>> >>> Original idea by Ashok, completely rewritten by Borislav. >>> >>> Before you read any further: the early loading method is still the >>> preferred one and you should always do that. The following patch is >>> improving the late loading mechanism for long running jobs and cloud use >>> cases. >>> >>> Gather all cores and serialize the microcode update on them by doing it >>> one-by-one to make the late update process as reliable as possible and >>> avoid potential issues caused by the microcode update. >>> >>> [ Borislav: Rewrite completely. ] >>> >>> Co-developed-by: Borislav Petkov >>> Signed-off-by: Ashok Raj >>> Signed-off-by: Borislav Petkov >>> Signed-off-by: Thomas Gleixner >>> Tested-by: Tom Lendacky >>> Tested-by: Ashok Raj >> >>The tested by tags were good for linux upstream. Can you make sure >>you add your name under tested-by for xen? > > Tested-by doesn't mean they have tested this patch. I just put the > original commits description here. Tested-by being as meaningful in Xen as it is in Linux, retaining such tags (other than authorship ones) is generally wrong, as it gives a wrong impression on what testing the _Xen_ patch has seen. > If Xen permits such tested-by from the sender and author, I will add one. I think it is generally implied that the author has done some testing. In the case here - with a ported over Linux commit by other than the original author - I would view a T-b by the original author as meaningful though. It should be made clear though that this is a ported Linux commit, which generally we do by naming the Linux commit. See e.g. the history of mwait-idle.c, where most of the commits are straight ports from Linux. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
On Thu, Apr 12, 2018 at 09:29:34AM -0700, Raj, Ashok wrote: >On Fri, Mar 30, 2018 at 02:59:00PM +0800, Chao Gao wrote: >> From: Gao Chao>> >> This patch is to backport microcode improvement patches from linux >> kernel. Below are the original patches description: >> >> commit a5321aec6412b20b5ad15db2d6b916c05349dbff >> Author: Ashok Raj >> Date: Wed Feb 28 11:28:46 2018 +0100 >> >> x86/microcode: Synchronize late microcode loading >> >> Original idea by Ashok, completely rewritten by Borislav. >> >> Before you read any further: the early loading method is still the >> preferred one and you should always do that. The following patch is >> improving the late loading mechanism for long running jobs and cloud use >> cases. >> >> Gather all cores and serialize the microcode update on them by doing it >> one-by-one to make the late update process as reliable as possible and >> avoid potential issues caused by the microcode update. >> >> [ Borislav: Rewrite completely. ] >> >> Co-developed-by: Borislav Petkov >> Signed-off-by: Ashok Raj >> Signed-off-by: Borislav Petkov >> Signed-off-by: Thomas Gleixner >> Tested-by: Tom Lendacky >> Tested-by: Ashok Raj > >The tested by tags were good for linux upstream. Can you make sure >you add your name under tested-by for xen? Tested-by doesn't mean they have tested this patch. I just put the original commits description here. If Xen permits such tested-by from the sender and author, I will add one. > >> Reviewed-by: Tom Lendacky >> Cc: Arjan Van De Ven >> Link: https://lkml.kernel.org/r/20180228102846.13447-8...@alien8.de >> >> +static int do_microcode_update(void *_info) >> +{ >> +struct microcode_info *info = _info; >> +int error, ret = 0; >> + >> +error = __wait_for_cpus(>cpu_in, USEC_PER_SEC); >> +if ( error ) >> +{ >> +ret = -EBUSY; >> +return ret; >> +} >> + >> +error = microcode_update_cpu(info->buffer, info->buffer_size); >> +if ( error && !ret ) >> +ret = error; > >Isn't this redundant? can ret become non-zero in the check above? Yes, it is redundant. I will also remove 'error' field from struct microcode_info because stop_machine_run already has the ability to return the first error code. Hence, don't need to implement it again here (this is the reason why the above fragment is weird). Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
On Fri, Mar 30, 2018 at 02:59:00PM +0800, Chao Gao wrote: > From: Gao Chao> > This patch is to backport microcode improvement patches from linux > kernel. Below are the original patches description: > > commit a5321aec6412b20b5ad15db2d6b916c05349dbff > Author: Ashok Raj > Date: Wed Feb 28 11:28:46 2018 +0100 > > x86/microcode: Synchronize late microcode loading > > Original idea by Ashok, completely rewritten by Borislav. > > Before you read any further: the early loading method is still the > preferred one and you should always do that. The following patch is > improving the late loading mechanism for long running jobs and cloud use > cases. > > Gather all cores and serialize the microcode update on them by doing it > one-by-one to make the late update process as reliable as possible and > avoid potential issues caused by the microcode update. > > [ Borislav: Rewrite completely. ] > > Co-developed-by: Borislav Petkov > Signed-off-by: Ashok Raj > Signed-off-by: Borislav Petkov > Signed-off-by: Thomas Gleixner > Tested-by: Tom Lendacky > Tested-by: Ashok Raj The tested by tags were good for linux upstream. Can you make sure you add your name under tested-by for xen? > Reviewed-by: Tom Lendacky > Cc: Arjan Van De Ven > Link: https://lkml.kernel.org/r/20180228102846.13447-8...@alien8.de > > +static int do_microcode_update(void *_info) > +{ > +struct microcode_info *info = _info; > +int error, ret = 0; > + > +error = __wait_for_cpus(>cpu_in, USEC_PER_SEC); > +if ( error ) > +{ > +ret = -EBUSY; > +return ret; > +} > + > +error = microcode_update_cpu(info->buffer, info->buffer_size); > +if ( error && !ret ) > +ret = error; Isn't this redundant? can ret become non-zero in the check above? > +/* > + * Increase the wait timeout to a safe value here since we're serializing > + * the microcode update and that could take a while on a large number of > + * CPUs. And that is fine as the *actual* timeout will be determined by > + * the last CPU finished updating and thus cut short > + */ > +error = __wait_for_cpus(>cpu_out, USEC_PER_SEC * > num_online_cpus()); > +if (error) > +panic("Timeout when finishing updating microcode"); > +info->error = ret; > +return ret; > } > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
>>> On 12.04.18 at 08:31,wrote: > Ping... > > Can someone help to review these two patches? Sure, they're on my list of things to look at. Too many other things going on. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
Ping... Can someone help to review these two patches? On Fri, Mar 30, 2018 at 02:59:00PM +0800, Chao Gao wrote: >From: Gao Chao> >This patch is to backport microcode improvement patches from linux >kernel. Below are the original patches description: > >commit a5321aec6412b20b5ad15db2d6b916c05349dbff >Author: Ashok Raj >Date: Wed Feb 28 11:28:46 2018 +0100 > > x86/microcode: Synchronize late microcode loading > > Original idea by Ashok, completely rewritten by Borislav. > > Before you read any further: the early loading method is still the > preferred one and you should always do that. The following patch is > improving the late loading mechanism for long running jobs and cloud use > cases. > > Gather all cores and serialize the microcode update on them by doing it > one-by-one to make the late update process as reliable as possible and > avoid potential issues caused by the microcode update. > > [ Borislav: Rewrite completely. ] > > Co-developed-by: Borislav Petkov > Signed-off-by: Ashok Raj > Signed-off-by: Borislav Petkov > Signed-off-by: Thomas Gleixner > Tested-by: Tom Lendacky > Tested-by: Ashok Raj > Reviewed-by: Tom Lendacky > Cc: Arjan Van De Ven > Link: https://lkml.kernel.org/r/20180228102846.13447-8...@alien8.de > >commit bb8c13d61a629276a162c1d2b1a20a815cbcfbb7 >Author: Borislav Petkov >Date: Wed Mar 14 19:36:15 2018 +0100 > > x86/microcode: Fix CPU synchronization routine > > Emanuel reported an issue with a hang during microcode update because my > dumb idea to use one atomic synchronization variable for both rendezvous > - before and after update - was simply bollocks: > > microcode: microcode_reload_late: late_cpus: 4 > microcode: __reload_late: cpu 2 entered > microcode: __reload_late: cpu 1 entered > microcode: __reload_late: cpu 3 entered > microcode: __reload_late: cpu 0 entered > microcode: __reload_late: cpu 1 left > microcode: Timeout while waiting for CPUs rendezvous, remaining: 1 > > CPU1 above would finish, leave and the others will still spin waiting > for > it to join. > > So do two synchronization atomics instead, which makes the code a lot > more > straightforward. > > Also, since the update is serialized and it also takes quite some time > per > microcode engine, increase the exit timeout by the number of CPUs on the > system. > > That's ok because the moment all CPUs are done, that timeout will be cut > short. > > Furthermore, panic when some of the CPUs timeout when returning from a > microcode update: we can't allow a system with not all cores updated. > > Also, as an optimization, do not do the exit sync if microcode wasn't > updated. > > Reported-by: Emanuel Czirai > Signed-off-by: Borislav Petkov > Signed-off-by: Thomas Gleixner > Tested-by: Emanuel Czirai > Tested-by: Ashok Raj > Tested-by: Tom Lendacky > Link: https://lkml.kernel.org/r/20180314183615.17629-2...@alien8.de > >This patch is also in accord with Andrew's suggestion, >"Rendezvous all online cpus in an IPI to apply the patch, and keep the >processors in until all have completed the patch.", in [1]. > >[1]:https://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update#Run_time_microcode_updates > >Signed-off-by: Chao Gao >Cc: Kevin Tian >Cc: Jun Nakajima >Cc: Ashok Raj >Cc: Borislav Petkov >Cc: Thomas Gleixner >--- > xen/arch/x86/microcode.c | 89 +++- > 1 file changed, 73 insertions(+), 16 deletions(-) > >diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c >index 4163f50..94c1ca2 100644 >--- a/xen/arch/x86/microcode.c >+++ b/xen/arch/x86/microcode.c >@@ -22,6 +22,7 @@ > */ > > #include >+#include > #include > #include > #include >@@ -30,15 +31,20 @@ > #include > #include > #include >+#include > #include > #include > #include >+#include > >+#include > #include > #include > #include > #include > >+#define USEC_PER_SEC 100UL >+ > static module_t __initdata ucode_mod; > static signed int __initdata ucode_mod_idx; > static bool_t __initdata ucode_mod_forced; >@@ -187,7 +193,7 @@ static DEFINE_SPINLOCK(microcode_mutex); > DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info); > > struct
[Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading
From: Gao ChaoThis patch is to backport microcode improvement patches from linux kernel. Below are the original patches description: commit a5321aec6412b20b5ad15db2d6b916c05349dbff Author: Ashok Raj Date: Wed Feb 28 11:28:46 2018 +0100 x86/microcode: Synchronize late microcode loading Original idea by Ashok, completely rewritten by Borislav. Before you read any further: the early loading method is still the preferred one and you should always do that. The following patch is improving the late loading mechanism for long running jobs and cloud use cases. Gather all cores and serialize the microcode update on them by doing it one-by-one to make the late update process as reliable as possible and avoid potential issues caused by the microcode update. [ Borislav: Rewrite completely. ] Co-developed-by: Borislav Petkov Signed-off-by: Ashok Raj Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Tested-by: Tom Lendacky Tested-by: Ashok Raj Reviewed-by: Tom Lendacky Cc: Arjan Van De Ven Link: https://lkml.kernel.org/r/20180228102846.13447-8...@alien8.de commit bb8c13d61a629276a162c1d2b1a20a815cbcfbb7 Author: Borislav Petkov Date: Wed Mar 14 19:36:15 2018 +0100 x86/microcode: Fix CPU synchronization routine Emanuel reported an issue with a hang during microcode update because my dumb idea to use one atomic synchronization variable for both rendezvous - before and after update - was simply bollocks: microcode: microcode_reload_late: late_cpus: 4 microcode: __reload_late: cpu 2 entered microcode: __reload_late: cpu 1 entered microcode: __reload_late: cpu 3 entered microcode: __reload_late: cpu 0 entered microcode: __reload_late: cpu 1 left microcode: Timeout while waiting for CPUs rendezvous, remaining: 1 CPU1 above would finish, leave and the others will still spin waiting for it to join. So do two synchronization atomics instead, which makes the code a lot more straightforward. Also, since the update is serialized and it also takes quite some time per microcode engine, increase the exit timeout by the number of CPUs on the system. That's ok because the moment all CPUs are done, that timeout will be cut short. Furthermore, panic when some of the CPUs timeout when returning from a microcode update: we can't allow a system with not all cores updated. Also, as an optimization, do not do the exit sync if microcode wasn't updated. Reported-by: Emanuel Czirai Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Tested-by: Emanuel Czirai Tested-by: Ashok Raj Tested-by: Tom Lendacky Link: https://lkml.kernel.org/r/20180314183615.17629-2...@alien8.de This patch is also in accord with Andrew's suggestion, "Rendezvous all online cpus in an IPI to apply the patch, and keep the processors in until all have completed the patch.", in [1]. [1]:https://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update#Run_time_microcode_updates Signed-off-by: Chao Gao Cc: Kevin Tian Cc: Jun Nakajima Cc: Ashok Raj Cc: Borislav Petkov Cc: Thomas Gleixner --- xen/arch/x86/microcode.c | 89 +++- 1 file changed, 73 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c index 4163f50..94c1ca2 100644 --- a/xen/arch/x86/microcode.c +++ b/xen/arch/x86/microcode.c @@ -22,6 +22,7 @@ */ #include +#include #include #include #include @@ -30,15 +31,20 @@ #include #include #include +#include #include #include #include +#include +#include #include #include #include #include +#define USEC_PER_SEC 100UL + static module_t __initdata ucode_mod; static signed int __initdata ucode_mod_idx; static bool_t __initdata ucode_mod_forced; @@ -187,7 +193,7 @@ static DEFINE_SPINLOCK(microcode_mutex); DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info); struct microcode_info { -unsigned int cpu; +atomic_t cpu_in, cpu_out; uint32_t buffer_size; int error; char buffer[1]; @@ -281,24 +287,52 @@ static int microcode_update_cpu(const void *buf, size_t