Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 7:53 PM, Brian Gerst wrote: > On Tue, Jul 21, 2015 at 10:12 PM, Andy Lutomirski wrote: >> On Tue, Jul 21, 2015 at 7:01 PM, Brian Gerst wrote: >>> On Tue, Jul 21, 2015 at 3:59 PM, Andy Lutomirski wrote: modify_ldt has questionable locking and does not synchronize threads. Improve it: redesign the locking and synchronize all threads' LDTs using an IPI on all modifications. >>> >>> What does this fix? I can see sending an IPI if the LDT is >>> reallocated, but on every update seems unnecessary. >>> >> >> It prevents nastiness in which you're in user mode with an impossible >> CS or SS, resulting in potentially interesting artifacts in >> interrupts, NMIs, etc. > > By impossible, do you mean a partially updated descriptor when the > interrupt occurs? Would making sure that the descriptor is atomically > updated (using set_64bit()) fix that? > I tried to exploit that once and didn't get very far. If I could cause the LDT to be populated one bit at a time, I think I could materialize a call gate out of thin air. The docs are unclear on what, if anything, the memory ordering rules are. I'm more concerned about the case where a segment register caches a value that is no longer in the LDT. If it's DS, ES, FS, or GS, it results in nondeterministic behavior but is otherwise not a big deal. If it's CS or SS, then an interrupt or exception will write a stack frame with the selectors but IRET can fault. If the interrupt is an NMI and certain other conditions are met and your kernel is older than 4.2-rc3, then you should read the CVE-2015-5157 advisory I'll send tomorrow :) Even on 4.2-rc3, the NMI code still struggles a bit if this happens. With this patch, you can never be in user mode with CS or SS selectors that point at the LDT but don't match the LDT. That makes me a lot more comfortable with modify_ldt. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 10:12 PM, Andy Lutomirski wrote: > On Tue, Jul 21, 2015 at 7:01 PM, Brian Gerst wrote: >> On Tue, Jul 21, 2015 at 3:59 PM, Andy Lutomirski wrote: >>> modify_ldt has questionable locking and does not synchronize >>> threads. Improve it: redesign the locking and synchronize all >>> threads' LDTs using an IPI on all modifications. >> >> What does this fix? I can see sending an IPI if the LDT is >> reallocated, but on every update seems unnecessary. >> > > It prevents nastiness in which you're in user mode with an impossible > CS or SS, resulting in potentially interesting artifacts in > interrupts, NMIs, etc. By impossible, do you mean a partially updated descriptor when the interrupt occurs? Would making sure that the descriptor is atomically updated (using set_64bit()) fix that? -- Brian Gerst -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 7:01 PM, Brian Gerst wrote: > On Tue, Jul 21, 2015 at 3:59 PM, Andy Lutomirski wrote: >> modify_ldt has questionable locking and does not synchronize >> threads. Improve it: redesign the locking and synchronize all >> threads' LDTs using an IPI on all modifications. > > What does this fix? I can see sending an IPI if the LDT is > reallocated, but on every update seems unnecessary. > It prevents nastiness in which you're in user mode with an impossible CS or SS, resulting in potentially interesting artifacts in interrupts, NMIs, etc. > -- > Brian Gerst -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 3:59 PM, Andy Lutomirski wrote: > modify_ldt has questionable locking and does not synchronize > threads. Improve it: redesign the locking and synchronize all > threads' LDTs using an IPI on all modifications. What does this fix? I can see sending an IPI if the LDT is reallocated, but on every update seems unnecessary. -- Brian Gerst -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 5:49 PM, Andrew Cooper wrote: > On 22/07/2015 01:28, Andy Lutomirski wrote: >> On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper >> wrote: >>> On 22/07/2015 01:07, Andy Lutomirski wrote: On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper wrote: > On 21/07/2015 22:53, Boris Ostrovsky wrote: >> On 07/21/2015 03:59 PM, Andy Lutomirski wrote: >>> --- a/arch/x86/include/asm/mmu_context.h >>> +++ b/arch/x86/include/asm/mmu_context.h >>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct >>> *mm) {} >>> #endif >>> /* >>> + * ldt_structs can be allocated, used, and freed, but they are never >>> + * modified while live. >>> + */ >>> +struct ldt_struct { >>> +int size; >>> +int __pad;/* keep the descriptors naturally aligned. */ >>> +struct desc_struct entries[]; >>> +}; >> >> This breaks Xen which expects LDT to be page-aligned. Not sure why. >> >> Jan, Andrew? > PV guests are not permitted to have writeable mappings to the frames > making up the GDT and LDT, so it cannot make unaudited changes to > loadable descriptors. In particular, for a 32bit PV guest, it is only > the segment limit which protects Xen from the ring1 guest kernel. > > A lot of this code hasn't been touched in years, and it certainly > predates me. The alignment requirement appears to come from the virtual > region Xen uses to map the guests GDT and LDT. Strict alignment is > required for the GDT so Xen's descriptors starting at 0xe0xx are > correct, but the LDT alignment seems to be a side effect of similar > codepaths. > > For an LDT smaller than 8192 entries, I can't see any specific reason > for enforcing alignment, other than "that's the way it has always been". > > However, the guest would still have to relinquish write access to all > frames which make up the LDT, which looks to be a bit of an issue given > the snippet above. Does the LDT itself need to be aligned or just the address passed to paravirt_alloc_ldt? >>> The address which Xen receives needs to be aligned. >>> >>> It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt >>> it is passed is page aligned, and passes it straight through. >> xen_alloc_ldt is just fiddling with protection though, I think. Isn't >> it xen_set_ldt that's the meat? We could easily pass xen_alloc_ldt a >> pointer to the ldt_struct. > > So it is. It is the linear_addr in xen_set_ldt() which Xen currently > audits to be page aligned. > > This will allow ldt_struct itself to be page aligned, and for the size > field to sit across the base/limit field of what would logically be > selector 0x0008 There would be some issues accessing size. To load > frames as an LDT, a guest must drop all refs to the page so that its > type may be changed from writeable to segdesc. After that, an > update_descriptor hypercall can be used to change size, and I believe > the guest may subsequently recreate read-only mappings to the frames in > question (although frankly it is getting late so you will want to double > check all of this). > > Anyhow, this looks like an issue which should be fixed up with slightly > more PVOps, rather than enforcing a Xen view of the world on native Linux. > I could presumably make the allocation the other way around so the size is at the end. I could even use two separate allocations if needed. >>> I suspect two separate allocations would be the better solution, as it >>> means that the size field doesn't need to be subject to funny page >>> permissions. >> True. OTOH we never write to the size field after allocating the thing. > > Right, but even reading it is going to cause problems if one of the > paravirt ops can't re-establish ro mappings. Does paravirt_alloc_ldt completely deny access or does it just set it RO? --Andy > > ~Andrew -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On 22/07/2015 01:28, Andy Lutomirski wrote: > On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper > wrote: >> On 22/07/2015 01:07, Andy Lutomirski wrote: >>> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper >>> wrote: On 21/07/2015 22:53, Boris Ostrovsky wrote: > On 07/21/2015 03:59 PM, Andy Lutomirski wrote: >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct >> *mm) {} >> #endif >> /* >> + * ldt_structs can be allocated, used, and freed, but they are never >> + * modified while live. >> + */ >> +struct ldt_struct { >> +int size; >> +int __pad;/* keep the descriptors naturally aligned. */ >> +struct desc_struct entries[]; >> +}; > > This breaks Xen which expects LDT to be page-aligned. Not sure why. > > Jan, Andrew? PV guests are not permitted to have writeable mappings to the frames making up the GDT and LDT, so it cannot make unaudited changes to loadable descriptors. In particular, for a 32bit PV guest, it is only the segment limit which protects Xen from the ring1 guest kernel. A lot of this code hasn't been touched in years, and it certainly predates me. The alignment requirement appears to come from the virtual region Xen uses to map the guests GDT and LDT. Strict alignment is required for the GDT so Xen's descriptors starting at 0xe0xx are correct, but the LDT alignment seems to be a side effect of similar codepaths. For an LDT smaller than 8192 entries, I can't see any specific reason for enforcing alignment, other than "that's the way it has always been". However, the guest would still have to relinquish write access to all frames which make up the LDT, which looks to be a bit of an issue given the snippet above. >>> Does the LDT itself need to be aligned or just the address passed to >>> paravirt_alloc_ldt? >> The address which Xen receives needs to be aligned. >> >> It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt >> it is passed is page aligned, and passes it straight through. > xen_alloc_ldt is just fiddling with protection though, I think. Isn't > it xen_set_ldt that's the meat? We could easily pass xen_alloc_ldt a > pointer to the ldt_struct. So it is. It is the linear_addr in xen_set_ldt() which Xen currently audits to be page aligned. This will allow ldt_struct itself to be page aligned, and for the size field to sit across the base/limit field of what would logically be selector 0x0008 There would be some issues accessing size. To load frames as an LDT, a guest must drop all refs to the page so that its type may be changed from writeable to segdesc. After that, an update_descriptor hypercall can be used to change size, and I believe the guest may subsequently recreate read-only mappings to the frames in question (although frankly it is getting late so you will want to double check all of this). Anyhow, this looks like an issue which should be fixed up with slightly more PVOps, rather than enforcing a Xen view of the world on native Linux. >>> I could presumably make the allocation the other way around so the >>> size is at the end. I could even use two separate allocations if >>> needed. >> I suspect two separate allocations would be the better solution, as it >> means that the size field doesn't need to be subject to funny page >> permissions. > True. OTOH we never write to the size field after allocating the thing. Right, but even reading it is going to cause problems if one of the paravirt ops can't re-establish ro mappings. ~Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper wrote: > On 22/07/2015 01:07, Andy Lutomirski wrote: >> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper >> wrote: >>> On 21/07/2015 22:53, Boris Ostrovsky wrote: On 07/21/2015 03:59 PM, Andy Lutomirski wrote: > --- a/arch/x86/include/asm/mmu_context.h > +++ b/arch/x86/include/asm/mmu_context.h > @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct > *mm) {} > #endif > /* > + * ldt_structs can be allocated, used, and freed, but they are never > + * modified while live. > + */ > +struct ldt_struct { > +int size; > +int __pad;/* keep the descriptors naturally aligned. */ > +struct desc_struct entries[]; > +}; This breaks Xen which expects LDT to be page-aligned. Not sure why. Jan, Andrew? >>> PV guests are not permitted to have writeable mappings to the frames >>> making up the GDT and LDT, so it cannot make unaudited changes to >>> loadable descriptors. In particular, for a 32bit PV guest, it is only >>> the segment limit which protects Xen from the ring1 guest kernel. >>> >>> A lot of this code hasn't been touched in years, and it certainly >>> predates me. The alignment requirement appears to come from the virtual >>> region Xen uses to map the guests GDT and LDT. Strict alignment is >>> required for the GDT so Xen's descriptors starting at 0xe0xx are >>> correct, but the LDT alignment seems to be a side effect of similar >>> codepaths. >>> >>> For an LDT smaller than 8192 entries, I can't see any specific reason >>> for enforcing alignment, other than "that's the way it has always been". >>> >>> However, the guest would still have to relinquish write access to all >>> frames which make up the LDT, which looks to be a bit of an issue given >>> the snippet above. >> Does the LDT itself need to be aligned or just the address passed to >> paravirt_alloc_ldt? > > The address which Xen receives needs to be aligned. > > It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt > it is passed is page aligned, and passes it straight through. xen_alloc_ldt is just fiddling with protection though, I think. Isn't it xen_set_ldt that's the meat? We could easily pass xen_alloc_ldt a pointer to the ldt_struct. > >> >>> I think I have a solution, but I doubt it is going to be very popular. >>> >>> * Make a new paravirt hook for allocation of ldt_struct, so the paravirt >>> backend can choose an alignment if needed >>> * Make absolutely certain that __pad has the value 0 (so size and __pad >>> combined don't look like a present descriptor) >>> * Never hand selector 0x0008 to unsuspecting users. >> Yuck. > > I actually meant 0x0004, but yes. Very much yuck. > >> >>> This will allow ldt_struct itself to be page aligned, and for the size >>> field to sit across the base/limit field of what would logically be >>> selector 0x0008 There would be some issues accessing size. To load >>> frames as an LDT, a guest must drop all refs to the page so that its >>> type may be changed from writeable to segdesc. After that, an >>> update_descriptor hypercall can be used to change size, and I believe >>> the guest may subsequently recreate read-only mappings to the frames in >>> question (although frankly it is getting late so you will want to double >>> check all of this). >>> >>> Anyhow, this looks like an issue which should be fixed up with slightly >>> more PVOps, rather than enforcing a Xen view of the world on native Linux. >>> >> I could presumably make the allocation the other way around so the >> size is at the end. I could even use two separate allocations if >> needed. > > I suspect two separate allocations would be the better solution, as it > means that the size field doesn't need to be subject to funny page > permissions. True. OTOH we never write to the size field after allocating the thing. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On 22/07/2015 01:07, Andy Lutomirski wrote: > On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper > wrote: >> On 21/07/2015 22:53, Boris Ostrovsky wrote: >>> On 07/21/2015 03:59 PM, Andy Lutomirski wrote: --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct *mm) {} #endif /* + * ldt_structs can be allocated, used, and freed, but they are never + * modified while live. + */ +struct ldt_struct { +int size; +int __pad;/* keep the descriptors naturally aligned. */ +struct desc_struct entries[]; +}; >>> >>> >>> This breaks Xen which expects LDT to be page-aligned. Not sure why. >>> >>> Jan, Andrew? >> PV guests are not permitted to have writeable mappings to the frames >> making up the GDT and LDT, so it cannot make unaudited changes to >> loadable descriptors. In particular, for a 32bit PV guest, it is only >> the segment limit which protects Xen from the ring1 guest kernel. >> >> A lot of this code hasn't been touched in years, and it certainly >> predates me. The alignment requirement appears to come from the virtual >> region Xen uses to map the guests GDT and LDT. Strict alignment is >> required for the GDT so Xen's descriptors starting at 0xe0xx are >> correct, but the LDT alignment seems to be a side effect of similar >> codepaths. >> >> For an LDT smaller than 8192 entries, I can't see any specific reason >> for enforcing alignment, other than "that's the way it has always been". >> >> However, the guest would still have to relinquish write access to all >> frames which make up the LDT, which looks to be a bit of an issue given >> the snippet above. > Does the LDT itself need to be aligned or just the address passed to > paravirt_alloc_ldt? The address which Xen receives needs to be aligned. It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt it is passed is page aligned, and passes it straight through. > >> I think I have a solution, but I doubt it is going to be very popular. >> >> * Make a new paravirt hook for allocation of ldt_struct, so the paravirt >> backend can choose an alignment if needed >> * Make absolutely certain that __pad has the value 0 (so size and __pad >> combined don't look like a present descriptor) >> * Never hand selector 0x0008 to unsuspecting users. > Yuck. I actually meant 0x0004, but yes. Very much yuck. > >> This will allow ldt_struct itself to be page aligned, and for the size >> field to sit across the base/limit field of what would logically be >> selector 0x0008 There would be some issues accessing size. To load >> frames as an LDT, a guest must drop all refs to the page so that its >> type may be changed from writeable to segdesc. After that, an >> update_descriptor hypercall can be used to change size, and I believe >> the guest may subsequently recreate read-only mappings to the frames in >> question (although frankly it is getting late so you will want to double >> check all of this). >> >> Anyhow, this looks like an issue which should be fixed up with slightly >> more PVOps, rather than enforcing a Xen view of the world on native Linux. >> > I could presumably make the allocation the other way around so the > size is at the end. I could even use two separate allocations if > needed. I suspect two separate allocations would be the better solution, as it means that the size field doesn't need to be subject to funny page permissions. ~Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper wrote: > On 21/07/2015 22:53, Boris Ostrovsky wrote: >> On 07/21/2015 03:59 PM, Andy Lutomirski wrote: >>> --- a/arch/x86/include/asm/mmu_context.h >>> +++ b/arch/x86/include/asm/mmu_context.h >>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct >>> *mm) {} >>> #endif >>> /* >>> + * ldt_structs can be allocated, used, and freed, but they are never >>> + * modified while live. >>> + */ >>> +struct ldt_struct { >>> +int size; >>> +int __pad;/* keep the descriptors naturally aligned. */ >>> +struct desc_struct entries[]; >>> +}; >> >> >> >> This breaks Xen which expects LDT to be page-aligned. Not sure why. >> >> Jan, Andrew? > > PV guests are not permitted to have writeable mappings to the frames > making up the GDT and LDT, so it cannot make unaudited changes to > loadable descriptors. In particular, for a 32bit PV guest, it is only > the segment limit which protects Xen from the ring1 guest kernel. > > A lot of this code hasn't been touched in years, and it certainly > predates me. The alignment requirement appears to come from the virtual > region Xen uses to map the guests GDT and LDT. Strict alignment is > required for the GDT so Xen's descriptors starting at 0xe0xx are > correct, but the LDT alignment seems to be a side effect of similar > codepaths. > > For an LDT smaller than 8192 entries, I can't see any specific reason > for enforcing alignment, other than "that's the way it has always been". > > However, the guest would still have to relinquish write access to all > frames which make up the LDT, which looks to be a bit of an issue given > the snippet above. Does the LDT itself need to be aligned or just the address passed to paravirt_alloc_ldt? > > I think I have a solution, but I doubt it is going to be very popular. > > * Make a new paravirt hook for allocation of ldt_struct, so the paravirt > backend can choose an alignment if needed > * Make absolutely certain that __pad has the value 0 (so size and __pad > combined don't look like a present descriptor) > * Never hand selector 0x0008 to unsuspecting users. Yuck. > > This will allow ldt_struct itself to be page aligned, and for the size > field to sit across the base/limit field of what would logically be > selector 0x0008 There would be some issues accessing size. To load > frames as an LDT, a guest must drop all refs to the page so that its > type may be changed from writeable to segdesc. After that, an > update_descriptor hypercall can be used to change size, and I believe > the guest may subsequently recreate read-only mappings to the frames in > question (although frankly it is getting late so you will want to double > check all of this). > > Anyhow, this looks like an issue which should be fixed up with slightly > more PVOps, rather than enforcing a Xen view of the world on native Linux. > I could presumably make the allocation the other way around so the size is at the end. I could even use two separate allocations if needed. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On 21/07/2015 22:53, Boris Ostrovsky wrote: > On 07/21/2015 03:59 PM, Andy Lutomirski wrote: >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct >> *mm) {} >> #endif >> /* >> + * ldt_structs can be allocated, used, and freed, but they are never >> + * modified while live. >> + */ >> +struct ldt_struct { >> +int size; >> +int __pad;/* keep the descriptors naturally aligned. */ >> +struct desc_struct entries[]; >> +}; > > > > This breaks Xen which expects LDT to be page-aligned. Not sure why. > > Jan, Andrew? PV guests are not permitted to have writeable mappings to the frames making up the GDT and LDT, so it cannot make unaudited changes to loadable descriptors. In particular, for a 32bit PV guest, it is only the segment limit which protects Xen from the ring1 guest kernel. A lot of this code hasn't been touched in years, and it certainly predates me. The alignment requirement appears to come from the virtual region Xen uses to map the guests GDT and LDT. Strict alignment is required for the GDT so Xen's descriptors starting at 0xe0xx are correct, but the LDT alignment seems to be a side effect of similar codepaths. For an LDT smaller than 8192 entries, I can't see any specific reason for enforcing alignment, other than "that's the way it has always been". However, the guest would still have to relinquish write access to all frames which make up the LDT, which looks to be a bit of an issue given the snippet above. I think I have a solution, but I doubt it is going to be very popular. * Make a new paravirt hook for allocation of ldt_struct, so the paravirt backend can choose an alignment if needed * Make absolutely certain that __pad has the value 0 (so size and __pad combined don't look like a present descriptor) * Never hand selector 0x0008 to unsuspecting users. This will allow ldt_struct itself to be page aligned, and for the size field to sit across the base/limit field of what would logically be selector 0x0008 There would be some issues accessing size. To load frames as an LDT, a guest must drop all refs to the page so that its type may be changed from writeable to segdesc. After that, an update_descriptor hypercall can be used to change size, and I believe the guest may subsequently recreate read-only mappings to the frames in question (although frankly it is getting late so you will want to double check all of this). Anyhow, this looks like an issue which should be fixed up with slightly more PVOps, rather than enforcing a Xen view of the world on native Linux. ~Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous
On 07/21/2015 03:59 PM, Andy Lutomirski wrote: modify_ldt has questionable locking and does not synchronize threads. Improve it: redesign the locking and synchronize all threads' LDTs using an IPI on all modifications. This will dramatically slow down modify_ldt in multithreaded programs, but there shouldn't be any multithreaded programs that care about modify_ldt's performance in the first place. Cc: sta...@vger.kernel.org Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/desc.h| 15 --- arch/x86/include/asm/mmu.h | 3 +- arch/x86/include/asm/mmu_context.h | 48 ++- arch/x86/kernel/cpu/common.c | 4 +- arch/x86/kernel/cpu/perf_event.c | 12 +- arch/x86/kernel/ldt.c | 247 +++-- arch/x86/kernel/process_64.c | 4 +- arch/x86/kernel/step.c | 6 +- arch/x86/power/cpu.c | 3 +- 9 files changed, 192 insertions(+), 150 deletions(-) diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index a0bf89fd2647..4e10d73cf018 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -280,21 +280,6 @@ static inline void clear_LDT(void) set_ldt(NULL, 0); } -/* - * load one particular LDT into the current CPU - */ -static inline void load_LDT_nolock(mm_context_t *pc) -{ - set_ldt(pc->ldt, pc->size); -} - -static inline void load_LDT(mm_context_t *pc) -{ - preempt_disable(); - load_LDT_nolock(pc); - preempt_enable(); -} - static inline unsigned long get_desc_base(const struct desc_struct *desc) { return (unsigned)(desc->base0 | ((desc->base1) << 16) | ((desc->base2) << 24)); diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 09b9620a73b4..364d27481a52 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -9,8 +9,7 @@ * we put the segment information here. */ typedef struct { - void *ldt; - int size; + struct ldt_struct *ldt; #ifdef CONFIG_X86_64 /* True if mm supports a task running in 32 bit compatibility mode. */ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 5e8daee7c5c9..1ff121fbf366 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct *mm) {} #endif /* + * ldt_structs can be allocated, used, and freed, but they are never + * modified while live. + */ +struct ldt_struct { + int size; + int __pad; /* keep the descriptors naturally aligned. */ + struct desc_struct entries[]; +}; This breaks Xen which expects LDT to be page-aligned. Not sure why. Jan, Andrew? -boris + +static inline void load_mm_ldt(struct mm_struct *mm) +{ + struct ldt_struct *ldt; + DEBUG_LOCKS_WARN_ON(!irqs_disabled()); + + /* lockless_dereference synchronizes with smp_store_release */ + ldt = lockless_dereference(mm->context.ldt); + + /* +* Any change to mm->context.ldt is followed by an IPI to all +* CPUs with the mm active. The LDT will not be freed until +* after the IPI is handled by all such CPUs. This means that, +* if the ldt_struct changes before we return, the values we see +* will be safe, and the new values will be loaded before we run +* any user code. +* +* NB: don't try to convert this to use RCU without extreme care. +* We would still need IRQs off, because we don't want to change +* the local LDT after an IPI loaded a newer value than the one +* that we can see. +*/ + + if (unlikely(ldt)) + set_ldt(ldt->entries, ldt->size); + else + clear_LDT(); +} + +/* * Used for LDT copy/destruction. */ int init_new_context(struct task_struct *tsk, struct mm_struct *mm); @@ -78,12 +116,12 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, * was called and then modify_ldt changed * prev->context.ldt but suppressed an IPI to this CPU. * In this case, prev->context.ldt != NULL, because we -* never free an LDT while the mm still exists. That -* means that next->context.ldt != prev->context.ldt, -* because mms never share an LDT. +* never set context.ldt to NULL while the mm still +* exists. That means that next->context.ldt != +* prev->context.ldt, because mms never share an LDT. */ if (unlikely(prev->context.ldt != next->context.ldt)) - load_LDT_nolock(&next->context); + load_mm_ldt(next); } #ifdef CONFIG_SMP else { @@ -106,7 +144,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,