Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
Hi Guo, On 19/06/2019 12:51, Guo Ren wrote: On Wed, Jun 19, 2019 at 4:54 PM Julien Grall wrote: On 6/19/19 9:07 AM, Guo Ren wrote: Hi Julien, Hi, You forgot CCing C-SKY folks :P I wasn't aware you could be interested :). Move arm asid allocator code in a generic one is a agood idea, I've made a patchset for C-SKY and test is on processing, See: https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/ If you plan to seperate it into generic one, I could co-work with you. Was the ASID allocator work out of box on C-Sky? Almost done, but one question: arm64 remove the code in switch_mm: cpumask_clear_cpu(cpu, mm_cpumask(prev)); cpumask_set_cpu(cpu, mm_cpumask(next)); Why? Although arm64 cache operations could affect all harts with CTC method of interconnect, I think we should keep these code for primitive integrity in linux. Because cpu_bitmap is in mm_struct instead of mm->context. I will let Will answer to this. [...] If so, I can easily move the code in a generic place (maybe lib/asid.c). I think it's OK. Will emits concern to move the code in lib. So I will stick with what I currently have. Cheers, -- Julien Grall
Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
On Wed, Jun 19, 2019 at 08:18:04PM +0800, Guo Ren wrote: > On Wed, Jun 19, 2019 at 5:12 PM Will Deacon wrote: > > This is one place where I'd actually prefer not to go down the route of > > making the code generic. Context-switching and low-level TLB management > > is deeply architecture-specific and I worry that by trying to make this > > code common, we run the real risk of introducing subtle bugs on some > > architecture every time it is changed. > "Add generic asid code" and "move arm's into generic" are two things. > We could do > first and let architecture's maintainer to choose. If I understand the proposal being discussed, it involves basing that generic ASID allocation code around the arm64 implementation which I don't necessarily think is a good starting point. > > Furthermore, the algorithm we use > > on arm64 is designed to scale to large systems using DVM and may well be > > too complex and/or sub-optimal for architectures with different system > > topologies or TLB invalidation mechanisms. > It's just a asid algorithm not very complex and there is a callback > for architecture to define their > own local hart tlb flush. Seems it has nothing with DVM or tlb > broadcast mechanism. I'm pleased that you think the algorithm is not very complex, but I'm also worried that you might not have fully understood some of its finer details. The reason I mention DVM and TLB broadcasting is because, depending on the mechanisms in your architecture relating to those, it may be strictly required that all concurrently running threads of a process have the same ASID at any given point in time, or it may be that you really don't care. If you don't care, then the arm64 allocator is over-engineered and likely inefficient for your system. If you do care, then it's worth considering whether a lock is sufficient around the allocator if you don't expect high core counts. Another possibility is that you end up using only one ASID and invalidating the local TLB on every context switch. Yet another design would be to manage per-cpu ASID pools. So rather than blindly copying the arm64 code, I suggest sitting down and designing something that fits to your architecture instead. You may end up with something that is both simpler and more efficient. Will
Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
On Wed, Jun 19, 2019 at 5:12 PM Will Deacon wrote: > > On Wed, Jun 19, 2019 at 09:54:21AM +0100, Julien Grall wrote: > > On 6/19/19 9:07 AM, Guo Ren wrote: > > > You forgot CCing C-SKY folks :P > > > > I wasn't aware you could be interested :). > > > > > Move arm asid allocator code in a generic one is a agood idea, I've > > > made a patchset for C-SKY and test is on processing, See: > > > https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/ > > > > > > If you plan to seperate it into generic one, I could co-work with you. > > > > Was the ASID allocator work out of box on C-Sky? If so, I can easily move > > the code in a generic place (maybe lib/asid.c). > > This is one place where I'd actually prefer not to go down the route of > making the code generic. Context-switching and low-level TLB management > is deeply architecture-specific and I worry that by trying to make this > code common, we run the real risk of introducing subtle bugs on some > architecture every time it is changed. "Add generic asid code" and "move arm's into generic" are two things. We could do first and let architecture's maintainer to choose. > Furthermore, the algorithm we use > on arm64 is designed to scale to large systems using DVM and may well be > too complex and/or sub-optimal for architectures with different system > topologies or TLB invalidation mechanisms. It's just a asid algorithm not very complex and there is a callback for architecture to define their own local hart tlb flush. Seems it has nothing with DVM or tlb broadcast mechanism. > > It's not a lot of code, so I don't see that it's a big deal to keep it > under arch/arm64. Yes, I think that's ok for arm64. Hi Arnd, What do you think about adding generic asid code for arch selection? Best Regards Guo Ren ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
On Wed, Jun 19, 2019 at 4:54 PM Julien Grall wrote: > > > > On 6/19/19 9:07 AM, Guo Ren wrote: > > Hi Julien, > > Hi, > > > > > You forgot CCing C-SKY folks :P > > I wasn't aware you could be interested :). > > > > > Move arm asid allocator code in a generic one is a agood idea, I've > > made a patchset for C-SKY and test is on processing, See: > > https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/ > > > > If you plan to seperate it into generic one, I could co-work with you. > > Was the ASID allocator work out of box on C-Sky? Almost done, but one question: arm64 remove the code in switch_mm: cpumask_clear_cpu(cpu, mm_cpumask(prev)); cpumask_set_cpu(cpu, mm_cpumask(next)); Why? Although arm64 cache operations could affect all harts with CTC method of interconnect, I think we should keep these code for primitive integrity in linux. Because cpu_bitmap is in mm_struct instead of mm->context. In current csky's patches I've also removed the codes the same as arm64, but I'll add it back at next version. > If so, I can easily move the code in a generic place (maybe lib/asid.c). I think it's OK. Best Regards Guo Ren
Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler
On 19/06/2019 01:19, Jacob Pan wrote: >>> I see this as a future extension due to limited testing, >> >> I'm wondering how we deal with: >> (1) old userspace that won't fill the new private_data field in >> page_response. A new kernel still has to support it. >> (2) old kernel that won't recognize the new PRIVATE_DATA flag. >> Currently iommu_page_response() rejects page responses with unknown >> flags. >> >> I guess we'll need a two-way negotiation, where userspace queries >> whether the kernel supports the flag (2), and the kernel learns >> whether it should expect the private data to come back (1). >> > I am not sure case (1) exist in that there is no existing user space > supports PRQ w/o private data. Am I missing something? > > For VT-d emulation, private data is always part of the scalable mode > PASID capability. If vIOMMU query host supports PASID and scalable > mode, it will always support private data once PRQ is enabled. Right if VT-d won't ever support page_response without private data then I don't think we have to worry about (1). > So I think we only need to negotiate (2) which should be covered by > VT-d PASID cap. > >>> perhaps for >>> now, can you add paddings similar to page request? Make it 64B as >>> well. >> >> I don't think padding is necessary, because iommu_page_response is >> sent by userspace to the kernel, unlike iommu_fault which is >> allocated by userspace and filled by the kernel. >> >> Page response looks a lot more like existing VFIO mechanisms, so I >> suppose we'll wrap the iommu_page_response structure and include an >> argsz parameter at the top: >> >> struct vfio_iommu_page_response { >> u32 argsz; >> struct iommu_page_response pr; >> }; >> >> struct vfio_iommu_page_response vpr = { >> .argsz = sizeof(vpr), >> .pr = ... >> ... >> }; >> >> ioctl(devfd, VFIO_IOMMU_PAGE_RESPONSE, &vpr); >> >> In that case supporting private data can be done by simply appending a >> field at the end (plus the negotiation above). >> > Do you mean at the end of struct vfio_iommu_page_response{}? or at > the end of that seems struct iommu_page_response{}? > > The consumer of the private data is iommu driver not vfio. So I think > you want to add the new field at the end of struct iommu_page_response, > right? Yes that's what I meant Thanks, Jean ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
On Wed, Jun 19, 2019 at 09:54:21AM +0100, Julien Grall wrote: > On 6/19/19 9:07 AM, Guo Ren wrote: > > You forgot CCing C-SKY folks :P > > I wasn't aware you could be interested :). > > > Move arm asid allocator code in a generic one is a agood idea, I've > > made a patchset for C-SKY and test is on processing, See: > > https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/ > > > > If you plan to seperate it into generic one, I could co-work with you. > > Was the ASID allocator work out of box on C-Sky? If so, I can easily move > the code in a generic place (maybe lib/asid.c). This is one place where I'd actually prefer not to go down the route of making the code generic. Context-switching and low-level TLB management is deeply architecture-specific and I worry that by trying to make this code common, we run the real risk of introducing subtle bugs on some architecture every time it is changed. Furthermore, the algorithm we use on arm64 is designed to scale to large systems using DVM and may well be too complex and/or sub-optimal for architectures with different system topologies or TLB invalidation mechanisms. It's not a lot of code, so I don't see that it's a big deal to keep it under arch/arm64. Will ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
On 6/19/19 9:07 AM, Guo Ren wrote: Hi Julien, Hi, You forgot CCing C-SKY folks :P I wasn't aware you could be interested :). Move arm asid allocator code in a generic one is a agood idea, I've made a patchset for C-SKY and test is on processing, See: https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/ If you plan to seperate it into generic one, I could co-work with you. Was the ASID allocator work out of box on C-Sky? If so, I can easily move the code in a generic place (maybe lib/asid.c). Cheers, -- Julien Grall ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
Hi Julien, You forgot CCing C-SKY folks :P Move arm asid allocator code in a generic one is a agood idea, I've made a patchset for C-SKY and test is on processing, See: https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/ If you plan to seperate it into generic one, I could co-work with you. Or I'll bring asid code into csky subsystem first and you can cleanup them later. Best Regards Guo Ren ML: linux-c...@vger.kernel.org On Thu, Jun 6, 2019 at 12:56 AM Julien Grall wrote: > > Hi, > > I am CCing RISC-V folks to see if there are an interest to share the code. > > @RISC-V: I noticed you are discussing about importing a version of ASID > allocator in RISC-V. At a first look, the code looks quite similar. Would the > library below helps you? > > Cheers, > > On 21/03/2019 16:36, Julien Grall wrote: > > We will want to re-use the ASID allocator in a separate context (e.g > > allocating VMID). So move the code in a new file. > > > > The function asid_check_context has been moved in the header as a static > > inline function because we want to avoid add a branch when checking if the > > ASID is still valid. > > > > Signed-off-by: Julien Grall > > > > --- > > > > This code will be used in the virt code for allocating VMID. I am not > > entirely sure where to place it. Lib could potentially be a good place but I > > am not entirely convinced the algo as it is could be used by other > > architecture. > > > > Looking at x86, it seems that it will not be possible to re-use because > > the number of PCID (aka ASID) could be smaller than the number of CPUs. > > See commit message 10af6235e0d327d42e1bad974385197817923dc1 "x86/mm: > > Implement PCID based optimization: try to preserve old TLB entries using > > PCI". > > --- > > arch/arm64/include/asm/asid.h | 77 ++ > > arch/arm64/lib/Makefile | 2 + > > arch/arm64/lib/asid.c | 185 + > > arch/arm64/mm/context.c | 235 > > +- > > 4 files changed, 267 insertions(+), 232 deletions(-) > > create mode 100644 arch/arm64/include/asm/asid.h > > create mode 100644 arch/arm64/lib/asid.c > > > > diff --git a/arch/arm64/include/asm/asid.h b/arch/arm64/include/asm/asid.h > > new file mode 100644 > > index ..bb62b587f37f > > --- /dev/null > > +++ b/arch/arm64/include/asm/asid.h > > @@ -0,0 +1,77 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef __ASM_ASM_ASID_H > > +#define __ASM_ASM_ASID_H > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +struct asid_info > > +{ > > + atomic64_t generation; > > + unsigned long *map; > > + atomic64_t __percpu *active; > > + u64 __percpu*reserved; > > + u32 bits; > > + /* Lock protecting the structure */ > > + raw_spinlock_t lock; > > + /* Which CPU requires context flush on next call */ > > + cpumask_t flush_pending; > > + /* Number of ASID allocated by context (shift value) */ > > + unsigned intctxt_shift; > > + /* Callback to locally flush the context. */ > > + void(*flush_cpu_ctxt_cb)(void); > > +}; > > + > > +#define NUM_ASIDS(info) (1UL << ((info)->bits)) > > +#define NUM_CTXT_ASIDS(info) (NUM_ASIDS(info) >> > > (info)->ctxt_shift) > > + > > +#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu) > > + > > +void asid_new_context(struct asid_info *info, atomic64_t *pasid, > > + unsigned int cpu); > > + > > +/* > > + * Check the ASID is still valid for the context. If not generate a new > > ASID. > > + * > > + * @pasid: Pointer to the current ASID batch > > + * @cpu: current CPU ID. Must have been acquired throught get_cpu() > > + */ > > +static inline void asid_check_context(struct asid_info *info, > > + atomic64_t *pasid, unsigned int cpu) > > +{ > > + u64 asid, old_active_asid; > > + > > + asid = atomic64_read(pasid); > > + > > + /* > > + * The memory ordering here is subtle. > > + * If our active_asid is non-zero and the ASID matches the current > > + * generation, then we update the active_asid entry with a relaxed > > + * cmpxchg. Racing with a concurrent rollover means that either: > > + * > > + * - We get a zero back from the cmpxchg and end up waiting on the > > + * lock. Taking the lock synchronises with the rollover and so > > + * we are forced to see the updated generation. > > + * > > + * - We get a valid ASID back from the cmpxchg, which means the > > + * relaxed xchg in flush_context will treat us as reserved > > + * because atomic RmWs are totally ordered for a given location. > > + */ > > + old_active_asid = atomic64_read(&active_asid(info, cpu)); > > + if (old_active_asid &