RE: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
We have no plans at present to test mthca on Xen. Our current Xen testing is focused on Connectx. -Jack -Original Message- From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com] Sent: Friday, January 07, 2011 9:16 PM To: Roland Dreier Cc: Vivien Bernet-Rollande; Jack Morgenstein; Jeremy Fitzhardinge; xen-de...@lists.xensource.com; linux-rdma@vger.kernel.org; rdr...@cisco.com; sean.he...@intel.com Subject: Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0 On Thu, Jan 06, 2011 at 10:51:57AM -0800, Roland Dreier wrote: > > There is another way, which I hope can fix this issue (it fixes > > the graphics drivers). Basically making the Xen MMU be aware of the E820 > > and consider any access to PFNs that are not in RAM to have VM_IO > implicitly set. > > > > The patches are at > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git > > > > stable/p2m-identity.v4.3 > > > > And posted on LKML )https://lkml.org/lkml/2010/12/30/163) > > > > They _should_ take care of your problem as long as the PFNs you > > are mapping fall within the PCI BAR regions (which I would presume they > > do since you are using ioremap). > > Yes, that should work fine for all the cases I know of in the RDMA > drivers. We are just mapping some PCI BAR space into userspace for > direct access to hardware (just like graphics, I think). Allright, then it will work. I've respun another version that fundamentally does the same thing but accounts for "what PFN is non-RAM" in a much cleaner way. It is @: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git devel/p2m-identity.v4.5 will post it soon on LKML. Jack, Vivien: i you do merge it in your tree for testing with Xen and it works properly can I attach your 'Tested-by' on the patch-set? > > I definitely prefer a solution that doesn't require driver authors to > add a Xen-specific fix that isn't required on any other architecture. Heh. You and me are on the same wavelength here. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
On Thu, Jan 06, 2011 at 10:51:57AM -0800, Roland Dreier wrote: > > There is another way, which I hope can fix this issue (it fixes > > the graphics drivers). Basically making the Xen MMU be aware of the E820 > > and consider any access to PFNs that are not in RAM to have VM_IO > implicitly set. > > > > The patches are at > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git > > > > stable/p2m-identity.v4.3 > > > > And posted on LKML )https://lkml.org/lkml/2010/12/30/163) > > > > They _should_ take care of your problem as long as the PFNs you > > are mapping fall within the PCI BAR regions (which I would presume they > > do since you are using ioremap). > > Yes, that should work fine for all the cases I know of in the RDMA > drivers. We are just mapping some PCI BAR space into userspace for > direct access to hardware (just like graphics, I think). Allright, then it will work. I've respun another version that fundamentally does the same thing but accounts for "what PFN is non-RAM" in a much cleaner way. It is @: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git devel/p2m-identity.v4.5 will post it soon on LKML. Jack, Vivien: i you do merge it in your tree for testing with Xen and it works properly can I attach your 'Tested-by' on the patch-set? > > I definitely prefer a solution that doesn't require driver authors to > add a Xen-specific fix that isn't required on any other architecture. Heh. You and me are on the same wavelength here. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
> > Finally as a stylistic thing I would probably prefer to see the > > vm_page_prot manipulation written as > > > > vma->vm_page_prot = > > pgprot_noncached(vm_get_page_prot(vma->vm_flags)); > > > > Makes sense. > While digging, I've also seen things like : > > vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags | > VM_IO)); This isn't the same, since it doesn't set VM_IO in vm_flags. > > wrapped up in a helper function; presumably every instance of > > > > vma->vm_page_prot = pgprot_FOO(vma->vm_page_prot) > > > > under drivers/ would need the same fix for Xen. > > The issue is actually only related to mappings of bus memory to > userland. In kernel space, ioremap() is generaly used, which I > understand sets the _PAGE_IOMAP flag. Right, but setting vm_page_prot means a driver is doing userspace mapping of something, and if it's setting pgprot, that would generally mean it's not mapping RAM. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
> There is another way, which I hope can fix this issue (it fixes > the graphics drivers). Basically making the Xen MMU be aware of the E820 > and consider any access to PFNs that are not in RAM to have VM_IO implicitly > set. > > The patches are at > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git > > stable/p2m-identity.v4.3 > > And posted on LKML )https://lkml.org/lkml/2010/12/30/163) > > They _should_ take care of your problem as long as the PFNs you > are mapping fall within the PCI BAR regions (which I would presume they > do since you are using ioremap). Yes, that should work fine for all the cases I know of in the RDMA drivers. We are just mapping some PCI BAR space into userspace for direct access to hardware (just like graphics, I think). I definitely prefer a solution that doesn't require driver authors to add a Xen-specific fix that isn't required on any other architecture. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
> To sum up : > - Xen needs the _PAGE_IOMAP flag set on the vm_page_prot for the > device to work. > - I understand the proper way to do this is to set the VM_IO flag, > and then call vm_get_page_prot() > > > There are quite a few drivers under drivers/infiniband/hw that call > > io_remap_pfn_range() -- presumably they all need the analogous fix? > > I only have access to Mellanox Infinihost HBAs for testing, but once There is another way, which I hope can fix this issue (it fixes the graphics drivers). Basically making the Xen MMU be aware of the E820 and consider any access to PFNs that are not in RAM to have VM_IO implicitly set. The patches are at git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/p2m-identity.v4.3 And posted on LKML )https://lkml.org/lkml/2010/12/30/163) They _should_ take care of your problem as long as the PFNs you are mapping fall within the PCI BAR regions (which I would presume they do since you are using ioremap). Please tell me if they work (or not). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
Thanks for your reply. > This is kind of unappealing -- is there no way to make > io_remap_pfn_range do the right thing under Xen? It seems unfortunate > to make drivers both set VM_IO and call io_remap_pfn_range. Or maybe we > should get rid of io_remap_pfn_range() and just have drivers set VM_IO > and call remap_pfn_range() all the time? Aside from IB drivers, the devices that need userland mapping are mainly graphic cards. There has been quite a few issues with that and the topic is discussed here : http://wiki.xensource.com/xenwiki/XenPVOPSDRM AFAIK, on x86 io_remap_pfn_range() is just #defined as remap_pfn_range(). remap_pfn_range() sets the VM_IO flag, but does not refresh the vm_page_prot field (mm/memory.c:1875). I think it could make sense to patch remap_pfn_range(), but I'm afraid it might break a lot of other drivers. > +vma->vm_flags |= VM_IO; > +vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); Under Xen, when the VM_IO flag is set, the vm_get_page_prot() macro will set the _PAGE_IOMAP bit on the the vm_page_prot. Xen needs that bit to be set in order to understand that we actually want the real memory space to be mapped. The use of this macro was pointed out to me by the Xen team (see last post on http://xen.1045712.n5.nabble.com/Kernel-Oops-when-reading-kernel-page-tables-debugfs-file-td3279750.html ), and, I assume, is the proper way of handling things. To sum up : - Xen needs the _PAGE_IOMAP flag set on the vm_page_prot for the device to work. - I understand the proper way to do this is to set the VM_IO flag, and then call vm_get_page_prot() > There are quite a few drivers under drivers/infiniband/hw that call > io_remap_pfn_range() -- presumably they all need the analogous fix? I only have access to Mellanox Infinihost HBAs for testing, but once we figure out how we want this fixed, I can take a look at the other drivers. > Finally as a stylistic thing I would probably prefer to see the > vm_page_prot manipulation written as > > vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags)); > Makes sense. While digging, I've also seen things like : vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags | VM_IO)); But it seems a bit dense. > and if this fix is really the right thing to do, this should probably be > wrapped up in a helper function; presumably every instance of > > vma->vm_page_prot = pgprot_FOO(vma->vm_page_prot) > > under drivers/ would need the same fix for Xen. The issue is actually only related to mappings of bus memory to userland. In kernel space, ioremap() is generaly used, which I understand sets the _PAGE_IOMAP flag. -- Vivien Bernet-Rollande -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0
> --- kernel-2.6.32.fc12.orig/drivers/infiniband/hw/mthca/mthca_provider.c > 2010-12-16 16:39:06.901130993 +0100 > +++ kernel-2.6.32.fc12.new/drivers/infiniband/hw/mthca/mthca_provider.c > 2010-12-16 17:02:07.944127489 +0100 > @@ -391,6 +391,8 @@ static int mthca_mmap_uar(struct ib_ucon > if (vma->vm_end - vma->vm_start != PAGE_SIZE) > return -EINVAL; > > +vma->vm_flags |= VM_IO; > +vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > if (io_remap_pfn_range(vma, vma->vm_start, This is kind of unappealing -- is there no way to make io_remap_pfn_range do the right thing under Xen? It seems unfortunate to make drivers both set VM_IO and call io_remap_pfn_range. Or maybe we should get rid of io_remap_pfn_range() and just have drivers set VM_IO and call remap_pfn_range() all the time? There are quite a few drivers under drivers/infiniband/hw that call io_remap_pfn_range() -- presumably they all need the analogous fix? Finally as a stylistic thing I would probably prefer to see the vm_page_prot manipulation written as vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags)); and if this fix is really the right thing to do, this should probably be wrapped up in a helper function; presumably every instance of vma->vm_page_prot = pgprot_FOO(vma->vm_page_prot) under drivers/ would need the same fix for Xen. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html