RE: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-09 Thread Jack Morgenstein
We have no plans at present to test mthca on Xen. Our current Xen testing is 
focused on Connectx.

-Jack

-Original Message-
From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
Sent: Friday, January 07, 2011 9:16 PM
To: Roland Dreier
Cc: Vivien Bernet-Rollande; Jack Morgenstein; Jeremy Fitzhardinge; 
xen-de...@lists.xensource.com; linux-rdma@vger.kernel.org; rdr...@cisco.com; 
sean.he...@intel.com
Subject: Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of 
mthca infiniband cards in Xen dom0

On Thu, Jan 06, 2011 at 10:51:57AM -0800, Roland Dreier wrote:
>  > There is another way, which I hope can fix this issue (it fixes
>  > the graphics drivers). Basically making the Xen MMU be aware of the E820
>  > and consider any access to PFNs that are not in RAM to have VM_IO 
> implicitly set.
>  >
>  > The patches are at 
> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
>  >
>  > stable/p2m-identity.v4.3
>  >
>  > And posted on LKML )https://lkml.org/lkml/2010/12/30/163)
>  >
>  > They _should_ take care of your problem as long as the PFNs you
>  > are mapping fall within the PCI BAR regions (which I would presume they
>  > do since you are using ioremap).
>
> Yes, that should work fine for all the cases I know of in the RDMA
> drivers.  We are just mapping some PCI BAR space into userspace for
> direct access to hardware (just like graphics, I think).

Allright, then it will work.

I've respun another version that fundamentally does the same thing
but accounts for  "what PFN is non-RAM" in a much cleaner way.

It is @:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git 
devel/p2m-identity.v4.5

will post it soon on LKML.

Jack, Vivien: i you do merge it in your tree for testing with Xen and it works 
properly can
I attach your 'Tested-by' on the patch-set?

>
> I definitely prefer a solution that doesn't require driver authors to
> add a Xen-specific fix that isn't required on any other architecture.

Heh. You and me are on the same wavelength here.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-07 Thread Konrad Rzeszutek Wilk
On Thu, Jan 06, 2011 at 10:51:57AM -0800, Roland Dreier wrote:
>  > There is another way, which I hope can fix this issue (it fixes
>  > the graphics drivers). Basically making the Xen MMU be aware of the E820
>  > and consider any access to PFNs that are not in RAM to have VM_IO 
> implicitly set.
>  > 
>  > The patches are at 
> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
>  > 
>  > stable/p2m-identity.v4.3
>  > 
>  > And posted on LKML )https://lkml.org/lkml/2010/12/30/163)
>  > 
>  > They _should_ take care of your problem as long as the PFNs you
>  > are mapping fall within the PCI BAR regions (which I would presume they
>  > do since you are using ioremap).
> 
> Yes, that should work fine for all the cases I know of in the RDMA
> drivers.  We are just mapping some PCI BAR space into userspace for
> direct access to hardware (just like graphics, I think).

Allright, then it will work. 

I've respun another version that fundamentally does the same thing
but accounts for  "what PFN is non-RAM" in a much cleaner way.

It is @:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git 
devel/p2m-identity.v4.5

will post it soon on LKML.

Jack, Vivien: i you do merge it in your tree for testing with Xen and it works 
properly can
I attach your 'Tested-by' on the patch-set?

> 
> I definitely prefer a solution that doesn't require driver authors to
> add a Xen-specific fix that isn't required on any other architecture.

Heh. You and me are on the same wavelength here.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-06 Thread Roland Dreier
 > > Finally as a stylistic thing I would probably prefer to see the
 > > vm_page_prot manipulation written as
 > >
 > >        vma->vm_page_prot = 
 > > pgprot_noncached(vm_get_page_prot(vma->vm_flags));
 > >
 > 
 > Makes sense.
 > While digging, I've also seen things like :
 > 
 > vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags | 
 > VM_IO));

This isn't the same, since it doesn't set VM_IO in vm_flags.

 > > wrapped up in a helper function; presumably every instance of
 > >
 > >        vma->vm_page_prot = pgprot_FOO(vma->vm_page_prot)
 > >
 > > under drivers/ would need the same fix for Xen.
 > 
 > The issue is actually only related to mappings of bus memory to
 > userland. In kernel space, ioremap() is generaly used, which I
 > understand sets the _PAGE_IOMAP flag.

Right, but setting vm_page_prot means a driver is doing userspace
mapping of something, and if it's setting pgprot, that would generally
mean it's not mapping RAM.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-06 Thread Roland Dreier
 > There is another way, which I hope can fix this issue (it fixes
 > the graphics drivers). Basically making the Xen MMU be aware of the E820
 > and consider any access to PFNs that are not in RAM to have VM_IO implicitly 
 > set.
 > 
 > The patches are at 
 > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
 > 
 > stable/p2m-identity.v4.3
 > 
 > And posted on LKML )https://lkml.org/lkml/2010/12/30/163)
 > 
 > They _should_ take care of your problem as long as the PFNs you
 > are mapping fall within the PCI BAR regions (which I would presume they
 > do since you are using ioremap).

Yes, that should work fine for all the cases I know of in the RDMA
drivers.  We are just mapping some PCI BAR space into userspace for
direct access to hardware (just like graphics, I think).

I definitely prefer a solution that doesn't require driver authors to
add a Xen-specific fix that isn't required on any other architecture.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-06 Thread Konrad Rzeszutek Wilk
> To sum up :
>  - Xen needs the _PAGE_IOMAP flag set on the vm_page_prot for the
> device to work.
>  - I understand the proper way to do this is to set the VM_IO flag,
> and then call vm_get_page_prot()
> 
> > There are quite a few drivers under drivers/infiniband/hw that call
> > io_remap_pfn_range() -- presumably they all need the analogous fix?
> 
> I only have access to Mellanox Infinihost HBAs for testing, but once

There is another way, which I hope can fix this issue (it fixes
the graphics drivers). Basically making the Xen MMU be aware of the E820
and consider any access to PFNs that are not in RAM to have VM_IO implicitly 
set.

The patches are at git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git

stable/p2m-identity.v4.3

And posted on LKML )https://lkml.org/lkml/2010/12/30/163)

They _should_ take care of your problem as long as the PFNs you
are mapping fall within the PCI BAR regions (which I would presume they
do since you are using ioremap).

Please tell me if they work (or not).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-06 Thread Vivien Bernet-Rollande
Thanks  for your reply.

> This is kind of unappealing -- is there no way to make
> io_remap_pfn_range do the right thing under Xen?  It seems unfortunate
> to make drivers both set VM_IO and call io_remap_pfn_range.  Or maybe we
> should get rid of io_remap_pfn_range() and just have drivers set VM_IO
> and call remap_pfn_range() all the time?

Aside from IB drivers, the devices that need userland mapping are
mainly graphic cards. There has been quite a few issues with that and
the topic is discussed here :
http://wiki.xensource.com/xenwiki/XenPVOPSDRM

AFAIK, on x86 io_remap_pfn_range() is just #defined as
remap_pfn_range(). remap_pfn_range() sets the VM_IO flag, but does not
refresh the vm_page_prot field (mm/memory.c:1875). I think it could
make sense to patch remap_pfn_range(), but I'm afraid it might break a
lot of other drivers.

 > +vma->vm_flags |= VM_IO;
 > +vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
 >  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

Under Xen, when the VM_IO flag is set, the vm_get_page_prot() macro
will set the _PAGE_IOMAP bit on the the vm_page_prot.
Xen needs that bit to be set in order to understand that we actually
want the real memory space to be mapped. The use of this macro was
pointed out to me by the Xen team (see last post on
http://xen.1045712.n5.nabble.com/Kernel-Oops-when-reading-kernel-page-tables-debugfs-file-td3279750.html
), and, I assume, is the proper way of handling things.

To sum up :
 - Xen needs the _PAGE_IOMAP flag set on the vm_page_prot for the
device to work.
 - I understand the proper way to do this is to set the VM_IO flag,
and then call vm_get_page_prot()

> There are quite a few drivers under drivers/infiniband/hw that call
> io_remap_pfn_range() -- presumably they all need the analogous fix?

I only have access to Mellanox Infinihost HBAs for testing, but once
we figure out how we want this fixed, I can take a look at the other
drivers.

> Finally as a stylistic thing I would probably prefer to see the
> vm_page_prot manipulation written as
>
>        vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags));
>

Makes sense.
While digging, I've also seen things like :

vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags | VM_IO));

But it seems a bit dense.

> and if this fix is really the right thing to do, this should probably be
> wrapped up in a helper function; presumably every instance of
>
>        vma->vm_page_prot = pgprot_FOO(vma->vm_page_prot)
>
> under drivers/ would need the same fix for Xen.

The issue is actually only related to mappings of bus memory to
userland. In kernel space, ioremap() is generaly used, which I
understand sets the _PAGE_IOMAP flag.

-- 
Vivien Bernet-Rollande
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infiniband/mthca : Fix userland mapping of mthca infiniband cards in Xen dom0

2011-01-05 Thread Roland Dreier
 > --- kernel-2.6.32.fc12.orig/drivers/infiniband/hw/mthca/mthca_provider.c 
 > 2010-12-16 16:39:06.901130993 +0100
 > +++ kernel-2.6.32.fc12.new/drivers/infiniband/hw/mthca/mthca_provider.c  
 > 2010-12-16 17:02:07.944127489 +0100
 > @@ -391,6 +391,8 @@ static int mthca_mmap_uar(struct ib_ucon
 >  if (vma->vm_end - vma->vm_start != PAGE_SIZE)
 >  return -EINVAL;
 >  
 > +vma->vm_flags |= VM_IO;
 > +vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
 >  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 >  
 >  if (io_remap_pfn_range(vma, vma->vm_start,

This is kind of unappealing -- is there no way to make
io_remap_pfn_range do the right thing under Xen?  It seems unfortunate
to make drivers both set VM_IO and call io_remap_pfn_range.  Or maybe we
should get rid of io_remap_pfn_range() and just have drivers set VM_IO
and call remap_pfn_range() all the time?

There are quite a few drivers under drivers/infiniband/hw that call
io_remap_pfn_range() -- presumably they all need the analogous fix?

Finally as a stylistic thing I would probably prefer to see the
vm_page_prot manipulation written as

vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags));

and if this fix is really the right thing to do, this should probably be
wrapped up in a helper function; presumably every instance of

vma->vm_page_prot = pgprot_FOO(vma->vm_page_prot)

under drivers/ would need the same fix for Xen.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html