Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-05-01 Thread Doug Ledford
On 05/01/2014, Shirley Ma wrote: 
 On 04/30/2014 04:58 PM, Doug Ledford wrote:
  On 04/302014 Shirley Ma wrote:
  I've created Xen guest on DomU. Dom0 PF works which has no mtts
  been
  enabled, however DomU I hit this problem by just mounting the file
  system:
  mlx4_core :00:04.0: Failed to allocate mtts for 66 pages(order
  7)
  mlx4_core :00:04.0: Failed to allocate mtts for 4096
  pages(order
  12)
  mlx4_core :00:04.0: Failed to allocate mtts for 4096
  pages(order
  12)
 
  RDMA microbenchmark perftest works ok. I enabled mtts scripts when
  booting the Xen guest. cat /proc/mtrr:
  What OS/RDMA stack are you using?  I'm not familiar with any mtts
  scripts, however I know there is an mtrr fixup script I wrote for
  the RDMA stack in Fedora/RHEL (and so I assume it's in Oracle Linux
  too, but I haven't checked).  In fact, I assume that's the script
  you are referring to based on the fact that your next bit of your
  email cats the /proc/mtrr file.  But I don't believe whether there
  is an mtrr setting mixup or not that is should have any impact on
  the mtts allocations in the driver.  Even if your mtrr registers
  were set incorrectly, the problem then becomes either A) a serious
  performance bottleneck (in the case of Intel hardware that needs
  write combining in order to get more than about 50MByte/s of
  throughput on their cards) or B) failed operation because MMIO
  writes to the card are being cached/write combined when they should
  not be.
 
  I suspect this is more likely Xen related than mtts/mtrr related.
 Yes. That's the script I used. I wonder whether it's possible to
 disable
 mtrr on DomU guest to debug this. I am new to Xen.

No, it's not possible to disable mtrr and expect any pass through
PCI devices to work.  The mtrr registers merely indicate what
portion of the memory map should be treated as normal memory (meaning
cacheable) and what should be treated as MMIO memory (meaning generally
non-cacheable).  That's all they do.  The mtts table allocation
failures are actually totally different.  In the VF on DomU they
are passed to the PF on Dom0 and the command is done there.  However,
the number of mtts available for the slave are limited (check the
code in resource_tracker.c).  In addition, the number of mtts
allocated is proportionally related to memory size in the guest
and inversely related to the log2_mtts_per_seg (probably both on
Dom0 and DomU, which I suspect need to agree on the log2_mtts_per_seg
module parameter).  I would try a combination of reducing the memory
in the quest, or increasing the log2_mtts_per_seg, or both, and
see if you can get it to work.

-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD
  http://people.redhat.com/dledford

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-04-30 Thread Or Gerlitz
On Wed, Apr 30, 2014 at 10:16 PM, Shirley Ma shirley...@oracle.com wrote:
[...]
 3. Upstream NFSoRDMA status:


So does it currently works...? I understand that Yan tried it out
today, and @ least one side just crashed.

Chuck, I assume there is a configuration which basically works for you
and allow you to develop the upstream patches, you send, right?  can
you send us your .config and exact NFS/rNFS options you use and works
basically OK?

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-04-30 Thread Chuck Lever
Hi Or-

On Apr 30, 2014, at 3:39 PM, Or Gerlitz or.gerl...@gmail.com wrote:

 On Wed, Apr 30, 2014 at 10:16 PM, Shirley Ma shirley...@oracle.com wrote:
 [...]
 3. Upstream NFSoRDMA status:
 
 
 So does it currently works...? I understand that Yan tried it out
 today, and @ least one side just crashed.
 
 Chuck, I assume there is a configuration which basically works for you
 and allow you to develop the upstream patches, you send, right?  can
 you send us your .config and exact NFS/rNFS options you use and works
 basically OK?

If I understood Yan, he is trying to use NFS/RDMA in guests (kvm?).  We
are pretty sure that is not working at the moment, but that is a priority
to get fixed. Shirley has a lab set up and has been looking into it.

So, first step would be to set up v3.15-rc3 on bare metal.

I think the critical stability patches for both client and server are
already upstream, unless you are exporting tmpfs.

The patches I just posted fix some other issues, feel free to apply them.
But the basics should be working now in v3.15-rc3.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-04-30 Thread Or Gerlitz
On Wed, Apr 30, 2014 at 10:47 PM, Chuck Lever chuck.le...@oracle.com

 If I understood Yan, he is trying to use NFS/RDMA in guests (kvm?).  We
 are pretty sure that is not working at the moment,

can you provide a short 1-2 liner why/what is broken there? the only
thing which I can think of to be not-supported over mlx4 VFs is the
proprietary FMRs, but AFAIK, the nfs-rdma code doesn't even have a
mode which uses them, right?

 but that is a priority
 to get fixed. Shirley has a lab set up and has been looking into it.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-04-30 Thread Shirley Ma

On 04/30/2014 01:00 PM, Or Gerlitz wrote:

On Wed, Apr 30, 2014 at 10:47 PM, Chuck Lever chuck.le...@oracle.com


If I understood Yan, he is trying to use NFS/RDMA in guests (kvm?).  We
are pretty sure that is not working at the moment,

can you provide a short 1-2 liner why/what is broken there? the only
thing which I can think of to be not-supported over mlx4 VFs is the
proprietary FMRs, but AFAIK, the nfs-rdma code doesn't even have a
mode which uses them, right?
I've created Xen guest on DomU. Dom0 PF works which has no mtts been 
enabled, however DomU I hit this problem by just mounting the file system:

mlx4_core :00:04.0: Failed to allocate mtts for 66 pages(order 7)
mlx4_core :00:04.0: Failed to allocate mtts for 4096 pages(order 12)
mlx4_core :00:04.0: Failed to allocate mtts for 4096 pages(order 12)

RDMA microbenchmark perftest works ok. I enabled mtts scripts when 
booting the Xen guest. cat /proc/mtrr:


[root@ca-nfsdev1vm1 log]# cat /proc/mtrr
reg00: base=0x0f000 ( 3840MB), size=  128MB, count=1: uncachable
reg01: base=0x0f800 ( 3968MB), size=   64MB, count=1: uncachable

lspci -v
00:04.0 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2 
Virtual Function] (rev b0)

Subsystem: Mellanox Technologies Device 61b0
Physical Slot: 4
Flags: bus master, fast devsel, latency 0
Memory at f000 (64-bit, prefetchable) [size=128M]
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [9c] MSI-X: Enable+ Count=4 Masked-
Kernel driver in use: mlx4_core
Kernel modules: mlx4_core

I will need to find another machine to try KVM guest. Yan might hit a 
different problem.


I have ConnectX-2, FW level is 2.11.2012. Yan has ConnectX-3, he tried 
it on KVM guest.

but that is a priority
to get fixed. Shirley has a lab set up and has been looking into it.

Shirley
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-04-30 Thread Doug Ledford
On 04/302014 Shirley Ma wrote:
 On 04/30/2014 01:00 PM, Or Gerlitz wrote:
  On Wed, Apr 30, 2014 at 10:47 PM, Chuck Lever
  chuck.le...@oracle.com
 
  If I understood Yan, he is trying to use NFS/RDMA in guests
  (kvm?).  We
  are pretty sure that is not working at the moment,
  can you provide a short 1-2 liner why/what is broken there? the
  only
  thing which I can think of to be not-supported over mlx4 VFs is the
  proprietary FMRs, but AFAIK, the nfs-rdma code doesn't even have a
  mode which uses them, right?
 I've created Xen guest on DomU. Dom0 PF works which has no mtts been
 enabled, however DomU I hit this problem by just mounting the file
 system:
 mlx4_core :00:04.0: Failed to allocate mtts for 66 pages(order 7)
 mlx4_core :00:04.0: Failed to allocate mtts for 4096 pages(order
 12)
 mlx4_core :00:04.0: Failed to allocate mtts for 4096 pages(order
 12)
 
 RDMA microbenchmark perftest works ok. I enabled mtts scripts when
 booting the Xen guest. cat /proc/mtrr:

What OS/RDMA stack are you using?  I'm not familiar with any mtts
scripts, however I know there is an mtrr fixup script I wrote for
the RDMA stack in Fedora/RHEL (and so I assume it's in Oracle Linux
too, but I haven't checked).  In fact, I assume that's the script
you are referring to based on the fact that your next bit of your
email cats the /proc/mtrr file.  But I don't believe whether there
is an mtrr setting mixup or not that is should have any impact on
the mtts allocations in the driver.  Even if your mtrr registers
were set incorrectly, the problem then becomes either A) a serious
performance bottleneck (in the case of Intel hardware that needs
write combining in order to get more than about 50MByte/s of
throughput on their cards) or B) failed operation because MMIO
writes to the card are being cached/write combined when they should
not be.

I suspect this is more likely Xen related than mtts/mtrr related.

 [root@ca-nfsdev1vm1 log]# cat /proc/mtrr
 reg00: base=0x0f000 ( 3840MB), size=  128MB, count=1: uncachable
 reg01: base=0x0f800 ( 3968MB), size=   64MB, count=1: uncachable
 
 lspci -v
 00:04.0 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2
 Virtual Function] (rev b0)
  Subsystem: Mellanox Technologies Device 61b0
  Physical Slot: 4
  Flags: bus master, fast devsel, latency 0
  Memory at f000 (64-bit, prefetchable) [size=128M]
  Capabilities: [60] Express Endpoint, MSI 00
  Capabilities: [9c] MSI-X: Enable+ Count=4 Masked-
  Kernel driver in use: mlx4_core
  Kernel modules: mlx4_core
 
 I will need to find another machine to try KVM guest. Yan might hit a
 different problem.
 
 I have ConnectX-2, FW level is 2.11.2012. Yan has ConnectX-3, he
 tried
 it on KVM guest.
  but that is a priority
  to get fixed. Shirley has a lab set up and has been looking into
  it.
 Shirley
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma
 in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD
  http://people.redhat.com/dledford

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFSoRDMA developers bi-weekly meeting announcement (4/30)

2014-04-30 Thread Shirley Ma


On 04/30/2014 04:58 PM, Doug Ledford wrote:

On 04/302014 Shirley Ma wrote:

On 04/30/2014 01:00 PM, Or Gerlitz wrote:

On Wed, Apr 30, 2014 at 10:47 PM, Chuck Lever
chuck.le...@oracle.com


If I understood Yan, he is trying to use NFS/RDMA in guests
(kvm?).  We
are pretty sure that is not working at the moment,

can you provide a short 1-2 liner why/what is broken there? the
only
thing which I can think of to be not-supported over mlx4 VFs is the
proprietary FMRs, but AFAIK, the nfs-rdma code doesn't even have a
mode which uses them, right?

I've created Xen guest on DomU. Dom0 PF works which has no mtts been
enabled, however DomU I hit this problem by just mounting the file
system:
mlx4_core :00:04.0: Failed to allocate mtts for 66 pages(order 7)
mlx4_core :00:04.0: Failed to allocate mtts for 4096 pages(order
12)
mlx4_core :00:04.0: Failed to allocate mtts for 4096 pages(order
12)

RDMA microbenchmark perftest works ok. I enabled mtts scripts when
booting the Xen guest. cat /proc/mtrr:

What OS/RDMA stack are you using?  I'm not familiar with any mtts
scripts, however I know there is an mtrr fixup script I wrote for
the RDMA stack in Fedora/RHEL (and so I assume it's in Oracle Linux
too, but I haven't checked).  In fact, I assume that's the script
you are referring to based on the fact that your next bit of your
email cats the /proc/mtrr file.  But I don't believe whether there
is an mtrr setting mixup or not that is should have any impact on
the mtts allocations in the driver.  Even if your mtrr registers
were set incorrectly, the problem then becomes either A) a serious
performance bottleneck (in the case of Intel hardware that needs
write combining in order to get more than about 50MByte/s of
throughput on their cards) or B) failed operation because MMIO
writes to the card are being cached/write combined when they should
not be.

I suspect this is more likely Xen related than mtts/mtrr related.
Yes. That's the script I used. I wonder whether it's possible to disable 
mtrr on DomU guest to debug this. I am new to Xen.

[root@ca-nfsdev1vm1 log]# cat /proc/mtrr
reg00: base=0x0f000 ( 3840MB), size=  128MB, count=1: uncachable
reg01: base=0x0f800 ( 3968MB), size=   64MB, count=1: uncachable

lspci -v
00:04.0 InfiniBand: Mellanox Technologies MT25400 Family [ConnectX-2
Virtual Function] (rev b0)
  Subsystem: Mellanox Technologies Device 61b0
  Physical Slot: 4
  Flags: bus master, fast devsel, latency 0
  Memory at f000 (64-bit, prefetchable) [size=128M]
  Capabilities: [60] Express Endpoint, MSI 00
  Capabilities: [9c] MSI-X: Enable+ Count=4 Masked-
  Kernel driver in use: mlx4_core
  Kernel modules: mlx4_core

I will need to find another machine to try KVM guest. Yan might hit a
different problem.

I have ConnectX-2, FW level is 2.11.2012. Yan has ConnectX-3, he
tried
it on KVM guest.

but that is a priority
to get fixed. Shirley has a lab set up and has been looking into
it.

Shirley
--
To unsubscribe from this list: send the line unsubscribe linux-rdma
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html