Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
On 09/18/2012 10:03 AM, Andrew Holway wrote: > Hi Steve, > > Do you think these patches will make their way into the redhat kernel > sometime soon? The process would start by opening a bz at bugzilla.redhat.com... If you like, you can send me the pointer to the bz and I'll make sure it gets noticed... > > What is the state of support for NFS over RDMA support at redhat? In theory its supported, but in reality that post is currently unnamed which seems to be the case in upstream as well... steved. > > Thanks, > > Andrew > > > On Sep 11, 2012, at 7:03 PM, Steve Dickson wrote: > >> >> >> On 09/04/2012 05:31 AM, Andrew Holway wrote: >>> Hello. >>> >>> # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also >>> seem relevent to libvirt. # >>> >>> I have a Centos 6.2 server and Centos 6.2 client. >>> >>> [root@store ~]# cat /etc/exports >>> /dev/shm >>> 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non >>> tempfs targets also) >>> >>> >>> [root@node001 ~]# cat /etc/fstab >>> store.ibnet:/dev/shm /mnt nfs >>> rdma,port=2050,defaults 0 0 >>> >>> >>> I wrote a little for loop one liner that dd'd the centos net install image >>> to a file called 'hello' then checksummed that file. Each iteration uses a >>> different block size. >>> >>> Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block >>> sizes get corrupted. >>> >>> I want to run my KVM guests on top of NFS over RDMA. My guests cannot >>> create filesystems. >>> >>> Thanks, >>> >>> Andrew. >>> >>> bug report: https://bugzilla.linux-nfs.org/show_bug.cgi?id=228 >> Well it appears the RHEL6 kernels are lacking a couple patches that might >> help with this >> >> 5c635e09 RPCRDMA: Fix FRMR registration/invalidate handling. >> 9b78145c xprtrdma: Remove assumption that each segment is <= PAGE_SIZE >> >> I can only image that Centos 6.2 might me lacking these too... ;-) >> >> steved. >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
Hi Steve, Do you think these patches will make their way into the redhat kernel sometime soon? What is the state of support for NFS over RDMA support at redhat? Thanks, Andrew On Sep 11, 2012, at 7:03 PM, Steve Dickson wrote: > > > On 09/04/2012 05:31 AM, Andrew Holway wrote: >> Hello. >> >> # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also >> seem relevent to libvirt. # >> >> I have a Centos 6.2 server and Centos 6.2 client. >> >> [root@store ~]# cat /etc/exports >> /dev/shm >> 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non >> tempfs targets also) >> >> >> [root@node001 ~]# cat /etc/fstab >> store.ibnet:/dev/shm /mnt nfs >> rdma,port=2050,defaults 0 0 >> >> >> I wrote a little for loop one liner that dd'd the centos net install image >> to a file called 'hello' then checksummed that file. Each iteration uses a >> different block size. >> >> Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block >> sizes get corrupted. >> >> I want to run my KVM guests on top of NFS over RDMA. My guests cannot create >> filesystems. >> >> Thanks, >> >> Andrew. >> >> bug report: https://bugzilla.linux-nfs.org/show_bug.cgi?id=228 > Well it appears the RHEL6 kernels are lacking a couple patches that might > help with this > > 5c635e09 RPCRDMA: Fix FRMR registration/invalidate handling. > 9b78145c xprtrdma: Remove assumption that each segment is <= PAGE_SIZE > > I can only image that Centos 6.2 might me lacking these too... ;-) > > steved. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
On 09/04/2012 05:31 AM, Andrew Holway wrote: > Hello. > > # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also > seem relevent to libvirt. # > > I have a Centos 6.2 server and Centos 6.2 client. > > [root@store ~]# cat /etc/exports > /dev/shm > 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non > tempfs targets also) > > > [root@node001 ~]# cat /etc/fstab > store.ibnet:/dev/shm /mnt nfs > rdma,port=2050,defaults 0 0 > > > I wrote a little for loop one liner that dd'd the centos net install image to > a file called 'hello' then checksummed that file. Each iteration uses a > different block size. > > Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block > sizes get corrupted. > > I want to run my KVM guests on top of NFS over RDMA. My guests cannot create > filesystems. > > Thanks, > > Andrew. > > bug report: https://bugzilla.linux-nfs.org/show_bug.cgi?id=228 Well it appears the RHEL6 kernels are lacking a couple patches that might help with this 5c635e09 RPCRDMA: Fix FRMR registration/invalidate handling. 9b78145c xprtrdma: Remove assumption that each segment is <= PAGE_SIZE I can only image that Centos 6.2 might me lacking these too... ;-) steved. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
On Thu, 2012-09-06 at 12:14 +0200, Andrew Holway wrote: > On Sep 5, 2012, at 4:02 PM, Avi Kivity wrote: > > > On 09/04/2012 03:04 PM, Myklebust, Trond wrote: > >> On Tue, 2012-09-04 at 11:31 +0200, Andrew Holway wrote: > >>> Hello. > >>> > >>> # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would > >>> also seem relevent to libvirt. # > >>> > >>> I have a Centos 6.2 server and Centos 6.2 client. > >>> > >>> [root@store ~]# cat /etc/exports > >>> /dev/shm > >>> 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with > >>> non tempfs targets also) > >>> > >>> > >>> [root@node001 ~]# cat /etc/fstab > >>> store.ibnet:/dev/shm /mnt nfs > >>> rdma,port=2050,defaults 0 0 > >>> > >>> > >>> I wrote a little for loop one liner that dd'd the centos net install > >>> image to a file called 'hello' then checksummed that file. Each iteration > >>> uses a different block size. > >>> > >>> Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block > >>> sizes get corrupted. > >> > >> > >> That is expected behaviour. DIRECT_IO over RDMA needs to be page aligned > >> so that it can use the more efficient RDMA READ and RDMA WRITE memory > >> semantics (instead of the SEND/RECEIVE channel semantics). > > > > Shouldn't subpage requests fail then? O_DIRECT block requests fail for > > subsector writes, instead of corrupting your data. > > But silent data corruption is so much fun!! A couple of RDMA folks are looking into why this is happening. I'm hoping they will get back to me soon. -- Trond Myklebust Linux NFS client maintainer NetApp trond.mykleb...@netapp.com www.netapp.com -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
On Sep 5, 2012, at 4:02 PM, Avi Kivity wrote: > On 09/04/2012 03:04 PM, Myklebust, Trond wrote: >> On Tue, 2012-09-04 at 11:31 +0200, Andrew Holway wrote: >>> Hello. >>> >>> # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also >>> seem relevent to libvirt. # >>> >>> I have a Centos 6.2 server and Centos 6.2 client. >>> >>> [root@store ~]# cat /etc/exports >>> /dev/shm >>> 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non >>> tempfs targets also) >>> >>> >>> [root@node001 ~]# cat /etc/fstab >>> store.ibnet:/dev/shm /mnt nfs >>> rdma,port=2050,defaults 0 0 >>> >>> >>> I wrote a little for loop one liner that dd'd the centos net install image >>> to a file called 'hello' then checksummed that file. Each iteration uses a >>> different block size. >>> >>> Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block >>> sizes get corrupted. >> >> >> That is expected behaviour. DIRECT_IO over RDMA needs to be page aligned >> so that it can use the more efficient RDMA READ and RDMA WRITE memory >> semantics (instead of the SEND/RECEIVE channel semantics). > > Shouldn't subpage requests fail then? O_DIRECT block requests fail for > subsector writes, instead of corrupting your data. But silent data corruption is so much fun!! -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
On 09/04/2012 03:04 PM, Myklebust, Trond wrote: > On Tue, 2012-09-04 at 11:31 +0200, Andrew Holway wrote: >> Hello. >> >> # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also >> seem relevent to libvirt. # >> >> I have a Centos 6.2 server and Centos 6.2 client. >> >> [root@store ~]# cat /etc/exports >> /dev/shm >> 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non >> tempfs targets also) >> >> >> [root@node001 ~]# cat /etc/fstab >> store.ibnet:/dev/shm /mnt nfs >> rdma,port=2050,defaults 0 0 >> >> >> I wrote a little for loop one liner that dd'd the centos net install image >> to a file called 'hello' then checksummed that file. Each iteration uses a >> different block size. >> >> Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block >> sizes get corrupted. > > > That is expected behaviour. DIRECT_IO over RDMA needs to be page aligned > so that it can use the more efficient RDMA READ and RDMA WRITE memory > semantics (instead of the SEND/RECEIVE channel semantics). Shouldn't subpage requests fail then? O_DIRECT block requests fail for subsector writes, instead of corrupting your data. Hopefully this is documented somewhere. -- error compiling committee.c: too many arguments to function -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
> > That is expected behaviour. DIRECT_IO over RDMA needs to be page aligned > so that it can use the more efficient RDMA READ and RDMA WRITE memory > semantics (instead of the SEND/RECEIVE channel semantics). Yes, I think I am understanding that now. I need to find a way of getting around the lib-virt issue. http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg01570.html Thanks, Andrew > >> I want to run my KVM guests on top of NFS over RDMA. My guests cannot create >> filesystems. >> >> Thanks, >> >> Andrew. >> >> bug report: https://bugzilla.linux-nfs.org/show_bug.cgi?id=228 >> >> [root@node001 mnt]# for f in 512 1024 2048 4096 8192 16384 32768 65536 >> 131072; do dd bs="$f" if=CentOS-6.3-x86_64-netinstall.iso of=hello >> iflag=direct oflag=direct && md5sum hello && rm -f hello; done >> >> 409600+0 records in >> 409600+0 records out >> 209715200 bytes (210 MB) copied, 62.3649 s, 3.4 MB/s >> aadd0ffe3c9dfa35d8354e99ecac9276 hello -- 512 byte block >> >> 204800+0 records in >> 204800+0 records out >> 209715200 bytes (210 MB) copied, 41.3876 s, 5.1 MB/s >> 336f6da78f93dab591edc18da81f002e hello -- 1K block >> >> 102400+0 records in >> 102400+0 records out >> 209715200 bytes (210 MB) copied, 21.1712 s, 9.9 MB/s >> f4cefe0a05c9b47ba68effdb17dc95d6 hello -- 2k block >> >> 51200+0 records in >> 51200+0 records out >> 209715200 bytes (210 MB) copied, 10.9631 s, 19.1 MB/s >> 690138908de516b6e5d7d180d085c3f3 hello -- 4k block >> >> 25600+0 records in >> 25600+0 records out >> 209715200 bytes (210 MB) copied, 5.4136 s, 38.7 MB/s >> 690138908de516b6e5d7d180d085c3f3 hello >> >> 12800+0 records in >> 12800+0 records out >> 209715200 bytes (210 MB) copied, 3.1448 s, 66.7 MB/s >> 690138908de516b6e5d7d180d085c3f3 hello >> >> 6400+0 records in >> 6400+0 records out >> 209715200 bytes (210 MB) copied, 1.77304 s, 118 MB/s >> 690138908de516b6e5d7d180d085c3f3 hello >> >> 3200+0 records in >> 3200+0 records out >> 209715200 bytes (210 MB) copied, 1.4331 s, 146 MB/s >> 690138908de516b6e5d7d180d085c3f3 hello >> >> 1600+0 records in >> 1600+0 records out >> 209715200 bytes (210 MB) copied, 0.922167 s, 227 MB/s >> 690138908de516b6e5d7d180d085c3f3 hello >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Trond Myklebust > Linux NFS client maintainer > > NetApp > trond.mykleb...@netapp.com > www.netapp.com > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] NFS over RDMA small block DIRECT_IO bug
On Tue, 2012-09-04 at 11:31 +0200, Andrew Holway wrote: > Hello. > > # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also > seem relevent to libvirt. # > > I have a Centos 6.2 server and Centos 6.2 client. > > [root@store ~]# cat /etc/exports > /dev/shm > 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non > tempfs targets also) > > > [root@node001 ~]# cat /etc/fstab > store.ibnet:/dev/shm /mnt nfs > rdma,port=2050,defaults 0 0 > > > I wrote a little for loop one liner that dd'd the centos net install image to > a file called 'hello' then checksummed that file. Each iteration uses a > different block size. > > Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block > sizes get corrupted. That is expected behaviour. DIRECT_IO over RDMA needs to be page aligned so that it can use the more efficient RDMA READ and RDMA WRITE memory semantics (instead of the SEND/RECEIVE channel semantics). > I want to run my KVM guests on top of NFS over RDMA. My guests cannot create > filesystems. > > Thanks, > > Andrew. > > bug report: https://bugzilla.linux-nfs.org/show_bug.cgi?id=228 > > [root@node001 mnt]# for f in 512 1024 2048 4096 8192 16384 32768 65536 > 131072; do dd bs="$f" if=CentOS-6.3-x86_64-netinstall.iso of=hello > iflag=direct oflag=direct && md5sum hello && rm -f hello; done > > 409600+0 records in > 409600+0 records out > 209715200 bytes (210 MB) copied, 62.3649 s, 3.4 MB/s > aadd0ffe3c9dfa35d8354e99ecac9276 hello -- 512 byte block > > 204800+0 records in > 204800+0 records out > 209715200 bytes (210 MB) copied, 41.3876 s, 5.1 MB/s > 336f6da78f93dab591edc18da81f002e hello -- 1K block > > 102400+0 records in > 102400+0 records out > 209715200 bytes (210 MB) copied, 21.1712 s, 9.9 MB/s > f4cefe0a05c9b47ba68effdb17dc95d6 hello -- 2k block > > 51200+0 records in > 51200+0 records out > 209715200 bytes (210 MB) copied, 10.9631 s, 19.1 MB/s > 690138908de516b6e5d7d180d085c3f3 hello -- 4k block > > 25600+0 records in > 25600+0 records out > 209715200 bytes (210 MB) copied, 5.4136 s, 38.7 MB/s > 690138908de516b6e5d7d180d085c3f3 hello > > 12800+0 records in > 12800+0 records out > 209715200 bytes (210 MB) copied, 3.1448 s, 66.7 MB/s > 690138908de516b6e5d7d180d085c3f3 hello > > 6400+0 records in > 6400+0 records out > 209715200 bytes (210 MB) copied, 1.77304 s, 118 MB/s > 690138908de516b6e5d7d180d085c3f3 hello > > 3200+0 records in > 3200+0 records out > 209715200 bytes (210 MB) copied, 1.4331 s, 146 MB/s > 690138908de516b6e5d7d180d085c3f3 hello > > 1600+0 records in > 1600+0 records out > 209715200 bytes (210 MB) copied, 0.922167 s, 227 MB/s > 690138908de516b6e5d7d180d085c3f3 hello > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Trond Myklebust Linux NFS client maintainer NetApp trond.mykleb...@netapp.com www.netapp.com -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] NFS over RDMA small block DIRECT_IO bug
Hello. # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also seem relevent to libvirt. # I have a Centos 6.2 server and Centos 6.2 client. [root@store ~]# cat /etc/exports /dev/shm 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure)(I have tried with non tempfs targets also) [root@node001 ~]# cat /etc/fstab store.ibnet:/dev/shm /mnt nfs rdma,port=2050,defaults 0 0 I wrote a little for loop one liner that dd'd the centos net install image to a file called 'hello' then checksummed that file. Each iteration uses a different block size. Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block sizes get corrupted. I want to run my KVM guests on top of NFS over RDMA. My guests cannot create filesystems. Thanks, Andrew. bug report: https://bugzilla.linux-nfs.org/show_bug.cgi?id=228 [root@node001 mnt]# for f in 512 1024 2048 4096 8192 16384 32768 65536 131072; do dd bs="$f" if=CentOS-6.3-x86_64-netinstall.iso of=hello iflag=direct oflag=direct && md5sum hello && rm -f hello; done 409600+0 records in 409600+0 records out 209715200 bytes (210 MB) copied, 62.3649 s, 3.4 MB/s aadd0ffe3c9dfa35d8354e99ecac9276 hello -- 512 byte block 204800+0 records in 204800+0 records out 209715200 bytes (210 MB) copied, 41.3876 s, 5.1 MB/s 336f6da78f93dab591edc18da81f002e hello -- 1K block 102400+0 records in 102400+0 records out 209715200 bytes (210 MB) copied, 21.1712 s, 9.9 MB/s f4cefe0a05c9b47ba68effdb17dc95d6 hello -- 2k block 51200+0 records in 51200+0 records out 209715200 bytes (210 MB) copied, 10.9631 s, 19.1 MB/s 690138908de516b6e5d7d180d085c3f3 hello -- 4k block 25600+0 records in 25600+0 records out 209715200 bytes (210 MB) copied, 5.4136 s, 38.7 MB/s 690138908de516b6e5d7d180d085c3f3 hello 12800+0 records in 12800+0 records out 209715200 bytes (210 MB) copied, 3.1448 s, 66.7 MB/s 690138908de516b6e5d7d180d085c3f3 hello 6400+0 records in 6400+0 records out 209715200 bytes (210 MB) copied, 1.77304 s, 118 MB/s 690138908de516b6e5d7d180d085c3f3 hello 3200+0 records in 3200+0 records out 209715200 bytes (210 MB) copied, 1.4331 s, 146 MB/s 690138908de516b6e5d7d180d085c3f3 hello 1600+0 records in 1600+0 records out 209715200 bytes (210 MB) copied, 0.922167 s, 227 MB/s 690138908de516b6e5d7d180d085c3f3 hello -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list