Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 23/11/2010 08:14 Alexander Zagrebin said the following: > It seems that this patch isn't merged into RELENG_8. > Are there chances that it will be merged before 8.2-RELEASE? Yes. MFC timer is ticking. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: 8.1-STABLE: zfs and sendfile: problem still exists
> -Original Message- > From: Andriy Gapon [mailto:a...@freebsd.org] > Sent: Saturday, October 30, 2010 1:53 PM > To: Artemiev Igor > Cc: freebsd-stable@freebsd.org; freebsd...@freebsd.org; > Alexander Zagrebin > Subject: Re: 8.1-STABLE: zfs and sendfile: problem still exists > > > Heh, next try. > > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > === > --- > sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (revision 214318) > +++ > sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (working copy) > @@ -67,6 +67,7 @@ > #include > #include > #include > +#include > > /* > * Programming rules. > @@ -464,7 +465,7 @@ > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > vm_page_wakeup(m); > - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { > + } else if (uio->uio_segflg == UIO_NOCOPY) { > /* >* The code below is here to make > sendfile(2) work >* correctly with ZFS. As pointed out by ups@ > @@ -474,9 +475,23 @@ >*/ > KASSERT(off == 0, > ("unexpected offset in mappedread > for sendfile")); > - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > + if (m != NULL && > vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > goto again; > - vm_page_busy(m); > + if (m == NULL) { > + m = vm_page_alloc(obj, > OFF_TO_IDX(start), > + VM_ALLOC_NOBUSY | VM_ALLOC_NORMAL); > + if (m == NULL) { > + VM_OBJECT_UNLOCK(obj); > + VM_WAIT; > + VM_OBJECT_LOCK(obj); > + goto again; > + } > + } else { > + vm_page_lock_queues(); > + vm_page_wire(m); > + vm_page_unlock_queues(); > + } > + vm_page_io_start(m); > VM_OBJECT_UNLOCK(obj); > if (dirbytes > 0) { > error = dmu_read_uio(os, zp->z_id, uio, > @@ -494,7 +509,10 @@ > VM_OBJECT_LOCK(obj); > if (error == 0) > m->valid = VM_PAGE_BITS_ALL; > - vm_page_wakeup(m); > + vm_page_io_finish(m); > + vm_page_lock_queues(); > + vm_page_unwire(m, 0); > + vm_page_unlock_queues(); > if (error == 0) { > uio->uio_resid -= bytes; > uio->uio_offset += bytes; > It seems that this patch isn't merged into RELENG_8. Are there chances that it will be merged before 8.2-RELEASE? -- Alexander Zagrebin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On 2010-11-01 8:30, Andriy Gapon wrote: First and foremost, the double-caching issue for ZFS+sendfile on FreeBSD is still there and no resolution for this issue is on horizon. So, you have to account for the fact that twice as much memory is needed for this use-case. Whether you plan your system, or configure it, or tune it. Second, with recent head and stable/8 ARC should not be the primary victim of memory pressure; ARC reclaim thread and the page daemon should cooperate in freeing/recycling memory. Nothing much to add. Although this discussion started due to issues with serving files thru web-typish services, there are more apps that use sendfile. For one, I noticed that I had once enabled sendfile in my Samba config. As per this discussion I saw little advantage in keeping it that way.. But I'm open for other suggestions. --WjW ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 31/10/2010 11:02 Alexander Zagrebin said the following: > I have a question. > When we transfer a file via sendfile, then current code allocates > a memory, marked inactive. For example, if the file has size 100 MB, > then 100 MB of memory will be allocated. > If we have to transfer this file again later, then this memory will used > as cache, and no disk io will be required. > The memory will be freed if file will be deleted or operating system > will need an additional memory. > I have correctly understood? > If it so, the i continue... > Such behaviour is good if we have files with relatively small size. > Suppose we have to transfer file with large size (for example, greater > than amount of physical memory). > While transfering, the inactive memory will grow, pressing the ARC. > When size of the ARC will fall to its minimum (vfs.zfs.arc_min), then > inactive memory will be reused. > So, when transfer is complete, we have: > 1. No free memory > 2. Size of the ARC has minimal size (it is bad) > 3. Inactive memory contains the _tail_ of the file only (it is bad too) > Now if we have to transfer this file again, then > 1. there is no (or few) file's data in ARC (ARC too small) > 2. The inactive memory doesn't contain a head part of the file > So the file's data will read from a disk again and again... > Also i've noticed that inactive memory frees relatively slowly, > so if there is a frequent access to large files, then system will run > at very unoptimal conditions. > It's imho... > Can you comment this? > First and foremost, the double-caching issue for ZFS+sendfile on FreeBSD is still there and no resolution for this issue is on horizon. So, you have to account for the fact that twice as much memory is needed for this use-case. Whether you plan your system, or configure it, or tune it. Second, with recent head and stable/8 ARC should not be the primary victim of memory pressure; ARC reclaim thread and the page daemon should cooperate in freeing/recycling memory. Nothing much to add. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Sun, 31 Oct 2010 10:02:44 +0100, Alexander Zagrebin wrote: >> I apologize for my haste, it should have been VM_ALLOC_WIRED. > > Ok, applied and tested under some load(~1200 active connections, outgoing > ~80MB/s). Patch work as expected and i has noted no side effects. Just one > question - should grow Active memory counter, if some pages is "hot"(during > multiple sendfile on one file)? Pages used by sendfile are marked as Inactive for faster reclamation on demand. I have a question. When we transfer a file via sendfile, then current code allocates a memory, marked inactive. For example, if the file has size 100 MB, then 100 MB of memory will be allocated. If we have to transfer this file again later, then this memory will used as cache, and no disk io will be required. The memory will be freed if file will be deleted or operating system will need an additional memory. I have correctly understood? If it so, the i continue... Such behaviour is good if we have files with relatively small size. Suppose we have to transfer file with large size (for example, greater than amount of physical memory). While transfering, the inactive memory will grow, pressing the ARC. When size of the ARC will fall to its minimum (vfs.zfs.arc_min), then inactive memory will be reused. So, when transfer is complete, we have: 1. No free memory 2. Size of the ARC has minimal size (it is bad) 3. Inactive memory contains the _tail_ of the file only (it is bad too) Now if we have to transfer this file again, then 1. there is no (or few) file's data in ARC (ARC too small) 2. The inactive memory doesn't contain a head part of the file So the file's data will read from a disk again and again... Also i've noticed that inactive memory frees relatively slowly, so if there is a frequent access to large files, then system will run at very unoptimal conditions. It's imho... Can you comment this? Add more RAM? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: 8.1-STABLE: zfs and sendfile: problem still exists
> >> I apologize for my haste, it should have been VM_ALLOC_WIRED. > > > > Ok, applied and tested under some load(~1200 active > connections, outgoing > > ~80MB/s). Patch work as expected and i has noted no side > effects. Just one > > question - should grow Active memory counter, if some pages > is "hot"(during > > multiple sendfile on one file)? > > Pages used by sendfile are marked as Inactive for faster > reclamation on demand. I have a question. When we transfer a file via sendfile, then current code allocates a memory, marked inactive. For example, if the file has size 100 MB, then 100 MB of memory will be allocated. If we have to transfer this file again later, then this memory will used as cache, and no disk io will be required. The memory will be freed if file will be deleted or operating system will need an additional memory. I have correctly understood? If it so, the i continue... Such behaviour is good if we have files with relatively small size. Suppose we have to transfer file with large size (for example, greater than amount of physical memory). While transfering, the inactive memory will grow, pressing the ARC. When size of the ARC will fall to its minimum (vfs.zfs.arc_min), then inactive memory will be reused. So, when transfer is complete, we have: 1. No free memory 2. Size of the ARC has minimal size (it is bad) 3. Inactive memory contains the _tail_ of the file only (it is bad too) Now if we have to transfer this file again, then 1. there is no (or few) file's data in ARC (ARC too small) 2. The inactive memory doesn't contain a head part of the file So the file's data will read from a disk again and again... Also i've noticed that inactive memory frees relatively slowly, so if there is a frequent access to large files, then system will run at very unoptimal conditions. It's imho... Can you comment this? -- Alexander Zagrebin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 30/10/2010 22:01 Artemiev Igor said the following: > On Sat, Oct 30, 2010 at 05:43:54PM +0300, Andriy Gapon wrote: > >> I apologize for my haste, it should have been VM_ALLOC_WIRED. > > Ok, applied and tested under some load(~1200 active connections, outgoing > ~80MB/s). Patch work as expected and i has noted no side effects. Just one > question - should grow Active memory counter, if some pages is "hot"(during > multiple sendfile on one file)? Pages used by sendfile are marked as Inactive for faster reclamation on demand. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 31/10/2010 02:37 Kostik Belousov said the following: > On Sat, Oct 30, 2010 at 05:43:54PM +0300, Andriy Gapon wrote: >> on 30/10/2010 14:25 Artemiev Igor said the following: >>> On Sat, Oct 30, 2010 at 01:33:00PM +0300, Andriy Gapon wrote: on 30/10/2010 13:12 Artemiev Igor said the following: > On Sat, Oct 30, 2010 at 12:52:54PM +0300, Andriy Gapon wrote: > >> Heh, next try. > > Got a panic, "vm_page_unwire: invalid wire count: 0" Oh, thank you for testing - forgot another piece (VM_ALLOC_WIRE for vm_page_alloc): >>> >>> Yep, it work. But VM_ALLOC_WIRE not exists in RELENG_8, therefore i >>> slightly modified your patch: >> >> I apologize for my haste, it should have been VM_ALLOC_WIRED. >> Here is a corrected patch: >> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c >> === >> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c >> (revision 214318) >> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c >> (working copy) >> @@ -67,6 +67,7 @@ >> #include >> #include >> #include >> +#include >> >> /* >> * Programming rules. >> @@ -464,7 +465,7 @@ >> uiomove_fromphys(&m, off, bytes, uio); >> VM_OBJECT_LOCK(obj); >> vm_page_wakeup(m); >> -} else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { >> +} else if (uio->uio_segflg == UIO_NOCOPY) { >> /* >> * The code below is here to make sendfile(2) work >> * correctly with ZFS. As pointed out by ups@ >> @@ -474,9 +475,23 @@ >> */ >> KASSERT(off == 0, >> ("unexpected offset in mappedread for sendfile")); >> -if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) >> +if (m != NULL && vm_page_sleep_if_busy(m, FALSE, >> "zfsmrb")) >> goto again; >> -vm_page_busy(m); >> +if (m == NULL) { >> +m = vm_page_alloc(obj, OFF_TO_IDX(start), >> +VM_ALLOC_NOBUSY | VM_ALLOC_WIRED | >> VM_ALLOC_NORMAL); >> +if (m == NULL) { >> +VM_OBJECT_UNLOCK(obj); >> +VM_WAIT; >> +VM_OBJECT_LOCK(obj); >> +goto again; >> +} >> +} else { >> +vm_page_lock_queues(); >> +vm_page_wire(m); >> +vm_page_unlock_queues(); >> +} >> +vm_page_io_start(m); > Why wiring the page if it is busied ? Eh? Because it is not? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Sat, Oct 30, 2010 at 05:43:54PM +0300, Andriy Gapon wrote: > on 30/10/2010 14:25 Artemiev Igor said the following: > > On Sat, Oct 30, 2010 at 01:33:00PM +0300, Andriy Gapon wrote: > >> on 30/10/2010 13:12 Artemiev Igor said the following: > >>> On Sat, Oct 30, 2010 at 12:52:54PM +0300, Andriy Gapon wrote: > >>> > Heh, next try. > >>> > >>> Got a panic, "vm_page_unwire: invalid wire count: 0" > >> > >> Oh, thank you for testing - forgot another piece (VM_ALLOC_WIRE for > >> vm_page_alloc): > > > > Yep, it work. But VM_ALLOC_WIRE not exists in RELENG_8, therefore i > > slightly modified your patch: > > I apologize for my haste, it should have been VM_ALLOC_WIRED. > Here is a corrected patch: > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > === > --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (revision 214318) > +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (working copy) > @@ -67,6 +67,7 @@ > #include > #include > #include > +#include > > /* > * Programming rules. > @@ -464,7 +465,7 @@ > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > vm_page_wakeup(m); > - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { > + } else if (uio->uio_segflg == UIO_NOCOPY) { > /* >* The code below is here to make sendfile(2) work >* correctly with ZFS. As pointed out by ups@ > @@ -474,9 +475,23 @@ >*/ > KASSERT(off == 0, > ("unexpected offset in mappedread for sendfile")); > - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > + if (m != NULL && vm_page_sleep_if_busy(m, FALSE, > "zfsmrb")) > goto again; > - vm_page_busy(m); > + if (m == NULL) { > + m = vm_page_alloc(obj, OFF_TO_IDX(start), > + VM_ALLOC_NOBUSY | VM_ALLOC_WIRED | > VM_ALLOC_NORMAL); > + if (m == NULL) { > + VM_OBJECT_UNLOCK(obj); > + VM_WAIT; > + VM_OBJECT_LOCK(obj); > + goto again; > + } > + } else { > + vm_page_lock_queues(); > + vm_page_wire(m); > + vm_page_unlock_queues(); > + } > + vm_page_io_start(m); Why wiring the page if it is busied ? pgp8p8bSN9Uij.pgp Description: PGP signature
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Sat, Oct 30, 2010 at 05:43:54PM +0300, Andriy Gapon wrote: > I apologize for my haste, it should have been VM_ALLOC_WIRED. Ok, applied and tested under some load(~1200 active connections, outgoing ~80MB/s). Patch work as expected and i has noted no side effects. Just one question - should grow Active memory counter, if some pages is "hot"(during multiple sendfile on one file)? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: 8.1-STABLE: zfs and sendfile: problem still exists
> >> Oh, thank you for testing - forgot another piece > (VM_ALLOC_WIRE for vm_page_alloc): > > > > Yep, it work. But VM_ALLOC_WIRE not exists in RELENG_8, > therefore i slightly modified your patch: > > I apologize for my haste, it should have been VM_ALLOC_WIRED. > Here is a corrected patch: > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > === > --- > sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (revision 214318) > +++ > sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (working copy) > @@ -67,6 +67,7 @@ > #include > #include > #include > +#include > > /* > * Programming rules. > @@ -464,7 +465,7 @@ > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > vm_page_wakeup(m); > - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { > + } else if (uio->uio_segflg == UIO_NOCOPY) { > /* >* The code below is here to make > sendfile(2) work >* correctly with ZFS. As pointed out by ups@ > @@ -474,9 +475,23 @@ >*/ > KASSERT(off == 0, > ("unexpected offset in mappedread > for sendfile")); > - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > + if (m != NULL && > vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > goto again; > - vm_page_busy(m); > + if (m == NULL) { > + m = vm_page_alloc(obj, > OFF_TO_IDX(start), > + VM_ALLOC_NOBUSY | > VM_ALLOC_WIRED | VM_ALLOC_NORMAL); > + if (m == NULL) { > + VM_OBJECT_UNLOCK(obj); > + VM_WAIT; > + VM_OBJECT_LOCK(obj); > + goto again; > + } > + } else { > + vm_page_lock_queues(); > + vm_page_wire(m); > + vm_page_unlock_queues(); > + } > + vm_page_io_start(m); > VM_OBJECT_UNLOCK(obj); > if (dirbytes > 0) { > error = dmu_read_uio(os, zp->z_id, uio, > @@ -494,7 +509,10 @@ > VM_OBJECT_LOCK(obj); > if (error == 0) > m->valid = VM_PAGE_BITS_ALL; > - vm_page_wakeup(m); > + vm_page_io_finish(m); > + vm_page_lock_queues(); > + vm_page_unwire(m, 0); > + vm_page_unlock_queues(); > if (error == 0) { > uio->uio_resid -= bytes; > uio->uio_offset += bytes; > Big thanks to Andriy, Igor and all who has paid attention to this problem. I've tried this patch on the test system running under VirtualBox, and it seems that it solves the problem. I'll try to test this patch in real conditions today. -- Alexander Zagrebin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 30/10/2010 14:25 Artemiev Igor said the following: > On Sat, Oct 30, 2010 at 01:33:00PM +0300, Andriy Gapon wrote: >> on 30/10/2010 13:12 Artemiev Igor said the following: >>> On Sat, Oct 30, 2010 at 12:52:54PM +0300, Andriy Gapon wrote: >>> Heh, next try. >>> >>> Got a panic, "vm_page_unwire: invalid wire count: 0" >> >> Oh, thank you for testing - forgot another piece (VM_ALLOC_WIRE for >> vm_page_alloc): > > Yep, it work. But VM_ALLOC_WIRE not exists in RELENG_8, therefore i slightly > modified your patch: I apologize for my haste, it should have been VM_ALLOC_WIRED. Here is a corrected patch: Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c === --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 214318) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy) @@ -67,6 +67,7 @@ #include #include #include +#include /* * Programming rules. @@ -464,7 +465,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -474,9 +475,23 @@ */ KASSERT(off == 0, ("unexpected offset in mappedread for sendfile")); - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) + if (m != NULL && vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; - vm_page_busy(m); + if (m == NULL) { + m = vm_page_alloc(obj, OFF_TO_IDX(start), + VM_ALLOC_NOBUSY | VM_ALLOC_WIRED | VM_ALLOC_NORMAL); + if (m == NULL) { + VM_OBJECT_UNLOCK(obj); + VM_WAIT; + VM_OBJECT_LOCK(obj); + goto again; + } + } else { + vm_page_lock_queues(); + vm_page_wire(m); + vm_page_unlock_queues(); + } + vm_page_io_start(m); VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { error = dmu_read_uio(os, zp->z_id, uio, @@ -494,7 +509,10 @@ VM_OBJECT_LOCK(obj); if (error == 0) m->valid = VM_PAGE_BITS_ALL; - vm_page_wakeup(m); + vm_page_io_finish(m); + vm_page_lock_queues(); + vm_page_unwire(m, 0); + vm_page_unlock_queues(); if (error == 0) { uio->uio_resid -= bytes; uio->uio_offset += bytes; -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Sat, Oct 30, 2010 at 01:33:00PM +0300, Andriy Gapon wrote: > on 30/10/2010 13:12 Artemiev Igor said the following: > > On Sat, Oct 30, 2010 at 12:52:54PM +0300, Andriy Gapon wrote: > > > >> Heh, next try. > > > > Got a panic, "vm_page_unwire: invalid wire count: 0" > > Oh, thank you for testing - forgot another piece (VM_ALLOC_WIRE for > vm_page_alloc): Yep, it work. But VM_ALLOC_WIRE not exists in RELENG_8, therefore i slightly modified your patch: --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c.orig 2010-10-30 11:56:41.621138440 +0200 +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c 2010-10-30 12:49:32.858692096 +0200 @@ -67,6 +67,7 @@ #include #include #include +#include /* * Programming rules. @@ -464,7 +465,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -474,9 +475,23 @@ */ KASSERT(off == 0, ("unexpected offset in mappedread for sendfile")); - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) +if (m != NULL && vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; - vm_page_busy(m); +if (m == NULL) { +m = vm_page_alloc(obj, OFF_TO_IDX(start), +VM_ALLOC_NOBUSY | VM_ALLOC_NORMAL); +if (m == NULL) { +VM_OBJECT_UNLOCK(obj); +VM_WAIT; +VM_OBJECT_LOCK(obj); +goto again; +} + } +vm_page_lock_queues(); +vm_page_wire(m); +vm_page_unlock_queues(); + +vm_page_io_start(m); VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { error = dmu_read_uio(os, zp->z_id, uio, @@ -494,6 +509,10 @@ VM_OBJECT_LOCK(obj); if (error == 0) m->valid = VM_PAGE_BITS_ALL; +vm_page_io_finish(m); +vm_page_lock_queues(); +vm_page_unwire(m, 0); +vm_page_unlock_queues(); vm_page_wakeup(m); if (error == 0) { uio->uio_resid -= bytes; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 30/10/2010 13:12 Artemiev Igor said the following: > On Sat, Oct 30, 2010 at 12:52:54PM +0300, Andriy Gapon wrote: > >> Heh, next try. > > Got a panic, "vm_page_unwire: invalid wire count: 0" Oh, thank you for testing - forgot another piece (VM_ALLOC_WIRE for vm_page_alloc): Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c === --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 214318) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy) @@ -67,6 +67,7 @@ #include #include #include +#include /* * Programming rules. @@ -464,7 +465,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -474,9 +475,23 @@ */ KASSERT(off == 0, ("unexpected offset in mappedread for sendfile")); - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) + if (m != NULL && vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; - vm_page_busy(m); + if (m == NULL) { + m = vm_page_alloc(obj, OFF_TO_IDX(start), + VM_ALLOC_NOBUSY | VM_ALLOC_WIRE | VM_ALLOC_NORMAL); + if (m == NULL) { + VM_OBJECT_UNLOCK(obj); + VM_WAIT; + VM_OBJECT_LOCK(obj); + goto again; + } + } else { + vm_page_lock_queues(); + vm_page_wire(m); + vm_page_unlock_queues(); + } + vm_page_io_start(m); VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { error = dmu_read_uio(os, zp->z_id, uio, @@ -494,7 +509,10 @@ VM_OBJECT_LOCK(obj); if (error == 0) m->valid = VM_PAGE_BITS_ALL; - vm_page_wakeup(m); + vm_page_io_finish(m); + vm_page_lock_queues(); + vm_page_unwire(m, 0); + vm_page_unlock_queues(); if (error == 0) { uio->uio_resid -= bytes; uio->uio_offset += bytes; -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Sat, Oct 30, 2010 at 12:52:54PM +0300, Andriy Gapon wrote: > Heh, next try. Got a panic, "vm_page_unwire: invalid wire count: 0" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Sat, Oct 30, 2010 at 11:25:05AM +0300, Andriy Gapon wrote: > > Note: I have only compile tested the patch. > > Missed one NULL check. > > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > === > --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (revision 214318) > +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (working copy) > @@ -67,6 +67,7 @@ > #include > #include > #include > +#include > > /* > * Programming rules. > @@ -464,7 +465,7 @@ > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > vm_page_wakeup(m); > - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { > + } else if (uio->uio_segflg == UIO_NOCOPY) { > /* >* The code below is here to make sendfile(2) work >* correctly with ZFS. As pointed out by ups@ > @@ -474,8 +475,18 @@ >*/ > KASSERT(off == 0, > ("unexpected offset in mappedread for sendfile")); > - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > + if (m != NULL && vm_page_sleep_if_busy(m, FALSE, > "zfsmrb")) > goto again; > + if (m == NULL) { > + m = vm_page_alloc(obj, OFF_TO_IDX(start), > + VM_ALLOC_NOBUSY | VM_ALLOC_SYSTEM); > + if (m == NULL) { > + VM_OBJECT_UNLOCK(obj); > + VM_WAIT; > + VM_OBJECT_LOCK(obj); > + goto again; > + } > + } > vm_page_busy(m); > VM_OBJECT_UNLOCK(obj); > if (dirbytes > 0) { Ok, i tested this patch. It worked :) freebsd_zfs_read now calls (file_size/MAXBSIZE) times. Thanks! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
Heh, next try. Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c === --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 214318) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy) @@ -67,6 +67,7 @@ #include #include #include +#include /* * Programming rules. @@ -464,7 +465,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -474,9 +475,23 @@ */ KASSERT(off == 0, ("unexpected offset in mappedread for sendfile")); - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) + if (m != NULL && vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; - vm_page_busy(m); + if (m == NULL) { + m = vm_page_alloc(obj, OFF_TO_IDX(start), + VM_ALLOC_NOBUSY | VM_ALLOC_NORMAL); + if (m == NULL) { + VM_OBJECT_UNLOCK(obj); + VM_WAIT; + VM_OBJECT_LOCK(obj); + goto again; + } + } else { + vm_page_lock_queues(); + vm_page_wire(m); + vm_page_unlock_queues(); + } + vm_page_io_start(m); VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { error = dmu_read_uio(os, zp->z_id, uio, @@ -494,7 +509,10 @@ VM_OBJECT_LOCK(obj); if (error == 0) m->valid = VM_PAGE_BITS_ALL; - vm_page_wakeup(m); + vm_page_io_finish(m); + vm_page_lock_queues(); + vm_page_unwire(m, 0); + vm_page_unlock_queues(); if (error == 0) { uio->uio_resid -= bytes; uio->uio_offset += bytes; -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 30/10/2010 11:16 Andriy Gapon said the following: > on 30/10/2010 11:16 Andriy Gapon said the following: >> Or maybe something like the following? >> It looks a little bit cleaner to me, but still is not perfect, as I have not >> handled unnecessary busy-ing of the pages where something more lightweight >> could >> have sufficed (e.g. wiring and shared busying). > > Note: I have only compile tested the patch. Missed one NULL check. Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c === --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 214318) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy) @@ -67,6 +67,7 @@ #include #include #include +#include /* * Programming rules. @@ -464,7 +465,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -474,8 +475,18 @@ */ KASSERT(off == 0, ("unexpected offset in mappedread for sendfile")); - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) + if (m != NULL && vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; + if (m == NULL) { + m = vm_page_alloc(obj, OFF_TO_IDX(start), + VM_ALLOC_NOBUSY | VM_ALLOC_SYSTEM); + if (m == NULL) { + VM_OBJECT_UNLOCK(obj); + VM_WAIT; + VM_OBJECT_LOCK(obj); + goto again; + } + } vm_page_busy(m); VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 30/10/2010 11:16 Andriy Gapon said the following: > Or maybe something like the following? > It looks a little bit cleaner to me, but still is not perfect, as I have not > handled unnecessary busy-ing of the pages where something more lightweight > could > have sufficed (e.g. wiring and shared busying). Note: I have only compile tested the patch. > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > === > --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (revision 214318) > +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > (working copy) > @@ -67,6 +67,7 @@ > #include > #include > #include > +#include > > /* > * Programming rules. > @@ -464,7 +465,7 @@ > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > vm_page_wakeup(m); > - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { > + } else if (uio->uio_segflg == UIO_NOCOPY) { > /* >* The code below is here to make sendfile(2) work >* correctly with ZFS. As pointed out by ups@ > @@ -476,6 +477,16 @@ > ("unexpected offset in mappedread for sendfile")); > if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > goto again; > + if (m == NULL) { > + m = vm_page_alloc(obj, OFF_TO_IDX(start), > + VM_ALLOC_NOBUSY | VM_ALLOC_SYSTEM); > + if (m == NULL) { > + VM_OBJECT_UNLOCK(obj); > + VM_WAIT; > + VM_OBJECT_LOCK(obj); > + goto again; > + } > + } > vm_page_busy(m); > VM_OBJECT_UNLOCK(obj); > if (dirbytes > 0) { > > -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 20:51 Artemiev Igor said the following: > On Fri, Oct 29, 2010 at 07:06:03PM +0300, Andriy Gapon wrote: >> Probably yes, but have to be careful there. >> First, do vm_page_grab only for UIO_NOCOPY case. >> Second, the first page is already "shared busy" after vm_page_io_start() >> call in >> kern_sendfile; so you might need VM_ALLOC_IGN_SBUSY for that page to avoid a >> deadlock. > > RELENG_8 doesn`t have VM_ALLOC_IGN_SBUSY, it appeared only in HEAD. > Can you review this patch, Whether correctly I have understood? (didnt test > it yet) > > --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c.orig > 2010-10-29 18:18:23.921078337 +0200 > +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > 2010-10-29 19:23:48.142513084 +0200 > @@ -449,7 +449,7 @@ > int bytes = MIN(PAGESIZE - off, len); > > again: > - if ((m = vm_page_lookup(obj, OFF_TO_IDX(start))) != NULL && > + if (uio->uio_segflg != UIO_NOCOPY && (m = vm_page_lookup(obj, > OFF_TO_IDX(start))) != NULL && > vm_page_is_valid(m, off, bytes)) { > if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > goto again; > @@ -464,7 +464,7 @@ > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > vm_page_wakeup(m); > - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { > + } else if (uio->uio_segflg == UIO_NOCOPY) { > /* >* The code below is here to make sendfile(2) work >* correctly with ZFS. As pointed out by ups@ > @@ -472,11 +472,9 @@ >* but it pessimize performance of sendfile/UFS, that's >* why I handle this special case in ZFS code. >*/ > - KASSERT(off == 0, > - ("unexpected offset in mappedread for sendfile")); > - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) > - goto again; > - vm_page_busy(m); > + if((m = vm_page_lookup(obj, OFF_TO_IDX(start))) == NULL > || !vm_page_is_valid(m, off, bytes)) > + m = vm_page_grab(obj, OFF_TO_IDX(start), > VM_ALLOC_NORMAL|VM_ALLOC_RETRY); > + > VM_OBJECT_UNLOCK(obj); > if (dirbytes > 0) { > error = dmu_read_uio(os, zp->z_id, uio, Or maybe something like the following? It looks a little bit cleaner to me, but still is not perfect, as I have not handled unnecessary busy-ing of the pages where something more lightweight could have sufficed (e.g. wiring and shared busying). Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c === --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 214318) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy) @@ -67,6 +67,7 @@ #include #include #include +#include /* * Programming rules. @@ -464,7 +465,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -476,6 +477,16 @@ ("unexpected offset in mappedread for sendfile")); if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; + if (m == NULL) { + m = vm_page_alloc(obj, OFF_TO_IDX(start), + VM_ALLOC_NOBUSY | VM_ALLOC_SYSTEM); + if (m == NULL) { + VM_OBJECT_UNLOCK(obj); + VM_WAIT; + VM_OBJECT_LOCK(obj); + goto again; + } + } vm_page_busy(m); VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 17:41 Andriy Gapon said the following: > on 29/10/2010 15:36 Andriy Gapon said the following: >> on 29/10/2010 12:04 Artemiev Igor said the following: >>> Yep, this problem exists. You may workaround it via bumping up >>> net.inet.tcp.sendspace up to 128k. zfs sendfile is very ineffective. I have >>> made a small investigation via DTrace, it reads MAXBSIZE chunks, but map in >>> vm >>> only one page (4K). I.e. if you have a file with size 512K, sendfile make >>> calls freebsd_zfs_read 128 times. >> >> What svn revision of FreeBSD source tree did you test? >> > > Ah, I think I see what's going on. > Either sendfile should (have an option to) use VOP_GETPAGES to request data > or ZFS > mappedread should use vm_grab_page instead of vm_lookup_page for UIO_NOCOPY > case. > Currently ZFS would read a whole FS block into ARC, but populate only one page > with data and for the rest it would just wastefully do uiomove(UIO_NOCOPY) > from > ARC data. > So, e.g. zpool iostat would show that there are only few actual reads from a > pool. > The rest of the time must be spent churning over the data already in ARC and > doing page-per-VOP_READ copies from it. Hmm, I investigated the issue some more and now I wouldn't put all the blame on ZFS. Indeed, perhaps ZFS is very inefficient here, perhaps it does extra looping and extra copying. However those operations should not lead to such a significant slowdown, but mostly to an increased CPU usage. So, it looks that sendfile spends most of the time in sbwait(). Of course, "erratic" behavior of ZFS does contribute to that. It's this code in kern_sendfile that gets triggered by ZFS: if (pg->valid && vm_page_is_valid(pg, pgoff, xfsize)) VM_OBJECT_UNLOCK(obj); else if (m != NULL) error = EAGAIN; /* send what we already got */ else ... Essentially, data is not only read from ZFS page by page, but it is also mostly sent with page-sized chunk at a time. P.S. just stating the obvious, kind of :-) -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Fri, Oct 29, 2010 at 06:22:54PM +0300, Andriy Gapon wrote: > on 29/10/2010 18:17 Kostik Belousov said the following: > > On Fri, Oct 29, 2010 at 06:05:26PM +0300, Andriy Gapon wrote: > >> on 29/10/2010 17:53 Kostik Belousov said the following: > >>> Could it be the priming of the vm object pages content ? > >> > >> Sorry, not familiar with this term. > >> Do you mean prepopulation of vm object with valid pages? > >> > >>> Due to double-buffering, and (possibly false) optimization to only > >> > >> What optimization? > > On zfs vnode read, the page from the corresponding vm object is only > > populated with the vnode data if the page already exists in the > > object. > > Do you mean a specific type of read? > For "normal" reads it's the other way around - if the page already exists and > is > valid, then we read from the page, not from ARC. Let me repeat it once more: zfs does not properly caches the vnode data content in the page cache (the cache is used in a weaker sence, not meaning the freebsd 'cached' memory, but a cache in more common sence). Not doing the optimization I mentioned would mean always allocating the pages and making it (partially) valid for each read call. > > > Not doing the optimization would be to allocate the page uncoditionally > > on the read if not already present, and copy the data from ARC to the page. > >> > >>> perform double-buffering when vm object already has some data cached, > >>> reads can prime vm object page list before file is mmapped or > >>> sendfile-ed. > >>> > >> > >> No double-buffering is done to optimize anything. Double-buffering > >> is a consequence of having page cache and ARC. The special > >> "double-buffering code" is to just handle that fact - e.g. making > >> sure that VOP_READ reads data from page cache instead of ARC if it's > >> possible that the data in them differs (i.e. page cache has more > >> recent data). > >> > >> So, if I understood the term 'priming' correctly, no priming should > >> ever occur. > > The priming is done on the first call to VOP_READ() with the right > > offset after the page is allocated. > > Again, what is priming? Filling the cache with an appropriate content. pgpc8DbIfno18.pgp Description: PGP signature
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Fri, Oct 29, 2010 at 07:06:03PM +0300, Andriy Gapon wrote: > Probably yes, but have to be careful there. > First, do vm_page_grab only for UIO_NOCOPY case. > Second, the first page is already "shared busy" after vm_page_io_start() call > in > kern_sendfile; so you might need VM_ALLOC_IGN_SBUSY for that page to avoid a > deadlock. RELENG_8 doesn`t have VM_ALLOC_IGN_SBUSY, it appeared only in HEAD. Can you review this patch, Whether correctly I have understood? (didnt test it yet) --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c.orig 2010-10-29 18:18:23.921078337 +0200 +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c 2010-10-29 19:23:48.142513084 +0200 @@ -449,7 +449,7 @@ int bytes = MIN(PAGESIZE - off, len); again: - if ((m = vm_page_lookup(obj, OFF_TO_IDX(start))) != NULL && + if (uio->uio_segflg != UIO_NOCOPY && (m = vm_page_lookup(obj, OFF_TO_IDX(start))) != NULL && vm_page_is_valid(m, off, bytes)) { if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) goto again; @@ -464,7 +464,7 @@ uiomove_fromphys(&m, off, bytes, uio); VM_OBJECT_LOCK(obj); vm_page_wakeup(m); - } else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) { + } else if (uio->uio_segflg == UIO_NOCOPY) { /* * The code below is here to make sendfile(2) work * correctly with ZFS. As pointed out by ups@ @@ -472,11 +472,9 @@ * but it pessimize performance of sendfile/UFS, that's * why I handle this special case in ZFS code. */ - KASSERT(off == 0, - ("unexpected offset in mappedread for sendfile")); - if (vm_page_sleep_if_busy(m, FALSE, "zfsmrb")) - goto again; - vm_page_busy(m); + if((m = vm_page_lookup(obj, OFF_TO_IDX(start))) == NULL || !vm_page_is_valid(m, off, bytes)) + m = vm_page_grab(obj, OFF_TO_IDX(start), VM_ALLOC_NORMAL|VM_ALLOC_RETRY); + VM_OBJECT_UNLOCK(obj); if (dirbytes > 0) { error = dmu_read_uio(os, zp->z_id, uio, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: 8.1-STABLE: zfs and sendfile: problem still exists
> > Can you reproduce the problem on your system? > > I can't reproduce it on mine. Note the resilvering was induced from > some unrelated disk swaps/tests I was performing, and ftpd is already > enabled via inetd on this system. > > What ZFS tunings have you applied to your system? Can you provide > output from "sysctl -a kstat.zfs.misc.arcstats" before and after a > transfer which exhibits the initial slowdown? It's amd64 Intel Atom based system with 2G RAM. /boot/loader.conf contains nothing special: vm.kmem_size="1536M" vfs.zfs.prefetch_disable="1" $ dd if=/dev/random of=test bs=1m count=50; sysctl -a kstat.zfs.misc.arcstats; fetch -o /dev/null http://localhost/test; sysctl -a kstat.zfs.misc.arcstats 50+0 records in 50+0 records out 52428800 bytes transferred in 2.956783 secs (17731705 bytes/sec) kstat.zfs.misc.arcstats.hits: 10889409 kstat.zfs.misc.arcstats.misses: 2482562 kstat.zfs.misc.arcstats.demand_data_hits: 7920924 kstat.zfs.misc.arcstats.demand_data_misses: 1587278 kstat.zfs.misc.arcstats.demand_metadata_hits: 2968455 kstat.zfs.misc.arcstats.demand_metadata_misses: 895284 kstat.zfs.misc.arcstats.prefetch_data_hits: 0 kstat.zfs.misc.arcstats.prefetch_data_misses: 0 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 30 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0 kstat.zfs.misc.arcstats.mru_hits: 5596211 kstat.zfs.misc.arcstats.mru_ghost_hits: 199040 kstat.zfs.misc.arcstats.mfu_hits: 5293198 kstat.zfs.misc.arcstats.mfu_ghost_hits: 481006 kstat.zfs.misc.arcstats.allocated: 2985083 kstat.zfs.misc.arcstats.deleted: 1901535 kstat.zfs.misc.arcstats.stolen: 1269643 kstat.zfs.misc.arcstats.recycle_miss: 464100 kstat.zfs.misc.arcstats.mutex_miss: 658 kstat.zfs.misc.arcstats.evict_skip: 148879 kstat.zfs.misc.arcstats.evict_l2_cached: 0 kstat.zfs.misc.arcstats.evict_l2_eligible: 150609301504 kstat.zfs.misc.arcstats.evict_l2_ineligible: 36864 kstat.zfs.misc.arcstats.hash_elements: 91782 kstat.zfs.misc.arcstats.hash_elements_max: 168546 kstat.zfs.misc.arcstats.hash_collisions: 2058158 kstat.zfs.misc.arcstats.hash_chains: 23888 kstat.zfs.misc.arcstats.hash_chain_max: 18 kstat.zfs.misc.arcstats.p: 807441359 kstat.zfs.misc.arcstats.c: 1006632960 kstat.zfs.misc.arcstats.c_min: 125829120 kstat.zfs.misc.arcstats.c_max: 1006632960 kstat.zfs.misc.arcstats.size: 1006690472 kstat.zfs.misc.arcstats.hdr_size: 20252216 kstat.zfs.misc.arcstats.data_size: 917198336 kstat.zfs.misc.arcstats.other_size: 69239920 kstat.zfs.misc.arcstats.l2_hits: 0 kstat.zfs.misc.arcstats.l2_misses: 0 kstat.zfs.misc.arcstats.l2_feeds: 0 kstat.zfs.misc.arcstats.l2_rw_clash: 0 kstat.zfs.misc.arcstats.l2_read_bytes: 0 kstat.zfs.misc.arcstats.l2_write_bytes: 0 kstat.zfs.misc.arcstats.l2_writes_sent: 0 kstat.zfs.misc.arcstats.l2_writes_done: 0 kstat.zfs.misc.arcstats.l2_writes_error: 0 kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0 kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0 kstat.zfs.misc.arcstats.l2_evict_reading: 0 kstat.zfs.misc.arcstats.l2_free_on_write: 0 kstat.zfs.misc.arcstats.l2_abort_lowmem: 0 kstat.zfs.misc.arcstats.l2_cksum_bad: 0 kstat.zfs.misc.arcstats.l2_io_error: 0 kstat.zfs.misc.arcstats.l2_size: 0 kstat.zfs.misc.arcstats.l2_hdr_size: 0 kstat.zfs.misc.arcstats.memory_throttle_count: 9 kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0 kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0 kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0 kstat.zfs.misc.arcstats.l2_write_in_l2: 0 kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0 kstat.zfs.misc.arcstats.l2_write_not_cacheable: 30 kstat.zfs.misc.arcstats.l2_write_full: 0 kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0 kstat.zfs.misc.arcstats.l2_write_pios: 0 kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0 /dev/null 100% of 50 MB 119 kBps 00m00s kstat.zfs.misc.arcstats.hits: 10928358 kstat.zfs.misc.arcstats.misses: 2486504 kstat.zfs.misc.arcstats.demand_data_hits: 7959052 kstat.zfs.misc.arcstats.demand_data_misses: 1590868 kstat.zfs.misc.arcstats.demand_metadata_hits: 2969276 kstat.zfs.misc.arcstats.demand_metadata_misses: 895636 kstat.zfs.misc.arcstats.prefetch_data_hits: 0 kstat.zfs.misc.arcstats.prefetch_data_misses: 0 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 30 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0 kstat.zfs.misc.arcstats.mru_hits: 5601378 kstat.zfs.misc.arcstats.mru_ghost_hits: 199211 kstat.zfs.misc.arcstats.mfu_hits: 5326980 kstat.zfs.misc.arcstats.mfu_ghost_hits: 482037 kstat.zfs.misc.arcstats.allocated: 2989914 kstat.zfs.misc.arcstats.deleted: 1904492 kstat.zfs.misc.arcstats.stolen: 1272047 kstat.zfs.misc.arcstats.recycle_miss: 464306 kstat.zfs.misc.arcstats.mutex_miss: 658 kstat.zfs.misc.arcstats.evict_skip: 148880 kstat.zfs.misc.arcstats.evict_l2_cached: 0 kstat.zfs.misc.arcstats.evict_l2_eligible: 150970209280 kstat.zfs.misc.arcstats.evict_l2_ineligible: 36864 kstat.zfs.misc.arcstats.hash_
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 18:26 Artemiev Igor said the following: > On Fri, Oct 29, 2010 at 05:41:59PM +0300, Andriy Gapon wrote: > >> What svn revision of FreeBSD source tree did you test? > > r213936. Revision seems a little old. > >> Ah, I think I see what's going on. >> Either sendfile should (have an option to) use VOP_GETPAGES to request data >> or ZFS >> mappedread should use vm_grab_page instead of vm_lookup_page for UIO_NOCOPY >> case. >> Currently ZFS would read a whole FS block into ARC, but populate only one >> page >> with data and for the rest it would just wastefully do uiomove(UIO_NOCOPY) >> from >> ARC data. >> So, e.g. zpool iostat would show that there are only few actual reads from a >> pool. >> The rest of the time must be spent churning over the data already in ARC and >> doing page-per-VOP_READ copies from it. > I can test it, but what allocflags? VM_ALLOC_RETRY|VM_ALLOC_NORMAL? Probably yes, but have to be careful there. First, do vm_page_grab only for UIO_NOCOPY case. Second, the first page is already "shared busy" after vm_page_io_start() call in kern_sendfile; so you might need VM_ALLOC_IGN_SBUSY for that page to avoid a deadlock. I think that it may be good to separate UIO_NOCOPY/sendfile case from mappedread into a function of its own. P.S. doing VOP_GETPAGES instead of vn_rdwr() in kern_sendfile() might be a better idea still. But there are some additional details to that, e.g. a mount/fs flag to tell which mechanism is preferred. Because, as I've been told, vn_rdwr() has better performance than VOP_GETPAGES. Although, I don't understand why it could/should be that way. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Fri, Oct 29, 2010 at 05:41:59PM +0300, Andriy Gapon wrote: > What svn revision of FreeBSD source tree did you test? r213936. Revision seems a little old. > Ah, I think I see what's going on. > Either sendfile should (have an option to) use VOP_GETPAGES to request data > or ZFS > mappedread should use vm_grab_page instead of vm_lookup_page for UIO_NOCOPY > case. > Currently ZFS would read a whole FS block into ARC, but populate only one page > with data and for the rest it would just wastefully do uiomove(UIO_NOCOPY) > from > ARC data. > So, e.g. zpool iostat would show that there are only few actual reads from a > pool. > The rest of the time must be spent churning over the data already in ARC and > doing page-per-VOP_READ copies from it. I can test it, but what allocflags? VM_ALLOC_RETRY|VM_ALLOC_NORMAL? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 18:17 Kostik Belousov said the following: > On Fri, Oct 29, 2010 at 06:05:26PM +0300, Andriy Gapon wrote: >> on 29/10/2010 17:53 Kostik Belousov said the following: >>> Could it be the priming of the vm object pages content ? >> >> Sorry, not familiar with this term. >> Do you mean prepopulation of vm object with valid pages? >> >>> Due to double-buffering, and (possibly false) optimization to only >> >> What optimization? > On zfs vnode read, the page from the corresponding vm object is only > populated with the vnode data if the page already exists in the > object. Do you mean a specific type of read? For "normal" reads it's the other way around - if the page already exists and is valid, then we read from the page, not from ARC. > Not doing the optimization would be to allocate the page uncoditionally > on the read if not already present, and copy the data from ARC to the page. >> >>> perform double-buffering when vm object already has some data cached, >>> reads can prime vm object page list before file is mmapped or >>> sendfile-ed. >>> >> >> No double-buffering is done to optimize anything. Double-buffering >> is a consequence of having page cache and ARC. The special >> "double-buffering code" is to just handle that fact - e.g. making >> sure that VOP_READ reads data from page cache instead of ARC if it's >> possible that the data in them differs (i.e. page cache has more >> recent data). >> >> So, if I understood the term 'priming' correctly, no priming should >> ever occur. > The priming is done on the first call to VOP_READ() with the right > offset after the page is allocated. Again, what is priming? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Fri, Oct 29, 2010 at 06:05:26PM +0300, Andriy Gapon wrote: > on 29/10/2010 17:53 Kostik Belousov said the following: > > Could it be the priming of the vm object pages content ? > > Sorry, not familiar with this term. > Do you mean prepopulation of vm object with valid pages? > > > Due to double-buffering, and (possibly false) optimization to only > > What optimization? On zfs vnode read, the page from the corresponding vm object is only populated with the vnode data if the page already exists in the object. Not doing the optimization would be to allocate the page uncoditionally on the read if not already present, and copy the data from ARC to the page. > > > perform double-buffering when vm object already has some data cached, > > reads can prime vm object page list before file is mmapped or > > sendfile-ed. > > > > No double-buffering is done to optimize anything. Double-buffering > is a consequence of having page cache and ARC. The special > "double-buffering code" is to just handle that fact - e.g. making > sure that VOP_READ reads data from page cache instead of ARC if it's > possible that the data in them differs (i.e. page cache has more > recent data). > > So, if I understood the term 'priming' correctly, no priming should > ever occur. The priming is done on the first call to VOP_READ() with the right offset after the page is allocated. pgpsWIastHVGc.pgp Description: PGP signature
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 17:53 Kostik Belousov said the following: > Could it be the priming of the vm object pages content ? Sorry, not familiar with this term. Do you mean prepopulation of vm object with valid pages? > Due to double-buffering, and (possibly false) optimization to only What optimization? > perform double-buffering when vm object already has some data cached, > reads can prime vm object page list before file is mmapped or > sendfile-ed. > No double-buffering is done to optimize anything. Double-buffering is a consequence of having page cache and ARC. The special "double-buffering code" is to just handle that fact - e.g. making sure that VOP_READ reads data from page cache instead of ARC if it's possible that the data in them differs (i.e. page cache has more recent data). So, if I understood the term 'priming' correctly, no priming should ever occur. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Fri, Oct 29, 2010 at 06:31:21PM +0400, Alexander Zagrebin wrote: > > > I've tried the nginx with > > > disabled sendfile (the nginx.conf contains "sendfile off;"): > > > > > > $ dd if=/dev/random of=test bs=1m count=100 > > > 100+0 records in > > > 100+0 records out > > > 104857600 bytes transferred in 5.892504 secs (17795083 bytes/sec) > > > $ fetch -o /dev/null http://localhost/test > > > /dev/null 100% of 100 > > MB 41 MBps > > > $ fetch -o /dev/null http://localhost/test > > > /dev/null 100% of 100 > > MB 44 MBps > > > $ fetch -o /dev/null http://localhost/test > > > /dev/null 100% of 100 > > MB 44 MBps > > > > > > > I am really surprised with such a bad performance of sendfile. > > Will you be able to profile the issue further? > > Yes. > > > I will also try to think of some measurements. > > A transfer rate is too low for the _first_ attempt only. > Further attempts demonstrates a reasonable transfer rate. > For example, nginx with "sendfile on;": > > $ dd if=/dev/random of=test bs=1m count=100 > 100+0 records in > 100+0 records out > 104857600 bytes transferred in 5.855305 secs (17908136 bytes/sec) > $ fetch -o /dev/null http://localhost/test > /dev/null 3% of 100 MB 118 kBps > 13m50s^C > fetch: transfer interrupted > $ fetch -o /dev/null http://localhost/test > /dev/null 100% of 100 MB 39 MBps > > If there was no access to the file during some time, then everything > repeats: > The first attempt - transfer rate is too low > A further attempts - no problems > > Can you reproduce the problem on your system? Could it be the priming of the vm object pages content ? Due to double-buffering, and (possibly false) optimization to only perform double-buffering when vm object already has some data cached, reads can prime vm object page list before file is mmapped or sendfile-ed. pgpnA8KHQc5Dk.pgp Description: PGP signature
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Fri, Oct 29, 2010 at 06:31:21PM +0400, Alexander Zagrebin wrote: > > > I've tried the nginx with > > > disabled sendfile (the nginx.conf contains "sendfile off;"): > > > > > > $ dd if=/dev/random of=test bs=1m count=100 > > > 100+0 records in > > > 100+0 records out > > > 104857600 bytes transferred in 5.892504 secs (17795083 bytes/sec) > > > $ fetch -o /dev/null http://localhost/test > > > /dev/null 100% of 100 > > MB 41 MBps > > > $ fetch -o /dev/null http://localhost/test > > > /dev/null 100% of 100 > > MB 44 MBps > > > $ fetch -o /dev/null http://localhost/test > > > /dev/null 100% of 100 > > MB 44 MBps > > > > > > > I am really surprised with such a bad performance of sendfile. > > Will you be able to profile the issue further? > > Yes. > > > I will also try to think of some measurements. > > A transfer rate is too low for the _first_ attempt only. > Further attempts demonstrates a reasonable transfer rate. > For example, nginx with "sendfile on;": > > $ dd if=/dev/random of=test bs=1m count=100 > 100+0 records in > 100+0 records out > 104857600 bytes transferred in 5.855305 secs (17908136 bytes/sec) > $ fetch -o /dev/null http://localhost/test > /dev/null 3% of 100 MB 118 kBps > 13m50s^C > fetch: transfer interrupted > $ fetch -o /dev/null http://localhost/test > /dev/null 100% of 100 MB 39 MBps > > If there was no access to the file during some time, then everything > repeats: > The first attempt - transfer rate is too low > A further attempts - no problems > > Can you reproduce the problem on your system? I can't reproduce it on mine. Note the resilvering was induced from some unrelated disk swaps/tests I was performing, and ftpd is already enabled via inetd on this system. icarus# uname -a FreeBSD icarus.home.lan 8.1-STABLE FreeBSD 8.1-STABLE #0: Sat Oct 16 07:10:54 PDT 2010 r...@icarus.home.lan:/usr/obj/usr/src/sys/X7SBA_RELENG_8_amd64 amd64 icarus# df -k Filesystem 1024-blocks Used Avail Capacity Mounted on /dev/ada0s1a 101297445180848013048%/ devfs 1 1 0 100%/dev /dev/ada0s1d12186190103986 11107310 1%/var /dev/ada0s1e 4058062 5468 3727950 0%/tmp /dev/ada0s1f 8395622 1918300 580567425%/usr data/cvs 686338517 289 686338228 0%/cvs data/home 687130693792465 686338228 0%/home data/storage 957080511 270742283 68633822828%/storage icarus# zpool status pool: data state: ONLINE scrub: resilver completed after 0h43m with 0 errors on Sun Oct 17 10:11:19 2010 config: NAMESTATE READ WRITE CKSUM dataONLINE 0 0 0 mirrorONLINE 0 0 0 ada1ONLINE 0 0 0 ada2ONLINE 0 0 0 258G resilvered errors: No known data errors icarus# pw useradd ftp -g users -u 2000 -s /bin/csh icarus# mkdir /home/ftp icarus# chown ftp:users /home/ftp icarus# cd /home/ftp icarus# dd if=/dev/urandom of=test bs=1m count=100 100+0 records in 100+0 records out 104857600 bytes transferred in 1.384421 secs (75741116 bytes/sec) icarus# chown ftp:users test icarus# ls -l test -rw-r--r-- 1 ftp users 104857600 Oct 29 07:41 test icarus# date ; fetch -o /dev/null ftp://localhost/test Fri Oct 29 07:45:47 PDT 2010 /dev/null 100% of 100 MB 174 MBps icarus# date ; fetch -o /dev/null ftp://localhost/test Fri Oct 29 07:45:48 PDT 2010 /dev/null 100% of 100 MB 156 MBps icarus# date ; fetch -o /dev/null ftp://localhost/test Fri Oct 29 07:45:49 PDT 2010 /dev/null 100% of 100 MB 170 MBps icarus# date ; fetch -o /dev/null ftp://localhost/test Fri Oct 29 07:45:50 PDT 2010 /dev/null 100% of 100 MB 155 MBps icarus# date ; fetch -o /dev/null ftp://localhost/test Fri Oct 29 07:45:52 PDT 2010 /dev/null 100% of 100 MB 151 MBps icarus# dd if=/dev/urandom of=test2 bs=1m count=500 500+0 records in 500+0 records out 524288000 bytes transferred in 6.947780 secs (75461228 bytes/sec) icarus# chown ftp:users test2 icarus# ls -l test2 -rw-r--r-- 1 ftp users 524288000 Oct 29 07:46 test2 icarus# date ; fetch -o /dev/null ftp://localhost/test2 Fri Oct 29 07:47:19 PDT 2010 /dev/null 100% of 500 MB 148 MBps icarus# date ; fetch -o /dev/null ftp://localhost/test2 Fri Oct 29 07:47:24 PDT 2010 /dev/null 100% of 500 MB 175 MBps icarus# date ; fetch -o /dev/null ftp://localhost/test2 Fri Oct 29 07:47:30 PDT 2010 /dev/null 100% of 500 MB 164 MBps What ZFS tunin
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 15:36 Andriy Gapon said the following: > on 29/10/2010 12:04 Artemiev Igor said the following: >> Yep, this problem exists. You may workaround it via bumping up >> net.inet.tcp.sendspace up to 128k. zfs sendfile is very ineffective. I have >> made a small investigation via DTrace, it reads MAXBSIZE chunks, but map in >> vm >> only one page (4K). I.e. if you have a file with size 512K, sendfile make >> calls freebsd_zfs_read 128 times. > > What svn revision of FreeBSD source tree did you test? > Ah, I think I see what's going on. Either sendfile should (have an option to) use VOP_GETPAGES to request data or ZFS mappedread should use vm_grab_page instead of vm_lookup_page for UIO_NOCOPY case. Currently ZFS would read a whole FS block into ARC, but populate only one page with data and for the rest it would just wastefully do uiomove(UIO_NOCOPY) from ARC data. So, e.g. zpool iostat would show that there are only few actual reads from a pool. The rest of the time must be spent churning over the data already in ARC and doing page-per-VOP_READ copies from it. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: 8.1-STABLE: zfs and sendfile: problem still exists
> > I've tried the nginx with > > disabled sendfile (the nginx.conf contains "sendfile off;"): > > > > $ dd if=/dev/random of=test bs=1m count=100 > > 100+0 records in > > 100+0 records out > > 104857600 bytes transferred in 5.892504 secs (17795083 bytes/sec) > > $ fetch -o /dev/null http://localhost/test > > /dev/null 100% of 100 > MB 41 MBps > > $ fetch -o /dev/null http://localhost/test > > /dev/null 100% of 100 > MB 44 MBps > > $ fetch -o /dev/null http://localhost/test > > /dev/null 100% of 100 > MB 44 MBps > > > > I am really surprised with such a bad performance of sendfile. > Will you be able to profile the issue further? Yes. > I will also try to think of some measurements. A transfer rate is too low for the _first_ attempt only. Further attempts demonstrates a reasonable transfer rate. For example, nginx with "sendfile on;": $ dd if=/dev/random of=test bs=1m count=100 100+0 records in 100+0 records out 104857600 bytes transferred in 5.855305 secs (17908136 bytes/sec) $ fetch -o /dev/null http://localhost/test /dev/null 3% of 100 MB 118 kBps 13m50s^C fetch: transfer interrupted $ fetch -o /dev/null http://localhost/test /dev/null 100% of 100 MB 39 MBps If there was no access to the file during some time, then everything repeats: The first attempt - transfer rate is too low A further attempts - no problems Can you reproduce the problem on your system? -- Alexander Zagrebin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 16:14 Alexander Zagrebin said the following: >>> I've noticed that ZFS on 8.1-STABLE still has problems with >> sendfile. >> >> Which svn revision, just in case? > > 8.1-STABLE > The source tree was updated 2010-10-27 OK, good. >>> When accessing a file at first time the transfer speed is >> too low, but >>> on following attempts the transfer speed is normal. >>> >>> How to repeat: >>> >>> $ dd if=/dev/random of=/tmp/test bs=1m count=100 >>> 100+0 records in >>> 100+0 records out >>> 104857600 bytes transferred in 5.933945 secs (17670807 bytes/sec) >>> $ sudo env LC_ALL=C /usr/libexec/ftpd -D >>> >>> The first attempt to fetch file: >>> >>> $ fetch -o /dev/null ftp://localhost/tmp/test >>> /dev/null 1% of 100 >> MB 118 kBps >>> 14m07s^C >>> fetch: transfer interrupted >>> >>> The transfer rate is too low (approx. 120 kBps), but any >> subsequent attempts >>> are success: >>> >>> $ fetch -o /dev/null ftp://localhost/tmp/test >>> /dev/null 100% of 100 >> MB 42 MBps >>> $ fetch -o /dev/null ftp://localhost/tmp/test >>> /dev/null 100% of 100 >> MB 47 MBps >> >> Can you do an experiment with the same structure but sendfile >> excluded? > > IMHO, ftpd hasn't an option to disable sendfile. Seems so. The source could be hacked (unconditional goto oldway in libexec/ftpd/ftpd.c, but anyway. > I've tried the nginx with > disabled sendfile (the nginx.conf contains "sendfile off;"): > > $ dd if=/dev/random of=test bs=1m count=100 > 100+0 records in > 100+0 records out > 104857600 bytes transferred in 5.892504 secs (17795083 bytes/sec) > $ fetch -o /dev/null http://localhost/test > /dev/null 100% of 100 MB 41 MBps > $ fetch -o /dev/null http://localhost/test > /dev/null 100% of 100 MB 44 MBps > $ fetch -o /dev/null http://localhost/test > /dev/null 100% of 100 MB 44 MBps > I am really surprised with such a bad performance of sendfile. Will you be able to profile the issue further? I will also try to think of some measurements. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: 8.1-STABLE: zfs and sendfile: problem still exists
> > I've noticed that ZFS on 8.1-STABLE still has problems with > sendfile. > > Which svn revision, just in case? 8.1-STABLE The source tree was updated 2010-10-27 > > When accessing a file at first time the transfer speed is > too low, but > > on following attempts the transfer speed is normal. > > > > How to repeat: > > > > $ dd if=/dev/random of=/tmp/test bs=1m count=100 > > 100+0 records in > > 100+0 records out > > 104857600 bytes transferred in 5.933945 secs (17670807 bytes/sec) > > $ sudo env LC_ALL=C /usr/libexec/ftpd -D > > > > The first attempt to fetch file: > > > > $ fetch -o /dev/null ftp://localhost/tmp/test > > /dev/null 1% of 100 > MB 118 kBps > > 14m07s^C > > fetch: transfer interrupted > > > > The transfer rate is too low (approx. 120 kBps), but any > subsequent attempts > > are success: > > > > $ fetch -o /dev/null ftp://localhost/tmp/test > > /dev/null 100% of 100 > MB 42 MBps > > $ fetch -o /dev/null ftp://localhost/tmp/test > > /dev/null 100% of 100 > MB 47 MBps > > Can you do an experiment with the same structure but sendfile > excluded? IMHO, ftpd hasn't an option to disable sendfile. I've tried the nginx with disabled sendfile (the nginx.conf contains "sendfile off;"): $ dd if=/dev/random of=test bs=1m count=100 100+0 records in 100+0 records out 104857600 bytes transferred in 5.892504 secs (17795083 bytes/sec) $ fetch -o /dev/null http://localhost/test /dev/null 100% of 100 MB 41 MBps $ fetch -o /dev/null http://localhost/test /dev/null 100% of 100 MB 44 MBps $ fetch -o /dev/null http://localhost/test /dev/null 100% of 100 MB 44 MBps -- Alexander Zagrebin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 29/10/2010 12:04 Artemiev Igor said the following: > Yep, this problem exists. You may workaround it via bumping up > net.inet.tcp.sendspace up to 128k. zfs sendfile is very ineffective. I have > made a small investigation via DTrace, it reads MAXBSIZE chunks, but map in vm > only one page (4K). I.e. if you have a file with size 512K, sendfile make > calls freebsd_zfs_read 128 times. What svn revision of FreeBSD source tree did you test? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
on 28/10/2010 08:57 Alexander Zagrebin said the following: > Hi! > > I've noticed that ZFS on 8.1-STABLE still has problems with sendfile. Which svn revision, just in case? > When accessing a file at first time the transfer speed is too low, but > on following attempts the transfer speed is normal. > > How to repeat: > > $ dd if=/dev/random of=/tmp/test bs=1m count=100 > 100+0 records in > 100+0 records out > 104857600 bytes transferred in 5.933945 secs (17670807 bytes/sec) > $ sudo env LC_ALL=C /usr/libexec/ftpd -D > > The first attempt to fetch file: > > $ fetch -o /dev/null ftp://localhost/tmp/test > /dev/null 1% of 100 MB 118 kBps > 14m07s^C > fetch: transfer interrupted > > The transfer rate is too low (approx. 120 kBps), but any subsequent attempts > are success: > > $ fetch -o /dev/null ftp://localhost/tmp/test > /dev/null 100% of 100 MB 42 MBps > $ fetch -o /dev/null ftp://localhost/tmp/test > /dev/null 100% of 100 MB 47 MBps Can you do an experiment with the same structure but sendfile excluded? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.1-STABLE: zfs and sendfile: problem still exists
On Thu, Oct 28, 2010 at 09:57:22AM +0400, Alexander Zagrebin wrote: > Hi! > > I've noticed that ZFS on 8.1-STABLE still has problems with sendfile. > When accessing a file at first time the transfer speed is too low, but > on following attempts the transfer speed is normal. ... > I've tried ftpd and nginx with "sendfile on". The behavior is the same. > After disabling using sendfile in nginx ("sendfile off") the problem has > gone. Yep, this problem exists. You may workaround it via bumping up net.inet.tcp.sendspace up to 128k. zfs sendfile is very ineffective. I have made a small investigation via DTrace, it reads MAXBSIZE chunks, but map in vm only one page (4K). I.e. if you have a file with size 512K, sendfile make calls freebsd_zfs_read 128 times. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"