Re: svn commit: r364129 - head/sys/vm
Done in r364134. On Tue, Aug 11, 2020 at 3:31 PM Mark Johnston wrote: > > On Tue, Aug 11, 2020 at 03:21:31PM -0700, Conrad Meyer wrote: > > Hi Konstantin, Mark (raised the same question), > > > > On Tue, Aug 11, 2020 at 2:32 PM Konstantin Belousov > > wrote: > > > > > > On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote: > > > > Author: cem > > > > Date: Tue Aug 11 20:37:45 2020 > > > > New Revision: 364129 > > > > URL: https://svnweb.freebsd.org/changeset/base/364129 > > > > > > > > Log: > > > > Add support for multithreading the inactive queue pageout within a > > > > domain. > > > > ... > > > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int > > > > cnt, > > > >* main purpose is to replenish the store of free pages. > > > >*/ > > > > if (vmd->vmd_severeset || curproc == pageproc || > > > > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt)) > > > > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt)) > > > > > > Why this change needed ? > > > > The change was inherited from Jeff, along with the rest of it. I > > don't know why he changed it, but it does seem orthogonal to the rest > > of the revision. This part is nonessential as far as I know, and > > could be backed out. > > To me it doesn't make sense. The difference between VM_ALLOC_NORMAL and > _SYSTEM is that they use different thresholds for the free page count > before deciding whether to continue the allocation. The _NORMAL > threshold is higher than the _SYSTEM threshold, but both are smaller > than the "severe" threshold at which we set vmd_severeset. In other > words, if the free page count is such that a _NORMAL allocation would > fail but a _SYSTEM allocation would succeed, then vmd_severeset would be > set and we'd never get to the _vmd_domain_allocate() call to begin with. > > So, I think this part of the change should be reverted. Even if the > analysis is incorrect, it's logically separate from the rest of the > diff. Prior to r355003 it made more sense, but per that commit it can > be harmful to populate the per-CPU page caches when the system is > severely short on free pages, so I would disagree with it anyway without > more testing. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r364129 - head/sys/vm
On Tue, Aug 11, 2020 at 03:21:31PM -0700, Conrad Meyer wrote: > Hi Konstantin, Mark (raised the same question), > > On Tue, Aug 11, 2020 at 2:32 PM Konstantin Belousov > wrote: > > > > On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote: > > > Author: cem > > > Date: Tue Aug 11 20:37:45 2020 > > > New Revision: 364129 > > > URL: https://svnweb.freebsd.org/changeset/base/364129 > > > > > > Log: > > > Add support for multithreading the inactive queue pageout within a > > > domain. > > > ... > > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int > > > cnt, > > >* main purpose is to replenish the store of free pages. > > >*/ > > > if (vmd->vmd_severeset || curproc == pageproc || > > > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt)) > > > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt)) > > > > Why this change needed ? > > The change was inherited from Jeff, along with the rest of it. I > don't know why he changed it, but it does seem orthogonal to the rest > of the revision. This part is nonessential as far as I know, and > could be backed out. To me it doesn't make sense. The difference between VM_ALLOC_NORMAL and _SYSTEM is that they use different thresholds for the free page count before deciding whether to continue the allocation. The _NORMAL threshold is higher than the _SYSTEM threshold, but both are smaller than the "severe" threshold at which we set vmd_severeset. In other words, if the free page count is such that a _NORMAL allocation would fail but a _SYSTEM allocation would succeed, then vmd_severeset would be set and we'd never get to the _vmd_domain_allocate() call to begin with. So, I think this part of the change should be reverted. Even if the analysis is incorrect, it's logically separate from the rest of the diff. Prior to r355003 it made more sense, but per that commit it can be harmful to populate the per-CPU page caches when the system is severely short on free pages, so I would disagree with it anyway without more testing. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r364129 - head/sys/vm
Hi Konstantin, Mark (raised the same question), On Tue, Aug 11, 2020 at 2:32 PM Konstantin Belousov wrote: > > On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote: > > Author: cem > > Date: Tue Aug 11 20:37:45 2020 > > New Revision: 364129 > > URL: https://svnweb.freebsd.org/changeset/base/364129 > > > > Log: > > Add support for multithreading the inactive queue pageout within a domain. > > ... > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, > >* main purpose is to replenish the store of free pages. > >*/ > > if (vmd->vmd_severeset || curproc == pageproc || > > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt)) > > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt)) > > Why this change needed ? The change was inherited from Jeff, along with the rest of it. I don't know why he changed it, but it does seem orthogonal to the rest of the revision. This part is nonessential as far as I know, and could be backed out. Best, Conrad ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r364129 - head/sys/vm
On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote: > Author: cem > Date: Tue Aug 11 20:37:45 2020 > New Revision: 364129 > URL: https://svnweb.freebsd.org/changeset/base/364129 > > Log: > Add support for multithreading the inactive queue pageout within a domain. > > In very high throughput workloads, the inactive scan can become overwhelmed > as you have many cores producing pages and a single core freeing. Since > Mark's introduction of batched pagequeue operations, we can now run multiple > inactive threads working on independent batches. > > To avoid confusing the pid and other control algorithms, I (Jeff) do this in > a mpi-like fan out and collect model that is driven from the primary page > daemon. It decides whether the shortfall can be overcome with a single > thread and if not dispatches multiple threads and waits for their results. > > The heuristic is based on timing the pageout activity and averaging a > pages-per-second variable which is exponentially decayed. This is visible in > sysctl and may be interesting for other purposes. > > I (Jeff) have verified that this does indeed double our paging throughput > when used with two threads. With four we tend to run into other contention > problems. For now I would like to commit this infrastructure with only a > single thread enabled. > > The number of worker threads per domain can be controlled with the > 'vm.pageout_threads_per_domain' tunable. > > Submitted by: jeff (earlier version) > Discussed with: markj > Tested by: pho > Sponsored by: probably Netflix (based on contemporary commits) > Differential Revision: https://reviews.freebsd.org/D21629 > > Modified: > head/sys/vm/vm_meter.c > head/sys/vm/vm_page.c > head/sys/vm/vm_page.h > head/sys/vm/vm_pageout.c > head/sys/vm/vm_pagequeue.h > > Modified: head/sys/vm/vm_page.c > == > --- head/sys/vm/vm_page.c Tue Aug 11 17:54:10 2020(r364128) > +++ head/sys/vm/vm_page.c Tue Aug 11 20:37:45 2020(r364129) > @@ -421,7 +421,7 @@ sysctl_vm_page_blacklist(SYSCTL_HANDLER_ARGS) > * In principle, this function only needs to set the flag PG_MARKER. > * Nonetheless, it write busies the page as a safety precaution. > */ > -static void > +void > vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags) > { > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, >* main purpose is to replenish the store of free pages. >*/ > if (vmd->vmd_severeset || curproc == pageproc || > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt)) > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt)) Why did this change? It does not look related to the rest of the change. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r364129 - head/sys/vm
On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote: > Author: cem > Date: Tue Aug 11 20:37:45 2020 > New Revision: 364129 > URL: https://svnweb.freebsd.org/changeset/base/364129 > > Log: > Add support for multithreading the inactive queue pageout within a domain. > > In very high throughput workloads, the inactive scan can become overwhelmed > as you have many cores producing pages and a single core freeing. Since > Mark's introduction of batched pagequeue operations, we can now run multiple > inactive threads working on independent batches. > > To avoid confusing the pid and other control algorithms, I (Jeff) do this in > a mpi-like fan out and collect model that is driven from the primary page > daemon. It decides whether the shortfall can be overcome with a single > thread and if not dispatches multiple threads and waits for their results. > > The heuristic is based on timing the pageout activity and averaging a > pages-per-second variable which is exponentially decayed. This is visible in > sysctl and may be interesting for other purposes. > > I (Jeff) have verified that this does indeed double our paging throughput > when used with two threads. With four we tend to run into other contention > problems. For now I would like to commit this infrastructure with only a > single thread enabled. > > The number of worker threads per domain can be controlled with the > 'vm.pageout_threads_per_domain' tunable. > > Submitted by: jeff (earlier version) > Discussed with: markj > Tested by: pho > Sponsored by: probably Netflix (based on contemporary commits) > Differential Revision: https://reviews.freebsd.org/D21629 > > Modified: > head/sys/vm/vm_meter.c > head/sys/vm/vm_page.c > head/sys/vm/vm_page.h > head/sys/vm/vm_pageout.c > head/sys/vm/vm_pagequeue.h > > Modified: head/sys/vm/vm_meter.c > == > --- head/sys/vm/vm_meter.cTue Aug 11 17:54:10 2020(r364128) > +++ head/sys/vm/vm_meter.cTue Aug 11 20:37:45 2020(r364129) > @@ -552,6 +552,9 @@ vm_domain_stats_init(struct vm_domain *vmd, struct sys > SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, > "free_severe", CTLFLAG_RD, >vmd_free_severe, 0, > "Severe free pages"); > + SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, > + "inactive_pps", CTLFLAG_RD, >vmd_inactive_pps, 0, > + "inactive pages freed/second"); > > } > > > Modified: head/sys/vm/vm_page.c > == > --- head/sys/vm/vm_page.c Tue Aug 11 17:54:10 2020(r364128) > +++ head/sys/vm/vm_page.c Tue Aug 11 20:37:45 2020(r364129) > @@ -421,7 +421,7 @@ sysctl_vm_page_blacklist(SYSCTL_HANDLER_ARGS) > * In principle, this function only needs to set the flag PG_MARKER. > * Nonetheless, it write busies the page as a safety precaution. > */ > -static void > +void > vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags) > { > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, >* main purpose is to replenish the store of free pages. >*/ > if (vmd->vmd_severeset || curproc == pageproc || > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt)) > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt)) Why this change needed ? > return (0); > domain = vmd->vmd_domain; > vm_domain_free_lock(vmd); > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r364129 - head/sys/vm
Author: cem Date: Tue Aug 11 20:37:45 2020 New Revision: 364129 URL: https://svnweb.freebsd.org/changeset/base/364129 Log: Add support for multithreading the inactive queue pageout within a domain. In very high throughput workloads, the inactive scan can become overwhelmed as you have many cores producing pages and a single core freeing. Since Mark's introduction of batched pagequeue operations, we can now run multiple inactive threads working on independent batches. To avoid confusing the pid and other control algorithms, I (Jeff) do this in a mpi-like fan out and collect model that is driven from the primary page daemon. It decides whether the shortfall can be overcome with a single thread and if not dispatches multiple threads and waits for their results. The heuristic is based on timing the pageout activity and averaging a pages-per-second variable which is exponentially decayed. This is visible in sysctl and may be interesting for other purposes. I (Jeff) have verified that this does indeed double our paging throughput when used with two threads. With four we tend to run into other contention problems. For now I would like to commit this infrastructure with only a single thread enabled. The number of worker threads per domain can be controlled with the 'vm.pageout_threads_per_domain' tunable. Submitted by: jeff (earlier version) Discussed with: markj Tested by:pho Sponsored by: probably Netflix (based on contemporary commits) Differential Revision:https://reviews.freebsd.org/D21629 Modified: head/sys/vm/vm_meter.c head/sys/vm/vm_page.c head/sys/vm/vm_page.h head/sys/vm/vm_pageout.c head/sys/vm/vm_pagequeue.h Modified: head/sys/vm/vm_meter.c == --- head/sys/vm/vm_meter.c Tue Aug 11 17:54:10 2020(r364128) +++ head/sys/vm/vm_meter.c Tue Aug 11 20:37:45 2020(r364129) @@ -552,6 +552,9 @@ vm_domain_stats_init(struct vm_domain *vmd, struct sys SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "free_severe", CTLFLAG_RD, >vmd_free_severe, 0, "Severe free pages"); + SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, + "inactive_pps", CTLFLAG_RD, >vmd_inactive_pps, 0, + "inactive pages freed/second"); } Modified: head/sys/vm/vm_page.c == --- head/sys/vm/vm_page.c Tue Aug 11 17:54:10 2020(r364128) +++ head/sys/vm/vm_page.c Tue Aug 11 20:37:45 2020(r364129) @@ -421,7 +421,7 @@ sysctl_vm_page_blacklist(SYSCTL_HANDLER_ARGS) * In principle, this function only needs to set the flag PG_MARKER. * Nonetheless, it write busies the page as a safety precaution. */ -static void +void vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags) { @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, * main purpose is to replenish the store of free pages. */ if (vmd->vmd_severeset || curproc == pageproc || - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt)) + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt)) return (0); domain = vmd->vmd_domain; vm_domain_free_lock(vmd); Modified: head/sys/vm/vm_page.h == --- head/sys/vm/vm_page.h Tue Aug 11 17:54:10 2020(r364128) +++ head/sys/vm/vm_page.h Tue Aug 11 20:37:45 2020(r364129) @@ -630,6 +630,7 @@ vm_page_t vm_page_find_least(vm_object_t, vm_pindex_t) void vm_page_free_invalid(vm_page_t); vm_page_t vm_page_getfake(vm_paddr_t paddr, vm_memattr_t memattr); void vm_page_initfake(vm_page_t m, vm_paddr_t paddr, vm_memattr_t memattr); +void vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags); int vm_page_insert (vm_page_t, vm_object_t, vm_pindex_t); void vm_page_invalid(vm_page_t m); void vm_page_launder(vm_page_t m); Modified: head/sys/vm/vm_pageout.c == --- head/sys/vm/vm_pageout.cTue Aug 11 17:54:10 2020(r364128) +++ head/sys/vm/vm_pageout.cTue Aug 11 20:37:45 2020(r364129) @@ -82,6 +82,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -163,6 +164,12 @@ SYSCTL_INT(_vm, OID_AUTO, panic_on_oom, SYSCTL_INT(_vm, OID_AUTO, pageout_update_period, CTLFLAG_RWTUN, _pageout_update_period, 0, "Maximum active LRU update period"); + +/* Access with get_pageout_threads_per_domain(). */ +static int pageout_threads_per_domain = 1; +SYSCTL_INT(_vm, OID_AUTO, pageout_threads_per_domain, CTLFLAG_RDTUN, +_threads_per_domain, 0, +"Number of worker threads comprising each per-domain pagedaemon"); SYSCTL_INT(_vm,