Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
On Tue, Jun 05, 2018 at 12:24:39AM -0700, Srikar Dronamraju wrote: > * Peter Zijlstra [2018-06-04 21:28:21]: > > > > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > > > - spin_lock(>numabalancing_migrate_lock); > > > - pgdat->numabalancing_migrate_nr_pages = 0; > > > - pgdat->numabalancing_migrate_next_window = jiffies + > > > - msecs_to_jiffies(migrate_interval_millisecs); > > > - spin_unlock(>numabalancing_migrate_lock); > > > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > > > + pgdat->numabalancing_migrate_next_window = jiffies + > > > + msecs_to_jiffies(migrate_interval_millisecs); > > > > Note that both are in fact wrong. That wants to be something like: > > > > pgdat->numabalancing_migrate_next_window += interval; > > > > Otherwise you stretch every interval by 'jiffies - > > numabalancing_migrate_next_window'. > > Okay, I get your point. Note that in practise it probably doesn't matter, but it just upsets my OCD ;-) > > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the > > spinlock/xchg. > unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window); > > if (time_after(jiffies, interval)) { > interval += msecs_to_jiffies(migrate_interval_millisecs)); > if (xchg(>numabalancing_migrate_nr_pages, 0)) > WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval); > } > > Something like this? Almost, you forgot about the case where 'jiffies - numabalancing_migrate_next_window > interval'. That wants to be something like: unsigned long timo = READ_ONCE(stupid_long_name); if (time_after(jiffies, timo) && xchg(_long_name, 0)) { do { timo += msec_to_jiffies(..); } while (unlikely(time_after(jiffies, timo); WRITE_ONCE(stupid_long_name, timo); }
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
On Tue, Jun 05, 2018 at 12:24:39AM -0700, Srikar Dronamraju wrote: > * Peter Zijlstra [2018-06-04 21:28:21]: > > > > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > > > - spin_lock(>numabalancing_migrate_lock); > > > - pgdat->numabalancing_migrate_nr_pages = 0; > > > - pgdat->numabalancing_migrate_next_window = jiffies + > > > - msecs_to_jiffies(migrate_interval_millisecs); > > > - spin_unlock(>numabalancing_migrate_lock); > > > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > > > + pgdat->numabalancing_migrate_next_window = jiffies + > > > + msecs_to_jiffies(migrate_interval_millisecs); > > > > Note that both are in fact wrong. That wants to be something like: > > > > pgdat->numabalancing_migrate_next_window += interval; > > > > Otherwise you stretch every interval by 'jiffies - > > numabalancing_migrate_next_window'. > > Okay, I get your point. Note that in practise it probably doesn't matter, but it just upsets my OCD ;-) > > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the > > spinlock/xchg. > unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window); > > if (time_after(jiffies, interval)) { > interval += msecs_to_jiffies(migrate_interval_millisecs)); > if (xchg(>numabalancing_migrate_nr_pages, 0)) > WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval); > } > > Something like this? Almost, you forgot about the case where 'jiffies - numabalancing_migrate_next_window > interval'. That wants to be something like: unsigned long timo = READ_ONCE(stupid_long_name); if (time_after(jiffies, timo) && xchg(_long_name, 0)) { do { timo += msec_to_jiffies(..); } while (unlikely(time_after(jiffies, timo); WRITE_ONCE(stupid_long_name, timo); }
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
* Peter Zijlstra [2018-06-04 21:28:21]: > > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > > - spin_lock(>numabalancing_migrate_lock); > > - pgdat->numabalancing_migrate_nr_pages = 0; > > - pgdat->numabalancing_migrate_next_window = jiffies + > > - msecs_to_jiffies(migrate_interval_millisecs); > > - spin_unlock(>numabalancing_migrate_lock); > > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > > + pgdat->numabalancing_migrate_next_window = jiffies + > > + msecs_to_jiffies(migrate_interval_millisecs); > > Note that both are in fact wrong. That wants to be something like: > > pgdat->numabalancing_migrate_next_window += interval; > > Otherwise you stretch every interval by 'jiffies - > numabalancing_migrate_next_window'. Okay, I get your point. > > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the > spinlock/xchg. > > I suppose the problem here is that PPC has a very nasty test-and-set > spinlock with fwd progress issues while xchg maps to a fairly simple > ll/sc that (hopefully) has some hardware fairness. > > And pgdata being a rather course data structure (per node?) there could > be a lot of CPUs stomping on this here thing. > > So simpler not really, but better for PPC. > unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window); if (time_after(jiffies, interval)) { interval += msecs_to_jiffies(migrate_interval_millisecs)); if (xchg(>numabalancing_migrate_nr_pages, 0)) WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval); } Something like this?
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
* Peter Zijlstra [2018-06-04 21:28:21]: > > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > > - spin_lock(>numabalancing_migrate_lock); > > - pgdat->numabalancing_migrate_nr_pages = 0; > > - pgdat->numabalancing_migrate_next_window = jiffies + > > - msecs_to_jiffies(migrate_interval_millisecs); > > - spin_unlock(>numabalancing_migrate_lock); > > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > > + pgdat->numabalancing_migrate_next_window = jiffies + > > + msecs_to_jiffies(migrate_interval_millisecs); > > Note that both are in fact wrong. That wants to be something like: > > pgdat->numabalancing_migrate_next_window += interval; > > Otherwise you stretch every interval by 'jiffies - > numabalancing_migrate_next_window'. Okay, I get your point. > > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the > spinlock/xchg. > > I suppose the problem here is that PPC has a very nasty test-and-set > spinlock with fwd progress issues while xchg maps to a fairly simple > ll/sc that (hopefully) has some hardware fairness. > > And pgdata being a rather course data structure (per node?) there could > be a lot of CPUs stomping on this here thing. > > So simpler not really, but better for PPC. > unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window); if (time_after(jiffies, interval)) { interval += msecs_to_jiffies(migrate_interval_millisecs)); if (xchg(>numabalancing_migrate_nr_pages, 0)) WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval); } Something like this?
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
On Mon, Jun 04, 2018 at 03:30:22PM +0530, Srikar Dronamraju wrote: > diff --git a/mm/migrate.c b/mm/migrate.c > index 8c0af0f..1c55956 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1874,11 +1874,9 @@ static bool numamigrate_update_ratelimit(pg_data_t > *pgdat, >* all the time is being spent migrating! >*/ > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > - spin_lock(>numabalancing_migrate_lock); > - pgdat->numabalancing_migrate_nr_pages = 0; > - pgdat->numabalancing_migrate_next_window = jiffies + > - msecs_to_jiffies(migrate_interval_millisecs); > - spin_unlock(>numabalancing_migrate_lock); > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > + pgdat->numabalancing_migrate_next_window = jiffies + > + msecs_to_jiffies(migrate_interval_millisecs); Note that both are in fact wrong. That wants to be something like: pgdat->numabalancing_migrate_next_window += interval; Otherwise you stretch every interval by 'jiffies - numabalancing_migrate_next_window'. Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the spinlock/xchg. I suppose the problem here is that PPC has a very nasty test-and-set spinlock with fwd progress issues while xchg maps to a fairly simple ll/sc that (hopefully) has some hardware fairness. And pgdata being a rather course data structure (per node?) there could be a lot of CPUs stomping on this here thing. So simpler not really, but better for PPC. > } > if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages) { > trace_mm_numa_migrate_ratelimit(current, pgdat->node_id, > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 4526643..464a25c 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6208,7 +6208,6 @@ static void __paginginit free_area_init_core(struct > pglist_data *pgdat) > > pgdat_resize_init(pgdat); > #ifdef CONFIG_NUMA_BALANCING > - spin_lock_init(>numabalancing_migrate_lock); > pgdat->numabalancing_migrate_nr_pages = 0; > pgdat->active_node_migrate = 0; > pgdat->numabalancing_migrate_next_window = jiffies; > -- > 1.8.3.1 >
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
On Mon, Jun 04, 2018 at 03:30:22PM +0530, Srikar Dronamraju wrote: > diff --git a/mm/migrate.c b/mm/migrate.c > index 8c0af0f..1c55956 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1874,11 +1874,9 @@ static bool numamigrate_update_ratelimit(pg_data_t > *pgdat, >* all the time is being spent migrating! >*/ > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > - spin_lock(>numabalancing_migrate_lock); > - pgdat->numabalancing_migrate_nr_pages = 0; > - pgdat->numabalancing_migrate_next_window = jiffies + > - msecs_to_jiffies(migrate_interval_millisecs); > - spin_unlock(>numabalancing_migrate_lock); > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > + pgdat->numabalancing_migrate_next_window = jiffies + > + msecs_to_jiffies(migrate_interval_millisecs); Note that both are in fact wrong. That wants to be something like: pgdat->numabalancing_migrate_next_window += interval; Otherwise you stretch every interval by 'jiffies - numabalancing_migrate_next_window'. Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the spinlock/xchg. I suppose the problem here is that PPC has a very nasty test-and-set spinlock with fwd progress issues while xchg maps to a fairly simple ll/sc that (hopefully) has some hardware fairness. And pgdata being a rather course data structure (per node?) there could be a lot of CPUs stomping on this here thing. So simpler not really, but better for PPC. > } > if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages) { > trace_mm_numa_migrate_ratelimit(current, pgdat->node_id, > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 4526643..464a25c 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6208,7 +6208,6 @@ static void __paginginit free_area_init_core(struct > pglist_data *pgdat) > > pgdat_resize_init(pgdat); > #ifdef CONFIG_NUMA_BALANCING > - spin_lock_init(>numabalancing_migrate_lock); > pgdat->numabalancing_migrate_nr_pages = 0; > pgdat->active_node_migrate = 0; > pgdat->numabalancing_migrate_next_window = jiffies; > -- > 1.8.3.1 >
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
On Mon, 2018-06-04 at 15:30 +0530, Srikar Dronamraju wrote: > > +++ b/mm/migrate.c > @@ -1874,11 +1874,9 @@ static bool > numamigrate_update_ratelimit(pg_data_t *pgdat, >* all the time is being spent migrating! >*/ > if (time_after(jiffies, pgdat- > >numabalancing_migrate_next_window)) { > - spin_lock(>numabalancing_migrate_lock); > - pgdat->numabalancing_migrate_nr_pages = 0; > - pgdat->numabalancing_migrate_next_window = jiffies + > - msecs_to_jiffies(migrate_interval_millisecs) > ; > - spin_unlock(>numabalancing_migrate_lock); > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > + pgdat->numabalancing_migrate_next_window = > jiffies + > + msecs_to_jiffies(migrate_interval_mi > llisecs); > } I am not convinced this is simpler, but no real objection either way :) -- All Rights Reversed. signature.asc Description: This is a digitally signed message part
Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock
On Mon, 2018-06-04 at 15:30 +0530, Srikar Dronamraju wrote: > > +++ b/mm/migrate.c > @@ -1874,11 +1874,9 @@ static bool > numamigrate_update_ratelimit(pg_data_t *pgdat, >* all the time is being spent migrating! >*/ > if (time_after(jiffies, pgdat- > >numabalancing_migrate_next_window)) { > - spin_lock(>numabalancing_migrate_lock); > - pgdat->numabalancing_migrate_nr_pages = 0; > - pgdat->numabalancing_migrate_next_window = jiffies + > - msecs_to_jiffies(migrate_interval_millisecs) > ; > - spin_unlock(>numabalancing_migrate_lock); > + if (xchg(>numabalancing_migrate_nr_pages, 0)) > + pgdat->numabalancing_migrate_next_window = > jiffies + > + msecs_to_jiffies(migrate_interval_mi > llisecs); > } I am not convinced this is simpler, but no real objection either way :) -- All Rights Reversed. signature.asc Description: This is a digitally signed message part