Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-05 Thread Peter Zijlstra
On Tue, Jun 05, 2018 at 12:24:39AM -0700, Srikar Dronamraju wrote:
> * Peter Zijlstra  [2018-06-04 21:28:21]:
> 
> > >   if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
> > > - spin_lock(>numabalancing_migrate_lock);
> > > - pgdat->numabalancing_migrate_nr_pages = 0;
> > > - pgdat->numabalancing_migrate_next_window = jiffies +
> > > - msecs_to_jiffies(migrate_interval_millisecs);
> > > - spin_unlock(>numabalancing_migrate_lock);
> > > + if (xchg(>numabalancing_migrate_nr_pages, 0))
> > > + pgdat->numabalancing_migrate_next_window = jiffies +
> > > + msecs_to_jiffies(migrate_interval_millisecs);
> >
> > Note that both are in fact wrong. That wants to be something like:
> >
> > pgdat->numabalancing_migrate_next_window += interval;
> >
> > Otherwise you stretch every interval by 'jiffies - 
> > numabalancing_migrate_next_window'.
> 
> Okay, I get your point.

Note that in practise it probably doesn't matter, but it just upsets my
OCD ;-)

> > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the
> > spinlock/xchg.

> unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window);
> 
> if (time_after(jiffies, interval)) {
>   interval += msecs_to_jiffies(migrate_interval_millisecs));
>   if (xchg(>numabalancing_migrate_nr_pages, 0))
>   WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval);
> }
> 
> Something like this?

Almost, you forgot about the case where 'jiffies -
numabalancing_migrate_next_window > interval'.

That wants to be something like:

  unsigned long timo = READ_ONCE(stupid_long_name);

  if (time_after(jiffies, timo) && xchg(_long_name, 0)) {
  do {
  timo += msec_to_jiffies(..);
  } while (unlikely(time_after(jiffies, timo);

  WRITE_ONCE(stupid_long_name, timo);
  }



Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-05 Thread Peter Zijlstra
On Tue, Jun 05, 2018 at 12:24:39AM -0700, Srikar Dronamraju wrote:
> * Peter Zijlstra  [2018-06-04 21:28:21]:
> 
> > >   if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
> > > - spin_lock(>numabalancing_migrate_lock);
> > > - pgdat->numabalancing_migrate_nr_pages = 0;
> > > - pgdat->numabalancing_migrate_next_window = jiffies +
> > > - msecs_to_jiffies(migrate_interval_millisecs);
> > > - spin_unlock(>numabalancing_migrate_lock);
> > > + if (xchg(>numabalancing_migrate_nr_pages, 0))
> > > + pgdat->numabalancing_migrate_next_window = jiffies +
> > > + msecs_to_jiffies(migrate_interval_millisecs);
> >
> > Note that both are in fact wrong. That wants to be something like:
> >
> > pgdat->numabalancing_migrate_next_window += interval;
> >
> > Otherwise you stretch every interval by 'jiffies - 
> > numabalancing_migrate_next_window'.
> 
> Okay, I get your point.

Note that in practise it probably doesn't matter, but it just upsets my
OCD ;-)

> > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the
> > spinlock/xchg.

> unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window);
> 
> if (time_after(jiffies, interval)) {
>   interval += msecs_to_jiffies(migrate_interval_millisecs));
>   if (xchg(>numabalancing_migrate_nr_pages, 0))
>   WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval);
> }
> 
> Something like this?

Almost, you forgot about the case where 'jiffies -
numabalancing_migrate_next_window > interval'.

That wants to be something like:

  unsigned long timo = READ_ONCE(stupid_long_name);

  if (time_after(jiffies, timo) && xchg(_long_name, 0)) {
  do {
  timo += msec_to_jiffies(..);
  } while (unlikely(time_after(jiffies, timo);

  WRITE_ONCE(stupid_long_name, timo);
  }



Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-05 Thread Srikar Dronamraju
* Peter Zijlstra  [2018-06-04 21:28:21]:

> > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
> > -   spin_lock(>numabalancing_migrate_lock);
> > -   pgdat->numabalancing_migrate_nr_pages = 0;
> > -   pgdat->numabalancing_migrate_next_window = jiffies +
> > -   msecs_to_jiffies(migrate_interval_millisecs);
> > -   spin_unlock(>numabalancing_migrate_lock);
> > +   if (xchg(>numabalancing_migrate_nr_pages, 0))
> > +   pgdat->numabalancing_migrate_next_window = jiffies +
> > +   msecs_to_jiffies(migrate_interval_millisecs);
>
> Note that both are in fact wrong. That wants to be something like:
>
>   pgdat->numabalancing_migrate_next_window += interval;
>
> Otherwise you stretch every interval by 'jiffies - 
> numabalancing_migrate_next_window'.

Okay, I get your point.


>
> Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the
> spinlock/xchg.
>
> I suppose the problem here is that PPC has a very nasty test-and-set
> spinlock with fwd progress issues while xchg maps to a fairly simple
> ll/sc that (hopefully) has some hardware fairness.
>
> And pgdata being a rather course data structure (per node?) there could
> be a lot of CPUs stomping on this here thing.
>
> So simpler not really, but better for PPC.
>

unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window);

if (time_after(jiffies, interval)) {
interval += msecs_to_jiffies(migrate_interval_millisecs));
if (xchg(>numabalancing_migrate_nr_pages, 0))
WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval);
}

Something like this?



Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-05 Thread Srikar Dronamraju
* Peter Zijlstra  [2018-06-04 21:28:21]:

> > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
> > -   spin_lock(>numabalancing_migrate_lock);
> > -   pgdat->numabalancing_migrate_nr_pages = 0;
> > -   pgdat->numabalancing_migrate_next_window = jiffies +
> > -   msecs_to_jiffies(migrate_interval_millisecs);
> > -   spin_unlock(>numabalancing_migrate_lock);
> > +   if (xchg(>numabalancing_migrate_nr_pages, 0))
> > +   pgdat->numabalancing_migrate_next_window = jiffies +
> > +   msecs_to_jiffies(migrate_interval_millisecs);
>
> Note that both are in fact wrong. That wants to be something like:
>
>   pgdat->numabalancing_migrate_next_window += interval;
>
> Otherwise you stretch every interval by 'jiffies - 
> numabalancing_migrate_next_window'.

Okay, I get your point.


>
> Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the
> spinlock/xchg.
>
> I suppose the problem here is that PPC has a very nasty test-and-set
> spinlock with fwd progress issues while xchg maps to a fairly simple
> ll/sc that (hopefully) has some hardware fairness.
>
> And pgdata being a rather course data structure (per node?) there could
> be a lot of CPUs stomping on this here thing.
>
> So simpler not really, but better for PPC.
>

unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window);

if (time_after(jiffies, interval)) {
interval += msecs_to_jiffies(migrate_interval_millisecs));
if (xchg(>numabalancing_migrate_nr_pages, 0))
WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval);
}

Something like this?



Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-04 Thread Peter Zijlstra
On Mon, Jun 04, 2018 at 03:30:22PM +0530, Srikar Dronamraju wrote:
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8c0af0f..1c55956 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1874,11 +1874,9 @@ static bool numamigrate_update_ratelimit(pg_data_t 
> *pgdat,
>* all the time is being spent migrating!
>*/
>   if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
> - spin_lock(>numabalancing_migrate_lock);
> - pgdat->numabalancing_migrate_nr_pages = 0;
> - pgdat->numabalancing_migrate_next_window = jiffies +
> - msecs_to_jiffies(migrate_interval_millisecs);
> - spin_unlock(>numabalancing_migrate_lock);
> + if (xchg(>numabalancing_migrate_nr_pages, 0))
> + pgdat->numabalancing_migrate_next_window = jiffies +
> + msecs_to_jiffies(migrate_interval_millisecs);

Note that both are in fact wrong. That wants to be something like:

pgdat->numabalancing_migrate_next_window += interval;

Otherwise you stretch every interval by 'jiffies - 
numabalancing_migrate_next_window'.

Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the
spinlock/xchg.

I suppose the problem here is that PPC has a very nasty test-and-set
spinlock with fwd progress issues while xchg maps to a fairly simple
ll/sc that (hopefully) has some hardware fairness.

And pgdata being a rather course data structure (per node?) there could
be a lot of CPUs stomping on this here thing.

So simpler not really, but better for PPC.

>   }
>   if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages) {
>   trace_mm_numa_migrate_ratelimit(current, pgdat->node_id,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4526643..464a25c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6208,7 +6208,6 @@ static void __paginginit free_area_init_core(struct 
> pglist_data *pgdat)
>  
>   pgdat_resize_init(pgdat);
>  #ifdef CONFIG_NUMA_BALANCING
> - spin_lock_init(>numabalancing_migrate_lock);
>   pgdat->numabalancing_migrate_nr_pages = 0;
>   pgdat->active_node_migrate = 0;
>   pgdat->numabalancing_migrate_next_window = jiffies;
> -- 
> 1.8.3.1
> 


Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-04 Thread Peter Zijlstra
On Mon, Jun 04, 2018 at 03:30:22PM +0530, Srikar Dronamraju wrote:
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8c0af0f..1c55956 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1874,11 +1874,9 @@ static bool numamigrate_update_ratelimit(pg_data_t 
> *pgdat,
>* all the time is being spent migrating!
>*/
>   if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) {
> - spin_lock(>numabalancing_migrate_lock);
> - pgdat->numabalancing_migrate_nr_pages = 0;
> - pgdat->numabalancing_migrate_next_window = jiffies +
> - msecs_to_jiffies(migrate_interval_millisecs);
> - spin_unlock(>numabalancing_migrate_lock);
> + if (xchg(>numabalancing_migrate_nr_pages, 0))
> + pgdat->numabalancing_migrate_next_window = jiffies +
> + msecs_to_jiffies(migrate_interval_millisecs);

Note that both are in fact wrong. That wants to be something like:

pgdat->numabalancing_migrate_next_window += interval;

Otherwise you stretch every interval by 'jiffies - 
numabalancing_migrate_next_window'.

Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the
spinlock/xchg.

I suppose the problem here is that PPC has a very nasty test-and-set
spinlock with fwd progress issues while xchg maps to a fairly simple
ll/sc that (hopefully) has some hardware fairness.

And pgdata being a rather course data structure (per node?) there could
be a lot of CPUs stomping on this here thing.

So simpler not really, but better for PPC.

>   }
>   if (pgdat->numabalancing_migrate_nr_pages > ratelimit_pages) {
>   trace_mm_numa_migrate_ratelimit(current, pgdat->node_id,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4526643..464a25c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6208,7 +6208,6 @@ static void __paginginit free_area_init_core(struct 
> pglist_data *pgdat)
>  
>   pgdat_resize_init(pgdat);
>  #ifdef CONFIG_NUMA_BALANCING
> - spin_lock_init(>numabalancing_migrate_lock);
>   pgdat->numabalancing_migrate_nr_pages = 0;
>   pgdat->active_node_migrate = 0;
>   pgdat->numabalancing_migrate_next_window = jiffies;
> -- 
> 1.8.3.1
> 


Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-04 Thread Rik van Riel
On Mon, 2018-06-04 at 15:30 +0530, Srikar Dronamraju wrote:
> 
> +++ b/mm/migrate.c
> @@ -1874,11 +1874,9 @@ static bool
> numamigrate_update_ratelimit(pg_data_t *pgdat,
>* all the time is being spent migrating!
>*/
>   if (time_after(jiffies, pgdat-
> >numabalancing_migrate_next_window)) {
> - spin_lock(>numabalancing_migrate_lock);
> - pgdat->numabalancing_migrate_nr_pages = 0;
> - pgdat->numabalancing_migrate_next_window = jiffies +
> - msecs_to_jiffies(migrate_interval_millisecs)
> ;
> - spin_unlock(>numabalancing_migrate_lock);
> + if (xchg(>numabalancing_migrate_nr_pages, 0))
> + pgdat->numabalancing_migrate_next_window =
> jiffies +
> + msecs_to_jiffies(migrate_interval_mi
> llisecs);
>   }

I am not convinced this is simpler, but no real
objection either way :)

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH 13/19] mm/migrate: Use xchg instead of spinlock

2018-06-04 Thread Rik van Riel
On Mon, 2018-06-04 at 15:30 +0530, Srikar Dronamraju wrote:
> 
> +++ b/mm/migrate.c
> @@ -1874,11 +1874,9 @@ static bool
> numamigrate_update_ratelimit(pg_data_t *pgdat,
>* all the time is being spent migrating!
>*/
>   if (time_after(jiffies, pgdat-
> >numabalancing_migrate_next_window)) {
> - spin_lock(>numabalancing_migrate_lock);
> - pgdat->numabalancing_migrate_nr_pages = 0;
> - pgdat->numabalancing_migrate_next_window = jiffies +
> - msecs_to_jiffies(migrate_interval_millisecs)
> ;
> - spin_unlock(>numabalancing_migrate_lock);
> + if (xchg(>numabalancing_migrate_nr_pages, 0))
> + pgdat->numabalancing_migrate_next_window =
> jiffies +
> + msecs_to_jiffies(migrate_interval_mi
> llisecs);
>   }

I am not convinced this is simpler, but no real
objection either way :)

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part