Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-03-27 Thread Huang Ying
On Wed, 2015-03-25 at 10:54 +, Mel Gorman wrote:
> On Mon, Mar 23, 2015 at 04:46:21PM +0800, Huang Ying wrote:
> > > My attention is occupied by the automatic NUMA regression at the moment
> > > but I haven't forgotten this. Even with the high client count, I was not
> > > able to reproduce this so it appears to depend on the number of CPUs
> > > available to stress the allocator enough to bypass the per-cpu allocator
> > > enough to contend heavily on the zone lock. I'm hoping to think of a
> > > better alternative than adding more padding and increasing the cache
> > > footprint of the allocator but so far I haven't thought of a good
> > > alternative. Moving the lock to the end of the freelists would probably
> > > address the problem but still increases the footprint for order-0
> > > allocations by a cache line.
> > 
> > Any update on this?  Do you have some better idea?  I guess this may be
> > fixed via putting some fields that are only read during order-0
> > allocation with the same cache line of lock, if there are any.
> > 
> 
> Sorry for the delay, the automatic NUMA regression took a long time to
> close and it potentially affected anybody with a NUMA machine, not just
> stress tests on large machines.
> 
> Moving it beside other fields shifts the problems. The lock is related
> to the free areas so it really belongs nearby and from my own testing,
> it does not affect mid-sized machines. I'd rather not put the lock in its
> own cache line unless we have to. Can you try the following untested patch
> instead? It is untested but builds and should be safe.
> 
> It'll increase the footprint of the page allocator but so would padding.
> It means it will contend with high-order free page breakups but that
> is not likely to happen during stress tests. It also collides with flags
> but they are relatively rarely updated.
> 
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f279d9c158cd..2782df47101e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -474,16 +474,15 @@ struct zone {
>   unsigned long   wait_table_bits;
>  
>   ZONE_PADDING(_pad1_)
> -
> - /* Write-intensive fields used from the page allocator */
> - spinlock_t  lock;
> -
>   /* free areas of different sizes */
>   struct free_areafree_area[MAX_ORDER];
>  
>   /* zone flags, see below */
>   unsigned long   flags;
>  
> + /* Write-intensive fields used from the page allocator */
> + spinlock_t  lock;
> +
>   ZONE_PADDING(_pad2_)
>  
>   /* Write-intensive fields used by page reclaim */

Stress page allocator tests here shows that the performance restored to
its previous level with the patch above.  I applied your patch on lasted
upstream kernel.  Result is as below:

testbox/testcase/testparams: brickland1/aim7/performance-6000-page_test

c875f421097a55d9  dbdc458f1b7d07f32891509c06  
  --  
 %stddev %change %stddev
 \  |\  
 84568 ±  1% +94.3% 164280 ±  1%  aim7.jobs-per-min
   2881944 ±  2% -35.1%1870386 ±  8%  
aim7.time.voluntary_context_switches
   681 ±  1%  -3.4%658 ±  0%  aim7.time.user_time
   5538139 ±  0% -12.1%4867884 ±  0%  
aim7.time.involuntary_context_switches
 44174 ±  1% -46.0%  23848 ±  1%  aim7.time.system_time
   426 ±  1% -48.4%219 ±  1%  aim7.time.elapsed_time
   426 ±  1% -48.4%219 ±  1%  aim7.time.elapsed_time.max
   468 ±  1% -43.1%266 ±  2%  uptime.boot
 13691 ±  0% -24.2%  10379 ±  1%  softirqs.NET_RX
931382 ±  2% +24.9%1163065 ±  1%  softirqs.RCU
407717 ±  2% -36.3% 259521 ±  9%  softirqs.SCHED
  19690372 ±  0% -34.8%   12836548 ±  1%  softirqs.TIMER
  2442 ±  1% -28.9%   1737 ±  5%  vmstat.procs.b
  3016 ±  3% +19.4%   3603 ±  4%  vmstat.procs.r
104330 ±  1% +34.6% 140387 ±  0%  vmstat.system.in
 22172 ±  0% +48.3%  32877 ±  2%  vmstat.system.cs
  1891 ± 12% -48.2%978 ± 10%  numa-numastat.node0.other_node
  1785 ± 14% -47.7%933 ±  6%  numa-numastat.node1.other_node
  1790 ± 12% -47.8%935 ± 10%  numa-numastat.node2.other_node
  1766 ± 14% -47.0%935 ± 12%  numa-numastat.node3.other_node
   426 ±  1% -48.4%219 ±  1%  time.elapsed_time.max
   426 ±  1% -48.4%219 ±  1%  time.elapsed_time
   5538139 ±  0% -12.1%4867884 ±  0%  time.involuntary_context_switches
 44174 ±  1% -46.0%  23848 ±  1%  time.system_time
   2881944 ±  2% -35.1%1870386 ±  8%  time.voluntary_context_switches
   7831898 ±  4% +31.8%   10325919 ±  5%  meminfo.Active
   7742498 ±  4% +32.2%   10237222 ±  5%  meminfo.Active(anon)
   7231211 ±  4% +28.7%9308183 ±  5%  meminfo.AnonPag

Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-03-25 Thread Mel Gorman
On Mon, Mar 23, 2015 at 04:46:21PM +0800, Huang Ying wrote:
> > My attention is occupied by the automatic NUMA regression at the moment
> > but I haven't forgotten this. Even with the high client count, I was not
> > able to reproduce this so it appears to depend on the number of CPUs
> > available to stress the allocator enough to bypass the per-cpu allocator
> > enough to contend heavily on the zone lock. I'm hoping to think of a
> > better alternative than adding more padding and increasing the cache
> > footprint of the allocator but so far I haven't thought of a good
> > alternative. Moving the lock to the end of the freelists would probably
> > address the problem but still increases the footprint for order-0
> > allocations by a cache line.
> 
> Any update on this?  Do you have some better idea?  I guess this may be
> fixed via putting some fields that are only read during order-0
> allocation with the same cache line of lock, if there are any.
> 

Sorry for the delay, the automatic NUMA regression took a long time to
close and it potentially affected anybody with a NUMA machine, not just
stress tests on large machines.

Moving it beside other fields shifts the problems. The lock is related
to the free areas so it really belongs nearby and from my own testing,
it does not affect mid-sized machines. I'd rather not put the lock in its
own cache line unless we have to. Can you try the following untested patch
instead? It is untested but builds and should be safe.

It'll increase the footprint of the page allocator but so would padding.
It means it will contend with high-order free page breakups but that
is not likely to happen during stress tests. It also collides with flags
but they are relatively rarely updated.


diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f279d9c158cd..2782df47101e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -474,16 +474,15 @@ struct zone {
unsigned long   wait_table_bits;
 
ZONE_PADDING(_pad1_)
-
-   /* Write-intensive fields used from the page allocator */
-   spinlock_t  lock;
-
/* free areas of different sizes */
struct free_areafree_area[MAX_ORDER];
 
/* zone flags, see below */
unsigned long   flags;
 
+   /* Write-intensive fields used from the page allocator */
+   spinlock_t  lock;
+
ZONE_PADDING(_pad2_)
 
/* Write-intensive fields used by page reclaim */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-03-23 Thread Huang Ying
On Thu, 2015-03-05 at 10:26 +, Mel Gorman wrote:
> On Thu, Mar 05, 2015 at 01:34:59PM +0800, Huang Ying wrote:
> > Hi, Mel,
> > 
> > On Sat, 2015-02-28 at 15:30 +0800, Huang Ying wrote:
> > > On Sat, 2015-02-28 at 01:46 +, Mel Gorman wrote:
> > > > On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > > > > FYI, we noticed the below changes on
> > > > > 
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> > > > > master
> > > > > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone 
> > > > > fields into read-only, page alloc, statistics and page reclaim lines")
> > > > > 
> > > > > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I 
> > > > > suspect there are some cache ping-pong or false sharing.
> > > > > 
> > > > 
> > > > Are you sure about this result? I ran similar tests here and found that
> > > > there was a major regression introduced near there but it was commit
> > > > 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
> > > > cause the problem and it was later reverted.  On local tests on a 4-node
> > > > machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
> > > > of the previous commit and well within the noise.
> > > 
> > > After applying the below debug patch, the performance regression
> > > restored.  So I think we can root cause this regression to be cache line
> > > alignment related issue?
> > > 
> > > If my understanding were correct, after the 3484b2de94, lock and low
> > > address area free_area are in the same cache line, so that the cache
> > > line of the lock and the low address area of free_area will be switched
> > > between MESI "E" and "S" state because it is written in one CPU (page
> > > allocating with free_area) and frequently read (spinning on lock) in
> > > another CPU.
> > 
> > What do you think about this?
> > 
> 
> My attention is occupied by the automatic NUMA regression at the moment
> but I haven't forgotten this. Even with the high client count, I was not
> able to reproduce this so it appears to depend on the number of CPUs
> available to stress the allocator enough to bypass the per-cpu allocator
> enough to contend heavily on the zone lock. I'm hoping to think of a
> better alternative than adding more padding and increasing the cache
> footprint of the allocator but so far I haven't thought of a good
> alternative. Moving the lock to the end of the freelists would probably
> address the problem but still increases the footprint for order-0
> allocations by a cache line.

Any update on this?  Do you have some better idea?  I guess this may be
fixed via putting some fields that are only read during order-0
allocation with the same cache line of lock, if there are any.

Best Regards,
Huang, Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-03-05 Thread Mel Gorman
On Thu, Mar 05, 2015 at 01:34:59PM +0800, Huang Ying wrote:
> Hi, Mel,
> 
> On Sat, 2015-02-28 at 15:30 +0800, Huang Ying wrote:
> > On Sat, 2015-02-28 at 01:46 +, Mel Gorman wrote:
> > > On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > > > FYI, we noticed the below changes on
> > > > 
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone 
> > > > fields into read-only, page alloc, statistics and page reclaim lines")
> > > > 
> > > > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I 
> > > > suspect there are some cache ping-pong or false sharing.
> > > > 
> > > 
> > > Are you sure about this result? I ran similar tests here and found that
> > > there was a major regression introduced near there but it was commit
> > > 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
> > > cause the problem and it was later reverted.  On local tests on a 4-node
> > > machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
> > > of the previous commit and well within the noise.
> > 
> > After applying the below debug patch, the performance regression
> > restored.  So I think we can root cause this regression to be cache line
> > alignment related issue?
> > 
> > If my understanding were correct, after the 3484b2de94, lock and low
> > address area free_area are in the same cache line, so that the cache
> > line of the lock and the low address area of free_area will be switched
> > between MESI "E" and "S" state because it is written in one CPU (page
> > allocating with free_area) and frequently read (spinning on lock) in
> > another CPU.
> 
> What do you think about this?
> 

My attention is occupied by the automatic NUMA regression at the moment
but I haven't forgotten this. Even with the high client count, I was not
able to reproduce this so it appears to depend on the number of CPUs
available to stress the allocator enough to bypass the per-cpu allocator
enough to contend heavily on the zone lock. I'm hoping to think of a
better alternative than adding more padding and increasing the cache
footprint of the allocator but so far I haven't thought of a good
alternative. Moving the lock to the end of the freelists would probably
address the problem but still increases the footprint for order-0
allocations by a cache line.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-03-04 Thread Huang Ying
Hi, Mel,

On Sat, 2015-02-28 at 15:30 +0800, Huang Ying wrote:
> On Sat, 2015-02-28 at 01:46 +, Mel Gorman wrote:
> > On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone 
> > > fields into read-only, page alloc, statistics and page reclaim lines")
> > > 
> > > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> > > there are some cache ping-pong or false sharing.
> > > 
> > 
> > Are you sure about this result? I ran similar tests here and found that
> > there was a major regression introduced near there but it was commit
> > 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
> > cause the problem and it was later reverted.  On local tests on a 4-node
> > machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
> > of the previous commit and well within the noise.
> 
> After applying the below debug patch, the performance regression
> restored.  So I think we can root cause this regression to be cache line
> alignment related issue?
> 
> If my understanding were correct, after the 3484b2de94, lock and low
> address area free_area are in the same cache line, so that the cache
> line of the lock and the low address area of free_area will be switched
> between MESI "E" and "S" state because it is written in one CPU (page
> allocating with free_area) and frequently read (spinning on lock) in
> another CPU.

What do you think about this?

Best Regards,
Huang, Ying

> Best Regards,
> Huang, Ying
> 
> ---
>  include/linux/mmzone.h |2 ++
>  1 file changed, 2 insertions(+)
> 
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -468,6 +468,8 @@ struct zone {
>   /* Write-intensive fields used from the page allocator */
>   spinlock_t  lock;
>  
> + ZONE_PADDING(_pad_xx_)
> +
>   /* free areas of different sizes */
>   struct free_areafree_area[MAX_ORDER];
>  
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-02-27 Thread Huang Ying
On Fri, 2015-02-27 at 11:53 +, Mel Gorman wrote:
> On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > FYI, we noticed the below changes on
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone fields 
> > into read-only, page alloc, statistics and page reclaim lines")
> > 
> > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> > there are some cache ping-pong or false sharing.
> > 
> 
> Annoying because this is pretty much the opposite of what I found during
> testing. What is the kernel config? Similar to the kernel config, can you
> post "pahole -C zone vmlinux" for the kernel you built? I should get the same
> result if I use the same kernel config but no harm in being sure. Thanks.

The output of pahole -C zone vmlinux for the kernels are as below.

Best Regards,
Huang, Ying

3484b2de94
---
struct zone {
long unsigned int  watermark[3]; /* 024 */
long int   lowmem_reserve[4];/*2432 */
intnode; /*56 4 */
unsigned int   inactive_ratio;   /*60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct pglist_data *   zone_pgdat;   /*64 8 */
struct per_cpu_pageset *   pageset;  /*72 8 */
long unsigned int  dirty_balance_reserve; /*80 8 */
long unsigned int  min_unmapped_pages;   /*88 8 */
long unsigned int  min_slab_pages;   /*96 8 */
long unsigned int  zone_start_pfn;   /*   104 8 */
long unsigned int  managed_pages;/*   112 8 */
long unsigned int  spanned_pages;/*   120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
long unsigned int  present_pages;/*   128 8 */
const char  *  name; /*   136 8 */
intnr_migrate_reserve_block; /*   144 4 */
seqlock_t  span_seqlock; /*   148 8 */

/* XXX 4 bytes hole, try to pack */

wait_queue_head_t *wait_table;   /*   160 8 */
long unsigned int  wait_table_hash_nr_entries; /*   168 8 */
long unsigned int  wait_table_bits;  /*   176 8 */

/* XXX 8 bytes hole, try to pack */

/* --- cacheline 3 boundary (192 bytes) --- */
struct zone_padding_pad1_;   /*   192 0 */
spinlock_t lock; /*   192 4 */

/* XXX 4 bytes hole, try to pack */

struct free_area   free_area[11];/*   200   968 */
/* --- cacheline 18 boundary (1152 bytes) was 16 bytes ago --- */
long unsigned int  flags;/*  1168 8 */

/* XXX 40 bytes hole, try to pack */

/* --- cacheline 19 boundary (1216 bytes) --- */
struct zone_padding_pad2_;   /*  1216 0 */
spinlock_t lru_lock; /*  1216 4 */

/* XXX 4 bytes hole, try to pack */

long unsigned int  pages_scanned;/*  1224 8 */
struct lruvec  lruvec;   /*  1232   120 */
/* --- cacheline 21 boundary (1344 bytes) was 8 bytes ago --- */
atomic_long_t  inactive_age; /*  1352 8 */
long unsigned int  percpu_drift_mark;/*  1360 8 */
long unsigned int  compact_cached_free_pfn; /*  1368 8 */
long unsigned int  compact_cached_migrate_pfn[2]; /*  1376
16 */
unsigned int   compact_considered;   /*  1392 4 */
unsigned int   compact_defer_shift;  /*  1396 4 */
intcompact_order_failed; /*  1400 4 */
bool   compact_blockskip_flush; /*  1404 1 */

/* XXX 3 bytes hole, try to pack */

/* --- cacheline 22 boundary (1408 bytes) --- */
struct zone_padding_pad3_;   /*  1408 0 */
atomic_long_t  vm_stat[38];  /*  1408   304 */
/* --- cacheline 26 boundary (1664 bytes) was 48 bytes ago --- */

/* size: 1728, cachelines: 27, members: 37 */
/* sum members: 1649, holes: 6, sum holes: 63 */
/* padding: 16 */
};

24b7e5819a
---
struct zone {
long unsigned int  watermark[3]; /* 024 */
long unsigned int  percpu_drift_mark;/*24 8 */
long unsigned int

Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-02-27 Thread Huang Ying
On Sat, 2015-02-28 at 01:46 +, Mel Gorman wrote:
> On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > FYI, we noticed the below changes on
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone fields 
> > into read-only, page alloc, statistics and page reclaim lines")
> > 
> > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> > there are some cache ping-pong or false sharing.
> > 
> 
> Are you sure about this result? I ran similar tests here and found that
> there was a major regression introduced near there but it was commit
> 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
> cause the problem and it was later reverted.  On local tests on a 4-node
> machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
> of the previous commit and well within the noise.

After applying the below debug patch, the performance regression
restored.  So I think we can root cause this regression to be cache line
alignment related issue?

If my understanding were correct, after the 3484b2de94, lock and low
address area free_area are in the same cache line, so that the cache
line of the lock and the low address area of free_area will be switched
between MESI "E" and "S" state because it is written in one CPU (page
allocating with free_area) and frequently read (spinning on lock) in
another CPU.

Best Regards,
Huang, Ying

---
 include/linux/mmzone.h |2 ++
 1 file changed, 2 insertions(+)

--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -468,6 +468,8 @@ struct zone {
/* Write-intensive fields used from the page allocator */
spinlock_t  lock;
 
+   ZONE_PADDING(_pad_xx_)
+
/* free areas of different sizes */
struct free_areafree_area[MAX_ORDER];
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-02-27 Thread Huang Ying
On Sat, 2015-02-28 at 10:30 +0800, Huang Ying wrote:
> On Sat, 2015-02-28 at 01:46 +, Mel Gorman wrote:
> > On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone 
> > > fields into read-only, page alloc, statistics and page reclaim lines")
> > > 
> > > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> > > there are some cache ping-pong or false sharing.
> > > 
> > 
> > Are you sure about this result? I ran similar tests here and found that
> > there was a major regression introduced near there but it was commit
> > 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
> > cause the problem and it was later reverted.  On local tests on a 4-node
> > machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
> > of the previous commit and well within the noise.
> 
> I have double checked the result before sending out.
> 
> Do you do the test with same kernel config and test case/parameters
> (aim7/page_test/load 6000)?

Or you can show your test case and parameters and I can try that too.

Best Regards,
Huang, Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-02-27 Thread Huang Ying
On Sat, 2015-02-28 at 01:46 +, Mel Gorman wrote:
> On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> > FYI, we noticed the below changes on
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone fields 
> > into read-only, page alloc, statistics and page reclaim lines")
> > 
> > The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> > there are some cache ping-pong or false sharing.
> > 
> 
> Are you sure about this result? I ran similar tests here and found that
> there was a major regression introduced near there but it was commit
> 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
> cause the problem and it was later reverted.  On local tests on a 4-node
> machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
> of the previous commit and well within the noise.

I have double checked the result before sending out.

Do you do the test with same kernel config and test case/parameters
(aim7/page_test/load 6000)?

Best Regards,
Huang, Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-02-27 Thread Mel Gorman
On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> FYI, we noticed the below changes on
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone fields 
> into read-only, page alloc, statistics and page reclaim lines")
> 
> The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> there are some cache ping-pong or false sharing.
> 

Are you sure about this result? I ran similar tests here and found that
there was a major regression introduced near there but it was commit
05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") that
cause the problem and it was later reverted.  On local tests on a 4-node
machine, commit 3484b2de9499df23c4604a513b36f96326ae81ad was within 1%
of the previous commit and well within the noise.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mm] 3484b2de949: -46.2% aim7.jobs-per-min

2015-02-27 Thread Mel Gorman
On Fri, Feb 27, 2015 at 03:21:36PM +0800, Huang Ying wrote:
> FYI, we noticed the below changes on
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> commit 3484b2de9499df23c4604a513b36f96326ae81ad ("mm: rearrange zone fields 
> into read-only, page alloc, statistics and page reclaim lines")
> 
> The perf cpu-cycles for spinlock (zone->lock) increased a lot.  I suspect 
> there are some cache ping-pong or false sharing.
> 

Annoying because this is pretty much the opposite of what I found during
testing. What is the kernel config? Similar to the kernel config, can you
post "pahole -C zone vmlinux" for the kernel you built? I should get the same
result if I use the same kernel config but no harm in being sure. Thanks.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/