Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-17 Thread Ingo Molnar
* Mel Gorman wrote: > > > [...] Holding PTL across task_numa_fault is bad, but not > > > the bad we're looking for. > > > > No, holding the PTL across task_numa_fault() is fine, > > because this bit got reworked in my tree rather > > significantly, see: > > > > 6030a23a1c66 sched: Move

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-17 Thread Ingo Molnar
* Mel Gorman mgor...@suse.de wrote: [...] Holding PTL across task_numa_fault is bad, but not the bad we're looking for. No, holding the PTL across task_numa_fault() is fine, because this bit got reworked in my tree rather significantly, see: 6030a23a1c66 sched: Move the

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-13 Thread Srikar Dronamraju
* Mel Gorman [2012-12-07 10:23:03]: > This is a full release of all the patches so apologies for the flood. V9 was > just a MIPS build fix and did not justify a full release. V10 includes Ingo's > scalability patches because even though they increase system CPU usage, > they also helped in a

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-13 Thread Srikar Dronamraju
* Mel Gorman mgor...@suse.de [2012-12-07 10:23:03]: This is a full release of all the patches so apologies for the flood. V9 was just a MIPS build fix and did not justify a full release. V10 includes Ingo's scalability patches because even though they increase system CPU usage, they also

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Mel Gorman
On Tue, Dec 11, 2012 at 09:52:38AM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Mon, Dec 10, 2012 at 03:24:05PM +, Mel Gorman wrote: > > > For example, I think that point 5 above is the potential source of the > > > corruption because. You're not flushing the TLBs for the PTEs

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Mel Gorman
On Tue, Dec 11, 2012 at 10:18:07AM +0100, Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > This is prototype only but what I was using as a reference > > > to see could I spot a problem in yours. It has not been even > > > boot tested but avoids remote->remote copies, contending on > > >

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Ingo Molnar
* Ingo Molnar wrote: > > This is prototype only but what I was using as a reference > > to see could I spot a problem in yours. It has not been even > > boot tested but avoids remote->remote copies, contending on > > PTL or holding it longer than necessary (should anyway) > > So ... because

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Ingo Molnar
* Mel Gorman wrote: > On Mon, Dec 10, 2012 at 03:24:05PM +, Mel Gorman wrote: > > For example, I think that point 5 above is the potential source of the > > corruption because. You're not flushing the TLBs for the PTEs you are > > updating in batch. Granted, you're relaxing rather than

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Ingo Molnar
* Mel Gorman mgor...@suse.de wrote: On Mon, Dec 10, 2012 at 03:24:05PM +, Mel Gorman wrote: For example, I think that point 5 above is the potential source of the corruption because. You're not flushing the TLBs for the PTEs you are updating in batch. Granted, you're relaxing rather

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Ingo Molnar
* Ingo Molnar mi...@kernel.org wrote: This is prototype only but what I was using as a reference to see could I spot a problem in yours. It has not been even boot tested but avoids remote-remote copies, contending on PTL or holding it longer than necessary (should anyway) So ...

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Mel Gorman
On Tue, Dec 11, 2012 at 10:18:07AM +0100, Ingo Molnar wrote: * Ingo Molnar mi...@kernel.org wrote: This is prototype only but what I was using as a reference to see could I spot a problem in yours. It has not been even boot tested but avoids remote-remote copies, contending on

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-11 Thread Mel Gorman
On Tue, Dec 11, 2012 at 09:52:38AM +0100, Ingo Molnar wrote: * Mel Gorman mgor...@suse.de wrote: On Mon, Dec 10, 2012 at 03:24:05PM +, Mel Gorman wrote: For example, I think that point 5 above is the potential source of the corruption because. You're not flushing the TLBs for the

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Mon, Dec 10, 2012 at 03:24:05PM +, Mel Gorman wrote: > For example, I think that point 5 above is the potential source of the > corruption because. You're not flushing the TLBs for the PTEs you are > updating in batch. Granted, you're relaxing rather than restricting access > so it should

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
* Srikar Dronamraju wrote: > KernelVersion: 3.7.0-rc8-tip_master+(December 7th Snapshot) > Please do let me know if you have questions/suggestions. Do you still have the exact sha1 by any chance? By the date of the snapshot I'd say that this fix: f0c77b62ba9d sched: Fix

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Srikar Dronamraju
Hi Mel, Ingo, Here are the results of running autonumabenchmark on a 64 core, 8 node machine. Has six 32GB nodes and two 64 GB nodes. KernelVersion: 3.7.0-rc8 Testcase: Min Max Avg numa01: 1475.37 1615.39 1555.24

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Mon, Dec 10, 2012 at 12:39:45PM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Fri, Dec 07, 2012 at 12:01:13PM +0100, Ingo Molnar wrote: > > > > > > * Mel Gorman wrote: > > > > > > > This is a full release of all the patches so apologies for the > > > > flood. [...] > > > >

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
hi Srikar, * Srikar Dronamraju wrote: > > > > Either way, last night I applied a patch on top of latest tip/master to > > remove the nr_cpus_allowed check so that numacore would be enabled again > > and tested that. In some places it has indeed much improved. In others > > it is still

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
* Ingo Molnar wrote: > > reasons. As it turns out, a printk() bodge showed that > > nr_cpus_allowed == 80 set in sched_init_smp() while > > num_online_cpus() == 48. This effectively disabling > > numacore. If you had responded to the bug report, this would > > likely have been found last

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
* Mel Gorman wrote: > On Fri, Dec 07, 2012 at 12:01:13PM +0100, Ingo Molnar wrote: > > > > * Mel Gorman wrote: > > > > > This is a full release of all the patches so apologies for the > > > flood. [...] > > > > I have yet to process all your mails, but assuming I address all > > your

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Mon, Dec 10, 2012 at 10:37:10AM +0530, Srikar Dronamraju wrote: > > > > Either way, last night I applied a patch on top of latest tip/master to > > remove the nr_cpus_allowed check so that numacore would be enabled again > > and tested that. In some places it has indeed much improved. In

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Sun, Dec 09, 2012 at 11:17:09PM +0200, Kirill A. Shutemov wrote: > On Sun, Dec 09, 2012 at 08:36:31PM +, Mel Gorman wrote: > > Either way, last night I applied a patch on top of latest tip/master to > > remove the nr_cpus_allowed check so that numacore would be enabled again > > and tested

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Sun, Dec 09, 2012 at 11:17:09PM +0200, Kirill A. Shutemov wrote: On Sun, Dec 09, 2012 at 08:36:31PM +, Mel Gorman wrote: Either way, last night I applied a patch on top of latest tip/master to remove the nr_cpus_allowed check so that numacore would be enabled again and tested that.

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Mon, Dec 10, 2012 at 10:37:10AM +0530, Srikar Dronamraju wrote: Either way, last night I applied a patch on top of latest tip/master to remove the nr_cpus_allowed check so that numacore would be enabled again and tested that. In some places it has indeed much improved. In others it

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
* Mel Gorman mgor...@suse.de wrote: On Fri, Dec 07, 2012 at 12:01:13PM +0100, Ingo Molnar wrote: * Mel Gorman mgor...@suse.de wrote: This is a full release of all the patches so apologies for the flood. [...] I have yet to process all your mails, but assuming I address all

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
* Ingo Molnar mi...@kernel.org wrote: reasons. As it turns out, a printk() bodge showed that nr_cpus_allowed == 80 set in sched_init_smp() while num_online_cpus() == 48. This effectively disabling numacore. If you had responded to the bug report, this would likely have been found

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
hi Srikar, * Srikar Dronamraju sri...@linux.vnet.ibm.com wrote: Either way, last night I applied a patch on top of latest tip/master to remove the nr_cpus_allowed check so that numacore would be enabled again and tested that. In some places it has indeed much improved. In others it

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Mon, Dec 10, 2012 at 12:39:45PM +0100, Ingo Molnar wrote: * Mel Gorman mgor...@suse.de wrote: On Fri, Dec 07, 2012 at 12:01:13PM +0100, Ingo Molnar wrote: * Mel Gorman mgor...@suse.de wrote: This is a full release of all the patches so apologies for the flood. [...]

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Srikar Dronamraju
Hi Mel, Ingo, Here are the results of running autonumabenchmark on a 64 core, 8 node machine. Has six 32GB nodes and two 64 GB nodes. KernelVersion: 3.7.0-rc8 Testcase: Min Max Avg numa01: 1475.37 1615.39 1555.24

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Ingo Molnar
* Srikar Dronamraju sri...@linux.vnet.ibm.com wrote: KernelVersion: 3.7.0-rc8-tip_master+(December 7th Snapshot) Please do let me know if you have questions/suggestions. Do you still have the exact sha1 by any chance? By the date of the snapshot I'd say that this fix: f0c77b62ba9d sched:

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-10 Thread Mel Gorman
On Mon, Dec 10, 2012 at 03:24:05PM +, Mel Gorman wrote: For example, I think that point 5 above is the potential source of the corruption because. You're not flushing the TLBs for the PTEs you are updating in batch. Granted, you're relaxing rather than restricting access so it should be ok

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-09 Thread Srikar Dronamraju
* Srikar Dronamraju [2012-12-10 10:37:10]: > > > > Either way, last night I applied a patch on top of latest tip/master to > > remove the nr_cpus_allowed check so that numacore would be enabled again > > and tested that. In some places it has indeed much improved. In others > > it is still

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-09 Thread Srikar Dronamraju
> > Either way, last night I applied a patch on top of latest tip/master to > remove the nr_cpus_allowed check so that numacore would be enabled again > and tested that. In some places it has indeed much improved. In others > it is still regressing badly and in two case, it's corrupting memory --

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-09 Thread Kirill A. Shutemov
On Sun, Dec 09, 2012 at 08:36:31PM +, Mel Gorman wrote: > Either way, last night I applied a patch on top of latest tip/master to > remove the nr_cpus_allowed check so that numacore would be enabled again > and tested that. In some places it has indeed much improved. In others > it is still

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-09 Thread Kirill A. Shutemov
On Sun, Dec 09, 2012 at 08:36:31PM +, Mel Gorman wrote: Either way, last night I applied a patch on top of latest tip/master to remove the nr_cpus_allowed check so that numacore would be enabled again and tested that. In some places it has indeed much improved. In others it is still

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-09 Thread Srikar Dronamraju
Either way, last night I applied a patch on top of latest tip/master to remove the nr_cpus_allowed check so that numacore would be enabled again and tested that. In some places it has indeed much improved. In others it is still regressing badly and in two case, it's corrupting memory --

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-09 Thread Srikar Dronamraju
* Srikar Dronamraju sri...@linux.vnet.ibm.com [2012-12-10 10:37:10]: Either way, last night I applied a patch on top of latest tip/master to remove the nr_cpus_allowed check so that numacore would be enabled again and tested that. In some places it has indeed much improved. In others

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-07 Thread Ingo Molnar
* Mel Gorman wrote: > This is a full release of all the patches so apologies for the > flood. [...] I have yet to process all your mails, but assuming I address all your review feedback and the latest unified tree in tip:master shows no regression in your testing, would you be willing to

Re: [PATCH 00/49] Automatic NUMA Balancing v10

2012-12-07 Thread Ingo Molnar
* Mel Gorman mgor...@suse.de wrote: This is a full release of all the patches so apologies for the flood. [...] I have yet to process all your mails, but assuming I address all your review feedback and the latest unified tree in tip:master shows no regression in your testing, would you be