On 02/11/2014 03:05 AM, David Rientjes wrote:
On Mon, 10 Feb 2014, Raghavendra K T wrote:

So I understood that you are suggesting implementations like below

1) I do not have problem with the below approach, I could post this in
next version.
( But this did not include 4k limit Linus mentioned to apply)

unsigned long max_sane_readahead(unsigned long nr)
{
         unsigned long local_free_page;
         int nid;

         nid = numa_mem_id();

         /*
          * We sanitize readahead size depending on free memory in
          * the local node.
          */
         local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
                           + node_page_state(nid, NR_FREE_PAGES);
         return min(nr, local_free_page / 2);
}

2) I did not go for below because Honza (Jan Kara) had some
concerns for 4k limit for normal case, and since I am not
the expert, I was waiting for opinions.

unsigned long max_sane_readahead(unsigned long nr)
{
         unsigned long local_free_page, sane_nr;
         int nid;

         nid = numa_mem_id();
        /* limit the max readahead to 4k pages */
        sane_nr = min(nr, MAX_REMOTE_READAHEAD);

         /*
          * We sanitize readahead size depending on free memory in
          * the local node.
          */
         local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
                           + node_page_state(nid, NR_FREE_PAGES);
         return min(sane_nr, local_free_page / 2);
}


I have no opinion on the 4KB pages, either of the above is just fine.


I was able to test (1) implementation on the system where readahead problem occurred. Unfortunately it did not help.

Reason seem to be that CONFIG_HAVE_MEMORYLESS_NODES dependency of
numa_mem_id(). The PPC machine I am facing problem has topology like
this:

numactl -H
---------
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 12 13 14 15 16 17 18 19 20 21 22 23 24 25
...
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus: 8 9 10 11 32 33 34 35 ...
node 1 size: 8071 MB
node 1 free: 2479 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

So it seems numa_mem_id() does not help for all the configs..
Am I missing something ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to