Hi Mel,

On 02/05/2013 09:32 PM, Mel Gorman wrote:
> On Tue, Feb 05, 2013 at 11:57:22AM +0000, Mel Gorman wrote:
>>
>>>> +                          migrate_pre_flag = 1;
>>>> +                  }
>>>> +
>>>> +                  if (!isolate_lru_page(pages[i])) {
>>>> +                          inc_zone_page_state(pages[i], NR_ISOLATED_ANON +
>>>> +                                           page_is_file_cache(pages[i]));
>>>> +                          list_add_tail(&pages[i]->lru, &pagelist);
>>>> +                  } else {
>>>> +                          isolate_err = 1;
>>>> +                          goto put_page;
>>>> +                  }
>>
>> isolate_lru_page() takes the LRU lock every time.
> 
> Credit to Michal Hocko for bringing this up but with the number of
> other issues I missed that this is also broken with respect to huge page
> handling. hugetlbfs pages will not be on the LRU so the isolation will mess
> up and the migration has to be handled differently.  Ordinarily hugetlbfs
> pages cannot be allocated from ZONE_MOVABLE but it is possible to configure
> it to be allowed via /proc/sys/vm/hugepages_treat_as_movable. If this
> encounters a hugetlbfs page, it'll just blow up.

I look into the migrate_huge_page() codes find that if we support the hugetlbfs
non movable migration, we have to invent another 
alloc_huge_page_node_nonmovable() 
or such allocate interface, which cost is large(exploding the codes and great 
impact
on current alloc_huge_page_node()) but gains little, I think that pinning 
hugepage
is a corner case. 

So can we skip over hugepage without migration but give some WARN_ON() info, is
it acceptable?

> 
> The other is that this almost certainly broken for transhuge page
> handling. gup returns the head and tail pages and ordinarily this is ok

I can't find codes doing such things :(, could you please point me out?

> because the caller only cares about the physical address. Migration will
> also split a hugepage if it receives it but you are potentially adding
> tail pages to a list here and then migrating them. The split of the first
> page will get very confused. I'm not exactly sure what the result will be
> but it won't be pretty.
> 
> Was THP enabled when this was tested? Was CONFIG_DEBUG_LIST enabled
> during testing?

I checked my config file that both CONFIG options aboved are enabled. However 
it was 
only be tested by two services invoking io_setup(), it works fine..

thanks,
linfeng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to