On Wed, 2007-08-08 at 22:44 +0100, Mel Gorman wrote: > On (08/08/07 14:30), Lee Schermerhorn didst pronounce: > > On Wed, 2007-08-08 at 10:36 -0700, Christoph Lameter wrote: > > > On Wed, 8 Aug 2007, Mel Gorman wrote: > > > <snip> > > > > o Remove bind_zonelist() (Patch in progress, very messy right now) > > > > > > Will this also allow us to avoid always hitting the first node of an > > > MPOL_BIND first? > > > > An idea: > > > > Apologies if someone already suggested this and I missed it. Too much > > traffic... > > > > instead of passing a zonelist for BIND policy, how about passing [to > > __alloc_pages(), I think] a starting node, a nodemask, and gfp flags for > > zone and modifiers. > > Yes, this has come up before although it wasn't my initial suggestion. I > thought maybe it was yours but I'm not sure anymore. I'm working through > it at the moment.
I've heard/seen Christoph mention passing a nodemask to alloc_pages a few times, but hadn't seen any of the details. Got me thinking.. > With the patch currently, a a nodemask is passed in for > filtering which should be enough as the zonelist being used should be enough > information to indicate the starting node. It'll take me a while to absorb the patch, so I'll just ask: Where does the zonelist for the argument come from? If the the bind policy zonelist is removed, then does it come from a node? There'll be only one per node with your other patches, right? So you had to have a node id, to look up the zonelist? Do you need the zonelist elsewhere, outside of alloc_pages()? If not, why not just let alloc_pages look it up from a starting node [which I think can be determined from the policy]? OK, that's a lot of questions. no need to answer. That's just what I'm thinking re: all this. I'll wait and see how the patch develops. > > The signature of __alloc_pages() becomes > > static page * fastcall > __alloc_pages_nodemask(gfp_t gfp_mask, nodemask_t *nodemask, > unsigned int order, struct zonelist *zonelist) > > > For various policies, the arguments would look like this: > > Policy start node nodemask > > > > default local node cpuset_current_mems_allowed > > > > preferred preferred_node cpuset_current_mems_allowed > > > > interleave computed node cpuset_current_mems_allowed > > > > bind local node policy nodemask [replaces bind > > zonelist in mempolicy] > > > > The last one is the most interesting. Much of the patch in development > involves deleting the custom node stuff. I've included the patch below if > you're curious. I wanted to get one-zonelist out first to see if we could > agree on that before going further with it. Again, it'll be a while. Thanks, Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

