On Thu, 26 Apr 2007, Andrew Morton wrote: > It's not exactly hard to lock four pages which are contiguous in pagecache, > contiguous in physical memory and are contiguous in the radix-tree.
If you can find them.... > > The patch is not about forcing to use large pages but about the option to > > use larger pages. Its a new flexibility. > > That's just spin. No its a fact. The patchset really allows one to switch large page support on and off. It opens up new options.. > > They are really larger. One page struct controls it all. > > No it doesn't and please stop spinning. x86 ptes map 4k pages and the core > MM needs changes to continue to work with this hack in place. The page cache is different from pte mapping. One page struct controls them all. Look at the patches. There is no state information in the tail pages apart from pointing to the head page. > If x86 had larger pagesize we wouldn't be seeing any of this. It is a > workaround > for present-generation hardware. Pagecache != mmap. > > The patchset would reduce complexity and making it easy to handle the page > > cache. Gets rid of the hacks to support larger ones right now. Its > > straightforward, no new locking, very much a cleanup patch. > > Were any cleanups made which were not also applicable as standalone things > to mainline? The page cache functions require a mapping parameter. This is available in most place and a natural thing given that allocation etc is also bound to mapping information. > > No it becomes easier. Look at the patchset. It cleans up a huge mess. > > I see no cleanups which are not also applicable to mainline. Not sure what you mean by that. > > What is hacky about it? > > It pretends that pages are large than they actually are, forcing the > pte-management code to also play along with the pretence. > > Pages *aren't* 16k. They're 4k. No they are 16k if the filesystem wants them to be 16k. The filesystem does not need to have the data mapped into an address space. And there is no problem with mapping 4k sections if we want to using the ptes. > Please address my point: if in five years time x86 has larger or varible > pagesize, this code will be a permanent millstone around our necks which we > *should not have merged*. No this code will enable us to switch to this new page size in a very fast way. Because the pagecache already supports it it is easier to add the mmap support for other page sizes. > And if in five years time x86 does not have larger pagesize support then > the manufacturers would have decided that 4k pages are not a performance > problem, so we again should not have merged this code. The manufacturers on x86 are already supporting 2M page sizes and cannot support intermediate sizes since they are married to the page table format for performance reasons. The patch could f.e. lead to straightforward support for 2M page mappings if we wanted it. > > It is the most consistent solution that avoid the proliferation of further > > hacks to address the large blocksize. > > You cannot say this. I'm sitting here *watching* you refuse to seriously > consider alternatives. And I am sitting here in disbelief about the series of weird alternatives running over my screen just to avoid the obvious solution. Then there is this weird idea that this would hinder us from supporting additional page sizes for mmap while the patch does exactly lead to enable support such features in the future. > > And you've conspicuously failed to address my point regarding the > *permanent* additional maintenance cost. Where? The page cache handling in the various layers is significantly simplified which reduces maintenance cost. > Anyway. Let's await those performance numbers. If they're notably good, > and if we judge that this goodness will be realised on more than one > arguably-crippled present-day disk adapter then we can evaluate the > *various* options which we have for stuffing more data into that adapter. One? Spin.... The majority you mean? Dave, where are we with the performance tests? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/