subject:"Re\: \[00\/17\] Large Blocksize Support V3"

RE: [00/17] Large Blocksize Support V3

2007-05-08 Thread Weigert, Daniel

ulavarty; Maxim Levitsky Subject: Re: [00/17] Large Blocksize Support V3 David Chinner <[EMAIL PROTECTED]> writes: > Both. To many things can happen asynchroonously to a page that it > makes it just about impossible to predict all the potential race > conditions that are i

Re: [00/17] Large Blocksize Support V3

2007-05-08 Thread William Lee Irwin III

On Mon, May 07, 2007 at 12:06:38AM -0700, William Lee Irwin III wrote: > +int alloc_page_array(struct pagearray *, const int, const size_t); > +void free_page_array(struct pagearray *); > +void zero_page_array(struct pagearray *); > +struct page *nopage_page_array(const struct vm_area_struct *, uns

Re: [00/17] Large Blocksize Support V3

2007-05-07 Thread William Lee Irwin III

On Mon, 7 May 2007, Eric W. Biederman wrote: >> Yes, instead of having to redesign the interface between the >> fs and the page cache for those filesystems that handle large >> blocks we instead need to redesign significant parts of the VM interface. >> Shift the redesign work to another group of p

Re: [00/17] Large Blocksize Support V3

2007-05-07 Thread Christoph Lameter

On Mon, 7 May 2007, Eric W. Biederman wrote: > Yes, instead of having to redesign the interface between the > fs and the page cache for those filesystems that handle large > blocks we instead need to redesign significant parts of the VM interface. > Shift the redesign work to another group of peop

Re: [00/17] Large Blocksize Support V3

2007-05-07 Thread William Lee Irwin III

David Chinner <[EMAIL PROTECTED]> writes: >>> Right - so how do we efficiently manipulate data inside a large >>> block that spans multiple discontigous pages if we don't vmap >>> it? On Mon, May 07, 2007 at 12:43:19AM -0600, Eric W. Biederman wrote: >> You don't manipulate data except for copy_f

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread Eric W. Biederman

David Chinner <[EMAIL PROTECTED]> writes: > Both. To many things can happen asynchroonously to a page that it > makes it just about impossible to predict all the potential race > conditions that are involved. complexity arose from trying to fix > the races that were uncovered without breaking ever

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread William Lee Irwin III

David Chinner <[EMAIL PROTECTED]> writes: >> Right - so how do we efficiently manipulate data inside a large >> block that spans multiple discontigous pages if we don't vmap >> it? On Mon, May 07, 2007 at 12:43:19AM -0600, Eric W. Biederman wrote: > You don't manipulate data except for copy_from_

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread Eric W. Biederman

David Chinner <[EMAIL PROTECTED]> writes: > On Sun, May 06, 2007 at 10:48:23PM -0600, Eric W. Biederman wrote: >> David Chinner <[EMAIL PROTECTED]> writes: >> >> > On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote: >> >> > >> >> > So while the jury is out about how many other file

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread David Chinner

On Sun, May 06, 2007 at 10:48:23PM -0600, Eric W. Biederman wrote: > David Chinner <[EMAIL PROTECTED]> writes: > > > On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote: > >> > > >> > So while the jury is out about how many other filesystems might use > >> > it, I suspect it's more t

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread David Chinner

On Fri, May 04, 2007 at 07:31:37AM -0600, Eric W. Biederman wrote: > David Chinner <[EMAIL PROTECTED]> writes: > > > On Fri, Apr 27, 2007 at 12:04:03AM -0700, Andrew Morton wrote: > > I've got several year-old Irix bugs assigned that are hit every so > > often where one page in the aggregated set

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread Eric W. Biederman

David Chinner <[EMAIL PROTECTED]> writes: > On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote: >> > >> > So while the jury is out about how many other filesystems might use >> > it, I suspect it's more than you might think. At the very least, >> > there may be some IA64 users who

Re: [00/17] Large Blocksize Support V3

2007-05-06 Thread David Chinner

On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote: > > > > So while the jury is out about how many other filesystems might use > > it, I suspect it's more than you might think. At the very least, > > there may be some IA64 users who might be trying to transition their > > way to x8

Re: [00/17] Large Blocksize Support V3

2007-05-04 Thread Christoph Lameter

On Fri, 4 May 2007, Eric W. Biederman wrote: > Given that small block sizes give us better storage efficiency, > which means less disk bandwidth used, which means less time > to get the data off of a slow disk (especially if you can > put multiple files you want simultaneously in that same space).

Re: [00/17] Large Blocksize Support V3

2007-05-04 Thread Eric W. Biederman

Theodore Tso <[EMAIL PROTECTED]> writes: > On Fri, Apr 27, 2007 at 01:48:49AM -0700, Andrew Morton wrote: >> And other filesystems (ie: ext4) _might_ use it. But ext4 is extent-based, >> so perhaps it's not work churning the on-disk format to get a bit of a >> boost in the block allocator. > > We

Re: [00/17] Large Blocksize Support V3

2007-05-04 Thread Eric W. Biederman

David Chinner <[EMAIL PROTECTED]> writes: > On Fri, Apr 27, 2007 at 12:04:03AM -0700, Andrew Morton wrote: > > I've looked at all this but I'm trying to work out if anyone > else has looked at the impact of doing this. I have direct experience > with this form of block aggregation - this is pretty

Re: [00/17] Large Blocksize Support V3

2007-05-04 Thread Eric W. Biederman

Andrew Morton <[EMAIL PROTECTED]> writes: > On Fri, 27 Apr 2007 18:03:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > >> > > > > You basically have to >> > > > > jump through nasty, nasty hoops, to handle corner cases that are > introduced >> > > > > because the generic code can no longer reli

Re: [00/17] Large Blocksize Support V3

2007-04-29 Thread Christoph Lameter

On Fri, 27 Apr 2007, Andrew Morton wrote: > By misunderstanding any suggestions, misrepresenting them, making incorrect > statements about them, by not suggesting any alternatives yourself, all of > it buttressed by a stolid refusal to recognise that this patch has any > costs. That was even ment

Re: [00/17] Large Blocksize Support V3

2007-04-29 Thread Christoph Lameter

On Sat, 28 Apr 2007, Maxim Levitsky wrote: > 1) Is it possible for block device to assume that it will alway get big > requests (and aligned by big blocksize) ? That is one of the key problems. We hope that Mel Gorman's antifrag work will get us there. > 2) Does metadata reading/writing occur

Re: [00/17] Large Blocksize Support V3

2007-04-29 Thread Matt Mackall

On Thu, Apr 26, 2007 at 02:28:46PM +0100, Alan Cox wrote: > > > Oh we have scores of these hacks around. Look at the dvd/cd layer. The > > > point is to get rid of those. > > > > Perhaps this is just a matter of cleaning them up so they are no > > longer hacks? > > CD and DVD media support vario

Re: [00/17] Large Blocksize Support V3

2007-04-29 Thread Peter Zijlstra

On Sat, 2007-04-28 at 01:55 -0700, Andrew Morton wrote: > On Sat, 28 Apr 2007 10:32:56 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > On Sat, 2007-04-28 at 01:22 -0700, Andrew Morton wrote: > > > On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> > > > wrote: > > > > >

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Andrew Morton

On Sat, 28 Apr 2007 12:19:56 -0700 William Lee Irwin III <[EMAIL PROTECTED]> wrote: > I'm skeptical, however, that the contiguity gains will compensate for > the CPU required to do such with the pcp lists. It wouldn't surprise me if approximate contiguity is a pretty common case in the pcp lists

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread William Lee Irwin III

On Sat, 28 Apr 2007 07:09:07 -0700 William Lee Irwin III <[EMAIL PROTECTED]> wrote: >> The gang allocation affair would may also want to make the calls into >> the page allocator batched. For instance, grab enough compound pages to >> build the gang under the lock, since we're going to blow the pe

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Andrew Morton

On Sat, 28 Apr 2007 07:09:07 -0700 William Lee Irwin III <[EMAIL PROTECTED]> wrote: > On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > >> only 4.4 times faster, and more scalable, since we don't bounce the > >> upper level locks around. > > On Sat, Apr 28, 2007 at 0

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Maxim Levitsky

On Wednesday 25 April 2007 01:21, [EMAIL PROTECTED] wrote: > Rationales: > > 1. We have problems supporting devices with a higher blocksize than >page size. This is for example important to support CD and DVDs that >can only read and write 32k or 64k blocks. We currently have a shim >l

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Eric W. Biederman

Pierre Ossman <[EMAIL PROTECTED]> writes: > Eric W. Biederman wrote: >> >> I have a hard time believe that device hardware limits don't allow them >> to have enough space to handle larger requests. If so it was a poor >> design by the hardware manufacturers. >> > > In the MMC layer, the block s

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread William Lee Irwin III

On Sat, Apr 28, 2007 at 12:29:08PM +0100, Alan Cox wrote: > Not neccessarily. If you use 16K contiguous pages you have to do > more work to get memory contiguously and you have less cache efficiency > both of which will do serious damage to performance with poor I/O > subsystems for all the extra p

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread William Lee Irwin III

On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: >> only 4.4 times faster, and more scalable, since we don't bounce the >> upper level locks around. On Sat, Apr 28, 2007 at 01:22:51AM -0700, Andrew Morton wrote: > I'm not sure what we're looking at here. radix-tree cha

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Alan Cox

> But all (both) the proposals we're (ahem) discussing do involve 4x > physically contiguous pages going into those four contiguous pagecache > slots. > > So we're improving things for the half-assed controllers, aren't we? Not neccessarily. If you use 16K contiguous pages you have to do more wor

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Pierre Ossman

Eric W. Biederman wrote: > > I have a hard time believe that device hardware limits don't allow them > to have enough space to handle larger requests. If so it was a poor > design by the hardware manufacturers. > In the MMC layer, the block size is a major bottle neck. None of the currently sup

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Andrew Morton

On Sat, 28 Apr 2007 11:21:17 +0100 Alan Cox <[EMAIL PROTECTED]> wrote: > > > Also remember that even if you do larger pages by using virtual pairs or > > > quads of real pages because it helps on some systems you end up needing > > > the same sized sglist as before so you don't make anything worse

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Alan Cox

> > Also remember that even if you do larger pages by using virtual pairs or > > quads of real pages because it helps on some systems you end up needing > > the same sized sglist as before so you don't make anything worse for > > half-assed controllers as you get the same I/O size providing they ha

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Andrew Morton

On Sat, 28 Apr 2007 10:43:28 +0100 Alan Cox <[EMAIL PROTECTED]> wrote: > On Fri, 27 Apr 2007 21:56:34 -0700 > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Sat, 28 Apr 2007 13:17:40 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > > > > Fix up your lameo HBA for reads. > > > > > > Where

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Alan Cox

On Fri, 27 Apr 2007 21:56:34 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Sat, 28 Apr 2007 13:17:40 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > > Fix up your lameo HBA for reads. > > > > Where did that come from? You spend 20 lines described the inefficiencies > > of the readahea

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Andrew Morton

On Sat, 28 Apr 2007 10:32:56 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > On Sat, 2007-04-28 at 01:22 -0700, Andrew Morton wrote: > > On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > > > > > > > > The other thing is that we can batch up pagecache page inser

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Peter Zijlstra

On Sat, 2007-04-28 at 01:22 -0700, Andrew Morton wrote: > On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > > > > > The other thing is that we can batch up pagecache page insertions for bulk > > > writes as well (that is. write(2) with buffer size > page size). I

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Andrew Morton

On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > > The other thing is that we can batch up pagecache page insertions for bulk > > writes as well (that is. write(2) with buffer size > page size). I should > > have a patch somewhere for that as well if anyone inter

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Christoph Hellwig

On Sat, Apr 28, 2007 at 12:27:45PM +1000, Nick Piggin wrote: > And that wasn't due to the 128 sg limit? No, that was due to aacraid really liking sg lists as small as possible where every entry covers areas as big as possible. The driver really liked physical merging once wli changed the page all

Re: [00/17] Large Blocksize Support V3

2007-04-28 Thread Peter Zijlstra

On Sat, 2007-04-28 at 11:43 +1000, Nick Piggin wrote: > Andrew Morton wrote: > > For example, see __do_page_cache_readahead(). It does a read_lock() and a > > page allocation and a radix-tree lookup for each page. We can vastly > > improve that. > > > > Step 1: > > > > - do a read-lock > > >

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 23:24:05 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > Fact is, this change has *costs*. And you're completely ignoring them, > > trying to spin them away. It ain't working and it never will. I'm seeing > > no serious attempt to think about how we can reduce

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Andrew Morton wrote: > Your patch *is* a workaround. It's a workaround for small CPU pagesize. > It's a workaround for suboptimal VFS anf filesystem implementations. It's > a workaround for a disk adapter which has suboptimal readahead and > writeback caching implementation

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 22:08:17 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Fri, 27 Apr 2007, Andrew Morton wrote: > > > My (repeated) point is that if we populate pagecache with > > physically-contiguous 4k > > pages in this manner then bio+block will be able to create much larg

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Andrew Morton wrote: > My (repeated) point is that if we populate pagecache with > physically-contiguous 4k > pages in this manner then bio+block will be able to create much larger SG > lists. True but the "if" becomes exceedingly rare the longer the system was in operatio

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Sat, 28 Apr 2007 13:17:40 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > Fix up your lameo HBA for reads. > > Where did that come from? You spend 20 lines described the inefficiencies > of the readahead in the page cache and it should be fixed but then you > turn around and say fix the HBA

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Sat, 28 Apr 2007, David Chinner wrote: > > 1-disk and 2-disk read throughput fell by an improbable amount, which makes > > me cautious about the other numbers. > > For read, yes, and it's because something is going wrong with the > I/O size - it looks like readahead thrashing of some kind even

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread David Chinner

On Fri, Apr 27, 2007 at 12:11:08PM -0700, Andrew Morton wrote: > On Sat, 28 Apr 2007 03:34:32 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > Some more information - stripe unit on the dm raid0 is 512k. > > I have not attempted to increase I/O sizes at all yet - these test are > > just demons

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

William Lee Irwin III wrote: >> What sort of strategy do you intend to use to speculatively populate >> the pagecache with contiguous pages? On Sat, Apr 28, 2007 at 12:50:26PM +1000, Nick Piggin wrote: > Andrew outlined it. I'd like to suggest a few straightforward additions to the proposal: (1)

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

William Lee Irwin III wrote: On Sat, Apr 28, 2007 at 12:27:45PM +1000, Nick Piggin wrote: I guess 10% isn't a small amount. Though it would be nice to have before/after numbers for Linux. And, like Andrew was saying, we could just _attempt_ to put contiguous pages in pagecache rather than _requ

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

On Sat, Apr 28, 2007 at 12:27:45PM +1000, Nick Piggin wrote: > I guess 10% isn't a small amount. Though it would be nice to have > before/after numbers for Linux. And, like Andrew was saying, we could > just _attempt_ to put contiguous pages in pagecache rather than > _require_ it. Which is still r

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Christoph Hellwig wrote: On Fri, Apr 27, 2007 at 10:25:44PM +1000, Nick Piggin wrote: Linus's favourite jokes about powerpc mmu being crippled forever, aside ;) Different mmu. The desktop 32bit mmu Linus refered to has almost nothing in common with the mmu on 64bit systems. Well I wasn't

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

On Thu, Apr 26, 2007 at 11:55:42PM -0700, Andrew Morton wrote: >>> Please address my point: if in five years time x86 has larger or varible >>> pagesize, this code will be a permanent millstone around our necks which we >>> *should not have merged*. >>> And if in five years time x86 does not have l

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Andrew Morton wrote: On Sat, 28 Apr 2007 03:34:32 +1000 David Chinner <[EMAIL PROTECTED]> wrote: Some more information - stripe unit on the dm raid0 is 512k. I have not attempted to increase I/O sizes at all yet - these test are just demonstrating efficiency improvements in the filesystem. Th

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 06:44:51 -0700 William Lee Irwin III <[EMAIL PROTECTED]> wrote: > On Thu, Apr 26, 2007 at 11:55:42PM -0700, Andrew Morton wrote: > > Please address my point: if in five years time x86 has larger or varible > > pagesize, this code will be a permanent millstone around our necks

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Sat, 28 Apr 2007 03:34:32 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > Some more information - stripe unit on the dm raid0 is 512k. > I have not attempted to increase I/O sizes at all yet - these test are > just demonstrating efficiency improvements in the filesystem. > > These numbers for

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

On Fri, 2007-04-27 at 12:55 -0400, Theodore Tso wrote: >> Unfortunately, this isn't a problem with hardware getting better, but >> a willingness to break backwards compatibility. >> x86_64 uses a 4k page size to avoid breaking 32-bit applications. And >> unfortunately, iirc, even 64-bit applicatio

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread David Chinner

On Sat, Apr 28, 2007 at 02:36:20AM +1000, David Chinner wrote: > The test was writing a single 50GB file to a fresh filesystem, and > then reading it back. Run on two different dm stripes - a 4-disk > RAID) and a 8disk RAID0 stripe, with a stripe unit of 512k. Disks > are 10krpm SAS, external jbod

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nicholas Miell

On Fri, 2007-04-27 at 12:55 -0400, Theodore Tso wrote: > On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote: > > And hardware gets better. If Intel & AMD come out with a 16k pagesize > > option in a couple of years we'll look pretty dumb. If the problems which > > you're presently havi

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Theodore Tso

On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote: > And hardware gets better. If Intel & AMD come out with a 16k pagesize > option in a couple of years we'll look pretty dumb. If the problems which > you're presently having with that controller get sorted out in the next > generation

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Nick Piggin wrote: > Linus's favourite jokes about powerpc mmu being crippled forever, aside ;) > > This seems like just speculation. I would not be against something which, > without, would "cripple" some relevant hardware, but you are just handwaving > at this point. And yo

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Theodore Tso

On Fri, Apr 27, 2007 at 01:48:49AM -0700, Andrew Morton wrote: > And other filesystems (ie: ext4) _might_ use it. But ext4 is extent-based, > so perhaps it's not work churning the on-disk format to get a bit of a > boost in the block allocator. Well, ext3 could definitely use it; there are people

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread David Chinner

On Fri, Apr 27, 2007 at 12:26:40AM -0700, Andrew Morton wrote: > On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > > The page cache handling in the various layers is significantly > > simplified which reduces maintenance cost. > > How on earth can the *

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

On Fri, Apr 27, 2007 at 08:22:07PM +1000, Nick Piggin wrote: >>> Just a random aside question... doesn't Oracle db do direct IO from >>> hugepages? William Lee Irwin III wrote: >> If and when configured to use direct IO and hugepages, yes. On Fri, Apr 27, 2007 at 11:06:36PM +1000, Nick Piggin wro

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Eric W. Biederman

Nick Piggin <[EMAIL PROTECTED]> writes: > Eric W. Biederman wrote: >> Jens Axboe <[EMAIL PROTECTED]> writes: > >>>Yes, that is exactly the problem. Once you have that, pktcdvd is pretty >>>much reduced to setup and init code, the actual data handling can be >>>done by sr or ide-cd directly. You co

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

On Thu, Apr 26, 2007 at 11:55:42PM -0700, Andrew Morton wrote: > Please address my point: if in five years time x86 has larger or varible > pagesize, this code will be a permanent millstone around our necks which we > *should not have merged*. > And if in five years time x86 does not have larger pa

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Hellwig

On Fri, Apr 27, 2007 at 10:14:20PM +1000, Paul Mackerras wrote: > It's not as simple on 64-bit powerpc with the hash table of course, > because the page size is chosen at the segment (256MB) level, > restricting where we can put 64k and 16M pages to some degree. I think Christoph's variable order

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Hellwig

On Fri, Apr 27, 2007 at 10:25:44PM +1000, Nick Piggin wrote: > Linus's favourite jokes about powerpc mmu being crippled forever, aside ;) Different mmu. The desktop 32bit mmu Linus refered to has almost nothing in common with the mmu on 64bit systems. > >Right this could help but it is not addre

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Hellwig

On Fri, Apr 27, 2007 at 05:12:05AM -0700, Christoph Lameter wrote: > Powerpc supports multiple pagesizes. Maybe we could make mmap use those > page sizes some day if we had a variable order page cache. Your stands on > the issue means that powerpc will be forever crippled and not be able to > us

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

William Lee Irwin III wrote: William Lee Irwin III wrote: I readily concede that seeks are most costly. Yet memory contiguity remains rather influential. Witness the fact that I'm now being called upon a second time to adjust the order in which mm/page_alloc.c returns pages for the sake of impl

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Mel Gorman

On (27/04/07 20:05), Nick Piggin didst pronounce: > Christoph Hellwig wrote: > >On Thu, Apr 26, 2007 at 05:48:12PM +1000, Nick Piggin wrote: > > > >>>Well maybe you could explain what you want. Preferably without > >>>redefining the established terms? > >> > >>Support for larger buffers than page

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread William Lee Irwin III

William Lee Irwin III wrote: >> I readily concede that seeks are most costly. Yet memory contiguity >> remains rather influential. >> Witness the fact that I'm now being called upon a second time to >> adjust the order in which mm/page_alloc.c returns pages for the >> sake of implicitly establishin

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Paul Mackerras wrote: Nick Piggin writes: For the TLB issue, higher order pagecache doesn't help. If distros Oh? Assuming your hardware is capable of supporting a variety of page sizes, and of putting a page at any address that is a multiple of its size, it should help, potentially a great

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Christoph Lameter wrote: On Fri, 27 Apr 2007, Nick Piggin wrote: For the TLB issue, higher order pagecache doesn't help. If distros ship with a 4K page size on powerpc, and use some larger pages in the pagecache, some people are still going to get angry because they wanted to use 64K pages...

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Paul Mackerras

Nick Piggin writes: > For the TLB issue, higher order pagecache doesn't help. If distros Oh? Assuming your hardware is capable of supporting a variety of page sizes, and of putting a page at any address that is a multiple of its size, it should help, potentially a great deal, as far as I can see

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Nick Piggin wrote: > For the TLB issue, higher order pagecache doesn't help. If distros > ship with a 4K page size on powerpc, and use some larger pages in > the pagecache, some people are still going to get angry because > they wanted to use 64K pages... But I agree 64K pages

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Andrew Morton wrote: > On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > > The page cache handling in the various layers is significantly > > simplified which reduces maintenance cost. > > How on earth can the *addition* of variabl

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Paul Mackerras wrote: > Option (b) would be a bit of an ugly hack. > > Which leaves option (c) - unless you have a further option. So I have > to say I support Christoph on this, at least as far as the general > principle is concerned. We could approximate option (b) by set

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Paul Mackerras wrote: Andrew Morton writes: If x86 had larger pagesize we wouldn't be seeing any of this. It is a workaround for present-generation hardware. Unfortunately, it's not really practical to increase the page size very much on most systems, because you end up wasting a lot of s

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Paul Mackerras

Andrew Morton writes: > If x86 had larger pagesize we wouldn't be seeing any of this. It is a > workaround > for present-generation hardware. Unfortunately, it's not really practical to increase the page size very much on most systems, because you end up wasting a lot of space in the page cache

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Christoph Hellwig wrote: On Thu, Apr 26, 2007 at 04:50:06PM +1000, Nick Piggin wrote: Improving the buffer layer would be a good way. Of course, that is a long and difficult task, so nobody wants to do it. It's also a stupid idea. We got rid of the buffer layer because it's a complete pain

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Eric W. Biederman wrote: Jens Axboe <[EMAIL PROTECTED]> writes: Yes, that is exactly the problem. Once you have that, pktcdvd is pretty much reduced to setup and init code, the actual data handling can be done by sr or ide-cd directly. You could merge it into cdrom.c, it would not be very diff

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

William Lee Irwin III wrote: William Lee Irwin III <[EMAIL PROTECTED]> writes: In memory as on disk, contiguity matters a lot for performance. On Thu, Apr 26, 2007 at 12:21:24PM -0600, Eric W. Biederman wrote: Not nearly so much though. In memory you don't have seeks to avoid. On disks av

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Christoph Lameter wrote: On Thu, 26 Apr 2007, Nick Piggin wrote: But what do you mean with it? A block is no longer a contiguous section of memory. So you have redefined the term. I don't understand what you mean at all. A block has always been a contiguous area of disk. You want to chang

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Christoph Lameter wrote: On Thu, 26 Apr 2007, Nick Piggin wrote: Christoph Lameter wrote: On Thu, 26 Apr 2007, Nick Piggin wrote: But I maintain that the end result is better than the fragmentation based approach. A lot of people don't actually want a bigger page cache size, because they

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

Christoph Hellwig wrote: On Thu, Apr 26, 2007 at 05:48:12PM +1000, Nick Piggin wrote: Well maybe you could explain what you want. Preferably without redefining the established terms? Support for larger buffers than page cache pages. I don't think you really want this :) The whole non-page

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Nick Piggin

William Lee Irwin III wrote: On Fri, Apr 27, 2007 at 01:38:30AM +1000, Nick Piggin wrote: Or good grounds to increase the sg limit and push for io controller manufacturers to do the same. If we have a hack in the kernel that mostly works, they won't. On Fri, Apr 27, 2007 at 01:38:30AM +1000,

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 18:03:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > > > You basically have to > > > > > jump through nasty, nasty hoops, to handle corner cases that are > > > > > introduced > > > > > because the generic code can no longer reliably lock out access to a > > > > > file

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread David Chinner

On Fri, Apr 27, 2007 at 12:26:40AM -0700, Andrew Morton wrote: > On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > > The page cache handling in the various layers is significantly > > simplified which reduces maintenance cost. > > How on earth can the *

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread David Chinner

On Fri, Apr 27, 2007 at 12:04:03AM -0700, Andrew Morton wrote: > On Fri, 27 Apr 2007 16:09:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote: > > > On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> > > > wrote:

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 00:35:19 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Fri, 27 Apr 2007, Andrew Morton wrote: > > > On Fri, 27 Apr 2007 00:22:26 -0700 (PDT) Christoph Lameter <[EMAIL > > PROTECTED]> wrote: > > > > > I will submit pieces to mm depending on the > > > outcome

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Fri, 27 Apr 2007, Andrew Morton wrote: > On Fri, 27 Apr 2007 00:22:26 -0700 (PDT) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > > I will submit pieces to mm depending on the > > outcome of our discussions. > There's a ludicrous amount of MM work pending in -mm. It would probably be >

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 00:22:26 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > I will submit pieces to mm depending on the > outcome of our discussions. Thanks. There's a ludicrous amount of MM work pending in -mm. It would probably be less work at your end to see what ends up landin

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > The page cache handling in the various layers is significantly > simplified which reduces maintenance cost. How on earth can the *addition* of variable pagecache size simplify the existing code? What cleanu

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Thu, 26 Apr 2007, Andrew Morton wrote: > Were any cleanups made which were not also applicable as standalone things > to mainline? Ahh. I think I know what you mean. The current patchset is for performance testing against mainline. Lets first cover the bases and then see where we go. It is n

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Christoph Lameter

On Thu, 26 Apr 2007, Andrew Morton wrote: > It's not exactly hard to lock four pages which are contiguous in pagecache, > contiguous in physical memory and are contiguous in the radix-tree. If you can find them > > The patch is not about forcing to use large pages but about the option to

Re: [00/17] Large Blocksize Support V3

2007-04-27 Thread Andrew Morton

On Fri, 27 Apr 2007 16:09:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote: > > On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > > > >blocksizes via this scheme - instantiate and lock four pages

Re: [00/17] Large Blocksize Support V3

2007-04-26 Thread Andrew Morton

On Thu, 26 Apr 2007 22:49:53 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 26 Apr 2007, Andrew Morton wrote: > > > > Or make sure that truncate > > > doesn't race on a partial *block* truncate? > > > > lock four pages > > You would only lock a single higher order block. Tr

Re: [00/17] Large Blocksize Support V3

2007-04-26 Thread David Chinner

On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote: > On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > > >blocksizes via this scheme - instantiate and lock four pages and go for > > >it. > > > > So now how do you get block aligned writeback? > >

Re: [00/17] Large Blocksize Support V3

2007-04-26 Thread Christoph Lameter

On Thu, 26 Apr 2007, Andrew Morton wrote: > > Or make sure that truncate > > doesn't race on a partial *block* truncate? > > lock four pages You would only lock a single higher order block. Truncate works on that level. If you have 4 separate pages then you need to take separate locks and you

Re: [00/17] Large Blocksize Support V3

2007-04-26 Thread Jens Axboe

On Thu, Apr 26 2007, Mel Gorman wrote: > On (26/04/07 20:39), Jens Axboe didst pronounce: > > On Thu, Apr 26 2007, Christoph Lameter wrote: > > > On Thu, 26 Apr 2007, Jens Axboe wrote: > > > > > > > On Thu, Apr 26 2007, Christoph Lameter wrote: > > > > > On Thu, 26 Apr 2007, Jens Axboe wrote: > >

Re: [00/17] Large Blocksize Support V3

2007-04-26 Thread Andrew Morton

On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > >blocksizes via this scheme - instantiate and lock four pages and go for > >it. > > So now how do you get block aligned writeback? in writeback and pageout: if (page->index & mapping->block_size_mask)

Re: [00/17] Large Blocksize Support V3

2007-04-26 Thread David Chinner

On Thu, Apr 26, 2007 at 07:53:57PM -0700, Andrew Morton wrote: > On Fri, 27 Apr 2007 12:27:31 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > > On Thu, Apr 26, 2007 at 07:04:38PM -0700, Andrew Morton wrote: > > > On Tue, 24 Apr 2007 15:21:05 -0700 [EMAIL PROTECTED] wrote: > > > Also, afaict your i

1 2 3 >

1 - 100 of 212 matches

Mail list logo