ulavarty; Maxim
Levitsky
Subject: Re: [00/17] Large Blocksize Support V3
David Chinner <[EMAIL PROTECTED]> writes:
> Both. To many things can happen asynchroonously to a page that it
> makes it just about impossible to predict all the potential race
> conditions that are i
On Mon, May 07, 2007 at 12:06:38AM -0700, William Lee Irwin III wrote:
> +int alloc_page_array(struct pagearray *, const int, const size_t);
> +void free_page_array(struct pagearray *);
> +void zero_page_array(struct pagearray *);
> +struct page *nopage_page_array(const struct vm_area_struct *, uns
On Mon, 7 May 2007, Eric W. Biederman wrote:
>> Yes, instead of having to redesign the interface between the
>> fs and the page cache for those filesystems that handle large
>> blocks we instead need to redesign significant parts of the VM interface.
>> Shift the redesign work to another group of p
On Mon, 7 May 2007, Eric W. Biederman wrote:
> Yes, instead of having to redesign the interface between the
> fs and the page cache for those filesystems that handle large
> blocks we instead need to redesign significant parts of the VM interface.
> Shift the redesign work to another group of peop
David Chinner <[EMAIL PROTECTED]> writes:
>>> Right - so how do we efficiently manipulate data inside a large
>>> block that spans multiple discontigous pages if we don't vmap
>>> it?
On Mon, May 07, 2007 at 12:43:19AM -0600, Eric W. Biederman wrote:
>> You don't manipulate data except for copy_f
David Chinner <[EMAIL PROTECTED]> writes:
> Both. To many things can happen asynchroonously to a page that it
> makes it just about impossible to predict all the potential race
> conditions that are involved. complexity arose from trying to fix
> the races that were uncovered without breaking ever
David Chinner <[EMAIL PROTECTED]> writes:
>> Right - so how do we efficiently manipulate data inside a large
>> block that spans multiple discontigous pages if we don't vmap
>> it?
On Mon, May 07, 2007 at 12:43:19AM -0600, Eric W. Biederman wrote:
> You don't manipulate data except for copy_from_
David Chinner <[EMAIL PROTECTED]> writes:
> On Sun, May 06, 2007 at 10:48:23PM -0600, Eric W. Biederman wrote:
>> David Chinner <[EMAIL PROTECTED]> writes:
>>
>> > On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote:
>> >> >
>> >> > So while the jury is out about how many other file
On Sun, May 06, 2007 at 10:48:23PM -0600, Eric W. Biederman wrote:
> David Chinner <[EMAIL PROTECTED]> writes:
>
> > On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote:
> >> >
> >> > So while the jury is out about how many other filesystems might use
> >> > it, I suspect it's more t
On Fri, May 04, 2007 at 07:31:37AM -0600, Eric W. Biederman wrote:
> David Chinner <[EMAIL PROTECTED]> writes:
>
> > On Fri, Apr 27, 2007 at 12:04:03AM -0700, Andrew Morton wrote:
> > I've got several year-old Irix bugs assigned that are hit every so
> > often where one page in the aggregated set
David Chinner <[EMAIL PROTECTED]> writes:
> On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote:
>> >
>> > So while the jury is out about how many other filesystems might use
>> > it, I suspect it's more than you might think. At the very least,
>> > there may be some IA64 users who
On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote:
> >
> > So while the jury is out about how many other filesystems might use
> > it, I suspect it's more than you might think. At the very least,
> > there may be some IA64 users who might be trying to transition their
> > way to x8
On Fri, 4 May 2007, Eric W. Biederman wrote:
> Given that small block sizes give us better storage efficiency,
> which means less disk bandwidth used, which means less time
> to get the data off of a slow disk (especially if you can
> put multiple files you want simultaneously in that same space).
Theodore Tso <[EMAIL PROTECTED]> writes:
> On Fri, Apr 27, 2007 at 01:48:49AM -0700, Andrew Morton wrote:
>> And other filesystems (ie: ext4) _might_ use it. But ext4 is extent-based,
>> so perhaps it's not work churning the on-disk format to get a bit of a
>> boost in the block allocator.
>
> We
David Chinner <[EMAIL PROTECTED]> writes:
> On Fri, Apr 27, 2007 at 12:04:03AM -0700, Andrew Morton wrote:
>
> I've looked at all this but I'm trying to work out if anyone
> else has looked at the impact of doing this. I have direct experience
> with this form of block aggregation - this is pretty
Andrew Morton <[EMAIL PROTECTED]> writes:
> On Fri, 27 Apr 2007 18:03:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
>
>> > > > > You basically have to
>> > > > > jump through nasty, nasty hoops, to handle corner cases that are
> introduced
>> > > > > because the generic code can no longer reli
On Fri, 27 Apr 2007, Andrew Morton wrote:
> By misunderstanding any suggestions, misrepresenting them, making incorrect
> statements about them, by not suggesting any alternatives yourself, all of
> it buttressed by a stolid refusal to recognise that this patch has any
> costs.
That was even ment
On Sat, 28 Apr 2007, Maxim Levitsky wrote:
> 1) Is it possible for block device to assume that it will alway get big
> requests (and aligned by big blocksize) ?
That is one of the key problems. We hope that Mel Gorman's antifrag work
will get us there.
> 2) Does metadata reading/writing occur
On Thu, Apr 26, 2007 at 02:28:46PM +0100, Alan Cox wrote:
> > > Oh we have scores of these hacks around. Look at the dvd/cd layer. The
> > > point is to get rid of those.
> >
> > Perhaps this is just a matter of cleaning them up so they are no
> > longer hacks?
>
> CD and DVD media support vario
On Sat, 2007-04-28 at 01:55 -0700, Andrew Morton wrote:
> On Sat, 28 Apr 2007 10:32:56 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> > On Sat, 2007-04-28 at 01:22 -0700, Andrew Morton wrote:
> > > On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]>
> > > wrote:
> > >
> >
On Sat, 28 Apr 2007 12:19:56 -0700 William Lee Irwin III <[EMAIL PROTECTED]>
wrote:
> I'm skeptical, however, that the contiguity gains will compensate for
> the CPU required to do such with the pcp lists.
It wouldn't surprise me if approximate contiguity is a pretty common case
in the pcp lists
On Sat, 28 Apr 2007 07:09:07 -0700 William Lee Irwin III <[EMAIL PROTECTED]>
wrote:
>> The gang allocation affair would may also want to make the calls into
>> the page allocator batched. For instance, grab enough compound pages to
>> build the gang under the lock, since we're going to blow the pe
On Sat, 28 Apr 2007 07:09:07 -0700 William Lee Irwin III <[EMAIL PROTECTED]>
wrote:
> On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> >> only 4.4 times faster, and more scalable, since we don't bounce the
> >> upper level locks around.
>
> On Sat, Apr 28, 2007 at 0
On Wednesday 25 April 2007 01:21, [EMAIL PROTECTED] wrote:
> Rationales:
>
> 1. We have problems supporting devices with a higher blocksize than
>page size. This is for example important to support CD and DVDs that
>can only read and write 32k or 64k blocks. We currently have a shim
>l
Pierre Ossman <[EMAIL PROTECTED]> writes:
> Eric W. Biederman wrote:
>>
>> I have a hard time believe that device hardware limits don't allow them
>> to have enough space to handle larger requests. If so it was a poor
>> design by the hardware manufacturers.
>>
>
> In the MMC layer, the block s
On Sat, Apr 28, 2007 at 12:29:08PM +0100, Alan Cox wrote:
> Not neccessarily. If you use 16K contiguous pages you have to do
> more work to get memory contiguously and you have less cache efficiency
> both of which will do serious damage to performance with poor I/O
> subsystems for all the extra p
On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>> only 4.4 times faster, and more scalable, since we don't bounce the
>> upper level locks around.
On Sat, Apr 28, 2007 at 01:22:51AM -0700, Andrew Morton wrote:
> I'm not sure what we're looking at here. radix-tree cha
> But all (both) the proposals we're (ahem) discussing do involve 4x
> physically contiguous pages going into those four contiguous pagecache
> slots.
>
> So we're improving things for the half-assed controllers, aren't we?
Not neccessarily. If you use 16K contiguous pages you have to do
more wor
Eric W. Biederman wrote:
>
> I have a hard time believe that device hardware limits don't allow them
> to have enough space to handle larger requests. If so it was a poor
> design by the hardware manufacturers.
>
In the MMC layer, the block size is a major bottle neck. None of the currently
sup
On Sat, 28 Apr 2007 11:21:17 +0100 Alan Cox <[EMAIL PROTECTED]> wrote:
> > > Also remember that even if you do larger pages by using virtual pairs or
> > > quads of real pages because it helps on some systems you end up needing
> > > the same sized sglist as before so you don't make anything worse
> > Also remember that even if you do larger pages by using virtual pairs or
> > quads of real pages because it helps on some systems you end up needing
> > the same sized sglist as before so you don't make anything worse for
> > half-assed controllers as you get the same I/O size providing they ha
On Sat, 28 Apr 2007 10:43:28 +0100 Alan Cox <[EMAIL PROTECTED]> wrote:
> On Fri, 27 Apr 2007 21:56:34 -0700
> Andrew Morton <[EMAIL PROTECTED]> wrote:
>
> > On Sat, 28 Apr 2007 13:17:40 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> >
> > > > Fix up your lameo HBA for reads.
> > >
> > > Where
On Fri, 27 Apr 2007 21:56:34 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Sat, 28 Apr 2007 13:17:40 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
>
> > > Fix up your lameo HBA for reads.
> >
> > Where did that come from? You spend 20 lines described the inefficiencies
> > of the readahea
On Sat, 28 Apr 2007 10:32:56 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> On Sat, 2007-04-28 at 01:22 -0700, Andrew Morton wrote:
> > On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> >
> > > >
> > > > The other thing is that we can batch up pagecache page inser
On Sat, 2007-04-28 at 01:22 -0700, Andrew Morton wrote:
> On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> > >
> > > The other thing is that we can batch up pagecache page insertions for bulk
> > > writes as well (that is. write(2) with buffer size > page size). I
On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> >
> > The other thing is that we can batch up pagecache page insertions for bulk
> > writes as well (that is. write(2) with buffer size > page size). I should
> > have a patch somewhere for that as well if anyone inter
On Sat, Apr 28, 2007 at 12:27:45PM +1000, Nick Piggin wrote:
> And that wasn't due to the 128 sg limit?
No, that was due to aacraid really liking sg lists as small as possible
where every entry covers areas as big as possible. The driver really
liked physical merging once wli changed the page all
On Sat, 2007-04-28 at 11:43 +1000, Nick Piggin wrote:
> Andrew Morton wrote:
> > For example, see __do_page_cache_readahead(). It does a read_lock() and a
> > page allocation and a radix-tree lookup for each page. We can vastly
> > improve that.
> >
> > Step 1:
> >
> > - do a read-lock
> >
>
On Fri, 27 Apr 2007 23:24:05 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]>
wrote:
> > Fact is, this change has *costs*. And you're completely ignoring them,
> > trying to spin them away. It ain't working and it never will. I'm seeing
> > no serious attempt to think about how we can reduce
On Fri, 27 Apr 2007, Andrew Morton wrote:
> Your patch *is* a workaround. It's a workaround for small CPU pagesize.
> It's a workaround for suboptimal VFS anf filesystem implementations. It's
> a workaround for a disk adapter which has suboptimal readahead and
> writeback caching implementation
On Fri, 27 Apr 2007 22:08:17 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]>
wrote:
> On Fri, 27 Apr 2007, Andrew Morton wrote:
>
> > My (repeated) point is that if we populate pagecache with
> > physically-contiguous 4k
> > pages in this manner then bio+block will be able to create much larg
On Fri, 27 Apr 2007, Andrew Morton wrote:
> My (repeated) point is that if we populate pagecache with
> physically-contiguous 4k
> pages in this manner then bio+block will be able to create much larger SG
> lists.
True but the "if" becomes exceedingly rare the longer the system was in
operatio
On Sat, 28 Apr 2007 13:17:40 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> > Fix up your lameo HBA for reads.
>
> Where did that come from? You spend 20 lines described the inefficiencies
> of the readahead in the page cache and it should be fixed but then you
> turn around and say fix the HBA
On Sat, 28 Apr 2007, David Chinner wrote:
> > 1-disk and 2-disk read throughput fell by an improbable amount, which makes
> > me cautious about the other numbers.
>
> For read, yes, and it's because something is going wrong with the
> I/O size - it looks like readahead thrashing of some kind even
On Fri, Apr 27, 2007 at 12:11:08PM -0700, Andrew Morton wrote:
> On Sat, 28 Apr 2007 03:34:32 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
>
> > Some more information - stripe unit on the dm raid0 is 512k.
> > I have not attempted to increase I/O sizes at all yet - these test are
> > just demons
William Lee Irwin III wrote:
>> What sort of strategy do you intend to use to speculatively populate
>> the pagecache with contiguous pages?
On Sat, Apr 28, 2007 at 12:50:26PM +1000, Nick Piggin wrote:
> Andrew outlined it.
I'd like to suggest a few straightforward additions to the proposal:
(1)
William Lee Irwin III wrote:
On Sat, Apr 28, 2007 at 12:27:45PM +1000, Nick Piggin wrote:
I guess 10% isn't a small amount. Though it would be nice to have
before/after numbers for Linux. And, like Andrew was saying, we could
just _attempt_ to put contiguous pages in pagecache rather than
_requ
On Sat, Apr 28, 2007 at 12:27:45PM +1000, Nick Piggin wrote:
> I guess 10% isn't a small amount. Though it would be nice to have
> before/after numbers for Linux. And, like Andrew was saying, we could
> just _attempt_ to put contiguous pages in pagecache rather than
> _require_ it. Which is still r
Christoph Hellwig wrote:
On Fri, Apr 27, 2007 at 10:25:44PM +1000, Nick Piggin wrote:
Linus's favourite jokes about powerpc mmu being crippled forever, aside ;)
Different mmu. The desktop 32bit mmu Linus refered to has almost nothing
in common with the mmu on 64bit systems.
Well I wasn't
On Thu, Apr 26, 2007 at 11:55:42PM -0700, Andrew Morton wrote:
>>> Please address my point: if in five years time x86 has larger or varible
>>> pagesize, this code will be a permanent millstone around our necks which we
>>> *should not have merged*.
>>> And if in five years time x86 does not have l
Andrew Morton wrote:
On Sat, 28 Apr 2007 03:34:32 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
Some more information - stripe unit on the dm raid0 is 512k.
I have not attempted to increase I/O sizes at all yet - these test are
just demonstrating efficiency improvements in the filesystem.
Th
On Fri, 27 Apr 2007 06:44:51 -0700 William Lee Irwin III <[EMAIL PROTECTED]>
wrote:
> On Thu, Apr 26, 2007 at 11:55:42PM -0700, Andrew Morton wrote:
> > Please address my point: if in five years time x86 has larger or varible
> > pagesize, this code will be a permanent millstone around our necks
On Sat, 28 Apr 2007 03:34:32 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> Some more information - stripe unit on the dm raid0 is 512k.
> I have not attempted to increase I/O sizes at all yet - these test are
> just demonstrating efficiency improvements in the filesystem.
>
> These numbers for
On Fri, 2007-04-27 at 12:55 -0400, Theodore Tso wrote:
>> Unfortunately, this isn't a problem with hardware getting better, but
>> a willingness to break backwards compatibility.
>> x86_64 uses a 4k page size to avoid breaking 32-bit applications. And
>> unfortunately, iirc, even 64-bit applicatio
On Sat, Apr 28, 2007 at 02:36:20AM +1000, David Chinner wrote:
> The test was writing a single 50GB file to a fresh filesystem, and
> then reading it back. Run on two different dm stripes - a 4-disk
> RAID) and a 8disk RAID0 stripe, with a stripe unit of 512k. Disks
> are 10krpm SAS, external jbod
On Fri, 2007-04-27 at 12:55 -0400, Theodore Tso wrote:
> On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> > And hardware gets better. If Intel & AMD come out with a 16k pagesize
> > option in a couple of years we'll look pretty dumb. If the problems which
> > you're presently havi
On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> And hardware gets better. If Intel & AMD come out with a 16k pagesize
> option in a couple of years we'll look pretty dumb. If the problems which
> you're presently having with that controller get sorted out in the next
> generation
On Fri, 27 Apr 2007, Nick Piggin wrote:
> Linus's favourite jokes about powerpc mmu being crippled forever, aside ;)
>
> This seems like just speculation. I would not be against something which,
> without, would "cripple" some relevant hardware, but you are just handwaving
> at this point. And yo
On Fri, Apr 27, 2007 at 01:48:49AM -0700, Andrew Morton wrote:
> And other filesystems (ie: ext4) _might_ use it. But ext4 is extent-based,
> so perhaps it's not work churning the on-disk format to get a bit of a
> boost in the block allocator.
Well, ext3 could definitely use it; there are people
On Fri, Apr 27, 2007 at 12:26:40AM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL
> PROTECTED]> wrote:
>
> > The page cache handling in the various layers is significantly
> > simplified which reduces maintenance cost.
>
> How on earth can the *
On Fri, Apr 27, 2007 at 08:22:07PM +1000, Nick Piggin wrote:
>>> Just a random aside question... doesn't Oracle db do direct IO from
>>> hugepages?
William Lee Irwin III wrote:
>> If and when configured to use direct IO and hugepages, yes.
On Fri, Apr 27, 2007 at 11:06:36PM +1000, Nick Piggin wro
Nick Piggin <[EMAIL PROTECTED]> writes:
> Eric W. Biederman wrote:
>> Jens Axboe <[EMAIL PROTECTED]> writes:
>
>>>Yes, that is exactly the problem. Once you have that, pktcdvd is pretty
>>>much reduced to setup and init code, the actual data handling can be
>>>done by sr or ide-cd directly. You co
On Thu, Apr 26, 2007 at 11:55:42PM -0700, Andrew Morton wrote:
> Please address my point: if in five years time x86 has larger or varible
> pagesize, this code will be a permanent millstone around our necks which we
> *should not have merged*.
> And if in five years time x86 does not have larger pa
On Fri, Apr 27, 2007 at 10:14:20PM +1000, Paul Mackerras wrote:
> It's not as simple on 64-bit powerpc with the hash table of course,
> because the page size is chosen at the segment (256MB) level,
> restricting where we can put 64k and 16M pages to some degree.
I think Christoph's variable order
On Fri, Apr 27, 2007 at 10:25:44PM +1000, Nick Piggin wrote:
> Linus's favourite jokes about powerpc mmu being crippled forever, aside ;)
Different mmu. The desktop 32bit mmu Linus refered to has almost nothing
in common with the mmu on 64bit systems.
> >Right this could help but it is not addre
On Fri, Apr 27, 2007 at 05:12:05AM -0700, Christoph Lameter wrote:
> Powerpc supports multiple pagesizes. Maybe we could make mmap use those
> page sizes some day if we had a variable order page cache. Your stands on
> the issue means that powerpc will be forever crippled and not be able to
> us
William Lee Irwin III wrote:
William Lee Irwin III wrote:
I readily concede that seeks are most costly. Yet memory contiguity
remains rather influential.
Witness the fact that I'm now being called upon a second time to
adjust the order in which mm/page_alloc.c returns pages for the
sake of impl
On (27/04/07 20:05), Nick Piggin didst pronounce:
> Christoph Hellwig wrote:
> >On Thu, Apr 26, 2007 at 05:48:12PM +1000, Nick Piggin wrote:
> >
> >>>Well maybe you could explain what you want. Preferably without
> >>>redefining the established terms?
> >>
> >>Support for larger buffers than page
William Lee Irwin III wrote:
>> I readily concede that seeks are most costly. Yet memory contiguity
>> remains rather influential.
>> Witness the fact that I'm now being called upon a second time to
>> adjust the order in which mm/page_alloc.c returns pages for the
>> sake of implicitly establishin
Paul Mackerras wrote:
Nick Piggin writes:
For the TLB issue, higher order pagecache doesn't help. If distros
Oh? Assuming your hardware is capable of supporting a variety of page
sizes, and of putting a page at any address that is a multiple of its
size, it should help, potentially a great
Christoph Lameter wrote:
On Fri, 27 Apr 2007, Nick Piggin wrote:
For the TLB issue, higher order pagecache doesn't help. If distros
ship with a 4K page size on powerpc, and use some larger pages in
the pagecache, some people are still going to get angry because
they wanted to use 64K pages...
Nick Piggin writes:
> For the TLB issue, higher order pagecache doesn't help. If distros
Oh? Assuming your hardware is capable of supporting a variety of page
sizes, and of putting a page at any address that is a multiple of its
size, it should help, potentially a great deal, as far as I can see
On Fri, 27 Apr 2007, Nick Piggin wrote:
> For the TLB issue, higher order pagecache doesn't help. If distros
> ship with a 4K page size on powerpc, and use some larger pages in
> the pagecache, some people are still going to get angry because
> they wanted to use 64K pages... But I agree 64K pages
On Fri, 27 Apr 2007, Andrew Morton wrote:
> On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL
> PROTECTED]> wrote:
>
> > The page cache handling in the various layers is significantly
> > simplified which reduces maintenance cost.
>
> How on earth can the *addition* of variabl
On Fri, 27 Apr 2007, Paul Mackerras wrote:
> Option (b) would be a bit of an ugly hack.
>
> Which leaves option (c) - unless you have a further option. So I have
> to say I support Christoph on this, at least as far as the general
> principle is concerned.
We could approximate option (b) by set
Paul Mackerras wrote:
Andrew Morton writes:
If x86 had larger pagesize we wouldn't be seeing any of this. It is a
workaround
for present-generation hardware.
Unfortunately, it's not really practical to increase the page size
very much on most systems, because you end up wasting a lot of s
Andrew Morton writes:
> If x86 had larger pagesize we wouldn't be seeing any of this. It is a
> workaround
> for present-generation hardware.
Unfortunately, it's not really practical to increase the page size
very much on most systems, because you end up wasting a lot of space
in the page cache
Christoph Hellwig wrote:
On Thu, Apr 26, 2007 at 04:50:06PM +1000, Nick Piggin wrote:
Improving the buffer layer would be a good way. Of course, that is
a long and difficult task, so nobody wants to do it.
It's also a stupid idea. We got rid of the buffer layer because it's
a complete pain
Eric W. Biederman wrote:
Jens Axboe <[EMAIL PROTECTED]> writes:
Yes, that is exactly the problem. Once you have that, pktcdvd is pretty
much reduced to setup and init code, the actual data handling can be
done by sr or ide-cd directly. You could merge it into cdrom.c, it would
not be very diff
William Lee Irwin III wrote:
William Lee Irwin III <[EMAIL PROTECTED]> writes:
In memory as on disk, contiguity matters a lot for performance.
On Thu, Apr 26, 2007 at 12:21:24PM -0600, Eric W. Biederman wrote:
Not nearly so much though. In memory you don't have seeks to avoid.
On disks av
Christoph Lameter wrote:
On Thu, 26 Apr 2007, Nick Piggin wrote:
But what do you mean with it? A block is no longer a contiguous section of
memory. So you have redefined the term.
I don't understand what you mean at all. A block has always been a
contiguous area of disk.
You want to chang
Christoph Lameter wrote:
On Thu, 26 Apr 2007, Nick Piggin wrote:
Christoph Lameter wrote:
On Thu, 26 Apr 2007, Nick Piggin wrote:
But I maintain that the end result is better than the fragmentation
based approach. A lot of people don't actually want a bigger page
cache size, because they
Christoph Hellwig wrote:
On Thu, Apr 26, 2007 at 05:48:12PM +1000, Nick Piggin wrote:
Well maybe you could explain what you want. Preferably without redefining
the established terms?
Support for larger buffers than page cache pages.
I don't think you really want this :) The whole non-page
William Lee Irwin III wrote:
On Fri, Apr 27, 2007 at 01:38:30AM +1000, Nick Piggin wrote:
Or good grounds to increase the sg limit and push for io controller
manufacturers to do the same. If we have a hack in the kernel that
mostly works, they won't.
On Fri, Apr 27, 2007 at 01:38:30AM +1000,
On Fri, 27 Apr 2007 18:03:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> > > > > You basically have to
> > > > > jump through nasty, nasty hoops, to handle corner cases that are
> > > > > introduced
> > > > > because the generic code can no longer reliably lock out access to a
> > > > > file
On Fri, Apr 27, 2007 at 12:26:40AM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL
> PROTECTED]> wrote:
>
> > The page cache handling in the various layers is significantly
> > simplified which reduces maintenance cost.
>
> How on earth can the *
On Fri, Apr 27, 2007 at 12:04:03AM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 16:09:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
>
> > On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> > > On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]>
> > > wrote:
On Fri, 27 Apr 2007 00:35:19 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]>
wrote:
> On Fri, 27 Apr 2007, Andrew Morton wrote:
>
> > On Fri, 27 Apr 2007 00:22:26 -0700 (PDT) Christoph Lameter <[EMAIL
> > PROTECTED]> wrote:
> >
> > > I will submit pieces to mm depending on the
> > > outcome
On Fri, 27 Apr 2007, Andrew Morton wrote:
> On Fri, 27 Apr 2007 00:22:26 -0700 (PDT) Christoph Lameter <[EMAIL
> PROTECTED]> wrote:
>
> > I will submit pieces to mm depending on the
> > outcome of our discussions.
> There's a ludicrous amount of MM work pending in -mm. It would probably be
>
On Fri, 27 Apr 2007 00:22:26 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]>
wrote:
> I will submit pieces to mm depending on the
> outcome of our discussions.
Thanks.
There's a ludicrous amount of MM work pending in -mm. It would probably be
less work at your end to see what ends up landin
On Fri, 27 Apr 2007 00:19:49 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]>
wrote:
> The page cache handling in the various layers is significantly
> simplified which reduces maintenance cost.
How on earth can the *addition* of variable pagecache size simplify the
existing code?
What cleanu
On Thu, 26 Apr 2007, Andrew Morton wrote:
> Were any cleanups made which were not also applicable as standalone things
> to mainline?
Ahh. I think I know what you mean. The current patchset is for performance
testing against mainline. Lets first cover the bases and then see where
we go. It is n
On Thu, 26 Apr 2007, Andrew Morton wrote:
> It's not exactly hard to lock four pages which are contiguous in pagecache,
> contiguous in physical memory and are contiguous in the radix-tree.
If you can find them
> > The patch is not about forcing to use large pages but about the option to
On Fri, 27 Apr 2007 16:09:21 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> > On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> >
> > > >blocksizes via this scheme - instantiate and lock four pages
On Thu, 26 Apr 2007 22:49:53 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]>
wrote:
> On Thu, 26 Apr 2007, Andrew Morton wrote:
>
> > > Or make sure that truncate
> > > doesn't race on a partial *block* truncate?
> >
> > lock four pages
>
> You would only lock a single higher order block. Tr
On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
>
> > >blocksizes via this scheme - instantiate and lock four pages and go for
> > >it.
> >
> > So now how do you get block aligned writeback?
>
>
On Thu, 26 Apr 2007, Andrew Morton wrote:
> > Or make sure that truncate
> > doesn't race on a partial *block* truncate?
>
> lock four pages
You would only lock a single higher order block. Truncate works on that
level.
If you have 4 separate pages then you need to take separate locks and you
On Thu, Apr 26 2007, Mel Gorman wrote:
> On (26/04/07 20:39), Jens Axboe didst pronounce:
> > On Thu, Apr 26 2007, Christoph Lameter wrote:
> > > On Thu, 26 Apr 2007, Jens Axboe wrote:
> > >
> > > > On Thu, Apr 26 2007, Christoph Lameter wrote:
> > > > > On Thu, 26 Apr 2007, Jens Axboe wrote:
> >
On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> >blocksizes via this scheme - instantiate and lock four pages and go for
> >it.
>
> So now how do you get block aligned writeback?
in writeback and pageout:
if (page->index & mapping->block_size_mask)
On Thu, Apr 26, 2007 at 07:53:57PM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 12:27:31 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> > On Thu, Apr 26, 2007 at 07:04:38PM -0700, Andrew Morton wrote:
> > > On Tue, 24 Apr 2007 15:21:05 -0700 [EMAIL PROTECTED] wrote:
> > > Also, afaict your i
1 - 100 of 212 matches
Mail list logo