Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 02:45:04PM -0200, Rik van Riel wrote: > On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: > > But only when the extra pages we're reading in don't > displace useful data from memory, making us fault in > those other pages ... causing us to go t

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 04:09:53PM +0100, Christoph Hellwig wrote: > On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote: > > > > That would require the vfs interfaces themselves (address space > > readpage/writepage ops) to take kiobufs as arguments, instead of struct > > page

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 10:08:45AM -0600, Steve Lord wrote: > Christoph Hellwig wrote: > > On Thu, Feb 01, 2001 at 08:14:58PM +0530, [EMAIL PROTECTED] wrote: > > > > > > That would require the vfs interfaces themselves (address space > > > readpage/writepage ops) to take kiobufs as arguments

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 08:53:33AM -0200, Marcelo Tosatti wrote: > > On Thu, 1 Feb 2001, Stephen C. Tweedie wrote: > > If we're under free memory shortage, "unlucky" readaheads will be harmful. I know, it's a balancing act. But given that even one successf

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 01:28:33PM +0530, [EMAIL PROTECTED] wrote: > > Here's a second pass attempt, based on Ben's wait queue extensions: > Does this sound any better ? It's a mechanism, all right, but you haven't described what problems it is trying to solve, and where it is likely to be

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-01 Thread Stephen C. Tweedie
Hi, On Thu, Feb 01, 2001 at 10:25:22AM +0530, [EMAIL PROTECTED] wrote: > > >We _do_ need the ability to stack completion events, but as far as the > >kiobuf work goes, my current thoughts are to do that by stacking > >lightweight "clone" kiobufs. > > Would that work with stackable filesystems ?

Re: [PATCH] vma limited swapin readahead

2001-02-01 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 04:24:24PM -0800, David Gould wrote: > > I am skeptical of the argument that we can win by replacing "the least > desirable" pages with pages were even less desireable and that we have > no recent indication of any need for. It seems possible under heavy swap > to dis

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-01-31 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 07:28:01PM +0530, [EMAIL PROTECTED] wrote: > > Do the following modifications to your wait queue extension sound > reasonable ? > > 1. Change add_wait_queue to add elements to the end of queue (fifo, by > default) and instead have an add_wait_queue_lifo() routine tha

Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait/notify + callback chains

2001-01-31 Thread Stephen C. Tweedie
Hi, On Tue, Jan 30, 2001 at 10:15:02AM +0530, [EMAIL PROTECTED] wrote: > > Comments, suggestions, advise, feedback solicited ! My first comment is that this looks very heavyweight indeed. Isn't it just over-engineered? We _do_ need the ability to stack completion events, but as far as the ki

Re: [Kiobuf-io-devel] Re: RFC: Kernel mechanism: Compound event wait/notify + callback chains

2001-01-31 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 04:12:11PM +0530, [EMAIL PROTECTED] wrote: > >Thanks for mentioning this. I didn't know about it earlier. I've been > going through the 4/00 kqueue patch on freebsd ... Linus has already denounced them as massively over-engineered... --Stephen - To unsubscribe f

Re: [PATCH] vma limited swapin readahead

2001-01-31 Thread Stephen C. Tweedie
Hi, On Wed, Jan 31, 2001 at 01:05:02AM -0200, Marcelo Tosatti wrote: > > However, the pages which are contiguous on swap are not necessarily > contiguous in the virtual memory area where the fault happened. That means > the swapin readahead code may read pages which are not related to the > proc

Re: Renaming lost+found

2001-01-27 Thread Stephen C. Tweedie
Hi, On Fri, Jan 26, 2001 at 06:05:54PM -0200, Rodrigo Barbosa (aka morcego) wrote: > > I think JFS indeed doesn't have it. And ReiserFS doesn't too. This > should be common place for journaling filesystems. No, it's nothing to do with journaling or not. Even journaling filesystems can suffer

Re: inode->i_dirty_buffers redundant ?

2001-01-26 Thread Stephen C. Tweedie
Hi, On Thu, Jan 25, 2001 at 07:11:01PM -0200, Marcelo Tosatti wrote: > > We probably want another kind of "IO buffer" abstraction for 2.5 which can > support buffer's bigger than PAGE_SIZE. > > Do you have any thoughts on that, Stephen? XFS is already doing this, with pagebufs being used in

Re: inode->i_dirty_buffers redundant ?

2001-01-26 Thread Stephen C. Tweedie
Hi, On Thu, Jan 25, 2001 at 09:05:54PM +0100, Daniel Phillips wrote: > "Stephen C. Tweedie" wrote: > > We also maintain the > > per-page buffer lists as caches of the virtual-to-physical mapping to > > avoid redundant bmap()ping. > > Could you clarify tha

Re: ioremap_nocache problem?

2001-01-26 Thread Stephen C. Tweedie
Hi, On Thu, Jan 25, 2001 at 11:53:01AM -0600, Timur Tabi wrote: > > > As in an MMIO aperture? If its MMIO on the bus you should be able to > > just call ioremap with the bus address. By nature of it being outside > > of real ram, it should automatically be uncached (unless you've set an > >

Re: ioremap_nocache problem?

2001-01-26 Thread Stephen C. Tweedie
Hi, On Thu, Jan 25, 2001 at 10:49:50AM -0600, Timur Tabi wrote: > > > set_bit(PG_reserved, &page->flags); > > ioremap(); > > ... > > iounmap(); > > clear_bit(PG_reserved, &page->flags); > > The problem with this is that between the ioremap and iounmap, the page is > reserved. W

Re: ioremap_nocache problem?

2001-01-26 Thread Stephen C. Tweedie
Hi, On Thu, Jan 25, 2001 at 09:56:32AM -0600, Timur Tabi wrote: > > ioremap*() is only supposed to be used on IO regions or reserved > > pages. If you haven't marked the pages as reserved, then iounmap will > > do the wrong thing, so it's up to you to reserve the pages. > > Au contraire! > > I

Re: limit on number of kmapped pages

2001-01-25 Thread Stephen C. Tweedie
Hi, On Wed, Jan 24, 2001 at 12:35:12AM +, David Wragg wrote: > > > And why do the pages need to be kmapped? > > They only need to be kmapped while data is being copied into them. But you only need to kmap one page at a time during the copy. There is absolutely no need to copy the whole c

Re: Largefile support in 2.4

2001-01-25 Thread Stephen C. Tweedie
Hi, On Wed, Jan 24, 2001 at 02:38:00PM -0500, Mike Black wrote: > How do normal users get to create/maintain large files (i.e. >2G) in Linux > 2.4 on i386? > The root user can make filesize unlimited but a non-root user cannot. They > come up with the same limits in both tcsh and bash (i.e. fil

Re: inode->i_dirty_buffers redundant ?

2001-01-25 Thread Stephen C. Tweedie
Hi, On Thu, Jan 25, 2001 at 04:17:30PM +0530, V Ganesh wrote: > so i_dirty_buffers contains buffer_heads of pages coming from write() as > well as metadata buffers from mark_buffer_dirty_inode(). a dirty MAP_SHARED > page which has been write()n to will potentially exist in both lists. > won't d

Re: ioremap_nocache problem?

2001-01-25 Thread Stephen C. Tweedie
Hi, On Tue, Jan 23, 2001 at 10:53:51AM -0600, Timur Tabi wrote: > > My problem is that it's very easy to map memory with ioremap_nocache, but if > you use iounmap() the un-map it, the entire system will crash. No one has been > able to explain that one to me, either. ioremap*() is only suppose

Re: inode->i_dirty_buffers redundant ?

2001-01-24 Thread Stephen C. Tweedie
Hi, On Wed, Jan 24, 2001 at 03:25:16PM +0530, V Ganesh wrote: > now that we have inode->i_mapping->dirty_pages, what do we need > inode->i_dirty_buffers for ? Metadata. Specifically, directory contents and indirection blocks. --Stephen - To unsubscribe from this list: send the line "unsubscrib

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-11 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 11:14:54AM -0800, Linus Torvalds wrote: > In article <[EMAIL PROTECTED]>, > > kiobufs are crap. Face it. They do NOT allow proper multi-page scatter > gather, regardless of what the kiobuf PR department has said. It's not surprising, since they were designed to solve

Re: Subtle MM bug

2001-01-11 Thread Stephen C. Tweedie
Hi, On Thu, Jan 11, 2001 at 02:03:48PM -0500, Alexander Viro wrote: > On Thu, 11 Jan 2001, Stephen C. Tweedie wrote: > > > On Thu, Jan 11, 2001 at 02:12:05PM +0100, Trond Myklebust wrote: > > > > > > What's wrong with copy-on-write style semantics? IOW, a

Re: Subtle MM bug

2001-01-11 Thread Stephen C. Tweedie
Hi, On Thu, Jan 11, 2001 at 11:50:21AM -0500, Albert D. Cahalan wrote: > Stephen C. Tweedie writes: > > > > But is it really worth the pain? I'd hate to have to audit the > > entire VFS to make sure that it works if another thread changes our > > credentials i

Re: Subtle MM bug

2001-01-11 Thread Stephen C. Tweedie
Hi, On Thu, Jan 11, 2001 at 02:12:05PM +0100, Trond Myklebust wrote: > > What's wrong with copy-on-write style semantics? IOW, anyone who > wants to change the credentials needs to make a private copy of the > existing structure first. Because COW only solves the problem if each task is only c

Re: Subtle MM bug

2001-01-11 Thread Stephen C. Tweedie
Hi, On Wed, Jan 10, 2001 at 12:11:16PM -0800, Linus Torvalds wrote: > > That said, we can easily support the notion of CLONE_CRED if we absolutely > have to (and sane people just shouldn't use it), so if somebody wants to > work on this for 2.5.x... But is it really worth the pain? I'd hate to

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-10 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 02:25:43PM -0800, Linus Torvalds wrote: > In article <[EMAIL PROTECTED]>, > Stephen C. Tweedie <[EMAIL PROTECTED]> wrote: > > > >Jes has also got hard numbers for the performance advantages of > >jumbograms on some of the net

Re: `rmdir .` doesn't work in 2.4

2001-01-10 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 03:06:35PM +0100, Andrea Arcangeli wrote: > On Tue, Jan 09, 2001 at 07:41:21AM -0600, Jesse Pollard wrote: > > Not exactly valid, since a file could be created in that "pinned" directory > > after the rmdir... > > In 2.2.x no file can be created in the pinned director

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 05:16:40PM +0100, Ingo Molnar wrote: > On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > i'm talking about kiovecs not kiobufs (because those are equivalent to a > fragmented packet - every packet fragment can be anywhere). Initializing a > kiovec

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 12:30:39PM -0500, Benjamin C.R. LaHaise wrote: > On Tue, 9 Jan 2001, Ingo Molnar wrote: > > > this is why i ment that *right now* kiobufs are not suited for networking, > > at least the way we do it. Maybe if kiobufs had the same kind of internal > > structure as sk_f

Re: VM subsystem bug in 2.4.0 ?

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 04:45:10PM +0100, Christoph Rohland wrote: > Hi Stephen, > > AFAIU mlock'ed pages would never get deactivated since the ptes do not > get dropped. D'oh, right --- so can't you lock a segment just by bumping page_count on its pages? --Stephen - To unsubscribe from th

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 04:00:34PM +0100, Ingo Molnar wrote: > > On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > we do have SLAB [which essentially caches structures, on a per-CPU basis] > which i did take into account, but still, initializing a 600+ byte kiovec > is

Re: VM subsystem bug in 2.4.0 ?

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 03:53:55PM +0100, Christoph Rohland wrote: > > On Tue, 9 Jan 2001, Stephen C. Tweedie wrote: > > But again, how do you clear the bit? Locking is a per-vma property, > > not per-page. I can mmap a file twice and mlock just one of the > >

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 03:40:56PM +0100, Ingo Molnar wrote: > > i'd love to first see these kinds of applications (under Linux) before > designing for them. Things like Beowulf have been around for a while now, and SGI have been doing that sort of multimedia stuff for ages. I don't think

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 01:04:49PM +0100, Ingo Molnar wrote: > > On Tue, 9 Jan 2001, Christoph Hellwig wrote: > > please study the networking portions of the zerocopy patch and you'll see > why this is not desirable. An alloc_kiovec()/free_kiovec() is exactly the > thing we cannot afford in

Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 11:23:41AM +0100, Ingo Molnar wrote: > > > Having proper kiobuf support would make it possible to, for example, > > do zerocopy network->disk data transfers and lots of other things. > > i used to think that this is useful, but these days it isnt. It's a waste > of P

Re: VM subsystem bug in 2.4.0 ?

2001-01-09 Thread Stephen C. Tweedie
Hi, On Mon, Jan 08, 2001 at 04:30:10PM -0200, Rik van Riel wrote: > On Mon, 8 Jan 2001, Linus Torvalds wrote: > > > > The only solution I see is something like a "active_immobile" > > list, and add entries to that list whenever "writepage()" > > returns 1 - instead of just moving them to the act

Re: Confirmation request about new 2.4.x. kernel limits

2001-01-09 Thread Stephen C. Tweedie
Hi, On Mon, Jan 08, 2001 at 11:11:05PM -0500, Venkatesh Ramamurthy wrote: > > > Max. RAM size:64 GB (any slowness > accessing RAM over 4 GB > * with 32 bit machines ?) > Imore than 4GB in RAM is bounce buffered, so there is performance > penalty as the d

Re: Confirmation request about new 2.4.x. kernel limits

2001-01-09 Thread Stephen C. Tweedie
Hi, On Fri, Jan 05, 2001 at 11:46:04PM +0100, Pavel Machek wrote: > > > Max. file size: 1 TB(?) > > Max. file system size: 2 TB(?) > > Again, maybe on i386 with ext2. Actually, the 2TB limit affects all architectures, as we assume that block indexes fit

Re: `rmdir .` doesn't work in 2.4

2001-01-09 Thread Stephen C. Tweedie
Hi, On Mon, Jan 08, 2001 at 09:28:33PM +0100, Andrea Arcangeli wrote: > On Mon, Jan 08, 2001 at 12:58:20PM -0500, Alexander Viro wrote: > > It's a hell of a pain wrt locking. You need to lock the parent, but it can > > This is a no-brainer and bad implementation, but shows it's obviously right >

Re: `rmdir .` doesn't work in 2.4

2001-01-09 Thread Stephen C. Tweedie
Hi, On Tue, Jan 09, 2001 at 01:01:25AM +0100, Andrea Arcangeli wrote: > On Mon, Jan 08, 2001 at 03:27:21PM -0800, Linus Torvalds wrote: > > However, it is against all UNIX standards, and Linux-2.4 will explicitly > > I may be missing something but apparently SuSv2 allows it, you can check here:

Re: ext3fs 0.0.5d and reiserfs 3.5.2x mutually exclusive

2001-01-08 Thread Stephen C. Tweedie
Hi, On Thu, Jan 04, 2001 at 01:50:44PM +0100, Matthias Andree wrote: > I just tried to patch ext3fs 0.0.5d on top of a 2.2.18 that already had > reiserfs 3.5.28 and failed, there are overlapping patches in fs/buffer.c > that I cannot resolve for lack of knowledge how buffer.c and journalling > ar

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-08 Thread Stephen C. Tweedie
Hi, On Sat, Jan 06, 2001 at 08:57:26PM +0100, Marc Lehmann wrote: > On Fri, Jan 05, 2001 at 11:58:56AM +, David Woodhouse <[EMAIL PROTECTED]> >wrote: > > You mount it read-only, recover as much as possible from it, and bin it. > > > > You _don't_ want the fs code to ignore your explicit ins

Re: MM/VM todo list

2001-01-05 Thread Stephen C. Tweedie
Hi, On Fri, Jan 05, 2001 at 10:13:27PM +0100, Christoph Hellwig wrote: > On Fri, Jan 05, 2001 at 02:56:40PM -0200, Marcelo Tosatti wrote: > > > * VM: experiment with different active lists / aging pages > > > of different ages at different rates + other page replacement > > > improvements > >

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-05 Thread Stephen C. Tweedie
Hi, On Fri, Jan 05, 2001 at 12:46:19PM +, Alan Cox wrote: > > recovery. Because the ext3 journal is just a series of data blocks to > > be copied into the filesystem (rather than "actions" to be done), it > > doesn't matter how many times it is done. The recovery flags are not > > reset unt

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-05 Thread Stephen C. Tweedie
Hi, On Fri, Jan 05, 2001 at 02:01:37AM +0100, Stefan Traby wrote: > > Please tell me how to specify "noreplay" for the initial "/" mount > :) You don't have to: the filesystem knows when a root mount is happening, and can do the extra work then to make sure that the mount isn't failed on a read

Re: [Ext2-devel] Re: [RFC] ext2_new_block() behaviour

2001-01-05 Thread Stephen C. Tweedie
Hi, On Fri, Jan 05, 2001 at 12:06:47AM -0700, Andreas Dilger wrote: > Stephen, you write: > > On Thu, Jan 04, 2001 at 05:31:12PM -0500, Alexander Viro wrote: > > > BTW, what inumber do you want for whiteouts? IIRC, we decided to use > > > the same entry type as UFS does (14), but I don't remember

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-05 Thread Stephen C. Tweedie
Hi, On Fri, Jan 05, 2001 at 01:31:12AM +0100, Daniel Phillips wrote: > "Stephen C. Tweedie" wrote: > > > Yes, and so long as your journal is not on another partition/disk things > will eventually be set right. The combination of a partially updated > filesystem and

Re: [Ext2-devel] Re: [RFC] ext2_new_block() behaviour

2001-01-04 Thread Stephen C. Tweedie
Hi, On Thu, Jan 04, 2001 at 05:31:12PM -0500, Alexander Viro wrote: > > BTW, what inumber do you want for whiteouts? IIRC, we decided to use > the same entry type as UFS does (14), but I don't remember what was > the decision on inumber. UFS uses 1 for them, is it OK with you? 0 is used for pad

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-04 Thread Stephen C. Tweedie
Hi, On Thu, Jan 04, 2001 at 10:08:21PM +0100, Stefan Traby wrote: > On Thu, Jan 04, 2001 at 07:21:04PM +0000, Stephen C. Tweedie wrote: > > > ext3 does the recovery automatically during mount(8), so user space > > will never see an unrecovered filesystem. (There are filesyste

Re: generic_file_write code segment in 2.2.18

2001-01-04 Thread Stephen C. Tweedie
Hi, On Thu, Jan 04, 2001 at 08:51:37AM +0100, Andi Kleen wrote: > On Wed, Jan 03, 2001 at 09:29:48PM -0800, Asang K Dani wrote: > > The code is buggy as far as I can see. copy_from_user doesn't return the > number of bytes copied, but the number of bytes not copied when an error > occurs (or 0

Re: [Fwd: devices.txt inconsistency]

2001-01-04 Thread Stephen C. Tweedie
Hi, On Wed, Jan 03, 2001 at 11:01:05AM -0500, Douglas Gilbert wrote: > Stephen, > Did you respond to hpa on this matter? Not yet, I'm just catching up on festive-season email. Happy New Year, all! > [From the cc address it seems as though you > work both for transmeta as well as redhat.] Nop

Re: [Ext2-devel] Re: [RFC] ext2_new_block() behaviour

2001-01-04 Thread Stephen C. Tweedie
Hi, On Wed, Jan 03, 2001 at 11:12:48AM -0500, Alexander Viro wrote: > > On Wed, 3 Jan 2001, Stephen C. Tweedie wrote: > > > Having preallocated blocks allocated immediately is deliberate: > > directories grow slowly and remain closed most of the time, so the > > nor

Re: test13-pre6

2001-01-04 Thread Stephen C. Tweedie
Hi, On Fri, Dec 29, 2000 at 04:25:43PM -0800, Linus Torvalds wrote: > > Stephen: mind trying your fsync/etc tests on this one, to verify that the > inode dirty stuff is all done right? Back from the Scottish Hogmanay celebrations now. :) I've run my normal tests on this (based mainly on timing

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-04 Thread Stephen C. Tweedie
Hi, On Wed, Jan 03, 2001 at 05:27:25PM +0100, Daniel Phillips wrote: > > Tux2 is explicitly designed to legitimize pulling the plug as a valid > way of shutting down. Metadata-only journalling filesystems are not > designed to be used this way, and even with full-data journalling you > should b

Re: ext2's inode i_version gone, what now? (stable branch)

2001-01-04 Thread Stephen C. Tweedie
Hi, On Sat, Dec 30, 2000 at 12:54:17PM +0100, Andreas Schuldei wrote: > > > > Why was it taken away? How is compatibility maintained? What could I use > > instead to fix the problem? > > Now I think i_version was moved from ext2_fs_i.h (struct ext2_inode_info) to > fs.h (struct inode). stegfs

Re: [PATCH] filemap_fdatasync & related changes

2001-01-04 Thread Stephen C. Tweedie
hi, On Wed, Jan 03, 2001 at 10:28:05AM -0800, Linus Torvalds wrote: > > On Wed, 3 Jan 2001, Chris Mason wrote: > > > > Just noticed the filemap_fdatasync code doesn't check the return value from > > writepage. Linus, would you take a patch that redirtied the page, puts it > > back onto the dir

Re: [Ext2-devel] Re: [RFC] ext2_new_block() behaviour

2001-01-03 Thread Stephen C. Tweedie
Hi, On Tue, Jan 02, 2001 at 10:37:50PM -0500, Alexander Viro wrote: > Umm... OK, the last argument is convincing. Thanks... > > BTW, what was the reason behind doing preallocation for directories on > ext2_bread() level? We both buy ourselves an oddity in directory structure > (preallocated blo

Re: Test12 ll_rw_block error.

2000-12-18 Thread Stephen C. Tweedie
Hi, On Sun, Dec 17, 2000 at 12:38:17AM -0200, Marcelo Tosatti wrote: > On Fri, 15 Dec 2000, Stephen C. Tweedie wrote: > > Stephen, > > The ->flush() operation (which we've been discussing a bit) would be very > useful now (mainly for XFS). > > At page_launder

Re: Test12 ll_rw_block error.

2000-12-18 Thread Stephen C. Tweedie
Hi, On Sat, Dec 16, 2000 at 07:08:02PM -0600, Russell Cattelan wrote: > > There is a very clean way of doing this with address spaces. It's > > something I would like to see done properly for 2.5: eliminate all > > knowledge of buffer_heads from the VM layer. It would be pretty > > simple to re

New patches for 2.2.18 raw IO (fix for fault retry)

2000-12-15 Thread Stephen C. Tweedie
Hi all, OK, this now assembles the full outstanding set of raw IO fixes for the final 2.2.18 kernel, both with and without the 4G bigmem patches. The only changes since the last 2.2.18pre24 release are the addition of a minor bugfix (possible failures when retrying after getting colliding kiobuf

Re: New patches for 2.2.18pre24 raw IO (fix for bounce buffer copy)

2000-12-15 Thread Stephen C. Tweedie
Hi, On Fri, Dec 08, 2000 at 01:06:33PM +0100, Andrea Arcangeli wrote: > On Mon, Dec 04, 2000 at 08:50:04PM +0000, Stephen C. Tweedie wrote: > > I have pushed another set of raw IO patches out, this time to fix a > This fix is missing: > > --- rawio-sct/mm/memory.c.~1~ Fri De

Re: Test12 ll_rw_block error.

2000-12-15 Thread Stephen C. Tweedie
Hi, On Fri, Dec 15, 2000 at 02:00:19AM -0500, Alexander Viro wrote: > On Thu, 14 Dec 2000, Linus Torvalds wrote: > > Just one: any fs that really cares about completion callback is very likely > to be picky about the requests ordering. So sync_buffers() is very unlikely > to be useful anyway. >

Re: 64bit offsets for block devices ?

2000-12-07 Thread Stephen C. Tweedie
Hi, On Wed, Dec 06, 2000 at 06:50:15AM -0800, Reto Baettig wrote: > Imagine we have a virtual disk which provides a 64bit (sparse) address > room. Unfortunately we can not use it as a block device because in a lot > of places (including buffer_head structure), we're using a long or even > an int

Re: Fixing random corruption in raw IO on 2.2.x kernel with bigmem enabled

2000-12-06 Thread Stephen C. Tweedie
Hi, On Wed, Dec 06, 2000 at 12:28:54PM -0500, Peng Dai wrote: > > This patch fixes a subtle corruption when doing raw IO on the 2.2.x > kernel > with bigmem enabled. The problem was first reported by Markus Döhr while That patch is already part of the full bugfixed raw IO patchset I posted out

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Tue, Dec 05, 2000 at 03:17:07PM -0500, Alexander Viro wrote: > > > On Tue, 5 Dec 2000, Linus Torvalds wrote: > > > And this is not just a "it happens to be like this" kind of thing. It > > _has_ to be like this, because every time we call clear_inode() we are > > going to physically fre

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Tue, Dec 05, 2000 at 09:48:51AM -0800, Linus Torvalds wrote: > > On Tue, 5 Dec 2000, Stephen C. Tweedie wrote: > > > > That is still buggy. We MUST NOT invalidate the inode buffers unless > > i_nlink == 0, because otherwise a subsequent open() and fsync() will

Re: test12-pre5

2000-12-05 Thread Stephen C. Tweedie
Hi, On Mon, Dec 04, 2000 at 08:00:03PM -0800, Linus Torvalds wrote: > > On Mon, 4 Dec 2000, Alexander Viro wrote: > > > This _is_ what clear_inode() does in pre5 (and in pre4, for that matter): > > void clear_inode(struct inode *inode) > { > if (!list_empty(&inode->

Re: Using map_user_kiobuf()

2000-12-04 Thread Stephen C. Tweedie
Hi, On Thu, Nov 30, 2000 at 01:07:37PM -, John Meikle wrote: > I have been experimenting with a module that returns data to either a user > space programme or another module. A memory area is passed in, and the data > is written to it. Because the memory may be allocated either by a module

New patches for 2.2.18pre24 raw IO (fix for bounce buffer copy)

2000-12-04 Thread Stephen C. Tweedie
Hi, I have pushed another set of raw IO patches out, this time to fix a bug with bounce buffer copying when running on highmem boxes. It is likely to affect any bounce buffer copies using non-page-aligned accesses if both highmem and normal pages are involved in the kiobuf. The specific new pat

Re: [PATCH] inode dirty blocks Re: test12-pre4

2000-12-04 Thread Stephen C. Tweedie
On Mon, Dec 04, 2000 at 01:01:36AM -0500, Alexander Viro wrote: > > It doesn't solve the problem. If you unlink a file with dirty metadata > you have a nice chance to hit the BUG() in inode.c:83. I hope that patch > below closes all remaining holes. See analysis in previous posting > (basically,

Re: corruption

2000-12-04 Thread Stephen C. Tweedie
Hi, On Sat, Dec 02, 2000 at 10:33:36AM -0500, Alexander Viro wrote: > > On Sun, 3 Dec 2000, Andrew Morton wrote: > > > It appears that this problem is not fixed. > Sure, it isn't. Place where the shit hits the fan: fs/buffer.c::unmap_buffer(). > Add the call of remove_inode_queue(bh) there and

Re: Updated: raw I/O patches (v2.2)

2000-12-01 Thread Stephen C. Tweedie
Hi, On Tue, Nov 21, 2000 at 11:18:15AM -0500, Eric Lowe wrote: > > I have updated raw I/O patches with Andrea's and my fixes against 2.2. > They check for CONFIG_BIGMEM so they can be applied and compiled > without the bigmem patch. I've just posted an assembly of all of the outstanding raw IO

Re: corruption

2000-12-01 Thread Stephen C. Tweedie
Hi, On Fri, Dec 01, 2000 at 08:35:41AM +1100, Andrew Morton wrote: > > I bet this'll catch it: > > static __inline__ void list_del(struct list_head *entry) > { > __list_del(entry->prev, entry->next); > + entry->next = entry->prev = 0; > } No, because the buffer hash list is never

Re: [PATCH] blindingly stupid 2.2 VM bug

2000-11-30 Thread Stephen C. Tweedie
Hi, On Tue, Nov 28, 2000 at 04:35:32PM -0800, John Kennedy wrote: > On Wed, Nov 29, 2000 at 01:04:16AM +0100, Andrea Arcangeli wrote: > > On Tue, Nov 28, 2000 at 03:36:15PM -0800, John Kennedy wrote: > > > No, it is all ext3fs stuff that is touching the same areas your > > > > Ok this now make

Re: e2fs performance as function of block size

2000-11-24 Thread Stephen C. Tweedie
Hi, On Wed, Nov 22, 2000 at 11:28:12PM +0100, Michael Marxmeier wrote: > > If the files get somewhat bigger (eg. > 1G) having a bigger block > size also greatly reduces the ext2 overhead. Especially fsync() > used to be really bad on big file but choosing a bigger block > size changed a lot. 2

Re: [patch] O_SYNC patch 3/3, add inode dirty buffer list support to ext2

2000-11-23 Thread Stephen C. Tweedie
Hi, On Wed, Nov 22, 2000 at 11:54:24AM -0700, Jeff V. Merkey wrote: > > I have not implemented O_SYNC in NWFS, but it looks like I need to add it > before posting the final patches. This patch appears to force write-through > of only dirty inodes, and allow reads to continue from cache. Is t

[testcase] fsync/O_SYNC simple test cases

2000-11-22 Thread Stephen C. Tweedie
Hi, The code below may be useful for doing simple testing of the O_SYNC and f[data]sync code in the kernel. It times various combinations of updates-in-place and appends under various synchronisation mechanisms, making it possible to see clearly whether fdatasync is skipping inode updates for up

[patch] O_SYNC patch 3/3, add inode dirty buffer list support to ext2

2000-11-22 Thread Stephen C. Tweedie
Hi, This final part of the O_SYNC patches adds calls to ext2, and to generic_commit_write, to record dirty buffers against the owning inode. It also removes most of fs/ext2/fsync.c, which now simply calls the generic sync code. --Stephen 2.4.0test11.02.ext2-osync.diff : --- linux-2.4.0-test1

[patch] O_SYNC patch 2/3, add per-inode dirty buffer lists

2000-11-22 Thread Stephen C. Tweedie
Hi, This is the second part of my old O_SYNC diffs patched up for 2.4.0-test11. It adds support for per-inode dirty buffer lists. In 2.4, we are now generating dirty buffers on a per-page basis for every write. For large O_SYNC writes (often databases use around 128K per write), we obviously d

[patch] O_SYNC patch 1/3: Fix fdatasync

2000-11-22 Thread Stephen C. Tweedie
Hi, This is the first patch out of 3 to fix O_SYNC and fdatasync for 2.4.0-test11. The patch below fixes fdatasync (at least for ext2) so that it does not flush the inode to disk for purely timestamp updates. It splits I_DIRTY into two bits, one bit (I_DIRTY_DATASYNC) which is set only for dirt

Re: ext3 vs. JFS file locations...

2000-11-06 Thread Stephen C. Tweedie
Hi, On Sat, Nov 04, 2000 at 09:53:41PM -0500, Albert D. Cahalan wrote: > > The journalling layer for ext3 is not a filesystem by itself. > It is generic journalling code. So, even if IBM did not have > any jfs code, the name would be wrong. Indeed, and the jfs layer will be renamed "jbd" at som

Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6

2000-11-01 Thread Stephen C. Tweedie
Hi, On Mon, Oct 30, 2000 at 01:56:07PM -0500, Jeff Garzik wrote: > > Seen it, re-read my question... > > I keep seeing "audio drivers' mmap" used a specific example of a place > that would benefit from kiobufs. The current via audio mmap looks quite > a bit like mmap_kiobuf and its support cod

Re: Quota fixes and a few questions

2000-10-27 Thread Stephen C. Tweedie
Hi, On Fri, Oct 27, 2000 at 11:31:59AM +0200, Juri Haberland wrote: > > > Hi Stephen, > > unfortunately 0.0.3b has the same problem. I tried it with a stock > 2.2.17 kernel + NFS patches + ext3-0.0.3b and the quota rpm you > included. Extracting two larger tar.gz files hits the deadlock reliabl

Re: Quota mods needed for journaled quota

2000-10-26 Thread Stephen C. Tweedie
Hi, On Thu, Oct 26, 2000 at 12:53:00PM -0400, Nathan Scott wrote: > > The addition of an "init_quota" method to the super_operations struct, > > with quota_on calling this and defaulting to installing the default > > quota_ops if the method is NULL, ought to be sufficient to let ext3 > > get quo

Quota mods needed for journaled quota

2000-10-25 Thread Stephen C. Tweedie
Hi, There are a few problems in the Linux quota code which make it impossible to perform quota updates transactionally when using a journaled filesystem. Basically we have the following problems: * The underlying filesystem does not know which files are the quota files, so cannot tell when

Re: Quota fixes and a few questions

2000-10-24 Thread Stephen C. Tweedie
Hi, On Fri, Oct 20, 2000 at 05:02:28PM +0200, Juri Haberland wrote: > As I wrote in my original mail I used 0.0.2f. > Is there a version called 0.0.3 yet and if so where can I find it? In > ftp.uk.linux.org (which is currently not reachable as well as > vger.kernel.org) I found only 0.0.2f. I mu

Re: Quota fixes and a few questions

2000-10-20 Thread Stephen C. Tweedie
Hi, On Thu, Oct 19, 2000 at 07:03:54PM +0200, Jan Kara wrote: > > > I stumbled into another problem: > > When using ext3 with quotas the kjournald process stops responding and > > stays in DW state when the filesystem gets under heavy load. It is easy > > to reproduce: > > Just extract two or th

Re: Quota fixes and a few questions

2000-10-06 Thread Stephen C. Tweedie
Hi Jan, On Wed, Sep 27, 2000 at 02:56:20PM +0200, Jan Kara wrote: > > So I've been thinking about fixes in quota (and also writing some parts). While we're at it, I've attached a patch which I was sent which simply teaches quota about ext3 as a valid fs type in fstab. It appears to work fine

Re: Soft-Updates for Linux ?

2000-10-03 Thread Stephen C. Tweedie
Hi, On Mon, Oct 02, 2000 at 03:13:07AM +0200, Daniel Phillips wrote: > What I've seen proposed is a mechanism where the VM can say 'flush this > page' to a filesystem and the filesystem can then go ahead and do what > it wants, including flushing the page, flushing some other page, or not > doin

Re: Can ext3 or ReiserFS w/ journalling be made on /dev/loop?

2000-10-03 Thread Stephen C. Tweedie
Hi, On Thu, Sep 28, 2000 at 07:59:21PM +, Marc Mutz wrote: > > I was asked a question lately that I was unable to answer: Assume you > want to make a (encrypted, but that's not the issue here) filesystem on > a loopback block device (/dev/loop*). Can this be a journalling one? In > other wor

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Tue, Sep 26, 2000 at 11:02:48AM -0600, Erik Andersen wrote: > Another approach would be to let user space turn off overcommit. No. Overcommit only applies to pageable memory. Beancounter is really needed for non-pageable resources such as page tables and mlock()ed pages. Cheers, St

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Tue, Sep 26, 2000 at 09:17:44AM -0600, [EMAIL PROTECTED] wrote: > Operating systems cannot make more memory appear by magic. > The question is really about the best strategy for dealing with low memory. In my > opinion, the OS should not try to out-think physical limitations. Instead, the

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 03:12:50PM -0600, [EMAIL PROTECTED] wrote: > > > > > > I'm not too sure of what you have in mind, but if it is > > > "process creates vast virtual space to generate many page table > > > entries -- using mmap" > > > the answer is, virtual address space quot

Re: the new VMt

2000-09-26 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 03:07:44PM -0600, [EMAIL PROTECTED] wrote: > On Mon, Sep 25, 2000 at 09:46:35PM +0100, Alan Cox wrote: > > > I'm not too sure of what you have in mind, but if it is > > > "process creates vast virtual space to generate many page table > > > entries -- using

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 02:04:19PM -0600, [EMAIL PROTECTED] wrote: > > Right, but if the alternative is spurious ENOMEM when we can satisfy > > An ENOMEM is not spurious if there is not enough memory. UNIX does not ask the > OS to do impossible tricks. Yes, but the ENOMEM _is_ spurious if

Re: [patch] vmfixes-2.4.0-test9-B2 - fixing deadlocks

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 09:32:42PM +0200, Andrea Arcangeli wrote: > Having shrink_mmap that browse the mapped page cache is useless > as having shrink_mmap browsing kernel memory and anonymous pages > as it does in 2.2.x as far I can tell. It's an algorithm > complexity problem and it will w

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 12:34:56PM -0600, [EMAIL PROTECTED] wrote: > > > Process 1,2 and 3 all start allocating 20 pages > > > now 57 pages are locked up in non-swapable kernel space and the system >deadlocks OOM. > > > > Or go the beancounter route: process 1 asks "can I pin 20 pages"

Re: the new VMt

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 08:09:31PM +0100, Alan Cox wrote: > > > Indeed. But we wont fail the kmalloc with a NULL return > > > > Isn't that the preferred behaviour, though? If we are completely out > > of VM on a no-swap machine, we should be killing one of the existing > > processes rather

<    1   2   3   4   >