Re: another ext3 question

2000-08-13 Thread Stephen C. Tweedie
Hi, Sorry for the delay, I've been on holiday for a couple of weeks. On Thu, Jul 27, 2000 at 07:36:34PM -0400, Jeremy Hansen wrote: > > ok ... to clarify ... ext3 _guarantees_ consistent file system metadata > or empirically, it tends to be robust about maintaining consistent > file system meta

Re: another ext3 question

2000-07-27 Thread Stephen C. Tweedie
Hi, On Thu, Jul 27, 2000 at 01:41:54PM -0400, Jeremy Hansen wrote: > > We're really itching to use ext3 in a production environment. Can you > give any clues on how things are going? The ext3-0.0.2f appears to be rock solid. Andreas has got prototyped code for e2fsck log replay, and I've got

Re: another ext3 question

2000-07-27 Thread Stephen C. Tweedie
Hi, On Fri, Jul 21, 2000 at 11:54:20PM -0600, Andreas Dilger wrote: > Note that you should not make the journals so large that they are a > major fraction of your RAM, as you will not gain anything by this. > A few megabytes is fine, 1024 disk blocks is the minimum. Yep. The main drawbacks to

Re: Tailmerging for Ext2

2000-07-26 Thread Stephen C. Tweedie
Hi, On Wed, Jul 26, 2000 at 03:19:46PM -0400, Alexander Viro wrote: > Erm? Consider that: huge lseek() + write past the end of file. Woops - got > to unmerge the tail (it's an internal block now) and we've got no > knowledge of IO going on the page. Again, IO may be asynchronous - no > protectio

Re: Tailmerging for Ext2

2000-07-26 Thread Stephen C. Tweedie
Hi, On Wed, Jul 26, 2000 at 02:41:44PM -0400, Alexander Viro wrote: > > For tail writes, I'd imagine we would just end up using the page cache > > as a virtual cache as NFS uses it, and doing plain copy into the > > buffer cache pages. > > Ouch. I _really_ don't like it - we end up with special

Re: Tailmerging for Ext2

2000-07-26 Thread Stephen C. Tweedie
Hi, On Wed, Jul 26, 2000 at 02:56:01PM -0400, Alexander Viro wrote: > > Not. Data normally is in page. Buffer_heads are not included into buffer > cache. They are refered from the struct page and their ->b_data just > points to appropriate pieces of page. You can not get them via bread(). > At a

Re: Tailmerging for Ext2

2000-07-26 Thread Stephen C. Tweedie
Hi, On Wed, Jul 26, 2000 at 02:05:11PM -0400, Alexander Viro wrote: > > Here is one more for you: > Suppose we grow the last fragment/tail/whatever. Do you copy the > data out of that shared block? If so, how do you update buffer_heads in > pages that cover the relocated data? (Same goes f

Re: question about sard and disk profiling patches

2000-07-25 Thread Stephen C. Tweedie
Hi, On Mon, Jul 24, 2000 at 04:34:11PM -0400, Jeremy Hansen wrote: > I build some customized kernel rpm's for our inhouse distro and I've > incorporated the profiling patches to use the sard utility. I'm just > curious if there is any downside to using this patch for any reason and if > there a

Re: PATCH: WIP super lock contention mitigation in 2.4

2000-07-14 Thread Stephen C. Tweedie
e corresponding update to the group + * descriptors. + * Stephen C. Tweedie ([EMAIL PROTECTED]), 1999 + * */ #include @@ -16,7 +23,6 @@ #include #include - /* * balloc.c contains the blocks allocation and deallocation routines */ @@ -70,42 +76,33 @@ } /* - * Re

Re: PATCH: Trying to get back IO performance (WIP)

2000-07-03 Thread Stephen C. Tweedie
MAIL PROTECTED]; Mon, 1 May 2000 18:18:14 +0100 Resent-From: "Stephen C. Tweedie" <[EMAIL PROTECTED]> Resent-Message-ID: <[EMAIL PROTECTED]> Resent-Date: Mon, 1 May 2000 18:18:14 +0100 (BST) Resent-To: [EMAIL PROTECTED] X-Authentication-Warning: worf.scot.redhat.com: sct set s

Re: O_SYNC patches for 2.4.0-test1-ac11

2000-06-09 Thread Stephen C. Tweedie
Hi, On Fri, Jun 09, 2000 at 02:51:18PM -0700, Ulrich Drepper wrote: > > Have you thought about O_RSYNC and whether it is possible/useful to > support it separately? It would be possible and useful, but it's entirely separate from the write path and probably doesn't make sense until we've got O_

Re: O_SYNC patches for 2.4.0-test1-ac11

2000-06-09 Thread Stephen C. Tweedie
Hi, On Fri, Jun 09, 2000 at 02:53:19PM -0700, Ulrich Drepper wrote: > > > If I don't preallocate the file, then even fdatasync is slow, [...] > > This might be a good argument to implement posix_fallocate() in the > kernel. No. If we do posix_fallocate(), then there are only two choices: we e

O_SYNC patches for 2.4.0-test1-ac11

2000-06-09 Thread Stephen C. Tweedie
Hi all, The following patch fully implements O_SYNC, fsync and fdatasync, at least for ext2. The infrastructure it includes should make it trivial for any other filesystem to do likewise. The basic changes are: Include a per-inode list of dirty buffers Pass a "datasync" parame

Re: [prepatch] Directory Notification

2000-05-22 Thread Stephen C. Tweedie
Hi, On Sun, May 21, 2000 at 04:27:29PM +, Ton Hospel wrote: > > > > It delivers a realtime signal to tasks which have requested it. The task > > can then call fstat to find out what changed. > > > A poll() notification mechanism should be at least as useful for e.g. > GUI's who generally p

Re: ext2 feature request

2000-05-01 Thread Stephen C. Tweedie
Hi, On Fri, Apr 28, 2000 at 08:09:21PM +1000, Andrew Clausen wrote: > > Is it possible to have a gap between the super-block and the > start of group 0's metadata? Yes. It's called the "s_first_data_block" field in the ext2 superblock, and lets you offset the data zone from the start of the fi

Re: Jussi's Was: info point on linux hdr

2000-04-26 Thread Stephen C. Tweedie
Hi, On Fri, Apr 21, 2000 at 08:37:37PM +0200, Benno Senoner wrote: > that means more than twice as fast. > > But I still have my doubts that all blocks gets really allocated: > for example I can do > ftruncate(filesize) > lseek(filesize-4,SEEK_SET) > write(value,4); > > and get an empty file of

Re: [linux-audio-dev] Re: File writes with O_SYNC slow

2000-04-20 Thread Stephen C. Tweedie
Hi, On Thu, Apr 20, 2000 at 10:57:15AM +0200, Benno Senoner wrote: > > I tried all combinations using my hdtest.c which I posted yesterday. > > I tried O_SYNC and even O_DSYNC on the SGI (Origin 2k), > (D_SYNC syncs only data blocks but not metadata blocks) Not quite. O_DSYNC syncs metadata t

Re: File writes with O_SYNC slow

2000-04-19 Thread Stephen C. Tweedie
Hi, On Wed, Apr 19, 2000 at 11:55:04AM -0400, Karl JH Millar wrote: > > I've noticed that file writes with O_SYNC are very much slower than they should > be. How fast do you think they should be? If you are doing small appends, then O_SYNC is _guaranteed_ to be dead slow. Ever write involves

Re: O_DIRECT architecture (was Re: info point on linux hdr)

2000-04-18 Thread Stephen C. Tweedie
Hi, On Tue, Apr 18, 2000 at 01:17:52PM -0500, Steve Lord wrote: > So I guess the question here is how do you plan on keeping track of the > origin of the pages? You don't have to. > Which ones were originally part of the kernel cache > and thus need copying up to user space? If the caller req

Re: O_DIRECT architecture (was Re: info point on linux hdr)

2000-04-18 Thread Stephen C. Tweedie
Hi, On Tue, Apr 18, 2000 at 07:56:04AM -0500, Steve Lord wrote: > > XFS is using the pagebuf code we wrote (or I should say are writing - it > needs a lot of work yet). This uses kiobufs to represent data in a set of > pages. So, we have the infrastructure to take a kiobuf and read or write > it

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-18 Thread Stephen C. Tweedie
Hi, On Tue, Apr 18, 2000 at 10:57:25AM -0400, Paul Barton-Davis wrote: > >> 1) pre-allocation takes a *long* time. Allocating 24 203MB files on a > >>clean ext2 partition of 18GB takes many, many minutes, for example. > >>Presumably, the same overhead is being incurred when block > >>

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-18 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 07:10:43PM +0200, Martin Schenk wrote: > If you are interested in a more efficient fsync (and a real fdatasync), > I have some patches that provide better performance for very large > files (where fsync is mostly busy scanning the page cache for changes), > and a fdat

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-18 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 01:05:12PM -0400, Paul Barton-Davis wrote: > > Acknowledging your much greater wisdom in this are than me, I don't > understand the above given that, in my experience: > > 1) pre-allocation takes a *long* time. Allocating 24 203MB files on a >clean ext2 partition

O_DIRECT architecture (was Re: info point on linux hdr)

2000-04-18 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 05:58:48PM -0500, Steve Lord wrote: > > O_DIRECT on Linux XFS is still a work in progress, we only have > direct reads so far. A very basic implementation was made available > this weekend. Care to elaborate on how you are doing O_DIRECT? It's something I've been th

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-17 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 01:45:15PM -0400, Paul Barton-Davis wrote: > >> 2) Why am I not having any of these problems ? Unlike Benno's code, I > >>Seagate 4.5GB Cheetah U2W 10K rpm IBM 9GB UltraStar U2W 10K rpm > >>Quantum 4.5GB Viking U2W 7.5K rpm 3 x IBM 18G

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-17 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 07:21:31PM +0200, Benno Senoner wrote: > > The only way you can get much better is to do non-writeback IO > > asynchronously. Use O_SYNC for writes, and submit the IOs from multiple > > threads, to let the kernel schedule the multiple IOs. Use large block > > sizes

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-17 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 01:10:41PM -0400, Paul Barton-Davis wrote: > > I had a question about this. Doug Gilbert told me that he heard using > multiple threads to schedule I/O requests could be a win, and that was > also my intuition. Other people have claimed that its often not a win, > and

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-17 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 01:05:12PM -0400, Paul Barton-Davis wrote: > > 2) Why am I not having any of these problems ? Unlike Benno's code, I >Seagate 4.5GB Cheetah U2W 10K rpm IBM 9GB UltraStar U2W 10K rpm >Quantum 4.5GB Viking U2W 7.5K rpm 3 x IBM 18GB UltraStar Ahh

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-17 Thread Stephen C. Tweedie
Hi, On Mon, Apr 17, 2000 at 04:50:05PM +0200, Benno Senoner wrote: > > Stephen, I tried all possible combinations , in my hdrbench code. ... > I tried: > -fsync() on all write descriptors at regular intervals ranging from 1sec to > 10sec > - fdatasync() on all write descriptors , same as above

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-17 Thread Stephen C. Tweedie
Hi, On Fri, Apr 14, 2000 at 06:15:09PM +1000, Andrew Clausen wrote: > > Any comments? Yes! > Date: Fri, 14 Apr 2000 08:10:10 -0400 > Message-Id: <[EMAIL PROTECTED]> > From: Paul Barton-Davis <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: [linux-audio-dev] info point on linux hdr > Send

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-15 Thread Stephen C. Tweedie
Hi, On Sat, Apr 15, 2000 at 06:50:48PM +0200, Benno Senoner wrote: > > Anyway does anyone know if implementing O_DIRECT would be a big amount > of work in kernel 2.3.x ? I'll be doing it, and it should be fairly straightforward. There are one or two infrastructure changes required, however, so

Re: [Fwd: [linux-audio-dev] info point on linux hdr]

2000-04-15 Thread Stephen C. Tweedie
Hi, On Sat, Apr 15, 2000 at 12:32:50PM +0200, Benno Senoner wrote: > On Sat, 15 Apr 2000, Paul Barton-Davis wrote: > > >PS: Does anyone know how to make a RAW I/O device on a spare disk partition, > > >and then put an ext2 over it (running the whole partition in RAWIO mode) ? > > >Is that possib

Re: Does your change make find faster by changing where it is storedor where it is returned?

2000-03-07 Thread Stephen C. Tweedie
Hi, On Tue, 07 Mar 2000 00:28:51 +0300, Hans Reiser <[EMAIL PROTECTED]> said: > Ok, you may have been wondering where my questions were going, now I > can explain. > You are proposing changing the VFS interface for an ext2 specific > optimization. But it's not! The proposed dentry change is a

Re: Ext2 / VFS projects

2000-02-11 Thread Stephen C. Tweedie
Hi, On Thu, 10 Feb 2000 10:27:29 -0500 (EST), Alexander Viro <[EMAIL PROTECTED]> said: > Correct, but that's going to make design much more complex - you really > don't want to do it for anything other than sub-page stuff (probably even > sub-sector). Which leads to 3 levels - allocation block/I

Re: Ext2 / VFS projects

2000-02-10 Thread Stephen C. Tweedie
Hi, On Wed, 09 Feb 2000 11:31:03 -0500, Matthew Wilcox <[EMAIL PROTECTED]> said: > fine-grained locking > [remove test_and_set_bit()] The critical one here is the superblock lock. --Stephen

Re: Ext2 / VFS projects

2000-02-10 Thread Stephen C. Tweedie
Hi, On Wed, 9 Feb 2000 14:30:13 -0500 (EST), Alexander Viro <[EMAIL PROTECTED]> said: > On Wed, 9 Feb 2000 [EMAIL PROTECTED] wrote: >> with 2k blocks and 128 byte fragments, we get to really reduce wasted >> space below any other system i've ever experienced. > Erm... I'm afraid that you are m

Re: EXT3 && Linux 2.3.39

2000-01-24 Thread Stephen C. Tweedie
Hi, On Thu, 20 Jan 2000 12:14:14 -0800 (PST), Black Dagger <[EMAIL PROTECTED]> said: > Is there a version of the Ext3 Patch that works with the 2.3.x > series of kernels? Preferably 2.3.39... I was running it on 2.2.13 > without a problem... but need some of the drivers from under > 2.3.39... or

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?

2000-01-16 Thread Stephen C. Tweedie
Hi, Chris Wedgwood writes: > > This may affect data which was not being written at the time of the > > crash. Only raid 5 is affected. > > Long term -- if you journal to something outside the RAID5 array (ie. > to raid-1 protected log disks) then you should be safe against this > type of

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?

2000-01-16 Thread Stephen C. Tweedie
Hi, Benno Senoner writes: > wow, really good idea to journal to a RAID1 array ! > > do you think it is possible to to the following: > > - N disks holding a soft RAID5 array. > - reserve a small partition on at least 2 disks of the array to hold a RAID1 > array. > - keep the journal o

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?

2000-01-13 Thread Stephen C. Tweedie
Hi, On Wed, 12 Jan 2000 22:09:35 +0100, Benno Senoner <[EMAIL PROTECTED]> said: > Sorry for my ignorance I got a little confused by this post: > Ingo said we are 100% journal-safe, you said the contrary, Raid resync is safe in the presence of journaling. Journaling is not safe in the presence

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?

2000-01-12 Thread Stephen C. Tweedie
Hi, On Wed, 12 Jan 2000 07:21:17 -0500 (EST), Ingo Molnar <[EMAIL PROTECTED]> said: > On Wed, 12 Jan 2000, Gadi Oxman wrote: >> As far as I know, we took care not to poke into the buffer cache to >> find clean buffers -- in raid5.c, the only code which does a find_buffer() >> is: > yep, this i

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?

2000-01-12 Thread Stephen C. Tweedie
Hi, On Tue, 11 Jan 2000 16:41:55 -0600, "Mark Ferrell" <[EMAIL PROTECTED]> said: > Perhaps I am confused. How is it that a power outage while attached > to the UPS becomes "unpredictable"? One of the most common ways to get an outage while on a UPS is somebody tripping over, or otherwise r

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power fai

2000-01-12 Thread Stephen C. Tweedie
Hi, On Wed, 12 Jan 2000 11:28:28 MET-1, "Petr Vandrovec" <[EMAIL PROTECTED]> said: > I did not follow this thread (on -fsdevel) too close (and I never > looked into RAID code, so I should shut up), but... can you > confirm that after buffer with data is finally marked dirty, parity > is recomp

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?

2000-01-12 Thread Stephen C. Tweedie
Hi, On Wed, 12 Jan 2000 00:12:55 +0200 (IST), Gadi Oxman <[EMAIL PROTECTED]> said: > Stephen, I'm afraid that there are some misconceptions about the > RAID-5 code. I don't think so --- I've been through this with Ingo --- but I appreciate your feedback since I'm getting inconsistent advise her

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?

2000-01-11 Thread Stephen C. Tweedie
Hi, On Tue, 11 Jan 2000 15:03:03 +0100, mauelsha <[EMAIL PROTECTED]> said: >> THIS IS EXPECTED. RAID-5 isn't proof against multiple failures, and the >> only way you can get bitten by this failure mode is to have a system >> failure and a disk failure at the same time. > To try to avoid this k

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?

2000-01-11 Thread Stephen C. Tweedie
Hi, On Tue, 11 Jan 2000 20:17:22 +0100, Benno Senoner <[EMAIL PROTECTED]> said: > Assume all RAID code - FS interaction problems get fixed, since a > linux soft-RAID5 box has no battery backup, does this mean that we > will loose data ONLY if there is a power failure AND successive disk > failur

[FAQ-answer] Re: soft RAID5 + journalled FS + power failure = problems ?

2000-01-11 Thread Stephen C. Tweedie
Hi, This is a FAQ: I've answered it several times, but in different places, so here's a definitive answer which will be my last one: future questions will be directed to the list archives. :-) On Tue, 11 Jan 2000 16:20:35 +0100, Benno Senoner <[EMAIL PROTECTED]> said: >> then raid can miscalcul

Re: O_EXCL and network FS's

2000-01-07 Thread Stephen C. Tweedie
Hi, On Wed, 5 Jan 2000 23:44:07 -0700 (MST), "Peter J. Braam" <[EMAIL PROTECTED]> said: > So I'm now requesting to add a "d_put" dentry method, that a file system > can optionally declare. It would always be invoked by the dput method, > perhaps indicating also when the dentry is no longer on a

Re: (reiserfs) Re: RFC: Re: journal ports for 2.3?

2000-01-07 Thread Stephen C. Tweedie
Hi, On Fri, 07 Jan 2000 00:32:48 +0300, Hans Reiser <[EMAIL PROTECTED]> said: > Andrea Arcangeli wrote: >> BTW, I thought Hans was talking about places that can't sleep (because of >> some not schedule-aware lock) when he said "place that cannot call >> balance_dirty()". > You were correct. I

Re: (reiserfs) Re: RFC: Re: journal ports for 2.3? (resendingbecause my

2000-01-07 Thread Stephen C. Tweedie
Hi, On Thu, 6 Jan 2000 20:25:38 -0500 (EST), "Albert D. Cahalan" <[EMAIL PROTECTED]> said: > AIX has such an API already. It is good to clone if you can. The AIX API is much more than a simple small-operation atomic transaction API, isn't it? The filesystem transactions have many properties --

Re: (reiserfs) Re: RFC: Re: journal ports for 2.3?

2000-01-06 Thread Stephen C. Tweedie
Hi, On Thu, 23 Dec 1999 06:41:44 +0800, Tan Pong Heng <[EMAIL PROTECTED]> said: > I was thinking that, unless you want to have FS specific buffer/page > cache, there is alway a gain for a unified cache for all fs. I think > the one piece of functionality missing from the 2.3 implementation > is

Re: (reiserfs) Re: RFC: Re: journal ports for 2.3?

2000-01-06 Thread Stephen C. Tweedie
Hi, On Thu, 23 Dec 1999 02:37:48 +0300, Hans Reiser <[EMAIL PROTECTED]> said: >> > I completly agree to change mark_buffer_dirty() to call balance_dirty() >> > before returning. > How can we use a mark_buffer_dirty that calls balance_dirty in a > place where we cannot call balance_dirty? It sh

Re: archive

1999-12-23 Thread Stephen C. Tweedie
Hi, On Wed, 22 Dec 1999 11:08:37 -0800, "sadri" <[EMAIL PROTECTED]> said: > Is there an archive of the emails posted in this list(linux-fsdevel)? > thanks Searching for "linux-fsdevel archive" on www.google.com found several. --Stephen

Re: RFC: Re: journal ports for 2.3?

1999-12-22 Thread Stephen C. Tweedie
Hi, On Tue, 21 Dec 1999 20:21:05 -0500 (EST), "Benjamin C.R. LaHaise" <[EMAIL PROTECTED]> said: > The buffer dirty lists are the wrong place to be dealing with this. We > need a lightweight, fast way of monitoring the system's dirty buffer/page > thresholds -- one that can be called for every w

Re: (reiserfs) Re: RFC: Re: journal ports for 2.3?

1999-12-21 Thread Stephen C. Tweedie
Hi, On Tue, 21 Dec 1999 14:57:29 +0100 (CET), Andrea Arcangeli <[EMAIL PROTECTED]> said: > So you are talking about replacing this line: > dirty = size_buffers_type[BUF_DIRTY] >> PAGE_SHIFT; > with: > dirty = (size_buffers_type[BUF_DIRTY]+size_buffers_type[BUF_PINNED]) >> >PAGE_SHIF

Re: (reiserfs) Re: RFC: Re: journal ports for 2.3?

1999-12-21 Thread Stephen C. Tweedie
Hi, On Tue, 21 Dec 1999 11:18:03 +0100 (CET), Andrea Arcangeli <[EMAIL PROTECTED]> said: > On Tue, 21 Dec 1999, Stephen C. Tweedie wrote: >> refile_buffer() checks in buffer.c. Ideally there should be a >> system-wide upper bound on dirty data: if each different fi

RFC: Re: journal ports for 2.3?

1999-12-20 Thread Stephen C. Tweedie
Hi, All comments welcome: this is a first draft outline of what I _think_ Linus is asking for from journaling for mainline kernels. On Wed, 15 Dec 1999 13:45:22 -0500, Chris Mason <[EMAIL PROTECTED]> said: > What is your current plan for porting ext3 into 2.3/2.4? Are you still > going to be b

Re: Oops with ext3 journaling

1999-12-08 Thread Stephen C. Tweedie
Hi, On Wed, 8 Dec 1999 17:28:49 -0500, "Theodore Y. Ts'o" <[EMAIL PROTECTED]> said: > Never fear, there will be an very easy way to switch back and forth > between ext2 and ext3. A single mount command, or at most a single > tune2fs command, should be all that it takes, no matter how the > jour

Re: Oops with ext3 journaling

1999-12-06 Thread Stephen C. Tweedie
Hi, On Sat, 4 Dec 1999 12:11:58 -0700, mike burrell <[EMAIL PROTECTED]> said: > couldn't you just make a new flag for the inode that journal.dat uses? i'm > guessing using S_IMMUTABLE will cause some problems, but something similar > to that? The immutable flag will work fine: journaling bypas

Re: Oops with ext3 journaling

1999-12-06 Thread Stephen C. Tweedie
Hi, On Sat, 4 Dec 1999 08:44:46 -0800 (PST), Brion Vibber <[EMAIL PROTECTED]> said: > Maybe at least stick a nice big warning in the docs along the lines of > "do not write to your journal file while mounted with journaling on, > you big dummy!" :) Not that I'd do so deliberately of course, but

Announce: ext3-0.0.2c

1999-11-08 Thread Stephen C. Tweedie
Hi, You can now find ext3-0.0.2c at ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/ext3-0.0.2c.tar.gz including full patches for 2.2.13 and for the standard Red Hat-6.1 kernels. Changes in this release --- in 0.0.2c: Lots of fixes to the way we set the filesystem's

Re: (reiserfs) Re: Raid resync changes buffer cache semantics --- not good for journaling!

1999-11-03 Thread Stephen C. Tweedie
Hi, On Wed, 3 Nov 1999 17:43:18 +0100 (MET), Ingo Molnar <[EMAIL PROTECTED]> said: > .. which is exactly what the RAID5 code was doing ever since. It _has_ to > do it to get 100% recovery anyway. This is one reason why access to caches > is so important to the RAID code. (we do not snapshot clea

Re: wierdisms w/ ext3.

1999-11-03 Thread Stephen C. Tweedie
Hi, On Tue, 2 Nov 1999 14:00:57 -0600, Timothy Ball <[EMAIL PROTECTED]> said: > I tried something like this: > --snip--snip--snip-- > /dev/hdb2 / ext3defaults, journal= 1 1 > --snip--snip--snip-- > but that didn't work. No --- as the readme states, you need to use "jou

Re: (reiserfs) Re: Raid resync changes buffer cache semantics --- not good for journaling!

1999-11-03 Thread Stephen C. Tweedie
Hi, On Wed, 3 Nov 1999 10:30:36 +0100 (MET), Ingo Molnar <[EMAIL PROTECTED]> said: >> OK... but raid resync _will_ block forever as it currently stands. > {not forever, but until the transaction is committed. (it's not even > necessary for the RAID resync to wait for locked buffers, it could as

Re: Raid resync changes buffer cache semantics --- not good for journaling!

1999-11-02 Thread Stephen C. Tweedie
Hi, On Tue, 2 Nov 1999 14:12:00 +0100 (MET), Ingo Molnar <[EMAIL PROTECTED]> said: > yes but this means that the block was not cached. OK... but raid resync _will_ block forever as it currently stands. >> > 2.3 removes physical indexing of cached blocks, >> >> 2.2 never guaranteed that IO w

Re: Buffer and page cache

1999-11-02 Thread Stephen C. Tweedie
Hi, On Tue, 02 Nov 1999 08:15:36 -0700, [EMAIL PROTECTED] said: > I'd like these pages to age a little before handing them over to the > "inode disk", because the "write_one_page" function called by > generic_file_write would incur significant latency if the inode disk is > "real", ie. not simul

Re: Raid resync changes buffer cache semantics --- not good for journaling!

1999-11-02 Thread Stephen C. Tweedie
Hi, On Tue, 02 Nov 1999 08:43:01 -0700, [EMAIL PROTECTED] said: >> Fixing this in raid seems far, far preferable to fixing it in the >> filesystems. The filesystem should be allowed to use the buffer cache >> for metadata and should be able to assume that there is a way to prevent >> those buff

Re: Linux Buffer Cache Does Not Support Mirroring

1999-11-02 Thread Stephen C. Tweedie
Hi, On Mon, 01 Nov 1999 15:53:29 -0500, Jeff Garzik <[EMAIL PROTECTED]> said: >> XFS delays allocation of user data blocks when possible to >> make blocks more contiguous; holding them in the buffer cache. >> This allows XFS to make extents large without requiring the user >> to specify extent s

Re: wierdisms w/ ext3.

1999-11-02 Thread Stephen C. Tweedie
Hi, On Tue, 2 Nov 1999 03:10:10 -0600, Timothy Ball <[EMAIL PROTECTED]> said: > Here's the info from /var/log/dmesg. Could it be that my journal file > has a large inode number? And if you have more than one ext3 partition > can you have more than one journal file? How would you specify it... >

Re: wierdisms w/ ext3.

1999-11-02 Thread Stephen C. Tweedie
Hi, On Mon, 1 Nov 1999 15:03:54 -0600, Timothy Ball <[EMAIL PROTECTED]> said: > I did my best to try to follow what the README for ext3 said. I made a > journal file in /var/local/journal/journal.dat. It has an inode # of > 183669. > Then I did /sbin/lilo -R linux rw rootflags=journal=183669.

Re: Raid resync changes buffer cache semantics --- not good for journaling!

1999-11-02 Thread Stephen C. Tweedie
Hi, On Mon, 1 Nov 1999 13:04:23 -0500 (EST), Ingo Molnar <[EMAIL PROTECTED]> said: > On Mon, 1 Nov 1999, Stephen C. Tweedie wrote: >> No, that's completely inappropriate: locking the buffer indefinitely >> will simply cause jobs like dump() to block forever, for exa

Re: Linux Buffer Cache Does Not Support Mirroring

1999-11-01 Thread Stephen C. Tweedie
Hi, On Mon, 01 Nov 1999 15:58:33 -0600, [EMAIL PROTECTED] said: > I agree with this, it feels closer to the linux page cache, the > terminology in the XFS white paper is a little confusing here. > XFS on Irix caches file data in buffers, but not in the regular buffer > cache, they are cached of

Re: Raid resync changes buffer cache semantics --- not good for journaling!

1999-11-01 Thread Stephen C. Tweedie
Hi, On Fri, 29 Oct 1999 14:06:24 -0400 (EDT), Ingo Molnar <[EMAIL PROTECTED]> said: > On Fri, 29 Oct 1999, Stephen C. Tweedie wrote: >> Fixing this in raid seems far, far preferable to fixing it in the >> filesystems. The filesystem should be allowed to use the buffer

Re: [ext3-0.0.2b] no-go with block size 4k

1999-10-29 Thread Stephen C. Tweedie
Hi, On Thu, 28 Oct 1999 21:29:44 +0200, Marc Mutz <[EMAIL PROTECTED]> said: > Hi Stephen! > I just tried your journalling support with my old spare scsi disk > (240M). The things I tried were: > Oct 28 21:08:57 adam kernel: Journal length (768 blocks) too short. Your journal is too short. The

Raid resync changes buffer cache semantics --- not good for journaling!

1999-10-29 Thread Stephen C. Tweedie
Hi all, There seems to be a conflict between journaling filesystem requirements (both ext3 and reiserfs), and the current raid code when it comes to write ordering in the buffer cache. The current ext3 code adds debugging checks to ll_rw_block designed to detect any cases where blocks are being

Re: ext3 - filesystem is not clean after recovery

1999-10-26 Thread Stephen C. Tweedie
Hi, On Tue, 26 Oct 1999 14:56:50 +0200, [EMAIL PROTECTED] (Miklos Szeredi) said: > I will try to make more tests with a cleaner configuration... OK, thanks --- the more information you can provide, the better. A reliable reproducer for any problems would be best of all. --Stephen

Re: ext3 - filesystem is not clean after recovery

1999-10-26 Thread Stephen C. Tweedie
Hi, On Tue, 26 Oct 1999 10:19:13 +0200, [EMAIL PROTECTED] (Miklos Szeredi) said: > Hi, > Sorry, I forgot to say, that it was with 0.0.2b. Also I reproduced > this twice, so the second time, it _was_ a clean fs before converting > to ext3. Are you sure you applied _both_ 0.0.2a and 0.0.2b, not j

Re: ext3 - filesystem is not clean after recovery

1999-10-25 Thread Stephen C. Tweedie
Hi, On Mon, 25 Oct 1999 18:41:09 +0200, [EMAIL PROTECTED] (Miklos Szeredi) said: > 5) boot, then mount ext3 filesystem - it says: > JFS DEBUG: (recovery.c, 411): journal_recover: JFS: recovery, exit status 0, >recovered transactions 130 to 133 > 6) unmount the fs, and with debugfs turn off jo

Announce: ext3-0.0.2b available

1999-10-21 Thread Stephen C. Tweedie
Hi all, ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/ext3-0.0.2b.gz contains the latest bug-fix update to ext3-0.0.2. It should be applied incrementally on top of ext3-0.0.2 and the ext3-0.0.2a patch. This is a bug-fix release for one important bug: if a transaction calls ext3_new_inode to allo

Re: ext3-0.0.2a patch released

1999-10-20 Thread Stephen C. Tweedie
Hi, On Tue, 19 Oct 1999 09:50:59 -0400, Daniel Veillard <[EMAIL PROTECTED]> said: > The oops of the day : > Oct 19 05:42:50 fr kernel: Assertion failure in journal_get_write_access() at >transaction.c line 436: "handle->h_buffer_credits > 0" ... > Oct 19 05:42:50 fr kernel: Call Trace: [cpr

Re: [RFC] Per-inode metadata cache.

1999-10-19 Thread Stephen C. Tweedie
Hi, On 19 Oct 1999 00:44:38 -0500, [EMAIL PROTECTED] (Eric W. Biederman) said: > Meanwhile having the metadata in the page cache (where they would > have predictable offsets by file size) Doesn't help --- you still need to look up the physical block numbers in order to clear the allocation bit

Re: [RFC] Per-inode metadata cache.

1999-10-18 Thread Stephen C. Tweedie
Hi, On 18 Oct 1999 08:20:51 -0500, [EMAIL PROTECTED] (Eric W. Biederman) said: >> And I still can't see how you can find the stale buffer in a >> per-object queue as the object can be destroyed as well after the >> lowlevel truncate. > Yes but you can prevent the buffer from becomming a stale b

Re: [RFC] Per-inode metadata cache.

1999-10-18 Thread Stephen C. Tweedie
Hi, On Mon, 18 Oct 1999 13:26:45 -0400 (EDT), Alexander Viro <[EMAIL PROTECTED]> said: >> You can't even know which is the inode Y that is using a block X without >> reading all the inode metadata while the block X still belongs to the >> inode Y (before the truncate). > WTF would we _need_ to

Re: [RFC] Per-inode metadata cache.

1999-10-18 Thread Stephen C. Tweedie
Hi, On Mon, 18 Oct 1999 13:26:45 -0400 (EDT), Alexander Viro <[EMAIL PROTECTED]> said: >> You can't even know which is the inode Y that is using a block X without >> reading all the inode metadata while the block X still belongs to the >> inode Y (before the truncate). > WTF would we _need_ to

Re: [RFC] Per-inode metadata cache.

1999-10-18 Thread Stephen C. Tweedie
Hi, On Mon, 18 Oct 1999 14:30:10 +0200 (CEST), Andrea Arcangeli <[EMAIL PROTECTED]> said: > I can't see these bigmem issues. The buffer and page-cache memory is not > in bigmem anyway. And you can use bigmem _wherever_ you want as far as you > remeber to fix all the involved code to kmap before

Re: [RFC] Per-inode metadata cache.

1999-10-18 Thread Stephen C. Tweedie
Hi, On Sat, 16 Oct 1999 01:59:38 -0400 (EDT), Alexander Viro <[EMAIL PROTECTED]> said: a) to d), fine. > e) we might get out with just a dirty blocks lists, but I think > that we can do better than that: keep per-inode cache for metadata. It > is going to be separate from the data pagecac

ext3-0.0.2a patch released

1999-10-18 Thread Stephen C. Tweedie
Hi, Available at ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/ext3-0.0.2a.diff.gz is a patch against ext3-0.0.2.tar.gz. This patch fixes a couple of problems in truncate(), especially deletion of large files where the delete has to be split over multiple transactions. Truncate of busy

Re: Announce: ext2+journaling, release 0.0.2

1999-10-18 Thread Stephen C. Tweedie
Hi, On Fri, 15 Oct 1999 14:04:48 +, Peter Rival <[EMAIL PROTECTED]> said: > Well, I think I just uncovered the first, umm, detail for this > release ;) Got the following while trying to start an AIM VII fserver > run on an AlphaPC164. Disks are all 2GB narrow SCSI, hanging off of a > si

Re: [patch] [possible race in ext2] Re: how to write get_block?

1999-10-15 Thread Stephen C. Tweedie
Hi, On Thu, 14 Oct 1999 00:17:27 -0400, Raul Miller <[EMAIL PROTECTED]> said: > Stephen C. Tweedie <[EMAIL PROTECTED]> wrote: >> There is one major potential future problem with moving this to the >> page cache. At some point I want to be able to extend the large

Re: (reiserfs) Re: journal requirements for buffer.c

1999-10-14 Thread Stephen C. Tweedie
Hi, On Thu, 14 Oct 1999 14:31:23 +0400, Hans Reiser <[EMAIL PROTECTED]> said: > Ah, I see, the problem is that when you batch the commits they can be > truly huge, and they all have to commit for any of them to commit, and > none of them can be flushed until they all commit, is that it? Exactly

Re: journal requirements for buffer.c

1999-10-14 Thread Stephen C. Tweedie
Hi, On Wed, 13 Oct 1999 02:19:19 +0400, Hans Reiser <[EMAIL PROTECTED]> said: > I merely hypothesize that the maximum value of required > FLUSHTIME_NON_EXPANDING will usually be less than 1% of memory, and > therefor won't have an impact. It is not like keeping 1% of memory > around for use by

RE: (reiserfs) Re: journal requirements for buffer.c

1999-10-14 Thread Stephen C. Tweedie
Hi, On Wed, 13 Oct 1999 09:55:39 -0400, Chris Mason <[EMAIL PROTECTED]> said: > All true. But shouldn't I be able to write function to reuse a buffer_head > for a different block without freeing it? I realize the buffer cache > doesn't have a call to do it now, but it seems like it should be p

Announce: limited user mode tools for ext3-0.0.2

1999-10-14 Thread Stephen C. Tweedie
Hi, To follow up on the kernel announce of the ext3-0.0.2 snapshot, there are a couple of tools available for helping with migrating to/from ext3. In particular, the current e2fsprogs work-in-progress snapshot at: http://web.mit.edu/tytso/www/linux/dist/e2fsprogs-1.16-WIP.tar.gz has support

Announce: ext2+journaling, release 0.0.2

1999-10-14 Thread Stephen C. Tweedie
Hi all, OK, a couple of weeks later than I'd hoped and massive numbers of bug-fixes further on, ext3-0.0.2 is out. This is the first usable release. Apart from the critical failure handling (handling of IO errors or memory allocation failures), this is the first solid version of journaled ext

Re: journal requirements for buffer.c

1999-10-13 Thread Stephen C. Tweedie
Hi, On Tue, 12 Oct 1999 15:03:25 +0400, Hans Reiser <[EMAIL PROTECTED]> said: >> With journaling, however, we have a new problem. We can have large >> amounts of dirty data pinned in memory, but we cannot actualy write >> that data to disk without first allocating more memory. > Trivia: I don'

Re: [RFC] truncate() (generic stuff)

1999-10-12 Thread Stephen C. Tweedie
Hi, On Tue, 12 Oct 1999 09:37:28 -0400 (EDT), Alexander Viro <[EMAIL PROTECTED]> said: > Rationale was: > a) get rid of code duplication and get all calls of ->truncate() > into the same place. > b) make it in the same place that sets i_size. > c) on many filesystems exte

Re: [patch] [possible race in ext2] Re: how to write get_block?

1999-10-12 Thread Stephen C. Tweedie
Hi, On Tue, 12 Oct 1999 15:39:35 +0200 (CEST), Andrea Arcangeli <[EMAIL PROTECTED]> said: > On Tue, 12 Oct 1999, Stephen C. Tweedie wrote: >> changes. The ext2 truncate code is really, really careful to provide > I was _not_ talking about ext2 at all. I was talking abo

Re: [RFC] truncate() (generic stuff)

1999-10-12 Thread Stephen C. Tweedie
Hi, On Mon, 11 Oct 1999 11:12:01 -0400 (EDT), Alexander Viro <[EMAIL PROTECTED]> said: > I began screwing around the truncate() stuff and the following is > a status report/request for comments: > a) call of ->truncate() method (and vmtruncate()) had been moved > into the notify_chan

Re: [patch] [possible race in ext2] Re: how to write get_block?

1999-10-12 Thread Stephen C. Tweedie
Hi, On Sat, 9 Oct 1999 23:53:01 +0200 (CEST), Andrea Arcangeli <[EMAIL PROTECTED]> said: > What I said about bforget in my old email is still true. The _only_ reason > for using bforget instead of brelse is to get buffer performances (that in > 2.3.x are not so interesting as in 2.2.x as in 2.3.

Re: (reiserfs) Re: RE: journal requirements for buffer.c (was: Romaprogress report)

1999-10-12 Thread Stephen C. Tweedie
Hi, On Tue, 12 Oct 1999 03:14:03 +0400, Hans Reiser <[EMAIL PROTECTED]> said: >> Hans, you didn't mention a journal call that happens on sync, or >> sync_old_buffers... > I see two issues: how to respond to memory pressure, and how to sync. > I'll let you articulate our sync needs. There are a

  1   2   >