Re: The INN/mmap bug

2000-09-19 Thread Daniel Phillips
Daniel Phillips wrote: > Alexander Viro wrote: > > On Tue, 19 Sep 2000, Daniel Phillips wrote: > > > > > !Mapped, !Uptodate, Dirty: pending map and write > > > > Wrong. It's an instant BUG at line 711 in ll_rw_blk.c - remember these > > reports? Yes, correct, this state should not escape

Re: The INN/mmap bug

2000-09-19 Thread Daniel Phillips
Alexander Viro wrote: > > On Tue, 19 Sep 2000, Daniel Phillips wrote: > > > The more I think about it the less clear and ambiguous I find it. > > When you add the dirty bit into the pot you get: > > > >Mapped, Uptodate, Dirty: not possible > > Sure, it is possible - that's how the

Re: The INN/mmap bug

2000-09-19 Thread Alexander Viro
On Tue, 19 Sep 2000, Daniel Phillips wrote: > The more I think about it the less clear and ambiguous I find it. > When you add the dirty bit into the pot you get: > >Mapped, Uptodate, Dirty: not possible Sure, it is possible - that's how the write happens > !Mapped,

Re: The INN/mmap bug

2000-09-19 Thread Daniel Phillips
Linus Torvalds wrote: > > On Mon, 18 Sep 2000, Alexander Viro wrote: > > * we have several bh state components and the thing is a big, > > fscking mess. If we look at the areas outside of the page lock we have: > > It's not a mess at all. > > I don't see why you call things "first stage"

Re: The INN/mmap bug

2000-09-19 Thread Alexander Viro
On Tue, 19 Sep 2000, Rik van Riel wrote: > IMHO it would be really nice if this problem was solved > on the /PAGE/ level, so it will work for non-buffer > filesystems as well ;) It would be nice if we separated the buffer-cache storage pages from the pagecache buffer-using ones and from the

Re: The INN/mmap bug

2000-09-19 Thread Rik van Riel
On Mon, 18 Sep 2000, Alexander Viro wrote: > On Sun, 17 Sep 2000, Linus Torvalds wrote: > > > And as we've seen, simplifying this area would not necessarily be a bad > > thing ;) > > > > Right now I'll just do the minimal fix, though, I think. > > Fine with me. I'll do the full analysis

Re: The INN/mmap bug

2000-09-19 Thread Rik van Riel
On Mon, 18 Sep 2000, Alexander Viro wrote: On Sun, 17 Sep 2000, Linus Torvalds wrote: And as we've seen, simplifying this area would not necessarily be a bad thing ;) Right now I'll just do the minimal fix, though, I think. Fine with me. I'll do the full analysis tonight, anyway,

Re: The INN/mmap bug

2000-09-19 Thread Daniel Phillips
Linus Torvalds wrote: On Mon, 18 Sep 2000, Alexander Viro wrote: * we have several bh state components and the thing is a big, fscking mess. If we look at the areas outside of the page lock we have: It's not a mess at all. I don't see why you call things "first stage" etc. It's

Re: The INN/mmap bug

2000-09-19 Thread Alexander Viro
On Tue, 19 Sep 2000, Daniel Phillips wrote: The more I think about it the less clear and ambiguous I find it. When you add the dirty bit into the pot you get: Mapped, Uptodate, Dirty: not possible Sure, it is possible - that's how the write happens !Mapped, Uptodate,

Re: The INN/mmap bug

2000-09-19 Thread Daniel Phillips
Alexander Viro wrote: On Tue, 19 Sep 2000, Daniel Phillips wrote: The more I think about it the less clear and ambiguous I find it. When you add the dirty bit into the pot you get: Mapped, Uptodate, Dirty: not possible Sure, it is possible - that's how the write happens

Re: The INN/mmap bug

2000-09-19 Thread Daniel Phillips
Daniel Phillips wrote: Alexander Viro wrote: On Tue, 19 Sep 2000, Daniel Phillips wrote: !Mapped, !Uptodate, Dirty: pending map and write Wrong. It's an instant BUG at line 711 in ll_rw_blk.c - remember these reports? Yes, correct, this state should not escape from

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Andreas Dilger wrote: > > Daniel writes: > > Alexander Viro wrote: > > > On Mon, 18 Sep 2000, Andreas Dilger wrote: > > > > This may actually be a problem in the future... what about shared access > > > > block devices like FCAL or a distributed filesystem? It has to be > > > > possible for

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Linus Torvalds wrote: > > On Mon, 18 Sep 2000, Alexander Viro wrote: > > That's what makes me unhappy about the current situation + obvious > > fixes. It works, but the proof is... well, not pretty. > > Oh, agreed. I think we should clean up the code. I looked at it yesterday, > and it

Re: The INN/mmap bug

2000-09-18 Thread Andreas Dilger
Daniel writes: > Alexander Viro wrote: > > On Mon, 18 Sep 2000, Andreas Dilger wrote: > > > This may actually be a problem in the future... what about shared access > > > block devices like FCAL or a distributed filesystem? It has to be > > > possible for pages to become non-uptodate in a sane

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Alexander Viro wrote: > > On Mon, 18 Sep 2000, Andreas Dilger wrote: > > > Alexander writes: > > > * uptodate pages should never become non-uptodate. > > > uptodate .. pages ... never have data _older_ than on disk > > > > This may actually be a problem in the future... what about

Re: The INN/mmap bug

2000-09-18 Thread Linus Torvalds
On Mon, 18 Sep 2000, Alexander Viro wrote: > > > > but should instead be > > > > static int make_buffer_uptodate(struct page *page, struct buffer_head * bh) > > { > > if (Page_Uptodate(page)) { > ^^^ > >

[PATCH] Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
That's to fs/buffer.c, generic_file_write() part will go separately (it's a separate bug actually - if copy_from_user() in generic_file_write() fails we mark page as not-uptodate; if memory pressure will drop the buffer_heads attempts to read will blow the page contents away).

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Alexander Viro wrote: > Except that the only reason why check for page being uptodate is correct > is that for such pages we are guaranteed that buffer ring will not be > dropped when page contatins data newer than on disk. Ugh. Translation: The only reason why the

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Linus Torvalds wrote: > It's not a mess at all. > > I don't see why you call things "first stage" etc. It's perfectly > straightforward. There are two bits that matter: > - buffer uptodate > - buffer mapped. first stage == page still not uptodate. second stage ==

Re: The INN/mmap bug

2000-09-18 Thread Linus Torvalds
On Mon, 18 Sep 2000, Alexander Viro wrote: > * we have several bh state components and the thing is a big, > fscking mess. If we look at the areas outside of the page lock we have: It's not a mess at all. I don't see why you call things "first stage" etc. It's perfectly straightforward.

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/18/00 13:19:27 -0400 Alexander Viro <[EMAIL PROTECTED]> wrote: > > > On Mon, 18 Sep 2000, Chris Mason wrote: > >> I'm not trying to put it all into a single get_block call, we have >> different get_block funcs for different purposes. What I'm really trying >> to do is squeeze into

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
OK, let's see. I've tried to describe what we have now (marking the bugs) + proposal that would give somewhat saner logics (in the end of posting). Comments are more than welcome. * life of the page is clearly divided in two parts - before it can be mapped to a user context and after

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Chris Mason wrote: > I'm not trying to put it all into a single get_block call, we have > different get_block funcs for different purposes. What I'm really trying > to do is squeeze into block_prepare_write, as a generic setup function for > file modifications. It is

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/18/00 12:23:43 -0400 Alexander Viro <[EMAIL PROTECTED]> wrote: > On Mon, 18 Sep 2000, Chris Mason wrote: > >> When reiserfs_get_block is called on the packed data (with create == 1), >> we unpack it to a full block, so the generic functions can handle >> dirties/writes from then on. If

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/18/00 08:52:12 -0700 Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Mon, 18 Sep 2000, Chris Mason wrote: >> >> ReiserFS depends on the buffer head up to date flag being correct when it >> is sent to get_block. When unpacking the tail, we have to know if the >> packed data on

Re: The INN/mmap bug

2000-09-18 Thread Linus Torvalds
On Mon, 18 Sep 2000, Chris Mason wrote: > > ReiserFS depends on the buffer head up to date flag being correct when it > is sent to get_block. When unpacking the tail, we have to know if the > packed data on disk should be copied over the data in the page. > > So, the above should work for

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Alexander Viro wrote: > > On Sun, 17 Sep 2000, Linus Torvalds wrote: > > > On Sun, 17 Sep 2000, Alexander Viro wrote: > > > > > > Looks sane. But I really wonder if we could just do it in > > > create_page_buffers() if page is up-to-date. OTOH it would require attempt > > > to map them all.

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/18/00 09:16:22 -0400 Alexander Viro <[EMAIL PROTECTED]> wrote: > > > On Mon, 18 Sep 2000, Chris Mason wrote: > >> ReiserFS depends on the buffer head up to date flag being correct when it >> is sent to get_block. When unpacking the tail, we have to know if the >> packed data on

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Chris Mason wrote: > ReiserFS depends on the buffer head up to date flag being correct when it > is sent to get_block. When unpacking the tail, we have to know if the > packed data on disk should be copied over the data in the page. ??? Details, please. What the hell

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/17/00 20:30:29 -0700 Linus Torvalds <[EMAIL PROTECTED]> wrote: > > Basically, both "truncate()" and "write()" have this bug where they can > end up re-reading stuff from disk even though the in-memory copy is newer. > > And because write() had this bug, the bug also got into >

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/17/00 20:30:29 -0700 Linus Torvalds [EMAIL PROTECTED] wrote: Basically, both "truncate()" and "write()" have this bug where they can end up re-reading stuff from disk even though the in-memory copy is newer. And because write() had this bug, the bug also got into

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Chris Mason wrote: ReiserFS depends on the buffer head up to date flag being correct when it is sent to get_block. When unpacking the tail, we have to know if the packed data on disk should be copied over the data in the page. ??? Details, please. What the hell are

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Alexander Viro wrote: On Sun, 17 Sep 2000, Linus Torvalds wrote: On Sun, 17 Sep 2000, Alexander Viro wrote: Looks sane. But I really wonder if we could just do it in create_page_buffers() if page is up-to-date. OTOH it would require attempt to map them all. Comments?

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Chris Mason wrote: I'm not trying to put it all into a single get_block call, we have different get_block funcs for different purposes. What I'm really trying to do is squeeze into block_prepare_write, as a generic setup function for file modifications. It is not.

Re: The INN/mmap bug

2000-09-18 Thread Chris Mason
--On 09/18/00 13:19:27 -0400 Alexander Viro [EMAIL PROTECTED] wrote: On Mon, 18 Sep 2000, Chris Mason wrote: I'm not trying to put it all into a single get_block call, we have different get_block funcs for different purposes. What I'm really trying to do is squeeze into

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Alexander Viro wrote: On Mon, 18 Sep 2000, Andreas Dilger wrote: Alexander writes: * uptodate pages should never become non-uptodate. uptodate .. pages ... never have data _older_ than on disk This may actually be a problem in the future... what about shared access

Re: The INN/mmap bug

2000-09-18 Thread Andreas Dilger
Daniel writes: Alexander Viro wrote: On Mon, 18 Sep 2000, Andreas Dilger wrote: This may actually be a problem in the future... what about shared access block devices like FCAL or a distributed filesystem? It has to be possible for pages to become non-uptodate in a sane way. So

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Linus Torvalds wrote: On Mon, 18 Sep 2000, Alexander Viro wrote: That's what makes me unhappy about the current situation + obvious fixes. It works, but the proof is... well, not pretty. Oh, agreed. I think we should clean up the code. I looked at it yesterday, and it didn't look

Re: The INN/mmap bug

2000-09-18 Thread Daniel Phillips
Andreas Dilger wrote: Daniel writes: Alexander Viro wrote: On Mon, 18 Sep 2000, Andreas Dilger wrote: This may actually be a problem in the future... what about shared access block devices like FCAL or a distributed filesystem? It has to be possible for pages to become

[PATCH] Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
That's to fs/buffer.c, generic_file_write() part will go separately (it's a separate bug actually - if copy_from_user() in generic_file_write() fails we mark page as not-uptodate; if memory pressure will drop the buffer_heads attempts to read will blow the page contents away).

Re: The INN/mmap bug

2000-09-18 Thread Linus Torvalds
On Mon, 18 Sep 2000, Alexander Viro wrote: but should instead be static int make_buffer_uptodate(struct page *page, struct buffer_head * bh) { if (Page_Uptodate(page)) { ^^^ set_bit(BH_Uptodate,

Re: The INN/mmap bug

2000-09-18 Thread Linus Torvalds
On Mon, 18 Sep 2000, Alexander Viro wrote: * we have several bh state components and the thing is a big, fscking mess. If we look at the areas outside of the page lock we have: It's not a mess at all. I don't see why you call things "first stage" etc. It's perfectly straightforward.

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Linus Torvalds wrote: It's not a mess at all. I don't see why you call things "first stage" etc. It's perfectly straightforward. There are two bits that matter: - buffer uptodate - buffer mapped. first stage == page still not uptodate. second stage == page is

Re: The INN/mmap bug

2000-09-18 Thread Alexander Viro
On Mon, 18 Sep 2000, Alexander Viro wrote: Except that the only reason why check for page being uptodate is correct is that for such pages we are guaranteed that buffer ring will not be dropped when page contatins data newer than on disk. Ugh. Translation: The only reason why the simple

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: > At that point, block_write_full_page() would become something like > > unsigned long index = inode->i_size >> PAGE_CACHE_SHIFT; > > offset = inode->i_size & (PAGE_CACHE_SIZE-1); > if (index > page->index) > offset =

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: > On Sun, 17 Sep 2000, Alexander Viro wrote: > > > > Looks sane. But I really wonder if we could just do it in > > create_page_buffers() if page is up-to-date. OTOH it would require attempt > > to map them all. Comments? > > That would certainly

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Mon, 18 Sep 2000, Alexander Viro wrote: > > We might as well do it in create_empty_buffers(), FWIW. Do you see any > case when it would mean extra work? I'm thinking about the following: > > if (page is uptodate) > do get_block(...,0) for all buffers and mark them

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Alexander Viro wrote: > > Looks sane. But I really wonder if we could just do it in > create_page_buffers() if page is up-to-date. OTOH it would require attempt > to map them all. Comments? That would certainly simplify a lot. And as we've seen, simplifying this

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: > Yeah. See my other mail - I don't think the patch is needed at all, > although I'm convinced that it would actually hide the problem for the > particular case of INN (zeroing out the partial block is not an issue > there, as truncate will have done

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: > The basic problem is that right now we have code that does > > if (!page->buffers) > create_empty_buffers(page, inode, inode->i_sb->s_blocksize); > ... > if (!buffer_uptodate(bh)) > ll_rw_block(READ,

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Alexander Viro wrote: > > Ugh. OK, first of all, patch is _not_ correct. Doesn't zero out the end of > partial block. > Yeah. See my other mail - I don't think the patch is needed at all, although I'm convinced that it would actually hide the problem for the particular

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Linus Torvalds wrote: > > I bet your patch fixes the corruption, but I want to understand _why_. > Call me dense, but __block_commit_write() seems to do everything we want > done.. Ok, I may be dense, but I see the bug. And no, your patch was not the right thing either.

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: > Hmm.. Ok, I see what you're saying. And I looked at the diff the wrong way > around - we _lost_ two articles rather than getting two new ones. > > However, what's wrong with the block_write_full_page() logic? Your patch > looks fine to me, but so

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: > > > On Sun, 17 Sep 2000, Marco d'Itri wrote: > > > > I added to block_write_full_page() the debugging code suggested by > > Alexander Viro: > > > > if (inode->i_dev == 0x305 && inode->i_ino == 48991) // Md > >

The INN/mmap bug

2000-09-17 Thread Marco d'Itri
I added to block_write_full_page() the debugging code suggested by Alexander Viro: if (inode->i_dev == 0x305 && inode->i_ino == 48991) // Md printk("block_write_full_page: writing page %d, size %Ld\n", page->index, inode->i_size); and I have

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Marco d'Itri wrote: > > I added to block_write_full_page() the debugging code suggested by > Alexander Viro: > > if (inode->i_dev == 0x305 && inode->i_ino == 48991) // Md > printk("block_write_full_page: writing page %d, size %Ld\n", >

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Marco d'Itri wrote: I added to block_write_full_page() the debugging code suggested by Alexander Viro: if (inode-i_dev == 0x305 inode-i_ino == 48991) // Md printk("block_write_full_page: writing page %d, size %Ld\n",

The INN/mmap bug

2000-09-17 Thread Marco d'Itri
I added to block_write_full_page() the debugging code suggested by Alexander Viro: if (inode-i_dev == 0x305 inode-i_ino == 48991) // Md printk("block_write_full_page: writing page %d, size %Ld\n", page-index, inode-i_size); and I have just

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: On Sun, 17 Sep 2000, Marco d'Itri wrote: I added to block_write_full_page() the debugging code suggested by Alexander Viro: if (inode-i_dev == 0x305 inode-i_ino == 48991) // Md

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: Hmm.. Ok, I see what you're saying. And I looked at the diff the wrong way around - we _lost_ two articles rather than getting two new ones. However, what's wrong with the block_write_full_page() logic? Your patch looks fine to me, but so does

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Linus Torvalds wrote: I bet your patch fixes the corruption, but I want to understand _why_. Call me dense, but __block_commit_write() seems to do everything we want done.. Ok, I may be dense, but I see the bug. And no, your patch was not the right thing either. It

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Alexander Viro wrote: Ugh. OK, first of all, patch is _not_ correct. Doesn't zero out the end of partial block. note to myself: no patches before the first cup of coffee Yeah. See my other mail - I don't think the patch is needed at all, although I'm convinced that it

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: The basic problem is that right now we have code that does if (!page-buffers) create_empty_buffers(page, inode, inode-i_sb-s_blocksize); ... if (!buffer_uptodate(bh)) ll_rw_block(READ, 1, bh);

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: Yeah. See my other mail - I don't think the patch is needed at all, although I'm convinced that it would actually hide the problem for the particular case of INN (zeroing out the partial block is not an issue there, as truncate will have done it -

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Sun, 17 Sep 2000, Alexander Viro wrote: Looks sane. But I really wonder if we could just do it in create_page_buffers() if page is up-to-date. OTOH it would require attempt to map them all. Comments? That would certainly simplify a lot. And as we've seen, simplifying this area

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: On Sun, 17 Sep 2000, Alexander Viro wrote: Looks sane. But I really wonder if we could just do it in create_page_buffers() if page is up-to-date. OTOH it would require attempt to map them all. Comments? That would certainly simplify a

Re: The INN/mmap bug

2000-09-17 Thread Linus Torvalds
On Mon, 18 Sep 2000, Alexander Viro wrote: We might as well do it in create_empty_buffers(), FWIW. Do you see any case when it would mean extra work? I'm thinking about the following: if (page is uptodate) do get_block(...,0) for all buffers and mark them uptodate

Re: The INN/mmap bug

2000-09-17 Thread Alexander Viro
On Sun, 17 Sep 2000, Linus Torvalds wrote: At that point, block_write_full_page() would become something like unsigned long index = inode-i_size PAGE_CACHE_SHIFT; offset = inode-i_size (PAGE_CACHE_SIZE-1); if (index page-index) offset =