Re: Oops while rebalancing, now unmountable.

2010-11-16 Thread Shane Shrybman
On Mon, 2010-11-15 at 13:46 -0500, Chris Mason wrote: > Excerpts from Christoph Hellwig's message of 2010-11-15 13:23:14 -0500: > > On Sun, Nov 14, 2010 at 11:12:22PM +0100, Andrea Arcangeli wrote: > > > I just wrote above that it can happen upstream without THP. It's not > > > THP related at all.

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Christoph Hellwig
On Mon, Nov 15, 2010 at 08:29:14PM +0100, Andrea Arcangeli wrote: > Scary stuff, so WB_SYNC_NONE wouldn't submit the dirty part of the > page down for I/O, so that it's all clean after wait_on_page_writeback > returns? (well of course unless the dirty bit was set again) It might not if we have loc

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Andrea Arcangeli
On Mon, Nov 15, 2010 at 02:12:04PM -0500, Christoph Hellwig wrote: > I didn't even notice that, but the WB_SYNC_NONE does indeed seem > buggy to me. If we set the sync_mode to WB_SYNC_NONE filesystem > can and frequently do trylock operations and might just skip to > write it out completely. Scar

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Chris Mason
Excerpts from Christoph Hellwig's message of 2010-11-15 14:12:04 -0500: > On Mon, Nov 15, 2010 at 07:46:57PM +0100, Andrea Arcangeli wrote: > > I've been reading the writeout() in mm/migrate.c and I wonder if maybe > > that should have been WB_SYNC_ALL or if we miss a > > wait_on_page_writeback in

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Andrea Arcangeli
On Mon, Nov 15, 2010 at 02:03:55PM -0500, Chris Mason wrote: > It always returns either -EIO or -EAGAIN, so the caller will try again > and then end up waiting on PageWriteback? Returning any error from ->writepage will make writeout return -EIO so aborting the migration for that page. If no error

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Christoph Hellwig
On Mon, Nov 15, 2010 at 07:46:57PM +0100, Andrea Arcangeli wrote: > I've been reading the writeout() in mm/migrate.c and I wonder if maybe > that should have been WB_SYNC_ALL or if we miss a > wait_on_page_writeback in after ->writepage() returns? Can you have a > look there? We check the PG_writeb

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Chris Mason
Excerpts from Andrea Arcangeli's message of 2010-11-15 13:46:57 -0500: > On Mon, Nov 15, 2010 at 01:23:14PM -0500, Christoph Hellwig wrote: > > On Sun, Nov 14, 2010 at 11:12:22PM +0100, Andrea Arcangeli wrote: > > > +static int btree_migratepage(struct address_space *mapping, > > > +

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Christoph Hellwig
On Mon, Nov 15, 2010 at 01:46:02PM -0500, Chris Mason wrote: > For the metadata blocks, btrfs gets into a problematic lock inversion > where it needs to record that a block has been written so that it will > be properly recowed when someone tries to change it again. > > Basically the rule for btre

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Chris Mason
Excerpts from Christoph Hellwig's message of 2010-11-15 13:23:14 -0500: > On Sun, Nov 14, 2010 at 11:12:22PM +0100, Andrea Arcangeli wrote: > > I just wrote above that it can happen upstream without THP. It's not > > THP related at all. THP is the consumer, this is a problem in migrate > > that wil

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Andrea Arcangeli
On Mon, Nov 15, 2010 at 01:23:14PM -0500, Christoph Hellwig wrote: > On Sun, Nov 14, 2010 at 11:12:22PM +0100, Andrea Arcangeli wrote: > > +static int btree_migratepage(struct address_space *mapping, > > + struct page *newpage, struct page *page) > > +{ > > + /* > > +

Re: Oops while rebalancing, now unmountable.

2010-11-15 Thread Christoph Hellwig
On Sun, Nov 14, 2010 at 11:12:22PM +0100, Andrea Arcangeli wrote: > I just wrote above that it can happen upstream without THP. It's not > THP related at all. THP is the consumer, this is a problem in migrate > that will trigger as well with migrate_pages or all other possible > migration APIs. >

Re: Oops while rebalancing, now unmountable.

2010-11-14 Thread Andrea Arcangeli
On Sun, Nov 14, 2010 at 05:00:18PM -0500, Christoph Hellwig wrote: > On Sun, Nov 14, 2010 at 09:42:06PM +0100, Andrea Arcangeli wrote: > > btrfs misses this: > > > > + .migratepage= btree_migratepage, > > > > It's a bug that can trigger upstream too (not only with THP) if there > > are

Re: Oops while rebalancing, now unmountable.

2010-11-14 Thread Christoph Hellwig
On Sun, Nov 14, 2010 at 09:42:06PM +0100, Andrea Arcangeli wrote: > btrfs misses this: > > + .migratepage= btree_migratepage, > > It's a bug that can trigger upstream too (not only with THP) if there > are hugepage allocations (like while incrasing nr_hugepages). Chris > already fixed i

Re: Oops while rebalancing, now unmountable.

2010-11-14 Thread Andrea Arcangeli
Hi Shane, On Sun, Nov 14, 2010 at 02:55:07PM -0500, Shane Shrybman wrote: > Hi Andrea! > > Long time since our last bug fix :) I still have fond memories of > 2.4.23-aa kernels, best of all time! Nice memories of good times :) > I couldn't find any other mention of this corruption issue with TH

Re: Oops while rebalancing, now unmountable.

2010-11-14 Thread Shane Shrybman
On Tue, 2010-11-09 at 13:21 -0500, Shane Shrybman wrote: > On Tue, 2010-11-09 at 08:42 -0500, Chris Mason wrote: > > Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > > > Hi, > > > > > > Got an oops last week while rebalancing that seems to have left me with > > > a corrupted

Re: Oops while rebalancing, now unmountable.

2010-11-09 Thread Shane Shrybman
On Tue, 2010-11-09 at 08:42 -0500, Chris Mason wrote: > Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > > Hi, > > > > Got an oops last week while rebalancing that seems to have left me with > > a corrupted btrfs. Kernel was ~2.6.36 + Transparent hugetlb patchset + > > small

Re: Oops while rebalancing, now unmountable.

2010-11-09 Thread Chris Mason
Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > Hi, > > Got an oops last week while rebalancing that seems to have left me with > a corrupted btrfs. Kernel was ~2.6.36 + Transparent hugetlb patchset + > small misc. patchs. We have a confirmed and reproducible case where the

Re: Oops while rebalancing, now unmountable.

2010-11-08 Thread Shane Shrybman
On Mon, 2010-11-08 at 16:04 -0500, Chris Mason wrote: > Excerpts from Shane Shrybman's message of 2010-11-08 15:39:25 -0500: > > On Mon, 2010-11-08 at 12:55 -0500, Chris Mason wrote: > > > Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > > > > Hi, > > > > > > > > Got an oops

Re: Oops while rebalancing, now unmountable.

2010-11-08 Thread Chris Mason
Excerpts from Shane Shrybman's message of 2010-11-08 15:39:25 -0500: > On Mon, 2010-11-08 at 12:55 -0500, Chris Mason wrote: > > Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > > > Hi, > > > > > > Got an oops last week while rebalancing that seems to have left me with > > >

Re: Oops while rebalancing, now unmountable.

2010-11-08 Thread Shane Shrybman
On Mon, 2010-11-08 at 12:55 -0500, Chris Mason wrote: > Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > > Hi, > > > > Got an oops last week while rebalancing that seems to have left me with > > a corrupted btrfs. Kernel was ~2.6.36 + Transparent hugetlb patchset + > > small

Re: Oops while rebalancing, now unmountable.

2010-11-08 Thread Chris Mason
Excerpts from Shane Shrybman's message of 2010-11-08 12:10:57 -0500: > Hi, > > Got an oops last week while rebalancing that seems to have left me with > a corrupted btrfs. Kernel was ~2.6.36 + Transparent hugetlb patchset + > small misc. patchs. Have you tried the 2.6.36 + the btrfs unstable git

Oops while rebalancing, now unmountable.

2010-11-08 Thread Shane Shrybman
Hi, Got an oops last week while rebalancing that seems to have left me with a corrupted btrfs. Kernel was ~2.6.36 + Transparent hugetlb patchset + small misc. patchs. I tried my last reliable 2.6.35 based kernel without the transparent hugetlb patchset and the btrfs was not mountable there eithe