Unable to suspend to memory while scrub running

2013-10-02 Thread Justin Husted
I tried to suspend a computer running linux 3.11.1 while a scrub was going, and I got this: Oct 2 20:15:12 caper kernel: [ 1237.472716] PM: Preparing system for mem sleep Oct 2 20:15:12 caper kernel: [ 1237.472856] Freezing user space processes ... Oct 2 20:15:12 caper kernel: [ 1257.474848] Fr

Btrfs balance bug

2013-10-02 Thread Slava Barinov
Good day. I've got a failure with btrfs balance. In fact I started btrfs balance /btr and got a total filesystem freeze. After I tried applying balance pause or balance cancel the following crashdump appeared and btrfs tool freezed. Reboot changed nothing: I've got totally the same crash

btrfs apparently causes hard lockup

2013-10-02 Thread Justin Husted
Hi, I've been having some trouble with one of my computers recently that I'm currently blaming on btrfs. At first, I thought the computer had a hardware fault, since it tended to just mysteriously hard-lockup a few times per day. Usually, it would hard-lockup with the GUI, and I was unable to ge

WARNING at fs/btrfs/extent_io.c:211, kernel BUG at fs/btrfs/extent_io.c:509!

2013-10-02 Thread Tomasz Chmielewski
Got this rather lengthy trace with 3.12-rc3. Full output at http://wpkg.org/btrfs.txt (14 MB): Oct 2 22:34:57 bkp010 kernel: [38059.449633] BTRFS debug (device sdb5): truncated 1 orphans Oct 3 00:15:21 bkp010 kernel: [44080.512456] [ cut here ] Oct 3 00:15:21 bkp010 k

Re: Corrupt btrfs filesystem recovery... What best instructions?

2013-10-02 Thread Chris Murphy
On Oct 2, 2013, at 6:49 PM, Martin wrote: > kernel: btrfs read error corrected: ino 1 off 907183792128 (dev /dev/sdc > sector 1781821200) Can anyone answer if this is what corrupt metadata detection and correction looks like? From the original email this is a single disk, with default mkfs.b

Re: Corrupt btrfs filesystem recovery... What best instructions?

2013-10-02 Thread Martin
So... The fix: ( Summary: Mounting "-o recovery,noatime" worked well and allowed a diff check to complete for all but one directory tree. So very nearly all the data is fine. Deleting the failed directory tree caused a call stack dump and eventually: kernel: parent transid verify failed on 91

Re: Questions regarding logging upon fsync in btrfs

2013-10-02 Thread Josef Bacik
On Wed, Oct 02, 2013 at 10:12:20PM +0200, Aastha Mehta wrote: > On 2 October 2013 13:52, Josef Bacik wrote: > > On Tue, Oct 01, 2013 at 10:13:25PM +0200, Aastha Mehta wrote: > >> On 1 October 2013 21:42, Aastha Mehta wrote: > >> > On 1 October 2013 21:40, Aastha Mehta wrote: > >> >> On 1 October

Re: Questions regarding logging upon fsync in btrfs

2013-10-02 Thread Aastha Mehta
On 2 October 2013 13:52, Josef Bacik wrote: > On Tue, Oct 01, 2013 at 10:13:25PM +0200, Aastha Mehta wrote: >> On 1 October 2013 21:42, Aastha Mehta wrote: >> > On 1 October 2013 21:40, Aastha Mehta wrote: >> >> On 1 October 2013 19:34, Josef Bacik wrote: >> >>> On Mon, Sep 30, 2013 at 11:07:20

Re: Is `btrfsck --repair` supposed to actually repair problems?

2013-10-02 Thread Charles Cazabon
Chris Murphy wrote: > On Oct 2, 2013, at 10:53 AM, Charles Cazabon wrote: > > >> I'd wait until the raid is finished syncing. > > > > Strictly speaking, this shouldn't be necessary. > > I know but it's a 16TB array, do you really want to start over from scratch? > No. And neither do most people

Re: Is `btrfsck --repair` supposed to actually repair problems?

2013-10-02 Thread Chris Murphy
On Oct 2, 2013, at 10:53 AM, Charles Cazabon wrote: >> I'd wait until the raid is finished syncing. > > Strictly speaking, this shouldn't be necessary. mdadm arrays are fully usable > from creation during the initial sync; the system tracks which bits have been > initialized and which haven't

[PATCH] Btrfs: fix a use-after-free bug in btrfs_dev_replace_finishing

2013-10-02 Thread Ilya Dryomov
free_device rcu callback, scheduled from btrfs_rm_dev_replace_srcdev, can be processed before btrfs_scratch_superblock is called, which would result in a use-after-free on btrfs_device contents. Fix this by zeroing the superblock before the rcu callback is registered. Cc: Stefan Behrens Signed-o

Re: Is `btrfsck --repair` supposed to actually repair problems?

2013-10-02 Thread Charles Cazabon
Chris Murphy wrote: > On Oct 1, 2013, at 9:13 PM, Charles Cazabon wrote: > > > > Ah, I'm not looking to repair the files -- I can recopy the files easily > > enough, and rsync will pick up any files whose contents have been corrupted. > > If you run a scrub, dmesg should contain the path for aff

[PATCH] Btrfs: eliminate races in worker stopping code

2013-10-02 Thread Ilya Dryomov
The current implementation of worker threads in Btrfs has races in worker stopping code, which cause all kinds of panics and lockups when running btrfs/011 xfstest in a loop. The problem is that btrfs_stop_workers is unsynchronized with respect to check_idle_worker, check_busy_worker and __btrfs_s

[PATCH] btrfs-progs: Make btrfs_header_chunk_tree_uuid() return unsigned long

2013-10-02 Thread Ross Kirk
Internally, btrfs_header_chunk_tree_uuid() calculates an unsigned long, but casts it to a pointer, while all callers cast it to unsigned long again. >From btrfs commit b308bc2f05a86e728bd035e21a4974acd05f4d1e Signed-off-by: Ross Kirk --- btrfs-find-root.c |3 +-- cmds-chunk.c |6 ++

Re: Questions regarding logging upon fsync in btrfs

2013-10-02 Thread Josef Bacik
On Tue, Oct 01, 2013 at 10:13:25PM +0200, Aastha Mehta wrote: > On 1 October 2013 21:42, Aastha Mehta wrote: > > On 1 October 2013 21:40, Aastha Mehta wrote: > >> On 1 October 2013 19:34, Josef Bacik wrote: > >>> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta wrote: > On 30 Septembe

Re: kernel BUG at mm/page-writeback.c:2317!

2013-10-02 Thread Josef Bacik
On Wed, Oct 02, 2013 at 05:00:14PM +0900, Tomasz Chmielewski wrote: > Just seen this on a server running 3.12.0-rc3 and btrfs. > > Process writing to btrfs filesystem is stuck. > This is probably the bug that Liu Bo just fixed https://patchwork.kernel.org/patch/2970751/ Thanks, Josef -- To un

kernel BUG at mm/page-writeback.c:2317!

2013-10-02 Thread Tomasz Chmielewski
Just seen this on a server running 3.12.0-rc3 and btrfs. Process writing to btrfs filesystem is stuck. [39455.402308] [ cut here ] [39455.402328] kernel BUG at mm/page-writeback.c:2317! [39455.402341] invalid opcode: [#1] SMP [39455.402354] Modules linked in: veth i