Re: hitting BUG_ON on troublesome FS
Remco Hosman - Yerf-it.com posted on Mon, 03 Feb 2014 21:51:26 +0100 as excerpted: > Anything i can do to resolve / debug the issue? I see from the trace you're running kernel 3.13.0. (FWIW 3.13.1 is out, but there weren't any btrfs commits therein, as they weren't upstream in 3.14-pre yet at that time.) You might try kernel 3.14-rc1 now that it's out. There's a big btrfs git pull in that, including a number of btrfs send/receive fixes. FWIW, btrfs send/receive seems to be a big focus right now, and I see a lot of patches floating by on the list even after that pull, so there's more where those came from. I'm not a dev (just another btrfs user and list regular for several kernel cycles now) and haven't followed the patches closely enough to know if your particular balance problem is covered in one of them, but as I said, since 3.14-rc1 is out now, you might as well try it... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs raid5 unmountable
Tetja Rediske posted on Mon, 03 Feb 2014 17:12:24 +0100 as excerpted: [...] > What happened before: > > One disk was faulty, I added a new one and removed the old one, followed > by a balance. > > So far so good. > > Some days after this I accidently removed a SATA Power Connector from > another drive, without noticing it at first. Worked about an hour on the > system, building new Kernel on another Filesystem. Rebooted with my new > Kernel and the FS was not mountable. I noticed the "missing" disk and > reattached the power. > > So far i tried: > > mount -o recovery > btrfs check > (after google) btrfs-zero-log > > Sadly no luck. Whoever I can get my Files with btrfs restore. The > Filesystem contains mainly Mediafiles, so it is not so bad, if they were > lost, but restoring them from backups and sources will need atleast > about a week. (Most of the Files are mirrored on a private Server, but > even with 100MBit this takes a lot of time ; ) > > Some Idea who to recover this FS? [As a btrfs users and list regular, /not/ a dev...] That filesystem is very likely toast. =:( Tho there's one thing you didn't mention trying yet that's worth the try. See below... You can read the list archives for the details if you like, but basically, the raid5/6 recovery code simply isn't complete yet and is not recommended for actual deployment in any way, shape or form. In practice at present it's a fancy raid0 that calculates and writes a bunch of extra parity, and can be run-time tested and even in some cases recover from online-device-loss (as you noted), but throw a shutdown in there along with the bad device, and like a raid0, you might as well consider the filesystem lost... at least until the recovery code is complete, at which point if the filesystem is still around you may well be able to recover it, since the parity is all there, the code to actually recover from it just isn't all there yet. FWIW, single-device btrfs is what I'd call almost-stable now altho you're still strongly encouraged to keep current and tested backups as there are still occasional corner-cases, and stay on current kernels and btrfs- tools as potentially data-risking bugs still are getting fixed. Multi- device btrfs in single/raid0/1/10 modes are also closing in on stable now, tho not /quite/ as stable as single device, but quite usable as long as you do have tested backups -- unless you're unlucky you won't actually have to use them (I haven't had to use mine), but definitely keep 'em just in case. But raid5/6, no-go, with the exception of pure testing data that you really are prepared to throw away, because recovery for it it really is still incomplete and thus known-broken. The one thing I didn't see you mention that's worth a try if you haven't already, is the degraded mount option. See $KERNELSRC/Documentation/filesystems/btrfs.txt. Tho really that should have been the first thing you tried for mounting once you realized you were down a device. But with a bit of luck... Also, if you've run btrf check with the --repair option (you didn't say, if you didn't, you should be fine as without --repair it's only a read- only diagnostic), you may have made things worse, as that's really intended to be a last resort. Of course if you'd been following the list as btrfs testers really should still be doing at this point, you'd have seen all this covered before. And of course, if you had done pre-deployment testing before you stuck valuable data on that btrfs raid5, you'd have noted the problems, even without reading about it on-list or on the wiki. But of course hindsight is 20/20, as they say, and at least you DO have backups, even if they'll take awhile to restore. =:^) That's already vastly better than a lot of the reports we unfortunately get here. =:^\ -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: there is devid 0 when replace is running
as of now when the replace-er disk is add to dev list with its devid 0. We fail to obtain details of devid 0 since we don't query devid 0 at all as below. --- btrfs rep start /dev/sdb /dev/sdf /btrfs btrfs fi show Label: none uuid: f8fb9819-16c8-47b7-b62f-0ff90f8c56cd Total devices 3 FS bytes used 1.94GiB devid1 size 1.10GiB used 1.10GiB path /dev/sdb devid2 size 1.10GiB used 1.08GiB path /dev/sdc devid0 size 0.00 used 0.00 path --- this patch will make it proper by querying dev id 0. - btrfs repl start /dev/sdb /dev/sdf /btrfs btrfs fi show /btrfs Label: none uuid: f8fb9819-16c8-47b7-b62f-0ff90f8c56cd Total devices 3 FS bytes used 1.94GiB devid0 size 1.10GiB used 1.10GiB path /dev/sdf devid1 size 1.10GiB used 1.10GiB path /dev/sdb devid2 size 1.10GiB used 1.08GiB path /dev/sdc - Its fine to query dev id 0 when there is no replace activity well because we just skip the error ENODEV btrfs fi show /btrfs Label: none uuid: f8fb9819-16c8-47b7-b62f-0ff90f8c56cd Total devices 2 FS bytes used 1.94GiB devid1 size 1.10GiB used 1.10GiB path /dev/sdf devid2 size 1.10GiB used 1.08GiB path /dev/sdc Signed-off-by: Anand Jain --- utils.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/utils.c b/utils.c index de513b6..a045ffd 100644 --- a/utils.c +++ b/utils.c @@ -1696,7 +1696,7 @@ int get_fs_info(char *path, struct btrfs_ioctl_fs_info_args *fi_args, goto out; } - for (; i <= fi_args->max_id; ++i) { + for (i = 0; i <= fi_args->max_id; ++i) { BUG_ON(ndevs >= fi_args->num_devices); ret = get_device_info(fd, i, &di_args[ndevs]); if (ret == -ENODEV) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: disable snapshot aware defrag for now
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/02/14 09:27, Josef Bacik wrote: > It is so totally broken that I don't want it being turned on by anybody > who can't edit this and change it themselves. The symptoms I saw are huge amounts of kernel memory consumption, possibly till exhaustion of swap. Are there other ways in which is it broken (eg corruption)? Also is this patch making its way to the various stables? Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) iEYEARECAAYFAlLwXE8ACgkQmOOfHg372QRLngCgpc445lPvM7YhGUxVdlU2O4vN 1CUAoM2NmeGPOeYxOji4yL4VRysBnTxg =sQ3M -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Receive on same subvolume
On 03/02/2014 4:34 PM, Chris Murphy wrote: On Feb 3, 2014, at 3:53 PM, Matthew Lai wrote: On 03/02/2014 11:26 AM, Chris Murphy wrote: On Feb 3, 2014, at 11:19 AM, Matthew Lai wrote: Thanks. I should clarify what I'm trying to do. I'm trying to use btrfs send for backup, without having another btrfs volume. So the initial backup is a complete send, piped to Amazon Glacier (so my machine never has the whole file, and doesn't have space for one). OK so you've use btrfs send piped to Glacier which creates a *file*, I'll call it "initial", not a navigable directory of files? Right? That is correct. It looks like the problem now is the sent file can't be applied to the original volume (for restore). I'm counting two sent files: initial, increment1. I'm not sure which one you're applying. If you have the exact same read-only snapshot that the btrfs send file "initial" is based on, then you'd apply the increment1 to that read-only snapshot which will cause a new read-only snapshot to be created with the incremental data applied to it. The error you're getting sounds like the parent read-only snapshot isn't available? That is also correct. There are 2 sent files. I am trying to apply increment1, on a snapshot of the parent (that was used to create increment1). I don't understand how you can apply increment1 to the snapshot of increment1; and also I don't understand how the parent is also increment1. I added -vv. Here is the test script for reproducing this entire setup. --- #!/bin/sh btrfs subvolume create data btrfs subvolume snapshot -r data first_backup touch data/a btrfs subvolume snapshot -r data second_backup btrfs send -p first_backup second_backup > increment btrfs subvolume snapshot first_backup first_backup_rw btrfs receive -vv first_backup_rw < increment --- Output: --- Create subvolume './data' Create a readonly snapshot of 'data' in './first_backup' Create a readonly snapshot of 'data' in './second_backup' At subvol second_backup Create a snapshot of 'first_backup' in './first_backup_rw' At snapshot second_backup receiving snapshot second_backup uuid=e6159a2a-3430-344a-a23d-b9bb83851a63, ctransid=28 parent_uuid=20c4ff66-a9ec-fc44-93c6-2c12637e95e6, parent_ctransid=26 ERROR: could not find parent subvolume --- I would think applying the "patch" to first_backup_rw should succeed, because it's exactly the same as first_backup, which is the parent for the send, but it doesn't. btrfs sub snap -r subvol.1 subvol btrfs send subvol.1 -f /subvol.1.btrfs #write some more files to subvol btrfs sub snap -r subvol.2 subvol btrfs send -p subvol.1 subvol.2 -f /subvol.2.btrfs #To make subvol.1 into subvol.2 by applying subvol.2.btrfs to subvol.1, the actual original subvol.1 must be present first or you need to "receive" it from subvol.1.btrfs first. And also, I'm pretty sure you can't have subvol.2 already present because receive must create it. Again, I haven't tried > and < so I don't know they work. Have you tried -f to point to the files? According to the manpage, -f is the same as output redirection. "Output is normally written to stdout. To write to a file, use this option. An alternative would be to use pipes." The reason why I can't use something like your sequence of commands (assuming the order of arguments for snap should be reversed) is that I want to be able to verify that the diff is correct, since there are still integrity problems with send/receive. I was planning to do that by applying the "patch" to a snapshot of the parent right away, and make sure the patched volume is equal to the current snapshot (by trying another send, and making sure the output is 0). Thanks Matthew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Receive on same subvolume
On Feb 3, 2014, at 3:53 PM, Matthew Lai wrote: > On 03/02/2014 11:26 AM, Chris Murphy wrote: >> On Feb 3, 2014, at 11:19 AM, Matthew Lai wrote: >> >>> Thanks. I should clarify what I'm trying to do. >>> >>> I'm trying to use btrfs send for backup, without having another btrfs >>> volume. >>> >>> So the initial backup is a complete send, piped to Amazon Glacier (so my >>> machine never has the whole file, and doesn't have space for one). >> OK so you've use btrfs send piped to Glacier which creates a *file*, I'll >> call it "initial", not a navigable directory of files? Right? > That is correct. >>> It looks like the problem now is the sent file can't be applied to the >>> original volume (for restore). >> I'm counting two sent files: initial, increment1. I'm not sure which one >> you're applying. If you have the exact same read-only snapshot that the >> btrfs send file "initial" is based on, then you'd apply the increment1 to >> that read-only snapshot which will cause a new read-only snapshot to be >> created with the incremental data applied to it. The error you're getting >> sounds like the parent read-only snapshot isn't available? >> > That is also correct. There are 2 sent files. I am trying to apply > increment1, on a snapshot of the parent (that was used to create increment1). I don't understand how you can apply increment1 to the snapshot of increment1; and also I don't understand how the parent is also increment1. > > I added -vv. Here is the test script for reproducing this entire setup. > > --- > #!/bin/sh > > btrfs subvolume create data > btrfs subvolume snapshot -r data first_backup > touch data/a > btrfs subvolume snapshot -r data second_backup > btrfs send -p first_backup second_backup > increment > btrfs subvolume snapshot first_backup first_backup_rw > btrfs receive -vv first_backup_rw < increment > --- > > Output: > --- > Create subvolume './data' > Create a readonly snapshot of 'data' in './first_backup' > Create a readonly snapshot of 'data' in './second_backup' > At subvol second_backup > Create a snapshot of 'first_backup' in './first_backup_rw' > At snapshot second_backup > receiving snapshot second_backup uuid=e6159a2a-3430-344a-a23d-b9bb83851a63, > ctransid=28 parent_uuid=20c4ff66-a9ec-fc44-93c6-2c12637e95e6, > parent_ctransid=26 > ERROR: could not find parent subvolume > --- > > I would think applying the "patch" to first_backup_rw should succeed, because > it's exactly the same as first_backup, which is the parent for the send, but > it doesn't. btrfs sub snap -r subvol.1 subvol btrfs send subvol.1 -f /subvol.1.btrfs #write some more files to subvol btrfs sub snap -r subvol.2 subvol btrfs send -p subvol.1 subvol.2 -f /subvol.2.btrfs #To make subvol.1 into subvol.2 by applying subvol.2.btrfs to subvol.1, the actual original subvol.1 must be present first or you need to "receive" it from subvol.1.btrfs first. And also, I'm pretty sure you can't have subvol.2 already present because receive must create it. Again, I haven't tried > and < so I don't know they work. Have you tried -f to point to the files? Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: add regression test for running snapshot and send concurrently
On Mon, Feb 03, 2014 at 11:22:36PM +0800, Wang Shilong wrote: > From: Wang Shilong > > Btrfs would fail to send if snapshot run concurrently, this test is to make > sure we have fixed the bug. > Couple of comments below. > +_scratch_mkfs > /dev/null 2>&1 > +_scratch_mount > + > + > +touch $SCRATCH_MNT/foo > + > +# get file with fragments by using backwards writes. > +for i in `seq 10240 -1 1`; do > + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ > + $SCRATCH_MNT/foo > /dev/null | _filter_xfs_io Indentation. > +done > + > +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT \ > + $SCRATCH_MNT/snap_1 >> $seqres.full 2>&1 > + > +$BTRFS_UTIL_PROG send -f $SCRATCH_MNT/send_file \ > + $SCRATCH_MNT/snap_1 >> $seqres.full 2>&1 & > + > +pid=$! > + > +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT/snap_1 \ > + $SCRATCH_MNT/snap_2 >> $seqres.full 2>&1 > + > +wait $pid || echo "Failed to send, see dmesg" This seems kind of racy. It assumes that the send command doesn't complete before the wait $pid call is made. If $pid doesn't exist at this time because it has completed, wait will return 127 and the test will fail Also, why would a failure to send result in meaingful information in dmesg? Shouldn't the userspace command output information to tell you why there was a failure into $seqres.full? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: throttle delayed refs better
On Mon, 3 Feb 2014 16:08:08 -0500 Josef Bacik wrote: > > On 02/03/2014 01:28 PM, Johannes Hirte wrote: > > On Thu, 23 Jan 2014 13:07:52 -0500 > > Josef Bacik wrote: > > > >> On one of our gluster clusters we noticed some pretty big lag > >> spikes. This turned out to be because our transaction commit was > >> taking like 3 minutes to complete. This is because we have like 30 > >> gigs of metadata, so our global reserve would end up being the max > >> which is like 512 mb. So our throttling code would allow a > >> ridiculous amount of delayed refs to build up and then they'd all > >> get run at transaction commit time, and for a cold mounted file > >> system that could take up to 3 minutes to run. So fix the > >> throttling to be based on both the size of the global reserve and > >> how long it takes us to run delayed refs. This patch tracks the > >> time it takes to run delayed refs and then only allows 1 seconds > >> worth of outstanding delayed refs at a time. This way it will > >> auto-tune itself from cold cache up to when everything is in > >> memory and it no longer has to go to disk. This makes our > >> transaction commits take much less time to run. Thanks, > >> > >> Signed-off-by: Josef Bacik > > This one breaks my system. Shortly after boot the btrfs-freespace > > thread goes up to 100% CPU usage and the system is nearly > > unresponsive. I've seen it first with the full pull request for > > 3.14-rc1 and was able to track it down to this patch. > Could you turn on the softlockup timer and see if you can get a > backtrace of where it is stuck? In the meantime I will go through > and see if I can pinpoint where it may be happening. Thanks, > > Josef This is what I've got with CONFIG_LOCKUP_DETECTOR=y CONFIG_HARDLOCKUP_DETECTOR=y # CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0 # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0 CONFIG_DETECT_HUNG_TASK=y CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0 # CONFIG_PANIC_ON_OOPS is not set CONFIG_PANIC_ON_OOPS_VALUE=0 CONFIG_PANIC_TIMEOUT=0 CONFIG_SCHED_DEBUG=y CONFIG_SCHEDSTATS=y CONFIG_TIMER_STATS=y CONFIG_DEBUG_PREEMPT=y [ 203.610758] perf samples too long (2513 > 2500), lowering kernel.perf_event_max_sample_rate to 5 [ 360.625822] INFO: task btrfs-endio-wri:1075 blocked for more than 120 seconds. [ 360.625826] Not tainted 3.14.0-rc1 #19 [ 360.625828] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.625829] btrfs-endio-wri D 880137c12d00 0 1075 2 0x [ 360.625833] 8800b6b10950 0002 00012d00 8800b6b10950 [ 360.625837] 8801325b3fd8 8800a2dcc000 8801325719e8 [ 360.625840] 880132571800 8800b635ba00 81256192 [ 360.625844] Call Trace: [ 360.625854] [] ? wait_current_trans.isra.19+0xbb/0xdf [ 360.625858] [] ? finish_wait+0x65/0x65 [ 360.625860] [] ? start_transaction+0x2f1/0x4e3 [ 360.625864] [] ? btrfs_finish_ordered_io+0x44c/0x7b2 [ 360.625869] [] ? try_to_del_timer_sync+0x53/0x5e [ 360.625871] [] ? del_timer_sync+0x26/0x43 [ 360.625875] [] ? schedule_timeout+0xeb/0x104 [ 360.625877] [] ? rcu_read_unlock_sched_notrace+0x11/0x11 [ 360.625882] [] ? worker_loop+0x162/0x4c3 [ 360.625884] [] ? btrfs_queue_worker+0x275/0x275 [ 360.625888] [] ? kthread+0xa3/0xab [ 360.625893] [] ? trace_preempt_on+0xd/0x2a [ 360.625895] [] ? freeze_workqueues_begin+0x8/0x11e [ 360.625897] [] ? __kthread_parkme+0x5a/0x5a [ 360.625901] [] ? ret_from_fork+0x7c/0xb0 [ 360.625903] [] ? __kthread_parkme+0x5a/0x5a [ 360.625906] INFO: task btrfs-transacti:1084 blocked for more than 120 seconds. [ 360.625908] Not tainted 3.14.0-rc1 #19 [ 360.625909] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.625910] btrfs-transacti D 880137c52d00 0 1084 2 0x [ 360.625912] 880132428950 0002 00012d00 880132428950 [ 360.625915] 8800b5a35fd8 8801331a5a70 8801331a5ae8 [ 360.625918] 8800aba981b8 00015000 0001 8126b986 [ 360.625921] Call Trace: [ 360.625925] [] ? btrfs_start_ordered_extent+0x91/0xdf [ 360.625928] [] ? finish_wait+0x65/0x65 [ 360.625931] [] ? btrfs_wait_ordered_range+0xab/0x10a [ 360.625934] [] ? __btrfs_write_out_cache+0x43c/0x67f [ 360.625939] [] ? kmem_cache_free+0x66/0x10d [ 360.625942] [] ? btrfs_update_inode_item+0xb9/0xcd [ 360.625944] [] ? __btrfs_prealloc_file_range+0x276/0x2db [ 360.625947] [] ? btrfs_write_out_ino_cache+0x3f/0x5e [ 360.625950] [] ? btrfs_save_ino_cache+0x269/0x2ea [ 360.625952] [] ? commit_fs_roots.isra.17+0xa6/0x148 [ 360.625954] [] ? trace_preempt_on+0xd/0x2a [ 360.625958] [] ? preempt_count_sub+0xbd/0xc9 [ 36
Re: Receive on same subvolume
On 03/02/2014 11:26 AM, Chris Murphy wrote: On Feb 3, 2014, at 11:19 AM, Matthew Lai wrote: Thanks. I should clarify what I'm trying to do. I'm trying to use btrfs send for backup, without having another btrfs volume. So the initial backup is a complete send, piped to Amazon Glacier (so my machine never has the whole file, and doesn't have space for one). OK so you've use btrfs send piped to Glacier which creates a *file*, I'll call it "initial", not a navigable directory of files? Right? That is correct. It looks like the problem now is the sent file can't be applied to the original volume (for restore). I'm counting two sent files: initial, increment1. I'm not sure which one you're applying. If you have the exact same read-only snapshot that the btrfs send file "initial" is based on, then you'd apply the increment1 to that read-only snapshot which will cause a new read-only snapshot to be created with the incremental data applied to it. The error you're getting sounds like the parent read-only snapshot isn't available? That is also correct. There are 2 sent files. I am trying to apply increment1, on a snapshot of the parent (that was used to create increment1). I added -vv. Here is the test script for reproducing this entire setup. --- #!/bin/sh btrfs subvolume create data btrfs subvolume snapshot -r data first_backup touch data/a btrfs subvolume snapshot -r data second_backup btrfs send -p first_backup second_backup > increment btrfs subvolume snapshot first_backup first_backup_rw btrfs receive -vv first_backup_rw < increment --- Output: --- Create subvolume './data' Create a readonly snapshot of 'data' in './first_backup' Create a readonly snapshot of 'data' in './second_backup' At subvol second_backup Create a snapshot of 'first_backup' in './first_backup_rw' At snapshot second_backup receiving snapshot second_backup uuid=e6159a2a-3430-344a-a23d-b9bb83851a63, ctransid=28 parent_uuid=20c4ff66-a9ec-fc44-93c6-2c12637e95e6, parent_ctransid=26 ERROR: could not find parent subvolume --- I would think applying the "patch" to first_backup_rw should succeed, because it's exactly the same as first_backup, which is the parent for the send, but it doesn't. Thanks Matthew -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lost with degraded RAID1
State is: I wont use this filesystem again. I have a backup. So I am interested to give the necessary information for debuging it and afterwards format it and create a new one. I already did fscks and btrfschk --repair and pushed the output to txt-files but they are more than 4 mb in size. So I will post excerpts: file: btrfsck.out=== Checking filesystem on /dev/mapper/bunkerA UUID: 7f954a85-7566-4251-832c-44f2d3de2211 42 parent transid verify failed on 1887688011776 wanted 121037 found 88533 parent transid verify failed on 1888518615040 wanted 121481 found 90267 parent transid verify failed on 1681394102272 wanted 110919 found 91024 parent transid verify failed on 1888522838016 wanted 121486 found 90270 parent transid verify failed on 1888398331904 wanted 121062 found 89987 leaf parent key incorrect 1887867330560 bad block 1887867330560 leaf parent key incorrect 188812032 bad block 188812032 leaf parent key incorrect 1888124637184 bad block 1888124637184 leaf parent key incorrect 1888131444736 bad block 1888131444736 [...and so on for 4MB] bad block 1888513552384 leaf parent key incorrect 1888513642496 bad block 1888513642496 leaf parent key incorrect 1888513654784 bad block 1888513654784 leaf parent key incorrect 1888514023424 bad block 1888514023424 btrfsck: cmds-check.c:2212: check_owner_ref: Assertion `!(rec->is_root)' failed. file: smartctl-before-btrfschk-repair== smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.12-0.bpo.1-amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 118 099 006Pre-fail Always - 172055696 3 Spin_Up_Time0x0003 093 093 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 7 5 Reallocated_Sector_Ct 0x0033 100 100 010Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 069 060 030Pre-fail Always - 9085642 9 Power_On_Hours 0x0032 097 097 000Old_age Always - 2769 10 Spin_Retry_Count0x0013 100 100 097Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020Old_age Always - 7 184 End-to-End_Error0x0032 100 100 099Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000Old_age Always - 0 189 High_Fly_Writes 0x003a 083 083 000Old_age Always - 17 190 Airflow_Temperature_Cel 0x0022 077 071 045Old_age Always - 23 (Min/Max 22/23) 191 G-Sense_Error_Rate 0x0032 100 100 000Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000Old_age Always - 5 193 Load_Cycle_Count0x0032 100 100 000Old_age Always - 7 194 Temperature_Celsius 0x0022 023 040 000Old_age Always - 23 (0 20 0 0) 197 Current_Pending_Sector 0x0012 100 100 000Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000Old_age Offline - 0 199 UDMA_CRC_Error_Count0x003e 200 200 000Old_age Always - 0 =file:btrfsck-repair.out enabling repair mode Checking filesystem on /dev/mapper/bunkerA UUID: 7f954a85-7566-4251-832c-44f2d3de2211 ify failed on 1887688011776 wanted 121037 found 88533 parent transid verify failed on 1888518615040 wanted 121481 found 90267 parent transid verify failed on 1681394102272 wanted 110919 found 91024 parent transid verify failed on 1888522838016 wanted 121486 found 90270 parent transid verify failed on 1888398331904 wanted 121062 found 89987 leaf parent key incorrect 1887867330560 bad block 1887867330560 [...and so on for 4MB] bad block 1888513642496 leaf parent key incorrect 1888513654784 bad block 1888513654784 leaf parent key incorrect 1888514023424 bad block 1888514023424 btrfsck: cmds-check.c:2212: check_owner_ref: Assertion `!(rec->is_root)' failed. ==file:smartctl-after-btrfschk-repair== smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.12-0.bpo.1-amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 118 099 0
Re: [PATCH] Btrfs: convert to add transaction protection for btrfs send
On 01/31/2014 11:37 AM, Wang Shilong wrote: > Hello Josef, > >> 在 2014-1-31,上午12:23,Josef Bacik 写道: >> >>> On 01/30/2014 11:20 AM, Wang Shilong wrote: Hello Josef, > On 01/30/2014 04:42 AM, Wang Shilong wrote: >> Hi Josef, >> >>> On 01/29/2014 10:32 AM, Wang Shilong wrote: From: Wang Shilong I sent a patch to kick off transaction from btrfs send, however it gets a regression that btrfs send try to search extent commit root without transaction protection. To fix this regression, we have two ideas: 1. don't use extent commit root for sending. 2. add transaction protection to use extent commit root safely. Both approaches need transaction actually, however, the first approach will add extent tree lock contention, so we'd better adopt the second approach. Luckily, now we only need transaction protection when iterating extent root, the protection's *range* is smaller than before. >>> So what is the problem exactly? How does it show up and what are you >>> doing to make it happen? I'd really like to kill the transaction >>> taking completely in the send path so I'd like to know what is going >>> wrong so we can either take the extent commit semaphore and be >>> satisfied that is ok or come up with a different solution. Thanks, >> See in find_extent_clone(), we have to walk backrefs while we have to >> search extent tree! >> i was thinking to kick off transaction for initial full send, however, >> we need to consider ref links even >> in the initial send. >> >> It is easy to trigger problems like the following steps: >> >> # mkfs.btrfs -f /dev/sda8 >> # mount /dev/sda8 /mnt >> # dd if=/dev/zero of=/mnt/data bs=4k count=102400 oflag=direct >> # btrfs sub snapshot -r /mnt /mnt/snap >> # btrfs send /mnt/snap -f /mnt/send_file & >> # btrfs sub snapshot /mnt/snap /mnt/snap_1 >> >> Feel free to correct me if i miss something here^_^(As i sometimes made >> some mistakes). >> > Ok so this is a lot of broken things, but it's not really the extent > root, cause like I said before nothings going to change that matters for > the snapshots bytes. > > What _does_ matter is the actual commit root for the actual fs root, and > that requires quite a bit of manoeuvring to get right. So I'll send a > patch in a few minutes when I'm happy with what I have to fix this. In > the meantime would you rig this example up into an xfstest so we can make > sure we don't have this problem in the future? Thanks, I am a little confused that we don't need protect extent commit root anyway, it is really safe to search extent commit root without any transaction protection^_^…. And i am ok to send a xfstest case for this.. >>> Sorry I didn't say that quite right. We definitely need to protect the >>> commit root for the extent root because we could easily swap it out and >>> then write over blocks as we search down it, which would break things. But >>> that's not what was screwing up here, we are cow'ing the root for /mnt/snap >>> and swapping out the commit root out from under us which is screwing us up >>> because we end up with a different root level than what we are expecting. >>> >>> So we need to use extent_commit_sem anywhere we search the commit root for >>> the extent tree, but we also need to do the same for searching the fs >>> roots. Thanks, > By some debugging, i found snapshots will cow src root(this is a little > strange...), we need do the same thing > for searching fs roots. Really thanks for looking into issue, and correct me, > waiting for your fix.^_^ ^_^ > So I've figured it out. We definitely need to protect the commit roots, but that's not what is screwing us. Say we have commit root for snap at block 1 and we search down the extent tree and see that it is at 1. Then we go to do the search down to level on the root for that block, but in the meantime we've snapshotted and switched the commit root for that fs_tree to block 2. We go to search down and don't find our bytenr we were looking for and we exit out without finding our original subvolume. So there are a few things we can do here 1) Only switch the commit roots for the fs_root _after_ we switch the extent root commit root. This works out well because we'd need to hold the extent_commit_sem for the entirety of this operation so we'd end up with a consistent view of everything. The drawback of this is that we have to process the fs_roots twice, once to update the root items and then again to swap the commit roots. 2) Remove the per-root rwsem for the commit root and just make one big rwsem that covers all commit root switching. This way everybody who wants to search with the commit root can just
hitting BUG_ON on troublesome FS
FIrst, a bit of history of the filesystem: used to be 6 disks, now 5. partially raid1 / raid10. been migrating back and forth a few times. As some point, a balance would not complete and would end with 164 ENOSPC’ses, while there was plenty of unallocated space on each disk. i scanned for extends larger then 1gig and found a few, so ran a recursive balance of the entire FS. I deceided to empty the filesystem and format it. i pulled most files off it some via btrfs send/receive, some via rsync. but 1 subvol wouldn’t send. i don’t remember the exact error, but it was that a extend could not be found on 1 of the disks. with only a few 100gig of data left, i decided to balance some remaining empty space before doing a `btrfs dev del`, so have another disk to store more data on. but im hitting a snag, i hit a BUG_ON when doing a `btrfs bal start -dusage=2 /mountpoint` : [ 3327.678329] btrfs: found 198 extents [ 3328.117274] btrfs: relocating block group 84473084968960 flags 17 [ 3329.278521] btrfs: found 103 extents [ 3331.907931] btrfs: found 103 extents [ 3332.386172] btrfs: relocating block group 84466642518016 flags 17 [ .536595] btrfs: found 86 extents [ 3335.982967] btrfs: found 86 extents [ 3336.599555] btrfs (4746) used greatest stack depth: 2744 bytes left [ 3379.073464] btrfs: relocating block group 89878368419840 flags 17 [ 3381.608948] btrfs: found 499 extents [ 3383.884696] [ cut here ] [ 3383.884720] kernel BUG at fs/btrfs/relocation.c:3405! [ 3383.884731] invalid opcode: [#1] SMP [ 3383.884742] Modules linked in: [ 3383.884753] CPU: 0 PID: 5663 Comm: btrfs Not tainted 3.13.0 #1 [ 3383.884763] Hardware name: System manufacturer System Product Name/E45M1-I DELUXE, BIOS 0405 08/08/2012 [ 3383.884778] task: 8802360eae80 ti: 88010dcaa000 task.ti: 88010dcaa000 [ 3383.884790] RIP: 0010:[] [] __add_tree_block+0x1c5/0x1e0 [ 3383.884811] RSP: 0018:88010dcaba38 EFLAGS: 00010202 [ 3383.884821] RAX: 0001 RBX: 880039f18000 RCX: [ 3383.884832] RDX: RSI: RDI: [ 3383.884843] RBP: 88010dcaba90 R08: 88010dcab9f4 R09: 88010dcab930 [ 3383.884854] R10: R11: 047f R12: 1000 [ 3383.884865] R13: 88023489c630 R14: R15: 528d112e4000 [ 3383.884876] FS: 7f8e27e74880() GS:88023ec0() knlGS: [ 3383.884888] CS: 0010 DS: ES: CR0: 8005003b [ 3383.884897] CR2: 7f60d89f35a8 CR3: 0001b5ada000 CR4: 07f0 [ 3383.884907] Stack: [ 3383.884941] 88010dcabb28 4000812bde34 00a8528d112e 0010 [ 3383.885012] 1000 1000 0f3a 8802348d6990 [ 3383.885082] 88001cbf5a00 880039f18000 00b8 88010dcabb00 [ 3383.885153] Call Trace: [ 3383.885192] [] add_data_references+0x244/0x2e0 [ 3383.885232] [] relocate_block_group+0x56b/0x640 [ 3383.885272] [] btrfs_relocate_block_group+0x1a2/0x2f0 [ 3383.885313] [] btrfs_relocate_chunk.isra.27+0x6a/0x740 [ 3383.885355] [] ? btrfs_set_path_blocking+0x31/0x70 [ 3383.885432] [] ? btrfs_search_slot+0x386/0x960 [ 3383.885473] [] ? free_extent_buffer+0x47/0xa0 [ 3383.885513] [] btrfs_balance+0x90b/0xea0 [ 3383.885553] [] btrfs_ioctl_balance+0x162/0x520 [ 3383.885592] [] btrfs_ioctl+0xcbd/0x25c0 [ 3383.885632] [] ? __do_page_fault+0x1dc/0x520 [ 3383.885673] [] do_vfs_ioctl+0x2c8/0x490 [ 3383.885712] [] SyS_ioctl+0x81/0xa0 [ 3383.885752] [] tracesys+0xdd/0xe2 [ 3383.885787] Code: ff 48 8b 4d a8 48 8d 75 b6 4c 89 ea 48 89 df e8 42 e7 ff ff 4c 89 ef 89 45 a8 e8 c7 0f f9 ff 8b 45 a8 e9 69 ff ff ff 85 c0 74 d6 <0f> 0b 66 0f 1f 84 00 00 00 00 00 b8 f4 ff ff ff e9 50 ff ff ff [ 3383.886001] RIP [] __add_tree_block+0x1c5/0x1e0 [ 3383.886042] RSP [ 3383.886359] ---[ end trace 075209044ce10da3 ]--- Anything i can do to resolve / debug the issue? Remco-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: throttle delayed refs better
On 02/03/2014 01:28 PM, Johannes Hirte wrote: On Thu, 23 Jan 2014 13:07:52 -0500 Josef Bacik wrote: On one of our gluster clusters we noticed some pretty big lag spikes. This turned out to be because our transaction commit was taking like 3 minutes to complete. This is because we have like 30 gigs of metadata, so our global reserve would end up being the max which is like 512 mb. So our throttling code would allow a ridiculous amount of delayed refs to build up and then they'd all get run at transaction commit time, and for a cold mounted file system that could take up to 3 minutes to run. So fix the throttling to be based on both the size of the global reserve and how long it takes us to run delayed refs. This patch tracks the time it takes to run delayed refs and then only allows 1 seconds worth of outstanding delayed refs at a time. This way it will auto-tune itself from cold cache up to when everything is in memory and it no longer has to go to disk. This makes our transaction commits take much less time to run. Thanks, Signed-off-by: Josef Bacik This one breaks my system. Shortly after boot the btrfs-freespace thread goes up to 100% CPU usage and the system is nearly unresponsive. I've seen it first with the full pull request for 3.14-rc1 and was able to track it down to this patch. Could you turn on the softlockup timer and see if you can get a backtrace of where it is stuck? In the meantime I will go through and see if I can pinpoint where it may be happening. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lost with degraded RAID1
On Feb 3, 2014, at 1:55 PM, Johan Kröckel wrote: > 2014-01-30 Chris Murphy : >> >> On Jan 30, 2014, at 10:58 AM, Hugo Mills wrote: >> >>> On Thu, Jan 30, 2014 at 10:33:21AM -0700, Chris Murphy wrote: You're doing an online conversion of a degraded raid1 volume into single? Does anyone know if this is expected or intended to work? >>> >>> I don't see why not. One suggested method of recovering RAID from a >>> degraded situation is to rebalance over just the remaining devices >>> (space permitting, of course). >> >> Right but that's not a conversion. That's a regular balance on a degraded >> mount, with multiple remaining devices: e.g. a 4 disk raid1, drive fails, >> mount -o degraded, delete missing, then balance will replicate any missing >> 2nd copies onto three drives. >> >> The bigger problem at the moment is that -o degraded isn't working for >> Johan. The too many missing devices message seems like a bug and with >> limited information it may even be whatever that bug is, that cause the >> conversion to fail. Some 11GB were converted prior to the failure. > Which usefull information can provide. On the weekend I was at the > server and found out, that the vanishing of the drive at reboot was > strange behavior of the bios. So the drive is online again. but the > filesystem is still showing strange behavior, but now I can mount it > rw. I'd like to see btrfs fi df results for the volume. And new btrfs check. And then a backup if needed, and then a scrub to see if that fixes anything broken between them. I'm not sure what happens if a new generation object is broken and the old generation is OK, what scrub will do? Maybe it just reports it, I'm not sure. If you want you could do a btrfs scrub -r which is read only and just reports what the problems are. You also have an incomplete balance, right? So it's possible some things might not be fixable if the conversion to single was successful. You'll need to decide if you want to reconvert back to data/metadata raid1/raid from whatever you're at now. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lost with degraded RAID1
2014-01-30 Chris Murphy : > > On Jan 30, 2014, at 10:58 AM, Hugo Mills wrote: > >> On Thu, Jan 30, 2014 at 10:33:21AM -0700, Chris Murphy wrote: >>> You're doing an online conversion of a degraded raid1 volume into single? >>> Does anyone know if this is expected or intended to work? >> >> I don't see why not. One suggested method of recovering RAID from a >> degraded situation is to rebalance over just the remaining devices >> (space permitting, of course). > > Right but that's not a conversion. That's a regular balance on a degraded > mount, with multiple remaining devices: e.g. a 4 disk raid1, drive fails, > mount -o degraded, delete missing, then balance will replicate any missing > 2nd copies onto three drives. > > The bigger problem at the moment is that -o degraded isn't working for Johan. > The too many missing devices message seems like a bug and with limited > information it may even be whatever that bug is, that cause the conversion to > fail. Some 11GB were converted prior to the failure. Which usefull information can provide. On the weekend I was at the server and found out, that the vanishing of the drive at reboot was strange behavior of the bios. So the drive is online again. but the filesystem is still showing strange behavior, but now I can mount it rw. > Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Receive on same subvolume
On Feb 3, 2014, at 11:19 AM, Matthew Lai wrote: > Thanks. I should clarify what I'm trying to do. > > I'm trying to use btrfs send for backup, without having another btrfs volume. > > So the initial backup is a complete send, piped to Amazon Glacier (so my > machine never has the whole file, and doesn't have space for one). OK so you've use btrfs send piped to Glacier which creates a *file*, I'll call it "initial", not a navigable directory of files? Right? > > At the same time I'm keeping a snapshot of the current volume. > > On the next incremental backup, I would use the first snapshot as the parent, > and send the differences to Glacier again (without having the entire file on > the system at any time). That's fine as long as the stdout from btrfs send ends up as a self contained file on Glacier. I'll call this "increment1" > > It looks like the problem now is the sent file can't be applied to the > original volume (for restore). I'm counting two sent files: initial, increment1. I'm not sure which one you're applying. If you have the exact same read-only snapshot that the btrfs send file "initial" is based on, then you'd apply the increment1 to that read-only snapshot which will cause a new read-only snapshot to be created with the incremental data applied to it. The error you're getting sounds like the parent read-only snapshot isn't available? Have you tried -vv flag to get more verbose error information when using btrfs receive? Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: throttle delayed refs better
On Thu, 23 Jan 2014 13:07:52 -0500 Josef Bacik wrote: > On one of our gluster clusters we noticed some pretty big lag > spikes. This turned out to be because our transaction commit was > taking like 3 minutes to complete. This is because we have like 30 > gigs of metadata, so our global reserve would end up being the max > which is like 512 mb. So our throttling code would allow a > ridiculous amount of delayed refs to build up and then they'd all get > run at transaction commit time, and for a cold mounted file system > that could take up to 3 minutes to run. So fix the throttling to be > based on both the size of the global reserve and how long it takes us > to run delayed refs. This patch tracks the time it takes to run > delayed refs and then only allows 1 seconds worth of outstanding > delayed refs at a time. This way it will auto-tune itself from cold > cache up to when everything is in memory and it no longer has to go > to disk. This makes our transaction commits take much less time to > run. Thanks, > > Signed-off-by: Josef Bacik This one breaks my system. Shortly after boot the btrfs-freespace thread goes up to 100% CPU usage and the system is nearly unresponsive. I've seen it first with the full pull request for 3.14-rc1 and was able to track it down to this patch. regards, Johannes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Receive on same subvolume
Thanks. I should clarify what I'm trying to do. I'm trying to use btrfs send for backup, without having another btrfs volume. So the initial backup is a complete send, piped to Amazon Glacier (so my machine never has the whole file, and doesn't have space for one). At the same time I'm keeping a snapshot of the current volume. On the next incremental backup, I would use the first snapshot as the parent, and send the differences to Glacier again (without having the entire file on the system at any time). It looks like the problem now is the sent file can't be applied to the original volume (for restore). Thanks Matthew On 03/02/2014 9:30 AM, Chris Murphy wrote: On Jan 29, 2014, at 2:26 PM, Matthew Lai wrote: Hello, Is this supposed to work? (/data is the root volume, /data/a is a subvolume) btrfs subvolume snapshot /data/a /data/b # make some changes in b btrfs send -p /data/a /data/b > delta btrfs receive /data/a < delta I'm getting "ERROR: could not find parent subvolume" on receive. What I'm trying to do is to back up using send/receive, but I don't have 50% free space, and (please correct me if I'm wrong) since receive doesn't do deduplication, I want to use snapshot to do the initial bootstrapping, instead of send/receive without a parent. I think you've oversimplified your commands, because it looks like you're using send/receive on the same file system. But if it's a backup, necessarily you'd have to be sending the subvolume(s) to another file system on another disk (either on the same system or remotely). So that needs some clarification. Also, btrfs send requires subvolumes to be read only. Are they? And btrfs incremental receive expects the identical parent already on the destination. Is it? Also, while I'm not certain it matters, man page says to use -f to specify files. I haven't tested < and >. But then also the step where you create this intermediate snapshot file isn't necessary, just combine the send receive commands through pipe. https://btrfs.wiki.kernel.org/index.php/Incremental_Backup Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] btrfs: send: remove virtual_mem member from fs_path
We don't need to keep track of that, it's available via is_vmalloc_addr. Signed-off-by: David Sterba --- fs/btrfs/send.c |8 ++-- 1 files changed, 2 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 524086a882f9..ea427624e842 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -55,7 +55,6 @@ struct fs_path { char *buf; int buf_len; unsigned int reversed:1; - unsigned int virtual_mem:1; char inline_buf[]; }; char pad[PAGE_SIZE]; @@ -241,7 +240,6 @@ static struct fs_path *fs_path_alloc(void) if (!p) return NULL; p->reversed = 0; - p->virtual_mem = 0; p->buf = p->inline_buf; p->buf_len = FS_PATH_INLINE_SIZE; fs_path_reset(p); @@ -265,7 +263,7 @@ static void fs_path_free(struct fs_path *p) if (!p) return; if (p->buf != p->inline_buf) { - if (p->virtual_mem) + if (is_vmalloc_addr(p->buf)) vfree(p->buf); else kfree(p->buf); @@ -299,13 +297,12 @@ static int fs_path_ensure_buf(struct fs_path *p, int len) tmp_buf = vmalloc(len); if (!tmp_buf) return -ENOMEM; - p->virtual_mem = 1; } memcpy(tmp_buf, p->buf, p->buf_len); p->buf = tmp_buf; p->buf_len = len; } else { - if (p->virtual_mem) { + if (is_vmalloc_addr(p->buf)) { tmp_buf = vmalloc(len); if (!tmp_buf) return -ENOMEM; @@ -319,7 +316,6 @@ static int fs_path_ensure_buf(struct fs_path *p, int len) return -ENOMEM; memcpy(tmp_buf, p->buf, p->buf_len); kfree(p->buf); - p->virtual_mem = 1; } } p->buf = tmp_buf; -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] btrfs: send: lower memory requirements in common case
The fs_path structure uses an inline buffer and falls back to a chain of allocations, but vmalloc is not necessary because PATH_MAX fits into PAGE_SIZE. The size of fs_path has been reduced to 256 bytes from PAGE_SIZE, usually 4k. Experimental measurements show that most paths on a single filesystem do not exceed 200 bytes, and these get stored into the inline buffer directly, which is now 230 bytes. Longer paths are kmalloced when needed. Signed-off-by: David Sterba --- fs/btrfs/send.c | 103 ++- 1 files changed, 34 insertions(+), 69 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index cb12c2ec37dc..4e3a3d413417 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -57,7 +57,12 @@ struct fs_path { unsigned short reversed:1; char inline_buf[]; }; - char pad[PAGE_SIZE]; + /* +* Average path length does not exceed 200 bytes, we'll have +* better packing in the slab and higher chance to satisfy +* a allocation later during send. +*/ + char pad[256]; }; }; #define FS_PATH_INLINE_SIZE \ @@ -262,12 +267,8 @@ static void fs_path_free(struct fs_path *p) { if (!p) return; - if (p->buf != p->inline_buf) { - if (is_vmalloc_addr(p->buf)) - vfree(p->buf); - else - kfree(p->buf); - } + if (p->buf != p->inline_buf) + kfree(p->buf); kfree(p); } @@ -287,40 +288,28 @@ static int fs_path_ensure_buf(struct fs_path *p, int len) if (p->buf_len >= len) return 0; - path_len = p->end - p->start; - old_buf_len = p->buf_len; - len = PAGE_ALIGN(len); - + /* +* First time the inline_buf does not suffice +*/ if (p->buf == p->inline_buf) { - tmp_buf = kmalloc(len, GFP_NOFS | __GFP_NOWARN); - if (!tmp_buf) { - tmp_buf = vmalloc(len); - if (!tmp_buf) - return -ENOMEM; - } - memcpy(tmp_buf, p->buf, p->buf_len); - p->buf = tmp_buf; - p->buf_len = len; + p->buf = kmalloc(len, GFP_NOFS); + if (!p->buf) + return -ENOMEM; + /* +* The real size of the buffer is bigger, this will let the +* fast path happen most of the time +*/ + p->buf_len = ksize(p->buf); } else { - if (is_vmalloc_addr(p->buf)) { - tmp_buf = vmalloc(len); - if (!tmp_buf) - return -ENOMEM; - memcpy(tmp_buf, p->buf, p->buf_len); - vfree(p->buf); - } else { - tmp_buf = krealloc(p->buf, len, GFP_NOFS); - if (!tmp_buf) { - tmp_buf = vmalloc(len); - if (!tmp_buf) - return -ENOMEM; - memcpy(tmp_buf, p->buf, p->buf_len); - kfree(p->buf); - } - } - p->buf = tmp_buf; - p->buf_len = len; + p->buf = krealloc(p->buf, len, GFP_NOFS); + if (!p->buf) + return -ENOMEM; + p->buf_len = ksize(p->buf); } + + path_len = p->end - p->start; + old_buf_len = p->buf_len; + if (p->reversed) { tmp_buf = p->buf + old_buf_len - path_len - 1; p->end = p->buf + p->buf_len - 1; @@ -911,9 +900,7 @@ static int iterate_dir_item(struct btrfs_root *root, struct btrfs_path *path, struct btrfs_dir_item *di; struct btrfs_key di_key; char *buf = NULL; - char *buf2 = NULL; - int buf_len; - int buf_virtual = 0; + const int buf_len = PATH_MAX; u32 name_len; u32 data_len; u32 cur; @@ -923,7 +910,6 @@ static int iterate_dir_item(struct btrfs_root *root, struct btrfs_path *path, int num; u8 type; - buf_len = PAGE_SIZE; buf = kmalloc(buf_len, GFP_NOFS); if (!buf) { ret = -ENOMEM; @@ -945,30 +931,12 @@ static int iterate_dir_item(struct btrfs_root *root, struct btrfs_path *path, type = btrfs_dir_type(eb, di); btrfs_dir_item_key_to_cpu(eb, di, &di_key); + /* +* Path too long +*/ if (name_len + data_len > buf_len) { - buf_len = PAGE_ALIGN(name_len + data_len); - if (buf_virtual) { -
[PATCH 3/6] btrfs: send: squeeze bitfilelds in fs_path
We know that buf_len is at most PATH_MAX, 4k, and can merge it with the reversed member. This saves 3 bytes in favor of inline_buf. Signed-off-by: David Sterba --- fs/btrfs/send.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index ea427624e842..cb12c2ec37dc 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -53,8 +53,8 @@ struct fs_path { char *end; char *buf; - int buf_len; - unsigned int reversed:1; + unsigned short buf_len:15; + unsigned short reversed:1; char inline_buf[]; }; char pad[PAGE_SIZE]; -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] btrfs: send: remove prepared member from fs_path
The member is used only to return value back from fs_path_prepare_for_add, we can do it locally and save 8 bytes for the inline_buf path. Signed-off-by: David Sterba --- fs/btrfs/send.c | 26 +- 1 files changed, 13 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 04c07ed51df5..524086a882f9 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -51,7 +51,6 @@ struct fs_path { struct { char *start; char *end; - char *prepared; char *buf; int buf_len; @@ -338,7 +337,8 @@ static int fs_path_ensure_buf(struct fs_path *p, int len) return 0; } -static int fs_path_prepare_for_add(struct fs_path *p, int name_len) +static int fs_path_prepare_for_add(struct fs_path *p, int name_len, + char **prepared) { int ret; int new_len; @@ -354,11 +354,11 @@ static int fs_path_prepare_for_add(struct fs_path *p, int name_len) if (p->start != p->end) *--p->start = '/'; p->start -= name_len; - p->prepared = p->start; + *prepared = p->start; } else { if (p->start != p->end) *p->end++ = '/'; - p->prepared = p->end; + *prepared = p->end; p->end += name_len; *p->end = 0; } @@ -370,12 +370,12 @@ out: static int fs_path_add(struct fs_path *p, const char *name, int name_len) { int ret; + char *prepared; - ret = fs_path_prepare_for_add(p, name_len); + ret = fs_path_prepare_for_add(p, name_len, &prepared); if (ret < 0) goto out; - memcpy(p->prepared, name, name_len); - p->prepared = NULL; + memcpy(prepared, name, name_len); out: return ret; @@ -384,12 +384,12 @@ out: static int fs_path_add_path(struct fs_path *p, struct fs_path *p2) { int ret; + char *prepared; - ret = fs_path_prepare_for_add(p, p2->end - p2->start); + ret = fs_path_prepare_for_add(p, p2->end - p2->start, &prepared); if (ret < 0) goto out; - memcpy(p->prepared, p2->start, p2->end - p2->start); - p->prepared = NULL; + memcpy(prepared, p2->start, p2->end - p2->start); out: return ret; @@ -400,13 +400,13 @@ static int fs_path_add_from_extent_buffer(struct fs_path *p, unsigned long off, int len) { int ret; + char *prepared; - ret = fs_path_prepare_for_add(p, len); + ret = fs_path_prepare_for_add(p, len, &prepared); if (ret < 0) goto out; - read_extent_buffer(eb, p->prepared, off, len); - p->prepared = NULL; + read_extent_buffer(eb, prepared, off, len); out: return ret; -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] Btrfs send updates - reduce memory consumption
[Sorry if you see this twice, first attempt hasn't appeared in the list yet] This reduces size of the path buffer in common case. Has been tested by xfstests, but at the moment v3.13 with or without this patch blows, so I'm sending it anyway. Based on current btrfs-next/master. David Sterba (6): btrfs: send: remove prepared member from fs_path btrfs: send: remove virtual_mem member from fs_path btrfs: send: squeeze bitfilelds in fs_path btrfs: send: lower memory requirements in common case btrfs: send: remove BUG from process_all_refs btrfs: send: remove BUG_ON from name_cache_delete fs/btrfs/send.c | 153 ++ 1 files changed, 62 insertions(+), 91 deletions(-) -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] btrfs: send: remove BUG from process_all_refs
There are only 2 static callers, the BUG would normally be never reached, but let's be nice. Signed-off-by: David Sterba --- fs/btrfs/send.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 4e3a3d413417..b0bf4ff40b5b 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3568,7 +3568,10 @@ static int process_all_refs(struct send_ctx *sctx, root = sctx->parent_root; cb = __record_deleted_ref; } else { - BUG(); + btrfs_err(sctx->send_root->fs_info, + "Wrong command %d in process_all_refs", cmd); + ret = -EINVAL; + goto out; } key.objectid = sctx->cmp_key->objectid; -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] btrfs: send: remove BUG_ON from name_cache_delete
If cleaning the name cache fails, we could try to proceed at the cost of some memory leak. This is not expected to happen often. Signed-off-by: David Sterba --- fs/btrfs/send.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index b0bf4ff40b5b..7b17b778eaf7 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -1849,13 +1849,20 @@ static void name_cache_delete(struct send_ctx *sctx, nce_head = radix_tree_lookup(&sctx->name_cache, (unsigned long)nce->ino); - BUG_ON(!nce_head); + if (!nce_head) { + btrfs_err(sctx->send_root->fs_info, + "name_cache_delete lookup failed ino %llu cache size %d, leaking memory", + nce->ino, sctx->name_cache_size); + } list_del(&nce->radix_list); list_del(&nce->list); sctx->name_cache_size--; - if (list_empty(nce_head)) { + /* +* We may not get to the final release of nce_head if the lookup fails +*/ + if (nce_head && list_empty(nce_head)) { radix_tree_delete(&sctx->name_cache, (unsigned long)nce->ino); kfree(nce_head); } -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Btrfs
On Mon 03 Feb 2014 12:54:05 PM EST, David Sterba wrote: On Thu, Jan 30, 2014 at 04:52:54PM -0500, Chris Mason wrote: Chris Mason (3) commits (+64/-32): Btrfs: setup inode location during btrfs_init_inode_locked (+9/-9) Btrfs: don't use ram_bytes for uncompressed inline items (+52/-22) The patches are CC: stable, but haven't gone through the mailinglist. Are they still going to be picked by stable? We do need both in -stable, I'll help with backports. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] btrfs: add simple debugfs interface
Help during debugging to export various interesting information and tunables without the need of extra mount options or ioctls. Usage: * declare your variable in sysfs.h, and include where you need it * define the variable in sysfs.c and make it visible via debugfs_create_TYPE Depends on CONFIG_DEBUG_FS. Signed-off-by: David Sterba --- v3: - fix typo in changelog v2: - added missing return to btrfs_init_debugfs - updated error handling to btrfs_init_sysfs, the cleanup is done in btrfs_exit_sysfs - removed #ifdef in btrfs_exit_sysfs, fs/btrfs/sysfs.c | 33 +++-- 1 files changed, 27 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 782374d8fd19..b725e4574448 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "ctree.h" #include "disk-io.h" @@ -593,6 +594,12 @@ static int add_device_membership(struct btrfs_fs_info *fs_info) /* /sys/fs/btrfs/ entry */ static struct kset *btrfs_kset; +/* /sys/kernel/debug/btrfs */ +static struct dentry *btrfs_debugfs_root_dentry; + +/* Debugging tunables and exported data */ +u64 btrfs_debugfs_test; + int btrfs_sysfs_add_one(struct btrfs_fs_info *fs_info) { int error; @@ -636,27 +643,41 @@ failure: return error; } +static int btrfs_init_debugfs(void) +{ +#ifdef CONFIG_DEBUG_FS + btrfs_debugfs_root_dentry = debugfs_create_dir("btrfs", NULL); + if (!btrfs_debugfs_root_dentry) + return -ENOMEM; + + debugfs_create_u64("test", S_IRUGO | S_IWUGO, btrfs_debugfs_root_dentry, + &btrfs_debugfs_test); +#endif + return 0; +} + int btrfs_init_sysfs(void) { int ret; + btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj); if (!btrfs_kset) return -ENOMEM; - init_feature_attrs(); + ret = btrfs_init_debugfs(); + if (ret) + return ret; + init_feature_attrs(); ret = sysfs_create_group(&btrfs_kset->kobj, &btrfs_feature_attr_group); - if (ret) { - kset_unregister(btrfs_kset); - return ret; - } - return 0; + return ret; } void btrfs_exit_sysfs(void) { sysfs_remove_group(&btrfs_kset->kobj, &btrfs_feature_attr_group); kset_unregister(btrfs_kset); + debugfs_remove_recursive(btrfs_debugfs_root_dentry); } -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Btrfs
On Thu, Jan 30, 2014 at 04:52:54PM -0500, Chris Mason wrote: > Chris Mason (3) commits (+64/-32): > Btrfs: setup inode location during btrfs_init_inode_locked (+9/-9) > Btrfs: don't use ram_bytes for uncompressed inline items (+52/-22) The patches are CC: stable, but haven't gone through the mailinglist. Are they still going to be picked by stable? The commit ids: 90d3e592e99b8e374ead2b45148abf506493a959 514ac8ad8793a097c0c9d89202c642479d6dfa34 but unfortunatelly neither applies directly to anything 3.10+ david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Receive on same subvolume
On Jan 29, 2014, at 2:26 PM, Matthew Lai wrote: > Hello, > > Is this supposed to work? (/data is the root volume, /data/a is a subvolume) > > btrfs subvolume snapshot /data/a /data/b > # make some changes in b > btrfs send -p /data/a /data/b > delta > btrfs receive /data/a < delta > > I'm getting "ERROR: could not find parent subvolume" on receive. > What I'm trying to do is to back up using send/receive, but I don't have 50% > free space, and (please correct me if I'm wrong) since receive doesn't do > deduplication, I want to use snapshot to do the initial bootstrapping, > instead of send/receive without a parent. I think you've oversimplified your commands, because it looks like you're using send/receive on the same file system. But if it's a backup, necessarily you'd have to be sending the subvolume(s) to another file system on another disk (either on the same system or remotely). So that needs some clarification. Also, btrfs send requires subvolumes to be read only. Are they? And btrfs incremental receive expects the identical parent already on the destination. Is it? Also, while I'm not certain it matters, man page says to use -f to specify files. I haven't tested < and >. But then also the step where you create this intermediate snapshot file isn't necessary, just combine the send receive commands through pipe. https://btrfs.wiki.kernel.org/index.php/Incremental_Backup Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: disable snapshot aware defrag for now
On 02/03/2014 09:48 AM, David Sterba wrote: On Wed, Jan 29, 2014 at 04:05:30PM -0500, Josef Bacik wrote: It's just broken and it's taking a lot of effort to fix it, so for now just disable it so people can defrag in peace. Thanks, Cc: sta...@vger.kernel.org Signed-off-by: Josef Bacik --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3b65987..8c0bc31 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2628,7 +2628,7 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) EXTENT_DEFRAG, 1, cached_state); if (ret) { u64 last_snapshot = btrfs_root_last_snapshot(&root->root_item); - if (last_snapshot >= BTRFS_I(inode)->generation) + if (0 && last_snapshot >= BTRFS_I(inode)->generation) That's not very flexible, how are we supposed to test that in the meantime? Editing sources is not the peferred way. Well since I'm the only one currently working on fixing it I'm not worried about it. If anybody else wants to fix it they can easily change it themselves. It is so totally broken that I don't want it being turned on by anybody who can't edit this and change it themselves. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: send: replace check with an assert in gen_unique_name
The buffer passed to snprintf can hold the fully expanded format string, 64 = 3x largest ULL + 3x char + trailing null. I don't think that removing the check entirely is a good idea, hence the ASSERT. Signed-off-by: David Sterba --- fs/btrfs/send.c |6 +- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 730dce395858..f65355dfc882 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -1408,11 +1408,7 @@ static int gen_unique_name(struct send_ctx *sctx, while (1) { len = snprintf(tmp, sizeof(tmp), "o%llu-%llu-%llu", ino, gen, idx); - if (len >= sizeof(tmp)) { - /* should really not happen */ - ret = -EOVERFLOW; - goto out; - } + ASSERT(len < sizeof(tmp)); di = btrfs_lookup_dir_item(NULL, sctx->send_root, path, BTRFS_FIRST_FREE_OBJECTID, -- 1.7.9 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RESEND] xfstests: add test for btrfs incremental send data corruption
Btrfs incremental send had an issue where it would detect a non-existent file hole and then overwrite the file section that hole covers with zeroes, overriding file data that it shouldn't. The respective btrfs kernel patch that fixed this issue is titled: Btrfs: fix send file hole detection leading to data corruption (https://patchwork.kernel.org/patch/3544831/) Signed-off-by: Filipe David Borba Manana Reviewed-by: Josef Bacik --- This is a patch resend, without any changes to the test, since Dave Chinner told in his last e-mail to resend any patches that he might have missed on the last patch merge party. tests/btrfs/034 | 101 +++ tests/btrfs/034.out |6 +++ tests/btrfs/group |1 + 3 files changed, 108 insertions(+) create mode 100755 tests/btrfs/034 create mode 100644 tests/btrfs/034.out diff --git a/tests/btrfs/034 b/tests/btrfs/034 new file mode 100755 index 000..db792de --- /dev/null +++ b/tests/btrfs/034 @@ -0,0 +1,101 @@ +#! /bin/bash +# FS QA Test No. btrfs/034 +# +# Test for a btrfs incremental send data corruption issue due to +# bad detection of file holes. +# +#--- +# Copyright (c) 2014 Filipe Manana. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +tmp=`mktemp -d` + +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ +rm -fr $tmp +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch +_need_to_be_root + +rm -f $seqres.full + +_scratch_mkfs >/dev/null 2>&1 +_scratch_mount + +# Create a file such that its file extent items span at least 3 btree leafs. +# This is necessary to trigger a btrfs incremental send bug where file hole +# detection was not correct, leading to data corruption by overriding latest +# data regions of a file with zeroes. + +run_check $XFS_IO_PROG -f -c "truncate 104857600" $SCRATCH_MNT/foo + +for ((i = 0; i < 940; i++)) +do +OFFSET=$((32768 + i * 8192)) +LEN=$((OFFSET + 8192)) +run_check $XFS_IO_PROG -c "falloc -k $OFFSET $LEN" $SCRATCH_MNT/foo +run_check $XFS_IO_PROG -c "pwrite -S 0xf0 $OFFSET 4096" $SCRATCH_MNT/foo +done + +run_check $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT \ +$SCRATCH_MNT/mysnap1 + +run_check $BTRFS_UTIL_PROG filesystem sync $SCRATCH_MNT +run_check $XFS_IO_PROG -c "truncate 3882008" $SCRATCH_MNT/foo + +run_check $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT \ +$SCRATCH_MNT/mysnap2 + +run_check $BTRFS_UTIL_PROG send $SCRATCH_MNT/mysnap1 -f $tmp/1.snap +run_check $BTRFS_UTIL_PROG send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \ +-f $tmp/2.snap + +md5sum $SCRATCH_MNT/foo | _filter_scratch +md5sum $SCRATCH_MNT/mysnap1/foo | _filter_scratch +md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch + +_scratch_unmount +_check_btrfs_filesystem $SCRATCH_DEV +_scratch_mkfs >/dev/null 2>&1 +_scratch_mount + +run_check $BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/1.snap +md5sum $SCRATCH_MNT/mysnap1/foo | _filter_scratch + +run_check $BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/2.snap +md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch + +_scratch_unmount +_check_btrfs_filesystem $SCRATCH_DEV + +status=0 +exit diff --git a/tests/btrfs/034.out b/tests/btrfs/034.out new file mode 100644 index 000..808e6b4 --- /dev/null +++ b/tests/btrfs/034.out @@ -0,0 +1,6 @@ +QA output created by 034 +9023ed93111c422d82e9cd54043a6fb0 SCRATCH_MNT/foo +8e58ce8749d203f29f6b8f6990da722f SCRATCH_MNT/mysnap1/foo +9023ed93111c422d82e9cd54043a6fb0 SCRATCH_MNT/mysnap2/foo +8e58ce8749d203f29f6b8f6990da722f SCRATCH_MNT/mysnap1/foo +9023ed93111c422d82e9cd54043a6fb0 SCRATCH_MNT/mysnap2/foo diff --git a/tests/btrfs/group b/tests/btrfs/group index b29236c..f9f062f 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -36,3 +36,4 @@ 031 auto quick 032 auto quick 033 auto quick +034 auto quick -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http:/
btrfs raid5 unmountable
Hi, since Freenode is doomed today, i ask the direct way. Following Filesystem: Label: 'data' uuid: 3a6fd6d7-5943-4cad-b56f-2e6dcabff453 Total devices 6 FS bytes used 7.02TiB devid1 size 1.82TiB used 1.82TiB path /dev/sda3 devid2 size 2.73TiB used 2.48TiB path /dev/sdc3 devid3 size 931.38GiB used 931.38GiB path /dev/sdd3 devid5 size 931.51GiB used 931.51GiB path /dev/sde1 devid6 size 931.51GiB used 931.51GiB path /dev/sdf1 devid7 size 2.73TiB used 2.48TiB path /dev/sdb3 Btrfs v3.12-dirty If I try to mount it from dmesg: [30644.681210] parent transid verify failed on 32059176910848 wanted 259627 found 259431 [30644.681307] parent transid verify failed on 32059176910848 wanted 259627 found 259431 [30644.681399] btrfs bad tree block start 0 32059176910848 [30644.681407] Failed to read block groups: -5 [30644.776879] btrfs: open_ctree failed btrfs check aborts with (many of the 1st lines) [...] Ignoring transid failure parent transid verify failed on 32059196616704 wanted 259627 found 259432 parent transid verify failed on 32059196616704 wanted 259627 found 259432 Check tree block failed, want=32059196616704, have=32059196747776 parent transid verify failed on 32059196616704 wanted 259627 found259432 Ignoring transid failure parent transid verify failed on 32059196616704 wanted 259627 found259432 Ignoring transid failure parent transid verify failed on 32059177230336 wanted 259627 found259431 Ignoring transid failure parent transid verify failed on 32059196620800 wanted 259627 found259432 parent transid verify failed on 32059196620800 wanted 259627 found259432 Check tree block failed, want=32059196620800, have=1983699371120445514 Check tree block failed, want=32059196620800, have=1983699371120445514 Check tree block failed, want=32059196620800, have=1983699371120445514 read block failed check_tree_block btrfs: cmds-check.c:2212: check_owner_ref: Assertion `!(rec->is_root)' failed. Aborted What happened before: One disk was faulty, I added a new one and removed the old one, followed by a balance. So far so good. Some days after this I accidently removed a SATA Power Connector from another drive, without noticing it at first. Worked about an hour on the system, building new Kernel on another Filesystem. Rebooted with my new Kernel and the FS was not mountable. I noticed the "missing" disk and reattached the power. So far i tried: mount -o recovery btrfs check (after google) btrfs-zero-log Sadly no luck. Whoever I can get my Files with btrfs restore. The Filesystem contains mainly Mediafiles, so it is not so bad, if they were lost, but restoring them from backups and sources will need atleast about a week. (Most of the Files are mirrored on a private Server, but even with 100MBit this takes a lot of time ; ) Some Idea who to recover this FS? Kind Regards Tetja Rediske -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: add regression test for running snapshot and send concurrently
From: Wang Shilong Btrfs would fail to send if snapshot run concurrently, this test is to make sure we have fixed the bug. Signed-off-by: Wang Shilong --- tests/btrfs/034 | 75 + tests/btrfs/034.out | 2 ++ tests/btrfs/group | 1 + 3 files changed, 78 insertions(+) create mode 100644 tests/btrfs/034 create mode 100644 tests/btrfs/034.out diff --git a/tests/btrfs/034 b/tests/btrfs/034 new file mode 100644 index 000..e27e3cf --- /dev/null +++ b/tests/btrfs/034 @@ -0,0 +1,75 @@ +#!/bin/bash +# FS QA Test No. btrfs/034 +# +# Regression test for running snapshots and send concurrently. +# +#--- +# Copyright (c) 2014 Fujitsu. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! + +_cleanup() +{ +rm -f $tmp.* +} + +trap "_cleanup ; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch + +_scratch_mkfs > /dev/null 2>&1 +_scratch_mount + + +touch $SCRATCH_MNT/foo + +# get file with fragments by using backwards writes. +for i in `seq 10240 -1 1`; do + $XFS_IO_PROG -f -d -c "pwrite $(($i * 4096)) 4096" \ + $SCRATCH_MNT/foo > /dev/null | _filter_xfs_io +done + +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT \ + $SCRATCH_MNT/snap_1 >> $seqres.full 2>&1 + +$BTRFS_UTIL_PROG send -f $SCRATCH_MNT/send_file \ + $SCRATCH_MNT/snap_1 >> $seqres.full 2>&1 & + +pid=$! + +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT/snap_1 \ + $SCRATCH_MNT/snap_2 >> $seqres.full 2>&1 + +wait $pid || echo "Failed to send, see dmesg" + +echo "Silence is golden" +status=0 ; exit diff --git a/tests/btrfs/034.out b/tests/btrfs/034.out new file mode 100644 index 000..4c8873c --- /dev/null +++ b/tests/btrfs/034.out @@ -0,0 +1,2 @@ +QA output created by 034 +Silence is golden diff --git a/tests/btrfs/group b/tests/btrfs/group index b29236c..f9f062f 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -36,3 +36,4 @@ 031 auto quick 032 auto quick 033 auto quick +034 auto quick -- 1.8.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: disable snapshot aware defrag for now
On Wed, Jan 29, 2014 at 04:05:30PM -0500, Josef Bacik wrote: > It's just broken and it's taking a lot of effort to fix it, so for now just > disable it so people can defrag in peace. Thanks, > > Cc: sta...@vger.kernel.org > Signed-off-by: Josef Bacik > --- > fs/btrfs/inode.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 3b65987..8c0bc31 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -2628,7 +2628,7 @@ static int btrfs_finish_ordered_io(struct > btrfs_ordered_extent *ordered_extent) > EXTENT_DEFRAG, 1, cached_state); > if (ret) { > u64 last_snapshot = btrfs_root_last_snapshot(&root->root_item); > - if (last_snapshot >= BTRFS_I(inode)->generation) > + if (0 && last_snapshot >= BTRFS_I(inode)->generation) That's not very flexible, how are we supposed to test that in the meantime? Editing sources is not the peferred way. I was thinking about adding a config option that would cover any experimental/broken features, this one be the first, as we currently have no other way to disable it. I'd rather avoid adding a temporary mount option. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Receive on same subvolume
Hi Matthew, I'm not sure what you are trying to achive. Couldn't you simply do another snapshot of the subvolume? I don't understand why you want to use send/receive on the same subvolume to be honest. Regards, Felix On Wed, Jan 29, 2014 at 10:26 PM, Matthew Lai wrote: > Hello, > > Is this supposed to work? (/data is the root volume, /data/a is a subvolume) > > btrfs subvolume snapshot /data/a /data/b > # make some changes in b > btrfs send -p /data/a /data/b > delta > btrfs receive /data/a < delta > > I'm getting "ERROR: could not find parent subvolume" on receive. > > What I'm trying to do is to back up using send/receive, but I don't have 50% > free space, and (please correct me if I'm wrong) since receive doesn't do > deduplication, I want to use snapshot to do the initial bootstrapping, > instead of send/receive without a parent. > > Thanks! > Matthew > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html