Re: Possible deadlock when writing

2018-12-01 Thread Janos Toth F.
I obviously can't be sure (due to obscure nature of this issue) but I think I observed similar behavior. For me it usually kicked in during scheduled de-fragmentation runs. I initially suspected it might have something to do with running defrag on files which are still opened for appending writes

Re: lazytime mount option—no support in Btrfs

2018-08-23 Thread Janos Toth F.
On Tue, Aug 21, 2018 at 4:10 PM Austin S. Hemmelgarn wrote: > Also, once you've got the space cache set up by mounting once writable > with the appropriate flag and then waiting for it to initialize, you > should not ever need to specify the `space_cache` option again. True. I am not sure why I

Re: lazytime mount option—no support in Btrfs

2018-08-21 Thread Janos Toth F.
> >> so pretty much everyone who wants to avoid the overhead from them can just > >> use the `noatime` mount option. It would be great if someone finally fixed this old bug then: https://bugzilla.kernel.org/show_bug.cgi?id=61601 Until then, it seems practically impossible to use both noatime

Re: Writeback errors in kernel log with Linux 4.15 (m=s=raid1, d=raid5, 5 disks)

2018-02-01 Thread Janos Toth F.
48 8b 44 24 48 65 48 33 04 25 28 00 00 00 75 0b 48 83 c4 50 5b 5d c3 <0f> ff eb c0 e8 c1 af ef ff 90 48 85 ff 53 48 c7 c3 60 5b 83 89 [ 333.197459] ---[ end trace 4427bc8429f8bec7 ]--- On Fri, Feb 2, 2018 at 2:28 AM, Jano

Writeback errors in kernel log with Linux 4.15 (m=s=raid1, d=raid5, 5 disks)

2018-02-01 Thread Janos Toth F.
I started seeing these on my d=raid5 filesystem after upgrading to Linux 4.15. Some files created since the upgrade seem to be corrupted. The disks seem to be fine (according to btrfs device stats and smartmontools device logs). The rest of the Btrfs filesystems (with m=s=d=single profiles) do

Re: deleted subvols don't go away?

2017-08-27 Thread Janos Toth F.
ID=5 is the default, "root" or "toplevel" subvolume which can't be deleted anyway (at least normally, I am not sure if some debug-magic can achieve that). I just checked this (out of curiosity) and all my Btrfs filesystems report something very similar to yours (I thought DELETED was a made up

Re: Btrfs Raid5 issue.

2017-08-21 Thread Janos Toth F.
I lost enough Btrfs m=d=s=RAID5 filesystems in past experiments (I didn't try using RAID5 for metadata and system chunks in the last few years) to faulty SATA cables + hotplug enabled SATA controllers (where a disk could disappear and reappear "as the wind blew"). Since then, I made a habit of

Re: [RFC] Checksum of the parity

2017-08-13 Thread Janos Toth F.
On Sun, Aug 13, 2017 at 8:45 PM, Chris Murphy wrote: > Further, the error detection of corrupt reconstruction is why I say > Btrfs is not subject *in practice* to the write hole problem. [2] > > [1] > I haven't tested the raid6 normal read case where a stripe contains >

Re: btrfs issue with mariadb incremental backup

2017-08-12 Thread Janos Toth F.
On Sat, Aug 12, 2017 at 11:34 PM, Chris Murphy wrote: > On Fri, Aug 11, 2017 at 11:08 PM, wrote: > > I think the problem is that the script does things so fast that the > snapshot is not always consistent on disk before btrfs send starts. > It's

Re: BTRFS warning: unhandled fiemap cache detected

2017-08-09 Thread Janos Toth F.
As much as I can tell it's nothing to worry about. (I have thousands of these warnings.) I don't know why the patch was submitted for 4.13 but not applied to the next 4.12.x , since it looks like a trivial tiny fix for an annoying problem. On Wed, Aug 9, 2017 at 10:48 AM, Mario Fugazzi®

Re: how to benchmark schedulers

2017-08-08 Thread Janos Toth F.
I think you should consider using Linux 4.12 which has bfq (bfq-mq) for blk-mq. So, you don't need out-of-tree BFQ patches if you switch to blk-mq (but now you are free to do so even if you have HDDs or SSDs which benefit from software schedulers since you have some multi-queue schedulers for

Re: write corruption due to bio cloning on raid5/6

2017-07-29 Thread Janos Toth F.
Reply to the TL;DR part, so TL;DR marker again... Well, I live on the other extreme now. I want as few filesystems as possible and viable (it's obviously impossible to have a real backup within the same fs and/or device and with the current size/performance/price differences between HDD and SSD,

Re: write corruption due to bio cloning on raid5/6

2017-07-28 Thread Janos Toth F.
a balance with usage=33 filters. I guess either of those (especially balance) could potentially cause scrub to hang. On Thu, Jul 27, 2017 at 10:44 PM, Duncan <1i5t5.dun...@cox.net> wrote: > Janos Toth F. posted on Thu, 27 Jul 2017 16:14:47 +0200 as excerpted: > >> * This is off-top

Re: write corruption due to bio cloning on raid5/6

2017-07-27 Thread Janos Toth F.
> It should only affect the dio-written files, the mentioned bug makes > btrfs write garbage into those files, so checksum fails when reading > files, nothing else from this bug. Thanks for confirming that. I thought so but I removed the affected temporary files even before I knew they were

Re: write corruption due to bio cloning on raid5/6

2017-07-24 Thread Janos Toth F.
I accidentally ran into this problem (it's pretty silly because I almost never run RC kernels or do dio writes but somehow I just happened to do both at once, exactly before I read your patch notes). I didn't initially catch any issues (I see no related messages in the kernel log) but after seeing

Re: Btrfs/SSD

2017-05-13 Thread Janos Toth F.
> Anyway, that 20-33% left entirely unallocated/unpartitioned > recommendation still holds, right? I never liked that idea. And I really disliked how people considered it to be (and even passed it down as) some magical, absolute stupid-proof fail-safe thing (because it's not). 1: Unless you

Re: "corrupt leaf, invalid item offset size pair"

2017-05-08 Thread Janos Toth F.
May be someone more talented will be able to assist you but in my experience this kind of damage is fatal in practice (even if you could theoretically fix it, it's probably easier to recreate the fs and restore the content from backup, or use the rescue tool to save some of the old content which

Re: btrfs filesystem keeps allocating new chunks for no apparent reason

2017-04-10 Thread Janos Toth F.
>> The command-line also rejects a number of perfectly legitimate >> arguments that BTRFS does understand too though, so that's not much >> of a test. > > Which are those? I didn't encounter any... I think this bug still stands unresolved (for 3+ years, probably because most people use init-rd/fs

Re: linux 4.8 kernel OOM

2017-02-26 Thread Janos Toth F.
So far 4.10.0 seems to be flawless for me. All the strange OOMs (which may or may not were related to Btrfs but it looked that way), random unexplained Btrfs mount failures (as well as some various other things totally unrelated to filesystems like sdhc card reader driver problems) which were

Re: btrfs recovery

2017-01-28 Thread Janos Toth F.
I usually compile my kernels with CONFIG_X86_RESERVE_LOW=640 and CONFIG_X86_CHECK_BIOS_CORRUPTION=N because 640 kilobyte seems like a very cheap price to pay in order to avoid worrying about this (and skip the associated checking + monitoring). Out of curiosity (after reading this email) I set

Re: RAID56 status?

2017-01-23 Thread Janos Toth F.
On Mon, Jan 23, 2017 at 7:57 AM, Brendan Hide wrote: > > raid0 stripes data in 64k chunks (I think this size is tunable) across all > devices, which is generally far faster in terms of throughput in both > writing and reading data. I remember seeing some proposals for

Re: Unocorrectable errors with RAID1

2017-01-16 Thread Janos Toth F.
> BTRFS uses a 2 level allocation system. At the higher level, you have > chunks. These are just big blocks of space on the disk that get used for > only one type of lower level allocation (Data, Metadata, or System). Data > chunks are normally 1GB, Metadata 256MB, and System depends on the

Re: [PATCH] recursive defrag cleanup

2017-01-04 Thread Janos Toth F.
I separated these 9 camera storages into 9 subvolumes (so now I have 10 subvols in total in this filesystem with the "root" subvol). It's obviously way too early to talk about long term performance but now I can tell that recursive defrag does NOT descend into "child" subvolumes (it does not pick

Re: [PATCH] recursive defrag cleanup

2017-01-03 Thread Janos Toth F.
On Tue, Jan 3, 2017 at 5:01 PM, Austin S. Hemmelgarn wrote: > I agree on this point. I actually hadn't known that it didn't recurse into > sub-volumes, and that's a pretty significant caveat that should be > documented (and ideally fixed, defrag doesn't need to worry about

Re: [PATCH] recursive defrag cleanup

2017-01-03 Thread Janos Toth F.
So, in order to defrag "everything" in the filesystem (which is possible to / potentially needs defrag) I need to run: 1: a recursive defrag starting from the root subvolume (to pick up all the files in all the possible subvolumes and directories) 2: a non-recursive defrag on the root subvolume +

Re: [PATCH] recursive defrag cleanup

2016-12-28 Thread Janos Toth F.
I still find the defrag tool a little bit confusing from a user perspective: - Does the recursive defrag (-r) also defrag the specified directory's extent tree or should one run two separate commands for completeness (one with -r and one without -r)? - What's the target scope of the extent tree

Re: some free space cache corruptions

2016-12-25 Thread Janos Toth F.
I am not sure I can remember a time when btrfs check did not print this "cache and super generation don't match, space cache will be invalidated" message, so I started ignoring it a long time ago because I never seemed to have problem with missing free space and never got any similar

Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread Janos Toth F.
I realize this is related very loosely (if at all) to this topic but what about these two possible features: - a mount option, or - an attribute (which could be set on directories and/or sub-volumes and applied to any new files created below these) which effectively forces every read/write

Re: btrfs flooding the I/O subsystem and hanging the machine, with bcache cache turned off

2016-12-01 Thread Janos Toth F.
Is there any fundamental reason not to support huge writeback caches? (I mean, besides working around bugs and/or questionably poor design choices which no one wishes to fix.) The obvious drawback is the increased risk of data loss upon hardware failure or kernel panic but why couldn't the user be

Re: RFC: raid with a variable stripe size

2016-11-29 Thread Janos Toth F.
I would love to have the stripe element size (per disk portions of logical "full" stripes) changeable online with balance anyway (starting from 512 byte/disk, not placing artificial arbitrary limitations on it at the low end). A small stripe size (for example 4k/disk or even 512byte/disk if you

Re: RFC: raid with a variable stripe size

2016-11-18 Thread Janos Toth F.
e: > 2016-11-18 23:32 GMT+03:00 Janos Toth F. <toth.f.ja...@gmail.com>: >> Based on the comments of this patch, stripe size could theoretically >> go as low as 512 byte: >> https://mail-archive.com/linux-btrfs@vger.kernel.org/msg56011.html >> If these very small (0.5

Re: RFC: raid with a variable stripe size

2016-11-18 Thread Janos Toth F.
Based on the comments of this patch, stripe size could theoretically go as low as 512 byte: https://mail-archive.com/linux-btrfs@vger.kernel.org/msg56011.html If these very small (0.5k-2k) stripe sizes could really work (it's possible to implement such changes and it does not degrade performance

Re: [Bug 186671] New: OOM on system with just rsync running 32GB of ram 30GB of pagecache

2016-11-18 Thread Janos Toth F.
It could be totally unrelated but I have a similar problem: processes get randomly OOM'd when I am doing anything "sort of heavy" on my Btrfs filesystems. I did some "evil tuning", so I assumed that must be the problem (even if the values looked sane for my system). Thus, I kept cutting back on

btrfs check --repair: ERROR: cannot read chunk root

2016-10-30 Thread Janos Toth F.
I stopped using Btrfs RAID-5 after encountering this problem two times (once due to a failing SATA cable, once due to a random kernel problem which caused the SATA or the block device driver to reset/crash). As much as I can tell, the main problem is that after a de- and a subsequent re-attach (on

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2016-07-23 Thread Janos Toth F.
It seems like I accidentally managed to break my Btrfs/RAID5 filesystem, yet again, in a similar fashion. This time around, I ran into some random libata driver issue (?) instead of a faulty hardware part but the end result is quiet similar. I issued the command (replacing X with valid letters

Unexpectedly slow removal of fragmented files (RAID-5)

2016-04-16 Thread Janos Toth F.
As you can see from the attached terminal log below, file deletion can take an unexpectedly long time, even if there is little disk I/O from other tasks. Listing the contents of similar directories (<=1000 files with ~1Gb size) can also be surprisingly slow (several seconds for a simple ls

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-11-06 Thread Janos Toth F.
I created a fresh RAID-5 mode Btrfs on the same 3 disks (including the faulty one which is still producing numerous random read errors) and Btrfs now seems to work exactly as I would anticipate. I copied some data and verified the checksum. The data is readable and correct regardless of the

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-11-04 Thread Janos Toth F.
Well. Now I am really confused about Btrfs RAID-5! So, I replaced all SATA cables (which are explicitly marked for beeing aimed at SATA3 speeds) and all the 3x2Tb WD Red 2.0 drives with 3x4Tb Seagate Contellation ES 3 drives and started from sratch. I secure-erased every drives, created an empty

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Janos Toth F.
I went through all the recovery options I could find (starting from read-only to "extraordinarily dangerous"). Nothing seemed to work. A Windows based proprietary recovery software (ReclaiMe) could scratch the surface but only that (it showed me the whole original folder structure after a few

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Janos Toth F.
r diagnosing how/why the filesystem got into > this unrecoverable state. > > A single device having issues should not cause the whole filesystem to > become unrecoverable. > > On Wed, Oct 21, 2015 at 9:09 AM, Janos Toth F. <toth.f.ja...@gmail.com> wrote: >> I went thr

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Janos Toth F.
;[ 267.246167] BTRFS (device sdd): parent transid verify failed on 38719488 wanted 101765 found 101223 <3>[ 267.246706] BTRFS (device sdd): parent transid verify failed on 38719488 wanted 101765 found 101223 <3>[ 267.246727] BTRFS: Failed to read block groups: -5 <3>[ 267.26

Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-19 Thread Janos Toth F.
I was in the middle of replacing the drives of my NAS one-by-one (I wished to move to bigger and faster storage at the end), so I used one more SATA drive + SATA cable than usual. Unfortunately, the extra cable turned out to be faulty and it looks like it caused some heavy damage to the file