Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-11-02 Thread Christoph Anton Mitterer
On Sat, 2018-11-03 at 09:34 +0800, Su Yue wrote: > Sorry for the late reply cause I'm busy at other things. No worries :-) > I just looked through related codes and found the bug. > The patches can fix it. So no need to do more tests. > Thanks to your tests and patience. :) Thanks for fixing :-)

Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-11-02 Thread Christoph Anton Mitterer
Hey Su. Anything further I need to do in this matter or can I consider it "solved" and you won't need further testing by my side, but just PR the patches of that branch? :-) Thanks, Chris. On Sat, 2018-10-27 at 14:15 +0200, Christoph Anton Mitterer wrote: > Hey. > > > Wi

Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-10-27 Thread Christoph Anton Mitterer
Hey. Without the last patches on 4.17: checking extents checking free space cache checking fs roots ERROR: errors found in fs roots Checking filesystem on /dev/mapper/system UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c found 619543498752 bytes used, error(s) found total csum bytes: 602382204

Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-10-18 Thread Christoph Anton Mitterer
Hey. So I'm back from a longer vacation and had now the time to try out your patches from below: On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote: > I found the errors should blame to something about inode_extref check > in lowmem mode. > I have writeen three patches to detect and report errors

Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-09-05 Thread Christoph Anton Mitterer
On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote: > Agreed with Qu, btrfs-check shall not try to do any write. Well.. it could have been just some coincidence :-) > I found the errors should blame to something about inode_extref check > in lowmem mode. So you mean errors in btrfs-check... and

Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-09-04 Thread Christoph Anton Mitterer
On Tue, 2018-09-04 at 17:14 +0800, Qu Wenruo wrote: > However the backtrace can't tell which process caused such fsync > call. > (Maybe LVM user space code?) Well it was just literally before btrfs-check exited... so I blindly guesses... but arguably it could be just some coincidence. LVM tools

Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-09-03 Thread Christoph Anton Mitterer
Hey. On Fri, 2018-08-31 at 10:33 +0800, Su Yue wrote: > Can you please fetch btrfs-progs from my repo and run lowmem check > in readonly? > Repo: https://github.com/Damenly/btrfs-progs/tree/lowmem_debug > It's based on v4.17.1 plus additonal output for debug only. I've adapted your patch to

fsck lowmem mode only: ERROR: errors found in fs roots

2018-08-30 Thread Christoph Anton Mitterer
Hey. I've the following on a btrfs that's basically the system fs for my notebook: When booting from a USB stick with: # uname -a Linux heisenberg 4.17.0-3-amd64 #1 SMP Debian 4.17.17-1 (2018-08-18) x86_64 GNU/Linux # btrfs --version btrfs-progs v4.17 ... a lowmem mode fsck gives no error: #

Re: [PATCH v2 1/2] btrfs-progs: Rename OPEN_CTREE_FS_PARTIAL to OPEN_CTREE_TEMPORARY_SUPER

2018-07-12 Thread Christoph Anton Mitterer
Hey. Better late than never ;-) Just to confirm: At least since 4.16.1, I could btrfs-restore from the broken fs image again (that I've described in "spurious full btrfs corruption" from around mid March). So the regression in btrfsprogs has in fact been fixed by these patches, it seems.

Re: call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-29 Thread Christoph Anton Mitterer
On Fri, 2018-06-29 at 09:10 +0800, Qu Wenruo wrote: > Maybe it's the old mkfs causing the problem? > Although mkfs.btrfs added device size alignment much earlier than > kernel, it's still possible that the old mkfs doesn't handle the > initial > device and extra device (mkfs.btrfs will always

Re: call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-28 Thread Christoph Anton Mitterer
Hey Qu and Nikolay. On Thu, 2018-06-28 at 22:58 +0800, Qu Wenruo wrote: > Nothing special. Btrfs-progs will handle it pretty well. Since this a remote system where the ISP provides only a rescue image with pretty old kernel/btrfs-progs, I had to copy a current local binary and use that... but

Re: call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-28 Thread Christoph Anton Mitterer
On Thu, 2018-06-28 at 22:09 +0800, Qu Wenruo wrote: > > [ 72.168662] WARNING: CPU: 0 PID: 242 at /build/linux- > > uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 > > btrfs_update_device+0x1b2/0x1c0It > looks like it's the old WARN_ON() for unaligned device size. > Would you please verify if it is

call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-28 Thread Christoph Anton Mitterer
Hey. On a 4.16.16 kernel with a RAID 1 btrfs I got the following messages since today. Data seems still to be readable (correctly)... and there are no other errors (like SATA errors) in the kernel log. Any idea what these could mean? Thanks, Chris. [ 72.168662] WARNING: CPU: 0 PID: 242 at

in which directions does btrfs send -p | btrfs receive work

2018-06-06 Thread Christoph Anton Mitterer
Hey. Just wondered about the following: When I have a btrfs which acts as a master and from which I make copies of snapshots on it via send/receive (with using -p at send) to other btrfs which acts as copies like this: master +--> copy1 +--> copy2 \--> copy3 and if now e.g. the

Re: Btrfs progs release 4.16.1

2018-04-25 Thread Christoph Anton Mitterer
On Wed, 2018-04-25 at 07:22 -0400, Austin S. Hemmelgarn wrote: > While I can understand Duncan's point here, I'm inclined to agree > with > David Same from my side... and I run a multi-PiB storage site (though not with btrfs). Cosmetically one shouldn't do this in a bugfix release, this should

Re: spurious full btrfs corruption

2018-03-26 Thread Christoph Anton Mitterer
Hey Qu. Some update on the corruption issue on my Fujitsu notebook: Finally got around running some memtest on it... and few seconds after it started I already got this: https://paste.pics/1ff8b13b94f31082bc7410acfb1c6693 So plenty of bad memory... I'd say it's probably not so unlikely that

Re: spurious full btrfs corruption

2018-03-21 Thread Christoph Anton Mitterer
Just some addition on this: On Fri, 2018-03-16 at 01:03 +0100, Christoph Anton Mitterer wrote: > The issue that newer btrfs-progs/kernel don't restore anything at all > from my corrupted fs: 4.13.3 seems to be already buggy... 4.7.3 works, but interestingly btrfs-find-super seems t

Re: Status of RAID5/6

2018-03-21 Thread Christoph Anton Mitterer
Hey. Some things would IMO be nice to get done/clarified (i.e. documented in the Wiki and manpages) from users'/admin's POV: Some basic questions: - Starting with which kernels (including stable kernel versions) does it contain the fixes for the bigger issues from some time ago? - Exactly what

Re: [PATCH] btrfs-progs: mkfs: add uuid and otime to ROOT_ITEM of FS_TREE

2018-03-19 Thread Christoph Anton Mitterer
On Mon, 2018-03-19 at 14:02 +0100, David Sterba wrote: > We can do that by a special purpose tool. No average user will ever run (even know) about that... Could you perhaps either do it automatically in fsck (which is IMO als a bad idea as fsck should be read-only per default)... or at least add

Re: spurious full btrfs corruption

2018-03-15 Thread Christoph Anton Mitterer
Hey. Found some time to move on with this: Frist, I think from my side (i.e. restoring as much as possible) I'm basically done now, so everything left over here is looking for possible bugs/etc. I have from my side no indication that my corruptions were actually a bug in btrfs... the new

Re: zerofree btrfs support?

2018-03-14 Thread Christoph Anton Mitterer
Hey. On Wed, 2018-03-14 at 20:38 +0100, David Sterba wrote: > I have a prototype code for that and after the years, seeing the > request > again, I'm not against adding it as long as it's not advertised as a > security feature. I'd expect that anyone in the security area should know that

Re: Ongoing Btrfs stability issues

2018-03-13 Thread Christoph Anton Mitterer
On Tue, 2018-03-13 at 20:36 +0100, Goffredo Baroncelli wrote: > A checksum mismatch, is returned as -EIO by a read() syscall. This is > an event handled badly by most part of the programs. Then these programs must simply be fixed... otherwise they'll also fail under normal circumstances with

Re: Ongoing Btrfs stability issues

2018-03-12 Thread Christoph Anton Mitterer
On Mon, 2018-03-12 at 22:22 +0100, Goffredo Baroncelli wrote: > Unfortunately no, the likelihood might be 100%: there are some > patterns which trigger this problem quite easily. See The link which > I posted in my previous email. There was a program which creates a > bad checksum (in COW+DATASUM

Re: Ongoing Btrfs stability issues

2018-03-11 Thread Christoph Anton Mitterer
On Sun, 2018-03-11 at 18:51 +0100, Goffredo Baroncelli wrote: > > COW is needed to properly checksum the data. Otherwise is not > possible to ensure the coherency between data and checksum (however I > have to point out that BTRFS fails even in this case [*]). > We could rearrange this sentence,

Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 23:31 +0500, Roman Mamedov wrote: > QCOW2 would add a second layer of COW > on top of > Btrfs, which sounds like a nightmare. I've just seen there is even a nocow option "specifically" for btrfs... it seems however that it doesn't disable the CoW of qcow, but rather that of

Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 16:50 +0100, Adam Borowski wrote: > Since we're on a btrfs mailing list Well... my original question was whether someone could make zerofree support for btrfs (which I think would be best if someone who knows how btrfs really works)... thus I directed the question to this

Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 19:37 +0500, Roman Mamedov wrote: > Note you can use it on HDDs too, even without QEMU and the like: via > using LVM > "thin" volumes. I use that on a number of machines, the benefit is > that since > TRIMed areas are "stored nowhere", those partitions allow for > incredibly

Re: Ongoing Btrfs stability issues

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 14:04 +0200, Nikolay Borisov wrote: > So for OLTP workloads you definitely want nodatacow enabled, bear in > mind this also disables crc checksumming, but your db engine should > already have such functionality implemented in it. Unlike repeated claims made here on the list

Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 09:16 +0100, Adam Borowski wrote: > Do you want zerofree for thin storage optimization, or for security? I don't think one can really use it for security (neither on SSD or HDD). On both, zeroed blocks may still be readable by forensic measures. So optimisation, i.e. digging

zerofree btrfs support?

2018-03-09 Thread Christoph Anton Mitterer
Hi. Just wondered... was it ever planned (or is there some equivalent) to get support for btrfs in zerofree? Thanks, Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at

call trace on btrfs send/receive

2018-03-09 Thread Christoph Anton Mitterer
Hey. The following still happens with 4.15 kernel/progs: btrfs send -p oldsnap newsnap | btrfs receive /some/other/fs Mar 10 00:48:10 heisenberg kernel: WARNING: CPU: 5 PID: 32197 at /build/linux-PFKtCE/linux-4.15.4/fs/btrfs/send.c:6487 btrfs_ioctl_send+0x48f/0xfb0 [btrfs] Mar 10 00:48:10

Re: spurious full btrfs corruption

2018-03-08 Thread Christoph Anton Mitterer
Hey. On Tue, 2018-03-06 at 09:50 +0800, Qu Wenruo wrote: > > These were the two files: > > -rw-r--r-- 1 calestyo calestyo 90112 Feb 22 16:46 'Lady In The > > Water/05.mp3' > > -rw-r--r-- 1 calestyo calestyo 4892407 Feb 27 23:28 > > '/home/calestyo/share/music/Lady In The Water/05.mp3' > > > >

Re: spurious full btrfs corruption

2018-03-05 Thread Christoph Anton Mitterer
Hey Qu. On Thu, 2018-03-01 at 09:25 +0800, Qu Wenruo wrote: > > - For my personal data, I have one[0] Seagate 8 TB SMR HDD, which I > > backup (send/receive) on two further such HDDs (all these are > > btrfs), and (rsync) on one further with ext4. > > These files have all their SHA512 sums

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
And you have any other ideas on how to dubs that filesystem? Or at least backup as much as possible? Thanks, Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
e8f1bc1493855e32b7a2a019decc3c353d94daf6 That bug... When was that introduced and how can I find out whether an fs was affected/corrupted by this? Cause I've mounted and wrote to some extremely important (to me) fs recently. Thanks, Chris. -- To unsubscribe from this list: send the line

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
A scrub now gave: # btrfs scrub start -Br /dev/disk/by-label/system ERROR: scrubbing /dev/disk/by-label/system failed for device id 1: ret=-1, errno=5 (Input/output error) scrub canceled for b6050e38-716a-40c3-a8df-fcf1dd7e655d scrub started at Wed Feb 21 17:42:39 2018 and was aborted

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
Spurious corruptions seem to continue [ 69.688652] BTRFS critical (device dm-0): unable to find logical 4503658729209856 length 4096 [ 69.688656] BTRFS critical (device dm-0): unable to find logical 4503658729209856 length 4096 [ 69.688658] BTRFS critical (device dm-0): unable to find

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
Interestingly, I got another one only within minutes after the scrub: Feb 21 15:23:49 heisenberg kernel: BTRFS warning (device dm-0): csum failed root 257 ino 7703 off 56852480 csum 0x42d1b69c expected csum 0x3ce55621 mirror 1 Feb 21 15:23:52 heisenberg kernel: BTRFS warning (device dm-0): csum

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
Hi Nikolay. Thanks. On Wed, 2018-02-21 at 08:34 +0200, Nikolay Borisov wrote: > This looks like the one fixed by > e8f1bc1493855e32b7a2a019decc3c353d94daf6 . It's tagged for stable so > you > should get it eventually. Another consequence of this was that I couldn't sync/umount or shutdown

BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-20 Thread Christoph Anton Mitterer
Hi. Not sure if that's a bug in btrfs... maybe someone's interested in it. Cheers, Chris. # uname -a Linux heisenberg 4.14.0-3-amd64 #1 SMP Debian 4.14.17-1 (2018-02-14) x86_64 GNU/Linux Feb 21 04:55:51 heisenberg kernel: BUG: unable to handle kernel paging request at 9fb75f827100 Feb

Re: block group 11778977169408 has wrong amount of free space

2017-09-03 Thread Christoph Anton Mitterer
Did another mount with clear_cache,rw (cause it was ro before)... now I get even more errors: # btrfs check /dev/mapper/data-a2 ; echo $? Checking filesystem on /dev/mapper/data-a2 UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db checking extents checking free space cache block group 9857516175360 has

Re: block group 11778977169408 has wrong amount of free space

2017-09-03 Thread Christoph Anton Mitterer
Just checked, and mounting with clear_cache, and then re-fscking doesn't even fix the problem... Output stays the same. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

block group 11778977169408 has wrong amount of free space

2017-09-03 Thread Christoph Anton Mitterer
Hey. Just got the following: $ uname -a Linux heisenberg 4.12.0-1-amd64 #1 SMP Debian 4.12.6-1 (2017-08-12) x86_64 GNU/Linux $ btrfs version btrfs-progs v4.12 on a filesystem: # btrfs check /dev/mapper/data-a2 ; echo $? Checking filesystem on /dev/mapper/data-a2 UUID:

call trace on send/receive

2017-08-31 Thread Christoph Anton Mitterer
Hey. Just got the following call trace with: $ uname -a Linux heisenberg 4.12.0-1-amd64 #1 SMP Debian 4.12.6-1 (2017-08-12) x86_64 GNU/Linux $ btrfs version btrfs-progs v4.12 Sep 01 06:10:12 heisenberg kernel: [ cut here ] Sep 01 06:10:12 heisenberg kernel: WARNING:

Re: deleted subvols don't go away?

2017-08-28 Thread Christoph Anton Mitterer
Thanks... Still a bit strange that it displays that entry... especially with a generation that seems newer than what I thought was the actually last generation on the fs. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

deleted subvols don't go away?

2017-08-27 Thread Christoph Anton Mitterer
Hey. Just wondered... On a number of filesystems I've removed several subvoumes (with -c)... even called btrfs filesystem sync afterwards... and waited quite a while (with the fs mounted rw) until no disk activity seems to happen anymore. Yet all these fs shows some deleted subvols e.g.: btrfs

Re: BTRFS warning (device dm-0): unhandled fiemap cache detected

2017-08-20 Thread Christoph Anton Mitterer
On Mon, 2017-08-21 at 10:43 +0800, Qu Wenruo wrote: > Harmless, it is only designed to merge fiemap output. Thanks for the info :) On Mon, 2017-08-21 at 10:57 +0800, Qu Wenruo wrote: > Quite strange, according to upstream git log, that commit is merged  > between v4.12-rc7 and v4.12. > Maybe I

BTRFS warning (device dm-0): unhandled fiemap cache detected

2017-08-20 Thread Christoph Anton Mitterer
Hey. Just got the following with 4.12.6: Aug 21 03:29:51 heisenberg kernel: BTRFS warning (device dm-0): unhandled fiemap cache detected: offset=0 phys=812641906688 len=12288 flags=0x0 Aug 21 03:29:56 heisenberg kernel: BTRFS warning (device dm-0): unhandled fiemap cache detected: offset=0

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Christoph Anton Mitterer
On Wed, 2017-08-16 at 09:53 -0400, Austin S. Hemmelgarn wrote: > Go try BTRFS on top of dm-integrity, or on a  > system with T10-DIF or T13-EPP support When dm-integrity is used... would that be enough for btrfs to do a proper repair in the RAID+nodatacow case? I assume it can't do repairs now

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Christoph Anton Mitterer
Just out of curiosity: On Wed, 2017-08-16 at 09:12 -0400, Chris Mason wrote: > Btrfs couples the crcs with COW because this (which sounds like you want it to stay coupled that way)... plus > It's possible to protect against all three without COW, but all  > solutions have their own tradeoffs

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-15 Thread Christoph Anton Mitterer
On Tue, 2017-08-15 at 07:37 -0400, Austin S. Hemmelgarn wrote: > Go look at Chrome, or Firefox, or Opera, or any other major web > browser.  >   At minimum, they will safely bail out if they detect corruption in > the  > user profile and can trivially resync the profile from another system > if  >

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 11:53 -0400, Austin S. Hemmelgarn wrote: > Quite a few applications actually _do_ have some degree of secondary  > verification or protection from a crash.  Go look at almost any > database  > software. Then please give proper references for this! This is from 2015, where

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 10:23 -0400, Austin S. Hemmelgarn wrote: > Assume you have higher level verification.  Would you rather not be > able  > to read the data regardless of if it's correct or not, or be able to  > read it and determine yourself if it's correct or not? What would be the

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 15:46 +0800, Qu Wenruo wrote: > The problem here is, if you enable csum and even data is updated  > correctly, only metadata is trashed, then you can't even read out > the  > correct data. So what? This problem occurs anyway *only* in case of a crash,.. and *only* if

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 14:36 +0800, Qu Wenruo wrote: > > And how are you going to write your data and checksum atomically > > when > > doing in-place updates? > > Exactly, that's the main reason I can figure out why btrfs disables  > checksum for nodatacow. Still, I don't get the problem here...

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-12 Thread Christoph Anton Mitterer
On Sat, 2017-08-12 at 00:42 -0700, Christoph Hellwig wrote: > And how are you going to write your data and checksum atomically when > doing in-place updates? Maybe I misunderstand something, but what's the big deal with not doing it atomically (I assume you mean in terms of actually writing to

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-11 Thread Christoph Anton Mitterer
Qu Wenruo wrote: >Although Btrfs can disable data CoW, nodatacow also disables data  >checksum, which is another main feature for btrfs. Then decoupling of the two should probably decoupled and support for notdatacow+checksumming be implemented?! I'm not an expert, but I wouldn't see why this

Re: FAILED: patch "[PATCH] Btrfs: fix early ENOSPC due to delalloc" failed to apply to 4.12-stable tree

2017-08-04 Thread Christoph Anton Mitterer
> all of the metadata it is supposed to. This fixes early ENOSPCs we > were > seeing when doing a btrfs receive to populate a new filesystem, as > well > as early ENOSPCs Christoph saw when doing a big cp -r onto Btrfs. > > Fixes: 957780eb2788 ("Btrfs: introduce ticketed enospc

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-03 Thread Christoph Anton Mitterer
On Thu, 2017-08-03 at 20:08 +0200, waxhead wrote: > Brendan Hide wrote: > > The title seems alarmist to me - and I suspect it is going to be  > > misconstrued. :-/ > > > > From the release notes at  > > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Li > >

Re: [PATCH 00/14 RFC] Btrfs: Add journal for raid5/6 writes

2017-08-01 Thread Christoph Anton Mitterer
Hi. Stupid question: Would the write hole be closed already, if parity was checksummed? Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
me, glad to hear it! I hadn't been able to reproduce the issue > outside of Facebook. Can I add your tested-by? Sure, but better use my other mail address for it, if you don't mind: Christoph Anton Mitterer <m...@christoph.anton.mitterer.name> > > I assume you'll take care to get that pat

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 10:32 -0700, Omar Sandoval wrote: > If that doesn't work, could you please also try > https://patchwork.kernel.org/patch/9829593/? Okay, tried the patch now, applied upon: Linux 4.12.0-trunk-amd64 #1 SMP Debian 4.12.2-1~exp1 (2017-07-18) x86_64 GNU/Linux (that is the Debian

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 11:14 -0700, Omar Sandoval wrote: > Yes, that's a safe enough workaround. It's a good idea to change the > parameters back after the copy. you mean even without having the fix, right? So AFAIU, the bug doesn't really cause FS corruption, but just "false" ENOSPC and these

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 10:55 -0700, Omar Sandoval wrote: > Against 4.12 would be best, thanks! okay,.. but that will take a while to compile... in the meantime... do you know whether it's more or less safe to use the 4.9 kernel without any fix, when I change the parameters mentioned before,

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 10:32 -0700, Omar Sandoval wrote: > Could you try 4.12? Linux 4.12.0-trunk-amd64 #1 SMP Debian 4.12.2-1~exp1 (2017-07-18) x86_64 GNU/Linux from Debian experimental, doesn't fix the issue... > If that doesn't work, could you please also try >

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 15:00 +, Martin Raiber wrote: > there are patches on this list/upstream which could fix this ( e.g. > "fix > delalloc accounting leak caused by u32 overflow"/"fix early ENOSPC > due > to delalloc"). mhh... it's a bit problematic to test these on that nodes... > Do you

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 15:00 +, Martin Raiber wrote: > It would be interesting if lowering the dirty ratio is a viable > work-around (sysctl vm.dirty_background_bytes=314572800 && sysctl > vm.dirty_bytes=1258291200). > > Regards, > Martin I took away a trailing 0 for each of them... and then

Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
Oh and I should add: After such error, cp goes on copying (with other files)... Same issue occurs when I do something like tar -cf - /media | tar -xf Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
Hey. The following happens on Debian stretch systems: # uname -a Linux lcg-lrz-admin 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linux What I have are VMs, which run with root fs as ext4 and which I want to migrate to btrfs. So I've added further disk images and then

Re: Exactly what is wrong with RAID5/6

2017-06-21 Thread Christoph Anton Mitterer
On Wed, 2017-06-21 at 16:45 +0800, Qu Wenruo wrote: > Btrfs is always using device ID to build up its device mapping. > And for any multi-device implementation (LVM,mdadam) it's never a > good  > idea to use device path. Isn't it rather the other way round? Using the ID is bad? Don't you remember

Re: [PATCH 1/2] btrfs: warn about RAID5/6 being experimental at mount time

2017-03-29 Thread Christoph Anton Mitterer
On Wed, 2017-03-29 at 06:39 +0200, Adam Borowski wrote: > Too many people come complaining about losing their data -- and > indeed, > there's no warning outside a wiki and the mailing list tribal > knowledge. > Message severity chosen for consistency with XFS -- "alert" makes > dmesg > produce

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-02-03 Thread Christoph Anton Mitterer
Hey Qu On Fri, 2017-02-03 at 14:20 +0800, Qu Wenruo wrote: > Great thanks for that! You're welcome. :) > I also added missing error message output for other places I found, > and  > updated the branch, the name remains as "lowmem_tests" > > Please try it. # btrfs check /dev/nbd0 ; echo $?

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-02-01 Thread Christoph Anton Mitterer
On Wed, 2017-02-01 at 09:06 +0800, Qu Wenruo wrote: > https://github.com/adam900710/btrfs-progs/tree/lowmem_fixes > > Which is also rebased to latest v4.9.1. Same game as last time, applied to 4.9, no RW mount between the runs. btrfs-progs v4.9 WITHOUT patch: *** #

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-29 Thread Christoph Anton Mitterer
On Sun, 2017-01-29 at 12:27 +0800, Qu Wenruo wrote: > Sorry for the late reply, in Chinese New Year vacation. No worries... and happy new year then ;) > I'll update the patchset soon to address it. Just tell me and I re-check. > Thanks again for your detailed output and patience, Thanks as

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-26 Thread Christoph Anton Mitterer
On Thu, 2017-01-26 at 11:10 +0800, Qu Wenruo wrote: > Would you please try lowmem_tests branch of my repo? >  > That branch contains a special debug output for the case you  > encountered, which should help to debug the case. > pecial debug output for the case you encountered, which Here the

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-25 Thread Christoph Anton Mitterer
On Thu, 2017-01-26 at 11:10 +0800, Qu Wenruo wrote: > In fact, the result without patches is not really needed for current > stage. > > Feel free to skip them until the patched ones passed. > Which should save you some time. Well the idea is, that if I do further writes in the meantime (by

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-25 Thread Christoph Anton Mitterer
On Wed, 2017-01-25 at 12:16 +0800, Qu Wenruo wrote: > https://github.com/adam900710/btrfs-progs/tree/lowmem_fixes Just finished trying your new patches. Same game as last time, applied to 4.9, no RW mount between the runs. btrfs-progs v4.9 WITHOUT patch: *** # btrfs

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-24 Thread Christoph Anton Mitterer
On Wed, 2017-01-25 at 12:16 +0800, Qu Wenruo wrote: > New patches are out now. > > Although I just updated  > 0001-btrfs-progs-lowmem-check-Fix-wrong-block-group-check.patch to > fix  > all similar bugs. > > You could get it from github: >

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-24 Thread Christoph Anton Mitterer
On Wed, 2017-01-25 at 08:44 +0800, Qu Wenruo wrote: > Thanks for the test, You're welcome... I'm happy if I can help :) Just tell me once you think you found something, and I'll repeat the testing. Cheers, Chr is. smime.p7s Description: S/MIME cryptographic signature

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-24 Thread Christoph Anton Mitterer
Hey Qu. I was giving your patches a try, again on the very same fs (which saw however writes in the meantime), from my initial report. btrfs-progs v4.9 WITHOUT patch: *** # btrfs check /dev/nbd0 ; echo $? checking extents checking free space cache checking fs roots

Re: RAID56 status?

2017-01-23 Thread Christoph Anton Mitterer
On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote: > We've been focusing on the single-drive use cases internally.  This > year  > that's changing as we ramp up more users in different places.   > Performance/stability work and raid5/6 are the top of my list right > now. +1 Would be nice to

Re: RAID56 status?

2017-01-23 Thread Christoph Anton Mitterer
Just wondered... is there any larger known RAID56 deployment? I mean something with real-world production systems and ideally many different IO scenarios, failures, pulling disks randomly and perhaps even so many disks that it's also likely to hit something like silent data corruption (on the

Re: RAID56 status?

2017-01-22 Thread Christoph Anton Mitterer
On Sun, 2017-01-22 at 22:39 +, Hugo Mills wrote: >    It's still all valid. Nothing's changed. > >    How would you like it to be updated? "Nope, still broken"? The kernel version mentioned there is 4.7... so noone (at least endusers) really knows whether it's just no longer maintainer or

Re: RAID56 status?

2017-01-22 Thread Christoph Anton Mitterer
On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote: > Therefore my question: whats the status of raid5/6 is in btrfs? > Is it somehow "production"-ready by now? AFAIK, what's on the - apparently already no longer updated -  https://btrfs.wiki.kernel.org/index.php/Status still applies, and RAID56

Re: [PATCH] btrfs-progs: lowmem-check: Fix wrong extent tree iteration

2017-01-20 Thread Christoph Anton Mitterer
On Fri, 2017-01-20 at 15:58 +0800, Qu Wenruo wrote: > Nice to hear that, although the -5 error seems to be caught > I'll locate the problem and then send the patch. > > Thanks for your testing! You're welcome... just ping me once I should do another run. Cheers, Chris. smime.p7s Description:

Re: [PATCH] btrfs-progs: lowmem-check: Fix wrong extent tree iteration

2017-01-19 Thread Christoph Anton Mitterer
Hey Qu. On Wed, 2017-01-18 at 16:48 +0800, Qu Wenruo wrote: > To Christoph, > > Would you please try this patch, and to see if it suppress the block > group > warning? I did another round of fsck in both modes (original/lomem), first WITHOUT your patch, then WITH it... both on progs version

Re: corruption: yet another one after deleting a ro snapshot

2017-01-17 Thread Christoph Anton Mitterer
On Wed, 2017-01-18 at 08:41 +0800, Qu Wenruo wrote: > Since we have your extent tree and root tree dump, I think we should > be  > able to build a image to reproduce the case. +1 > BTW, your fs is too large for us to really do some verification or > other  > work. Sure I know... but that's

Re: corruption: yet another one after deleting a ro snapshot

2017-01-17 Thread Christoph Anton Mitterer
Am 17. Januar 2017 09:53:19 MEZ schrieb Qu Wenruo : >Just lowmem false alert, as extent-tree dump shows complete fine >result. > >I'll CC you and adds your reported-by tag when there is any update on >this case. Fine, just one thing left right more from my side on this

Re: corruption: yet another one after deleting a ro snapshot

2017-01-16 Thread Christoph Anton Mitterer
On Mon, 2017-01-16 at 13:47 +0800, Qu Wenruo wrote: > > > And I highly suspect if the subvolume 6403 is the RO snapshot you > > > just removed. > > > > I guess there is no way to find out whether it was that snapshot, > > is > > there? > > "btrfs subvolume list" could do it." Well that was

Re: corruption: yet another one after deleting a ro snapshot

2017-01-15 Thread Christoph Anton Mitterer
On Mon, 2017-01-16 at 11:16 +0800, Qu Wenruo wrote: > It would be very nice if you could paste the output of > "btrfs-debug-tree -t extent " and "btrfs-debug-tree -t > root  > " > > That would help us to fix the bug in lowmem mode. I'll send you the link in a private mail ... if any other

Re: corruption: yet another one after deleting a ro snapshot

2017-01-15 Thread Christoph Anton Mitterer
On Mon, 2017-01-16 at 09:38 +0800, Qu Wenruo wrote: > So the fs is REALLY corrupted. *sigh* ... (not as in fuck-I'm-loosing-my-data™ ... but as in *sigh* another-possibly-deeply-hidden-bug-in-btrfs-that-might-eventually- cause-data-loss...) > BTW, lowmem mode seems to have a new false alert when

Re: corruption: yet another one after deleting a ro snapshot

2017-01-15 Thread Christoph Anton Mitterer
On Thu, 2017-01-12 at 10:38 +0800, Qu Wenruo wrote: > IIRC, RO mount won't continue background deletion. I see. > Would you please try 4.9 btrfs-progs? Done now, see results (lowmem and original mode) below: # btrfs version btrfs-progs v4.9 # btrfs check /dev/nbd0 ; echo $? Checking

Re: corruption: yet another one after deleting a ro snapshot

2017-01-11 Thread Christoph Anton Mitterer
Hey Qu, On Thu, 2017-01-12 at 09:25 +0800, Qu Wenruo wrote: > And since you just deleted a subvolume and unmount it soon Indeed, I unmounted it pretty quickly afterwards... I had mounted it (ro) in the meantime, and did a whole find mntoint > /dev/null on it just to see whether going through the

Re: corruption: yet another one after deleting a ro snapshot

2017-01-11 Thread Christoph Anton Mitterer
Oops forgot to copy and past the actual fsck output O:-) # btrfs check /dev/mapper/data-a3 ; echo $? Checking filesystem on /dev/mapper/data-a3 UUID: 326d292d-f97b-43ca-b1e8-c722d3474719 checking extents ref mismatch on [37765120 16384] extent item 0, found 1 Backref 37765120 parent 6403 root

corruption: yet another one after deleting a ro snapshot

2017-01-11 Thread Christoph Anton Mitterer
Hey. Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04) x86_64 GNU/Linux btrfs-progs v4.7.3 I've had this already at least once some year ago or so: I was doing backups (incremental via send/receive). After everything was copied, I unmounted the destination fs, made a fsck, all

yet another call trace during send/receive

2017-01-11 Thread Christoph Anton Mitterer
Hi. On Debian sid: $ uname -a Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04) x86_64 GNU/Linux $ btrfs version btrfs-progs v4.7.3 During a: # btrfs send -p foo bar | btrfs receive baz Jan 11 20:43:10 heisenberg kernel: [ cut here ] Jan 11 20:43:10

Re: some free space cache corruptions

2016-12-28 Thread Christoph Anton Mitterer
On Mon, 2016-12-26 at 00:12 +, Duncan wrote: > By themselves, free-space cache warnings are minor and not a serious  > issue at all -- the cache is just that, a cache, designed to speed  > operation but not actually necessary, and btrfs can detect and route  > around space-cache corruption

some free space cache corruptions

2016-12-25 Thread Christoph Anton Mitterer
Hey. Had the following on a Debian sid: Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02) x86_64 GNU/Linux btrfs-progs v4.7.3 I was doing a btrfs check of a rather big btrfs (8TB device, nearly full), having many snapshots on it, all incrementally send from another 8TB device,

csum errors during btrfs check

2016-12-23 Thread Christoph Anton Mitterer
Hey. Had the following on a Debian sid: Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02) x86_64 GNU/Linux btrfs-progs v4.7.3 (It's not so long ago that I ran some longer memtest86+ on the respective system. So memory should be ok.) It was again a 8 TB SATA disk connected

strange btrfs deadlock

2016-12-22 Thread Christoph Anton Mitterer
Hey. Had the following on a Debian sid: Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02) x86_64 GNU/Linux I was basically copying data between several filesystems all on SATA disks attached via USB. Unfortunately I have only little data... The first part may be totally

  1   2   3   4   >