Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-11-02 Thread Christoph Anton Mitterer
On Sat, 2018-11-03 at 09:34 +0800, Su Yue wrote:
> Sorry for the late reply cause I'm busy at other things.
No worries :-)


> I just looked through related codes and found the bug.
> The patches can fix it. So no need to do more tests.
> Thanks to your tests and patience. :)
Thanks for fixing :-)


Best wishes,
Chris.



Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-11-02 Thread Christoph Anton Mitterer
Hey Su.

Anything further I need to do in this matter or can I consider it
"solved" and you won't need further testing by my side, but just PR the
patches of that branch? :-)

Thanks,
Chris.

On Sat, 2018-10-27 at 14:15 +0200, Christoph Anton Mitterer wrote:
> Hey.
> 
> 
> Without the last patches on 4.17:
> 
> checking extents
> checking free space cache
> checking fs roots
> ERROR: errors found in fs roots
> Checking filesystem on /dev/mapper/system
> UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
> found 619543498752 bytes used, error(s) found
> total csum bytes: 602382204
> total tree bytes: 2534309888
> total fs tree bytes: 1652097024
> total extent tree bytes: 160432128
> btree space waste bytes: 459291608
> file data blocks allocated: 7334036647936
>  referenced 730839187456
> 
> 
> With the last patches, on 4.17:
> 
> checking extents
> checking free space cache
> checking fs roots
> checking only csum items (without verifying data)
> checking root refs
> Checking filesystem on /dev/mapper/system
> UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
> found 619543498752 bytes used, no error found
> total csum bytes: 602382204
> total tree bytes: 2534309888
> total fs tree bytes: 1652097024
> total extent tree bytes: 160432128
> btree space waste bytes: 459291608
> file data blocks allocated: 7334036647936
>  referenced 730839187456
> 
> 
> Cheers,
> Chris.
> 



Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-10-27 Thread Christoph Anton Mitterer
Hey.


Without the last patches on 4.17:

checking extents
checking free space cache
checking fs roots
ERROR: errors found in fs roots
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
found 619543498752 bytes used, error(s) found
total csum bytes: 602382204
total tree bytes: 2534309888
total fs tree bytes: 1652097024
total extent tree bytes: 160432128
btree space waste bytes: 459291608
file data blocks allocated: 7334036647936
 referenced 730839187456


With the last patches, on 4.17:

checking extents
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
found 619543498752 bytes used, no error found
total csum bytes: 602382204
total tree bytes: 2534309888
total fs tree bytes: 1652097024
total extent tree bytes: 160432128
btree space waste bytes: 459291608
file data blocks allocated: 7334036647936
 referenced 730839187456


Cheers,
Chris.



Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-10-18 Thread Christoph Anton Mitterer
Hey.

So I'm back from a longer vacation and had now the time to try out your
patches from below:

On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote:
> I found the errors should blame to something about inode_extref check
> in lowmem mode.
> I have writeen three patches to detect and report errors about
> inode_extref. For your convenience, it's based on v4.17:
> https://github.com/Damenly/btrfs-progs/tree/ext_ref_v4.17
> 
> This repo should report more errors. Because one of those is just
> Whac-A-Mole, I will make it better and send them later to ML.


This is the output it gives:
checking extents
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
found 617228185600 bytes used, no error found
total csum bytes: 600139124
total tree bytes: 2516172800
total fs tree bytes: 1639890944
total extent tree bytes: 156532736
btree space waste bytes: 455772589
file data blocks allocated: 7431727771648
 referenced 732073979904

(just a bit strange that the UUID line is not in the beginngin)... but
other than that, no longer an error message as it seems)


Cheers,
Chris.



Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-09-05 Thread Christoph Anton Mitterer
On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote:
> Agreed with Qu, btrfs-check shall not try to do any write.

Well.. it could have been just some coincidence :-)


> I found the errors should blame to something about inode_extref check
> in lowmem mode.

So you mean errors in btrfs-check... and it was a false positive?


> I have writeen three patches to detect and report errors about
> inode_extref. For your convenience, it's based on v4.17:
> https://github.com/Damenly/btrfs-progs/tree/ext_ref_v4.17

I hope I can test them soon could take a bit longer as I'm about to
head off into vacation.


Cheers,
Chris.



Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-09-04 Thread Christoph Anton Mitterer
On Tue, 2018-09-04 at 17:14 +0800, Qu Wenruo wrote:
> However the backtrace can't tell which process caused such fsync
> call.
> (Maybe LVM user space code?)

Well it was just literally before btrfs-check exited... so I blindly
guesses... but arguably it could be just some coincidence.

LVM tools are installed, but since I no longer use and PVs/LVs/etc. ...
I'd doubt they'd do anything here.


Cheers,
Chris.



Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-09-03 Thread Christoph Anton Mitterer
Hey.


On Fri, 2018-08-31 at 10:33 +0800, Su Yue wrote:
> Can you please fetch btrfs-progs from my repo and run lowmem check
> in readonly?
> Repo: https://github.com/Damenly/btrfs-progs/tree/lowmem_debug
> It's based on v4.17.1 plus additonal output for debug only.

I've adapted your patch to 4.17 from Debian (i.e. not the 4.17.1).


First I ran it again with the pristine 4.17 from Debian:
# btrfs check --mode=lowmem /dev/mapper/system ; echo $?
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
checking extents
checking free space cache
checking fs roots
ERROR: errors found in fs roots
found 435924422656 bytes used, error(s) found
total csum bytes: 423418948
total tree bytes: 2218328064
total fs tree bytes: 1557168128
total extent tree bytes: 125894656
btree space waste bytes: 429599230
file data blocks allocated: 5193373646848
 referenced 555255164928
[ 1248.687628] [ cut here ]
[ 1248.688352] generic_make_request: Trying to write to read-only block-device 
dm-0 (partno 0)
[ 1248.689127] WARNING: CPU: 3 PID: 933 at 
/build/linux-LgHyGB/linux-4.17.17/block/blk-core.c:2180 
generic_make_request_checks+0x43d/0x610
[ 1248.689909] Modules linked in: dm_crypt algif_skcipher af_alg dm_mod 
snd_hda_codec_hdmi snd_hda_codec_realtek intel_rapl snd_hda_codec_generic 
x86_pkg_temp_thermal intel_powerclamp i915 iwlwifi btusb coretemp btrtl btbcm 
uvcvideo kvm_intel snd_hda_intel btintel videobuf2_vmalloc bluetooth 
snd_hda_codec kvm videobuf2_memops videobuf2_v4l2 videobuf2_common cfg80211 
snd_hda_core irqbypass videodev jitterentropy_rng drm_kms_helper 
crct10dif_pclmul snd_hwdep crc32_pclmul drbg ghash_clmulni_intel intel_cstate 
snd_pcm ansi_cprng ppdev intel_uncore drm media ecdh_generic iTCO_wdt snd_timer 
iTCO_vendor_support rtsx_pci_ms crc16 snd intel_rapl_perf memstick joydev 
mei_me rfkill evdev soundcore sg parport_pc pcspkr serio_raw fujitsu_laptop mei 
i2c_algo_bit parport shpchp sparse_keymap pcc_cpufreq lpc_ich button
[ 1248.693639]  video battery ac ip_tables x_tables autofs4 btrfs 
zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic 
raid1 raid0 multipath linear md_mod sd_mod uas usb_storage crc32c_intel 
rtsx_pci_sdmmc mmc_core ahci xhci_pci libahci aesni_intel ehci_pci aes_x86_64 
libata crypto_simd xhci_hcd ehci_hcd cryptd glue_helper psmouse i2c_i801 
scsi_mod rtsx_pci e1000e usbcore usb_common
[ 1248.696956] CPU: 3 PID: 933 Comm: btrfs Not tainted 4.17.0-3-amd64 #1 Debian 
4.17.17-1
[ 1248.698118] Hardware name: FUJITSU LIFEBOOK E782/FJNB253, BIOS Version 2.11 
07/15/2014
[ 1248.699299] RIP: 0010:generic_make_request_checks+0x43d/0x610
[ 1248.700495] RSP: 0018:ac89827c7d88 EFLAGS: 00010286
[ 1248.701702] RAX:  RBX: 98f4848a9200 RCX: 0006
[ 1248.702930] RDX: 0007 RSI: 0082 RDI: 98f49e2d6730
[ 1248.704170] RBP: 98f484f6d460 R08: 033e R09: 00aa
[ 1248.705422] R10: ac89827c7e60 R11:  R12: 
[ 1248.706675] R13: 0001 R14:  R15: 
[ 1248.707928] FS:  7f92842018c0() GS:98f49e2c() 
knlGS:
[ 1248.709190] CS:  0010 DS:  ES:  CR0: 80050033
[ 1248.710448] CR2: 55fc6fe1a5b0 CR3: 000407f62001 CR4: 001606e0
[ 1248.711707] Call Trace:
[ 1248.712960]  ? do_writepages+0x4b/0xe0
[ 1248.714201]  ? blkdev_readpages+0x20/0x20
[ 1248.715441]  ? do_writepages+0x4b/0xe0
[ 1248.716684]  generic_make_request+0x64/0x400
[ 1248.717935]  ? finish_wait+0x80/0x80
[ 1248.719181]  ? mempool_alloc+0x67/0x1a0
[ 1248.720425]  ? submit_bio+0x6c/0x140
[ 1248.721663]  submit_bio+0x6c/0x140
[ 1248.722902]  submit_bio_wait+0x53/0x80
[ 1248.724139]  blkdev_issue_flush+0x7c/0xb0
[ 1248.725377]  blkdev_fsync+0x2f/0x40
[ 1248.726612]  do_fsync+0x38/0x60
[ 1248.727849]  __x64_sys_fsync+0x10/0x20
[ 1248.729086]  do_syscall_64+0x55/0x110
[ 1248.730323]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1248.731565] RIP: 0033:0x7f928354d161
[ 1248.732805] RSP: 002b:7ffd35e3f5d8 EFLAGS: 0246 ORIG_RAX: 
004a
[ 1248.734067] RAX: ffda RBX: 55fc09c0c260 RCX: 7f928354d161
[ 1248.735342] RDX: 55fc09c13e28 RSI: 55fc0899f820 RDI: 0004
[ 1248.736614] RBP: 55fc09c0c2d0 R08: 0005 R09: 55fc09c0da70
[ 1248.738001] R10: 009e R11: 0246 R12: 
[ 1248.739272] R13: 55fc0899d213 R14: 55fc09c0c290 R15: 0001
[ 1248.740542] Code: 24 54 03 00 00 48 8d 74 24 08 48 89 df c6 05 3e 03 d9 00 
01 e8 d5 63 01 00 44 89 e2 48 89 c6 48 c7 c7 80 e1 e6 ad e8 a3 4e d1 ff <0f> 0b 
4c 8b 63 08 e9 7b fc ff ff 80 3d 15 03 d9 00 00 0f 85 94
[ 1248.741909] ---[ end trace c2f580dbd579028c ]---
1

Not really sure why btrfs-check apparently tries to write to the device

fsck lowmem mode only: ERROR: errors found in fs roots

2018-08-30 Thread Christoph Anton Mitterer
Hey.

I've the following on a btrfs that's basically the system fs for my
notebook:


When booting from a USB stick with:
# uname -a
Linux heisenberg 4.17.0-3-amd64 #1 SMP Debian 4.17.17-1
(2018-08-18) x86_64 GNU/Linux

# btrfs --version
btrfs-progs v4.17

... a lowmem mode fsck gives no error:

# btrfs check --mode=lowmem /dev/mapper/system ; echo $?
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
checking extents
checking free space cache
checking fs roots
ERROR: errors found in fs roots
found 495910952960 bytes used, error(s) found
total csum bytes: 481840472
total tree bytes: 2388819968
total fs tree bytes: 1651097600
total extent tree bytes: 161841152
btree space waste bytes: 446707102
file data blocks allocated: 6651878428672
 referenced 542320984064
1

... while a normal mode fsck doesn't give one:

# btrfs check /dev/mapper/system ; echo $?
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
checking extents
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
found 495910952960 bytes used, no error found
total csum bytes: 481840472
total tree bytes: 2388819968
total fs tree bytes: 1651097600
total extent tree bytes: 161841152
btree space waste bytes: 446707102
file data blocks allocated: 6651878428672
 referenced 542320984064
0

There were no unusual kernel log messages.


Back in the normal system (no longer USB)... I spottet this:
Aug 30 18:31:29 heisenberg kernel: BTRFS info (device dm-0): the free
space cache file (22570598400) is invalid, skip it

but not sure whether it's related (probably not)... and I haven't seen
such a free space cache file issue (or any other btrfs errors) in a
long while (I usually watch my kernel log once after booting has
finished).


Any ideas? Perhaps it's just yet another lowmem false positive...
anything I can help in debugging this?


Apart from this I haven't noticed any corruptions recently... just
about to make a full backup of the fs (or better said a rw snapshot of
the most of the data) with tar, so most data will be read soon at least
once... an I would probably notice any further errors that are
otherwise silent.


Cheers,
Chris.



Re: [PATCH v2 1/2] btrfs-progs: Rename OPEN_CTREE_FS_PARTIAL to OPEN_CTREE_TEMPORARY_SUPER

2018-07-12 Thread Christoph Anton Mitterer
Hey.

Better late than never ;-)

Just to confirm: At least since 4.16.1, I could btrfs-restore from the
broken fs image again (that I've described in "spurious full btrfs
corruption" from around mid March).

So the regression in btrfsprogs has in fact been fixed by these
patches, it seems.


Thanks,
Chris.

On Wed, 2018-04-11 at 15:29 +0800, Qu Wenruo wrote:
> The old flag OPEN_CTREE_FS_PARTIAL is in fact quite easy to be
> confused
> with OPEN_CTREE_PARTIAL, which allow btrfs-progs to open damaged
> filesystem (like corrupted extent/csum tree).
> 
> However OPEN_CTREE_FS_PARTIAL, unlike its name, is just allowing
> btrfs-progs to open fs with temporary superblocks (which only has 6
> basic trees on SINGLE meta/sys chunks).
> 
> The usage of FS_PARTIAL is really confusing here.
> 
> So rename OPEN_CTREE_FS_PARTIAL to OPEN_CTREE_TEMPORARY_SUPER, and
> add
> extra comment for its behavior.
> Also rename BTRFS_MAGIC_PARTIAL to BTRFS_MAGIC_TEMPORARY to keep the
> naming consistent.
> 
> And with above comment, the usage of FS_PARTIAL in dump-tree is
> obviously incorrect, fix it.
> 
> Fixes: 8698a2b9ba89 ("btrfs-progs: Allow inspect dump-tree to show
> specified tree block even some tree roots are corrupted")
> Signed-off-by: Qu Wenruo 
> ---
> changelog:
> v2:
>   New patch
> ---
>  cmds-inspect-dump-tree.c |  2 +-
>  convert/main.c   |  4 ++--
>  ctree.h  |  8 +---
>  disk-io.c| 12 ++--
>  disk-io.h| 10 +++---
>  mkfs/main.c  |  2 +-
>  6 files changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/cmds-inspect-dump-tree.c b/cmds-inspect-dump-tree.c
> index 0802b31e9596..e6510851e8f4 100644
> --- a/cmds-inspect-dump-tree.c
> +++ b/cmds-inspect-dump-tree.c
> @@ -220,7 +220,7 @@ int cmd_inspect_dump_tree(int argc, char **argv)
>   int uuid_tree_only = 0;
>   int roots_only = 0;
>   int root_backups = 0;
> - unsigned open_ctree_flags = OPEN_CTREE_FS_PARTIAL;
> + unsigned open_ctree_flags = OPEN_CTREE_PARTIAL;
>   u64 block_only = 0;
>   struct btrfs_root *tree_root_scan;
>   u64 tree_id = 0;
> diff --git a/convert/main.c b/convert/main.c
> index 6bdfab40d0b0..80f3bed84c84 100644
> --- a/convert/main.c
> +++ b/convert/main.c
> @@ -1140,7 +1140,7 @@ static int do_convert(const char *devname, u32
> convert_flags, u32 nodesize,
>   }
>  
>   root = open_ctree_fd(fd, devname, mkfs_cfg.super_bytenr,
> -  OPEN_CTREE_WRITES |
> OPEN_CTREE_FS_PARTIAL);
> +  OPEN_CTREE_WRITES |
> OPEN_CTREE_TEMPORARY_SUPER);
>   if (!root) {
>   error("unable to open ctree");
>   goto fail;
> @@ -1230,7 +1230,7 @@ static int do_convert(const char *devname, u32
> convert_flags, u32 nodesize,
>   }
>  
>   root = open_ctree_fd(fd, devname, 0,
> - OPEN_CTREE_WRITES | OPEN_CTREE_FS_PARTIAL);
> +  OPEN_CTREE_WRITES |
> OPEN_CTREE_TEMPORARY_SUPER);
>   if (!root) {
>   error("unable to open ctree for finalization");
>   goto fail;
> diff --git a/ctree.h b/ctree.h
> index fa861ba0b4c3..80d4e59a66ce 100644
> --- a/ctree.h
> +++ b/ctree.h
> @@ -45,10 +45,12 @@ struct btrfs_free_space_ctl;
>  #define BTRFS_MAGIC 0x4D5F53665248425FULL /* ascii _BHRfS_M, no null
> */
>  
>  /*
> - * Fake signature for an unfinalized filesystem, structures might be
> partially
> - * created or missing.
> + * Fake signature for an unfinalized filesystem, which only has
> barebone tree
> + * structures (normally 6 near empty trees, on SINGLE meta/sys
> temporary chunks)
> + *
> + * ascii !BHRfS_M, no null
>   */
> -#define BTRFS_MAGIC_PARTIAL 0x4D5F536652484221ULL /* ascii !BHRfS_M,
> no null */
> +#define BTRFS_MAGIC_TEMPORARY 0x4D5F536652484221ULL
>  
>  #define BTRFS_MAX_MIRRORS 3
>  
> diff --git a/disk-io.c b/disk-io.c
> index 58eae709e0e8..9e8b1e9d295c 100644
> --- a/disk-io.c
> +++ b/disk-io.c
> @@ -1117,14 +1117,14 @@ static struct btrfs_fs_info
> *__open_ctree_fd(int fp, const char *path,
>   fs_info->ignore_chunk_tree_error = 1;
>  
>   if ((flags & OPEN_CTREE_RECOVER_SUPER)
> -  && (flags & OPEN_CTREE_FS_PARTIAL)) {
> +  && (flags & OPEN_CTREE_TEMPORARY_SUPER)) {
>   fprintf(stderr,
> - "cannot open a partially created filesystem for
> recovery");
> + "cannot open a filesystem with temporary super block for
> recovery");
>   goto out;
>   }
>  
> - if (flags & OPEN_CTREE_FS_PARTIAL)
> - sbflags = SBREAD_PARTIAL;
> + if (flags & OPEN_CTREE_TEMPORARY_SUPER)
> + sbflags = SBREAD_TEMPORARY;
>  
>   ret = btrfs_scan_fs_devices(fp, path, _devices,
> sb_bytenr, sbflags,
>   (flags & OPEN_CTREE_NO_DEVICES));
> @@ -1285,8 +1285,8 @@ static int check_super(struct btrfs_super_block
> *sb, unsigned sbflags)
>   int csum_size;
>  
>   if 

Re: call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-29 Thread Christoph Anton Mitterer
On Fri, 2018-06-29 at 09:10 +0800, Qu Wenruo wrote:
> Maybe it's the old mkfs causing the problem?
> Although mkfs.btrfs added device size alignment much earlier than
> kernel, it's still possible that the old mkfs doesn't handle the
> initial
> device and extra device (mkfs.btrfs will always create a temporary fs
> on
> the first device, then add all the other devices to the system) the
> same
> way.

Well who knows,.. at least now everything's fine again :-)

Thanks guys!

Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-28 Thread Christoph Anton Mitterer
Hey Qu and Nikolay.


On Thu, 2018-06-28 at 22:58 +0800, Qu Wenruo wrote:
> Nothing special. Btrfs-progs will handle it pretty well.
Since this a remote system where the ISP provides only a rescue image
with pretty old kernel/btrfs-progs, I had to copy a current local
binary and use that... but that seemed to have worked quite well

> Because the WARN_ON() is newly added.
Ah I see.

> Yep, latest will warn about it, and --repair can also fix it too.
Great.


On Thu, 2018-06-28 at 17:25 +0300, Nikolay Borisov wrote:
> Was this an old FS or a fresh one?
You mean in terms of original fs creation? Probably rather oldish.. I'd
guess at least a year or maybe even 2-3 or more.

> Looking at the callstack this
> seems
> to have occured due to "btrfs_set_device_total_bytes(leaf, dev_item,
> btrfs_device_get_disk_total_bytes(device));" call. Meaning the total
> bytes of the disk were unalgined. Perhaps this has been like that for
> quite some time, then you did a couple of kernel upgrades (this
> WARN_ON
> was added later than 4.11) and just now you happened to delete a
> chunk
> which would trigger a device update on-disk ?
Could be...


The following was however still a bit strange:
sda2 and sdb2 are the partitions on the two HDDs forming the RAID1.


root@rescue ~ # ./btrfs rescue fix-device-size /dev/sda2
Fixed device size for devid 2, old size: 999131127296 new size: 999131123712
Fixed super total bytes, old size: 1998262251008 new size: 1998262247424
Fixed unaligned/mismatched total_bytes for super block and device items
root@rescue ~ # ./btrfs rescue fix-device-size /dev/sdb2
No device size related problem found

As you can see, no alignment issues were found on sdb2.

I've created these at the same time...
I don't think (but cannot exclude for 100%) that this server ever lost
a disk (in that case I could image that newer progs/kernel might have
created sdb2 with proper alignment)

Looking at the partitions:

root@rescue ~ # gdisk -l /dev/sda
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)End (sector)  Size   Code  Name
   12048 2097151   1023.0 MiB  EF02  BIOS boot partition
   2 2097152  1953525134   930.5 GiB   8300  Linux filesystem
root@rescue ~ # gdisk -l /dev/sdb
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdb: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)End (sector)  Size   Code  Name
   12048 2097151   1023.0 MiB  EF02  BIOS boot partition
   2 2097152  1953525134   930.5 GiB   8300  Linux filesystem


Both the same... so if there was no device replace or so... then I
wonder why only one device was affected.


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-28 Thread Christoph Anton Mitterer
On Thu, 2018-06-28 at 22:09 +0800, Qu Wenruo wrote:
> > [   72.168662] WARNING: CPU: 0 PID: 242 at /build/linux-
> > uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565
> > btrfs_update_device+0x1b2/0x1c0It
> looks like it's the old WARN_ON() for unaligned device size.
> Would you please verify if it is the case?

# blockdev --getsize64 /dev/sdb2 /dev/sda2
999131127296
999131127296


Since getsize64 returns bytes and not sectors, I suppose it would need
to be aligned to 1024 by the least?

999131127296 / 1024 = 975713991,5

So it's not.


> If so, "btrfs rescue fix-device-size" should handle it pretty well.

I guess this needs to be done with the fs unmounted?
Anything to consider since I have RAID1 (except from running it on both
devices)?


Also, it's a bit strange that this error occurred never before (though
the btrfs-restore manpage says the kernel would check for this since
4.11).

It would further be nice if btrfs-check would warn about this.


Thanks,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


call trace: WARNING: at /build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 btrfs_update_device

2018-06-28 Thread Christoph Anton Mitterer
Hey.

On a 4.16.16 kernel with a RAID 1 btrfs I got the following messages
since today.

Data seems still to be readable (correctly)... and there are no other
errors (like SATA errors) in the kernel log.

Any idea what these could mean?

Thanks,
Chris.


[   72.168662] WARNING: CPU: 0 PID: 242 at 
/build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 
btrfs_update_device+0x1b2/0x1c0 [btrfs]
[   72.168701] Modules linked in: cpufreq_userspace cpufreq_powersave 
cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_policy 
ipt_REJECT nf_reject_ipv4 xt_comment xt_tcpudp nf_conntrack_ipv4 powernow_k8 
nf_defrag_ipv4 edac_mce_amd snd_hda_intel kvm_amd snd_hda_codec ccp rng_core 
snd_hda_core kvm snd_hwdep irqbypass snd_pcm wmi_bmof radeon snd_timer ttm 
xt_multiport snd pcspkr soundcore drm_kms_helper k8temp ohci_pci ata_generic 
pata_atiixp ohci_hcd ehci_pci sg wmi xt_conntrack drm nf_conntrack i2c_algo_bit 
ehci_hcd usbcore button sp5100_tco usb_common shpchp i2c_piix4 iptable_filter 
binfmt_misc sunrpc hwmon_vid ip_tables x_tables autofs4 btrfs zstd_decompress 
zstd_compress xxhash sd_mod raid10 raid456 async_raid6_recov
[   72.168776]  async_memcpy async_pq async_xor async_tx libcrc32c 
crc32c_generic xor raid6_pq raid1 raid0 multipath linear md_mod evdev ahci 
libahci serio_raw libata r8169 mii scsi_mod
[   72.168820] CPU: 0 PID: 242 Comm: btrfs-cleaner Not tainted 4.16.0-2-amd64 
#1 Debian 4.16.16-2
[   72.168852] Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7551/KA780G 
(MS-7551), BIOS V16.6 05/12/2010
[   72.168907] RIP: 0010:btrfs_update_device+0x1b2/0x1c0 [btrfs]
[   72.168939] RSP: 0018:bd5a810a3d60 EFLAGS: 00010206
[   72.168973] RAX: 0fff RBX: 938e847f8000 RCX: 00e8a0db1e00
[   72.169006] RDX: 1000 RSI: 3f5c RDI: 938e7a8015e0
[   72.169040] RBP: 938e8fb97a00 R08: bd5a810a3d10 R09: bd5a810a3d18
[   72.169073] R10: 0003 R11: 3000 R12: 
[   72.169106] R13: 3f3c R14: 938e7a8015e0 R15: 938e8f0c6328
[   72.169140] FS:  () GS:938e9dc0() 
knlGS:
[   72.169177] CS:  0010 DS:  ES:  CR0: 80050033
[   72.169210] CR2: 7fcff92ce000 CR3: 00020575e000 CR4: 06f0
[   72.169243] Call Trace:
[   72.169304]  btrfs_remove_chunk+0x2a9/0x8c0 [btrfs]
[   72.169359]  btrfs_delete_unused_bgs+0x323/0x3f0 [btrfs]
[   72.169415]  ? __btree_submit_bio_start+0x20/0x20 [btrfs]
[   72.169469]  cleaner_kthread+0x152/0x160 [btrfs]
[   72.169506]  kthread+0x113/0x130
[   72.169540]  ? kthread_create_worker_on_cpu+0x70/0x70
[   72.169575]  ? SyS_exit_group+0x10/0x10
[   72.169610]  ret_from_fork+0x35/0x40
[   72.169643] Code: 4c 89 f7 45 31 c0 ba 10 00 00 00 4c 89 ee e8 16 23 ff ff 
4c 89 f7 e8 9e ef fc ff e9 de fe ff ff 41 bc f4 ff ff ff e9 db fe ff ff <0f> 0b 
eb b7 e8 85 4c 1a c5 0f 1f 44 00 00 66 66 66 66 90 41 55 
[   72.169705] ---[ end trace ed549af9d9cf6190 ]---
[   72.170009] WARNING: CPU: 0 PID: 242 at 
/build/linux-uwVqDp/linux-4.16.16/fs/btrfs/ctree.h:1565 
btrfs_update_device+0x1b2/0x1c0 [btrfs]
[   72.170050] Modules linked in: cpufreq_userspace cpufreq_powersave 
cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_policy 
ipt_REJECT nf_reject_ipv4 xt_comment xt_tcpudp nf_conntrack_ipv4 powernow_k8 
nf_defrag_ipv4 edac_mce_amd snd_hda_intel kvm_amd snd_hda_codec ccp rng_core 
snd_hda_core kvm snd_hwdep irqbypass snd_pcm wmi_bmof radeon snd_timer ttm 
xt_multiport snd pcspkr soundcore drm_kms_helper k8temp ohci_pci ata_generic 
pata_atiixp ohci_hcd ehci_pci sg wmi xt_conntrack drm nf_conntrack i2c_algo_bit 
ehci_hcd usbcore button sp5100_tco usb_common shpchp i2c_piix4 iptable_filter 
binfmt_misc sunrpc hwmon_vid ip_tables x_tables autofs4 btrfs zstd_decompress 
zstd_compress xxhash sd_mod raid10 raid456 async_raid6_recov
[   72.170152]  async_memcpy async_pq async_xor async_tx libcrc32c 
crc32c_generic xor raid6_pq raid1 raid0 multipath linear md_mod evdev ahci 
libahci serio_raw libata r8169 mii scsi_mod
[   72.170204] CPU: 0 PID: 242 Comm: btrfs-cleaner Tainted: GW
4.16.0-2-amd64 #1 Debian 4.16.16-2
[   72.170241] Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7551/KA780G 
(MS-7551), BIOS V16.6 05/12/2010
[   72.170300] RIP: 0010:btrfs_update_device+0x1b2/0x1c0 [btrfs]
[   72.170333] RSP: 0018:bd5a810a3d60 EFLAGS: 00010206
[   72.170367] RAX: 0fff RBX: 938e847f8000 RCX: 00e8a0db1e00
[   72.170401] RDX: 1000 RSI: 3f5c RDI: 938e7a8015e0
[   72.170434] RBP: 938e8fb97a00 R08: bd5a810a3d10 R09: bd5a810a3d18
[   72.170468] R10: 0003 R11: 3000 R12: 
[   72.170501] R13: 3f3c R14: 938e7a8015e0 R15: 938e8f0c6328
[   

in which directions does btrfs send -p | btrfs receive work

2018-06-06 Thread Christoph Anton Mitterer
Hey.

Just wondered about the following:

When I have a btrfs which acts as a master and from which I make copies
 of snapshots on it via send/receive (with using -p at send) to other
btrfs which acts as copies like this:
master +--> copy1
   +--> copy2
   \--> copy3
and if now e.g. the device of master breaks, can I move *with
incremential send -p / receive backups from one of the copies?

Which of the following two would work (or both?):

A) Redesignating a copy to be a new master, e.g.:
   old-copy1/new-master +--> new-disk/new-copy1

   +--> copy2
\--> copy3
   Obviously at least
send/receiving to new-copy1 shoud work,
   but would that work as well
to copy2/copy3 (with -p), since they're
   based on (and probably using
UUIDs) from the snapshot on the old
   broken master?

B) Let a new device be the master and move on from that (kinda creating
   a "send/receive cycle":
   1st:
   copy1 +--> new-disk/new-master

   from then on (when new snapshots should be incrementally sent):
   new-master +--> copy1
  +--> copy2
  \--> copy3
   Again, not sure whether send/receiving to copy2/3 would work, since
   they're based on snapshots/parents from the old broken master.
   And I'm even more unsure, whether this back send/receiving,
   from copy1->new-master->copy1 would work.


Any expert having some definite idea? :-)

Thanks,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs progs release 4.16.1

2018-04-25 Thread Christoph Anton Mitterer
On Wed, 2018-04-25 at 07:22 -0400, Austin S. Hemmelgarn wrote:
> While I can understand Duncan's point here, I'm inclined to agree
> with 
> David

Same from my side... and I run a multi-PiB storage site (though not
with btrfs).

Cosmetically one shouldn't do this in a bugfix release, this should
have really no impact on the real world.

The typical sysadmin will anyway use some stable distribution... and is
there any which ships already 4.16?

Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: spurious full btrfs corruption

2018-03-26 Thread Christoph Anton Mitterer
Hey Qu.

Some update on the corruption issue on my Fujitsu notebook:


Finally got around running some memtest on it... and few seconds after
it started I already got this:
https://paste.pics/1ff8b13b94f31082bc7410acfb1c6693

So plenty of bad memory...

I'd say it's probably not so unlikely that *this* was the actual reason
for btrfs-metadata corruption.

It would perfectly fit to the symptom that I saw shortly before the fs
was completely destroyed:
The spurious csum errors on reads that went away when I read the file
again.



I'd guess you also found no further issue with the v1 space cache
and/or the tree log in the meantime?
So it's probably safe to turn them on again?



We(aka you + me testing fixes) can still look in the issue that newer
btrfsprogs no longer recover anything from the broken fs, while older
to.
I can keep the image around, so no reason to hurry from your side.



Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: spurious full btrfs corruption

2018-03-21 Thread Christoph Anton Mitterer
Just some addition on this:

On Fri, 2018-03-16 at 01:03 +0100, Christoph Anton Mitterer wrote:
> The issue that newer btrfs-progs/kernel don't restore anything at all
> from my corrupted fs:

4.13.3 seems to be already buggy...

4.7.3 works, but interestingly btrfs-find-super seems to hang on it
forever with 100% CPU but apparently no disc IO (works in later
versions, where it finishes in a few seconds).


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Status of RAID5/6

2018-03-21 Thread Christoph Anton Mitterer
Hey.

Some things would IMO be nice to get done/clarified (i.e. documented in
the Wiki and manpages) from users'/admin's  POV:

Some basic questions:
- Starting with which kernels (including stable kernel versions) does
it contain the fixes for the bigger issues from some time ago?

- Exactly what does not work yet (only the write hole?)?
  What's the roadmap for such non-working things?

- Ideally some explicit confirmations of what's considered to work,
  like:
  - compression+raid?
  - rebuild / replace of devices?
  - changing raid lvls?
  - repairing data (i.e. picking the right block according to csums in
case of silent data corruption)?
  - scrub (and scrub+repair)?
  - anything to consider with raid when doing snapshots, send/receive
or defrag?
  => and for each of these: for which raid levels?

  Perhaps also confirmation for previous issues:
  - I vaguely remember there were issues with either device delete or
replace and that one of them was possibly super-slow?
  - I also remember there were cases in which a fs could end up in
permanent read-only state?


- Clarifying questions on what is expected to work and how things are
  expected to behave, e.g.:
  - Can one plug a device (without deleting/removing it first) just
under operation and will btrfs survive it?
  - If an error is found (e.g. silent data corruption based on csums),
when will it repair (fix = write the repaired data) the data?
On the read that finds the bad data?
Only on scrub (i.e. do users need to regularly run scrubs)? 
  - What happens if error cannot be repaired, e.g. no csum information
or all blocks bad?
EIO? Or are there cases where it gives no EIO (I guess at least in
nodatacow case)
  - What happens if data cannot be fixed (i.e. trying to write the
repaired block again fails)?
And if the repaired block is written, will it be immediately
checked again (to find cases of blocks that give different results
again)?
  - Will a scrub check only the data on "one" device... or will it
check all the copies (or parity blocks) on all devices in the raid?
  - Does a fsck check all devices or just one?
  - Does a balance implicitly contain a scrub?
  - If a rebuild/repair/reshape is performed... can these be
interrupted? What if they are forcibly interrupted (power loss)?


- Explaining common workflows:
  - Replacing a faulty or simply an old disk.
How to stop btrfs from using a device (without bricking the fs)?
How to do the rebuild.
  - Best practices, like: should one do regular balances (and if so, as
asked above, do these include the scrubs, so basically: is it
enough to do one of them)
  - How to grow/shrink raid btrfs... and if this is done... how to
replicate the data already on the fs to the newly added disks (or
is this done automatically - and if so, how to see that it's
finished)?
  - What will actually trigger repairs? (i.e. one wants to get silent
block errors fixed ASAP and not only when the data is read - and
when it's possibly to late)
  - In the rebuild/repair phase (e.g. one replaces a device): Can one
somehow give priority to the rebuild/repair? (e.g. in case of a
degraded raid, one may want to get that solved ASAP and rather slow
down other reads or stop them completely.
  - Is there anything to notice when btrfs raid is placed above dm-
crypt from a security PoV?
With MD raid that wasn't much of a problem as it's typically placed
below dm-crypt... but btrfs raid would need to be placed above it.
So maybe there are some known attacks against crypto modes, if
equal (RAID 1 / 10) or similar/equal (RAID 5/6) data is written
above multiple crypto devices? (Probably something one would need
to ask their experts).


- Maintenance tools
  - How to get the status of the RAID? (Querying kernel logs is IMO
rather a bad way for this)
This includes:
- Is the raid degraded or not?
- Are scrubs/repairs/rebuilds/reshapes in progress and how far are
  they? (Reshape would be: if the raid level is changed or the raid
  grown/shrinked: has all data been replicated enough to be
  "complete" for the desired raid lvl/number of devices/size?
   - What should one regularly do? scrubs? balance? How often?
 Do we get any automatic (but configurable) tools for this?
   - There should be support in commonly used tools, e.g. Icinga/Nagios
 check_raid
   - Ideally there should also be some desktop notification tool, which
 tells about raid (and btrfs errors in general) as small
 installations with raids typically run no Icinga/Nagios but rely
 on e.g. email or gui notifications.

I think especially for such tools it's important that these are
maintained by upstream (and yes I know you guys are rather fs
developers not)... but since these tools are so vital, having them done
3rd party can easily lead to the situation where something changes in

Re: [PATCH] btrfs-progs: mkfs: add uuid and otime to ROOT_ITEM of FS_TREE

2018-03-19 Thread Christoph Anton Mitterer
On Mon, 2018-03-19 at 14:02 +0100, David Sterba wrote:
> We can do that by a special purpose tool.

No average user will ever run (even know) about that...

Could you perhaps either do it automatically in fsck (which is IMO als
a bad idea as fsck should be read-only per default)... or at least add
a warning to fsck, like a "Info: Please run tool foo to get bar done."?

Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: spurious full btrfs corruption

2018-03-15 Thread Christoph Anton Mitterer
Hey.

Found some time to move on with this:


Frist, I think from my side (i.e. restoring as much as possible) I'm
basically done now, so everything left over here is looking for
possible bugs/etc.

I have from my side no indication that my corruptions were actually a
bug in btrfs... the new notebook used to be unstable for some time and
it might be just that.
Also that second occurrence of csum errors (when I made a image from
the broken fs to external HDD) kinda hints that it may be a memory
issue (though I haven't found time to run memtest86+ yet).

So let's just suppose that btrfs code is as rocksolid as its raid56 is
;-P and assume the issues were cause by some unlucky memory corruption
just happening at the wrong (important) meta-data. 




The issue that newer btrfs-progs/kernel don't restore anything at all
from my corrupted fs:

On Fri, 2018-03-09 at 07:48 +0800, Qu Wenruo wrote:
> > So something changed after 4.14, which makes the tools no longer
> > being
> > able to restore at least that what they could restore at 4.14.
> 
> This seems to be a regression.
> But I'm not sure if it's the kernel to blame or the btrfs-progs.
> 
> > 
> > 
> > => Some bug recently introduced in btrfs-progs?
> 
> Is the "block mapping error" message from kernel or btrfs-progs?

All progs messages unless otherwise noticed.
/dev/mapper/restore being the image from the broken SSD fs.
Everything below was on the OLD laptop (which has probably no memory or
whichever issues) under kernel 4.15.4 and progs 4.15.1.

# btrfs-find-root /dev/mapper/restore 
Couldn't map the block 4503658729209856
No mapping for 4503658729209856-4503658729226240
Couldn't map the block 4503658729209856
Superblock thinks the generation is 2083143
Superblock thinks the level is 1
Found tree root at 58572800 gen 2083143 level 1
Well block 27820032(gen: 2083133 level: 1) seems good, but generation/level 
doesn't match, want gen: 2083143 level: 1
Well block 25526272(gen: 2083132 level: 1) seems good, but generation/level 
doesn't match, want gen: 2083143 level: 1
Well block 21807104(gen: 2083131 level: 1) seems good, but generation/level 
doesn't match, want gen: 2083143 level: 1
Well block 11829248(gen: 2083130 level: 1) seems good, but generation/level 
doesn't match, want gen: 2083143 level: 1
Well block 8716288(gen: 2083129 level: 1) seems good, but generation/level 
doesn't match, want gen: 2083143 level: 1
Well block 6209536(gen: 2083128 level: 1) seems good, but generation/level 
doesn't match, want gen: 2083143 level: 1




# btrfs-debug-tree -b 27820032 /dev/mapper/restore 
btrfs-progs v4.15.1
Couldn't map the block 4503658729209856
No mapping for 4503658729209856-4503658729226240
Couldn't map the block 4503658729209856
bytenr mismatch, want=4503658729209856, have=0
node 27820032 level 1 items 2 free 491 generation 2083133 owner 1
fs uuid b6050e38-716a-40c3-a8df-fcf1dd7e655d
chunk uuid ae6b0cc6-bbc5-4131-b3f3-41b748f5a775
key (EXTENT_TREE ROOT_ITEM 0) block 27836416 (1699) gen 2083133
key (1853 INODE_ITEM 0) block 28000256 (1709) gen 2083133

=> I *think* (but not 100% sure - would need to double check if it's
important for you to know), that the older progs/kernel showed me much
more here




# btrfs-debug-tree /dev/mapper/restore 
btrfs-progs v4.15.1
Couldn't map the block 4503658729209856
No mapping for 4503658729209856-4503658729226240
Couldn't map the block 4503658729209856
bytenr mismatch, want=4503658729209856, have=0
ERROR: unable to open /dev/mapper/restore

=> same here: I *think* (but not 100% sure - would need to double check
if it's important for you to know), that the older progs/kernel showed
me much more here




# btrfs-debug-tree -b 27836416 /dev/mapper/restore 
btrfs-progs v4.15.1
Couldn't map the block 4503658729209856
No mapping for 4503658729209856-4503658729226240
Couldn't map the block 4503658729209856
bytenr mismatch, want=4503658729209856, have=0
leaf 27836416 items 63 free space 6131 generation 2083133 owner 1
leaf 27836416 flags 0x1(WRITTEN) backref revision 1
fs uuid b6050e38-716a-40c3-a8df-fcf1dd7e655d
chunk uuid ae6b0cc6-bbc5-4131-b3f3-41b748f5a775
item 0 key (EXTENT_TREE ROOT_ITEM 0) itemoff 15844 itemsize 439
generation 2083133 root_dirid 0 bytenr 27328512 level 2 refs 1
lastsnap 0 byte_limit 0 bytes_used 182190080 flags 0x0(none)
uuid ----
drop key (0 UNKNOWN.0 0) level 0
item 1 key (DEV_TREE ROOT_ITEM 0) itemoff 15405 itemsize 439
generation 2083129 root_dirid 0 bytenr 9502720 level 1 refs 1
lastsnap 0 byte_limit 0 bytes_used 114688 flags 0x0(none)
uuid ----
drop key (0 UNKNOWN.0 0) level 0
item 2 key (FS_TREE INODE_REF 6) itemoff 15388 itemsize 17
index 0 namelen 7 name: default
item 3 key (FS_TREE ROOT_ITEM 0) itemoff 14949 itemsize 439

Re: zerofree btrfs support?

2018-03-14 Thread Christoph Anton Mitterer
Hey.


On Wed, 2018-03-14 at 20:38 +0100, David Sterba wrote:
> I have a prototype code for that and after the years, seeing the
> request
> again, I'm not against adding it as long as it's not advertised as a
> security feature.
I'd expect that anyone in the security area should know that securely
deleting data is not done by overwriting it (not even overwriting it
multiple times may be enough).
So I don't think that it would be btrfs' or zerofree's duty do teach
that to users.

The later's manpage doesn't advertise it for such purpose and even
contains a (though perhaps a bit too vague) warning:
>It  may  however  be useful in other situations: for instance it can be
>used to make it more difficult to retrieve deleted  data.  Beware  that
>securely  deleting  sensitive  data  is not in general an easy task and
>usually requires writing several times on the deleted blocks.

They should probably drop the first "can be used to make it difficult"
sentence... and add that even overwriting multiple times is often not
enough.


> The zeroing simply builds on top of the trim code, so it's merely
> adding
> the ioctl interface and passing down the desired operation.
Well I think what would be really mandatory if such support is added to
an 3rd party tools is, that it will definitely continue to work
(without causing corruptions or so), even if btrfs changes.

And if it's just using existing btrfs kernel code (and zerofree itself
would mostly do nothing)... than that seems quite promising. :-)


I personally don't need it that desperate anymore, since I got discard
support working in my qemu... but others may still benefit from it, so
if it's easy, why not!? :-)

Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ongoing Btrfs stability issues

2018-03-13 Thread Christoph Anton Mitterer
On Tue, 2018-03-13 at 20:36 +0100, Goffredo Baroncelli wrote:
> A checksum mismatch, is returned as -EIO by a read() syscall. This is
> an event handled badly by most part of the programs.
Then these programs must simply be fixed... otherwise they'll also fail
under normal circumstances with btrfs, if there is any corruption.


> The problem is the following: there is a time window between the
> checksum computation and the writing the data on the disk (which is
> done at the lower level via a DMA channel), where if the data is
> update the checksum would mismatch. This happens if we have two
> threads, where the first commits the data on the disk, and the second
> one updates the data (I think that both VM and database could behave
> so).
Well that's clear... but isn't that time frame also there if the extent
is just written without CoW (regardless of checksumming)?
Obviously there would need to be some protection here anyway, so that
such data is taken e.g. from RAM, before the write has completed, so
that the read wouldn't take place while the write has only half
finished?!
So I'd naively assume one could just enlarge that protection to the
completion of checksum writing,...



> In btrfs, a checksum mismatch creates an -EIO error during the
> reading. In a conventional filesystem (or a btrfs filesystem w/o
> datasum) there is no checksum, so this problem doesn't exist.
If ext writes an extent (can't that be up to 128MiB there?), then I'm
sure it cannot write that atomically (in terms of hardware)... so there
is likely some protection around this operation, that there are no
concurrent reads of that particular extent from the disk, while the
write hasn't finished yet.



> > Even if not... I should be only a problem in case of a crash during
> > that,.. and than I'd still prefer to get the false positive than
> > bad
> > data.
> 
> How you can know if it is a "bad data" or a "bad checksum" ?
Well as I've said, in my naive thinking this should only be a problem
in case of a crash... and then, yes, one cannot say whether it's bad
data or checksum (that's exactly what I'm saying)... but I rather
prefer to know that something might be fishy, then not knowing anything
and perhaps even get good data "RAID-repaired" with bad one...


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ongoing Btrfs stability issues

2018-03-12 Thread Christoph Anton Mitterer
On Mon, 2018-03-12 at 22:22 +0100, Goffredo Baroncelli wrote:
> Unfortunately no, the likelihood might be 100%: there are some
> patterns which trigger this problem quite easily. See The link which
> I posted in my previous email. There was a program which creates a
> bad checksum (in COW+DATASUM mode), and the file became unreadable.
But that rather seems like a plain bug?!

No reason that would conceptually make checksumming+notdatacow
impossible.

AFAIU, the conceptual thin would be about:
- data is written in nodatacow
  => thus a checksum must be written as well, so write it
- what can then of course happen is
  - both csum and data are written => fine
  - csum is written but data not and then some crash => csum will show
that => fine
  - data is written but csum not and then some crash => csum will give
false positive

Still better few false positives, as many unnoticed data corruptions
and no true raid repair.


> If you cannot know if a checksum is bad or the data is bad, the
> checksum is not useful at all!
Why not? It's anyway only uncertain in the case of crash,... and it at
least tells you that something is fishy.
A program which cares about its data will ensure its own journaling
means and can simply recover by this... or users could then just roll
in a backup.
Or one could provide some API/userland tool to recompute the csums of
the affected file (and possibly live with bad data).


> If I read correctly what you wrote, it seems that you consider a
> "minor issue" the fact that the checksum is not correct. If you
> accept the possibility that a checksum might be wrong, you wont trust
> anymore the checksum; so the checksum became not useful.
There's simply no disadvantage to not having checksumming at all in the
nodatacow case.
Cause then you never have an idea whether your data is correct or
not... the case with checksumming + datacow, which can give a false
positive on a crash when data was written correctly, but not the
checksum, covers at least the other cases of data corruption (silent
data corruption, csum written, but data not or only partially in case
of a crash).


> Again, you are assuming that the likelihood of having a bad checksum
> is low. Unfortunately this is not true. There are pattern which
> exploits this bug with a likelihood=100%.

Okay I don't understand why this would be so and wouldn't assume that
the IO pattern can affect it heavily... but I'm not really btrfs
expert.

My blind assumption would have been that writing an extent of data
takes much longer to complete than writing the corresponding checksum.

Even if not... I should be only a problem in case of a crash during
that,.. and than I'd still prefer to get the false positive than bad
data.


Anyway... it's not going to happen so the discussion is pointless.
I think people can probably use dm-integrity (which btw: does no CoW
either (IIRC) and still can provide integrity... ;-) ) to see whether
their data is valid.
No nice but since it won't change on btrfs, a possible alternative.


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ongoing Btrfs stability issues

2018-03-11 Thread Christoph Anton Mitterer
On Sun, 2018-03-11 at 18:51 +0100, Goffredo Baroncelli wrote:
> 
> COW is needed to properly checksum the data. Otherwise is not
> possible to ensure the coherency between data and checksum (however I
> have to point out that BTRFS fails even in this case [*]).
> We could rearrange this sentence, saying that: if you want checksum,
> you need COW...

No,... not really... the meta-data is anyway always CoWed... so if you
do checksum *and* notdatacow,..., the only thing that could possibly
happen (in the worst case) is, that data that actually made it
correctly to the disk is falsely determined bad, as the metadata (i.e.
the checksums) weren't upgraded correctly.

That however is probably much less likely than the other way round,..
i.e. bad data went to disk and would be detected with checksuming.


I had lots of discussions about this here on the list, and no one ever
brought up a real argument against it... I also had an off-list
discussion with Chris Mason who IIRC confirmed that it would actually
work as I imagine it... with the only two problems:
- good data possibly be marked bad because of bad checksums
- reads giving back EIO where people would rather prefer bad data
(not really sure if this were really his two arguments,... I'd have to
look it up, so don't nail me down).


Long story short:

In any case, I think giving back bad data without EIO is unacceptable.
If someone really doesn't care (e.g. because he has higher level
checksumming and possibly even repair) he could still manually disable
checksumming.

The little chance of having a false positive weights IMO far less that
have very large amounts of data (DBs, VM images are our typical cases)
completely unprotected.

And not having checksumming with notdatacow breaks any safe raid repair
(so in that case "repair" may even overwrite good data),... which is
IMO also unacceptable.
And the typical use cases for nodatacow (VMs, DBs) are in turn not so
uncommon to want RAID.


I really like btrfs,... and it's not that other fs (which typically
have no checksumming at all) would perform better here... but not
having it for these major use case is a big disappointment for me.


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 23:31 +0500, Roman Mamedov wrote:
> QCOW2 would add a second layer of COW
> on top of
> Btrfs, which sounds like a nightmare.

I've just seen there is even a nocow option "specifically" for btrfs...
it seems however that it doesn't disable the CoW of qcow, but rather
that of btrfs... (thus silently also the checksumming).


Does plain qcow2 really CoW on every write? I've always assumed it
would only CoW when one makes snapshots or so...


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 16:50 +0100, Adam Borowski wrote:
> Since we're on a btrfs mailing list
Well... my original question was whether someone could make zerofree
support for btrfs (which I think would be best if someone who knows how
btrfs really works)... thus I directed the question to this list and
not to some of qemu :-)


> It works only with scsi and virtio-scsi drivers.  Most qemu setups
> use
> either ide (ouch!) or virtio-blk.
Seems my libvirt created VMs use "sata" per default... and it does seem
to work with that either in the meantime.


Thanks :-)

Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 19:37 +0500, Roman Mamedov wrote:
> Note you can use it on HDDs too, even without QEMU and the like: via
> using LVM
> "thin" volumes. I use that on a number of machines, the benefit is
> that since
> TRIMed areas are "stored nowhere", those partitions allow for
> incredibly fast
> block-level backups, as it doesn't have to physically read in all the
> free
> space, let alone any stale data in there. LVM snapshots are also way
> more
> efficient with thin volumes, which helps during backup.
I was thinking about using those... but then I'd have to use loop
device files I guess... which I also want to avoid.



> > dm-crypt per default blocks discard.
> 
> Out of misguided paranoia. If your crypto is any good (and last I
> checked AES
> was good enough), there's really not a lot to gain for the "attacker"
> knowing
> which areas of the disk are used and which are not.
I'm not an expert here... but a) I think it would be independent of AES
and rather the encryption mode (e.g. XTS) which protects here or not...
and b) we've seen too many attacks on crypto based on smart statistics
and knowing which blocks on a medium are actually data or just "random
crypto noise" (and you know that when using TRIM) can already tell a
lot.
At least it could tell an attacker how much data there is on a fs.

 
> It works, just not with some of the QEMU virtualized disk device
> drivers.
> You don't need to use qemu-img to manually dig holes either, it's all
> automatic.
You're right... seems like in older version one needed to set virtio-
scsi as device driver (which I possible missed), but nowadays it even
seems to work with sata.



> QEMU deallocates parts of its raw images for those areas which have
> been
> TRIM'ed by the guest. In fact I never use qcow2, always raw images
> only.
> Yet, boot a guest, issue fstrim, and see the raw file while still
> having the
> same size, show much lower actual disk usage in "du".
Works with qcow2 as well... heck even Windows can do it (though it has
no fstrim and it seems one needs to run defrag (which probably does
next to defragmentation also what fstrim does).


Fine for me,... though non qemu users may still be interested in having
 zerofree.


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ongoing Btrfs stability issues

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 14:04 +0200, Nikolay Borisov wrote:
> So for OLTP workloads you definitely want nodatacow enabled, bear in
> mind this also disables crc checksumming, but your db engine should
> already have such functionality implemented in it.

Unlike repeated claims made here on the list and other places... I
woudln't now *any* DB system which actually does this per default and
or in a way that would be comparable to filesystem lvl checksumming.


Look back in the archives... when I've asked several times for
checksumming support *with* nodatacow, I evaluated the existing status
for the big ones (postgres,mysql,sqlite,bdb)... and all of them had
this either not enabled per default, not at all, or requiring special
support for the program using the DB.


Similar btw: no single VM image type I've evaluated back then had any
form of checksumming integrated.


Still, one of the major deficiencies (not in comparison to other fs,
but in comparison to how it should be) of btrfs unfortunately :-(


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zerofree btrfs support?

2018-03-10 Thread Christoph Anton Mitterer
On Sat, 2018-03-10 at 09:16 +0100, Adam Borowski wrote:
> Do you want zerofree for thin storage optimization, or for security?
I don't think one can really use it for security (neither on SSD or
HDD).
On both, zeroed blocks may still be readable by forensic measures.

So optimisation, i.e. digging holes in VM image files and make them
sparse.


> For the former, you can use fstrim; this is enough on any modern SSD;
> on HDD
> you can rig the block device to simulate TRIM by writing zeroes.  I'm
> sure
> one of dm-* can do this, if not -- should be easy to add, there's
> also
> qemu-nbd which allows control over discard, but incurs a performance
> penalty
> compared to playing with the block layer.

Writing zeros if of course possible... but rather ugly... one really
needs to write *everything* while a smart tool could just zero those
block groups that have been used (while everything else is still zero
from the original image file).

TRIM/discard... not sure how far this is really a solution.

The first thing that comes to my mind is, that *if* the discard would
propagate down below a dm-crypt layer (e.g. in my case there is:
SSD->partitions->dmcrypt->LUKS->btrfs->image-files-I-want-to-zero)
it has effects on security, which is why dm-crypt per default blocks
discard.

Some longer time ago I had a look at whether qemu would support that on
it's own,... i.e. the guest and it's btrfs would normally use discard,
but the image file below would mark the block as discarded and later on
e can use some qemu-img command to dig holes into exactly those
locations.
Back then it didn't seem to work.

But even if it would in the meantime, a proper zerofree implementation
would be beneficial for all non-qemu/qcow2 users (e.g. if one uses raw
images in qemu, the whole thing couldn't work but with really zeroing
the blocks inside the guest.


Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


zerofree btrfs support?

2018-03-09 Thread Christoph Anton Mitterer
Hi.

Just wondered... was it ever planned (or is there some equivalent) to
get support for btrfs in zerofree?

Thanks,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


call trace on btrfs send/receive

2018-03-09 Thread Christoph Anton Mitterer
Hey.

The following still happens with 4.15 kernel/progs:

btrfs send -p oldsnap newsnap | btrfs receive /some/other/fs

Mar 10 00:48:10 heisenberg kernel: WARNING: CPU: 5 PID: 32197 at 
/build/linux-PFKtCE/linux-4.15.4/fs/btrfs/send.c:6487 
btrfs_ioctl_send+0x48f/0xfb0 [btrfs]
Mar 10 00:48:10 heisenberg kernel: Modules linked in: udp_diag tcp_diag 
inet_diag algif_skcipher af_alg uas vhost_net vhost tap xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
nf_nat tun bridge stp llc ctr ccm fuse ebtable_filter ebtables devlink 
cpufreq_userspace cpufreq_powersave cpufreq_conservative ip6t_REJECT 
nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter 
ip6_tables xt_policy ipt_REJECT nf_reject_ipv4 xt_comment nf_conntrack_ipv4 
nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack binfmt_misc 
iptable_filter joydev snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp 
coretemp btusb btrtl btbcm btintel kvm_intel iwldvm bluetooth kvm irqbypass 
rtsx_pci_sdmmc rtsx_pci_ms memstick mmc_core
Mar 10 00:48:10 heisenberg kernel:  mac80211 iTCO_wdt crct10dif_pclmul 
iTCO_vendor_support uvcvideo crc32_pclmul videobuf2_vmalloc videobuf2_memops 
ghash_clmulni_intel videobuf2_v4l2 drbg intel_cstate videobuf2_core iwlwifi 
ansi_cprng intel_uncore videodev ecdh_generic crc16 media intel_rapl_perf sg 
psmouse i915 i2c_i801 snd_hda_intel pcspkr cfg80211 rtsx_pci snd_hda_codec 
rfkill snd_hda_core snd_hwdep drm_kms_helper fujitsu_laptop snd_pcm 
sparse_keymap drm video snd_timer ac button snd battery mei_me lpc_ich 
soundcore i2c_algo_bit mei mfd_core shpchp loop parport_pc ppdev sunrpc lp 
parport ip_tables x_tables autofs4 dm_crypt dm_mod raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c raid1 
raid0 multipath linear md_mod btrfs crc32c_generic xor zstd_decompress 
zstd_compress xxhash raid6_pq
Mar 10 00:48:10 heisenberg kernel:  uhci_hcd sd_mod usb_storage crc32c_intel 
ahci libahci aesni_intel libata ehci_pci aes_x86_64 evdev xhci_pci crypto_simd 
cryptd glue_helper xhci_hcd ehci_hcd serio_raw scsi_mod e1000e ptp usbcore 
pps_core usb_common
Mar 10 00:48:10 heisenberg kernel: CPU: 5 PID: 32197 Comm: btrfs Not tainted 
4.15.0-1-amd64 #1 Debian 4.15.4-1
Mar 10 00:48:10 heisenberg kernel: Hardware name: FUJITSU LIFEBOOK 
E782/FJNB253, BIOS Version 2.11 07/15/2014
Mar 10 00:48:10 heisenberg kernel: RIP: 0010:btrfs_ioctl_send+0x48f/0xfb0 
[btrfs]
Mar 10 00:48:10 heisenberg kernel: RSP: 0018:a4cc0a377c48 EFLAGS: 00010293
Mar 10 00:48:10 heisenberg kernel: RAX:  RBX: 958718b1140c 
RCX: 0001
Mar 10 00:48:10 heisenberg kernel: RDX: 0001 RSI: 0015 
RDI: 958718b1140c
Mar 10 00:48:10 heisenberg kernel: RBP: 9587617c1c00 R08: 4000 
R09: 0060
Mar 10 00:48:10 heisenberg kernel: R10: 0015 R11: 0246 
R12: 958718b11000
Mar 10 00:48:10 heisenberg kernel: R13: 9587b7cfdad0 R14: 95850d8d4000 
R15: 958718b11000
Mar 10 00:48:10 heisenberg kernel: FS:  7f5f0866a8c0() 
GS:95881e34() knlGS:
Mar 10 00:48:10 heisenberg kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Mar 10 00:48:10 heisenberg kernel: CR2: 7f5f073a4e38 CR3: 0001e6b56004 
CR4: 001606e0
Mar 10 00:48:10 heisenberg kernel: Call Trace:
Mar 10 00:48:10 heisenberg kernel:  ? kmem_cache_alloc_trace+0x14b/0x1a0
Mar 10 00:48:10 heisenberg kernel:  ? 
insert_reserved_file_extent.constprop.69+0x2c1/0x2f0 [btrfs]
Mar 10 00:48:10 heisenberg kernel:  ? btrfs_opendir+0x3e/0x70 [btrfs]
Mar 10 00:48:10 heisenberg kernel:  ? _cond_resched+0x15/0x40
Mar 10 00:48:10 heisenberg kernel:  ? __kmalloc_track_caller+0x190/0x220
Mar 10 00:48:10 heisenberg kernel:  ? __check_object_size+0xaf/0x1b0
Mar 10 00:48:10 heisenberg kernel:  _btrfs_ioctl_send+0x80/0x110 [btrfs]
Mar 10 00:48:10 heisenberg kernel:  ? task_change_group_fair+0xb3/0x100
Mar 10 00:48:10 heisenberg kernel:  ? cpu_cgroup_fork+0x66/0x90
Mar 10 00:48:10 heisenberg kernel:  btrfs_ioctl+0xfab/0x2450 [btrfs]
Mar 10 00:48:10 heisenberg kernel:  ? enqueue_entity+0x106/0x6b0
Mar 10 00:48:10 heisenberg kernel:  ? enqueue_task_fair+0x67/0x7d0
Mar 10 00:48:10 heisenberg kernel:  ? do_vfs_ioctl+0xa4/0x630
Mar 10 00:48:10 heisenberg kernel:  do_vfs_ioctl+0xa4/0x630
Mar 10 00:48:10 heisenberg kernel:  ? _do_fork+0x14d/0x3f0
Mar 10 00:48:10 heisenberg kernel:  SyS_ioctl+0x74/0x80
Mar 10 00:48:10 heisenberg kernel:  do_syscall_64+0x6f/0x130
Mar 10 00:48:10 heisenberg kernel:  entry_SYSCALL_64_after_hwframe+0x21/0x86
Mar 10 00:48:10 heisenberg kernel: RIP: 0033:0x7f5f07493f07
Mar 10 00:48:10 heisenberg kernel: RSP: 002b:7fff8a4619d8 EFLAGS: 0246 
ORIG_RAX: 0010
Mar 10 00:48:10 heisenberg kernel: RAX: ffda RBX: 55b941872270 
RCX: 7f5f07493f07
Mar 10 00:48:10 heisenberg kernel: 

Re: spurious full btrfs corruption

2018-03-08 Thread Christoph Anton Mitterer
Hey.


On Tue, 2018-03-06 at 09:50 +0800, Qu Wenruo wrote:
> > These were the two files:
> > -rw-r--r-- 1 calestyo calestyo   90112 Feb 22 16:46 'Lady In The
> > Water/05.mp3'
> > -rw-r--r-- 1 calestyo calestyo 4892407 Feb 27 23:28
> > '/home/calestyo/share/music/Lady In The Water/05.mp3'
> > 
> > 
> > -rw-r--r-- 1 calestyo calestyo 1904640 Feb 22 16:47 'The Hunt For
> > Red October [Intrada]/21.mp3'
> > -rw-r--r-- 1 calestyo calestyo 2968128 Feb 27 23:28
> > '/home/calestyo/share/music/The Hunt For Red October
> > [Intrada]/21.mp3'
> > 
> > with the former (smaller one) being the corrupted one (i.e. the one
> > returned by btrfs-restore).
> > 
> > Both are (in terms of filesize) multiples of 4096... what does that
> > mean now?
> 
> That means either we lost some file extents or inode items.
> 
> Btrfs-restore only found EXTENT_DATA, which contains the pointer to
> the
> real data, and inode number.
> But no INODE_ITEM is found, which records the real inode size, so it
> can
> only use EXTENT_DATA to rebuild as much data as possible.
> That why all recovered one is aligned to 4K.
> 
> So some metadata is also corrupted.

But that can also happen to just some files?
Anyway... still strange that it hit just those two (which weren't
touched for long).


> > However, all the qcow2 files from the restore are more or less
> > garbage.
> > During the btrfs-restore it already complained on them, that it
> > would
> > loop too often on them and whether I want to continue or not (I
> > choose
> > n and on another full run I choose y).
> > 
> > Some still contain a partition table, some partitions even
> > filesystems
> > (btrfs again)... but I cannot mount them.
> 
> I think the same problem happens on them too.
> 
> Some data is lost while some are good.
> Anyway, they would be garbage.

Again, still strange... that so many files (of those that I really
checked) were fully okay,... while those 4 were all broken.

When it only uses EXTENT_DATA, would that mean that it basically breaks
on every border where the file is split up into multiple extents (which
is of course likely for the (CoWed) images that I had.



> > 
> > > Would you please try to restore the fs on another system with
> > > good
> > > memory?
> > 
> > Which one? The originally broken fs from the SSD?
> 
> Yep.
> 
> > And what should I try to find out here?
> 
> During restore, if the csum error happens again on the newly created
> destination btrfs.
> (And I recommend use mount option nospace_cache,notreelog on the
> destination fs)

So an update on this (everything on the OLD notebook with likely good
memory):

I booted again from USBstick (with 4.15 kernel/progs),
luksOpened+losetup+luksOpened (yes two dm-crypt, the one from the
external restore HDD, then the image file of the SSD which again
contained dmcrypt+LUKS, of which one was the broken btrfs).

As I've mentioned before... btrfs-restore (and the other tools for
trying to find the bytenr) immediately fail here.
They bring some "block mapping error" and produce no output.

This worked on my first rescue attempt (where I had 4.12 kernel/progs).

Since I had no 4.12 kernel/progs at hand anymore, I went to an even
older rescue stick, wich has 4.7 kernel/progs (if I'm not wrong).
There it worked again (on the same image file).

So something changed after 4.14, which makes the tools no longer being
able to restore at least that what they could restore at 4.14.


=> Some bug recently introduced in btrfs-progs?




I finished the dump then (from OLD notebook/good RAM) with 4.7
kernel/progs,... to the very same external HDD I've used before.

And afterwards I:
diff -qr --no-dereference restoreFromNEWnotebook/ restoreFromOLDnotebook/

=> No differences were found, except one further file that was in the
new restoreFromOLDnotebook. Could be that this was a file wich I
deleted on the old restore because of csum errors, but I don't really
remember (actually I thought to remember that there were a few which I
deleted).

Since all other files were equal (that is at least in terms of file
contents and symlink targets - I didn't compare the metadata like
permissions, dates and owners)... the qcow2 images are garbage as well.

=> No csum errors were recorded in the kernel log during the diff, and
since both, the (remaining) restore results from the NEW notebook and
the ones just made on the OLD one were read because of the diff,... I'd
guess that no further corruption happened in the recent btrfs-restore.





On to the next working site:

> > > This -28 (ENOSPC) seems to show that the extent tree of the new
> > > btrfs
> > > is
> > > corrupted.
> > 
> > "new" here is dm-1, right? Which is the fresh btrfs I've created on
> > some 8TB HDD for my recovery works.
> > While that FS shows me:
> > [26017.690417] BTRFS info (device dm-2): disk space caching is
> > enabled
> > [26017.690421] BTRFS info (device dm-2): has skinny extents
> > [26017.798959] BTRFS info (device dm-2): bdev /dev/mapper/data-a4
> > errs:
> 

Re: spurious full btrfs corruption

2018-03-05 Thread Christoph Anton Mitterer
Hey Qu.

On Thu, 2018-03-01 at 09:25 +0800, Qu Wenruo wrote:
> > - For my personal data, I have one[0] Seagate 8 TB SMR HDD, which I
> >   backup (send/receive) on two further such HDDs (all these are
> >   btrfs), and (rsync) on one further with ext4.
> >   These files have all their SHA512 sums attached as XATTRs, which
> > I
> >   regularly test. So I think I can be pretty sure, that there was
> > never
> >   a case of silent data corruption and the RAM on the E782 is fine.
> 
> Good backup practice can't be even better.

Well I still would want to add something tape and/or optical based
solution...
But having this depends a bit on having a good way to do incremental
backups, i.e. I wouldn't want to write full copies of everything to
tape/BluRay over and over again, but just the actually added data and
records of metadata changes.
The former (adding just added files is rather easy), but still
recording any changes in metadata (moved/renamed/deleted files, changes
in file dates, permissions, XATTRS etc.).
Also I would always want to backup complete files, so not just changes
to a file, even if just one byte changed of a 4 GiB file... and not
want to have files split over mediums.

send/receive sounds like a candidate for this (except it works only on
changes, not full files), but I would prefer to have everything in a
standard format like tar which one can rather easily recover manually
if there are failures in the backups.


Another missing piece is a tool which (at my manual order) adds hash
sums to the files, and which can verify them
Actually I wrote such a tool already, but as shell script and it simply
forks so often, that it became extremely slow at millions of small
files.
I often found it so useful to have that kind of checksumming in
addition to the kind of checksumming e.g. btrfs does which is not at
the level of whole files.
So if something goes wrong like now, I cannot only verify whether
single extents are valid, but also the chain of them that comprises a
file.. and that just for the point where I defined "now, as it is, the
file is valid",.. and automatically on any writes, as it would be done
at file system level checksumming.
In the current case,... for many files where I had such whole-file-
csums, verifying whether what btrfs-restore gave me was valid or not,
was very easy because of them.


> Normally I won't blame memory unless strange behavior happens, from
> unexpected freeze to strange kernel panic.
Me neither... I think bad RAM happens rather rarely these days but
my case may actually be one.


> Netconsole would help here, especially when U757 has an RJ45.
> As long as you have another system which is able to run nc, it should
> catch any kernel message, and help us to analyse if it's really a
> memory
> corruption.
Ah thanks... I wasn't even aware of that ^^
I'll have a look at it when I start inspecting the U757 again in the
next weeks.


> > - The notebooks SSD is a Samsung SSD 850 PRO 1TB, the same which I
> >   already used with the old notebook.
> >   A long SMART check after the corruption, brought no errors.
> 
> Also using that SSD with smaller capacity, it's less possible for the
> SSD.
Sorry, what do you mean? :)


> Normally I won't blame memory, but even newly created btrfs, without
> any
> powerloss, it still reports csum error, then it maybe the problem.
That was also my idea...
I may mix up things, but I think I even found a csum error later on the
rescue USB stick (which is also btrfs)... would need to double check
that, though.

> > - So far, in the data I checked (which as I've said, excludes a
> > lot,..
> >   especially the QEMU images)
> >   I found only few cases, where the data I got from btrfs restore
> > was
> >   really bad.
> >   Namely, two MP3 files. Which were equal to their backup
> > counterparts,
> >   but just up to some offset... and the rest of the files were just
> >   missing.
> 
> Offset? Is that offset aligned to 4K?
> Or some strange offset?

These were the two files:
-rw-r--r-- 1 calestyo calestyo   90112 Feb 22 16:46 'Lady In The Water/05.mp3'
-rw-r--r-- 1 calestyo calestyo 4892407 Feb 27 23:28 
'/home/calestyo/share/music/Lady In The Water/05.mp3'


-rw-r--r-- 1 calestyo calestyo 1904640 Feb 22 16:47 'The Hunt For Red October 
[Intrada]/21.mp3'
-rw-r--r-- 1 calestyo calestyo 2968128 Feb 27 23:28 
'/home/calestyo/share/music/The Hunt For Red October [Intrada]/21.mp3'

with the former (smaller one) being the corrupted one (i.e. the one
returned by btrfs-restore).

Both are (in terms of filesize) multiples of 4096... what does that
mean now?


> > - Especially recovering the VM images will take up some longer
> > time...
> >   (I think I cannot really trust what came out from the btrfs restore
> >   here, since these already brought csum errs before)

In the meantime I had a look of the remaining files that I got from the
btrfs-restore (haven't run it again so far, from the OLD notebook, so
only the results from the NEW notebook 

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
And you have any other ideas on how to dubs that filesystem? Or at least backup 
as much as possible? 

Thanks, Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
 e8f1bc1493855e32b7a2a019decc3c353d94daf6 

That bug... When was that introduced and how can I find out whether an fs was 
affected/corrupted by this?  Cause I've mounted and wrote to some extremely 
important (to me) fs recently.

Thanks, Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
A scrub now gave:
# btrfs scrub start -Br /dev/disk/by-label/system
ERROR: scrubbing /dev/disk/by-label/system failed for device id 1: ret=-1, 
errno=5 (Input/output error)
scrub canceled for b6050e38-716a-40c3-a8df-fcf1dd7e655d
scrub started at Wed Feb 21 17:42:39 2018 and was aborted after 00:04:18
total bytes scrubbed: 116.59GiB with 1 errors
error details: csum=1
corrected errors: 0, uncorrectable errors: 0, unverified errors: 0



with that in the kernel log
Feb 21 17:43:09 heisenberg kernel: BTRFS warning (device dm-0): checksum error 
at logical 16401395712 on dev /dev/mapper/system, sector 32033976, root 1830, 
inode 42609350, offset 6754201600, length 4096, links 1 (path: 
var/lib/libvirt/images/subsurface.qcow2)
Feb 21 17:43:09 heisenberg kernel: BTRFS warning (device dm-0): checksum error 
at logical 16401395712 on dev /dev/mapper/system, sector 32033976, root 257, 
inode 42609350, offset 6754201600, length 4096, links 1 (path: 
var/lib/libvirt/images/subsurface.qcow2)
Feb 21 17:43:09 heisenberg kernel: BTRFS error (device dm-0): bdev 
/dev/mapper/system errs: wr 0, rd 0, flush 0, corrupt 1, gen 0

Feb 21 17:46:13 heisenberg kernel: usb 1-2: USB disconnect, device number 2
Feb 21 17:46:57 heisenberg kernel: btrfs_printk: 128 callbacks suppressed
Feb 21 17:46:57 heisenberg kernel: BTRFS critical (device dm-0): unable to find 
logical 4503658729209856 length 16384
Feb 21 17:46:57 heisenberg kernel: BTRFS critical (device dm-0): unable to find 
logical 4503658729209856 length 4096
Feb 21 17:46:57 heisenberg kernel: BTRFS critical (device dm-0): unable to find 
logical 4503658729209856 length 4096
Feb 21 17:46:57 heisenberg kernel: BTRFS critical (device dm-0): unable to find 
logical 4503658729209856 length 16384


any idea on what to do?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
Spurious corruptions seem to continue


[   69.688652] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.688656] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.688658] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   69.702870] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.702872] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.702875] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   69.865433] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.865436] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.865439] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   69.908030] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.908035] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.908040] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   69.949323] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.949326] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.949329] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   69.971228] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.971231] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.971235] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   69.998081] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.998084] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   69.998087] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.049415] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.049420] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.049424] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.067896] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.067900] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.067903] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.095769] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.095772] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.095775] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.106943] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.106946] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.106948] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.127554] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.127557] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.127561] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.133413] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.133415] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.133418] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.142557] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.142560] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.142564] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.166941] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.166944] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.166948] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.186688] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.186691] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.186693] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 16384
[   70.204750] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.204753] BTRFS critical (device dm-0): unable to find logical
4503658729209856 length 4096
[   70.204755] BTRFS critical (device dm-0): unable to find logical
4503658729209856 

Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
Interestingly, I got another one only within minutes after the scrub:
Feb 21 15:23:49 heisenberg kernel: BTRFS warning (device dm-0): csum failed 
root 257 ino 7703 off 56852480 csum 0x42d1b69c expected csum 0x3ce55621 mirror 1
Feb 21 15:23:52 heisenberg kernel: BTRFS warning (device dm-0): csum failed 
root 257 ino 7703 off 66146304 csum 0xc739c174 expected csum 0x62e6ce8b mirror 1
Feb 21 15:23:56 heisenberg kernel: BTRFS warning (device dm-0): csum failed 
root 257 ino 7703 off 87212032 csum 0x183aff6d expected csum 0x3dacaab0 mirror 1


The file (a tgz - which seems to unpack fine) was probably read, but
definitely not written to in ages...

SMART for the SSD looks ok...

Strange...

Cheers,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-21 Thread Christoph Anton Mitterer
Hi Nikolay.

Thanks.

On Wed, 2018-02-21 at 08:34 +0200, Nikolay Borisov wrote:
> This looks like the one fixed by
> e8f1bc1493855e32b7a2a019decc3c353d94daf6 . It's tagged for stable so
> you
> should get it eventually.

Another consequence of this was that I couldn't sync/umount or shutdown
anymore properly.

And now after hard reset I found this in the kernel logs:
Feb 21 14:49:29 heisenberg kernel: BTRFS warning (device dm-0): csum failed 
root 257 ino 49103564 off 2076672 csum 0xe1f5b83a expected csum 0x0e0adf97 
mirror 1
Feb 21 14:49:29 heisenberg kernel: BTRFS warning (device dm-0): csum failed 
root 257 ino 49103505 off 4464640 csum 0x0b661193 expected csum 0xe9c939a3 
mirror 1
Feb 21 14:49:45 heisenberg kernel: BTRFS warning (device dm-0): csum failed 
root 257 ino 47533539 off 139264 csum 0x4d704dc7 expected csum 0x2303d9f7 
mirror 1


That may be totally unrelated to the above bug (I just may not have
noticed it earlier), but I checked that now:
# btrfs inspect-internal inode-resolve 49103564 /
//usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.9.2
# btrfs inspect-internal inode-resolve 49103505 /
//usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.9.2
# btrfs inspect-internal inode-resolve 47533539 /
//usr/lib/python2.7/dist-packages/libxml2.py

AFAIU inode-resolve should give me the files belonging to the above
broken inodes?

# dpkg -S //usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.9.2
libqt5widgets5:amd64: /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.9.2
# dpkg -S //usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.9.2
libqt5gui5:amd64: /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.9.2
# dpkg -S //usr/lib/python2.7/dist-packages/libxml2.py
python-libxml2: /usr/lib/python2.7/dist-packages/libxml2.py

Which belong to these Debian packages.

# debsums -asc libqt5widgets5 libqt5gui5 python-libxml2
#

Which are apparently correct (as it regards Debian, which keeps some
hash sums of "all" it's package's files).


Interestingly, another:
# btrfs scrub start -Br /dev/disk/by-label/system
scrub done for b6050e38-716a-40c3-a8df-fcf1dd7e655d
scrub started at Wed Feb 21 14:52:45 2018 and finished after 00:23:25
total bytes scrubbed: 629.61GiB with 0 errors

#

returned no further error...
What does that mean now? How could btrfs correct the error (did it - I
have no RAID or so)?
Anything further I should do to check the consistency of my filesystem?


Thanks,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BUG: unable to handle kernel paging request at ffff9fb75f827100

2018-02-20 Thread Christoph Anton Mitterer
Hi.

Not sure if that's a bug in btrfs... maybe someone's interested in it.

Cheers,
Chris.

# uname -a
Linux heisenberg 4.14.0-3-amd64 #1 SMP Debian 4.14.17-1 (2018-02-14) x86_64 
GNU/Linux


Feb 21 04:55:51 heisenberg kernel: BUG: unable to handle kernel paging request 
at 9fb75f827100
Feb 21 04:55:51 heisenberg kernel: IP: btrfs_drop_inode+0x16/0x40 [btrfs]
Feb 21 04:55:51 heisenberg kernel: PGD 410806067 P4D 410806067 PUD 0 
Feb 21 04:55:51 heisenberg kernel: Oops:  [#1] SMP PTI
Feb 21 04:55:51 heisenberg kernel: Modules linked in: vhost_net vhost tap 
xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_nat_ipv4 nf_nat tun bridge stp llc ctr ccm fuse ebtable_filter ebtables 
devlink cpufreq_userspace cpufreq_powersave cpufreq_conservative ip6t_REJECT 
nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter 
ip6_tables xt_policy ipt_REJECT nf_reject_ipv4 xt_comment nf_conntrack_ipv4 
nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack iptable_filter 
binfmt_misc arc4 snd_hda_codec_hdmi btusb btrtl btbcm btintel bluetooth 
snd_hda_codec_realtek snd_hda_codec_generic drbg uvcvideo videobuf2_vmalloc 
videobuf2_memops snd_soc_skl snd_usb_audio ansi_cprng snd_soc_skl_ipc cdc_mbim 
cdc_wdm snd_usbmidi_lib snd_soc_sst_ipc cdc_ncm snd_soc_sst_dsp snd_rawmidi 
ecdh_generic
Feb 21 04:55:51 heisenberg kernel:  videobuf2_v4l2 i2c_designware_platform 
iwlmvm snd_hda_ext_core videobuf2_core usbnet i2c_designware_core 
snd_seq_device snd_soc_sst_match mii snd_soc_core videodev snd_compress media 
mac80211 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass intel_cstate snd_hda_intel intel_uncore iwlwifi snd_hda_codec 
intel_rapl_perf snd_hda_core snd_hwdep snd_pcm pcspkr joydev evdev cfg80211 
serio_raw snd_timer rfkill snd iTCO_wdt sg soundcore iTCO_vendor_support i915 
wmi shpchp drm_kms_helper mei_me battery fujitsu_laptop mei tpm_crb idma64 
button drm sparse_keymap video i2c_algo_bit ac acpi_pad intel_lpss_pci 
intel_lpss mfd_core loop parport_pc ppdev sunrpc lp parport ip_tables x_tables 
autofs4 algif_skcipher af_alg ext4 crc16 mbcache jbd2 fscrypto ecb hid_generic
Feb 21 04:55:51 heisenberg kernel:  usbhid hid dm_crypt dm_mod raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c raid1 
raid0 multipath linear md_mod btrfs crc32c_generic xor zstd_decompress 
zstd_compress xxhash uas raid6_pq uhci_hcd ehci_pci ehci_hcd usb_storage sd_mod 
crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel 
aes_x86_64 crypto_simd glue_helper cryptd e1000e xhci_pci ahci ptp libahci 
xhci_hcd psmouse pps_core libata sdhci_pci sdhci i2c_i801 scsi_mod mmc_core 
usbcore usb_common
Feb 21 04:55:51 heisenberg kernel: CPU: 3 PID: 50 Comm: kswapd0 Not tainted 
4.14.0-3-amd64 #1 Debian 4.14.17-1
Feb 21 04:55:51 heisenberg kernel: Hardware name: FUJITSU LIFEBOOK 
U757/FJNB2A5, BIOS Version 1.16 07/05/2017
Feb 21 04:55:51 heisenberg kernel: task: 9fbff9523040 task.stack: 
ad5e8378
Feb 21 04:55:51 heisenberg kernel: RIP: 0010:btrfs_drop_inode+0x16/0x40 [btrfs]
Feb 21 04:55:51 heisenberg kernel: RSP: 0018:ad5e83783c28 EFLAGS: 00010286
Feb 21 04:55:51 heisenberg kernel: RAX: 0001 RBX: 9fbe05d69330 
RCX: 
Feb 21 04:55:51 heisenberg kernel: RDX: 9fb75f827000 RSI: c075f2b0 
RDI: 9fbe05d69330
Feb 21 04:55:51 heisenberg kernel: RBP: 9fbff2040800 R08: 9fbff1ddea08 
R09: ad5e83783d88
Feb 21 04:55:51 heisenberg kernel: R10: 001bf318 R11:  
R12: 9fbe05d693b8
Feb 21 04:55:51 heisenberg kernel: R13: 9fbe05d69488 R14: 9fbfc26cecc0 
R15: 
Feb 21 04:55:51 heisenberg kernel: FS:  () 
GS:9fc01dd8() knlGS:
Feb 21 04:55:51 heisenberg kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Feb 21 04:55:51 heisenberg kernel: CR2: 9fb75f827100 CR3: 00041020a004 
CR4: 003606e0
Feb 21 04:55:51 heisenberg kernel: DR0:  DR1:  
DR2: 
Feb 21 04:55:51 heisenberg kernel: DR3:  DR6: fffe0ff0 
DR7: 0400
Feb 21 04:55:51 heisenberg kernel: Call Trace:
Feb 21 04:55:51 heisenberg kernel:  iput+0xf7/0x210
Feb 21 04:55:51 heisenberg kernel:  __dentry_kill+0xce/0x160
Feb 21 04:55:51 heisenberg kernel:  shrink_dentry_list+0xe0/0x2d0
Feb 21 04:55:51 heisenberg kernel:  prune_dcache_sb+0x52/0x70
Feb 21 04:55:51 heisenberg kernel:  super_cache_scan+0xf7/0x1a0
Feb 21 04:55:51 heisenberg kernel:  shrink_slab.part.49+0x1e8/0x3e0
Feb 21 04:55:51 heisenberg kernel:  shrink_node+0x123/0x300
Feb 21 04:55:51 heisenberg kernel:  kswapd+0x299/0x6f0
Feb 21 04:55:51 heisenberg kernel:  ? mem_cgroup_shrink_node+0x190/0x190
Feb 21 04:55:51 heisenberg kernel:  kthread+0x11a/0x130
Feb 21 04:55:51 heisenberg kernel:  ? kthread_create_on_node+0x70/0x70
Feb 21 04:55:51 heisenberg 

Re: block group 11778977169408 has wrong amount of free space

2017-09-03 Thread Christoph Anton Mitterer
Did another mount with clear_cache,rw (cause it was ro before)... now I
get even more errors:
# btrfs check  /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
block group 9857516175360 has wrong amount of free space
failed to load free space cache for block group 9857516175360
block group 11778977169408 has wrong amount of free space
failed to load free space cache for block group 11778977169408
checking fs roots
checking csums
checking root refs
found 4404625330176 bytes used, no error found
total csum bytes: 4293007908
total tree bytes: 7511883776
total fs tree bytes: 1856258048
total extent tree bytes: 1097842688
btree space waste bytes: 887738230
file data blocks allocated: 4397113446400
 referenced 4515055595520
0

what the???

smime.p7s
Description: S/MIME cryptographic signature


Re: block group 11778977169408 has wrong amount of free space

2017-09-03 Thread Christoph Anton Mitterer
Just checked, and mounting with clear_cache, and then re-fscking
doesn't even fix the problem...

Output stays the same.

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


block group 11778977169408 has wrong amount of free space

2017-09-03 Thread Christoph Anton Mitterer
Hey.

Just got the following:
$ uname -a
Linux heisenberg 4.12.0-1-amd64 #1 SMP Debian 4.12.6-1 (2017-08-12)
x86_64 GNU/Linux

$ btrfs version
btrfs-progs v4.12

on a filesystem:

# btrfs check  /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
block group 11778977169408 has wrong amount of free space
failed to load free space cache for block group 11778977169408
checking fs roots
checking csums
checking root refs
found 4404625739776 bytes used, no error found
total csum bytes: 4293007908
total tree bytes: 7511900160
total fs tree bytes: 1856258048
total extent tree bytes: 1097859072
btree space waste bytes: 887753954
file data blocks allocated: 4397113839616
 referenced 4515055988736
0

Any idea what could cause these free space issues and how to clean them
up? Thought that should work with recent kernels could that mean
some data will be corrupted when I do e.g. mount with clean_cache?

Interestingly, $? is still 0... even though errors were found.
And kernel log shows nothing.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


call trace on send/receive

2017-08-31 Thread Christoph Anton Mitterer
Hey.

Just got the following call trace with:
$ uname -a
Linux heisenberg 4.12.0-1-amd64 #1 SMP Debian 4.12.6-1 (2017-08-12) x86_64 
GNU/Linux
$ btrfs version
btrfs-progs v4.12


Sep 01 06:10:12 heisenberg kernel: [ cut here ]
Sep 01 06:10:12 heisenberg kernel: WARNING: CPU: 3 PID: 7411 at 
/build/linux-fHlJSJ/linux-4.12.6/fs/btrfs/send.c:6310 
btrfs_ioctl_send+0x6c7/0x1100 [btrfs]
Sep 01 06:10:12 heisenberg kernel: Modules linked in: udp_diag tcp_diag 
inet_diag ext4 jbd2 fscrypto ecb mbcache algif_skcipher af_alg uas vhost_net 
vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat nf_nat_ipv4 nf_nat tun bridge stp llc ctr ccm fuse ebtable_filter 
ebtables cpufreq_userspace cpufreq_powersave cpufreq_conservative joydev 
rtsx_pci_sdmmc ip6t_REJECT nf_reject_ipv6 xt_tcpudp mmc_core rtsx_pci_ms arc4 
memstick nf_conntrack_ipv6 nf_defrag_ipv6 intel_rapl ip6table_filter 
x86_pkg_temp_thermal intel_powerclamp ip6_tables coretemp iTCO_wdt 
iTCO_vendor_support iwldvm kvm_intel mac80211 kvm xt_policy irqbypass 
crct10dif_pclmul ipt_REJECT nf_reject_ipv4 crc32_pclmul snd_hda_codec_hdmi 
xt_comment ghash_clmulni_intel btusb btrtl nf_conntrack_ipv4 btbcm 
nf_defrag_ipv4 intel_cstate btintel
Sep 01 06:10:12 heisenberg kernel:  iwlwifi uvcvideo videobuf2_vmalloc 
bluetooth videobuf2_memops videobuf2_v4l2 videobuf2_core xt_multiport 
intel_uncore videodev snd_hda_codec_realtek xt_conntrack cfg80211 i915 
ecdh_generic media crc16 intel_rapl_perf pcspkr psmouse i2c_i801 
snd_hda_codec_generic sg nf_conntrack rtsx_pci rfkill iptable_filter 
snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep drm_kms_helper 
fujitsu_laptop snd_pcm mei_me snd_timer sparse_keymap mei drm button video snd 
battery ac soundcore i2c_algo_bit lpc_ich loop mfd_core shpchp binfmt_misc 
parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 dm_crypt dm_mod 
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx 
libcrc32c raid1 raid0 multipath linear md_mod btrfs crc32c_generic xor raid6_pq 
uhci_hcd sd_mod usb_storage crc32c_intel
Sep 01 06:10:12 heisenberg kernel:  aesni_intel aes_x86_64 crypto_simd cryptd 
glue_helper ahci libahci xhci_pci evdev libata ehci_pci xhci_hcd ehci_hcd 
serio_raw scsi_mod e1000e ptp usbcore pps_core usb_common
Sep 01 06:10:12 heisenberg kernel: CPU: 3 PID: 7411 Comm: btrfs Not tainted 
4.12.0-1-amd64 #1 Debian 4.12.6-1
Sep 01 06:10:12 heisenberg kernel: Hardware name: FUJITSU LIFEBOOK 
E782/FJNB253, BIOS Version 2.11 07/15/2014
Sep 01 06:10:12 heisenberg kernel: task: 8e278c6b9040 task.stack: 
a24888cb4000
Sep 01 06:10:12 heisenberg kernel: RIP: 0010:btrfs_ioctl_send+0x6c7/0x1100 
[btrfs]
Sep 01 06:10:12 heisenberg kernel: RSP: 0018:a24888cb7cb8 EFLAGS: 00010293
Sep 01 06:10:12 heisenberg kernel: RAX:  RBX: 8e26c914d40c 
RCX: 0015
Sep 01 06:10:12 heisenberg kernel: RDX: 0001 RSI: 0020 
RDI: 8e26c914d40c
Sep 01 06:10:12 heisenberg kernel: RBP: 7ffd0be90c60 R08: 8b43c5c0 
R09: 0020
Sep 01 06:10:12 heisenberg kernel: R10: a24888cb7ea0 R11: 8e278c6b9040 
R12: 7ffd0be90c60
Sep 01 06:10:12 heisenberg kernel: R13: 8e255392c000 R14: 8e26c914d000 
R15: 8e246b9a3600
Sep 01 06:10:12 heisenberg kernel: FS:  7fc1fd0b28c0() 
GS:8e285e2c() knlGS:
Sep 01 06:10:12 heisenberg kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Sep 01 06:10:12 heisenberg kernel: CR2: 7fc1fc084e38 CR3: 00013f0ee000 
CR4: 001406e0
Sep 01 06:10:12 heisenberg kernel: Call Trace:
Sep 01 06:10:12 heisenberg kernel:  ? memcg_kmem_get_cache+0x50/0x160
Sep 01 06:10:12 heisenberg kernel:  ? cpumask_next_and+0x26/0x40
Sep 01 06:10:12 heisenberg kernel:  ? select_task_rq_fair+0x9bf/0xa40
Sep 01 06:10:12 heisenberg kernel:  ? btrfs_ioctl+0x80b/0x2450 [btrfs]
Sep 01 06:10:12 heisenberg kernel:  ? account_entity_enqueue+0xc5/0xf0
Sep 01 06:10:12 heisenberg kernel:  ? enqueue_entity+0x110/0x6e0
Sep 01 06:10:12 heisenberg kernel:  ? enqueue_task_fair+0x7e/0x6b0
Sep 01 06:10:12 heisenberg kernel:  ? do_vfs_ioctl+0x9f/0x600
Sep 01 06:10:12 heisenberg kernel:  ? do_vfs_ioctl+0x9f/0x600
Sep 01 06:10:12 heisenberg kernel:  ? _do_fork+0x148/0x3e0
Sep 01 06:10:12 heisenberg kernel:  ? SyS_ioctl+0x74/0x80
Sep 01 06:10:12 heisenberg kernel:  ? system_call_fast_compare_end+0xc/0x97
Sep 01 06:10:12 heisenberg kernel: Code: 4c 89 e7 89 ee 49 89 dc e8 47 01 57 ca 
48 c7 04 24 ff ff ff ff c7 44 24 28 00 00 00 00 48 c7 44 24 10 00 00 00 00 e9 
2e fb ff ff <0f> ff e9 bd f9 ff ff 48 63 44 24 08 48 89 04 24 e9 5a fd ff ff 
Sep 01 06:10:12 heisenberg kernel: ---[ end trace 9f9174d6f4d21959 ]---


send/receive processes seem to continue running... so either there is
actually something broken (and then the userland tools should also
notice this) or this is harmless and it shouldn't go to the kernel log,
I guess...

Cheers,
Chris.


Re: deleted subvols don't go away?

2017-08-28 Thread Christoph Anton Mitterer
Thanks...

Still a bit strange that it displays that entry... especially with a
generation that seems newer than what I thought was the actually last
generation on the fs.

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


deleted subvols don't go away?

2017-08-27 Thread Christoph Anton Mitterer
Hey.

Just wondered...
On a number of filesystems I've removed several subvoumes (with -c)...
even called btrfs filesystem sync afterwards... and waited quite a
while (with the fs mounted rw) until no disk activity seems to happen
anymore.

Yet all these fs shows some deleted subvols e.g.:

btrfs subvolume list -pagud /thefs
ID 5 gen 10502 parent 0 top level 0 uuid - path /DELETED


Any ideas?



btw: seems (at least) the -d option is missing from the manpages at
least until progs version 4.12


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: BTRFS warning (device dm-0): unhandled fiemap cache detected

2017-08-20 Thread Christoph Anton Mitterer
On Mon, 2017-08-21 at 10:43 +0800, Qu Wenruo wrote:
> Harmless, it is only designed to merge fiemap output.
Thanks for the info :)


On Mon, 2017-08-21 at 10:57 +0800, Qu Wenruo wrote:
> Quite strange, according to upstream git log, that commit is merged 
> between v4.12-rc7 and v4.12.
> Maybe I misunderstand the stable kernel release cycle.

Seems it was only added in 4.12.7? Maybe a typo?

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


BTRFS warning (device dm-0): unhandled fiemap cache detected

2017-08-20 Thread Christoph Anton Mitterer
Hey.

Just got the following with 4.12.6:

Aug 21 03:29:51 heisenberg kernel: BTRFS warning (device dm-0): unhandled 
fiemap cache detected: offset=0 phys=812641906688 len=12288 flags=0x0
Aug 21 03:29:56 heisenberg kernel: BTRFS warning (device dm-0): unhandled 
fiemap cache detected: offset=0 phys=812641906688 len=12288 flags=0x0
Aug 21 03:30:58 heisenberg kernel: BTRFS warning (device dm-0): unhandled 
fiemap cache detected: offset=0 phys=813760614400 len=32768 flags=0x0
Aug 21 03:31:15 heisenberg kernel: BTRFS warning (device dm-0): unhandled 
fiemap cache detected: offset=0 phys=812641906688 len=12288 flags=0x0

Is it what should be fixed with:
https://patchwork.kernel.org/patch/9803291/

?

Is this harmless or must I assume that some part of my data/fs is now
corrupt and should I recover from backup?

Thanks,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Christoph Anton Mitterer
On Wed, 2017-08-16 at 09:53 -0400, Austin S. Hemmelgarn wrote:
> Go try BTRFS on top of dm-integrity, or on a 
> system with T10-DIF or T13-EPP support

When dm-integrity is used... would that be enough for btrfs to do a
proper repair in the RAID+nodatacow case? I assume it can't do repairs
now there, because how should it know which copy is valid.


>  (which you should have access to 
> given the amount of funding CERN gets)
Hehe, CERN may get that funding (I don't know),... but the universities
rather don't ;-)


> Except it isn't clear with nodatacow, because it might be a false
> positive.

Sure, never claimed the opposite... just that I'd expect this to be
less likely than the other way round, and less of a problem in
practise.



> SUSE is pathological case of brain-dead defaults.  Snapper needs to 
> either die or have some serious sense beat into it.  When you turn
> off 
> the automatic snapshot generation for everything but updates and set
> the 
> retention policy to not keep almost everything, it's actually not bad
> at 
> all.

Well, still, with CoW (unless you have some form of deduplication,
which in e.g. their use case would have to be on the layers below
btrfs), your storage usage will grow probably more significantly than
without.

And as you've mentioned yourself in the other mail, there's still the
issue with fragmentation.


> Snapshots work fine with nodatacow, each block gets CoW'ed once when 
> it's first written to, and then goes back to being NOCOW.  The only 
> caveat is that you probably want to defrag either once everything
> has 
> been rewritten, or right after the snapshot.

I thought defrag would unshare the reflinks?

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Christoph Anton Mitterer
Just out of curiosity:


On Wed, 2017-08-16 at 09:12 -0400, Chris Mason wrote:
> Btrfs couples the crcs with COW because

this (which sounds like you want it to stay coupled that way)...

plus


> It's possible to protect against all three without COW, but all 
> solutions have their own tradeoffs and this is the setup we
> chose.  It's 
> easy to trust and easy to debug and at scale that really helps.

... this (which sounds more you think the checksumming is so helpful,
that it would be nice in the nodatacow as well).

What does that mean now? Things will stay as they are... or it may
become a goal to get checksumming for nodatacow (while of course still
retaining the possibility to disable both, datacow AND checksumming)?


> In general, production storage environments prefer clearly defined 
> errors when the storage has the wrong data.  EIOs happen often, and
> you 
> want to be able to quickly pitch the bad data and replicate in good 
> data.

Which would also rather point towards getting clear EIOs (and thus
checksumming) in the nodatacow case.



> My real goal is to make COW fast enough that we can leave it on for
> the 
> database applications too.  Obviously I haven't quite finished that
> one 
> yet ;)

Well the question is, even if you manage that sooner or later, will
everyone be fully satisfied by this?!
I've mentioned earlier on the list that I manage one of the many big
data/computing centres for LHC.
Our use case is typically big plain storage servers connected via some
higher level storage management system (http://dcache.org/)... with
mostly write once/read many.

So apart from some central DBs for the storage management system
itself, CoW is mostly no issue for us.
But I've talked to some friend at the local super computing centre and
they have rather general issues with CoW at their virtualisation
cluster.
Like SUSE's snapper making many snapshots leading the storage images of
VMs apparently to explode (in terms of space usage).
For some of their storage backends there simply seem to be no de-
duplication available (or other reasons that prevent it's usage).

From that I'd guess there would be still people who want the nice
features of btrfs (snapshots, checksumming, etc.), while still being
able to nodatacow in specific cases.


> But I'd rather keep the building block of all the other btrfs 
> features in place than try to do crcs differently.

Mhh I see, what a pity.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-15 Thread Christoph Anton Mitterer
On Tue, 2017-08-15 at 07:37 -0400, Austin S. Hemmelgarn wrote:
> Go look at Chrome, or Firefox, or Opera, or any other major web
> browser. 
>   At minimum, they will safely bail out if they detect corruption in
> the 
> user profile and can trivially resync the profile from another system
> if 
> the user has profile sync set up.

Aha,... I'd rather see a concrete reference to some white paper or
code, where one can really see that these programs actually *do* their
own checksumming.
But even from what you claim here now (that they'd only detect the
corruption and then resync from another system - which is nothing else
than recovering from a backup), I wouldn't see the big problem with
EIO.


> Go take a look at any enterprise 
> database application from a reasonable company, it will almost
> always 
> support replication across systems and validate data it reads.

Okay, I already showed you, that PostgreSQL, MySQL, BDB, sqlite can't
or don't do per default... so which do you mean with the enterprise DB
(Oracle?) and where's the reference that shows that they really do
general checksuming? And that EIO would be a problem for their recovery
strategies?

And again, we're not talking about the WALs (or whatever these programs
call it) which are there to handle a crash... we are talking about
silent data corruption.



> Agreed, but there's also the counter argument for log files that
> most 
> people who are not running servers rarely (if ever) look at old
> logs, 
> and it's the old logs that are the most likely to have at rest 
> corruption (the longer something sits idle on media, the more likely
> it 
> will suffer from a media error).

I wouldn't have any valid prove that it's really the "idle" data, which
is the most likely one to have silent corruptions (at least not for all
types of storage medium), but even if this is the case as you say...
then it's probably more likely to hit the /usr/ /lib/ and so on stuff
on stable distros... logs are typically rotated and then at least once
re-written (when compressed).


> Go install OpenSUSE in a VM.  Look at what filesystem it uses.  Go 
> install Solaris in a VM, lo and behold it uses ZFS _with no option
> for 
> anything else_ as it's root filesystem.  Go install a recent version
> of 
> Windows server in a VM, notice that it also has the option of a
> properly 
> checked filesystem (ReFS).  Go install FreeBSD in a VM, notice that
> it 
> provides the option (which is actively recommended by many people
> who 
> use FreeBSD) to install with root on ZFS.  Install Android or Chrome
> OS 
> (or AOSP or Chromium OS) in a VM.  Root the system and take a look
> at 
> the storage stack, both of them use dm-verity, and Android (and
> possibly 
> Chrome OS too, not 100% certain) uses per-file AEAD through the VFS 
> encryption API on encrypted devices.

So your argument for not adding support for this is basically:
People don't or shouldn't use btrfs for this? o.O



>   The fact that some OS'es blindly 
> trust the underlying storage hardware is not our issue, it's their 
> issue, and it shouldn't be 'fixed' by BTRFS because it doesn't just 
> affect their customers who run the OS in a VM on BTRFS.

Then you can probably drop checksumming from btrfs altogether. And with
the same "argument" any other advanced feature.
For resilience there is hardware RAID or Linux' MD raid... so no need
to keep it in btrfs o.O


> Most enterprise database apps offer support for
> replication, 
> and quite a few do their own data validation when reading from the 
> database.
First of all,... replication != the capability to detect silent data
corruption.

You still haven't named a single one which does checksumming per
default. At least those which are quite popular in the FLOSS world all
don't seem to do.



Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 11:53 -0400, Austin S. Hemmelgarn wrote:
> Quite a few applications actually _do_ have some degree of secondary 
> verification or protection from a crash.  Go look at almost any
> database 
> software.
Then please give proper references for this!

This is from 2015, where you claimed this already and I looked up all
the bigger DBs and they either couldn't do it at all, didn't to it per
default, or it required application support (i.e. from the programs
using the DB)
https://www.spinics.net/lists/linux-btrfs/msg50258.html


> It usually will not have checksumming, but it will almost 
> always have support for a journal, which is enough to cover the 
> particular data loss scenario we're talking about (unexpected
> unclean 
> shutdown).

I don't think we talk about this:
We talk about people wanting checksuming to notice e.g. silent data
corruption.

The crash case is only the corner case about what happens then if data
is written correctly but csums not.


> In my own experience, the things that use nodatacow fall into one of
> 4 
> categories:
> 1. Cases where the data is non-critical, and data loss will be 
> inconvenient but not fatal.  Systemd journal files are a good example
> of 
> this, as are web browser profiles when you're using profile sync.

I'd guess many people would want to have their log files valid and
complete. Same for their profiles (especially since people concerned
about their integrity might not want to have these synced to
Mozilla/Google etc.)


> 2. Cases where the upper level can reasonably be expected to have
> some 
> degree of handling, even if it's not correction.  VM disk images and 
> most database applications fall into this category.

No. Wrong. Or prove me that I'm wrong ;-)
And these two (VMs, DBs) are actually *the* main cases for nodatacow.


> And I and most other sysadmins I know would prefer the opposite with
> the 
> addition of a secondary notification method.  You can still hook the 
> notification to stop the application, but you don't have to if you
> don't 
> want to (and in cases 1 and 2 I listed above, you probably don't want
> to).

Then I guess btrfs is generally not the right thing for such people, as
in the CoW case it will also give them EIO on any corruptions and their
programs will fail.



Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 10:23 -0400, Austin S. Hemmelgarn wrote:
> Assume you have higher level verification.  Would you rather not be
> able 
> to read the data regardless of if it's correct or not, or be able to 
> read it and determine yourself if it's correct or not?

What would be the difference here then to the CoW+checksuming+some-
data-corruption-case?!
btrfs would also give EIO and all these applications you mention would
fail then.

As I've said previous, one could provide end users with the means to
still access the faulty data. Or they could simply mount with
nochecksum.




> For almost 
> anybody, the answer is going to be the second case, because the 
> application knows better than the OS if the data is correct (and 
> 'correct' may be a threshold, not some binary determination).
You've made that claim already once with VMs and DBs, and your claim
proved simply wrong.

Most applications don't do this kind of verification.

And those that do probably rather just check whether the data is valid
and if not give an error or at best fall back to some automatical
backups (e.g. what package managers do).

I'd know only few programs who'd really be capable to use data they
know is bogus and recover from that automagically... the only examples
I'd know are some archive formats which include error correcting codes.
And I really mean using the blocks for recovery for which the csum
wouldn't verify (i.e. the ones that gives an EIO)... without ECCs, how
would a program know what do to with such data?


I cannot image that many people would choose the second option, to be
honest.
Working with bogus data?! What should be the benefit of this?



>   At that 
> point, you need to make the checksum error a warning instead of 
> returning -EIO.  How do you intend to communicate that warning back
> to 
> the application?  The kernel log won't work, because on any
> reasonably 
> secure system it's not visible to anyone but root.

Still same problem with CoW + any data corruption...

>   There's also no side 
> channel for the read() system calls that you can utilize.  That then 
> means that the checksums end up just being a means for the
> administrator 
> to know some data wasn't written correctly, but they should know
> that 
> anyway because the system crashed.

No, they'd have no idea if any / which data was written during the
crash.



> Looking at this from a different angle: Without background, what
> would 
> you assume the behavior to be for this?  For most people, the
> assumption 
> would be that this provides the same degree of data safety that the 
> checksums do when the data is CoW.

I don't think the average use would have any such assumption. Most
people likely don't even know that there is implicitly no checksuming
if nodatacow is enabled.


What people may however have heard of is, that btrfs does doe
checksuming and they'd assume that their filesystem gives them always
just valid data (or an error)... and IMO that's actually what each
modern fs should do per default.
Relying on higher levels providing such means is simply not realistic.



Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 15:46 +0800, Qu Wenruo wrote:
> The problem here is, if you enable csum and even data is updated 
> correctly, only metadata is trashed, then you can't even read out
> the 
> correct data.

So what?
This problem occurs anyway *only* in case of a crash,.. and *only* if
notdatacow+checksumung would be used.
A case in which currently, the user can either only hope that his data
is fine (unless higher levels provide some checksumming means[0]), or
anyway needs to recover from a backup.

Intuitively I'd also say it's much less likely that the data (which is
more in terms of space) is written correctly while the checksum is not.
Or is it?



[0] And when I've investigated back when discussion rose up the first
time and some list member claimed that most typical cases (DBs, VM
images) would anyway do their own checksuming,... I came to the
conclusion that most did not even support it and even if they would
it's no enabled per default and not really a *full* checksumming in
most cases.



> As btrfs csum checker will just prevent you from reading out any
> data 
> which doesn't match with csum.
As I've said before, a tool could be provided, that re-computes the
checksums then (making the data accessible again)... or one could
simply mount the fs with nochecksum or some other special option, which
allows bypassing any checks.

> Now it's not just data corruption, but data loss then.
I think the former is worse than the later. The later gives you a
chance of noting it, and either recover from a backup, regenerate the
data (if possible) or manually mark the data as being "good" (though
corrupted) again.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-14 Thread Christoph Anton Mitterer
On Mon, 2017-08-14 at 14:36 +0800, Qu Wenruo wrote:
> > And how are you going to write your data and checksum atomically
> > when
> > doing in-place updates?
> 
> Exactly, that's the main reason I can figure out why btrfs disables 
> checksum for nodatacow.

Still, I don't get the problem here...

Yes it cannot be done atomically (without workarounds like a journal or
so), but this should be only an issue in case of a crash or similar.

And in this case nodatacow+nochecksum is anyway already bad, it's also
not atomic, so data may be completely garbage (e.g. half written)...
just that no one will ever notice.

The only problem that nodatacow + checksuming + nonatomic should give
is when the data was actually correctly written at a crash, but the
cheksum was not, in which case the bogus checksum would invalidate the
good data on next read.

Or do I miss something?


To me that sounds still much better than having no protection at all.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-12 Thread Christoph Anton Mitterer
On Sat, 2017-08-12 at 00:42 -0700, Christoph Hellwig wrote:
> And how are you going to write your data and checksum atomically when
> doing in-place updates?

Maybe I misunderstand something, but what's the big deal with not doing
it atomically (I assume you mean in terms of actually writing to the
pyhsical medium)? Isn't that anyway already a problem in case of a
crash?

And isn't that the case also with all forms of e.g. software RAID (when
not having a journal)?

And as I've said, what's the worst thing that can happen? Either the
data would not have been completely written - with or without
checksumming. Then what's the difference to try the checksumming (and
do it successfully in all non crash cases)?
My understanding was (but that may be wrong of course, I'm not a
filesystem expert at all), that worst that can happen is that data an
csum aren't *both* fully written (in all possible combinations), so
we'd have four cases in total:

data=good csum=good => fine
data=bad  csum=bad  => doesn't matter whether csum or not and whether atomic or 
not
data=bad  csum=good => the csum will tell us, that the data is bad
data=
good csum=bad  => the only real problem, data would be actually

  good, but csum is not



smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-11 Thread Christoph Anton Mitterer
Qu Wenruo wrote:
>Although Btrfs can disable data CoW, nodatacow also disables data 
>checksum, which is another main feature for btrfs.

Then decoupling of the two should probably decoupled and support for
notdatacow+checksumming be implemented?!

I'm not an expert, but I wouldn't see why this shouldn't be possible
(especially since metadata is AFAIC anyway *always* CoWed +
checksummed).


Nearly a year ago I had some off-list mails exchanged with CM and AFAIU
he said it would technically be possible...


What's the worst thing that can happen?! IMO, that noCoWed data would
have been correctly written on a crash, but not the checksum, thereby
the (bad) checksum would invalidate the actually good data.
How likely is that compared to the other way round? I'd guess not so
much.
And even if, it's IMO still better to have then false positives (which
the higher application layers should take care of anyway) than to not
notice silent data corruption at all.


Of course checksuming would possibly impact performance, but anyway
could still use nodatacow+nochecksum (or any other fs) if he focuses
more on performance than data integrity.
But all those who focus on integrity would get that, even in the
nodatacow case.


IIRC, CM brought as an argument, that some people rather get the bad
data than nothing at all (respectively EIO)... but for those btrfs is
probably anyway a bad choice (at least in the normal non-nodatacow
case),... also any application should properly deal with EIO... and
last but not least, one could still provide a special tool that, after
crash (with possibly non-matching data/csum) allows a user to find such
cases and decide what to do,... so a user/admin who rather takes the
bad data an tries for forensical recovery could be given a tool like
btrfs csum --recompute-invalid-csums (or some better name), in which
either all (or just some paths) csums are re-written in case they don't
match.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: FAILED: patch "[PATCH] Btrfs: fix early ENOSPC due to delalloc" failed to apply to 4.12-stable tree

2017-08-04 Thread Christoph Anton Mitterer
Hey.

Could someone of the devs put some attention on this...?

Thanks,
Chris :-)


On Mon, 2017-07-31 at 18:06 -0700, gre...@linuxfoundation.org wrote:
> The patch below does not apply to the 4.12-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git
> commit
> id to <sta...@vger.kernel.org>.
> 
> thanks,
> 
> greg k-h
> 
> -- original commit in Linus's tree --
> 
> From 17024ad0a0fdfcfe53043afb969b813d3e020c21 Mon Sep 17 00:00:00
> 2001
> From: Omar Sandoval <osan...@fb.com>
> Date: Thu, 20 Jul 2017 15:10:35 -0700
> Subject: [PATCH] Btrfs: fix early ENOSPC due to delalloc
> 
> If a lot of metadata is reserved for outstanding delayed allocations,
> we
> rely on shrink_delalloc() to reclaim metadata space in order to
> fulfill
> reservation tickets. However, shrink_delalloc() has a shortcut where
> if
> it determines that space can be overcommitted, it will stop early.
> This
> made sense before the ticketed enospc system, but now it means that
> shrink_delalloc() will often not reclaim enough space to fulfill any
> tickets, leading to an early ENOSPC. (Reservation tickets don't care
> about being able to overcommit, they need every byte accounted for.)
> 
> Fix it by getting rid of the shortcut so that shrink_delalloc()
> reclaims
> all of the metadata it is supposed to. This fixes early ENOSPCs we
> were
> seeing when doing a btrfs receive to populate a new filesystem, as
> well
> as early ENOSPCs Christoph saw when doing a big cp -r onto Btrfs.
> 
> Fixes: 957780eb2788 ("Btrfs: introduce ticketed enospc
> infrastructure")
> Tested-by: Christoph Anton Mitterer <m...@christoph.anton.mitterer.na
> me>
> Cc: sta...@vger.kernel.org
> Reviewed-by: Josef Bacik <jba...@fb.com>
> Signed-off-by: Omar Sandoval <osan...@fb.com>
> Signed-off-by: David Sterba <dste...@suse.com>
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index a6635f07b8f1..e3b0b4196d3d 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -4825,10 +4825,6 @@ static void shrink_delalloc(struct
> btrfs_fs_info *fs_info, u64 to_reclaim,
>   else
>   flush = BTRFS_RESERVE_NO_FLUSH;
>   spin_lock(_info->lock);
> - if (can_overcommit(fs_info, space_info, orig, flush,
> false)) {
> - spin_unlock(_info->lock);
> - break;
> - }
>   if (list_empty(_info->tickets) &&
>   list_empty(_info->priority_tickets)) {
>   spin_unlock(_info->lock);
> 

smime.p7s
Description: S/MIME cryptographic signature


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-03 Thread Christoph Anton Mitterer
On Thu, 2017-08-03 at 20:08 +0200, waxhead wrote:
> Brendan Hide wrote:
> > The title seems alarmist to me - and I suspect it is going to be 
> > misconstrued. :-/
> > 
> > From the release notes at 
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Li
> > nux/7/html/7.4_Release_Notes/chap-Red_Hat_Enterprise_Linux-
> > 7.4_Release_Notes-Deprecated_Functionality.html
> > "Btrfs has been deprecated
> > 

Wow... not that this would have any direct effect... it's still quite
alarming, isn't it?

This is not meant as criticism, but I often wonder myself where the
btrfs is going to!? :-/

It's in the kernel now since when? 2009? And while the extremely basic
things (snapshots, etc.) seem to work quite stable... other things seem
to be rather stuck (RAID?)... not to talk about many things that have
been kinda "promised" (fancy different compression algos, n-parity-
raid).
There are no higher-level management tools (e.g. RAID
management/monitoring, etc.)... there are still some kinda serious
issues (the attacks/corruptions likely possible via UUID collisions)...
One thing that I miss since long would be the checksumming with
nodatacow.
Also it has always been said that the actual performance tunning would
still lay ahead?!


I really like btrfs and use it on all my personal systems... and I
haven't had any data loss since then (only a number of seriously
looking false positives due to bugs in btrfs check ;-) )... but one
still reads every now and then from people here on the list who seem to
suffer from more serious losses.



So is there any concrete roadmap? Or priority tasks? Is there a lack of
developers?

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 00/14 RFC] Btrfs: Add journal for raid5/6 writes

2017-08-01 Thread Christoph Anton Mitterer
Hi.

Stupid question:
Would the write hole be closed already, if parity was checksummed?

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 14:48 -0700, Omar Sandoval wrote:
> Just to be sure, did you explicitly write 0 to these?
Nope... that seemed to have been the default value, i.e. I used
sysctl(8) in read (and not set) mode here.



> These sysctls are
> really confusing, see https://www.kernel.org/doc/Documentation/sysctl
> /vm.txt.
> Basically, there are two ways to specify these, either as a ratio of
> system memory (vm.dirty_ratio and vm.dirty_background_ratio) or a
> static
> number of bytes (vm.dirty_bytes and vm.dirty_background_bytes). If
> you
> set one, the other appears as 0, and the kernel sets the ratios by
> default. But if you explicitly set them to 0, the kernel is going to
> flush stuff extremely aggressively.
I see,... not sure why both are 0 here... at least I didn't change it
myself - must be something from the distro?


> Awesome, glad to hear it! I hadn't been able to reproduce the issue
> outside of Facebook. Can I add your tested-by?
Sure, but better use my other mail address for it, if you don't mind:
Christoph Anton Mitterer <m...@christoph.anton.mitterer.name>


> > I assume you'll take care to get that patch into stable kernels?
> > Is this patch alone enough to recommend the Debian maintainers to
> > include it into their 4.9 long term stable kernels?
> 
> I'll mark it for stable, assuming Debian tracks the upstream LTS
> releases it should get in.
Okay :-)

Nevertheless I'll open a bug at their BTS, just to be safe.


Thanks :)

Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 10:32 -0700, Omar Sandoval wrote:
> If that doesn't work, could you please also try
> https://patchwork.kernel.org/patch/9829593/?

Okay, tried the patch now, applied upon:
Linux 4.12.0-trunk-amd64 #1 SMP Debian 4.12.2-1~exp1 (2017-07-18) x86_64 
GNU/Linux
(that is the Debian source package, with all their further patches and
their kernel config).

with the parameters at their defaults:
# sysctl vm.dirty_bytes
vm.dirty_bytes = 0
# sysctl vm.dirty_background_bytes
vm.dirty_background_bytes = 0



Tried copying the whole image three times (before I haven had a single
copy of the whole image with at least one error, so that should be
"proof" enough that it fixes the isse) upon the btrfs fs,... no errors
this time...

Looks good :-)


I assume you'll take care to get that patch into stable kernels?
Is this patch alone enough to recommend the Debian maintainers to
include it into their 4.9 long term stable kernels?

And would you recommend this as an "urgent" fix?


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 11:14 -0700, Omar Sandoval wrote:
> Yes, that's a safe enough workaround. It's a good idea to change the
> parameters back after the copy.
you mean even without having the fix, right?

So AFAIU, the bug doesn't really cause FS corruption, but just "false"
ENOSPC and these happen during having meta-data creating (e.g. during
operations like mine) only?

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 10:55 -0700, Omar Sandoval wrote:
> Against 4.12 would be best, thanks!
okay,.. but that will take a while to compile...


in the meantime... do you know whether it's more or less safe to use
the 4.9 kernel without any fix, when I change the parameters mentioned
before, during the massive copying?

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 10:32 -0700, Omar Sandoval wrote:
> Could you try 4.12?
Linux 4.12.0-trunk-amd64 #1 SMP Debian 4.12.2-1~exp1 (2017-07-18)
x86_64 GNU/Linux
from Debian experimental, doesn't fix the issue...


>  If that doesn't work, could you please also try
> https://patchwork.kernel.org/patch/9829593/?
Against 4.9?


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 15:00 +, Martin Raiber wrote:
> there are patches on this list/upstream which could fix this ( e.g.
> "fix
> delalloc accounting leak caused by u32 overflow"/"fix early ENOSPC
> due
> to delalloc").

mhh... it's a bit problematic to test these on that nodes...


> Do you use compression?

nope...


> It would be interesting if lowering the dirty ratio is a viable
> work-around (sysctl vm.dirty_background_bytes=314572800 && sysctl
> vm.dirty_bytes=1258291200).

doesn't seem to change anything.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
On Thu, 2017-07-20 at 15:00 +, Martin Raiber wrote:
> It would be interesting if lowering the dirty ratio is a viable
> work-around (sysctl vm.dirty_background_bytes=314572800 && sysctl
> vm.dirty_bytes=1258291200).
> 
> Regards,
> Martin

I took away a trailing 0 for each of them... and then it goes through
without error

sysctl vm.dirty_bytes=125829120
vm.dirty_bytes = 125829120
sysctl vm.dirty_background_bytes=31457280
vm.dirty_background_bytes = 31457280


But what does that mean now... could there be still any corruptions?
And do you need to permanently set the value (until this is fixed in
stable), or is this just necessary when I had this large copying
operation?


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
Oh and I should add:
After such error, cp goes on copying (with other files)...

Same issue occurs when I do something like tar -cf - /media | tar -xf


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


strange No space left on device issues

2017-07-20 Thread Christoph Anton Mitterer
Hey.

The following happens on Debian stretch systems:
# uname -a
Linux lcg-lrz-admin 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) 
x86_64 GNU/Linux

What I have are VMs, which run with root fs as ext4 and which I want to
migrate to btrfs.
So I've added further disk images and then something like this:
- mkfs.btrfs --nodiscard --label system /dev/sdc2 (i.e. the new image)
- mounted that at /mnt
- created a subvol "root" in it
- stopped all services on the node
- remount,ro /
- mount --bind / /media
- cp -a /media/ /mnt/subvol/
- and then I'd go on move everything in place, install bootloader etc.

That used to always work, and does when I try the same with ext4
instead of btrfs on the new images.

But with btrfs I get spurious No space error like:
cp: cannot create regular file
'/mnt/root/X/media/usr/share/doc/openjdk-8-jre-
headless/api/java/security/PrivilegedExceptionAction.html': No space
left on device
cp: cannot create regular file
'/mnt/root/X/media/usr/share/doc/openjdk-8-jre-
headless/api/java/security/Provider.Service.html': No space left on
device
cp: cannot create regular file
'/mnt/root/X/media/usr/share/doc/openjdk-8-jre-
headless/api/javax/script/AbstractScriptEngine.html': No space left on
device

or:
cp: preserving permissions for
‘/mnt/root/X/usr/include/c++/6/gnu/javax/crypto/keyring/BaseKeyring.h’:
No space left on device
cp: preserving permissions for ‘/mnt/root/X/usr/share/doc/cmake-
data/html/variable/CMAKE_CXX_STANDARD_REQUIRED.html’: No space left on
device


All these happen always (when I create a fresh btrfs on the volume and
start over) with different files... and btrfs filesystem df shows
plenty of space left like in terms of >15GB left.


Any ideas?

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: Exactly what is wrong with RAID5/6

2017-06-21 Thread Christoph Anton Mitterer
On Wed, 2017-06-21 at 16:45 +0800, Qu Wenruo wrote:
> Btrfs is always using device ID to build up its device mapping.
> And for any multi-device implementation (LVM,mdadam) it's never a
> good 
> idea to use device path.

Isn't it rather the other way round? Using the ID is bad? Don't you
remember our discussion about using leaked UUIDs (or accidental
collisions) for all kinds of attacks?


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 1/2] btrfs: warn about RAID5/6 being experimental at mount time

2017-03-29 Thread Christoph Anton Mitterer
On Wed, 2017-03-29 at 06:39 +0200, Adam Borowski wrote:
> Too many people come complaining about losing their data -- and
> indeed,
> there's no warning outside a wiki and the mailing list tribal
> knowledge.
> Message severity chosen for consistency with XFS -- "alert" makes
> dmesg
> produce nice red background which should get the point across.

Wouldn't it be much better to disallow:
- creation
AND
- mounting
of btrfs unless some special swtich like:
--yes-i-know-this-is-still-extremely-experimental
is given for the time being?

Normal users typically don't look at any such kernel log messages - and
expert users (who do) anyway know, that it's still unstable.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-02-03 Thread Christoph Anton Mitterer
Hey Qu

On Fri, 2017-02-03 at 14:20 +0800, Qu Wenruo wrote:
> Great thanks for that!
You're welcome. :)


> I also added missing error message output for other places I found,
> and 
> updated the branch, the name remains as "lowmem_tests"
> 
> Please try it.
# btrfs check /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used, no error found
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
0

# btrfs check /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used, no error found
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
0

=> looks good this time :)


btw: In your commit messages, please change my email to
cales...@scientia.net everywhere... I accidentally used my university
address (christoph.anton.mitte...@lmu.de) sometimes when sending mail.

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-02-01 Thread Christoph Anton Mitterer
On Wed, 2017-02-01 at 09:06 +0800, Qu Wenruo wrote:
> https://github.com/adam900710/btrfs-progs/tree/lowmem_fixes
> 
> Which is also rebased to latest v4.9.1.

Same game as last time, applied to 4.9, no RW mount between the runs.


btrfs-progs v4.9 WITHOUT patch:
***
# btrfs check /dev/nbd0 ; echo $?
checking extents
137

=> would be nice if btrfs-progrs could give a message on why it failed,
i.e. "not enough memory" or so.


# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6997067956224 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[7702516334592 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[8051482427392 1073741824] used 1073741824 but extent items 
used 1084751872
ERROR: block group[8116980678656 1073741824] used 1073741824 but extent items 
used 0
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used err is -5
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
1
=> error still occurs *without* patch

=> increased VM memory here

# btrfs check /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used err is 0
total csum bytes: 7330834320
total tree bytes: 10902437888

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-29 Thread Christoph Anton Mitterer
On Sun, 2017-01-29 at 12:27 +0800, Qu Wenruo wrote:
> Sorry for the late reply, in Chinese New Year vacation.
No worries... and happy new year then ;)


> I'll update the patchset soon to address it.
Just tell me and I re-check.


> Thanks again for your detailed output and patience,
Thanks as well :)

Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-26 Thread Christoph Anton Mitterer
On Thu, 2017-01-26 at 11:10 +0800, Qu Wenruo wrote:
> Would you please try lowmem_tests branch of my repo?
> 
> That branch contains a special debug output for the case you 
> encountered, which should help to debug the case.
> pecial debug output for the case you encountered, which

Here the output with your patches (again, not having applied the
unnecessary fs-tests patches):

In the output below I've replaced filenames with "[snip..snap]" and
exchanged some of the xattr values.
In case you should need their original values for testing, just tell me
and I send them to you off-list.

btrfs-progs v4.9 WITH patches:
**
# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
ERROR: root 6031 EXTENT_DATA[277 524288] datasum missing, have: 36864, expect: 
45056 ret: 0
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
=== fs tree leaf dump: slot: 136 ===
leaf 5960902508544 items 191 free space 60 generation 2775 owner 6031
fs uuid 326d292d-f97b-43ca-b1e8-c722d3474719
chunk uuid 5da7e818-7f0b-43c1-b465-fdfaa52da633
item 0 key (274 EXTENT_DATA 7733248) itemoff 16230 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807724716032 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 1 key (274 EXTENT_DATA 7864320) itemoff 16177 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807724834816 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 2 key (274 EXTENT_DATA 7995392) itemoff 16124 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807724953600 nr 122880
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 3 key (274 EXTENT_DATA 8126464) itemoff 16071 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725076480 nr 122880
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 4 key (274 EXTENT_DATA 8257536) itemoff 16018 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725199360 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 5 key (274 EXTENT_DATA 8388608) itemoff 15965 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725318144 nr 114688
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 6 key (274 EXTENT_DATA 8519680) itemoff 15912 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725432832 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 7 key (274 EXTENT_DATA 8650752) itemoff 15859 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725551616 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 8 key (274 EXTENT_DATA 8781824) itemoff 15806 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725670400 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 9 key (274 EXTENT_DATA 8912896) itemoff 15753 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725789184 nr 122880
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 10 key (274 EXTENT_DATA 9043968) itemoff 15700 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807725912064 nr 114688
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 11 key (274 EXTENT_DATA 9175040) itemoff 15647 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807726026752 nr 114688
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 12 key (274 EXTENT_DATA 9306112) itemoff 15594 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 6807726141440 nr 118784
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 13 key (274 EXTENT_DATA 9437184) itemoff 15541 itemsize 53
generation 2775 type 1 (regular)
extent data disk byte 

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-25 Thread Christoph Anton Mitterer
On Thu, 2017-01-26 at 11:10 +0800, Qu Wenruo wrote:
> In fact, the result without patches is not really needed for current
> stage.
> 
> Feel free to skip them until the patched ones passed.
> Which should save you some time.

Well the idea is, that if I do further writes in the meantime (by
adding new backup data), then things in the fs could change (I blindly
assume) in such a way, that the false positive isn't triggered any more
- not because a patch would finally have it fixed, but simply because
things on the fs changed...

That's why I repeated it always so far - just to see that the issues
would be still there.


> Would you please try lowmem_tests branch of my repo?
> 
> That branch contains a special debug output for the case you 
> encountered, which should help to debug the case.
> pecial debug output for the case you encountered, which

Sure, tomorrow.

Best wishes,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-25 Thread Christoph Anton Mitterer
On Wed, 2017-01-25 at 12:16 +0800, Qu Wenruo wrote:
> https://github.com/adam900710/btrfs-progs/tree/lowmem_fixes

Just finished trying your new patches.

Same game as last time, applied to 4.9, no RW mount between the runs.


btrfs-progs v4.9 WITHOUT patch:
***
# btrfs check /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used err is 0
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
0


# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6997067956224 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[7702516334592 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[8051482427392 1073741824] used 1073741824 but extent items 
used 1084751872
ERROR: block group[8116980678656 1073741824] used 1073741824 but extent items 
used 0
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used err is -5
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
1

btrfs-progs v4.9 WITH patches:

Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-24 Thread Christoph Anton Mitterer
On Wed, 2017-01-25 at 12:16 +0800, Qu Wenruo wrote:
> New patches are out now.
> 
> Although I just updated 
> 0001-btrfs-progs-lowmem-check-Fix-wrong-block-group-check.patch to
> fix 
> all similar bugs.
> 
> You could get it from github:
> https://github.com/adam900710/btrfs-progs/tree/lowmem_fixes

Sure, will take a while, though (hopefully get it done tomorrow)


> Unfortunately, I didn't find the cause of the remaining error of
> that 
> missing csum.
> And considering the size of your fs, btrfs-image is not possible, so
> I'm 
> afraid you need to test the patches every time it updates.

No worries :-)


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-24 Thread Christoph Anton Mitterer
On Wed, 2017-01-25 at 08:44 +0800, Qu Wenruo wrote:
> Thanks for the test,

You're welcome... I'm happy if I can help :)

Just tell me once you think you found something, and I'll repeat the
testing.


Cheers,
Chr
is.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update

2017-01-24 Thread Christoph Anton Mitterer
Hey Qu.

I was giving your patches a try, again on the very same fs (which saw
however writes in the meantime), from my initial report.

btrfs-progs v4.9 WITHOUT patch:
***
# btrfs check /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used err is 0
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
0

# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6997067956224 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[7702516334592 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[8051482427392 1073741824] used 1073741824 but extent items 
used 1084751872
ERROR: block group[8116980678656 1073741824] used 1073741824 but extent items 
used 0
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7519512838144 bytes used err is -5
total csum bytes: 7330834320
total tree bytes: 10902437888
total fs tree bytes: 2019704832
total extent tree bytes: 1020149760
btree space waste bytes: 925714197
file data blocks allocated: 7509228494848
 referenced 7630551511040
1

=> so the fs would still show the symptoms


Then, with no RW mount to the fs in between, 4.9 with the following of
your patches:

Re: RAID56 status?

2017-01-23 Thread Christoph Anton Mitterer
On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote:
> We've been focusing on the single-drive use cases internally.  This
> year 
> that's changing as we ramp up more users in different places.  
> Performance/stability work and raid5/6 are the top of my list right
> now.
+1

Would be nice to get some feedback on what happens behind the scenes...
 actually I think a regular btrfs development blog could be generally a
nice thing :)

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RAID56 status?

2017-01-23 Thread Christoph Anton Mitterer
Just wondered... is there any larger known RAID56 deployment? I mean
something with real-world production systems and ideally many different
 IO scenarios, failures, pulling disks randomly and perhaps even so
many disks that it's also likely to hit something like silent data
corruption (on the disk level)?

Has CM already migrated all of Facebook's storage to btrfs RAID56?! ;-)
Well at least facebook.com seems till online ;-P *kidding*

I mean the good thing in having such a massive production-like
environment - especially when it's not just one homogeneous usage
pattern - is that it would help to build up quite some trust into the
code (once the already known bugs are fixed).



Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: RAID56 status?

2017-01-22 Thread Christoph Anton Mitterer
On Sun, 2017-01-22 at 22:39 +, Hugo Mills wrote:
>    It's still all valid. Nothing's changed.
> 
>    How would you like it to be updated? "Nope, still broken"?

The kernel version mentioned there is 4.7... so noone (at least
endusers) really knows whether it's just no longer maintainer or still
up-to-date with nothing changed... :(


Cherrs,
Chris

smime.p7s
Description: S/MIME cryptographic signature


Re: RAID56 status?

2017-01-22 Thread Christoph Anton Mitterer
On Sun, 2017-01-22 at 22:22 +0100, Jan Vales wrote:
> Therefore my question: whats the status of raid5/6 is in btrfs?
> Is it somehow "production"-ready by now?
AFAIK, what's on the - apparently already no longer updated - 
https://btrfs.wiki.kernel.org/index.php/Status still applies, and
RAID56 is not yet usable for anything near production.

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH] btrfs-progs: lowmem-check: Fix wrong extent tree iteration

2017-01-20 Thread Christoph Anton Mitterer
On Fri, 2017-01-20 at 15:58 +0800, Qu Wenruo wrote:
> Nice to hear that, although the -5 error seems to be caught
> I'll locate the problem and then send the patch.
> 
> Thanks for your testing!

You're welcome... just ping me once I should do another run.

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH] btrfs-progs: lowmem-check: Fix wrong extent tree iteration

2017-01-19 Thread Christoph Anton Mitterer
Hey Qu.


On Wed, 2017-01-18 at 16:48 +0800, Qu Wenruo wrote:
> To Christoph,
> 
> Would you please try this patch, and to see if it suppress the block
> group
> warning?
I did another round of fsck in both modes (original/lomem), first
WITHOUT your patch, then WITH it... both on progs version 4.9... no
further RW mount between these 4 runs:


btrfs-progs v4.9 WITHOUT patch:
***
# btrfs check /dev/nbd0 ; echo $?
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7469206884352 bytes used err is 0
total csum bytes: 7281779252
total tree bytes: 10837262336
total fs tree bytes: 2011906048
total extent tree bytes: 1015349248
btree space waste bytes: 922444044
file data blocks allocated: 7458369622016
 referenced 7579485159424
0

# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6997067956224 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[7702516334592 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[8051482427392 1073741824] used 1073741824 but extent items 
used 1084751872
ERROR: block group[8116980678656 1073741824] used 1073217536 but extent items 
used 0
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
found 7469206884352 bytes used err is -5
total csum bytes: 7281779252
total tree bytes: 10837262336
total fs tree bytes: 2011906048
total extent tree bytes: 1015349248
btree space waste bytes: 922444044
file data 

Re: corruption: yet another one after deleting a ro snapshot

2017-01-17 Thread Christoph Anton Mitterer
On Wed, 2017-01-18 at 08:41 +0800, Qu Wenruo wrote:
> Since we have your extent tree and root tree dump, I think we should
> be 
> able to build a image to reproduce the case.
+1

> BTW, your fs is too large for us to really do some verification or
> other 
> work.

Sure I know... but that's simply the one which I work the most with and
where I stumble over such things.

I have e.g. a smaller one (well still 1TB in total 500GB used) which is
the root-fs from my notebook... but not really any issues with that so
far ^^


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: corruption: yet another one after deleting a ro snapshot

2017-01-17 Thread Christoph Anton Mitterer
Am 17. Januar 2017 09:53:19 MEZ schrieb Qu Wenruo :
>Just lowmem false alert, as extent-tree dump shows complete fine
>result.
>
>I'll CC you and adds your reported-by tag when there is any update on 
>this case.

Fine, just one thing left right more from my side on this issue:
Do you want me to leave the fs untouched until I could verify a lowmem mode fix?
Or is it ok to go on using it (and running backups on it)? 

Cheers,
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corruption: yet another one after deleting a ro snapshot

2017-01-16 Thread Christoph Anton Mitterer
On Mon, 2017-01-16 at 13:47 +0800, Qu Wenruo wrote:
> > > And I highly suspect if the subvolume 6403 is the RO snapshot you
> > > just removed.
> > 
> > I guess there is no way to find out whether it was that snapshot,
> > is
> > there?
> 
> "btrfs subvolume list" could do it."
Well that was clear,... but I rather meant something that also shows me
the path of the deleted subvol.
Anyway:
# btrfs subvolume list /data/data-a/3/
ID 6029 gen 2528 top level 5 path data
ID 6031 gen 3208 top level 5 path backups
ID 7285 gen 3409 top level 5 path 
snapshots/_external-fs/data-a1/data/2017-01-11_1

So since I only had two further snapshots in snapshots/_external-
fs/data-a1/data/ it must be the deleted one.

btw: data is empty, and backup contains actually some files (~25k,
~360GB)... these are not created via send/receive, but either via cp or
rsync.
And they are always in the same subvol (i.e. the backups svol isn't
deleted like the snaphots are)


> Also checked the extent tree, the result is a little interesting:
> 1) Most tree backref are good.
> In fact, 3 of all the 4 errors reported are tree blocks shared by
> other subvolumes, like:
> 
> item 77 key (5120 METADATA_ITEM 1) itemoff 13070 itemsize 42
>   extent refs 2 gen 11 flags TREE_BLOCK|FULL_BACKREF
>   tree block skinny level 1
>   tree block backref root 7285
>   tree block backref root 6572
> 
> This means the tree blocks are share by 2 other subvolumes,
> 7285 and 6572.
> 
> 7285 subvolume is completely OK, while 6572 is also undergoing
> subvolume 
> deletion(while real deletion doesn't start yet).
Well there were in total three snapshots, the still existing:
snapshots/_external-fs/data-a1/data/2017-01-11_1
and two earlier ones,
one from around 2016-09-16_1 (= probably ID 6572?), one even a bit
earlier from 2016-08-19_1 (probably ID 6403?).
Each one was created with
send -p | receive, using the respectively earlier one as parent.
So it's
quite reasonable that they share the extents and also that it'sby 2
subvols.



> And considering the generation, I assume 6403 is deleted before 6572.
Don't remember which one of the 2 subvols form 2016 I've deleted first,
the older or the more recent one... my bash history implies in this
order:
 4203  btrfs subvolume delete 2016-08-19_1
 4204  btrfs subvolume delete 2016-09-16_1


> So we're almost clear that, btrfs (maybe only btrfsck) doesn't handle
> it 
> well if there are multiple subvolume undergoing deletion.
> 
> This gives us enough info to try to build such image by ourselves
> now.
> (Although still quite hard to do though).
Well keep me informed if you actually find/fix something  :)


> And for the scary lowmem mode, it's false alert.
> 
> I manually checked the used size of a block group and it's OK.
So you're going to fix this?


> BTW, most of your block groups are completely used, without any
> space.
> But interestingly, mostly data extent size are just 512K, larger than
> compressed extent upper limit, but still quite small.
Not sure if I understand this...


> In other words, your fs seems to be fragmented considering the upper 
> limit of a data extent is 128M.
> (Or your case is quite common in common world?)
No, I don't think I understand what you mean :D


> So you are mostly OK to mount it rw any time you want, and I have 
> already downloaded the raw data.
Okay, I've remounted it now RW, called btrfs filesystem sync, and
waited until the HDD became silent and showed no further activity.

(again 3.9)

# btrfs check /dev/nbd0 ; echo $?
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 7469206884352 bytes used err is 0
total csum bytes: 7281779252
total tree bytes: 10837262336
total fs tree bytes: 2011906048
total extent tree bytes: 1015349248
btree space waste bytes: 922444044
file data blocks allocated: 7458369622016
 referenced 7579485159424
0


=> as you can see, original mode pretends things would be fine now.


# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but 

Re: corruption: yet another one after deleting a ro snapshot

2017-01-15 Thread Christoph Anton Mitterer
On Mon, 2017-01-16 at 11:16 +0800, Qu Wenruo wrote:
> It would be very nice if you could paste the output of
> "btrfs-debug-tree -t extent " and "btrfs-debug-tree -t
> root 
> "
> 
> That would help us to fix the bug in lowmem mode.
I'll send you the link in a private mail ... if any other developer
needs it, just ask me or Qu for the link.


> BTW, if it's possible, would you please try to run btrfs-check
> before 
> your next deletion on ro-snapshots?
You mean in general, when I do my next runs of backups respectively
snaphot-cleanup?
Sure, actually I did this this time as well (in original mode, though),
and no error was found.

For what should I look out?


> Not really needed, as all corruption happens on tree block of root
> 6403,
> it means, if it's a real corruption, it will only disturb you(make
> fs 
> suddenly RO) when you try to modify something(leaves under that node)
> in 
> that subvolume.
Ah... and it couldn't cause corruption to the same data blocks if they
were used by another snaphshot?



> And I highly suspect if the subvolume 6403 is the RO snapshot you
> just removed.
I guess there is no way to find out whether it was that snapshot, is
there?



> If 'btrfs subvolume list' can't find that subvolume, then I think
> it's 
> mostly OK for you to RW mount and wait the subvolume to be fully
> deleted.
>
> And I think you have already provided enough data for us to, at
> least 
> try to, reproduce the bug.

I won't do the remount,rw this night, so you have the rest of your
day/night time to think of anything further I should test or provide
you with from that fs... then it will be "gone" (in the sense of
mounted RW).
Just give your veto if I should wait :)


Thanks,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: corruption: yet another one after deleting a ro snapshot

2017-01-15 Thread Christoph Anton Mitterer
On Mon, 2017-01-16 at 09:38 +0800, Qu Wenruo wrote:
> So the fs is REALLY corrupted.
*sigh* ... (not as in fuck-I'm-loosing-my-data™ ... but as in *sigh*
another-possibly-deeply-hidden-bug-in-btrfs-that-might-eventually-
cause-data-loss...)

> BTW, lowmem mode seems to have a new false alert when checking the
> block 
> group item.

Anything you want to check me there?


> Did you have any "lightweight" method to reproduce the bug?
Na, not at all... as I've said this already happened to me once before,
and in both cases I was cleaning up old ro-snapshots.

At least in the current case the fs was only ever filled via
send/receive (well apart from minor mkdirs or so)... so there shouldn't
have been any "extreme ways" of using it.

I think (but not sure), that this was also the case on the other
occasion that happened to me with a different fs (i.e. I think it was
also a backup 8TB disk).


> For example, on a 1G btrfs fs with moderate operations, for example 
> 15min or so, to reproduce the bug?
Well I could try to produce it, but I guess you'd have far better means
to do so.

As I've said I was mostly doing send (with -p) | receive to do
incremental backups... and after a while I was cleaning up the old
snapshots on the backup fs.
Of course the snapshot subvols are pretty huge.. as I've said close to
8TB (7.5 or so)... everything from quite big files (4GB) to very small,
smylinks (no device/sockets/fifos)... perhaps some hardlinks...
Some refcopied files. The whole fs has compression enabled.


> > Shall I rw-mount the fs and do sync and wait and retry? Or is there
> > anything else that you want me to try before in order to get the
> > kernel
> > bug (if any) or btrfs-progs bug nailed down?
> 
> Personally speaking, rw mount would help, to verify if it's just a
> bug 
> that will disappear after the deletion is done.
Well but than we might loose any chance to further track it down.

And even if it would go away, it would still at least be a bug in terms
of fsck false positive if not more (in the sense of... corruptions
may happen if some affect parts of the fs are used while not cleaned up
again).


> But considering the size of your fs, it may not be a good idea as we 
> don't have reliable method to recover/rebuild extent tree yet.

So what do you effectively want now?
Wait and try something else?
RW mount and recheck to see whether it goes away with that? (And even
if, should I rather re-create/populate the fs from scratch just to be
sure?

What I can also offer in addition... as mentioned some times
previously, I do have full lists of the reg-files/dirs/symlinks as well
as SHA512 sums of each of the reg-files, as they are expected to be on
the fs respectively the snapshot.
So I can offer to do a full verification pass of these, to see whether
anything is missing or (file)data actually corrupted.

Of course that will take a while, and even if everything verifies, I'm
still not really sure whether I'd trust that fs anymore ;-)


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: corruption: yet another one after deleting a ro snapshot

2017-01-15 Thread Christoph Anton Mitterer
On Thu, 2017-01-12 at 10:38 +0800, Qu Wenruo wrote:
> IIRC, RO mount won't continue background deletion.
I see.


> Would you please try 4.9 btrfs-progs?

Done now, see results (lowmem and original mode) below:

# btrfs version
btrfs-progs v4.9

# btrfs check /dev/nbd0 ; echo $?
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
ref mismatch on [37765120 16384] extent item 0, found 1
Backref 37765120 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [37765120 16384]
owner ref check failed [37765120 16384]
ref mismatch on [5120 16384] extent item 0, found 1
Backref 5120 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [5120 16384]
owner ref check failed [5120 16384]
ref mismatch on [78135296 16384] extent item 0, found 1
Backref 78135296 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [78135296 16384]
owner ref check failed [78135296 16384]
ref mismatch on [5960381235200 16384] extent item 0, found 1
Backref 5960381235200 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [5960381235200 16384]
checking free space cache
checking fs roots
checking csums
checking root refs
found 7483995824128 bytes used err is 0
total csum bytes: 7296183880
total tree bytes: 10875944960
total fs tree bytes: 2035286016
total extent tree bytes: 1015988224
btree space waste bytes: 920641324
file data blocks allocated: 8267656339456
 referenced 8389440876544
0


# btrfs check --mode=lowmem /dev/nbd0 ; echo $?
Checking filesystem on /dev/nbd0
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4681006841856 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5063795802112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5171169984512 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5216267141120 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[5290355326976 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[5445511020544 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[6084387405824 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6104788500480 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6878956355584 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[6997067956224 1073741824] used 1073741824 

Re: corruption: yet another one after deleting a ro snapshot

2017-01-11 Thread Christoph Anton Mitterer
Hey Qu,

On Thu, 2017-01-12 at 09:25 +0800, Qu Wenruo wrote:
> And since you just deleted a subvolume and unmount it soon
Indeed, I unmounted it pretty quickly afterwards...

I had mounted it (ro) in the meantime, and did a whole
find mntoint > /dev/null
on it just to see whether going through the file hierarchy causes any
kernel errors already.
There are about 1,2 million files on the fs (in now only one snapshot)
and that took some 3-5 mins...
Not sure whether it continues to delete the subvol when it's mounted
ro... if so, it would have had some time.

However, another fsck afterwards:
# btrfs check /dev/mapper/data-a3 ; echo $?
Checking filesystem on /dev/mapper/data-a3
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
ref mismatch on [37765120 16384] extent item 0, found 1
Backref 37765120 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [37765120 16384]
owner ref check failed [37765120 16384]
ref mismatch on [5120 16384] extent item 0, found 1
Backref 5120 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [5120 16384]
owner ref check failed [5120 16384]
ref mismatch on [78135296 16384] extent item 0, found 1
Backref 78135296 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [78135296 16384]
owner ref check failed [78135296 16384]
ref mismatch on [5960381235200 16384] extent item 0, found 1
Backref 5960381235200 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [5960381235200 16384]
checking free space cache
checking fs roots
checking csums
checking root refs
found 7483995824128 bytes used err is 0
total csum bytes: 7296183880
total tree bytes: 10875944960
total fs tree bytes: 2035286016
total extent tree bytes: 1015988224
btree space waste bytes: 920641324
file data blocks allocated: 8267656339456
 referenced 8389440876544
0


> , I assume
> the 
> btrfs is still doing background subvolume deletion, maybe it's just
> a 
> false alert from btrfsck.
If one deleted a subvol and unmounts too fast, will this already cause
a corruption or does btrfs simply continue to cleanup during the next
time(s) it's mounted?



> Would you please try btrfs check --mode=lowmem using latest btrfs-
> progs?
Here we go, however still with v4.7.3:

# btrfs check --mode=lowmem /dev/mapper/data-a3 ; echo $?
Checking filesystem on /dev/mapper/data-a3
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
ERROR: block group[74117545984 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[239473786880 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[500393050112 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[581997428736 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[626557714432 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[668433645568 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[948680261632 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[982503129088 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1039411445760 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1054443831296 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1190809042944 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1279392743424 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1481256206336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[1620842643456 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[1914511032320 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3055361720320 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3216422993920 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[3670615785472 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3801612288000 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[3828455833600 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4250973241344 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4261710659584 1073741824] used 1073741824 but extent items 
used 1074266112
ERROR: block group[4392707162112 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4558063403008 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4607455526912 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4635372814336 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4640204652544 1073741824] used 1073741824 but extent items 
used 0
ERROR: block group[4642352136192 1073741824] used 1073741824 but extent items 
used 1207959552
ERROR: block group[4681006841856 1073741824] used 1073741824 but 

Re: corruption: yet another one after deleting a ro snapshot

2017-01-11 Thread Christoph Anton Mitterer
Oops forgot to copy and past the actual fsck output O:-)
# btrfs check /dev/mapper/data-a3 ; echo $?
Checking filesystem on /dev/mapper/data-a3
UUID: 326d292d-f97b-43ca-b1e8-c722d3474719
checking extents
ref mismatch on [37765120 16384] extent item 0, found 1
Backref 37765120 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [37765120 16384]
owner ref check failed [37765120 16384]
ref mismatch on [5120 16384] extent item 0, found 1
Backref 5120 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [5120 16384]
owner ref check failed [5120 16384]
ref mismatch on [78135296 16384] extent item 0, found 1
Backref 78135296 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [78135296 16384]
owner ref check failed [78135296 16384]
ref mismatch on [5960381235200 16384] extent item 0, found 1
Backref 5960381235200 parent 6403 root 6403 not found in extent tree
backpointer mismatch on [5960381235200 16384]
checking free space cache
checking fs roots
checking csums
checking root refs
found 7483995824128 bytes used err is 0
total csum bytes: 7296183880
total tree bytes: 10875944960
total fs tree bytes: 2035286016
total extent tree bytes: 1015988224
btree space waste bytes: 920641324
file data blocks allocated: 8267656339456
 referenced 8389440876544
0



Also I've found the previous occasion of the apparently same issue:
https://www.spinics.net/lists/linux-btrfs/msg45190.html


What's the suggested way in reporting bugs? Here on the list?
kernel.org bugzilla?
It's a bit worrying that even just I myself has reported quite a number
of likely bugs here on the ML which never got a reaction from a
developer and thus likely still sleep under to hood :-/


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


corruption: yet another one after deleting a ro snapshot

2017-01-11 Thread Christoph Anton Mitterer
Hey.

Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04)
x86_64 GNU/Linux
btrfs-progs v4.7.3

I've had this already at least once some year ago or so:

I was doing backups (incremental via send/receive).
After everything was copied, I unmounted the destination fs, made a
fsck, all fine.
Then I mounted it again and did nothing but deleting the old snapshot.
After that, another fsck with the following errors:


Usually I have quite positive experiences with btrfs (things seem to be
fine even after a crash or accidental removal of the USB cable which
attaches the HDD)... but I'm every time shocked again, when supposedly
simple and basic operations like this cause such corruptions.
Kinda gives one the feeling as if quite deep bugs are still everywhere
in place, especially as such "hard to explain" errors happens every now
and then (take e.g. my mails "strange btrfs deadlock", "csum errors
during btrfs check" from the last days... and I don't seem to be the
only one who suffers from such problems, even with the basic parts of
btrfs which are considered to be stable - I mean we're not talking
about RAID56 here)... sigh :-(


While these files are precious, I have in total copies of all these
files, 3 on btrfs and 1 on ext4 (just to be on the safe side if btrfs
gets corrupted for no good reason :-( ) so I could do some
debugging here if some developer tells me what to do.


Anyway... what should I do to repair the fs? Or is it better to simply
re-create that backup from scratch?


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


yet another call trace during send/receive

2017-01-11 Thread Christoph Anton Mitterer
Hi.

On Debian sid:
$ uname -a
Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04) x86_64 
GNU/Linux

$ btrfs version
btrfs-progs v4.7.3

During a:
# btrfs send -p foo bar | btrfs receive baz


Jan 11 20:43:10 heisenberg kernel: [ cut here ]
Jan 11 20:43:10 heisenberg kernel: WARNING: CPU: 6 PID: 10042 at 
/build/linux-zDY19G/linux-4.8.15/fs/btrfs/send.c:6117 
btrfs_ioctl_send+0x533/0x1280 [btrfs]
Jan 11 20:43:10 heisenberg kernel: Modules linked in: udp_diag tcp_diag 
inet_diag algif_skcipher af_alg uas vhost_net vhost macvtap macvlan xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
nf_nat tun bridge stp llc fuse ctr ccm ebtable_filter ebtables joydev 
rtsx_pci_ms memstick rtsx_pci_sdmmc mmc_core iTCO_wdt iTCO_vendor_support 
cpufreq_userspace cpufreq_powersave cpufreq_conservative ip6t_REJECT 
nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter 
ip6_tables xt_policy ipt_REJECT nf_reject_ipv4 xt_comment nf_conntrack_ipv4 
nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack iptable_filter 
binfmt_misc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate 
intel_uncore
Jan 11 20:43:10 heisenberg kernel:  intel_rapl_perf psmouse pcspkr uvcvideo 
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media 
btusb btrtl btbcm btintel sg bluetooth crc16 arc4 iwldvm mac80211 iwlwifi 
cfg80211 rtsx_pci rfkill fjes snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_codec_generic tpm_tis tpm_tis_core tpm i915 fujitsu_laptop battery 
snd_hda_intel snd_hda_codec lpc_ich i2c_i801 ac mfd_core shpchp i2c_smbus 
snd_hda_core snd_hwdep snd_pcm snd_timer e1000e snd soundcore ptp pps_core 
video button mei_me mei drm_kms_helper drm i2c_algo_bit loop parport_pc ppdev 
sunrpc lp parport ip_tables x_tables autofs4 dm_crypt dm_mod raid10 raid456 
libcrc32c async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 
raid0 multipath linear md_mod btrfs crc32c_generic xor raid6_pq uhci_hcd 
usb_storage
Jan 11 20:43:10 heisenberg kernel:  sd_mod crc32c_intel ahci libahci 
aesni_intel xhci_pci aes_x86_64 xhci_hcd libata glue_helper lrw ehci_pci 
gf128mul ablk_helper ehci_hcd cryptd evdev usbcore scsi_mod serio_raw usb_common
Jan 11 20:43:10 heisenberg kernel: CPU: 6 PID: 10042 Comm: btrfs Tainted: G 
   W   4.8.0-2-amd64 #1 Debian 4.8.15-2
Jan 11 20:43:10 heisenberg kernel: Hardware name: FUJITSU LIFEBOOK 
E782/FJNB23E, BIOS Version 1.11 05/24/2012
Jan 11 20:43:10 heisenberg kernel:  0286 248adbdb 
b3b1f925 
Jan 11 20:43:10 heisenberg kernel:   b3874ffe 
9ebe7e9f4424 7ffcbf0ea5d0
Jan 11 20:43:10 heisenberg kernel:  9ebc0d644000 9ebe7e9f4000 
9ebe5e44fb20 9ebd4270ae00
Jan 11 20:43:10 heisenberg kernel: Call Trace:
Jan 11 20:43:10 heisenberg kernel:  [] ? dump_stack+0x5c/0x77
Jan 11 20:43:10 heisenberg kernel:  [] ? __warn+0xbe/0xe0
Jan 11 20:43:10 heisenberg kernel:  [] ? 
btrfs_ioctl_send+0x533/0x1280 [btrfs]
Jan 11 20:43:10 heisenberg kernel:  [] ? 
memcg_kmem_get_cache+0x50/0x150
Jan 11 20:43:10 heisenberg kernel:  [] ? 
kmem_cache_alloc+0x122/0x530
Jan 11 20:43:10 heisenberg kernel:  [] ? 
sched_slice.isra.57+0x51/0xc0
Jan 11 20:43:10 heisenberg kernel:  [] ? 
update_cfs_rq_load_avg+0x200/0x4c0
Jan 11 20:43:10 heisenberg kernel:  [] ? 
task_rq_lock+0x46/0xa0
Jan 11 20:43:10 heisenberg kernel:  [] ? 
btrfs_ioctl+0x97c/0x2370 [btrfs]
Jan 11 20:43:10 heisenberg kernel:  [] ? 
enqueue_task_fair+0x5c/0x940
Jan 11 20:43:10 heisenberg kernel:  [] ? sched_clock+0x5/0x10
Jan 11 20:43:10 heisenberg kernel:  [] ? 
check_preempt_curr+0x50/0x90
Jan 11 20:43:10 heisenberg kernel:  [] ? 
wake_up_new_task+0x156/0x200
Jan 11 20:43:10 heisenberg kernel:  [] ? 
do_vfs_ioctl+0x9f/0x5f0
Jan 11 20:43:10 heisenberg kernel:  [] ? _do_fork+0x14d/0x3f0
Jan 11 20:43:10 heisenberg kernel:  [] ? SyS_ioctl+0x74/0x80
Jan 11 20:43:10 heisenberg kernel:  [] ? 
system_call_fast_compare_end+0xc/0x96
Jan 11 20:43:10 heisenberg kernel: ---[ end trace 3831b8afbd0cbc9e ]---
Jan 11 20:43:45 heisenberg kernel: BTRFS info (device dm-2): The free space 
cache file (7525348933632) is invalid. skip it


The send/receive seems to continue running...
Not sure if the free space cache file entry is related (btw: a btrfs
check directly before didn't find that error - actually yet another
fsck directly before that, brought a message that the super generation
and space file generation would mismatch (or something like that) and
it would be invalidated... so kinda strange that this happens at all).


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: some free space cache corruptions

2016-12-28 Thread Christoph Anton Mitterer
On Mon, 2016-12-26 at 00:12 +, Duncan wrote:
> By themselves, free-space cache warnings are minor and not a serious 
> issue at all -- the cache is just that, a cache, designed to speed 
> operation but not actually necessary, and btrfs can detect and route 
> around space-cache corruption on-the-fly so by itself it's not a big
> deal.
Well... sure about that? Haven't we had recently that serious bug in
the FST, which could cause data corruption as btrfs used space as free,
while it wasn't?


> These warnings are however hints that something out of the routine
> has 
> happened
Which again just likely means that there was/is some bug in btrfs...
other than that, why should it suddenly get some corrupted cache, when
only ro-snapshots were removed in bewtween?


> unless the filesystem itself, or a scrub, etc, has fixed things
> in 
> the mean time.  (And as I said, the space-cache is only a cache,
> designed 
> to speed things up, cache corruption is fairly common and btrfs can
> and 
> does deal with it without issue.
When finishing the most recent backups, the fs in question got pretty
fully and the error message I've spottet during btrfs check appeared in
the kernel log as well:
Dec 29 03:03:11 heisenberg kernel: BTRFS warning (device dm-1): block group 
5431552376832 has wrong amount of free space
Dec 29 03:03:11 heisenberg kernel: BTRFS warning (device dm-1): failed to load 
free space cache for block group 5431552376832, rebuilding it now
(fs was NOT mounted with clear_cache)

which implies it was now rebuilt

However, after a subsquent fsck, the same error occurs there again:
# btrfs check /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
block group 5431552376832 has wrong amount of free space
failed to load free space cache for block group 5431552376832
checking fs roots
checking csums
checking root refs
found 7571911602176 bytes used err is 0
total csum bytes: 7381752972
total tree bytes: 11145035776
total fs tree bytes: 2100396032
total extent tree bytes: 1137082368
btree space waste bytes: 996179488
file data blocks allocated: 7560766566400
 referenced 7681157672960
0


> 2) It recently came to the attention of the devs that the existing
> btrfs 
> mount-option method of clearing the free-space cache only clears it
> for 
> block-groups/chunks it encounters on-the-fly.  It doesn't do a
> systematic 
> beginning-to-end clear.
So that calls for fixing the documentation as well?!



> 3) As a result of #2, the devs only very recently added support in
> btrfs 
> check for a /full/ space-cache-v1 clear, using the new
> --clear-space-cache option.  But your btrfs-progs v4.7.3 is too old
> to 
> support it.  I know it's in the v4.9 I just upgraded to... checking
> the 
> wiki it appears the option was added in btrfs-progs v4.8.3 (v4.8.4
> for v2 
> cache).

And is the new option stable?! ;-)

> Tho if you haven't recently run a scrub, I'd do that as well
Well I did a full verification using my own checksum (i.e. every
regular file in the fs has SHA512 sums attached as XATTR)... since that
caused all data to be read, this should be identical to a scrub (at
least as for the regular files data (no necessarily metadata),
shouldn't it? 


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


some free space cache corruptions

2016-12-25 Thread Christoph Anton Mitterer
Hey.

Had the following on a Debian sid:
Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02)
x86_64 GNU/Linux

btrfs-progs v4.7.3


I was doing a btrfs check of a rather big btrfs (8TB device, nearly
full), having many snapshots on it, all incrementally send from another
8TB device, which in turn functions as the master copy:
# btrfs check /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 6805741969408 bytes used err is 0
total csum bytes: 6634558200
total tree bytes: 10292641792
total fs tree bytes: 2074869760
total extent tree bytes: 1100251136
btree space waste bytes: 885346193
file data blocks allocated: 6922343247872
 referenced 7040929374208
0

=> this already showed an unusual:
cache and super generation don't match, space cache will be invalidated
Where does it come from? 


Then I did some incremental send/receive (-p) from the other master 8TB
master btrfs and another fsck afters wards:

# btrfs check /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 7467006156800 bytes used err is 0
total csum bytes: 7279407560
total tree bytes: 11069603840
total fs tree bytes: 2127314944
total extent tree bytes: 1141342208
btree space waste bytes: 922662895
file data blocks allocated: 7599280926720
 referenced 7720960733184
0

=> all fine...



Afterwards I removed all ro-snapshots except the most recent one... and
repeated the fsck:
# btrfs check /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
block group 5431552376832 has wrong amount of free space
failed to load free space cache for block group 5431552376832
checking fs roots
checking csums
checking root refs
found 7427361222656 bytes used err is 0
total csum bytes: 7240763996
total tree bytes: 10998038528
total fs tree bytes: 2100297728
total extent tree bytes: 1137065984
btree space waste bytes: 992708933
file data blocks allocated: 7416363184128
 referenced 7536754290688
0

=> Isn't that some indication of a bug already? Nothing happened, just
deletion of snapshots and there is apparently some free space cache
corruption?


Then I tried the usual recipe:
mount /data/data-a/2/ -o clear_cache
kernel said:
Dec 25 22:14:17 heisenberg kernel: BTRFS info (device dm-2): force clearing of 
disk cache

...re-mounted,rw, deleted some regular files... repeated the fsck and
again:
# btrfs check /dev/mapper/data-a2 ; echo $?
Checking filesystem on /dev/mapper/data-a2
UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db
checking extents
checking free space cache
block group 5431552376832 has wrong amount of free space
failed to load free space cache for block group 5431552376832
checking fs roots
checking csums
checking root refs
found 7427284213760 bytes used err is 0
total csum bytes: 7240689688
total tree bytes: 10997907456
total fs tree bytes: 2100281344
total extent tree bytes: 1137049600
btree space waste bytes: 992679805
file data blocks allocated: 7416286306304
 referenced 7536677412864
0

=> same error again...

Any ideas how to resolve? And is this some serious error that could
have caused corruptions?


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


csum errors during btrfs check

2016-12-23 Thread Christoph Anton Mitterer
Hey.

Had the following on a Debian sid:
Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02) x86_64 
GNU/Linux

btrfs-progs v4.7.3


(It's not so long ago that I ran some longer memtest86+ on the
respective system. So memory should be ok.)


It was again a 8 TB SATA disk connected via USB3.
SMART values are all just okay, the disk has some 600 head flying hours
so not *that* extremely much.

When doing another round of btrfs check on it (I only did some
yesterday after the events described in the "strange btrfs deadlock"
email I've sent to the earlier today) everything was still okay.

This time, I got csum errors during the "checking extents" phase.
There were always two identical lines of csum errors printed (with the
same address, expect and actual csum.

I've aborted btrfs check (after some 10-15 pairs of csum errors),
repeated it... again csum errors.

Aborted it again, blockdev --setro the device, mounted the fs and did a
find /mnt/ > /dev/null on it seemed to all work fine.

Unmounted and repeated the btrfs check... no errors this time (and I
let it complete)...

No messages/errors to the kernel log during the whole time.


Any ideas what that could mean? Are there any known bugs in 4.7.3?

Especially... if the csums errors might have occurred just because of
some electronic glitch in the SATA/USB bridge (okay unlikely - I'd have
 expected that the lower bus levels show such errors - but who knows)
what does btrfs check do if it encounters such csum errors? Trying to
correct it (thereby possibly writing bad data in my case)?

That the errors just went away isn't less worrying,... but I'd have
expected that because of the blockdev --setro, there couldn't have been
any auto-repairs or that like which would have corrected the csums.


Any advice what I could/should do now? scrubbing[0]? Rather assuming
faulty hardware. The data on the device is backuped (mostly), still
it's pretty precious.


Thanks,
Chris.

[0] I also have my hown SHA512 sums of all files on disk (in XATTRS)
plus file lists of files that should be present (+/- some recent
changes to the fs)... so I can do really very accurate checks whether
the data is fully ok.

smime.p7s
Description: S/MIME cryptographic signature


strange btrfs deadlock

2016-12-22 Thread Christoph Anton Mitterer
Hey.

Had the following on a Debian sid:
Linux heisenberg 4.8.0-2-amd64 #1 SMP Debian 4.8.11-1 (2016-12-02)
x86_64 GNU/Linux


I was basically copying data between several filesystems all on SATA
disks attached via USB.

Unfortunately I have only little data...



The first part may be totally unrelated... here I was doing some
recursive diff between data on sdb and sdc (both mounted ro), when I
connected a 3rd disk to the same USB3.0 hub on which the other two
disks were already connected.

That somehow made sdc failing... (interestingly sdb seemed to continue
working).

Dec 23 04:36:04 heisenberg kernel: [38080.618202] BTRFS info (device dm-1): 
disk space caching is enabled
Dec 23 04:36:18 heisenberg kernel: [38093.903212] bash (7006): drop_caches: 3
Dec 23 04:58:44 heisenberg kernel: [39440.832610] scsi host7: uas_pre_reset: 
timed out
Dec 23 04:58:44 heisenberg kernel: [39440.832760] sd 7:0:0:0: [sdc] tag#4 
uas_zap_pending 0 uas-tag 5 inflight: CMD 
Dec 23 04:58:44 heisenberg kernel: [39440.832767] sd 7:0:0:0: [sdc] tag#4 CDB: 
Read(10) 28 00 3f 03 45 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832777] sd 7:0:0:0: [sdc] tag#5 
uas_zap_pending 0 uas-tag 6 inflight: CMD 
Dec 23 04:58:44 heisenberg kernel: [39440.832780] sd 7:0:0:0: [sdc] tag#5 CDB: 
Read(10) 28 00 3f 03 49 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832785] sd 7:0:0:0: [sdc] tag#6 
uas_zap_pending 0 uas-tag 7 inflight: CMD 
Dec 23 04:58:44 heisenberg kernel: [39440.832788] sd 7:0:0:0: [sdc] tag#6 CDB: 
Read(10) 28 00 3f 03 4d 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832792] sd 7:0:0:0: [sdc] tag#8 
uas_zap_pending 0 uas-tag 9 inflight: CMD 
Dec 23 04:58:44 heisenberg kernel: [39440.832796] sd 7:0:0:0: [sdc] tag#8 CDB: 
Read(10) 28 00 3f 03 51 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832858] sd 7:0:0:0: [sdc] tag#4 
FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Dec 23 04:58:44 heisenberg kernel: [39440.832864] sd 7:0:0:0: [sdc] tag#4 CDB: 
Read(10) 28 00 3f 03 45 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832870] blk_update_request: I/O 
error, dev sdc, sector 1057178952
Dec 23 04:58:44 heisenberg kernel: [39440.832917] sd 7:0:0:0: [sdc] tag#5 
FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Dec 23 04:58:44 heisenberg kernel: [39440.832921] sd 7:0:0:0: [sdc] tag#5 CDB: 
Read(10) 28 00 3f 03 49 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832924] blk_update_request: I/O 
error, dev sdc, sector 1057179976
Dec 23 04:58:44 heisenberg kernel: [39440.832937] BTRFS error (device dm-2): 
bdev /dev/mapper/data-c errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
Dec 23 04:58:44 heisenberg kernel: [39440.832959] sd 7:0:0:0: [sdc] tag#6 
FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Dec 23 04:58:44 heisenberg kernel: [39440.832963] sd 7:0:0:0: [sdc] tag#6 CDB: 
Read(10) 28 00 3f 03 4d 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832966] blk_update_request: I/O 
error, dev sdc, sector 1057181000
Dec 23 04:58:44 heisenberg kernel: [39440.832980] sd 7:0:0:0: [sdc] tag#8 
FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Dec 23 04:58:44 heisenberg kernel: [39440.832985] sd 7:0:0:0: [sdc] tag#8 CDB: 
Read(10) 28 00 3f 03 51 48 00 04 00 00
Dec 23 04:58:44 heisenberg kernel: [39440.832988] blk_update_request: I/O 
error, dev sdc, sector 1057182024
Dec 23 04:58:44 heisenberg kernel: [39440.832995] BTRFS error (device dm-2): 
bdev /dev/mapper/data-c errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
Dec 23 04:58:44 heisenberg kernel: [39440.833807] sd 7:0:0:0: [sdc] 
Synchronizing SCSI cache
Dec 23 04:58:45 heisenberg kernel: [39441.072663] sd 7:0:0:0: [sdc] Synchronize 
Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Dec 23 04:58:45 heisenberg kernel: [39441.096973] usb 4-2.4: Disable of 
device-initiated U1 failed.
Dec 23 04:58:45 heisenberg kernel: [39441.100670] usb 4-2.4: Disable of 
device-initiated U2 failed.
Dec 23 04:58:45 heisenberg kernel: [39441.107663] usb 4-2.4: Set SEL for 
device-initiated U1 failed.
Dec 23 04:58:45 heisenberg kernel: [39441.55] usb 4-2.4: Set SEL for 
device-initiated U2 failed.
Dec 23 04:58:45 heisenberg kernel: [39441.188752] usb 4-2.4: reset SuperSpeed 
USB device number 4 using xhci_hcd
Dec 23 04:58:45 heisenberg kernel: [39441.225703] scsi host8: uas
Dec 23 04:58:45 heisenberg kernel: [39441.227043] scsi 8:0:0:0: Direct-Access   
  Seagate  Expansion0636 PQ: 0 ANSI: 6
Dec 23 04:58:45 heisenberg kernel: [39441.429443] sd 8:0:0:0: Attached scsi 
generic sg2 type 0
Dec 23 04:58:45 heisenberg kernel: [39441.429572] sd 8:0:0:0: [sdd] 3907029167 
512-byte logical blocks: (2.00 TB/1.82 TiB)
Dec 23 04:58:45 heisenberg kernel: [39441.430756] sd 8:0:0:0: [sdd] Write 
Protect is off
Dec 23 04:58:45 heisenberg kernel: [39441.430764] sd 8:0:0:0: [sdd] Mode Sense: 
2b 00 10 08
Dec 23 04:58:45 heisenberg kernel: [39441.431593] sd 8:0:0:0: [sdd] Write 
cache: enabled, read 

  1   2   3   4   >