[PATCH] fstests: btrfs/004: increase the buffer size of logical-resolve to the maximum value 64K

2018-03-05 Thread Lu Fengqi
Because of commit e76e13ce8c0b ("fsstress: implement the
clonerange/deduperange ioctls"), dedupe makes the number of references to
the same extent item increase so much that the default 4K buffer of
logical-resolve is no longer sufficient.

Signed-off-by: Lu Fengqi 
---
 tests/btrfs/004 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/btrfs/004 b/tests/btrfs/004
index de583cc355d4..0d2efb91dba7 100755
--- a/tests/btrfs/004
+++ b/tests/btrfs/004
@@ -103,7 +103,7 @@ _btrfs_inspect_addr()
expect_addr=$3
expect_inum=$4
file=$5
-   cmd="$BTRFS_UTIL_PROG inspect-internal logical-resolve -P $addr $mp"
+   cmd="$BTRFS_UTIL_PROG inspect-internal logical-resolve -s 65536 -P 
$addr $mp"
echo "# $cmd" >> $seqres.full
out=`$cmd`
echo "$out" >> $seqres.full
-- 
2.16.2



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ERROR: unsupported checksum algorithm 35355

2018-03-05 Thread Ken Swenson
Hi Qu,

attached is the binary super block as requested.

Thank you,

Ken

On 03/05/2018 09:07 PM, Qu Wenruo wrote:
>
> On 2018年03月06日 09:51, Ken Swenson wrote:
>> Hello,
>>  
>> Somehow it appears the csum_type on my btrfs file system got corrupted.
>> I cannot mount the system in recovery or read only. btrfs check just
>> returns "ERROR: unsupported checksum algorithm 35355" as well as btrfs
>> recover. The only command I was able to successfully run was btrfs
>> inspect-internal dump-super, which I've pasted the output at the end of
>> this message.
>>
>> Requested information from the wiki:
>> Linux 4.15.7-1-ARCH #1 SMP PREEMPT Wed Feb 28 19:01:57 UTC 2018 x86_64
>> GNU/Linux
>> btrfs-progs v4.15.1
>> btrfs fi show: ERROR: unsupported checksum algorithm 35355 ERROR: cannot
>> scan /dev/mapper/x: Input/output error
>>
>> dmesg:
>> [ 11.232399] Btrfs loaded, crc32c=crc32c-intel
>> [ 11.233229] BTRFS: device fsid 513f07e1-08be-4d94-a55c-11c6251f6c2f
>> devid 1 transid  /dev/dm-2
>> [ 488.372891] BTRFS error (device dm-2): unsupported checksum algorithm
>> 35355
>> [ 488.372894] BTRFS error (device dm-2): superblock checksum mismatch
>> [ 488.384902] BTRFS error (device dm-2): open_ctree failed
>>
>> Is there anything I can do to recovery from this or am I out of luck? To
>> give some background the disk was working fine until I upgraded to
>> Kernel 4.15.7 and rebooted.
>>
>> superblock: bytenr=65536, device=/dev/mapper/x
>> -
>> csum_type        35355 (INVALID)
> This is obviously corrupted.
>
> Btrfs only supports csum_type 0 (CRC32) yet.
>
> And the value seems to be some garbage.
>
>> csum_size        32
> So is the csum size.
>
>> csum           
>> 0xf0dbeddd [match]
> Surprised to see the csum even matched.
>
>> bytenr            65536
>> flags            0x1
>>             ( WRITTEN )
>> magic            _BHRfS_M [match]
>> fsid            513f07e1-08be-4d94-a55c-11c6251f6c2f
>> label           
>> generation        
>> root            186466304
>> sys_array_size        129
>> chunk_root_generation    7450
>> root_level        1
>> chunk_root        21004288
>> chunk_root_level    1
>> log_root        0
>> log_root_transid    0
>> log_root_level        0
>> total_bytes        5000947523584
>> bytes_used        420849201152
>> sectorsize        4096
>> nodesize        16384
>> leafsize (deprecated)        16384
>> stripesize        4096
>> root_dir        6
>> num_devices        1
>> compat_flags        0x0
>> compat_ro_flags        0x0
>> incompat_flags        0x176d2169
>>             ( MIXED_BACKREF |
>>               COMPRESS_LZO |
>>               BIG_METADATA |
>>               EXTENDED_IREF |
>>               SKINNY_METADATA |
>>               unknown flag: 0x176d2000 )
> And unknown flags also exists.
>
> And according to later output, all backup super blocks have the same
> corruption while csum still matches.
>
> I'm wondering if the memory is corrupted.
>
> It's possible to manually modify the superblock to a valid status.
> As all the corruption is obvious, but I'm not 100% sure if there is
> other corruption.
>
> Please provide the binary superblock dump by:
>
> dd if= bs=1 count=4K skip=64K of=super_copy
>
> Thanks,
> Qu
>
>> cache_generation    
>> uuid_tree_generation    
>> dev_item.uuid        880c692f-5270-4c7a-908d-8b3956fb3790
>> dev_item.fsid        513f07e1-08be-4d94-a55c-11c6251f6c2f [match]
>> dev_item.type        0
>> dev_item.total_bytes    5000947523584
>> dev_item.bytes_used    457439182848
>> dev_item.io_align    4096
>> dev_item.io_width    4096
>> dev_item.sector_size    4096
>> dev_item.devid        1
>> dev_item.dev_group    0
>> dev_item.seek_speed    0
>> dev_item.bandwidth    0
>> dev_item.generation    0
>> sys_chunk_array[2048]:
>>     item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520)
>>         length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP
>>         io_align 65536 io_width 65536 sector_size 4096
>>         num_stripes 2 sub_stripes 0
>>             stripe 0 devid 1 offset 20971520
>>             dev_uuid 880c692f-5270-4c7a-908d-8b3956fb3790
>>             stripe 1 devid 1 offset 29360128
>>             dev_uuid 880c692f-5270-4c7a-908d-8b3956fb3790
>> backup_roots[4]:
>>     backup 0:
>>         backup_tree_root:    181174272    gen: 7774    level: 1
>>         backup_chunk_root:    21004288    gen: 7450    level: 1
>>         backup_extent_root:    181190656    gen: 7774    level: 2
>>         backup_fs_root:        809828352    gen: 6404    level: 0
>>         backup_dev_root:    93585408    gen: 7631    level: 1
>>         backup_csum_root:    181862400    gen: 7774    level: 2
>>         backup_total_bytes:    5000947523584
>>         backup_bytes_used:    420849201152
>>         backup_num_devices:    1
>>
>>     backup 1:
>>         backup_tree_root:    183828480    gen: 7775    level: 1
>>         

[PATCH] btrfs-progs: dump-super: Don't verify csum if csum type or size is unknown

2018-03-05 Thread Qu Wenruo
Reported-by: Ken Swenson 
Signed-off-by: Qu Wenruo 
---
 cmds-inspect-dump-super.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/cmds-inspect-dump-super.c b/cmds-inspect-dump-super.c
index 150c2e5aedf4..85bff262ad85 100644
--- a/cmds-inspect-dump-super.c
+++ b/cmds-inspect-dump-super.c
@@ -339,7 +339,9 @@ static void dump_superblock(struct btrfs_super_block *sb, 
int full)
printf("csum\t\t\t0x");
for (i = 0, p = sb->csum; i < csum_size; i++)
printf("%02x", p[i]);
-   if (check_csum_sblock(sb, csum_size))
+   if (csum_type != BTRFS_CSUM_TYPE_CRC32 || csum_size != BTRFS_CRC32_SIZE)
+   printf(" [UNKNOWN CSUM TYPE OR SIZE]");
+   else if (check_csum_sblock(sb, csum_size))
printf(" [match]");
else
printf(" [DON'T MATCH]");
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ERROR: unsupported checksum algorithm 35355

2018-03-05 Thread Qu Wenruo


On 2018年03月06日 09:51, Ken Swenson wrote:
> Hello,
>  
> Somehow it appears the csum_type on my btrfs file system got corrupted.
> I cannot mount the system in recovery or read only. btrfs check just
> returns "ERROR: unsupported checksum algorithm 35355" as well as btrfs
> recover. The only command I was able to successfully run was btrfs
> inspect-internal dump-super, which I've pasted the output at the end of
> this message.
> 
> Requested information from the wiki:
> Linux 4.15.7-1-ARCH #1 SMP PREEMPT Wed Feb 28 19:01:57 UTC 2018 x86_64
> GNU/Linux
> btrfs-progs v4.15.1
> btrfs fi show: ERROR: unsupported checksum algorithm 35355 ERROR: cannot
> scan /dev/mapper/x: Input/output error
> 
> dmesg:
> [ 11.232399] Btrfs loaded, crc32c=crc32c-intel
> [ 11.233229] BTRFS: device fsid 513f07e1-08be-4d94-a55c-11c6251f6c2f
> devid 1 transid  /dev/dm-2
> [ 488.372891] BTRFS error (device dm-2): unsupported checksum algorithm
> 35355
> [ 488.372894] BTRFS error (device dm-2): superblock checksum mismatch
> [ 488.384902] BTRFS error (device dm-2): open_ctree failed
> 
> Is there anything I can do to recovery from this or am I out of luck? To
> give some background the disk was working fine until I upgraded to
> Kernel 4.15.7 and rebooted.
> 
> superblock: bytenr=65536, device=/dev/mapper/x
> -
> csum_type        35355 (INVALID)

This is obviously corrupted.

Btrfs only supports csum_type 0 (CRC32) yet.

And the value seems to be some garbage.

> csum_size        32

So is the csum size.

> csum           
> 0xf0dbeddd [match]

Surprised to see the csum even matched.

> bytenr            65536
> flags            0x1
>             ( WRITTEN )
> magic            _BHRfS_M [match]
> fsid            513f07e1-08be-4d94-a55c-11c6251f6c2f
> label           
> generation        
> root            186466304
> sys_array_size        129
> chunk_root_generation    7450
> root_level        1
> chunk_root        21004288
> chunk_root_level    1
> log_root        0
> log_root_transid    0
> log_root_level        0
> total_bytes        5000947523584
> bytes_used        420849201152
> sectorsize        4096
> nodesize        16384
> leafsize (deprecated)        16384
> stripesize        4096
> root_dir        6
> num_devices        1
> compat_flags        0x0
> compat_ro_flags        0x0
> incompat_flags        0x176d2169
>             ( MIXED_BACKREF |
>               COMPRESS_LZO |
>               BIG_METADATA |
>               EXTENDED_IREF |
>               SKINNY_METADATA |
>               unknown flag: 0x176d2000 )

And unknown flags also exists.

And according to later output, all backup super blocks have the same
corruption while csum still matches.

I'm wondering if the memory is corrupted.

It's possible to manually modify the superblock to a valid status.
As all the corruption is obvious, but I'm not 100% sure if there is
other corruption.

Please provide the binary superblock dump by:

dd if= bs=1 count=4K skip=64K of=super_copy

Thanks,
Qu

> cache_generation    
> uuid_tree_generation    
> dev_item.uuid        880c692f-5270-4c7a-908d-8b3956fb3790
> dev_item.fsid        513f07e1-08be-4d94-a55c-11c6251f6c2f [match]
> dev_item.type        0
> dev_item.total_bytes    5000947523584
> dev_item.bytes_used    457439182848
> dev_item.io_align    4096
> dev_item.io_width    4096
> dev_item.sector_size    4096
> dev_item.devid        1
> dev_item.dev_group    0
> dev_item.seek_speed    0
> dev_item.bandwidth    0
> dev_item.generation    0
> sys_chunk_array[2048]:
>     item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520)
>         length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP
>         io_align 65536 io_width 65536 sector_size 4096
>         num_stripes 2 sub_stripes 0
>             stripe 0 devid 1 offset 20971520
>             dev_uuid 880c692f-5270-4c7a-908d-8b3956fb3790
>             stripe 1 devid 1 offset 29360128
>             dev_uuid 880c692f-5270-4c7a-908d-8b3956fb3790
> backup_roots[4]:
>     backup 0:
>         backup_tree_root:    181174272    gen: 7774    level: 1
>         backup_chunk_root:    21004288    gen: 7450    level: 1
>         backup_extent_root:    181190656    gen: 7774    level: 2
>         backup_fs_root:        809828352    gen: 6404    level: 0
>         backup_dev_root:    93585408    gen: 7631    level: 1
>         backup_csum_root:    181862400    gen: 7774    level: 2
>         backup_total_bytes:    5000947523584
>         backup_bytes_used:    420849201152
>         backup_num_devices:    1
> 
>     backup 1:
>         backup_tree_root:    183828480    gen: 7775    level: 1
>         backup_chunk_root:    21004288    gen: 7450    level: 1
>         backup_extent_root:    183320576    gen: 7775    level: 2
>         backup_fs_root:        809828352    gen: 6404    level: 0
>         backup_dev_root:    183975936    gen: 7775    level: 1
>     

ERROR: unsupported checksum algorithm 35355

2018-03-05 Thread Ken Swenson
Hello,
 
Somehow it appears the csum_type on my btrfs file system got corrupted.
I cannot mount the system in recovery or read only. btrfs check just
returns "ERROR: unsupported checksum algorithm 35355" as well as btrfs
recover. The only command I was able to successfully run was btrfs
inspect-internal dump-super, which I've pasted the output at the end of
this message.

Requested information from the wiki:
Linux 4.15.7-1-ARCH #1 SMP PREEMPT Wed Feb 28 19:01:57 UTC 2018 x86_64
GNU/Linux
btrfs-progs v4.15.1
btrfs fi show: ERROR: unsupported checksum algorithm 35355 ERROR: cannot
scan /dev/mapper/x: Input/output error

dmesg:
[ 11.232399] Btrfs loaded, crc32c=crc32c-intel
[ 11.233229] BTRFS: device fsid 513f07e1-08be-4d94-a55c-11c6251f6c2f
devid 1 transid  /dev/dm-2
[ 488.372891] BTRFS error (device dm-2): unsupported checksum algorithm
35355
[ 488.372894] BTRFS error (device dm-2): superblock checksum mismatch
[ 488.384902] BTRFS error (device dm-2): open_ctree failed

Is there anything I can do to recovery from this or am I out of luck? To
give some background the disk was working fine until I upgraded to
Kernel 4.15.7 and rebooted.

superblock: bytenr=65536, device=/dev/mapper/x
-
csum_type        35355 (INVALID)
csum_size        32
csum           
0xf0dbeddd [match]
bytenr            65536
flags            0x1
            ( WRITTEN )
magic            _BHRfS_M [match]
fsid            513f07e1-08be-4d94-a55c-11c6251f6c2f
label           
generation        
root            186466304
sys_array_size        129
chunk_root_generation    7450
root_level        1
chunk_root        21004288
chunk_root_level    1
log_root        0
log_root_transid    0
log_root_level        0
total_bytes        5000947523584
bytes_used        420849201152
sectorsize        4096
nodesize        16384
leafsize (deprecated)        16384
stripesize        4096
root_dir        6
num_devices        1
compat_flags        0x0
compat_ro_flags        0x0
incompat_flags        0x176d2169
            ( MIXED_BACKREF |
              COMPRESS_LZO |
              BIG_METADATA |
              EXTENDED_IREF |
              SKINNY_METADATA |
              unknown flag: 0x176d2000 )
cache_generation    
uuid_tree_generation    
dev_item.uuid        880c692f-5270-4c7a-908d-8b3956fb3790
dev_item.fsid        513f07e1-08be-4d94-a55c-11c6251f6c2f [match]
dev_item.type        0
dev_item.total_bytes    5000947523584
dev_item.bytes_used    457439182848
dev_item.io_align    4096
dev_item.io_width    4096
dev_item.sector_size    4096
dev_item.devid        1
dev_item.dev_group    0
dev_item.seek_speed    0
dev_item.bandwidth    0
dev_item.generation    0
sys_chunk_array[2048]:
    item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520)
        length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP
        io_align 65536 io_width 65536 sector_size 4096
        num_stripes 2 sub_stripes 0
            stripe 0 devid 1 offset 20971520
            dev_uuid 880c692f-5270-4c7a-908d-8b3956fb3790
            stripe 1 devid 1 offset 29360128
            dev_uuid 880c692f-5270-4c7a-908d-8b3956fb3790
backup_roots[4]:
    backup 0:
        backup_tree_root:    181174272    gen: 7774    level: 1
        backup_chunk_root:    21004288    gen: 7450    level: 1
        backup_extent_root:    181190656    gen: 7774    level: 2
        backup_fs_root:        809828352    gen: 6404    level: 0
        backup_dev_root:    93585408    gen: 7631    level: 1
        backup_csum_root:    181862400    gen: 7774    level: 2
        backup_total_bytes:    5000947523584
        backup_bytes_used:    420849201152
        backup_num_devices:    1

    backup 1:
        backup_tree_root:    183828480    gen: 7775    level: 1
        backup_chunk_root:    21004288    gen: 7450    level: 1
        backup_extent_root:    183320576    gen: 7775    level: 2
        backup_fs_root:        809828352    gen: 6404    level: 0
        backup_dev_root:    183975936    gen: 7775    level: 1
        backup_csum_root:    185761792    gen: 7775    level: 2
        backup_total_bytes:    5000947523584
        backup_bytes_used:    420849201152
        backup_num_devices:    1

    backup 2:
        backup_tree_root:    186810368    gen: 7776    level: 1
        backup_chunk_root:    21004288    gen: 7450    level: 1
        backup_extent_root:    186613760    gen: 7776    level: 2
        backup_fs_root:        809828352    gen: 6404    level: 0
        backup_dev_root:    183975936    gen: 7775    level: 1
        backup_csum_root:    187056128    gen: 7776    level: 2
        backup_total_bytes:    5000947523584
        backup_bytes_used:    420849201152
        backup_num_devices:    1

    backup 3:
        backup_tree_root:    186466304    gen:     level: 1
        backup_chunk_root:    21004288    gen: 7450    level: 1
        backup_extent_root:    187908096  

Re: spurious full btrfs corruption

2018-03-05 Thread Qu Wenruo


On 2018年03月06日 08:57, Christoph Anton Mitterer wrote:
> Hey Qu.
> 
> On Thu, 2018-03-01 at 09:25 +0800, Qu Wenruo wrote:
>>> - For my personal data, I have one[0] Seagate 8 TB SMR HDD, which I
>>>   backup (send/receive) on two further such HDDs (all these are
>>>   btrfs), and (rsync) on one further with ext4.
>>>   These files have all their SHA512 sums attached as XATTRs, which
>>> I
>>>   regularly test. So I think I can be pretty sure, that there was
>>> never
>>>   a case of silent data corruption and the RAM on the E782 is fine.
>>
>> Good backup practice can't be even better.
> 
> Well I still would want to add something tape and/or optical based
> solution...
> But having this depends a bit on having a good way to do incremental
> backups, i.e. I wouldn't want to write full copies of everything to
> tape/BluRay over and over again, but just the actually added data and
> records of metadata changes.
> The former (adding just added files is rather easy), but still
> recording any changes in metadata (moved/renamed/deleted files, changes
> in file dates, permissions, XATTRS etc.).
> Also I would always want to backup complete files, so not just changes
> to a file, even if just one byte changed of a 4 GiB file... and not
> want to have files split over mediums.
> 
> send/receive sounds like a candidate for this (except it works only on
> changes, not full files), but I would prefer to have everything in a
> standard format like tar which one can rather easily recover manually
> if there are failures in the backups.
> 
> 
> Another missing piece is a tool which (at my manual order) adds hash
> sums to the files, and which can verify them
> Actually I wrote such a tool already, but as shell script and it simply
> forks so often, that it became extremely slow at millions of small
> files.
> I often found it so useful to have that kind of checksumming in
> addition to the kind of checksumming e.g. btrfs does which is not at
> the level of whole files.
> So if something goes wrong like now, I cannot only verify whether
> single extents are valid, but also the chain of them that comprises a
> file.. and that just for the point where I defined "now, as it is, the
> file is valid",.. and automatically on any writes, as it would be done
> at file system level checksumming.
> In the current case,... for many files where I had such whole-file-
> csums, verifying whether what btrfs-restore gave me was valid or not,
> was very easy because of them.
> 
> 
>> Normally I won't blame memory unless strange behavior happens, from
>> unexpected freeze to strange kernel panic.
> Me neither... I think bad RAM happens rather rarely these days but
> my case may actually be one.
> 
> 
>> Netconsole would help here, especially when U757 has an RJ45.
>> As long as you have another system which is able to run nc, it should
>> catch any kernel message, and help us to analyse if it's really a
>> memory
>> corruption.
> Ah thanks... I wasn't even aware of that ^^
> I'll have a look at it when I start inspecting the U757 again in the
> next weeks.
> 
> 
>>> - The notebooks SSD is a Samsung SSD 850 PRO 1TB, the same which I
>>>   already used with the old notebook.
>>>   A long SMART check after the corruption, brought no errors.
>>
>> Also using that SSD with smaller capacity, it's less possible for the
>> SSD.
> Sorry, what do you mean? :)

I'm using the same SSD (with smaller size).
So unless some strange thing happened, I won't blame the SSD.

> 
> 
>> Normally I won't blame memory, but even newly created btrfs, without
>> any
>> powerloss, it still reports csum error, then it maybe the problem.
> That was also my idea...
> I may mix up things, but I think I even found a csum error later on the
> rescue USB stick (which is also btrfs)... would need to double check
> that, though.
> 
>>> - So far, in the data I checked (which as I've said, excludes a
>>> lot,..
>>>   especially the QEMU images)
>>>   I found only few cases, where the data I got from btrfs restore
>>> was
>>>   really bad.
>>>   Namely, two MP3 files. Which were equal to their backup
>>> counterparts,
>>>   but just up to some offset... and the rest of the files were just
>>>   missing.
>>
>> Offset? Is that offset aligned to 4K?
>> Or some strange offset?
> 
> These were the two files:
> -rw-r--r-- 1 calestyo calestyo   90112 Feb 22 16:46 'Lady In The Water/05.mp3'
> -rw-r--r-- 1 calestyo calestyo 4892407 Feb 27 23:28 
> '/home/calestyo/share/music/Lady In The Water/05.mp3'
> 
> 
> -rw-r--r-- 1 calestyo calestyo 1904640 Feb 22 16:47 'The Hunt For Red October 
> [Intrada]/21.mp3'
> -rw-r--r-- 1 calestyo calestyo 2968128 Feb 27 23:28 
> '/home/calestyo/share/music/The Hunt For Red October [Intrada]/21.mp3'
> 
> with the former (smaller one) being the corrupted one (i.e. the one
> returned by btrfs-restore).
> 
> Both are (in terms of filesize) multiples of 4096... what does that
> mean now?

That means either we lost some file extents or inode items.


Re: spurious full btrfs corruption

2018-03-05 Thread Christoph Anton Mitterer
Hey Qu.

On Thu, 2018-03-01 at 09:25 +0800, Qu Wenruo wrote:
> > - For my personal data, I have one[0] Seagate 8 TB SMR HDD, which I
> >   backup (send/receive) on two further such HDDs (all these are
> >   btrfs), and (rsync) on one further with ext4.
> >   These files have all their SHA512 sums attached as XATTRs, which
> > I
> >   regularly test. So I think I can be pretty sure, that there was
> > never
> >   a case of silent data corruption and the RAM on the E782 is fine.
> 
> Good backup practice can't be even better.

Well I still would want to add something tape and/or optical based
solution...
But having this depends a bit on having a good way to do incremental
backups, i.e. I wouldn't want to write full copies of everything to
tape/BluRay over and over again, but just the actually added data and
records of metadata changes.
The former (adding just added files is rather easy), but still
recording any changes in metadata (moved/renamed/deleted files, changes
in file dates, permissions, XATTRS etc.).
Also I would always want to backup complete files, so not just changes
to a file, even if just one byte changed of a 4 GiB file... and not
want to have files split over mediums.

send/receive sounds like a candidate for this (except it works only on
changes, not full files), but I would prefer to have everything in a
standard format like tar which one can rather easily recover manually
if there are failures in the backups.


Another missing piece is a tool which (at my manual order) adds hash
sums to the files, and which can verify them
Actually I wrote such a tool already, but as shell script and it simply
forks so often, that it became extremely slow at millions of small
files.
I often found it so useful to have that kind of checksumming in
addition to the kind of checksumming e.g. btrfs does which is not at
the level of whole files.
So if something goes wrong like now, I cannot only verify whether
single extents are valid, but also the chain of them that comprises a
file.. and that just for the point where I defined "now, as it is, the
file is valid",.. and automatically on any writes, as it would be done
at file system level checksumming.
In the current case,... for many files where I had such whole-file-
csums, verifying whether what btrfs-restore gave me was valid or not,
was very easy because of them.


> Normally I won't blame memory unless strange behavior happens, from
> unexpected freeze to strange kernel panic.
Me neither... I think bad RAM happens rather rarely these days but
my case may actually be one.


> Netconsole would help here, especially when U757 has an RJ45.
> As long as you have another system which is able to run nc, it should
> catch any kernel message, and help us to analyse if it's really a
> memory
> corruption.
Ah thanks... I wasn't even aware of that ^^
I'll have a look at it when I start inspecting the U757 again in the
next weeks.


> > - The notebooks SSD is a Samsung SSD 850 PRO 1TB, the same which I
> >   already used with the old notebook.
> >   A long SMART check after the corruption, brought no errors.
> 
> Also using that SSD with smaller capacity, it's less possible for the
> SSD.
Sorry, what do you mean? :)


> Normally I won't blame memory, but even newly created btrfs, without
> any
> powerloss, it still reports csum error, then it maybe the problem.
That was also my idea...
I may mix up things, but I think I even found a csum error later on the
rescue USB stick (which is also btrfs)... would need to double check
that, though.

> > - So far, in the data I checked (which as I've said, excludes a
> > lot,..
> >   especially the QEMU images)
> >   I found only few cases, where the data I got from btrfs restore
> > was
> >   really bad.
> >   Namely, two MP3 files. Which were equal to their backup
> > counterparts,
> >   but just up to some offset... and the rest of the files were just
> >   missing.
> 
> Offset? Is that offset aligned to 4K?
> Or some strange offset?

These were the two files:
-rw-r--r-- 1 calestyo calestyo   90112 Feb 22 16:46 'Lady In The Water/05.mp3'
-rw-r--r-- 1 calestyo calestyo 4892407 Feb 27 23:28 
'/home/calestyo/share/music/Lady In The Water/05.mp3'


-rw-r--r-- 1 calestyo calestyo 1904640 Feb 22 16:47 'The Hunt For Red October 
[Intrada]/21.mp3'
-rw-r--r-- 1 calestyo calestyo 2968128 Feb 27 23:28 
'/home/calestyo/share/music/The Hunt For Red October [Intrada]/21.mp3'

with the former (smaller one) being the corrupted one (i.e. the one
returned by btrfs-restore).

Both are (in terms of filesize) multiples of 4096... what does that
mean now?


> > - Especially recovering the VM images will take up some longer
> > time...
> >   (I think I cannot really trust what came out from the btrfs restore
> >   here, since these already brought csum errs before)

In the meantime I had a look of the remaining files that I got from the
btrfs-restore (haven't run it again so far, from the OLD notebook, so
only the results from the NEW notebook 

Re: [PATCH] btrfs: adjust return values of btrfs_inode_by_name

2018-03-05 Thread David Sterba
On Mon, Mar 05, 2018 at 05:13:37PM +0800, Su Yue wrote:
> Previously, btrfs_inode_by_name() returns 0 which leaves caller to
> check objectid of location even location type is invalid.
> 
> Let btrfs_inode_by_name() returns -EUCLEAN if found corrupted
> location of a dir entry.
> Removal of label out_err also simplifies the function.
> 
> Signed-off-by: Su Yue 

Reviewed-by: David Sterba 

> ---
>  fs/btrfs/inode.c | 22 ++
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 53ca025655fc..c7155f9d7c6f 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -5448,7 +5448,8 @@ void btrfs_evict_inode(struct inode *inode)
>  
>  /*
>   * this returns the key found in the dir entry in the location pointer.
> - * If no dir entries were found, location->objectid is 0.
> + * If no dir entries were found, returns -ENOENT.
> + * If found a corrupted location in dir entry, returns -EUCLEAN.
>   */
>  static int btrfs_inode_by_name(struct inode *dir, struct dentry *dentry,
>  struct btrfs_key *location)
> @@ -5466,27 +5467,27 @@ static int btrfs_inode_by_name(struct inode *dir, 
> struct dentry *dentry,
>  
>   di = btrfs_lookup_dir_item(NULL, root, path, btrfs_ino(BTRFS_I(dir)),
>   name, namelen, 0);
> - if (IS_ERR(di))
> + if (unlikely(!di)) {

unlikely is not needed here, removed
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: free-space-cache: Use DIV_ROUND_UP() to replace open code

2018-03-05 Thread David Sterba
On Mon, Mar 05, 2018 at 04:09:12PM +0800, Qu Wenruo wrote:
> Signed-off-by: Qu Wenruo 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] btrfs: verify max_inline mount parameter

2018-03-05 Thread David Sterba
On Thu, Mar 01, 2018 at 07:51:23PM +0800, Anand Jain wrote:
> >> @@ -605,7 +605,14 @@ int btrfs_parse_options(struct btrfs_fs_info *info, 
> >> char *options,
> >>case Opt_max_inline:
> >>num = match_strdup([0]);
> >>if (num) {
> >> -  info->max_inline = memparse(num, NULL);
> >> +  char *retptr;
> >> +
> >> +  info->max_inline = memparse(num, );
> > 
> > I missed it in the patch that changed max_inline to u32, memparse
> > returns unsigned long long, so this is not entrely correct and requires
> > a temporary variable.
> > 
> > We should also report if the user-specified value is larger than
> > BTRFS_MAX_METADATA_BLOCKSIZE .
> 
> (Got diverted into something else. Sorry for the delay.)
> 
> Currently -o max_line can be only upto sectorsize.
> 
> We have MAX_INLINE_EXTENT_BUFFER_SIZE which is 64K and is equal to 
> BTRFS_MAX_METADATA_BLOCKSIZE (also 64K)

BTRFS_MAX_METADATA_BLOCKSIZE is the limit of nodesize,
MAX_INLINE_EXTENT_BUFFER_SIZE exists only for sanity checking.


> I didn't get the point that max_inline is limited by sector size in the 
> current design. Any idea?

The sectorsize is now mandatory to be equal to the page size, so the
inline file size is rather limited by that.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: rename btrfs_close_extra_device to btrfs_free_extra_devids

2018-03-05 Thread David Sterba
On Tue, Feb 27, 2018 at 12:41:59PM +0800, Anand Jain wrote:
> This function btrfs_close_extra_devices() is about freeing
> extra devids which once it may have belonged to this fsid.
> So rename it and add the comment. The _devid suffix is
> appropriate as this function won't handle devices which are
> outside of the fsid being mounted.
> 
> Signed-off-by: Anand Jain 

Added to next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Remove root argument from cow_file_range_inline

2018-03-05 Thread David Sterba
On Fri, Mar 02, 2018 at 09:43:15AM +0200, Nikolay Borisov wrote:
> This argument is always set to the root of the inode, which is also
> passed. So let's get a reference inside the function and simplify
> the arg list.
> 
> Signed-off-by: Nikolay Borisov 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: drop nonvaring variable, instead define it

2018-03-05 Thread David Sterba
On Mon, Feb 26, 2018 at 04:48:14PM +0800, Anand Jain wrote:
> btrfs_defrag_leaves() has min_trans = 0 which doesn't vary.
> Defining it makes sense.

Defining the oldest transaction makes sense and it can be used in more
calls to btrfs_search_forward, please update the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: send: fix typo in TLV_PUT

2018-03-05 Thread David Sterba
On Fri, Mar 02, 2018 at 06:05:49PM -0700, Liu Bo wrote:
> According to tlv_put()'s prototype, data and attrlen needs to be
> exchanged in the macro, but seems all callers are already aware of
> this misorder and are therefore not affected.
> 
> Signed-off-by: Liu Bo 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to change/fix 'Received UUID'

2018-03-05 Thread Marc MERLIN
On Mon, Mar 05, 2018 at 10:38:16PM +0300, Andrei Borzenkov wrote:
> > If I absolutely know that the data is the same on both sides, how do I
> > either
> > 1) force back in a 'Received UUID' value on the destination
> 
> I suppose the most simple is to write small program that does it using
> BTRFS_IOC_SET_RECEIVED_SUBVOL.

Understdood.
Given that I have not worked with the code at all, what is the best 
tool in btrfs progs, to add this to?

btrfstune?
btrfs propery set?
other?

David, is this something you'd be willing to add support for?
(to be honest, it'll be quicker for someone who knows the code to add than
for me, but if no one has the time, I'l see if I can have a shot at it)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to change/fix 'Received UUID'

2018-03-05 Thread Andrei Borzenkov
05.03.2018 19:16, Marc MERLIN пишет:
> Howdy,
> 
> I did a bunch of copies and moving around subvolumes between disks and
> at some point, I did a snapshot dir1/Win_ro.20180205_21:18:31 
> dir2/Win_ro.20180205_21:18:31
> 
> As a result, I lost the ro flag, and apparently 'Received UUID' which is
> now preventing me from restarting the btrfs send/receive.
> 
> I changed the snapshot back to 'ro' but that's not enough:
> 
> Source:
> Name:   Win_ro.20180205_21:18:31
> UUID:   23ccf2bd-f494-e348-b34e-1f28486b2540
> Parent UUID:-
> Received UUID:  3cc327e1-358f-284e-92e2-4e4fde92b16f
> Creation time:  2018-02-15 20:14:42 -0800
> Subvolume ID:   964
> Generation: 4062
> Gen at creation:459
> Parent ID:  5
> Top level ID:   5
> Flags:  readonly
> 
> Dest:
> Name:   Win_ro.20180205_21:18:31
> UUID:   a1e8777c-c52b-af4e-9ce2-45ca4d4d2df8
> Parent UUID:-
> Received UUID:  -
> Creation time:  2018-02-17 22:20:25 -0800
> Subvolume ID:   94826
> Generation: 250714
> Gen at creation:250540
> Parent ID:  89160
> Top level ID:   89160
> Flags:  readonly
> 
> If I absolutely know that the data is the same on both sides, how do I
> either
> 1) force back in a 'Received UUID' value on the destination

I suppose the most simple is to write small program that does it using
BTRFS_IOC_SET_RECEIVED_SUBVOL.

> 2) force a btrfs receive to work despite the lack of matching 'Received
> UUID' 
> 
> Yes, I could discard and start over, but my 2nd such subvolume is 8TB,
> so I'd really rather not :)
> 
> Any ideas?
> 
> Thanks,
> Marc
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs space used issue

2018-03-05 Thread Austin S. Hemmelgarn

On 2018-03-05 10:28, Christoph Hellwig wrote:

On Sat, Mar 03, 2018 at 06:59:26AM +, Duncan wrote:

Indeed.  Preallocation with COW doesn't make the sense it does on an
overwrite-in-place filesystem.


It makes a whole lot of sense, it just is a little harder to implement.

There is no reason not to preallocate specific space, or if you aren't
forced to be fully log structured by the medium, specific blocks to
COW into.  It just isn't quite as trivial as for a rewrite in place
file system to implement.
Yes, there's generally no reason not to pre-allocate space, but given 
how BTRFS implements pre-allocation, it doesn't make sense to do so 
pretty much at all for anything but NOCOW files, as it doesn't even 
guarantee that you'll be able to write however much data you 
pre-allocated space for (and it doesn't matter whether you use fallocate 
or just write out a run of zeroes, either way does not work in a manner 
consistent with how other filesystems do).


There's been discussion before about this, arising from the (completely 
illogical given how fallocate is expected to behave) behavior that you 
can fallocate more than half the free space on a BTRFS volume but will 
then fail writes with -ENOSPC part way through actually writing data to 
the pre-allocated space you just reserved (and that it can fail for 
other reasons too with -ENOSPC).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to change/fix 'Received UUID'

2018-03-05 Thread Marc MERLIN
Howdy,

I did a bunch of copies and moving around subvolumes between disks and
at some point, I did a snapshot dir1/Win_ro.20180205_21:18:31 
dir2/Win_ro.20180205_21:18:31

As a result, I lost the ro flag, and apparently 'Received UUID' which is
now preventing me from restarting the btrfs send/receive.

I changed the snapshot back to 'ro' but that's not enough:

Source:
Name:   Win_ro.20180205_21:18:31
UUID:   23ccf2bd-f494-e348-b34e-1f28486b2540
Parent UUID:-
Received UUID:  3cc327e1-358f-284e-92e2-4e4fde92b16f
Creation time:  2018-02-15 20:14:42 -0800
Subvolume ID:   964
Generation: 4062
Gen at creation:459
Parent ID:  5
Top level ID:   5
Flags:  readonly

Dest:
Name:   Win_ro.20180205_21:18:31
UUID:   a1e8777c-c52b-af4e-9ce2-45ca4d4d2df8
Parent UUID:-
Received UUID:  -
Creation time:  2018-02-17 22:20:25 -0800
Subvolume ID:   94826
Generation: 250714
Gen at creation:250540
Parent ID:  89160
Top level ID:   89160
Flags:  readonly

If I absolutely know that the data is the same on both sides, how do I
either
1) force back in a 'Received UUID' value on the destination
2) force a btrfs receive to work despite the lack of matching 'Received
UUID' 

Yes, I could discard and start over, but my 2nd such subvolume is 8TB,
so I'd really rather not :)

Any ideas?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs space used issue

2018-03-05 Thread Christoph Hellwig
On Sat, Mar 03, 2018 at 06:59:26AM +, Duncan wrote:
> Indeed.  Preallocation with COW doesn't make the sense it does on an 
> overwrite-in-place filesystem.

It makes a whole lot of sense, it just is a little harder to implement.

There is no reason not to preallocate specific space, or if you aren't
forced to be fully log structured by the medium, specific blocks to
COW into.  It just isn't quite as trivial as for a rewrite in place
file system to implement.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Free space cache file (v1 space cache) corruption

2018-03-05 Thread Qu Wenruo
As the investigation about unexpected btrfs corruption goes on, here we
expose an strange v1 space cache corruption.

The script is updated to gist:
https://gist.github.com/adam900710/d37f38070f7fc4d858ffe856c516b426

The script itself is pretty straight forward:

0) Create a btrfs with large enough data chunk
   Original single data chunk created by mkfs is not large enough.
   Do a full balance to create a large enough data chunk, so space cache
   will live in a data chunk which also has its own cache.

1) Does some fsstress load along with dm-log-writes.
   The load is pretty small. Just -n 200 could reproduce it.

   dm-log-writes will record all the operations to later analyse.

2) Use dm-log-writes to replay to each FLUSH and FUA operations and do
   fsck
   In the script, it does this manually, just to check both FUA and
   FLUSH.
   In fact we can use --check fua option to do it in one line.

   Although btrfs check won't return error as it detects invalid free
   space cache and just ignore them, but we can get free space cache
   related error prompt.

Then we can get some free space cache corruption in both flush and fua
operations.
And some of them can even survive across *several* transaction.

Further more, when such corruption happens, space cache file extent
seems to be CoWed, instead of being overwritten.
In my test environment, the whole 64K file extent of metadata block
group cache just get CoWed.
(In previous trans, its bytenr is XXX by in next trans it's YYY, and the
inode size doesn't change at all, but nbytes seems is increasing)

Although kernel and btrfs check can both report such problem due to free
space bytes difference, but that's already the last defensing line.
The corrupted free space cache passes both generation and csum check.

I'll keep digging while advice from anyone who is familiar with free
space cache would really help in this case.

Thanks,
Qu



signature.asc
Description: OpenPGP digital signature


[PATCH] btrfs: adjust return values of btrfs_inode_by_name

2018-03-05 Thread Su Yue
Previously, btrfs_inode_by_name() returns 0 which leaves caller to
check objectid of location even location type is invalid.

Let btrfs_inode_by_name() returns -EUCLEAN if found corrupted
location of a dir entry.
Removal of label out_err also simplifies the function.

Signed-off-by: Su Yue 
---
 fs/btrfs/inode.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 53ca025655fc..c7155f9d7c6f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5448,7 +5448,8 @@ void btrfs_evict_inode(struct inode *inode)
 
 /*
  * this returns the key found in the dir entry in the location pointer.
- * If no dir entries were found, location->objectid is 0.
+ * If no dir entries were found, returns -ENOENT.
+ * If found a corrupted location in dir entry, returns -EUCLEAN.
  */
 static int btrfs_inode_by_name(struct inode *dir, struct dentry *dentry,
   struct btrfs_key *location)
@@ -5466,27 +5467,27 @@ static int btrfs_inode_by_name(struct inode *dir, 
struct dentry *dentry,
 
di = btrfs_lookup_dir_item(NULL, root, path, btrfs_ino(BTRFS_I(dir)),
name, namelen, 0);
-   if (IS_ERR(di))
+   if (unlikely(!di)) {
+   ret = -ENOENT;
+   goto out;
+   }
+   if (IS_ERR(di)) {
ret = PTR_ERR(di);
-
-   if (IS_ERR_OR_NULL(di))
-   goto out_err;
+   goto out;
+   }
 
btrfs_dir_item_key_to_cpu(path->nodes[0], di, location);
if (location->type != BTRFS_INODE_ITEM_KEY &&
location->type != BTRFS_ROOT_ITEM_KEY) {
+   ret = -EUCLEAN;
btrfs_warn(root->fs_info,
 "%s gets something invalid in DIR_ITEM (name %s, directory ino %llu, 
location(%llu %u %llu))",
   __func__, name, btrfs_ino(BTRFS_I(dir)),
   location->objectid, location->type, 
location->offset);
-   goto out_err;
}
 out:
btrfs_free_path(path);
return ret;
-out_err:
-   location->objectid = 0;
-   goto out;
 }
 
 /*
@@ -5789,9 +5790,6 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, 
struct dentry *dentry)
if (ret < 0)
return ERR_PTR(ret);
 
-   if (location.objectid == 0)
-   return ERR_PTR(-ENOENT);
-
if (location.type == BTRFS_INODE_ITEM_KEY) {
inode = btrfs_iget(dir->i_sb, , root, NULL);
return inode;
-- 
2.16.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: free-space-cache: Use DIV_ROUND_UP() to replace open code

2018-03-05 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 free-space-cache.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/free-space-cache.c b/free-space-cache.c
index 50356d0462dd..f933f9f1cf3f 100644
--- a/free-space-cache.c
+++ b/free-space-cache.c
@@ -54,8 +54,7 @@ static int io_ctl_init(struct io_ctl *io_ctl, u64 size, u64 
ino,
   struct btrfs_root *root)
 {
memset(io_ctl, 0, sizeof(struct io_ctl));
-   io_ctl->num_pages = (size + root->fs_info->sectorsize - 1) /
-   root->fs_info->sectorsize;
+   io_ctl->num_pages = DIV_ROUND_UP(size, root->fs_info->sectorsize);
io_ctl->buffer = kzalloc(size, GFP_NOFS);
if (!io_ctl->buffer)
return -ENOMEM;
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html