Re: btrfs send extremely slow (almost stuck)

2016-09-04 Thread Qu Wenruo



At 09/05/2016 05:41 AM, Oliver Freyermuth wrote:

Am 30.08.2016 um 02:48 schrieb Qu Wenruo:

Yes.
And more specifically, it doesn't even affect delta backup.

For shared extents caused by reflink/dedupe(out-of-band or even incoming 
in-band), it will be send as individual files.

For contents, they are all the same, just more space usage.


For those interested, I have now actually tested the btrfs send / btrfs receive 
backup for several subvolumes after applying this patch.
The throughput is finally usable, almost hitting network / IO limits as 
expected - ideal so far!
Also delta seemed fine for the subvolumes for which things worked.

However, I now sadly get (for one of my subvolumes):

send ioctl failed with -2: No such file or directory

at some point during the transfer, it sadly seems to be reproducible.
I do not think it's related to this patch, but of course this makes "btrfs 
send" still unusable to me -
I guess it's not ready for general use just yet.
Is there any information I can easily extract / provide to allow the experts to 
fix this issue?


Did you get the half way send stream?

If the send stream has something, please use "--no-data" option to send 
the subvolume again to get the metadata only dump, and upload it for debug.


Also, please paste "btrfs-debug-tree -t " output for 
debug.

WARN: above "btrfs-debug-tree" command will contain file names.
You could use the following sed to wipe filename:

"btrfs-debug-tree  -t 5 /dev/sda6  | sed "s/name:.*//"

Thanks,
Qu


The kernel log shows nothing.

Thanks a lot,
Oliver





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: let btrfs_delete_unused_bgs() to clean relocated bgs

2016-09-04 Thread Naohiro Aota
2016-09-02 (金) の 09:35 -0400 に Josef Bacik さんは書きました:
> On 09/02/2016 03:46 AM, Naohiro Aota wrote:
> > 
> > Currently, btrfs_relocate_chunk() is removing relocated BG by
> > itself. But
> > the work can be done by btrfs_delete_unused_bgs() (and it's better
> > since it
> > trim the BG). Let's dedupe the code.
> > 
> > While btrfs_delete_unused_bgs() is already hitting the relocated
> > BG, it
> > skip the BG since the BG has "ro" flag set (to keep balancing BG
> > intact).
> > On the other hand, btrfs cannot drop "ro" flag here to prevent
> > additional
> > writes. So this patch make use of "removed" flag.
> > btrfs_delete_unused_bgs() now detect the flag to distinguish
> > whether a
> > read-only BG is relocating or not.
> > 
> 
> This seems racey to me.  We remove the last part of the block group,
> it ends up 
> on the unused_bgs_list, we process this list, see that removed isn't
> set and we 
> skip it, then later we set removed, but it's too late.  I think the
> right way is 
> to actually do a transaction, set ->removed, manually add it to the 
> unused_bgs_list if it's not already, then end the transaction.  This
> way we are 
> guaranteed to have the bg on the list when it is ready to be
> removed.  This is 
> my analysis after looking at it for 10 seconds after being awake for
> like 30 
> minutes so if I'm missing something let me know.  Thanks,

I don't think a race will happen. Since we are holding
delete_unused_bgs_mutex here, btrfs_delte_unused_bgs() checks ->removed
flag after we unlock the mutex i.e. we setup the flag properly. For a
case btrfs_delete_usused_bgs() checks the BG before we hold
delte_unused_bgs_mutex, then that BG is removed by it (if it's empty)
and btrfs_relocate_chunk() should never see it.

Regards,
Naohiro


RE: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-04 Thread Zhao Lei
Hi, Sean Fu

> From: Sean Fu [mailto:fxinr...@gmail.com]
> Sent: Sunday, September 04, 2016 7:54 PM
> To: dste...@suse.com
> Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
> zhao...@cn.fujitsu.com; linux-btrfs@vger.kernel.org;
> linux-ker...@vger.kernel.org; Sean Fu 
> Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in
> btrfs_read_chunk_tree.
> 
> The input argument root is already set with "fs_info->chunk_root".
> "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
> "open_ctree".
> “root->fs_info = fs_info” in "btrfs_alloc_root".
> 
The root argument of this function means "any root".
And the function is designed getting chunk root from
"any root" in head.

Since there is only one caller of this function,
and the caller always send chunk_root as root argument in
current code, we can remove above conversion,
and I suggest renaming root to chunk_root to make it clear,
something like:

- btrfs_read_chunk_tree(struct btrfs_root *root)
+ btrfs_read_chunk_tree(struct btrfs_root *chunk_root)

Thanks
Zhaolei

> Signed-off-by: Sean Fu 
> ---
>  fs/btrfs/volumes.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 366b335..384a6d2 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
>   int ret;
>   int slot;
> 
> - root = root->fs_info->chunk_root;
> -
>   path = btrfs_alloc_path();
>   if (!path)
>   return -ENOMEM;
> --
> 2.6.2
> 




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gazillions of Incorrect local/global backref count

2016-09-04 Thread Chris Murphy
On Sat, Sep 3, 2016 at 10:50 PM, Christoph Anton Mitterer
 wrote:
> Hey.
>
> I just did a btrfs check on my notebooks root fs, with:
> $ uname -a
> Linux heisenberg 4.7.0-1-amd64 #1 SMP Debian 4.7.2-1 (2016-08-28)
> x86_64 GNU/Linux
> $ btrfs --version
> btrfs-progs v4.7.1
>
>
>
> during:
> checking extents
>
> it found gazillions of these:
> Incorrect local backref count on 1107980288 root 257 owner 17807428
> offset 13568135168 found 2 wanted 3 back 0x2d69990
> Incorrect local backref count on 1107980288 root 257 owner 14055042
> offset 13568135168 found 2 wanted 3 back 0x2d69930
> Incorrect global backref count on 1107980288 found 4 wanted 6
> backpointer mismatch on [1107980288 61440]
> Incorrect local backref count on 1108049920 root 257 owner 17807428
> offset 13568262144 found 2 wanted 5 back 0x2d69ac0
> Incorrect local backref count on 1108049920 root 257 owner 14055042
> offset 13568262144 found 2 wanted 5 back 0x2d69b20
> Incorrect global backref count on 1108049920 found 4 wanted 10
> backpointer mismatch on [1108049920 77824]
>
> See stdout/err[0] logfiles from the check.
>
>
> What do they mean?

https://bugzilla.kernel.org/show_bug.cgi?id=155791
http://www.spinics.net/lists/linux-btrfs/msg58142.html


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re[4]: btrfs check "Couldn't open file system" after error in transaction.c

2016-09-04 Thread Chris Murphy
On Sun, Sep 4, 2016 at 1:23 PM, Hendrik Friedel  wrote:
> Hello,
>
> here the output of btrfsck:
> Checking filesystem on /dev/sdd
> UUID: a8af3832-48c7-4568-861f-e80380dd7e0b
> checking extents
> checking free space cache
> checking fs root
> checking csums
> checking root refs
> checking quota groups
> Ignoring qgroup relation key 24544
> Ignoring qgroup relation key 24610
> Ignoring qgroup relation key 24611
> Ignoring qgroup relation key 25933
> Ignoring qgroup relation key 25934
> Ignoring qgroup relation key 25935
> Ignoring qgroup relation key 25936
> Ignoring qgroup relation key 25937
> Ignoring qgroup relation key 25938
> Ignoring qgroup relation key 25939
> Ignoring qgroup relation key 25939
> Ignoring qgroup relation key 25941
> Ignoring qgroup relation key 25942
> Ignoring qgroup relation key 25958
> Ignoring qgroup relation key 25959
> Ignoring qgroup relation key 25960
> Ignoring qgroup relation key 25961
> Ignoring qgroup relation key 25962
> Ignoring qgroup relation key 25963
> Ignoring qgroup relation key 25964
> Ignoring qgroup relation key 25965
> Ignoring qgroup relation key 25966
> Ignoring qgroup relation key 25966
> Ignoring qgroup relation key 25968
> Ignoring qgroup relation key 25970
> Ignoring qgroup relation key 25971
> Ignoring qgroup relation key 25972
> Ignoring qgroup relation key 25975
> Ignoring qgroup relation key 25976
> Ignoring qgroup relation key 25976
> Ignoring qgroup relation key 25976
> Ignoring qgroup relation key 567172078071971871
> Ignoring qgroup relation key 567172078071971872
> Ignoring qgroup relation key 567172078071971882
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971885
> Ignoring qgroup relation key 567172078071971886
> Ignoring qgroup relation key 567172078071971886
> Ignoring qgroup relation key 567172078071971886
> Ignoring qgroup relation key 567172078071971886
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Ignoring qgroup relation key 567172078071971892
> Qgroup is already inconsistent before checking
> Counts for qgroup id: 3102 are different
> our:referenced 174829252608 referenced compressed 174829252608
> disk:   referenced 174829252608 referenced compressed 174829252608
> our:exclusive 2899968 exclusive compressed 2899968
> disk:   exclusive 2916352 exclusive compressed 2916352
> diff:   exclusive -16384 exclusive compressed -16384
> Counts for qgroup id: 25977 are different
> our:referenced 47249391616 referenced compressed 47249391616
> disk:   referenced 47249391616 referenced compressed 47249391616
> our:exclusive 90222592 exclusive compressed 90222592
> disk:   exclusive 90238976 exclusive compressed 90238976
> diff:   exclusive -16384 exclusive compressed -16384
> Counts for qgroup id: 25978 are different
> our:referenced 174829252608 referenced compressed 174829252608
> disk:   referenced 174829252608 referenced compressed 174829252608
> our:exclusive 1064960 exclusive compressed 1064960
> disk:   exclusive 1081344 exclusive compressed 1081344
> diff:   exclusive -16384 exclusive compressed -16384
> Counts for qgroup id: 26162 are different
> our:referenced 65940500480 referenced compressed 65940500480
> disk:   referenced 65866997760 referenced compressed 65866997760
> diff:   referenced 73502720 referenced compressed 73502720
> our:exclusive 3991326720 exclusive compressed 3991326720
> disk:   exclusive 3960582144 exclusive compressed 3960582144
> diff:   exclusive 30744576 exclusive compressed 30744576
> found 8423479726080 bytes used err is 1
> total csum bytes: 8206766844
> total tree bytes: 17669144576
> total fs tree bytes: 7271251968
> total extent tree bytes: 683851776
> total csum bytes: 8206766844
> total tree bytes: 

Re: Re[3]: btrfs check "Couldn't open file system" after error in transaction.c

2016-09-04 Thread Chris Murphy
On Sun, Sep 4, 2016 at 12:51 PM, Hendrik Friedel  wrote:
> Hello again,
>
> before overwriting the filesystem, some last questions:
>
>>>  Maybe
>>> take advantage of the fact it does read only and recreate it. You
>>> could take a btrfs-image and btrfs-debug-tree first,
>>
>> And what do I do with it?
>>
>>> because there's
>>> some bug somewhere: somehow it became inconsistent, and can't be fixed
>>> at mount time or even with btrfs check.
>>
>> Ok, so is there any way to help you finding this bug?
>
> Anything, I can do here?
>
>> Coming back to my objectives:
>> -Understand the reason behind the issue and prevent it in future
>> Finding the but would help on the above
>>
>> -If not possible to repair the filesystem:
>>-understand if the data that I read from the drive is valid or
>> corrupted
>> Can you answer this?
>>
>> As mentioned: I do have a backup, a month old. The data does not change so
>> regularly, so most should be ok.
>> Now I have two sources of data:
>> the backup and the current degraded filesystem.
>> If data differs, which one do I take? Is it safe to use the more recent
>> one from the degraded filesystem?
>>
> And can you help me on these points?
>
> FYI, I did a
> btrfsck --init-csum-tree /dev/sdd
> btrfs rescue zero-log btrfs-zero-log
> btrfsck /dev/sdd

Curious that this is fixing a parenttransid problem...not sure why.
Only a developer working on btrfsck could answer this. They'd need the
btrfs-image before these things were done and see what's wrong with
the file system that causes check to fail. Changing anything changes
the evidence of what was wrong.

>
> now. The last command is still running. It seems to be working; Is there a
> way to be sure, that the data is all ok again?

Not by Brfs. The problem now is that by init-csum-tree it recomputed
the csums for everything. If there were any files corrupt, they now
have csums based on that corruption, so they will read as OK by Btrfs.
That's the problem with init-csum-tree. So now you need a different
way to confirm/deny if they files are really good or not.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re[2]: btrfs check "Couldn't open file system" after error in transaction.c

2016-09-04 Thread Chris Murphy
Lost track of this...sorry.

On Sun, Aug 28, 2016 at 12:04 PM, Hendrik Friedel  wrote:
> Hi Chris,
>
> thanks for your reply -especially on a Sunday.
>>>
>>>  I have a filesystem (three disks with no raid)
>>
>>
>> So it's data single *and* metadata single?
>>
> No:
> Data, single: total=8.14TiB, used=7.64TiB
> System, RAID1: total=32.00MiB, used=912.00KiB
> Metadata, RAID1: total=18.00GiB, used=16.45GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>>
>>
>>>  btrfs check will lead to  "Couldn't open file system"

That's a bug worth filing. That bug report will need a URL for where
you put the btrfs-image file.


>>  Maybe
>> take advantage of the fact it does read only and recreate it. You
>> could take a btrfs-image and btrfs-debug-tree first,
>
> And what do I do with it?

Put it somewhere it can live a while, it might be months before a dev
gets around to looking at it. I usually put them on google drive in
the public folder, and then post the URL (get shareable link) in the
bug report.



>
>> because there's
>> some bug somewhere: somehow it became inconsistent, and can't be fixed
>> at mount time or even with btrfs check.
>
> Ok, so is there any way to help you finding this bug?
> Coming back to my objectives:
> -Understand the reason behind the issue and prevent it in future
> Finding the but would help on the above

No idea.

>
> -If not possible to repair the filesystem:
>-understand if the data that I read from the drive is valid or corrupted
> Can you answer this?

Other than nocow files which do not have csums, Btrfs will spit back
an I/O error and path to the bad file rather than hand over data it
thinks is corrupt (doesn't match csum). So data read from the volume
should be valid.


>
> As mentioned: I do have a backup, a month old. The data does not change so
> regularly, so most should be ok.
> Now I have two sources of data:
> the backup and the current degraded filesystem.
> If data differs, which one do I take? Is it safe to use the more recent one
> from the degraded filesystem?

If data differs you have to figure out a way to inspect the file to
determine which one is correct. Databases have their own consistency
checks, for example, if it's an image, open it in a viewer - big
problems will be visible, small problems might just be one wrong pixel
and you may not even notice it.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send extremely slow (almost stuck)

2016-09-04 Thread Oliver Freyermuth
Am 30.08.2016 um 02:48 schrieb Qu Wenruo:
> Yes.
> And more specifically, it doesn't even affect delta backup.
> 
> For shared extents caused by reflink/dedupe(out-of-band or even incoming 
> in-band), it will be send as individual files.
> 
> For contents, they are all the same, just more space usage.

For those interested, I have now actually tested the btrfs send / btrfs receive 
backup for several subvolumes after applying this patch. 
The throughput is finally usable, almost hitting network / IO limits as 
expected - ideal so far! 
Also delta seemed fine for the subvolumes for which things worked. 

However, I now sadly get (for one of my subvolumes): 

send ioctl failed with -2: No such file or directory

at some point during the transfer, it sadly seems to be reproducible. 
I do not think it's related to this patch, but of course this makes "btrfs 
send" still unusable to me - 
I guess it's not ready for general use just yet. 
Is there any information I can easily extract / provide to allow the experts to 
fix this issue?
The kernel log shows nothing. 

Thanks a lot, 
Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re[4]: btrfs check "Couldn't open file system" after error in transaction.c

2016-09-04 Thread Hendrik Friedel

Hello,

here the output of btrfsck:
Checking filesystem on /dev/sdd
UUID: a8af3832-48c7-4568-861f-e80380dd7e0b
checking extents
checking free space cache
checking fs root
checking csums
checking root refs
checking quota groups
Ignoring qgroup relation key 24544
Ignoring qgroup relation key 24610
Ignoring qgroup relation key 24611
Ignoring qgroup relation key 25933
Ignoring qgroup relation key 25934
Ignoring qgroup relation key 25935
Ignoring qgroup relation key 25936
Ignoring qgroup relation key 25937
Ignoring qgroup relation key 25938
Ignoring qgroup relation key 25939
Ignoring qgroup relation key 25939
Ignoring qgroup relation key 25941
Ignoring qgroup relation key 25942
Ignoring qgroup relation key 25958
Ignoring qgroup relation key 25959
Ignoring qgroup relation key 25960
Ignoring qgroup relation key 25961
Ignoring qgroup relation key 25962
Ignoring qgroup relation key 25963
Ignoring qgroup relation key 25964
Ignoring qgroup relation key 25965
Ignoring qgroup relation key 25966
Ignoring qgroup relation key 25966
Ignoring qgroup relation key 25968
Ignoring qgroup relation key 25970
Ignoring qgroup relation key 25971
Ignoring qgroup relation key 25972
Ignoring qgroup relation key 25975
Ignoring qgroup relation key 25976
Ignoring qgroup relation key 25976
Ignoring qgroup relation key 25976
Ignoring qgroup relation key 567172078071971871
Ignoring qgroup relation key 567172078071971872
Ignoring qgroup relation key 567172078071971882
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971885
Ignoring qgroup relation key 567172078071971886
Ignoring qgroup relation key 567172078071971886
Ignoring qgroup relation key 567172078071971886
Ignoring qgroup relation key 567172078071971886
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Ignoring qgroup relation key 567172078071971892
Qgroup is already inconsistent before checking
Counts for qgroup id: 3102 are different
our:referenced 174829252608 referenced compressed 
174829252608
disk:   referenced 174829252608 referenced compressed 
174829252608

our:exclusive 2899968 exclusive compressed 2899968
disk:   exclusive 2916352 exclusive compressed 2916352
diff:   exclusive -16384 exclusive compressed -16384
Counts for qgroup id: 25977 are different
our:referenced 47249391616 referenced compressed 47249391616
disk:   referenced 47249391616 referenced compressed 47249391616
our:exclusive 90222592 exclusive compressed 90222592
disk:   exclusive 90238976 exclusive compressed 90238976
diff:   exclusive -16384 exclusive compressed -16384
Counts for qgroup id: 25978 are different
our:referenced 174829252608 referenced compressed 
174829252608
disk:   referenced 174829252608 referenced compressed 
174829252608

our:exclusive 1064960 exclusive compressed 1064960
disk:   exclusive 1081344 exclusive compressed 1081344
diff:   exclusive -16384 exclusive compressed -16384
Counts for qgroup id: 26162 are different
our:referenced 65940500480 referenced compressed 65940500480
disk:   referenced 65866997760 referenced compressed 65866997760
diff:   referenced 73502720 referenced compressed 73502720
our:exclusive 3991326720 exclusive compressed 3991326720
disk:   exclusive 3960582144 exclusive compressed 3960582144
diff:   exclusive 30744576 exclusive compressed 30744576
found 8423479726080 bytes used err is 1
total csum bytes: 8206766844
total tree bytes: 17669144576
total fs tree bytes: 7271251968
total extent tree bytes: 683851776
total csum bytes: 8206766844
total tree bytes: 17669144576
total fs tree bytes: 7271251968
total extent tree bytes: 683851776
btree space waste bytes: 2859469730
file data blocks allocated: 16171232772096
referenced 13512171663360

What does that tell us?

Greetings,
Hendrik


-- Originalnachricht --
Von: "Hendrik Friedel" 

Re[3]: btrfs check "Couldn't open file system" after error in transaction.c

2016-09-04 Thread Hendrik Friedel

Hello again,

before overwriting the filesystem, some last questions:


 Maybe
take advantage of the fact it does read only and recreate it. You
could take a btrfs-image and btrfs-debug-tree first,

And what do I do with it?


because there's
some bug somewhere: somehow it became inconsistent, and can't be fixed
at mount time or even with btrfs check.

Ok, so is there any way to help you finding this bug?

Anything, I can do here?


Coming back to my objectives:
-Understand the reason behind the issue and prevent it in future
Finding the but would help on the above

-If not possible to repair the filesystem:
   -understand if the data that I read from the drive is valid or 
corrupted

Can you answer this?

As mentioned: I do have a backup, a month old. The data does not change 
so regularly, so most should be ok.

Now I have two sources of data:
the backup and the current degraded filesystem.
If data differs, which one do I take? Is it safe to use the more recent 
one from the degraded filesystem?



And can you help me on these points?

FYI, I did a
btrfsck --init-csum-tree /dev/sdd
btrfs rescue zero-log btrfs-zero-log
btrfsck /dev/sdd

now. The last command is still running. It seems to be working; Is there 
a way to be sure, that the data is all ok again?


Regards,
Hendrik




Greetings,
Hendrik






--
Chris Murphy



---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OOM killer and Btrfs

2016-09-04 Thread Francesco Turco
On Sun, Sep 4, 2016, at 12:06, Markus Trippelsdorf wrote:
> On 2016.09.04 at 11:59 +0200, Francesco Turco wrote:
> > Is the problem already known? Should I report a bug? Is there a patch I
> > can try? Thanks.
> 
> This issue was recently fixed by:
> 
> commit 6b4e3181d7bd5ca5ab6f45929e4a5ffa7ab4ab7f
> Author: Michal Hocko 
> Date:   Thu Sep 1 16:14:41 2016 -0700
> 
> mm, oom: prevent premature OOM killer invocation for high order
> request
> 
> It will be backported to the 4.7.x stable kernel, too.

Great, I will wait for a new 4.7.x release then :)

Thank you for the info!

-- 
https://www.fturco.net/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OOM killer and Btrfs

2016-09-04 Thread Markus Trippelsdorf
On 2016.09.04 at 11:59 +0200, Francesco Turco wrote:
> I use Btrfs on a Gentoo Linux system with kernel 4.7.2. When my computer
> is under heavy I/O load some application often crashes, for example
> ClamAV, Firefox or Portage. I suspect the problem is due to Btrfs, but I
> may be wrong.
> 
> These are the most recent error messages from journalctl, but I have
> many other similar ones in my logs:
> 
> *** BEGIN ***
> 
> Sep 04 10:13:26 desktop kernel: gpg-agent invoked oom-killer:
> gfp_mask=0x27080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK),
> order=2, oom_
> 
> Is the problem already known? Should I report a bug? Is there a patch I
> can try? Thanks.

This issue was recently fixed by:

commit 6b4e3181d7bd5ca5ab6f45929e4a5ffa7ab4ab7f
Author: Michal Hocko 
Date:   Thu Sep 1 16:14:41 2016 -0700

mm, oom: prevent premature OOM killer invocation for high order request

It will be backported to the 4.7.x stable kernel, too.

-- 
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


OOM killer and Btrfs

2016-09-04 Thread Francesco Turco
I use Btrfs on a Gentoo Linux system with kernel 4.7.2. When my computer
is under heavy I/O load some application often crashes, for example
ClamAV, Firefox or Portage. I suspect the problem is due to Btrfs, but I
may be wrong.

These are the most recent error messages from journalctl, but I have
many other similar ones in my logs:

*** BEGIN ***

Sep 04 10:13:26 desktop kernel: gpg-agent invoked oom-killer:
gfp_mask=0x27080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK),
order=2, oom_
Sep 04 10:13:26 desktop kernel: gpg-agent cpuset=/ mems_allowed=0
Sep 04 10:13:26 desktop kernel: CPU: 1 PID: 15883 Comm: gpg-agent Not
tainted 4.7.2-gentoo #6
Sep 04 10:13:26 desktop kernel: Hardware name:  /DQ35JO,
BIOS JOQ3510J.86A.1143.2010.1209.0048 12/09/2010
Sep 04 10:13:26 desktop kernel:   8801258ebbb0
813db638 8801258ebd48
Sep 04 10:13:26 desktop kernel:  88009510c800 8801258ebbe8
811bbe3d 8801258ebd48
Sep 04 10:13:26 desktop kernel:   88009510c800
81e30816 001c
Sep 04 10:13:26 desktop kernel: Call Trace:
Sep 04 10:13:26 desktop kernel:  []
dump_stack+0x4d/0x65
Sep 04 10:13:26 desktop kernel:  []
dump_header+0x56/0x16e
Sep 04 10:13:26 desktop kernel:  []
oom_kill_process+0x218/0x3e0
Sep 04 10:13:26 desktop kernel:  []
out_of_memory+0x3ba/0x460
Sep 04 10:13:26 desktop kernel:  []
__alloc_pages_nodemask+0xedd/0xf00
Sep 04 10:13:26 desktop kernel:  []
alloc_kmem_pages_node+0x4a/0xc0
Sep 04 10:13:26 desktop kernel:  []
copy_process.part.50+0x104/0x1760
Sep 04 10:13:26 desktop kernel:  [] ?
check_preempt_wakeup+0x10a/0x240
Sep 04 10:13:26 desktop kernel:  [] ?
__set_task_blocked+0x2d/0x70
Sep 04 10:13:26 desktop kernel:  []
_do_fork+0xc5/0x370
Sep 04 10:13:26 desktop kernel:  [] ?
SyS_pselect6+0x13a/0x220
Sep 04 10:13:26 desktop kernel:  []
SyS_clone+0x14/0x20
Sep 04 10:13:26 desktop kernel:  []
do_syscall_64+0x4b/0xa0
Sep 04 10:13:26 desktop kernel:  []
entry_SYSCALL64_slow_path+0x25/0x25
Sep 04 10:13:26 desktop kernel: Mem-Info:
Sep 04 10:13:26 desktop kernel: active_anon:173869 inactive_anon:274253
isolated_anon:0
 active_file:888485 inactive_file:366424
 isolated_file:0
 unevictable:8 dirty:231 writeback:0
 unstable:0
 slab_reclaimable:240788
 slab_unreclaimable:10484
 mapped:46080 shmem:2372 pagetables:8521
 bounce:0
 free:36342 free_pcp:0 free_cma:0
Sep 04 10:13:26 desktop kernel: Node 0 DMA free:15768kB min:20kB
low:32kB high:44kB active_anon:0kB inactive_anon:0kB active_file:0kB
inacti
Sep 04 10:13:26 desktop kernel: lowmem_reserve[]: 0 3219 7890 7890
Sep 04 10:13:26 desktop kernel: Node 0 DMA32 free:48736kB min:4632kB
low:7928kB high:11224kB active_anon:164348kB inactive_anon:560608kB act
Sep 04 10:13:26 desktop kernel: lowmem_reserve[]: 0 0 4671 4671
Sep 04 10:13:26 desktop kernel: Node 0 Normal free:80864kB min:6720kB
low:11500kB high:16280kB active_anon:531128kB inactive_anon:536404kB a
Sep 04 10:13:26 desktop kernel: lowmem_reserve[]: 0 0 0 0
Sep 04 10:13:26 desktop kernel: Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB
(U) 1*32kB (U) 1*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*
Sep 04 10:13:26 desktop kernel: Node 0 DMA32: 7240*4kB (UME) 2472*8kB
(UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0
Sep 04 10:13:26 desktop kernel: Node 0 Normal: 19846*4kB (UMEH) 29*8kB
(UH) 12*16kB (H) 7*32kB (H) 0*64kB 1*128kB (H) 3*256kB (H) 0*512kB 0*
Sep 04 10:13:26 desktop kernel: Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 04 10:13:26 desktop kernel: 1261746 total pagecache pages
Sep 04 10:13:26 desktop kernel: 4498 pages in swap cache
Sep 04 10:13:26 desktop kernel: Swap cache stats: add 927111, delete
922613, find 398281/626731
Sep 04 10:13:26 desktop kernel: Free swap  = 8024748kB
Sep 04 10:13:26 desktop kernel: Total swap = 8388604kB
Sep 04 10:13:26 desktop kernel: 2079412 pages RAM
Sep 04 10:13:26 desktop kernel: 0 pages HighMem/MovableOnly
Sep 04 10:13:26 desktop kernel: 53317 pages reserved
Sep 04 10:13:26 desktop kernel: [ pid ]   uid  tgid total_vm  rss
nr_ptes nr_pmds swapents oom_score_adj name
Sep 04 10:13:26 desktop kernel: [ 1655] 0  16554624610346   
  88   3   79 0 systemd-journal
Sep 04 10:13:26 desktop kernel: [ 1665] 0  166525033  145   
  16   3   60 0 lvmetad
Sep 04 10:13:26 desktop kernel: [ 1686] 0  1686 8671  251   
  19   3  469 -1000 systemd-udevd
Sep 04 10:13:26 desktop kernel: [ 1805]   108  180528434  243   
  26   4  101 0 systemd-timesyn
Sep 04 10:13:26 desktop kernel: [ 1813]   109  1813 9863 1398   
  24

Re: gazillions of Incorrect local/global backref count

2016-09-04 Thread Christoph Anton Mitterer
On Sun, 2016-09-04 at 05:33 +, Paul Jones wrote:
> The errors are wrong. I nearly ruined my filesystem a few days ago by
> trying to repair similar errors, thankfully all seems ok.
> Check again with btrfs-progs 4.6.1 and see if the errors go away,
> mine did.
> See open bug https://bugzilla.kernel.org/show_bug.cgi?id=155791 for
> more details.

Thanks for the pointer :)
I can at least confirm that my system seems to work normal, scrub
didn't bring any errors either, nor are there any kernel messages...

The interesting thing... I have some pretty large btrfs on those 8TiB
seagate disks (nearly full with some million files)... which I have
also scanned with v4.7... and no errors.
Only my system fs seems to be "affected".


Well it's not my first case of false positives in btrfs check (https://
www.mail-archive.com/linux-btrfs@vger.kernel.org/msg48325.html)... so I
was more relaxed this time (at least a bit ;-) ).

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature