send | receive: received snapshot is missing recent files

2017-09-05 Thread Dave
I'm running Arch Linux on BTRFS. I use Snapper to take hourly
snapshots and it works without any issues.

I have a bash script that uses send | receive to transfer snapshots to
a couple external HDD's. The script runs daily on a systemd timer. I
set all this up recently and I first noticed that it runs every day
and that the expected snapshots are received.

At a glance, everything looked correct. However, today was my day to
drill down and really make sure everything was working.

To my surprise, the newest received incremental snapshots are missing
all recent files. These new snapshots reflect the system state from
weeks ago and no files more recent than a certain date are in the
snapshots.

However, the snapshots are newly created and newly received. The work
is being done fresh each day when my script runs, but the results are
anchored back in time at this earlier date. Weird.

I'm not really sure where to start troubleshooting, so I'll start by
sharing part of my script. I'm sure the problem is in my script, and
is not related to BTRFS or snapper functionality. (As I said, the
Snapper snapshots are totally OK before being sent | received.

These are the key lines of the script I'm using to send | receive a snapshot:

old_num=$(snapper -c "$config" list -t single | awk
'/'"$selected_uuid"'/ {print $1}')
old_snap=$SUBVOLUME/.snapshots/$old_num/snapshot
new_num=$(snapper -c "$config" create --print-number)
new_snap=$SUBVOLUME/.snapshots/$new_num/snapshot
btrfs send -c "$old_snap" "$new_snap" | $ssh btrfs receive
"$backup_location"

I have to admit that even after reading the following page half a
dozen times, I barely understand the difference between -c and -p.
https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F

After reading that page again today, I feel like I should switch to -p
(maybe). However, the -c vs -p choice probably isn't my problem.

Any ideas what my problem could be?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Mkfs: Rework --rootdir to a more generic behavior

2017-09-05 Thread Qu Wenruo



On 2017年09月06日 03:05, Goffredo Baroncelli wrote:

On 09/05/2017 10:19 AM, Qu Wenruo wrote:



On 2017年09月05日 02:08, David Sterba wrote:

On Mon, Sep 04, 2017 at 03:41:05PM +0900, Qu Wenruo wrote:

mkfs.btrfs --rootdir provides user a method to generate btrfs with
pre-written content while without the need of root privilege.

However the code is quite old and doesn't get much review or test.
This makes some strange behavior, from customized chunk allocation
(which uses the reserved 0~1M device space) to lack of special file
handler (Fixed in previous 2 patches).


The cleanup in this area is most welcome. The patches look good after a
quick look, I'll do another review round.


To save you some time, I found that my rework can't create new image which old 
--rootdir can do. So it's still not completely the same behavior.
I can fix it by creating a large sparse file first and then truncate it using 
current method easily.

But this really concerns me, do we need to shrink the fs?


I still fatigue to understand in what "mkfs.btrfs --rootdir" would be better than a 
"simple tar";

in the first case I have to do
a1) mkfs.btrfs --root-dir  (create the archive)
a2) dd  (copy and truncate the image and store it in the archive)
a3) dd  (take the archived image, and restore it)
a4) btrfs fi resize (expand the image)

in the second case I have to
b1) tar cf ... (create the image an store it in the archive, this is a1+a2)
b2) mkfs,btrfs (create the filesystem with the final size)
b3) tar xf ... (take the archived image and restore it)


However the code is already written (and it seems simple enough), so a possible 
compromise could be to have the "shrinking" only if another option is passed; 
eg.

mkfs.btrfs --root ...--> populate the filesystem
mkfs.btrfs --shrink --root   --> populate and shrink the filesystem

however I find this useful only if it is possible to creating the filesystem in 
a file; ie.

mkfs.btrfs --shrink --root  

where  doesn't have to exists before mkfs.btrfs, and after
a)  contains the image
b)  is the smallest possible size.


Yes, that's the original behavior. And what my rework can't do yet.
It can't determine the size of the device, so it can't continue.

If we decide to follow the original behavior, then I have to create 
sparse file first and truncate the file at the end.

But still quite easy to do.

And if we decide to follow mkfs.ext -d behavior, then I just need to 
remove 2 patches from the patchset (shrink patch and doc patch, which 
adds about 100 lines), and slightly modify the rework patch to remove 
the O_CREATE open flag.




Definitely I don't like the truncate done by the operator by hand after the 
mkfs.btrfs (current behavior).

BTW I compiled successfully the patches, and these seems to work.

PS: I tried to cross-compile mkfs.btrfs ton arm, but mkfs.btrfs was unable to 
work:

$ uname -a
Linux bananapi 4.4.66-bananian #2 SMP Sat May 6 19:26:50 UTC 2017 armv7l 
GNU/Linux
$ sudo ./mkfs.btrfs /dev/loop0
btrfs-progs v4.12.1-5-g3c9451cd
See http://btrfs.wiki.kernel.org for more information.

ERROR: superblock magic doesn't match
Performing full device TRIM /dev/loop0 (10.00GiB) ...
ERROR: open ctree failed

However this problem exists even with a plain v4.12.1. The first error seems to 
suggest that there is some endian-ness issue


I'd better get one cheap ARM board if I want to do native debug.

BTW, what's the output of dump-super here?
Which may gives us some clue to fix it.

Thanks,
Qu



BR
G.Baroncelli



I had a discussion with Austin about this, thread named "[btrfs-progs] Bug in 
mkfs.btrfs -r".
The only equivalent I found is "mkfs.ext4 -d", which can only create new file 
if size is given and will not shrink fs.
(Genext2fs shrinks the fs, but is no longer in e2fsprogs)

If we follow that behavior, the 3rd and 5th patches are not needed, which I'm 
pretty happy with.

Functionally, both behavior can be implemented with current method, but I hope 
to make sure which is the designed behavior so I can stick to it.

I hope you could make the final decision on this so I can update the patchset.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Adam Borowski
On Mon, Sep 04, 2017 at 10:33:40PM +0200, A L wrote:
> On 9/4/2017 5:11 PM, Adam Borowski wrote:
> > Hi!
> > Here's an utility to measure used compression type + ratio on a set of files
> > or directories: https://github.com/kilobyte/compsize
> 
> Great tool. Just tried it on some of my backup snapshots.
> 
>    # compsize portage.20170904T2200
>    142432 files.
>    all   78%  329M/ 422M
>    none 100%  227M/ 227M
>    zlib  52%  102M/ 195M
> 
>    # du -sh  portage.20170904T2200
>    787M    portage.20170904T2200
> 
>    # btrfs fi du -s  portage.20170904T2200
>     Total   Exclusive  Set shared  Filename
>     271.61MiB 6.34MiB   245.51MiB portage.20170904T2200
> 
> Interesting results. How do I interpret them?

I've added some documentation; especially in the man page.

(Sorry for not pushing this earlier, Timofey went wild on this tool and I
wanted to avoid conflicts.)

> Compsize also doesn't seem to like some non-standard files and throws an
> error (even though they should be ignored?):
> 
> # compsize usb-backup/volumes/root/root.20170727T2321/
> open("usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350"):
> No such device or address
> 
> # dir
> usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350
> srwx-- 1 root root 0 Dec 31  2015 
> usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350=

Fixed.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
⠈⠳⣄ 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs-progs: Use clean_tree_block when btrfs_update_root fail

2017-09-05 Thread Gu Jinxiang
In btrfs_fsck_reinit_root, when btrfs_alloc_free_block fail, it will
update on original root.
Before update it, used btrfs_mark_buffer_dirty to set the flag to EXTENT_DIRTY.
So, we should call clean_tree_block to clear the flag if update fail.

Signed-off-by: Gu Jinxiang 
---
 cmds-check.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/cmds-check.c b/cmds-check.c
index 006edbde..6bd55e90 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11652,6 +11652,7 @@ init:
ret = btrfs_update_root(trans, root->fs_info->tree_root,
>root_key, >root_item);
if (ret) {
+   clean_tree_block(trans, root, c);
free_extent_buffer(c);
return ret;
}
-- 
2.13.5



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] btrfs-progs: Replace some BUG_ON by return

2017-09-05 Thread Gu Jinxiang
The following test failed becasuse there is no data/metadata/system block_group.
Use return ret to replace BUG_ON(ret) to avoid system crash, because there is
enough message for user to understand what happened.

$sudo TEST=003\* make test-fuzz
Unable to find block group for 0
Unable to find block group for 0
Unable to find block group for 0
extent-tree.c:2693: btrfs_reserve_extent: BUG_ON `ret` triggered, value -28
/home/fnst/btrfs/btrfs-progs/btrfs[0x419966]
/home/fnst/btrfs/btrfs-progs/btrfs(btrfs_reserve_extent+0xb16)[0x41f500]
/home/fnst/btrfs/btrfs-progs/btrfs(btrfs_alloc_free_block+0x55)[0x41f59b]
/home/fnst/btrfs/btrfs-progs/btrfs[0x46a6ce]
/home/fnst/btrfs/btrfs-progs/btrfs(cmd_check+0x1012)[0x47c885]
/home/fnst/btrfs/btrfs-progs/btrfs(main+0x127)[0x40b055]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2aae83e89f45]
/home/fnst/btrfs/btrfs-progs/btrfs[0x40a939]
Creating a new CRC tree
Checking filesystem on 
/home/fnst/btrfs/btrfs-progs/tests/fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored
UUID: 5cb33553-6f6d-4ce8-83fd-20af5a2f8181
Reinitialize checksum tree
failed (ignored, ret=134): /home/fnst/btrfs/btrfs-progs/btrfs check 
--init-csum-tree 
/home/fnst/btrfs/btrfs-progs/tests/fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored
mayfail: returned code 134 (SIGABRT), not ignored
test failed for case 003-multi-check-unmounted

Signed-off-by: Gu Jinxiang 
---
 extent-tree.c | 16 ++--
 transaction.c |  8 ++--
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/extent-tree.c b/extent-tree.c
index eed56886..14838a5d 100644
--- a/extent-tree.c
+++ b/extent-tree.c
@@ -2678,11 +2678,13 @@ int btrfs_reserve_extent(struct btrfs_trans_handle 
*trans,
ret = do_chunk_alloc(trans, info,
 num_bytes,
 BTRFS_BLOCK_GROUP_METADATA);
-   BUG_ON(ret);
+   if (ret)
+   goto out;
}
ret = do_chunk_alloc(trans, info,
 num_bytes + SZ_2M, data);
-   BUG_ON(ret);
+   if (ret)
+   goto out;
}
 
WARN_ON(num_bytes < info->sectorsize);
@@ -2690,9 +2692,11 @@ int btrfs_reserve_extent(struct btrfs_trans_handle 
*trans,
   search_start, search_end, hint_byte, ins,
   trans->alloc_exclude_start,
   trans->alloc_exclude_nr, data);
-   BUG_ON(ret);
+   if (ret)
+   goto out;
clear_extent_dirty(>free_space_cache,
   ins->objectid, ins->objectid + ins->offset - 1);
+out:
return ret;
 }
 
@@ -2761,7 +2765,8 @@ static int alloc_tree_block(struct btrfs_trans_handle 
*trans,
int ret;
ret = btrfs_reserve_extent(trans, root, num_bytes, empty_size,
   hint_byte, search_end, ins, 0);
-   BUG_ON(ret);
+   if (ret)
+   goto out;
 
if (root_objectid == BTRFS_EXTENT_TREE_OBJECTID) {
struct pending_extent_op *extent_op;
@@ -2792,6 +2797,7 @@ static int alloc_tree_block(struct btrfs_trans_handle 
*trans,
finish_current_insert(trans, root->fs_info->extent_root);
del_pending_extents(trans, root->fs_info->extent_root);
}
+out:
return ret;
 }
 
@@ -2813,7 +2819,6 @@ struct extent_buffer *btrfs_alloc_free_block(struct 
btrfs_trans_handle *trans,
   trans->transid, 0, key, level,
   empty_size, hint, (u64)-1, );
if (ret) {
-   BUG_ON(ret > 0);
return ERR_PTR(ret);
}
 
@@ -2821,7 +2826,6 @@ struct extent_buffer *btrfs_alloc_free_block(struct 
btrfs_trans_handle *trans,
if (!buf) {
btrfs_free_extent(trans, root, ins.objectid, ins.offset,
  0, root->root_key.objectid, level, 0);
-   BUG_ON(1);
return ERR_PTR(-ENOMEM);
}
btrfs_set_buffer_uptodate(buf);
diff --git a/transaction.c b/transaction.c
index ad705728..33225002 100644
--- a/transaction.c
+++ b/transaction.c
@@ -165,9 +165,11 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
*trans,
BUG_ON(ret);
 commit_tree:
ret = commit_tree_roots(trans, fs_info);
-   BUG_ON(ret);
+   if (ret)
+   goto error;
ret = __commit_transaction(trans, root);
-   BUG_ON(ret);
+   if (ret)
+   goto error;
write_ctree_super(trans, fs_info);
btrfs_finish_extent_commit(trans, fs_info->extent_root,
   _info->pinned_extents);
@@ -177,6 +179,8 @@ commit_tree:
fs_info->running_transaction = NULL;
fs_info->last_trans_committed = 

Re: read-only for no good reason on 4.9.30

2017-09-05 Thread Chris Murphy
On Sun, Sep 3, 2017 at 11:19 PM, Russell Coker
 wrote:
> I have a system with less than 50% disk space used.  It just started rejecting
> writes due to lack of disk space.

What's the error? Is it ENOSPC? Kinda needs kernel messages, and also
if it's ENOSPC to have mounted with enospc_debug option. Also
everytime this has come up before, devs have asked for

$ grep -R . /sys/fs/btrfs/fsuuid/allocation/

I have no idea how to parse that myself, but there is probably a way
to anticipate enoscp from that information if you can learn how to
parse it.



>  I ran "btrfs balance" and then it started
> working correctly again.  It seems that a btrfs filesystem if left alone will
> eventually get fragmented enough that it rejects writes (I've had similar
> issues with other systems running BTRFS with other kernel versions).
>
> Is this a known issue?

Sounds like old thread "BTRFS constantly reports "No space left on
device" even with a huge unallocated space" but that's a long time
ago, and kernel ~4.7.3 fixed it. Whatever that was should be in
4.9.30.

Another possibility is there's a small bug in some cases where things
go around the new ticketed enospc infrastructure, and that was fixed
in 4.9.42 and 4.12.6. So you should try one of those and see if it
fixes the problem.





-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Mkfs: Rework --rootdir to a more generic behavior

2017-09-05 Thread Goffredo Baroncelli
On 09/05/2017 10:19 AM, Qu Wenruo wrote:
> 
> 
> On 2017年09月05日 02:08, David Sterba wrote:
>> On Mon, Sep 04, 2017 at 03:41:05PM +0900, Qu Wenruo wrote:
>>> mkfs.btrfs --rootdir provides user a method to generate btrfs with
>>> pre-written content while without the need of root privilege.
>>>
>>> However the code is quite old and doesn't get much review or test.
>>> This makes some strange behavior, from customized chunk allocation
>>> (which uses the reserved 0~1M device space) to lack of special file
>>> handler (Fixed in previous 2 patches).
>>
>> The cleanup in this area is most welcome. The patches look good after a
>> quick look, I'll do another review round.
> 
> To save you some time, I found that my rework can't create new image which 
> old --rootdir can do. So it's still not completely the same behavior.
> I can fix it by creating a large sparse file first and then truncate it using 
> current method easily.
> 
> But this really concerns me, do we need to shrink the fs?

I still fatigue to understand in what "mkfs.btrfs --rootdir" would be better 
than a "simple tar"; 

in the first case I have to do
a1) mkfs.btrfs --root-dir  (create the archive)
a2) dd  (copy and truncate the image and store it in the archive)
a3) dd  (take the archived image, and restore it)
a4) btrfs fi resize (expand the image)

in the second case I have to 
b1) tar cf ... (create the image an store it in the archive, this is a1+a2)
b2) mkfs,btrfs (create the filesystem with the final size)
b3) tar xf ... (take the archived image and restore it)


However the code is already written (and it seems simple enough), so a possible 
compromise could be to have the "shrinking" only if another option is passed; 
eg.

mkfs.btrfs --root ...--> populate the filesystem
mkfs.btrfs --shrink --root   --> populate and shrink the filesystem 

however I find this useful only if it is possible to creating the filesystem in 
a file; ie.

mkfs.btrfs --shrink --root  

where  doesn't have to exists before mkfs.btrfs, and after 
a)  contains the image
b)  is the smallest possible size.

Definitely I don't like the truncate done by the operator by hand after the 
mkfs.btrfs (current behavior).

BTW I compiled successfully the patches, and these seems to work. 

PS: I tried to cross-compile mkfs.btrfs ton arm, but mkfs.btrfs was unable to 
work:

$ uname -a
Linux bananapi 4.4.66-bananian #2 SMP Sat May 6 19:26:50 UTC 2017 armv7l 
GNU/Linux
$ sudo ./mkfs.btrfs /dev/loop0
btrfs-progs v4.12.1-5-g3c9451cd
See http://btrfs.wiki.kernel.org for more information.

ERROR: superblock magic doesn't match
Performing full device TRIM /dev/loop0 (10.00GiB) ...
ERROR: open ctree failed

However this problem exists even with a plain v4.12.1. The first error seems to 
suggest that there is some endian-ness issue

BR
G.Baroncelli

> 
> I had a discussion with Austin about this, thread named "[btrfs-progs] Bug in 
> mkfs.btrfs -r".
> The only equivalent I found is "mkfs.ext4 -d", which can only create new file 
> if size is given and will not shrink fs.
> (Genext2fs shrinks the fs, but is no longer in e2fsprogs)
> 
> If we follow that behavior, the 3rd and 5th patches are not needed, which I'm 
> pretty happy with.
> 
> Functionally, both behavior can be implemented with current method, but I 
> hope to make sure which is the designed behavior so I can stick to it.
> 
> I hope you could make the final decision on this so I can update the patchset.
> 
> Thanks,
> Qu
> 
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Nick Terrell
On 9/4/17, 8:12 AM, "Adam Borowski"  wrote:
> Hi!
> Here's an utility to measure used compression type + ratio on a set of files
> or directories: https://github.com/kilobyte/compsize
> 
> It should be of great help for users, and also if you:
> * muck with compression levels
> * add new compression types
> * add heurestics that could err on withholding compression too much

Thanks for writing this tool Adam, I'll try it out with zstd! It looks very
useful for benchmarking compression algorithms, much better than measuring
the filesystem size with du/df.

> (Thanks for Knorrie and his python-btrfs project that made figuring out the
> ioctls much easier.)
> 
> Meow!
> -- 
> ⢀⣴⠾⠻⢶⣦⠀ 
> ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
> ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din
> ⠈⠳⣄ 
> 
 

N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)

2017-09-05 Thread Josef Bacik
Alright I just reworked the build tree ref stuff and tested it to make sure it 
wasn’t going to give false positives again.  Apparently I had only ever used 
this with very basic existing fs’es and nothing super complicated, so it was 
just broken for anything complex.  I’ve pushed it to my tree, you can just pull 
and build and try again.  This time the stack traces will even work!  Thanks,

Josef

On 9/3/17, 4:21 PM, "Marc MERLIN"  wrote:

On Sun, Sep 03, 2017 at 05:33:33PM +, Josef Bacik wrote:
> Alright pushed, sorry about that.
 
I'm reasonably sure I'm running the new code, but still got this:
[ 2104.336513] Dropping a ref for a root that doesn't have a ref on the block
[ 2104.358226] Dumping block entry [115253923840 155648], num_refs 1, metadata 
0, from disk 1
[ 2104.384037]   Ref root 0, parent 3414272884736, owner 262813, offset 0, 
num_refs 18446744073709551615
[ 2104.412766]   Ref root 418, parent 0, owner 262813, offset 0, num_refs 1
[ 2104.433888]   Root entry 418, num_refs 1
[ 2104.446648]   Root entry 69869, num_refs 0
[ 2104.459904]   Ref action 2, root 69869, ref_root 0, parent 3414272884736, 
owner 262813, offset 0, num_refs 18446744073709551615
[ 2104.496244]   No Stacktrace

Now, in the background I had a monthly md check of the underlying device
(mdadm raid 5), and got some of those. Obviously that's not good, and 
I'm assuming that md raid5 may not have a checksum on blocks, so it won't know
which drive has the corrupted data.
Does that sound right?

Now, the good news is that btrfs on top does have checksums, so running a scrub 
should
hopefully find those corrupted blocks if they happen to be in use by the 
filesystem
(maybe they are free).
But as a reminder, this whole thread started with my FS maybe not being in a 
good state, but both
check --repair and scrub returning clean. Maybe I'll use the opportunity to 
re-run a check --repair
and a scrub after that to see what state things are in.

md6: mismatch sector in range 3581539536-3581539544
md6: mismatch sector in range 3581539544-3581539552
md6: mismatch sector in range 3581539552-3581539560
md6: mismatch sector in range 3581539560-3581539568  
md6: mismatch sector in range 3581543792-3581543800
md6: mismatch sector in range 3581543800-3581543808
md6: mismatch sector in range 3581543808-3581543816
md6: mismatch sector in range 3581543816-3581543824
md6: mismatch sector in range 3581544112-3581544120
md6: mismatch sector in range 3581544120-3581544128

As for your patch, no idea why it's not giving me a stacktrace, sorry :-/

Git log of my tree does show:
commit aa162d2908bd7452805ea812b7550232b0b6ed53
Author: Josef Bacik 
Date:   Sun Sep 3 13:32:17 2017 -0400

Btrfs: use be->metadata just in case

I suspect we're not getting the owner in some cases, so we want to just
use the known value.

Signed-off-by: Josef Bacik 

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.merlins.org_=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=sDzg6MvHymKOUgI8SFIm4Q=BaH33jtavN-1wWyV3yseE5v7ImIAaTXLnjChSr4HnQw=3JczS4Mo254uip2aIsYiC_EUHsmGYcCJUUMl6si8NQ8=
  | PGP 1024R/763BE901


N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]

2017-09-05 Thread Liu Bo
On Tue, Sep 05, 2017 at 11:47:26AM +0200, Marco Lorenzo Crociani wrote:
> Hi,
> I was transferring some data with rsync to a btrfs filesystem when I got:
> 
> set 04 14:59:05  kernel: INFO: task kworker/u33:2:25015 blocked for more
> than 120 seconds.
> set 04 14:59:05  kernel:   Not tainted 4.12.10-1.el7.elrepo.x86_64 #1
> set 04 14:59:05  kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> set 04 14:59:05  kernel: kworker/u33:2   D0 25015  2 0x0080
> set 04 14:59:05  kernel: Workqueue: events_unbound
> btrfs_async_reclaim_metadata_space [btrfs]
> set 04 14:59:05  kernel: Call Trace:
> set 04 14:59:05  kernel:  __schedule+0x28a/0x880
> set 04 14:59:05  kernel:  schedule+0x36/0x80
> set 04 14:59:05  kernel:  wb_wait_for_completion+0x64/0x90
> set 04 14:59:05  kernel:  ? remove_wait_queue+0x60/0x60
> set 04 14:59:05  kernel:  __writeback_inodes_sb_nr+0x8e/0xb0
> set 04 14:59:05  kernel:  writeback_inodes_sb_nr+0x10/0x20
> set 04 14:59:05  kernel:  flush_space+0x469/0x580 [btrfs]
> set 04 14:59:05  kernel:  ? dequeue_task_fair+0x577/0x830
> set 04 14:59:05  kernel:  ? pick_next_task_fair+0x122/0x550
> set 04 14:59:05  kernel:  btrfs_async_reclaim_metadata_space+0x112/0x430
> [btrfs]
> set 04 14:59:05  kernel:  process_one_work+0x149/0x360
> set 04 14:59:05  kernel:  worker_thread+0x4d/0x3c0
> set 04 14:59:05  kernel:  kthread+0x109/0x140
> set 04 14:59:05  kernel:  ? rescuer_thread+0x380/0x380
> set 04 14:59:05  kernel:  ? kthread_park+0x60/0x60
> set 04 14:59:05  kernel:  ? do_syscall_64+0x67/0x150
> set 04 14:59:05  kernel:  ret_from_fork+0x25/0x30
> 
> btrfs fi df /data
> Data, single: total=20.63TiB, used=20.63TiB
> System, DUP: total=8.00MiB, used=2.20MiB
> Metadata, DUP: total=41.50GiB, used=40.61GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> btrfs fi show /dev/sdo
> Label: 'Storage'  uuid: 429e42f4-dd9e-4267-b353-aa0831812f87
>   Total devices 1 FS bytes used 20.67TiB
>   devid1 size 36.38TiB used 20.71TiB path /dev/sdo
> 
> Is it serious? Can I provide other info?
>

I think we're still cool here, the stack shows that btrfs is trying to
gain metadata space by flushing dirty pages via writeback threads, and
perhaps there're too much to flush to get enough metadata.


Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check --repair now runs in minutes instead of hours? aborting

2017-09-05 Thread Marc MERLIN
On Tue, Sep 05, 2017 at 04:05:04PM +0800, Qu Wenruo wrote:
> > gargamel:~# btrfs fi df /mnt/btrfs_pool1
> > Data, single: total.60TiB, used.54TiB
> > System, DUP: total2.00MiB, used=1.19MiB
> > Metadata, DUP: totalX.00GiB, used.69GiB
> 
> Wait for a minute.
> 
> Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the format
> (again)?
> This output format must be changed, at least to 0.69 GiB, or 706 MiB.
 
Email client problem. I see control characters in what you quoted.

Let's try again
gargamel:~# btrfs fi df /mnt/btrfs_pool1
Data, single: total=10.66TiB, used=10.60TiB  => 10TB
System, DUP: total=64.00MiB, used=1.20MiB=> 1.2MB
Metadata, DUP: total=57.50GiB, used=12.76GiB => 13GB
GlobalReserve, single: total=512.00MiB, used=0.00B  => 0

> You mean lowmem is actually FASTER than original mode?
> That's very surprising.
 
Correct, unless I add --repair and then original mode is 2x faster than
lowmem.

> Is there any special operation done for that btrfs?
> Like offline dedupe or tons of reflinks?

In this case, no.
Note that btrfs check used to take many hours overnight until I did a
git pull of btrfs progs and built the latest from TOT.

> BTW, how many subvolumes do you have in the fs?
 
gargamel:/mnt/btrfs_pool1# btrfs subvolume list . | wc -l
91

If I remove snapshots for btrfs send and historical 'backups':
gargamel:/mnt/btrfs_pool1# btrfs subvolume list . | grep -Ev 
'(hourly|daily|weekly|rw|ro)' | wc -l
5

> This looks like a bug. My first guess is related to number of
> subvolumes/reflinks, but I'm not sure since I don't have many real-world
> btrfs.
> 
> I'll take sometime to look into it.
> 
> Thanks for the very interesting report,

Thanks for having a look :)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is autodefrag recommended? -- re-duplication???

2017-09-05 Thread Hugo Mills
On Tue, Sep 05, 2017 at 05:01:10PM +0300, Marat Khalili wrote:
> Dear experts,
> 
> At first reaction to just switching autodefrag on was positive, but
> mentions of re-duplication are very scary. Main use of BTRFS here is
> backup snapshots, so re-duplication would be disastrous.
> 
> In order to stick to concrete example, let there be two files, 4KB
> and 4GB in size, referenced in read-only snapshots 100 times each,
> and some 4KB of both files are rewritten each night and then another
> snapshot is created (let's ignore snapshots deletion here). AFAIU
> 8KB of additional space (+metadata) will be allocated each night
> without autodefrag. With autodefrag will it be perhaps 4KB+128KB or
> something much worse?

   I'm going for 132 KiB (4+128).

   Of course, if there's two 4 KiB writes close together, then there's
less overhead, as they'll share the range.

   Hugo.

-- 
Hugo Mills | Once is happenstance; twice is coincidence; three
hugo@... carfax.org.uk | times is enemy action.
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Re: \o/ compsize

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 22:21, Hans van Kranenburg wrote:

On 09/05/2017 04:02 PM, Qu Wenruo wrote:



On 2017年09月05日 03:52, Timofey Titovets wrote:

2017-09-04 21:42 GMT+03:00 Adam Borowski :

On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:

2017-09-04 18:11 GMT+03:00 Adam Borowski :

Here's an utility to measure used compression type + ratio on a set
of files
or directories: https://github.com/kilobyte/compsize

It should be of great help for users, and also if you:
* muck with compression levels
* add new compression types
* add heurestics that could err on withholding compression too much


Did a brief review, and the result looks quite good.
Especially same disk bytenr is handled well, so same file extent
referring to different part of the large extent won't get count twice.

Nice job.

But still some smaller improvement can be done:
(Please keep in mind I can go totally wrong since I'm not doing a
comprehensive review)

Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY,
which should filtered out unrelated results.


No, it does not.

https://patchwork.kernel.org/patch/9767619/


Why not?

Min key = ino, EXTENT_DATA, 0
Max key = ino, EXTENT_DATA, -1

With that min_key and max_key, the result is just what we want.

This also filtered out any item not belongs to this ino, and other 
things like XATTR or whatever.


Thanks,
Qu



And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined
functions will be a big improvement for reviewers.
(So I can check if the magic numbers are right or not, since I'm a lazy
bone and don't want to manually calculate the offset)



Packaged to AUR:
https://aur.archlinux.org/packages/compsize-git/


Nice, I don't even need to build it myself!
(Well, no much dependency anyway)



Cool!  I'd wait until people say the code is sane (I don't really
know these
ioctls) but if you want to make poor AUR folks our beta testers,
that's ok.


The code is sane!
And it even considered inline extent! (Which I didn't consider BTW as
inline extent counts as metadata, not data so my first thought just is
to just ignore them).



This just are too handy =>>

However, one issue: I did not set a license; your packaging says GPL3.
It would be better to have something compatible with btrfs-progs
which are
GPL2-only.  What about GPL2-or-higher?


Sorry for license, just copy-paste error, fixed


After adding some related info (like wasted space in pinned extents,
reuse
of extents), it'd be nice to have this tool inside btrfs-progs,
either as a
part of "fi du" or another command.


That will be useful =>

If improved, I think there is the chance to get it into btrfs-progs.

Thanks,
Qu



P.S.
your code work amazing fast on my ssd and data %)
150Gb data
-O0 2.12s
-O2 0.51s


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Hans van Kranenburg
On 09/05/2017 04:02 PM, Qu Wenruo wrote:
> 
> 
> On 2017年09月05日 03:52, Timofey Titovets wrote:
>> 2017-09-04 21:42 GMT+03:00 Adam Borowski :
>>> On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:
 2017-09-04 18:11 GMT+03:00 Adam Borowski :
> Here's an utility to measure used compression type + ratio on a set
> of files
> or directories: https://github.com/kilobyte/compsize
>
> It should be of great help for users, and also if you:
> * muck with compression levels
> * add new compression types
> * add heurestics that could err on withholding compression too much
> 
> Did a brief review, and the result looks quite good.
> Especially same disk bytenr is handled well, so same file extent
> referring to different part of the large extent won't get count twice.
> 
> Nice job.
> 
> But still some smaller improvement can be done:
> (Please keep in mind I can go totally wrong since I'm not doing a
> comprehensive review)
> 
> Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY,
> which should filtered out unrelated results.

No, it does not.

https://patchwork.kernel.org/patch/9767619/

> And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined
> functions will be a big improvement for reviewers.
> (So I can check if the magic numbers are right or not, since I'm a lazy
> bone and don't want to manually calculate the offset)
> 

 Packaged to AUR:
 https://aur.archlinux.org/packages/compsize-git/
> 
> Nice, I don't even need to build it myself!
> (Well, no much dependency anyway)
> 
>>>
>>> Cool!  I'd wait until people say the code is sane (I don't really
>>> know these
>>> ioctls) but if you want to make poor AUR folks our beta testers,
>>> that's ok.
> 
> The code is sane!
> And it even considered inline extent! (Which I didn't consider BTW as
> inline extent counts as metadata, not data so my first thought just is
> to just ignore them).
> 
>>
>> This just are too handy =)
>>
>>> However, one issue: I did not set a license; your packaging says GPL3.
>>> It would be better to have something compatible with btrfs-progs
>>> which are
>>> GPL2-only.  What about GPL2-or-higher?
>>
>> Sorry for license, just copy-paste error, fixed
>>
>>> After adding some related info (like wasted space in pinned extents,
>>> reuse
>>> of extents), it'd be nice to have this tool inside btrfs-progs,
>>> either as a
>>> part of "fi du" or another command.
>>
>> That will be useful =)
> 
> If improved, I think there is the chance to get it into btrfs-progs.
> 
> Thanks,
> Qu
> 
>>
>> P.S.
>> your code work amazing fast on my ssd and data %)
>> 150Gb data
>> -O0 2.12s
>> -O2 0.51s
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: \o/ compsize

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 03:52, Timofey Titovets wrote:

2017-09-04 21:42 GMT+03:00 Adam Borowski :

On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote:

2017-09-04 18:11 GMT+03:00 Adam Borowski :

Here's an utility to measure used compression type + ratio on a set of files
or directories: https://github.com/kilobyte/compsize

It should be of great help for users, and also if you:
* muck with compression levels
* add new compression types
* add heurestics that could err on withholding compression too much


Did a brief review, and the result looks quite good.
Especially same disk bytenr is handled well, so same file extent 
referring to different part of the large extent won't get count twice.


Nice job.

But still some smaller improvement can be done:
(Please keep in mind I can go totally wrong since I'm not doing a 
comprehensive review)


Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, 
which should filtered out unrelated results.


And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined 
functions will be a big improvement for reviewers.
(So I can check if the magic numbers are right or not, since I'm a lazy 
bone and don't want to manually calculate the offset)




Packaged to AUR:
https://aur.archlinux.org/packages/compsize-git/


Nice, I don't even need to build it myself!
(Well, no much dependency anyway)



Cool!  I'd wait until people say the code is sane (I don't really know these
ioctls) but if you want to make poor AUR folks our beta testers, that's ok.


The code is sane!
And it even considered inline extent! (Which I didn't consider BTW as 
inline extent counts as metadata, not data so my first thought just is 
to just ignore them).




This just are too handy =)


However, one issue: I did not set a license; your packaging says GPL3.
It would be better to have something compatible with btrfs-progs which are
GPL2-only.  What about GPL2-or-higher?


Sorry for license, just copy-paste error, fixed


After adding some related info (like wasted space in pinned extents, reuse
of extents), it'd be nice to have this tool inside btrfs-progs, either as a
part of "fi du" or another command.


That will be useful =)


If improved, I think there is the chance to get it into btrfs-progs.

Thanks,
Qu



P.S.
your code work amazing fast on my ssd and data %)
150Gb data
-O0 2.12s
-O2 0.51s


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is autodefrag recommended? -- re-duplication???

2017-09-05 Thread Marat Khalili

Dear experts,

At first reaction to just switching autodefrag on was positive, but 
mentions of re-duplication are very scary. Main use of BTRFS here is 
backup snapshots, so re-duplication would be disastrous.


In order to stick to concrete example, let there be two files, 4KB and 
4GB in size, referenced in read-only snapshots 100 times each, and some 
4KB of both files are rewritten each night and then another snapshot is 
created (let's ignore snapshots deletion here). AFAIU 8KB of additional 
space (+metadata) will be allocated each night without autodefrag. With 
autodefrag will it be perhaps 4KB+128KB or something much worse?


--

With Best Regards,
Marat Khalili

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is autodefrag recommended?

2017-09-05 Thread Austin S. Hemmelgarn

On 2017-09-05 08:49, Henk Slager wrote:

On Tue, Sep 5, 2017 at 1:45 PM, Austin S. Hemmelgarn
 wrote:


   - You end up duplicating more data than is strictly necessary. This
 is, IIRC, something like 128 KiB for a write.


FWIW< I'm pretty sure you can mitigate this first issue by running a regular
defrag on a semi-regular basis (monthly is what I would probably suggest).


No, both autodefrag and regular defrag duplicate data, so if you keep
snapshots around for weeks or months, it can eat up a significant
amount of space.

I'm not talking about data duplication due to broken reflinks, I'm 
talking about data duplication due to how partial extent rewrites are 
handled in BTRFS.


As a more illustrative example, suppose you've got a 256k file that has 
just one extent.  Such a file will require 256k of space for the data 
Now rewrite from 128k to 192k.  The file now technically takes up 320k, 
because the region you rewrote is still allocated in the original extent.


I know that sub-extent-size reflinks are handled like this (in the above 
example, if you instead use the CLONE ioctl to create a new file 
reflinking that range, then delete the original, the remaining 192k of 
space in the extent ends up unreferenced, but gets kept around until the 
referenced region is no longer referenced (and the easiest way to ensure 
this is to either rewrite the whole file, or defragment it)), and I'm 
pretty sure from reading the code that mid-extent writes are handled 
this way too, in which case, a full defrag can reclaim that space.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to disable/revoke 'compression'?

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 19:36, Austin S. Hemmelgarn wrote:

On 2017-09-03 19:55, Qu Wenruo wrote:



On 2017年09月04日 02:06, Adam Borowski wrote:

On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote:

Hi,
I used the mount option 'compression' on some mounted sub volumes. How
can I revoke the compression? Means to delete the option and get all
data uncompressed on this volume.
Is it enough to remount the sub volume without this option? Or is it
necessary to do some addional step (balancing?) to get all stored data
uncompressed.


If you set it via mount option, removing the option is enough to disable
compression for _new_ files.  Other ways are chattr +c and 
btrfs-property,

but if you haven't heard about those you almost surely don't have such
attributes set.

After remounting, you may uncompress existing files.  Balancing won't do
this as it moves extents around without looking inside; defrag on the 
other
hand rewrites extents thus as a side effect it applies new 
[non]compression

settings.  Thus: 「btrfs fi defrag -r /path/to/filesystem」.

Beside of it, is it possible to find out what the real and 
compressed size

of a file, for example or the ratio?


Currently not.

I've once written a tool which does this, but 1. it's extremely slow, 2.
insane, 3. so insane a certain member of this list would kill me had I
distributed the tool.  Thus, I'd need to rewrite it first...


AFAIK the only method to determine the compression ratio is to check 
the EXTENT_DATA key and its corresponding file_extent_item structure.

(Which I assume Adam is doing this way)

In that structure is records its on-disk data size and in-memory data 
size. (All rounded up to sectorsize, which is 4K in most case)

So in theory it's possible to determine the compression ratio.

The only method I can think of (maybe I forgot some methods?) is to 
use offline tool (btrfs-debug-tree) to check that.
FS APIs like fiemap doesn't even support to report on-disk data size 
so we can't use it.



But the problem is more complicated, especially when compressed CoW is 
involved.


For example, there is an extent (A) which represents the data for 
inode 258, range [0,128k).

On disk size its just 4K.

And when we write the range [32K, 64K), which get CoWed and 
compressed, resulting a new file extent (B) for inode 258, range [32K, 
64K), and on disk size is 4K as an example.


Then file extent layout for 258 will be:
[0,32k):  range [0,32K) of uncompressed Extent A
[32k, 64k): range [0,32k) of uncompressed Extent B
[64k, 128k): range [64k, 128K) of uncompressed Extent A.

And on disk extent size is 4K (compressed Extent A) + 4K (compressed 
Extent B) = 8K.


Before the write, the compresstion ratio is 4K/128K = 3.125%
While after write, the compression ratio is 8K/128K = 6.25%

Not to mention that it's possible to have uncompressed file extent.

So it's complicated even we're just using offline tool to determine 
the compression ratio of btrfs compressed file.
Out of curiosity, is there any easier method if you just want an 
aggregate ratio for the whole filesystem?  The intuitive option of 
comparing `du -sh` output to how much space is actually used in chunks 
is obviously out because that will count sparse ranges as 'compressed', 
and there should actually be a significant difference in values there 
for an uncompressed filesystem (the chunk usage should be higher).


I can be totally wrong (since I just forgot the quite obvious 
SEARCH_TREE ioctl), but according to btrfs on-disk format, only 
EXTENT_DATA contains the compression ratio (ram size and on-disk size).


So to get ratio for the whole fs, one needs to iterate through the whole 
extent tree, and follows the (any is enough) backref to locate the 
EXTENT_DATA and get the compression ratio.


That's to say, that will be slow anyway.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is autodefrag recommended?

2017-09-05 Thread Henk Slager
On Tue, Sep 5, 2017 at 1:45 PM, Austin S. Hemmelgarn
 wrote:

>>   - You end up duplicating more data than is strictly necessary. This
>> is, IIRC, something like 128 KiB for a write.
>
> FWIW< I'm pretty sure you can mitigate this first issue by running a regular
> defrag on a semi-regular basis (monthly is what I would probably suggest).

No, both autodefrag and regular defrag duplicate data, so if you keep
snapshots around for weeks or months, it can eat up a significant
amount of space.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is autodefrag recommended?

2017-09-05 Thread A L
There is a drawback in that defragmentation re-dups data that is previously 
deduped or shared in snapshots/subvolumes.

 From: Marat Khalili  -- Sent: 2017-09-04 - 11:31 

> Hello list,
> good time of the day,
> 
> More than once I see mentioned in this list that autodefrag option 
> solves problems with no apparent drawbacks, but it's not the default. 
> Can you recommend to just switch it on indiscriminately on all 
> installations?
> 
> I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's 
> Ubuntu that gives us this strange choice, no idea why it's not 4.9). 
> Only spinning rust here, no SSDs.
> 
> --
> 
> With Best Regards,
> Marat Khalili
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is autodefrag recommended?

2017-09-05 Thread Austin S. Hemmelgarn

On 2017-09-04 06:54, Hugo Mills wrote:

On Mon, Sep 04, 2017 at 12:31:54PM +0300, Marat Khalili wrote:

Hello list,
good time of the day,

More than once I see mentioned in this list that autodefrag option
solves problems with no apparent drawbacks, but it's not the
default. Can you recommend to just switch it on indiscriminately on
all installations?

I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's
Ubuntu that gives us this strange choice, no idea why it's not 4.9).
Only spinning rust here, no SSDs.


autodefrag effectively works by taking a small region around every
write or cluster of writes and making that into a stand-alone extent.
I was under the impression that it had some kind of 'random access' 
detection heuristic, and onky triggered if that flagged the write 
patterns as 'random'.


This has two consequences:

  - You end up duplicating more data than is strictly necessary. This
is, IIRC, something like 128 KiB for a write.
FWIW< I'm pretty sure you can mitigate this first issue by running a 
regular defrag on a semi-regular basis (monthly is what I would probably 
suggest).


  - There's an I/O overhead for enabling autodefrag, because it's
increasing the amount of data written.
And this issue may not be as much of an issue.  The region being 
rewritten gets written out sequentially, so it will increase the amount 
of data written, but in most cases probably won't increase IO request 
counts to the device by much.  If you care mostly about raw bandwidth, 
then this could still have an impact, but if you care about IOPS, it 
probably won't have much impact unless you're already running the device 
at peak capacity.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to disable/revoke 'compression'?

2017-09-05 Thread Austin S. Hemmelgarn

On 2017-09-03 19:55, Qu Wenruo wrote:



On 2017年09月04日 02:06, Adam Borowski wrote:

On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote:

Hi,
I used the mount option 'compression' on some mounted sub volumes. How
can I revoke the compression? Means to delete the option and get all
data uncompressed on this volume.
Is it enough to remount the sub volume without this option? Or is it
necessary to do some addional step (balancing?) to get all stored data
uncompressed.


If you set it via mount option, removing the option is enough to disable
compression for _new_ files.  Other ways are chattr +c and 
btrfs-property,

but if you haven't heard about those you almost surely don't have such
attributes set.

After remounting, you may uncompress existing files.  Balancing won't do
this as it moves extents around without looking inside; defrag on the 
other
hand rewrites extents thus as a side effect it applies new 
[non]compression

settings.  Thus: 「btrfs fi defrag -r /path/to/filesystem」.

Beside of it, is it possible to find out what the real and compressed 
size

of a file, for example or the ratio?


Currently not.

I've once written a tool which does this, but 1. it's extremely slow, 2.
insane, 3. so insane a certain member of this list would kill me had I
distributed the tool.  Thus, I'd need to rewrite it first...


AFAIK the only method to determine the compression ratio is to check the 
EXTENT_DATA key and its corresponding file_extent_item structure.

(Which I assume Adam is doing this way)

In that structure is records its on-disk data size and in-memory data 
size. (All rounded up to sectorsize, which is 4K in most case)

So in theory it's possible to determine the compression ratio.

The only method I can think of (maybe I forgot some methods?) is to use 
offline tool (btrfs-debug-tree) to check that.
FS APIs like fiemap doesn't even support to report on-disk data size so 
we can't use it.



But the problem is more complicated, especially when compressed CoW is 
involved.


For example, there is an extent (A) which represents the data for inode 
258, range [0,128k).

On disk size its just 4K.

And when we write the range [32K, 64K), which get CoWed and compressed, 
resulting a new file extent (B) for inode 258, range [32K, 64K), and on 
disk size is 4K as an example.


Then file extent layout for 258 will be:
[0,32k):  range [0,32K) of uncompressed Extent A
[32k, 64k): range [0,32k) of uncompressed Extent B
[64k, 128k): range [64k, 128K) of uncompressed Extent A.

And on disk extent size is 4K (compressed Extent A) + 4K (compressed 
Extent B) = 8K.


Before the write, the compresstion ratio is 4K/128K = 3.125%
While after write, the compression ratio is 8K/128K = 6.25%

Not to mention that it's possible to have uncompressed file extent.

So it's complicated even we're just using offline tool to determine the 
compression ratio of btrfs compressed file.
Out of curiosity, is there any easier method if you just want an 
aggregate ratio for the whole filesystem?  The intuitive option of 
comparing `du -sh` output to how much space is actually used in chunks 
is obviously out because that will count sparse ranges as 'compressed', 
and there should actually be a significant difference in values there 
for an uncompressed filesystem (the chunk usage should be higher).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]

2017-09-05 Thread Marco Lorenzo Crociani

Hi,
I was transferring some data with rsync to a btrfs filesystem when I got:

set 04 14:59:05  kernel: INFO: task kworker/u33:2:25015 blocked for more 
than 120 seconds.

set 04 14:59:05  kernel:   Not tainted 4.12.10-1.el7.elrepo.x86_64 #1
set 04 14:59:05  kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.

set 04 14:59:05  kernel: kworker/u33:2   D0 25015  2 0x0080
set 04 14:59:05  kernel: Workqueue: events_unbound 
btrfs_async_reclaim_metadata_space [btrfs]

set 04 14:59:05  kernel: Call Trace:
set 04 14:59:05  kernel:  __schedule+0x28a/0x880
set 04 14:59:05  kernel:  schedule+0x36/0x80
set 04 14:59:05  kernel:  wb_wait_for_completion+0x64/0x90
set 04 14:59:05  kernel:  ? remove_wait_queue+0x60/0x60
set 04 14:59:05  kernel:  __writeback_inodes_sb_nr+0x8e/0xb0
set 04 14:59:05  kernel:  writeback_inodes_sb_nr+0x10/0x20
set 04 14:59:05  kernel:  flush_space+0x469/0x580 [btrfs]
set 04 14:59:05  kernel:  ? dequeue_task_fair+0x577/0x830
set 04 14:59:05  kernel:  ? pick_next_task_fair+0x122/0x550
set 04 14:59:05  kernel:  btrfs_async_reclaim_metadata_space+0x112/0x430 
[btrfs]

set 04 14:59:05  kernel:  process_one_work+0x149/0x360
set 04 14:59:05  kernel:  worker_thread+0x4d/0x3c0
set 04 14:59:05  kernel:  kthread+0x109/0x140
set 04 14:59:05  kernel:  ? rescuer_thread+0x380/0x380
set 04 14:59:05  kernel:  ? kthread_park+0x60/0x60
set 04 14:59:05  kernel:  ? do_syscall_64+0x67/0x150
set 04 14:59:05  kernel:  ret_from_fork+0x25/0x30

btrfs fi df /data
Data, single: total=20.63TiB, used=20.63TiB
System, DUP: total=8.00MiB, used=2.20MiB
Metadata, DUP: total=41.50GiB, used=40.61GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

btrfs fi show /dev/sdo
Label: 'Storage'  uuid: 429e42f4-dd9e-4267-b353-aa0831812f87
Total devices 1 FS bytes used 20.67TiB
devid1 size 36.38TiB used 20.71TiB path /dev/sdo

Is it serious? Can I provide other info?

Regards,

--
Marco Crociani
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check --repair now runs in minutes instead of hours? aborting

2017-09-05 Thread Duncan
Qu Wenruo posted on Tue, 05 Sep 2017 17:06:35 +0800 as excerpted:

>> See if these numbers, copied and reformatted from his post with spaces
>> inserted either side of the numbers and the equals signs deleted,
>> arrive any less garbled:
>> 
>> Data, single: total 10.60 TiB, used 10.54 TiB System, DUP: total 32.00
>> MiB, used 1.19 MiB Metadata, DUP: total 58.00 GiB, used 12.69 GiB
>> GlobalReserve, single: total 512.00 MiB, used 0.00 B
>> 
> Thanks a lot for this.

It worked. =:^)  (But thinking about it now, that smiley with an equals 
sign probably won't!)


-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check --repair now runs in minutes instead of hours? aborting

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 16:54, Duncan wrote:

Qu Wenruo posted on Tue, 05 Sep 2017 16:05:04 +0800 as excerpted:


On 2017年09月05日 10:55, Marc MERLIN wrote:


gargamel:~# btrfs fi df /mnt/btrfs_pool1
Data, single: total.60TiB, used.54TiB
System, DUP: total2.00MiB, used=1.19MiB
Metadata, DUP: totalX.00GiB, used.69GiB


Wait for a minute.

Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the
format (again)?


It appears to be your end.  Based on the fact that I'm seeing a a bunch
of weird characters in your quote of the message that I didn't see in the
original, I'm guessing it's charset related, very possibly due to the
"equal" sign being an escape character for mime/quoted-printable (tho his
post was text/plain; charset equals utf-8, full 8-bit, so not quoted-
printable encoded at all) and I believe various i18n escapes as well,
with the latter being an issue if the client assumes local charset
despite the utf8 specified in the header.

See if these numbers, copied and reformatted from his post with spaces
inserted either side of the numbers and the equals signs deleted, arrive
any less garbled:

Data, single: total 10.60 TiB, used 10.54 TiB
System, DUP: total 32.00 MiB, used 1.19 MiB
Metadata, DUP: total 58.00 GiB, used 12.69 GiB
GlobalReserve, single: total 512.00 MiB, used 0.00 B


Thanks a lot for this.

I'd better double check my client setup to avoid such embrassment.

Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check --repair now runs in minutes instead of hours? aborting

2017-09-05 Thread Duncan
Qu Wenruo posted on Tue, 05 Sep 2017 16:05:04 +0800 as excerpted:

> On 2017年09月05日 10:55, Marc MERLIN wrote:
>> 
>> gargamel:~# btrfs fi df /mnt/btrfs_pool1
>> Data, single: total.60TiB, used.54TiB
>> System, DUP: total2.00MiB, used=1.19MiB
>> Metadata, DUP: totalX.00GiB, used.69GiB
> 
> Wait for a minute.
> 
> Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the
> format (again)?

It appears to be your end.  Based on the fact that I'm seeing a a bunch 
of weird characters in your quote of the message that I didn't see in the 
original, I'm guessing it's charset related, very possibly due to the 
"equal" sign being an escape character for mime/quoted-printable (tho his 
post was text/plain; charset equals utf-8, full 8-bit, so not quoted-
printable encoded at all) and I believe various i18n escapes as well, 
with the latter being an issue if the client assumes local charset 
despite the utf8 specified in the header.

See if these numbers, copied and reformatted from his post with spaces 
inserted either side of the numbers and the equals signs deleted, arrive 
any less garbled:

Data, single: total 10.60 TiB, used 10.54 TiB
System, DUP: total 32.00 MiB, used 1.19 MiB
Metadata, DUP: total 58.00 GiB, used 12.69 GiB
GlobalReserve, single: total 512.00 MiB, used 0.00 B

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Mkfs: Rework --rootdir to a more generic behavior

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 02:08, David Sterba wrote:

On Mon, Sep 04, 2017 at 03:41:05PM +0900, Qu Wenruo wrote:

mkfs.btrfs --rootdir provides user a method to generate btrfs with
pre-written content while without the need of root privilege.

However the code is quite old and doesn't get much review or test.
This makes some strange behavior, from customized chunk allocation
(which uses the reserved 0~1M device space) to lack of special file
handler (Fixed in previous 2 patches).


The cleanup in this area is most welcome. The patches look good after a
quick look, I'll do another review round.


To save you some time, I found that my rework can't create new image 
which old --rootdir can do. So it's still not completely the same behavior.
I can fix it by creating a large sparse file first and then truncate it 
using current method easily.


But this really concerns me, do we need to shrink the fs?

I had a discussion with Austin about this, thread named "[btrfs-progs] 
Bug in mkfs.btrfs -r".
The only equivalent I found is "mkfs.ext4 -d", which can only create new 
file if size is given and will not shrink fs.

(Genext2fs shrinks the fs, but is no longer in e2fsprogs)

If we follow that behavior, the 3rd and 5th patches are not needed, 
which I'm pretty happy with.


Functionally, both behavior can be implemented with current method, but 
I hope to make sure which is the designed behavior so I can stick to it.


I hope you could make the final decision on this so I can update the 
patchset.


Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check --repair now runs in minutes instead of hours? aborting

2017-09-05 Thread Qu Wenruo



On 2017年09月05日 10:55, Marc MERLIN wrote:

On Tue, Sep 05, 2017 at 09:21:55AM +0800, Qu Wenruo wrote:



On 2017年09月05日 09:05, Marc MERLIN wrote:

Ok, I don't want to sound like I'm complaining :) but I updated
btrfs-progs to top of tree in git, installed it, and ran it on an 8TiB
filesystem that used to take 12H or so to check.


How much space allocated for that 8T fs?
If metadata is not that large, 10min is valid.

Here fi df output could help.


gargamel:~# btrfs fi df /mnt/btrfs_pool1
Data, single: total.60TiB, used.54TiB
System, DUP: total2.00MiB, used=1.19MiB
Metadata, DUP: totalX.00GiB, used.69GiB


Wait for a minute.

Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the 
format (again)?

This output format must be changed, at least to 0.69 GiB, or 706 MiB.

I'll fix this first.


GlobalReserve, single: totalQ2.00MiB, used=0.00B


And, without --repair, how much time it takes to run?


Well, funny that you ask, it's now been running for hours, still waiting...
  
Just before, I ran lowmem, and it was pretty quick too (didn't time it,

but less than 1h):


You mean lowmem is actually FASTER than original mode?
That's very surprising.

Is there any special operation done for that btrfs?
Like offline dedupe or tons of reflinks?

IIRC original mode did a quite slow check for tons of reflink, which may 
be related.


BTW, how many subvolumes do you have in the fs?


gargamel:/var/local/src/btrfs-progs# btrfs check --mode=wmem
/dev/mapper/dshelf1
Checking filesystem on /dev/mapper/dshelf1
UUID: 36f5079e-ca6c-4855-8639-ccb82695c18d
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 11674263330816 bytes used, no error found
total csum bytes: 11384482936
total tree bytes: 13738737664
total fs tree bytes: 758988800
total extent tree bytes: 482623488
btree space waste bytes: 1171475737
file data blocks allocated: 12888981110784
  referenced 12930453286912

Now, this is good news for my filesystem being probably clean (previous
versions of lowmem before my git update found issues that were unclear, but
apparently errors in the code, and this version finds nothing)

But I'm not sure why --repair would be fast, and not --repair would be slow?


This looks like a bug. My first guess is related to number of 
subvolumes/reflinks, but I'm not sure since I don't have many real-world 
btrfs.


I'll take sometime to look into it.

Thanks for the very interesting report,
Qu



Marc


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html