Re: [PATCH] btrfs: use proper type for failrec in extent_state

2016-02-23 Thread David Sterba
On Tue, Feb 23, 2016 at 10:06:59AM +0800, Qu Wenruo wrote:
> > I'm not sure why, but my gcc 5.3.1 think's that a member of failrec can
> > be used uninitialized:
> >
> >CC [M]  fs/btrfs/extent_io.o
> > fs/btrfs/extent_io.c: In function ‘clean_io_failure’:
> > fs/btrfs/extent_io.c:2133:4: warning: ‘failrec’ may be used uninitialized 
> > in this function [-Wmaybe-uninitialized]
> >  repair_io_failure(inode, start, failrec->len,
> >  ^
> >
> After applying your patch on integration-4.6, and compiling, my gcc 
> 5.3.0 didn't give such warning though.
> 
> And I didn't see an official 5.3.1, but only 5.3 so I assume maybe it's 
> related to the .1 from your distribution?

Yes it's distro gcc 5.3 + patches. Arnd sent a patch to fix the warning.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: avoid uninitialized variable warning

2016-02-23 Thread David Sterba
On Mon, Feb 22, 2016 at 10:53:20PM +0100, Arnd Bergmann wrote:
> With CONFIG_SMP and CONFIG_PREEMPT both disabled, gcc decides
> to partially inline the get_state_failrec() function but cannot
> figure out that means the failrec pointer is always valid
> if the function returns success, which causes a harmless
> warning:
> 
> fs/btrfs/extent_io.c: In function 'clean_io_failure':
> fs/btrfs/extent_io.c:2131:4: error: 'failrec' may be used uninitialized in 
> this function [-Werror=maybe-uninitialized]
> 
> This marks get_state_failrec() and set_state_failrec() both
> as 'noinline', which avoids the warning in all cases for me,
> and seems less ugly than adding a fake initialization.

Thanks for the analysis and the fix, works for me.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix lockdep deadlock warning due to dev_replace

2016-02-23 Thread David Sterba
On Fri, Jul 17, 2015 at 04:49:19PM +0800, Liu Bo wrote:
> With this, btrfs/011 no more produces warnings in dmesg.
> 
> Signed-off-by: Liu Bo 

Patch added to next for 4.6.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] unable to handle kernel paging request when running btrfs/011

2016-02-23 Thread David Sterba
On Sun, Feb 21, 2016 at 09:36:44AM +0800, Qu Wenruo wrote:
> 
> 
> On 02/19/2016 09:41 PM, Anand Jain wrote:
> >
> >
> >   Saw below warn leading to bug when running btrfs/011, not
> >   reproducible. Any idea ?
> >
> Seems like another wq_destroy race.
> 
> But it's hard to locate which wq is the cause from backtrace only.

Name of the workqueue is in the first stacktrasce, it's readahead. The
crash happens in 'umount'.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] unable to handle kernel paging request when running btrfs/011

2016-02-23 Thread David Sterba
On Sun, Feb 21, 2016 at 09:36:44AM +0800, Qu Wenruo wrote:
> 
> 
> On 02/19/2016 09:41 PM, Anand Jain wrote:
> >
> >
> >   Saw below warn leading to bug when running btrfs/011, not
> >   reproducible. Any idea ?
> >
> Seems like another wq_destroy race.
> 
> But it's hard to locate which wq is the cause from backtrace only.
> 
> What's the wq btrfs_stop_all_workers+0xcd is going to free?

You can always try to guess on your locally built sources, provided the
configs do not enable "too much" debugging. What I get and looks
reasonable:

(gdb) l *(btrfs_stop_all_workers+0xcd)
0x2bc7d is in btrfs_stop_all_workers (fs/btrfs/disk-io.c:2154).
warning: Source file is more recent than executable.
2149btrfs_destroy_workqueue(fs_info->delayed_workers);
2150btrfs_destroy_workqueue(fs_info->caching_workers);
2151btrfs_destroy_workqueue(fs_info->readahead_workers);
2152btrfs_destroy_workqueue(fs_info->flush_workers);
2153btrfs_destroy_workqueue(fs_info->qgroup_rescan_workers);
2154btrfs_destroy_workqueue(fs_info->extent_workers);
2155}
2156
2157static void free_root_extent_buffers(struct btrfs_root *root)
2158{

> And what's the wq end_workqueue_bio+0x85 is going to add?

(gdb) l *(end_workqueue_bio+0x85)
0x2ab55 is in end_workqueue_bio (fs/btrfs/disk-io.c:726).
721
722 if (bio->bi_rw & REQ_WRITE) {
723 if (end_io_wq->metadata == BTRFS_WQ_ENDIO_METADATA) {
724 wq = fs_info->endio_meta_write_workers;
725 func = btrfs_endio_meta_write_helper;
726 } else if (end_io_wq->metadata == 
BTRFS_WQ_ENDIO_FREE_SPACE) {
727 wq = fs_info->endio_freespace_worker;
728 func = btrfs_freespace_write_helper;
729 } else if (end_io_wq->metadata == 
BTRFS_WQ_ENDIO_RAID56) {
730 wq = fs_info->endio_raid56_workers;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Nazar Mokrynskyi

> What is wrong with noatime,relatime? I'm using them for a long time as
> good compromise in terms of performance.
The one option ends up canceling the other, as they're both atime related
options that say do different things.

I'd have to actually setup a test or do some research to be sure which
one overrides the other (but someone here probably can say without
further research), tho I'd /guess/ the latter one overrides the earlier
one, which would effectively make them both pretty much useless, since
relatime is the normal kernel default and thus doesn't need to be
specified.

Noatime is strongly recommended for btrfs, however, particularly with
snapshots, as otherwise, the changes between snapshots can consist mostly
of generally useless atime changes.

(FWIW, after over a decade of using noatime here (I first used it on the
then new reiserfs, after finding a recommendation for it on that), I got
tired of specifying the option on nearly all my fstab entries, and now
days carry a local kernel patch that changes the default to noatime,
allowing me to drop specifying it everywhere.  I don't claim to be a
coder, let alone a kernel level coder, but as a gentooer used to building
from source for over a decade, I've found that I can often find the code
behind some behavior I'd like to tweak, and given good enough comments, I
can often create trivial patches to accomplish that tweak, even if it's
not exactly the code a real C coder would choose to use, which is exactly
what I've done here.  So now, unless some other atime option is
specified, my filesystems are all mounted noatime.  =:^)
Well, then I'll leave relatime on root fs and noatime on partition with 
snapshots, thanks.


Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora: naza...@diaspora.mokrynskyi.com
Tox: 
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249




smime.p7s
Description: Кріптографічний підпис S/MIME


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Alexander Fougner
2016-02-23 17:55 GMT+01:00 Nazar Mokrynskyi :
>> > What is wrong with noatime,relatime? I'm using them for a long time as
>> > good compromise in terms of performance.
>> The one option ends up canceling the other, as they're both atime related
>> options that say do different things.
>>
>> I'd have to actually setup a test or do some research to be sure which
>> one overrides the other (but someone here probably can say without
>> further research), tho I'd /guess/ the latter one overrides the earlier
>> one, which would effectively make them both pretty much useless, since
>> relatime is the normal kernel default and thus doesn't need to be
>> specified.
>>
>> Noatime is strongly recommended for btrfs, however, particularly with
>> snapshots, as otherwise, the changes between snapshots can consist mostly
>> of generally useless atime changes.
>>
>> (FWIW, after over a decade of using noatime here (I first used it on the
>> then new reiserfs, after finding a recommendation for it on that), I got
>> tired of specifying the option on nearly all my fstab entries, and now
>> days carry a local kernel patch that changes the default to noatime,
>> allowing me to drop specifying it everywhere.  I don't claim to be a
>> coder, let alone a kernel level coder, but as a gentooer used to building
>> from source for over a decade, I've found that I can often find the code
>> behind some behavior I'd like to tweak, and given good enough comments, I
>> can often create trivial patches to accomplish that tweak, even if it's
>> not exactly the code a real C coder would choose to use, which is exactly
>> what I've done here.  So now, unless some other atime option is
>> specified, my filesystems are all mounted noatime.  =:^)
>
> Well, then I'll leave relatime on root fs and noatime on partition with
> snapshots, thanks.

If you snapshot the root filesystem then the atime changes will still
be there, and you'll be having a lot of unnecessary changes between
each snapshot.

> Sincerely, Nazar Mokrynskyi
> github.com/nazar-pc
> Skype: nazar-pc
> Diaspora: naza...@diaspora.mokrynskyi.com
> Tox:
> A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Nazar Mokrynskyi
But why? I have relatime option, it should not cause changes unless file 
contents is actually changed if I understand this option correctly.


Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora: naza...@diaspora.mokrynskyi.com
Tox: 
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249

On 23.02.16 18:05, Alexander Fougner wrote:

2016-02-23 17:55 GMT+01:00 Nazar Mokrynskyi :

What is wrong with noatime,relatime? I'm using them for a long time as
good compromise in terms of performance.

The one option ends up canceling the other, as they're both atime related
options that say do different things.

I'd have to actually setup a test or do some research to be sure which
one overrides the other (but someone here probably can say without
further research), tho I'd /guess/ the latter one overrides the earlier
one, which would effectively make them both pretty much useless, since
relatime is the normal kernel default and thus doesn't need to be
specified.

Noatime is strongly recommended for btrfs, however, particularly with
snapshots, as otherwise, the changes between snapshots can consist mostly
of generally useless atime changes.

(FWIW, after over a decade of using noatime here (I first used it on the
then new reiserfs, after finding a recommendation for it on that), I got
tired of specifying the option on nearly all my fstab entries, and now
days carry a local kernel patch that changes the default to noatime,
allowing me to drop specifying it everywhere.  I don't claim to be a
coder, let alone a kernel level coder, but as a gentooer used to building
from source for over a decade, I've found that I can often find the code
behind some behavior I'd like to tweak, and given good enough comments, I
can often create trivial patches to accomplish that tweak, even if it's
not exactly the code a real C coder would choose to use, which is exactly
what I've done here.  So now, unless some other atime option is
specified, my filesystems are all mounted noatime.  =:^)

Well, then I'll leave relatime on root fs and noatime on partition with
snapshots, thanks.

If you snapshot the root filesystem then the atime changes will still
be there, and you'll be having a lot of unnecessary changes between
each snapshot.




smime.p7s
Description: Кріптографічний підпис S/MIME


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Marc MERLIN
Well, since we're on the topic, my backup server btrfs FS has become so
slow that it hangs my system a few seconds here and there and causes
some of my cron jobs to fail.

I'm going to re-create it for a 3 time (in 3 years), adding bcache this
time, but clearly there is a good chance that this filesystem is indeed
going to crap performance wise because all it does is receive btrfs
receive and rsync backups, with snapshot rotations (daily snapshots, and
they expire after a couple of weeks).

I'm currently doing a very slow defrag to see if it'll help (looks like
it's going to take days).
I'm doing this:
for i in dir1 dir2 debian32 debian64 ubuntu dir4 ; do echo $i; time btrfs fi 
defragment -v -r $i; done

But, just to be clear, is there a way I missed to see how fragmented my
filesystem is without running filefrag on millions of files and parsing
the output?

Label: 'dshelf2'  uuid: d4a51178-c1e6-4219-95ab-5c5864695bfd
Total devices 1 FS bytes used 4.25TiB
devid1 size 7.28TiB used 4.44TiB path /dev/mapper/dshelf2

btrfs fi df /mnt/btrfs_pool2/
Data, single: total=4.29TiB, used=4.18TiB
System, DUP: total=64.00MiB, used=512.00KiB
Metadata, DUP: total=77.50GiB, used=73.31GiB
GlobalReserve, single: total=512.00MiB, used=31.22MiB

Currently, it's btrfs on top of dmcrpyt on top of swraid5

Since I'm about to recreate this after a very slow backup/restore
process, if you have suggestions on how I could better build this
(outside of using a 4.4 kernel this time), they would be appreciated.

Also, should I try running defragment -r from cron from time to time?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Alexander Fougner
2016-02-23 18:18 GMT+01:00 Nazar Mokrynskyi :
> But why? I have relatime option, it should not cause changes unless file
> contents is actually changed if I understand this option correctly.
>

*or* if it is older than 1 day. From the manpages:

relatime
  Update inode access times relative to modify or change time.
  Access time is only updated if the previous access time was
  earlier than the current modify or change time.  (Similar to
  noatime, but it doesn't break mutt or other applications that
  need to know if a file has been read since the last time it
  was modified.)

  Since Linux 2.6.30, the kernel defaults to the behavior
  provided by this option (unless noatime was specified), and
  the strictatime option is required to obtain traditional
>   semantics.  In addition, since Linux 2.6.30, the file's last
  access time is always updated if it is more than 1 day old. <

Also, if you only use relatime, then you don't need to specify it,
it's the default since 2.6.30 as mentioned above.


> Sincerely, Nazar Mokrynskyi
> github.com/nazar-pc
> Skype: nazar-pc
> Diaspora: naza...@diaspora.mokrynskyi.com
> Tox:
> A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249
>
> On 23.02.16 18:05, Alexander Fougner wrote:
>>
>> 2016-02-23 17:55 GMT+01:00 Nazar Mokrynskyi :
>
> What is wrong with noatime,relatime? I'm using them for a long time as
> good compromise in terms of performance.

 The one option ends up canceling the other, as they're both atime
 related
 options that say do different things.

 I'd have to actually setup a test or do some research to be sure which
 one overrides the other (but someone here probably can say without
 further research), tho I'd /guess/ the latter one overrides the earlier
 one, which would effectively make them both pretty much useless, since
 relatime is the normal kernel default and thus doesn't need to be
 specified.

 Noatime is strongly recommended for btrfs, however, particularly with
 snapshots, as otherwise, the changes between snapshots can consist
 mostly
 of generally useless atime changes.

 (FWIW, after over a decade of using noatime here (I first used it on the
 then new reiserfs, after finding a recommendation for it on that), I got
 tired of specifying the option on nearly all my fstab entries, and now
 days carry a local kernel patch that changes the default to noatime,
 allowing me to drop specifying it everywhere.  I don't claim to be a
 coder, let alone a kernel level coder, but as a gentooer used to
 building
 from source for over a decade, I've found that I can often find the code
 behind some behavior I'd like to tweak, and given good enough comments,
 I
 can often create trivial patches to accomplish that tweak, even if it's
 not exactly the code a real C coder would choose to use, which is
 exactly
 what I've done here.  So now, unless some other atime option is
 specified, my filesystems are all mounted noatime.  =:^)
>>>
>>> Well, then I'll leave relatime on root fs and noatime on partition with
>>> snapshots, thanks.
>>
>> If you snapshot the root filesystem then the atime changes will still
>> be there, and you'll be having a lot of unnecessary changes between
>> each snapshot.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Nazar Mokrynskyi

Wow, this is interesting, didn't know it.

I'll probably try noatime instead:)

Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora: naza...@diaspora.mokrynskyi.com
Tox: 
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249

On 23.02.16 18:29, Alexander Fougner wrote:

2016-02-23 18:18 GMT+01:00 Nazar Mokrynskyi :

But why? I have relatime option, it should not cause changes unless file
contents is actually changed if I understand this option correctly.


*or* if it is older than 1 day. From the manpages:

relatime
   Update inode access times relative to modify or change time.
   Access time is only updated if the previous access time was
   earlier than the current modify or change time.  (Similar to
   noatime, but it doesn't break mutt or other applications that
   need to know if a file has been read since the last time it
   was modified.)

   Since Linux 2.6.30, the kernel defaults to the behavior
   provided by this option (unless noatime was specified), and
   the strictatime option is required to obtain traditional

   semantics.  In addition, since Linux 2.6.30, the file's last

   access time is always updated if it is more than 1 day old. <

Also, if you only use relatime, then you don't need to specify it,
it's the default since 2.6.30 as mentioned above.



Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora: naza...@diaspora.mokrynskyi.com
Tox:
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249

On 23.02.16 18:05, Alexander Fougner wrote:

2016-02-23 17:55 GMT+01:00 Nazar Mokrynskyi :

What is wrong with noatime,relatime? I'm using them for a long time as
good compromise in terms of performance.

The one option ends up canceling the other, as they're both atime
related
options that say do different things.

I'd have to actually setup a test or do some research to be sure which
one overrides the other (but someone here probably can say without
further research), tho I'd /guess/ the latter one overrides the earlier
one, which would effectively make them both pretty much useless, since
relatime is the normal kernel default and thus doesn't need to be
specified.

Noatime is strongly recommended for btrfs, however, particularly with
snapshots, as otherwise, the changes between snapshots can consist
mostly
of generally useless atime changes.

(FWIW, after over a decade of using noatime here (I first used it on the
then new reiserfs, after finding a recommendation for it on that), I got
tired of specifying the option on nearly all my fstab entries, and now
days carry a local kernel patch that changes the default to noatime,
allowing me to drop specifying it everywhere.  I don't claim to be a
coder, let alone a kernel level coder, but as a gentooer used to
building
from source for over a decade, I've found that I can often find the code
behind some behavior I'd like to tweak, and given good enough comments,
I
can often create trivial patches to accomplish that tweak, even if it's
not exactly the code a real C coder would choose to use, which is
exactly
what I've done here.  So now, unless some other atime option is
specified, my filesystems are all mounted noatime.  =:^)

Well, then I'll leave relatime on root fs and noatime on partition with
snapshots, thanks.

If you snapshot the root filesystem then the atime changes will still
be there, and you'll be having a lot of unnecessary changes between
each snapshot.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





smime.p7s
Description: Кріптографічний підпис S/MIME


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Marc MERLIN
On Tue, Feb 23, 2016 at 09:26:35AM -0800, Marc MERLIN wrote:
> Label: 'dshelf2'  uuid: d4a51178-c1e6-4219-95ab-5c5864695bfd
> Total devices 1 FS bytes used 4.25TiB
> devid1 size 7.28TiB used 4.44TiB path /dev/mapper/dshelf2
> 
> btrfs fi df /mnt/btrfs_pool2/
> Data, single: total=4.29TiB, used=4.18TiB
> System, DUP: total=64.00MiB, used=512.00KiB
> Metadata, DUP: total=77.50GiB, used=73.31GiB
> GlobalReserve, single: total=512.00MiB, used=31.22MiB
> 
> Currently, it's btrfs on top of dmcrpyt on top of swraid5

Sorry, I forgot to give the mount options:
/dev/mapper/dshelf2 on /mnt/dshelf2/backup type btrfs 
(rw,noatime,compress=lzo,space_cache,skip_balance,subvolid=257,subvol=/backup)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Nazar Mokrynskyi
Looks like btrfstune -x did nothing, probably, it was already used at 
creation time, I'm using rcX versions of kernel all the time and rolling 
version of Ubuntu, so this is very likely to be the case.


One thing I've noticed is much slower mount/umount on HDD than on SSD:


nazar-pc@nazar-pc ~> time sudo umount /backup
0.00user 0.00system 0:00.01elapsed 36%CPU (0avgtext+0avgdata 
7104maxresident)k

0inputs+0outputs (0major+784minor)pagefaults 0swaps
nazar-pc@nazar-pc ~> time sudo mount /backup
0.00user 0.00system 0:00.03elapsed 23%CPU (0avgtext+0avgdata 
7076maxresident)k

0inputs+0outputs (0major+803minor)pagefaults 0swaps
nazar-pc@nazar-pc ~> time sudo umount /backup_hdd
0.00user 0.11system 0:01.04elapsed 11%CPU (0avgtext+0avgdata 
7092maxresident)k

0inputs+15296outputs (0major+787minor)pagefaults 0swaps
nazar-pc@nazar-pc ~> time sudo mount /backup_hdd
0.00user 0.02system 0:04.45elapsed 0%CPU (0avgtext+0avgdata 
7140maxresident)k

14648inputs+0outputs (0major+795minor)pagefaults 0swaps
It is especially long (tenth of seconds with hight HDD load) when called 
after some time, not consequently.


Once it took something like 20 seconds to unmount filesystem and around 
10 seconds to mount it.


Sincerely, Nazar Mokrynskyi
github.com/nazar-pc
Skype: nazar-pc
Diaspora:naza...@diaspora.mokrynskyi.com
Tox: 
A9D95C9AA5F7A3ED75D83D0292E22ACE84BA40E912185939414475AF28FD2B2A5C8EF5261249

On 22.02.16 20:58, Nazar Mokrynskyi wrote:
On Tue, Feb 16, 2016 at 5:44 AM, Nazar Mokrynskyi 
 wrote:
> I have 2 SSD with BTRFS filesystem (RAID) on them and several 
subvolumes.
> Each 15 minutes I'm creating read-only snapshot of subvolumes 
/root, /home

> and /web inside /backup.
> After this I'm searching for last common subvolume on /backup_hdd, 
sending
> difference between latest common snapshot and simply latest 
snapshot to

> /backup_hdd.
> On top of all above there is snapshots rotation, so that /backup 
contains

> much less snapshots than /backup_hdd.
>
> I'm using this setup for last 7 months or so and this is luckily 
the longest

> period when I had no problems with BTRFS at all.
> However, last 2+ months btrfs receive command loads HDD so much 
that I can't

> even get list of directories in it.
> This happens even if diff between snapshots is really small.
> HDD contains 2 filesystems - mentioned BTRFS and ext4 for other 
files, so I

> can't even play mp3 file from ext4 filesystem while btrfs receive is
> running.
> Since I'm running everything each 15 minutes this is a real headache.
>
> My guess is that performance hit might be caused by filesystem 
fragmentation
> even though there is more than enough empty space. But I'm not sure 
how to
> properly check this and can't, obviously, run defragmentation on 
read-only

> subvolumes.
>
> I'll be thankful for anything that might help to identify and 
resolve this

> issue.
>
> ~> uname -a
> Linux nazar-pc 4.5.0-rc4-haswell #1 SMP Tue Feb 16 02:09:13 CET 
2016 x86_64

> x86_64 x86_64 GNU/Linux
>
> ~> btrfs --version
> btrfs-progs v4.4
>
> ~> sudo btrfs fi show
> Label: none  uuid: 5170aca4-061a-4c6c-ab00-bd7fc8ae6030
> Total devices 2 FS bytes used 71.00GiB
> devid1 size 111.30GiB used 111.30GiB path /dev/sdb2
> devid2 size 111.30GiB used 111.29GiB path /dev/sdc2
>
> Label: 'Backup'  uuid: 40b8240a-a0a2-4034-ae55-f8558c0343a8
> Total devices 1 FS bytes used 252.54GiB
> devid1 size 800.00GiB used 266.08GiB path /dev/sda1
>
> ~> sudo btrfs fi df /
> Data, RAID0: total=214.56GiB, used=69.10GiB
> System, RAID1: total=8.00MiB, used=16.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, RAID1: total=4.00GiB, used=1.87GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> ~> sudo btrfs fi df /backup_hdd
> Data, single: total=245.01GiB, used=243.61GiB
> System, DUP: total=32.00MiB, used=48.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=10.50GiB, used=8.93GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> Relevant mount options:
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/ btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/root0 1
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/home btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/home 01
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/backup btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/backup 01
> UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030/web btrfs
> compress=lzo,noatime,relatime,ssd,subvol=/web 01
> UUID=40b8240a-a0a2-4034-ae55-f8558c0343a8/backup_hdd btrfs
> compress=lzo,noatime,relatime,noexec 01
As already indicated by Duncan, the amount of snapshots might be just
too much. The fragmentation on the HDD might have become very high. If
there is limited amount of RAM in the system (so limited caching), too
much time is lost in seeks. In addition:

  compress=lzo
this also increases the chance of scattering fr

Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Lionel Bouton
Le 23/02/2016 18:34, Marc MERLIN a écrit :
> On Tue, Feb 23, 2016 at 09:26:35AM -0800, Marc MERLIN wrote:
>> Label: 'dshelf2'  uuid: d4a51178-c1e6-4219-95ab-5c5864695bfd
>> Total devices 1 FS bytes used 4.25TiB
>> devid1 size 7.28TiB used 4.44TiB path /dev/mapper/dshelf2
>>
>> btrfs fi df /mnt/btrfs_pool2/
>> Data, single: total=4.29TiB, used=4.18TiB
>> System, DUP: total=64.00MiB, used=512.00KiB
>> Metadata, DUP: total=77.50GiB, used=73.31GiB
>> GlobalReserve, single: total=512.00MiB, used=31.22MiB
>>
>> Currently, it's btrfs on top of dmcrpyt on top of swraid5
> Sorry, I forgot to give the mount options:
> /dev/mapper/dshelf2 on /mnt/dshelf2/backup type btrfs 
> (rw,noatime,compress=lzo,space_cache,skip_balance,subvolid=257,subvol=/backup)

Why don't you use autodefrag ? If you have writable snapshots and do
write to them heavily it would not be a good idea (depending on how
BTRFS handles this in most cases you would probably either break the
reflinks or fragment a snapshot to defragment another) but if you only
have read-only snapshots it may work for you (it does for me).

The only BTRFS filesystems where I disabled autodefrag where Ceph OSDs
with heavy in-place updates. Another option would have been to mark
files NoCoW but I didn't want to abandon BTRFS checksumming.

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] btrfs-progs: copy functionality of btrfs-debug-tree to inspect-internal subcommand

2016-02-23 Thread David Sterba
On Mon, Feb 22, 2016 at 03:49:49PM +0100, Alexander Fougner wrote:
> The long-term plan is to merge the features of standalone tools
> into the btrfs binary, reducing the number of shipped binaries.
> 
> Signed-off-by: Alexander Fougner 

Replaced v1, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Austin S. Hemmelgarn

On 2016-02-23 12:34, Nazar Mokrynskyi wrote:

Wow, this is interesting, didn't know it.

I'll probably try noatime instead:)

For what it's worth, due to how it's implemented on almost every UNIX 
derived system in existence (including Linux), atimes are essentially 
useless.  A majority of the software that has used them over the years 
has mad the flawed assumption that the atime only gets updated when the 
file data is read or modified, when they actually get updated when ever 
the file data is read or modified, and when the metadata is modified 
(and in some old UNIX systems, the update on file data modification was 
simply implemented as a cascade effect from the mtime getting updated). 
 Mutt is one of the only publicly available programs I know of that 
uses them, and it makes this same flawed assumption.  The only software 
I know of that uses them right is tmpwatch and tmpreaper, which use them 
to clean up /tmp and similar directories when files there haven't been 
touched in a long time, and even those have the option to not use atimes.


Now, long rant aside, you may want to also look into the 'lazytime' 
mount option.  It won't reduce fragmentation, but it should improve 
performance overall, the only downsides are that mtimes might be 
incorrect after a crash, and it's only available in newer kernels (I 
think it got added in 4.0 or 4.1, but I'm not certain).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: fix symlink creation multiple times

2016-02-23 Thread David Sterba
On Thu, Feb 18, 2016 at 09:14:34PM -0500, Hongxu Jia wrote:
> The rule to create symlink in Makefile caused parallel issue:
> $ make -j 40 DESTDIR=/image install BUILD_VERBOSE=1
> ...
>   1 [LN] libbtrfs.so.0
>   2 [LN] libbtrfs.so
>   3 ln -s -f libbtrfs.so.0.1 libbtrfs.so.0
>   4 ln -s -f libbtrfs.so.0.1 libbtrfs.so.0
>   5 ln -s -f libbtrfs.so.0.1 libbtrfs.so
>   6 ln -s -f libbtrfs.so.0.1 libbtrfs.so
> ...
> 
> It failed occasionally:
> ...
> |symlinkat: couldn't stat 'git/libbtrfs.so' even though symlink
> creation succeeded (No such file or directory).
> |ln: failed to create symbolic link 'libbtrfs.so': No such file or directory
> ...
> 
> Signed-off-by: Hongxu Jia 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Marc MERLIN
On Tue, Feb 23, 2016 at 07:01:52PM +0100, Lionel Bouton wrote:
> Why don't you use autodefrag ? If you have writable snapshots and do
> write to them heavily it would not be a good idea (depending on how
> BTRFS handles this in most cases you would probably either break the
> reflinks or fragment a snapshot to defragment another) but if you only
> have read-only snapshots it may work for you (it does for me).
 
It's not a stupid question, I had issues with autodefrag in the past,
and turned it off, but it's been a good 2 years, so maybe it works well
enough now.

> The only BTRFS filesystems where I disabled autodefrag where Ceph OSDs
> with heavy in-place updates. Another option would have been to mark
> files NoCoW but I didn't want to abandon BTRFS checksumming.

Right. I don't have to worry about COW for virtualbox images there, and
the snapshots are read only (well, my script makes read-write snapshots
too, but I almost never use them. Hopefully their presence isn't a
problem, right?)

Thanks for the suggestion.
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Major HDD performance degradation on btrfs receive

2016-02-23 Thread Lionel Bouton
Le 23/02/2016 19:30, Marc MERLIN a écrit :
> On Tue, Feb 23, 2016 at 07:01:52PM +0100, Lionel Bouton wrote:
>> Why don't you use autodefrag ? If you have writable snapshots and do
>> write to them heavily it would not be a good idea (depending on how
>> BTRFS handles this in most cases you would probably either break the
>> reflinks or fragment a snapshot to defragment another) but if you only
>> have read-only snapshots it may work for you (it does for me).
>  
> It's not a stupid question, I had issues with autodefrag in the past,
> and turned it off, but it's been a good 2 years, so maybe it works well
> enough now.
>
>> The only BTRFS filesystems where I disabled autodefrag where Ceph OSDs
>> with heavy in-place updates. Another option would have been to mark
>> files NoCoW but I didn't want to abandon BTRFS checksumming.
> Right. I don't have to worry about COW for virtualbox images there, and
> the snapshots are read only (well, my script makes read-write snapshots
> too, but I almost never use them. Hopefully their presence isn't a
> problem, right?)

I believe autodefrag is only triggering defragmentation on access (write
access only according to the wiki) and uses a queue of limited length
for defragmentation tasks to perform. So the snapshots by themselves
won't cause problems. Even if you access files the defragmentation
should be focused mainly on the versions of the files you access the
most. The real problems probably happen when you access the same file
from several snapshots with lots of internal modifications between the
versions in these snapshots: either autodefrag will break the reflinks
between them or it will attempt to optimize the 2 file versions at
roughly the same time which won't give any benefit but will waste I/O.

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Documentation for BTRFS error (device dev): bdev /dev/xx errs: wr 22, rd 0, flush 0, corrupt 0, gen 0

2016-02-23 Thread Marc MERLIN
I have a freshly created md5 array, with drives that I specifically
scanned one by one block by block, and for good measure, I also scanned
the entire software raid with a check command which took 3 days to run.

Everything passed.

Then, I made a bcache of that device, an ssd that seems to work fine
otherwise (brand new), and dmcrypted the result

md5 - bache - dmcrypt - btrfs
ssd /

Now, I'm copying data over with btrfs send, and I'm seeing these slowly
show up and the write counter go up one by one.
BTRFS error (device dm-7): bdev /dev/mapper/oldds1 errs: wr 17, rd 0, flush 0, 
corrupt 0, gen 0

Where is the documentation for those counters?
Is the write error fatal, or a recovered error?
Should I consider that my filesystem is corrupted as soon as any of
those counters go up?
(I couldn't find an exact meaning of each of them)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Documentation for BTRFS error (device dev): bdev /dev/xx errs: wr 22, rd 0, flush 0, corrupt 0, gen 0

2016-02-23 Thread Duncan
Marc MERLIN posted on Tue, 23 Feb 2016 13:59:11 -0800 as excerpted:

> I have a freshly created md5 array, with drives that I specifically
> scanned one by one block by block, and for good measure, I also scanned
> the entire software raid with a check command which took 3 days to run.
> 
> Everything passed.
> 
> Then, I made a bcache of that device, an ssd that seems to work fine
> otherwise (brand new), and dmcrypted the result
> 
> md5 - bache - dmcrypt - btrfs ssd /
> 
> Now, I'm copying data over with btrfs send, and I'm seeing these slowly
> show up and the write counter go up one by one.
> BTRFS error (device dm-7): bdev /dev/mapper/oldds1 errs: wr 17, rd 0,
> flush 0, corrupt 0, gen 0
> 
> Where is the documentation for those counters?
> Is the write error fatal, or a recovered error?
> Should I consider that my filesystem is corrupted as soon as any of
> those counters go up?
> (I couldn't find an exact meaning of each of them)

I believe all formal documentation of what the error counters actually 
mean is developer-level -- "Trust the Source, Luke."

Unless something has recently been added to the wiki documenting them, 
admin/user level documentation is only the simple mention in the
btrfs-device manpage under stats, and what can be gathered, often by 
reading between the lines or from simply observing real behavior and the 
kernel log when errors increment, from the simple error counter names and 
comments here on this list.

Yet another point supporting the "btrfs is still stabilizing, not yet 
fully stable" position, I suppose, as it could definitely be argued that 
those counters and their visibility, including display in the kernel log 
at mount time, are definitely intended to be consumed at the admin-user 
level, and that it follows that they should be documented at the admin-
user level before the filesystem can properly be defined as fully stable.


Meanwhile, not saying my own admin-user viewpoint is gospel, by any 
stretch, but with the intent of hopefully helping make sense of things...

>From my own experience of some months with a failing ssd (as part of a 
raid1 pair with an ssd that was working fine, so I could and did 
regularly scrub the errors and took advantage of the checksummed raid1 
pairing to let it go much further than I would have in other 
circumstances, simply to observe how things worked as it degraded)...

Write error counter increments should be accompanied by kernel log events 
telling you more -- what level of the device stack is returning the 
errors that propagate up to the filesystem level, for instance.  Expected 
would be either bus level timeouts and resets, or storage device errors.  

If it's storage device errors, SMART data should show increasing raw 
value relocated sectors or the like (smartctl -A).  If it's bus errors, 
it could be bad cabling (bad connections or bad shielding, or using 
SATA-150 certified cables for SATA-600 or some such), or, as I saw on an 
old and failing mobo (when I pulled it there were bulging and some 
exploded capacitors) a few years ago, failing filter-capacitors on the 
mobo signalling paths.  Bad power, including the possibility of an 
overloaded UPS that hit one guy I know, is notorious for both this sort 
of issue and memory problems, as well.

Of course bus timeout errors can also be due to lower timeouts on the bus 
(typically 30-second) than on the device (often 2-minute retry time, on 
consumer-level devices), but there's others here with far more knowledge 
in that area, including what to do to try to fix it, than I have, and the 
various options to fix it have been posted multiple times by now, and 
likely will be posted here again.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Documentation for BTRFS error (device dev): bdev /dev/xx errs: wr 22, rd 0, flush 0, corrupt 0, gen 0

2016-02-23 Thread Duncan
Duncan posted on Tue, 23 Feb 2016 23:17:06 + as excerpted:

> Marc MERLIN posted on Tue, 23 Feb 2016 13:59:11 -0800 as excerpted:
> 
>> I have a freshly created md5 array, with drives that I specifically
>> scanned one by one block by block, and for good measure, I also scanned
>> the entire software raid with a check command which took 3 days to run.
>> 
>> Everything passed.
>> 
>> Then, I made a bcache of that device, an ssd that seems to work fine
>> otherwise (brand new), and dmcrypted the result
>> 
>> md5 - bache - dmcrypt - btrfs ssd /
>> 
>> Now, I'm copying data over with btrfs send, and I'm seeing these slowly
>> show up and the write counter go up one by one.
>> BTRFS error (device dm-7): bdev /dev/mapper/oldds1 errs: wr 17, rd 0,
>> flush 0, corrupt 0, gen 0
>> 
>> Where is the documentation for those counters?
>> Is the write error fatal, or a recovered error?
>> Should I consider that my filesystem is corrupted as soon as any of
>> those counters go up?
>> (I couldn't find an exact meaning of each of them)
> 
> I believe all formal documentation of what the error counters actually
> mean is developer-level -- "Trust the Source, Luke."

Forgot to mention, tho you're probably already considering it, if this is 
the same raid5-backed btrfs you were complaining about being slow in the 
other thread, and considering redoing with bcache to an ssd added, as 
seems very likely, if it /is/ actually storage device or bus errors, that 
could be one reason the previous one was getting so slow...  Maybe it 
wasn't btrfs after all.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix listxattrs not listing all xattrs packed in the same item

2016-02-23 Thread Satoru Takeuchi

On 2016/02/23 2:52, fdman...@kernel.org wrote:

From: Filipe Manana 

In the listxattrs handler, we were not listing all the xattrs that are
packed in the same btree item, which happens when multiple xattrs have
a name that when crc32c hashed produce the same checksum value.

Fix this by processing them all.

The following test case for xfstests reproduces the issue:

   seq=`basename $0`
   seqres=$RESULT_DIR/$seq
   echo "QA output created by $seq"
   tmp=/tmp/$$
   status=1 # failure is the default!
   trap "_cleanup; exit \$status" 0 1 2 3 15

   _cleanup()
   {
   cd /
   rm -f $tmp.*
   }

   # get standard environment, filters and checks
   . ./common/rc
   . ./common/filter
   . ./common/attr

   # real QA test starts here
   _supported_fs generic
   _supported_os Linux
   _require_scratch
   _require_attrs

   rm -f $seqres.full

   _scratch_mkfs >>$seqres.full 2>&1
   _scratch_mount

   # Create our test file with a few xattrs. The first 3 xattrs have a name
   # that when given as input to a crc32c function result in the same checksum.
   # This made btrfs list only one of the xattrs through listxattrs system call
   # (because it packs xattrs with the same name checksum into the same btree
   # item).
   touch $SCRATCH_MNT/testfile
   $SETFATTR_PROG -n user.foobar -v 123 $SCRATCH_MNT/testfile
   $SETFATTR_PROG -n user.WvG1c1Td -v qwerty $SCRATCH_MNT/testfile
   $SETFATTR_PROG -n user.J3__T_Km3dVsW_ -v hello $SCRATCH_MNT/testfile
   $SETFATTR_PROG -n user.something -v pizza $SCRATCH_MNT/testfile
   $SETFATTR_PROG -n user.ping -v pong $SCRATCH_MNT/testfile

   # Now call getfattr with --dump, which calls the listxattrs system call.
   # It should list all the xattrs we have set before.
   $GETFATTR_PROG --absolute-names --dump $SCRATCH_MNT/testfile | 
_filter_scratch

   status=0
   exit


Tested-by: Satoru Takeuchi 

* chris/integration-4.6(HEAD is 790dd8b)

===
# ./check generic/337
FSTYP -- btrfs
PLATFORM  -- Linux/x86_64 fedora23 4.2.7-300.fc23.x86_64
MKFS_OPTIONS  -- /dev/vdc
MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/vdc /scratch_mnt

generic/337  - output mismatch (see 
/root/xfstests/results//generic/337.out.bad)
--- tests/generic/337.out   2016-02-24 07:26:33.0 +0900
+++ /root/xfstests/results//generic/337.out.bad 2016-02-24 
07:43:02.47100 +0900
@@ -1,7 +1,5 @@
 QA output created by 337
 # file: SCRATCH_MNT/testfile
-user.J3__T_Km3dVsW_="hello"
-user.WvG1c1Td="qwerty"
 user.foobar="123"
 user.ping="pong"
 user.something="pizza"
...
(Run 'diff -u tests/generic/337.out 
/root/xfstests/results//generic/337.out.bad'  to see the entire diff)
Ran: generic/337
Failures: generic/337
Failed 1 of 1 tests


* the above source + your patch


# ./check generic/337
FSTYP -- btrfs
PLATFORM  -- Linux/x86_64 fedora23 4.5.0-rc3-ktest+
MKFS_OPTIONS  -- /dev/vdc
MOUNT_OPTIONS -- /dev/vdc /scratch_mnt

generic/337  0s
Ran: generic/337
Passed all 1 tests


Thanks,
Satoru



Signed-off-by: Filipe Manana 
---
  fs/btrfs/xattr.c | 62 +++-
  1 file changed, 39 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
index f2a20d5..caf643d 100644
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -260,16 +260,12 @@ out:

  ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size)
  {
-   struct btrfs_key key, found_key;
+   struct btrfs_key key;
struct inode *inode = d_inode(dentry);
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_path *path;
-   struct extent_buffer *leaf;
-   struct btrfs_dir_item *di;
-   int ret = 0, slot;
+   int ret = 0;
size_t total_size = 0, size_left = size;
-   unsigned long name_ptr;
-   size_t name_len;

/*
 * ok we want all objects associated with this id.
@@ -291,6 +287,13 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char 
*buffer, size_t size)
goto err;

while (1) {
+   struct extent_buffer *leaf;
+   int slot;
+   struct btrfs_dir_item *di;
+   struct btrfs_key found_key;
+   u32 item_size;
+   u32 cur;
+
leaf = path->nodes[0];
slot = path->slots[0];

@@ -319,28 +322,41 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char 
*buffer, size_t size)
goto next;

di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item);
-   if (verify_dir_item(root, leaf, di))
-   goto next;
-
-   name_len = btrfs_dir_name_len(leaf, di);
-   total_size += name_

Re: Documentation for BTRFS error (device dev): bdev /dev/xx errs: wr 22, rd 0, flush 0, corrupt 0, gen 0

2016-02-23 Thread Marc MERLIN
On Tue, Feb 23, 2016 at 11:22:47PM +, Duncan wrote:
> Forgot to mention, tho you're probably already considering it, if this is 
> the same raid5-backed btrfs you were complaining about being slow in the 
> other thread, 

No, that's another one :)
This one was remade from scratch after the filesystem on it got
corrupted.
5 x 4TB swraid5  64GB SSD
  bcache
  dmcrypt
  btrfs

Smart is 100% for all 5 drives, and they passed an extensive test before
I built the new raid and filesystem on them.

> and considering redoing with bcache to an ssd added, as 
> seems very likely, if it /is/ actually storage device or bus errors, that 
> could be one reason the previous one was getting so slow...  Maybe it 
> wasn't btrfs after all.

Good thinking, although in this case, it's a different filesystem.

This filesystem is however on a Sata port multiplier with a 2 meter
cable to an external disk array. 
As a result, bandwidth to it is going to be slow-ish, and the long cable
could be adding I/O errors.

On Tue, Feb 23, 2016 at 11:17:06PM +, Duncan wrote:
> I believe all formal documentation of what the error counters actually 
> mean is developer-level -- "Trust the Source, Luke."
 
Haha, I know that one :)
Although to be fair I was more offering for someone to tell me what
they're supposed to mean, and me updating the wiki to capture that info.

> Yet another point supporting the "btrfs is still stabilizing, not yet 
> fully stable" position, I suppose, as it could definitely be argued that 
> those counters and their visibility, including display in the kernel log 
> at mount time, are definitely intended to be consumed at the admin-user 
> level, and that it follows that they should be documented at the admin-
> user level before the filesystem can properly be defined as fully stable.
 
Yes :) and I'm happy to help make this reality in the wiki at least.
 
> Write error counter increments should be accompanied by kernel log events 
> telling you more -- what level of the device stack is returning the 
> errors that propagate up to the filesystem level, for instance.  Expected 
> would be either bus level timeouts and resets, or storage device errors.  
 
I agree, and I get 0 such errors here, which is why it's weird.

> If it's storage device errors, SMART data should show increasing raw 
> value relocated sectors or the like (smartctl -A).  If it's bus errors, 

Correct, and they are all at 0.

> it could be bad cabling (bad connections or bad shielding, or using 
> SATA-150 certified cables for SATA-600 or some such), or, as I saw on an 

Cabling is indeed a likely culprit, I'm just surprised that if it's the
case, the sata layer is showing me nothing (I'm doing tail -f
/var/log/kern.log and usually I'd see sata or PMP errors there)

> old and failing mobo (when I pulled it there were bulging and some 
> exploded capacitors) a few years ago, failing filter-capacitors on the 
> mobo signalling paths.  Bad power, including the possibility of an 
> overloaded UPS that hit one guy I know, is notorious for both this sort 
> of issue and memory problems, as well.

All true, but wouldn't all of these show up as actual disk errors by the
underlying driver involved too?

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Documentation for BTRFS error (device dev): bdev /dev/xx errs: wr 22, rd 0, flush 0, corrupt 0, gen 0

2016-02-23 Thread Duncan
Marc MERLIN posted on Tue, 23 Feb 2016 16:19:44 -0800 as excerpted:

> Cabling is indeed a likely culprit, I'm just surprised that if it's the
> case, the sata layer is showing me nothing (I'm doing tail -f
> /var/log/kern.log and usually I'd see sata or PMP errors there)

That /is/ surprising.  No explanation, there, tho I don't know enough 
about such errors to know if they /always/ tend to show up in the logs, 
or not, only that mine generally have.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Input/Output errors

2016-02-23 Thread Kenny MacDermid
I'm running btrfs on DM-Crypt Luks running on LVM.

Occasionally I get files that are unreadable for some period of time.
Attempting to read from them results in an 

Input/output error

Sometimes they'll come back on their own, and sometimes a scrub seems to
help, but sometimes I just have to delete them.

Nothing shows up in dmesg when these occur, and I can't predict which
files it will be, or what causes it.

It's currently happening running 4.4.1-2-ARCH, but I've seen the same
thing for many previous kernel versions.

Does anyone have any suggestions?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] unable to handle kernel paging request when running btrfs/011

2016-02-23 Thread Qu Wenruo



David Sterba wrote on 2016/02/23 15:45 +0100:

On Sun, Feb 21, 2016 at 09:36:44AM +0800, Qu Wenruo wrote:



On 02/19/2016 09:41 PM, Anand Jain wrote:



   Saw below warn leading to bug when running btrfs/011, not
   reproducible. Any idea ?


Seems like another wq_destroy race.

But it's hard to locate which wq is the cause from backtrace only.

What's the wq btrfs_stop_all_workers+0xcd is going to free?


You can always try to guess on your locally built sources, provided the
configs do not enable "too much" debugging. What I get and looks
reasonable:


Never though even on different system, the binary could be so identical.

This should provide good enough info to investigate further.

Great thanks for the advice.
Qu



(gdb) l *(btrfs_stop_all_workers+0xcd)
0x2bc7d is in btrfs_stop_all_workers (fs/btrfs/disk-io.c:2154).
warning: Source file is more recent than executable.
2149btrfs_destroy_workqueue(fs_info->delayed_workers);
2150btrfs_destroy_workqueue(fs_info->caching_workers);
2151btrfs_destroy_workqueue(fs_info->readahead_workers);
2152btrfs_destroy_workqueue(fs_info->flush_workers);
2153btrfs_destroy_workqueue(fs_info->qgroup_rescan_workers);
2154btrfs_destroy_workqueue(fs_info->extent_workers);
2155}
2156
2157static void free_root_extent_buffers(struct btrfs_root *root)
2158{


And what's the wq end_workqueue_bio+0x85 is going to add?


(gdb) l *(end_workqueue_bio+0x85)
0x2ab55 is in end_workqueue_bio (fs/btrfs/disk-io.c:726).
721
722 if (bio->bi_rw & REQ_WRITE) {
723 if (end_io_wq->metadata == BTRFS_WQ_ENDIO_METADATA) {
724 wq = fs_info->endio_meta_write_workers;
725 func = btrfs_endio_meta_write_helper;
726 } else if (end_io_wq->metadata == 
BTRFS_WQ_ENDIO_FREE_SPACE) {
727 wq = fs_info->endio_freespace_worker;
728 func = btrfs_freespace_write_helper;
729 } else if (end_io_wq->metadata == 
BTRFS_WQ_ENDIO_RAID56) {
730 wq = fs_info->endio_raid56_workers;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/Output errors

2016-02-23 Thread Marc MERLIN
On Tue, Feb 23, 2016 at 08:40:46PM -0400, Kenny MacDermid wrote:
> I'm running btrfs on DM-Crypt Luks running on LVM.
> 
> Occasionally I get files that are unreadable for some period of time.
> Attempting to read from them results in an 
> 
> Input/output error
> 
> Sometimes they'll come back on their own, and sometimes a scrub seems to
> help, but sometimes I just have to delete them.
> 
> Nothing shows up in dmesg when these occur, and I can't predict which
> files it will be, or what causes it.
> 
> It's currently happening running 4.4.1-2-ARCH, but I've seen the same
> thing for many previous kernel versions.
> 
> Does anyone have any suggestions?

That's weird to say the least, you should at least get *something* in
dmesg.
And you are getting other error messages and btrfs kernel messages in
your logs?

When whatever app you have that's trying to read them fails, I assume
they also fail with cat or less?

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/Output errors

2016-02-23 Thread Kenny MacDermid
On Tue, Feb 23, 2016 at 05:56:58PM -0800, Marc MERLIN wrote:
> On Tue, Feb 23, 2016 at 08:40:46PM -0400, Kenny MacDermid wrote:
> > I'm running btrfs on DM-Crypt Luks running on LVM.
> > 
> > Occasionally I get files that are unreadable for some period of time.
> > Attempting to read from them results in an 
> > 
> > Input/output error
> > 
> > Sometimes they'll come back on their own, and sometimes a scrub seems to
> > help, but sometimes I just have to delete them.
> > 
> > Nothing shows up in dmesg when these occur, and I can't predict which
> > files it will be, or what causes it.
> > 
> > It's currently happening running 4.4.1-2-ARCH, but I've seen the same
> > thing for many previous kernel versions.
> > 
> > Does anyone have any suggestions?
> 
> That's weird to say the least, you should at least get *something* in
> dmesg.
> And you are getting other error messages and btrfs kernel messages in
> your logs?
> 
> When whatever app you have that's trying to read them fails, I assume
> they also fail with cat or less?

I am getting other, normal btrfs messages. I'll include them at the end.
When it happens I get nothing at all in dmesg/logs.

And yes, cat will fail. I can move the file to another name though,
which I often do to get it out of the way.

They're mounted with:

rw,noatime,compress=lzo,ssd,discard,space_cache,autodefrag,inode_cache

issue_discards=1 is in lvm.conf, and discard in /etc/crypttab. (I'm now
reading that I probably shouldn't have it in fstab though and just run
fstrim.)

I don't know if this is related yet at all, but it /seems/ more likely
to happen after I delete a bunch of data. That could be a red herring
though.

When the file becomes readable again it's perfectly fine. Scrub never
finds any errors.

$ dmesg | grep -i btrfs
[   11.837137] Btrfs loaded
[   11.837403] BTRFS: device label root devid 1 transid 508963 /dev/dm-3
[   11.856203] BTRFS info (device dm-3): disk space caching is enabled
[   11.879366] BTRFS: detected SSD devices, enabling SSD mode
[   12.160267] BTRFS info (device dm-3): turning on discard
[   12.160272] BTRFS info (device dm-3): enabling auto defrag
[   12.160275] BTRFS info (device dm-3): enabling inode map caching
[   12.160277] BTRFS info (device dm-3): disk space caching is enabled
[   14.979093] BTRFS: device label home devid 1 transid 705779 /dev/dm-5
[   15.013978] BTRFS info (device dm-5): use ssd allocation scheme
[   15.013983] BTRFS info (device dm-5): turning on discard
[   15.013987] BTRFS info (device dm-5): enabling auto defrag
[   15.013989] BTRFS info (device dm-5): enabling inode map caching
[   15.013991] BTRFS info (device dm-5): disk space caching is enabled
[   15.100779] BTRFS error (device dm-5): could not find root 8
[   15.102889] BTRFS error (device dm-5): could not find root 8
[   15.105833] BTRFS error (device dm-3): could not find root 8
[   15.105838] BTRFS error (device dm-3): could not find root 8

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Input/Output errors

2016-02-23 Thread Chris Murphy
On Tue, Feb 23, 2016 at 8:02 PM, Kenny MacDermid
 wrote:

>
> rw,noatime,compress=lzo,ssd,discard,space_cache,autodefrag,inode_cache

It sounds like an ssd trim bug. I'd check the firmware for updates. If
it's up to date, I'd drop discard mount option first and try to
reproduce. Or just use the default mount options and try to reproduce,
then add them back one at a time until you discover the culprit.

Also, how many files/directories are there? inode_cache isn't
recommended for most use cases. And space_cache is the default so it
doesn't need to be listed.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/5] Btrfs in-band de-duplication tests cases

2016-02-23 Thread Qu Wenruo
Since we are push btrfs in-band de-duplication for v4.6, it's better to
add test cases for this new feature.

Except the first basic function test, the rest are all regression test
which we found during the development.
We also found some bugs from the generic test, but we need some xfstests
option allowing us to enable dedup for any test case.
(We did it by hack _scratch_mount and _test_mount to enable dedup for
any test case)

Use the sequence number starts from 200 to avoid any possible conflicts.
The new script returns some hole number which is not that proper for
such related test case set.
Hopes it's not too hard for maintainer to modify the sequence number.

Qu Wenruo (5):
  fstests: rename _require_btrfs to _require_btrfs_subcommand
  fstests: btrfs: Add basic test for btrfs in-band de-duplication
  fstests: btrfs: Add testcase for btrfs dedup enable disable race test
  fstests: btrfs: Add per inode dedup flag test
  fstests: btrfs: Test inband dedup with balance.

 common/defrag   |   8 
 common/rc   |   2 +-
 tests/btrfs/004 |   2 +-
 tests/btrfs/048 |   1 +
 tests/btrfs/059 |   1 +
 tests/btrfs/200 | 125 
 tests/btrfs/200.out |  19 
 tests/btrfs/201 | 100 +
 tests/btrfs/201.out |   1 +
 tests/btrfs/202 | 116 
 tests/btrfs/202.out |  15 +++
 tests/btrfs/203 |  91 ++
 tests/btrfs/203.out |   3 ++
 tests/btrfs/group   |   4 ++
 14 files changed, 486 insertions(+), 2 deletions(-)
 create mode 100755 tests/btrfs/200
 create mode 100644 tests/btrfs/200.out
 create mode 100755 tests/btrfs/201
 create mode 100644 tests/btrfs/201.out
 create mode 100755 tests/btrfs/202
 create mode 100644 tests/btrfs/202.out
 create mode 100755 tests/btrfs/203
 create mode 100644 tests/btrfs/203.out

-- 
2.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] fstests: btrfs: Add per inode dedup flag test

2016-02-23 Thread Qu Wenruo
This test will check per inode dedup flag.

Signed-off-by: Qu Wenruo 
---
 tests/btrfs/202 | 116 
 tests/btrfs/202.out |  15 +++
 tests/btrfs/group   |   1 +
 3 files changed, 132 insertions(+)
 create mode 100755 tests/btrfs/202
 create mode 100644 tests/btrfs/202.out

diff --git a/tests/btrfs/202 b/tests/btrfs/202
new file mode 100755
index 000..3e5d470
--- /dev/null
+++ b/tests/btrfs/202
@@ -0,0 +1,116 @@
+#! /bin/bash
+# FS QA Test 202
+#
+# Btrfs per inode dedup flag test
+#
+#---
+# Copyright (c) 2016 Fujitsu.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/defrag
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_btrfs_subcommand dedup
+_require_btrfs_subcommand property
+_require_btrfs_fs_feature dedup
+_require_btrfs_mkfs_feature dedup
+
+# File size is twice the maximum file extent of btrfs
+# So even fallbacked to non-dedup, it will have at least 2 extents
+file_size=$(( 256 * 1024 * 1024 ))
+dedup_bs=$(( 64 * 1024 ))
+
+_scratch_mkfs "-O dedup" >> $seqres.full 2>&1
+_scratch_mount
+
+# Return 0 for not deduped at all , return 1 for part or full deduped
+test_file_deduped () {
+   file=$1
+
+   nr_uniq_extents=$(_uniq_extent_count $file)
+   nr_total_extents=$(_extent_count $file)
+
+   if [ $nr_uniq_extents -eq $nr_total_extents ]; then
+   echo "not de-duplicated"
+   else
+   echo "de-duplicated"
+   fi
+}
+
+dedup_write_file () {
+   file=$1
+   size=$2
+
+   $XFS_IO_PROG -f -c "pwrite -b $dedup_bs 0 $size" $file | _filter_xfs_io
+}
+
+print_result () {
+   file=$1
+
+   echo "$(basename $file): $(test_file_deduped $file)"
+}
+_run_btrfs_util_prog dedup enable -b $dedup_bs $SCRATCH_MNT
+touch $SCRATCH_MNT/dedup_file
+touch $SCRATCH_MNT/no_dedup_file
+mkdir $SCRATCH_MNT/dedup_dir
+mkdir $SCRATCH_MNT/no_dedup_dir
+
+_run_btrfs_util_prog property set $SCRATCH_MNT/no_dedup_file dedup disable
+_run_btrfs_util_prog property set $SCRATCH_MNT/no_dedup_dir dedup disable
+
+dedup_write_file $SCRATCH_MNT/tmp $dedup_bs
+# sync to ensure hash is added to dedup tree
+sync
+
+dedup_write_file $SCRATCH_MNT/dedup_file $file_size
+dedup_write_file $SCRATCH_MNT/no_dedup_file $file_size
+dedup_write_file $SCRATCH_MNT/dedup_dir/dedup_dir_default_file $file_size
+dedup_write_file $SCRATCH_MNT/no_dedup_dir/no_dedup_dir_default_file $file_size
+
+print_result $SCRATCH_MNT/dedup_file
+print_result $SCRATCH_MNT/no_dedup_file
+print_result $SCRATCH_MNT/dedup_dir/dedup_dir_default_file
+print_result $SCRATCH_MNT/no_dedup_dir/no_dedup_dir_default_file
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/202.out b/tests/btrfs/202.out
new file mode 100644
index 000..ced9e88
--- /dev/null
+++ b/tests/btrfs/202.out
@@ -0,0 +1,15 @@
+QA output created by 202
+wrote 65536/65536 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 268435456/268435456 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 268435456/268435456 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 268435456/268435456 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 268435456/268435456 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+dedup_file: de-duplicated
+no_dedup_file: not de-duplicated
+dedup_dir_default_file: de-duplicated
+no_dedup_dir_default_file: not de-duplicated
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 76ebea7..0c03cf1 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -121,3 +121,4 @@
 118 auto quick snapshot metadata
 200 auto dedup
 201 auto dedup
+202 auto dedup
-- 
2.7.1



--
To unsubscribe from this list: send

[PATCH 2/5] fstests: btrfs: Add basic test for btrfs in-band de-duplication

2016-02-23 Thread Qu Wenruo
Add basic test for btrfs in-band de-duplication, including:
1) Enable
2) Re-enable
3) On disk extents are refering to same bytenr
4) Disable

Signed-off-by: Qu Wenruo 
---
 common/defrag   |   8 
 tests/btrfs/200 | 125 
 tests/btrfs/200.out |  19 
 tests/btrfs/group   |   1 +
 4 files changed, 153 insertions(+)
 create mode 100755 tests/btrfs/200
 create mode 100644 tests/btrfs/200.out

diff --git a/common/defrag b/common/defrag
index 942593e..34cc822 100644
--- a/common/defrag
+++ b/common/defrag
@@ -47,6 +47,14 @@ _extent_count()
$XFS_IO_PROG -c "fiemap" $1 | tail -n +2 | grep -v hole | wc -l| 
$AWK_PROG '{print $1}'
 }
 
+_uniq_extent_count()
+{
+   file=$1
+   $XFS_IO_PROG -c "fiemap" $file >> $seqres.full 2>&1
+   $XFS_IO_PROG -c "fiemap" $file | tail -n +2 | grep -v hole |\
+   $AWK_PROG '{print $3}' | sort | uniq | wc -l
+}
+
 _check_extent_count()
 {
min=$1
diff --git a/tests/btrfs/200 b/tests/btrfs/200
new file mode 100755
index 000..f2ff542
--- /dev/null
+++ b/tests/btrfs/200
@@ -0,0 +1,125 @@
+#! /bin/bash
+# FS QA Test 200
+#
+# Basic btrfs inband dedup test, including:
+# 1) Enable
+# 2) Uniq file extent number
+# 3) Re-enable
+# 4) Disable
+#
+#---
+# Copyright (c) 2016 Fujitsu.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/defrag
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_btrfs_subcommand dedup
+_require_btrfs_fs_feature dedup
+_require_btrfs_mkfs_feature dedup
+
+# File size is twice the maximum file extent of btrfs
+# So even fallbacked to non-dedup, it will have at least 2 extents
+file_size=$(( 256 * 1024 * 1024 ))
+
+_scratch_mkfs "-O dedup" >> $seqres.full 2>&1
+_scratch_mount
+
+do_dedup_test()
+{
+   backend=$1
+   dedup_bs=$2
+
+   _run_btrfs_util_prog dedup enable -s $backend -b $dedup_bs $SCRATCH_MNT
+   $XFS_IO_PROG -f -c "pwrite -b $dedup_bs 0 $dedup_bs" \
+   $SCRATCH_MNT/initial_block | _filter_xfs_io
+
+   # sync to ensure dedup hash is added into dedup pool
+   sync
+   $XFS_IO_PROG -f -c "pwrite -b $dedup_bs 0 $file_size" \
+   $SCRATCH_MNT/real_file | _filter_xfs_io
+   # sync again to ensure data are all written to disk and
+   # we can get stable extent map
+   sync
+
+   # Test if real_file is de-duplicated
+   nr_uniq_extents=$(_uniq_extent_count $SCRATCH_MNT/real_file)
+   nr_total_extents=$(_extent_count $SCRATCH_MNT/real_file)
+
+   echo "uniq/total: $nr_uniq_extents/$nr_total_extents" >> $seqres.full
+   # Allow a small amount of dedup miss, as commit interval or
+   # memory pressure may break a dedup_bs block and cause
+   # smalll extent which won't go through dedup routine
+   if [ $nr_uniq_extents -ge $(( $nr_total_extents * 5 / 100 )) ]; then
+   echo "Too high dedup failure rate"
+   fi
+
+   # Also check the md5sum to ensure data is not corrupted
+   md5=$(_md5_checksum $SCRATCH_MNT/real_file)
+   if [ $md5 != $init_md5 ]; then
+   echo "File after in-band de-duplication is corrupted"
+   fi
+}
+
+# Create the initial file and calculate its checksum without dedup
+$XFS_IO_PROG -f -c "pwrite 0 $file_size" $SCRATCH_MNT/csum_file | \
+   _filter_xfs_io
+init_md5=$(_md5_checksum $SCRATCH_MNT/csum_file)
+echo "md5 of the initial file is $init_md5" >> $seqres.full
+
+# Test inmemory dedup first, use 64K dedup bs to keep compatibility
+# with 64K page size
+do_dedup_test inmemory 64K
+
+# Test ondisk backend, and re-enable function
+do_dedup_test ondisk 64K
+
+# Test 128K(default) dedup bs
+do_dedup_test inmemory 128K
+do_dedup_test ondisk 128K
+
+# Check dedup disable
+_run_btrfs_util_prog dedup disable

[PATCH 1/5] fstests: Add support to check btrfs sysfs features

2016-02-23 Thread Qu Wenruo
Btrfs has its sysfs interface showing what features current kernel/btrfs
module support.

Add _require_btrfs_kernel_feature() to check such interface.

Also rename _require_btrfs() to _require_btrfs_subcommand() to avoid
confusion.

Signed-off-by: Qu Wenruo 
---
 common/rc   | 2 +-
 tests/btrfs/004 | 2 +-
 tests/btrfs/048 | 1 +
 tests/btrfs/059 | 1 +
 4 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/common/rc b/common/rc
index af16c81..ff57862 100644
--- a/common/rc
+++ b/common/rc
@@ -2706,7 +2706,7 @@ _require_deletable_scratch_dev_pool()
 }
 
 # We check for btrfs and (optionally) features of the btrfs command
-_require_btrfs()
+_require_btrfs_subcommand()
 {
cmd=$1
_require_command "$BTRFS_UTIL_PROG" btrfs
diff --git a/tests/btrfs/004 b/tests/btrfs/004
index 905770a..2ce628e 100755
--- a/tests/btrfs/004
+++ b/tests/btrfs/004
@@ -51,7 +51,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_scratch
 _require_no_large_scratch_dev
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal
 _require_command "/usr/sbin/filefrag" filefrag
 
 rm -f $seqres.full
diff --git a/tests/btrfs/048 b/tests/btrfs/048
index c2cb4a6..d15346a 100755
--- a/tests/btrfs/048
+++ b/tests/btrfs/048
@@ -48,6 +48,7 @@ _supported_os Linux
 _require_test
 _require_scratch
 _require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 send_files_dir=$TEST_DIR/btrfs-test-$seq
 
diff --git a/tests/btrfs/059 b/tests/btrfs/059
index b9a6ef4..6e7f7ee 100755
--- a/tests/btrfs/059
+++ b/tests/btrfs/059
@@ -51,6 +51,7 @@ _supported_os Linux
 _require_test
 _require_scratch
 _require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 rm -f $seqres.full
 
-- 
2.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] fstests: btrfs: Add testcase for btrfs dedup enable disable race test

2016-02-23 Thread Qu Wenruo
Add test case to check btrfs dedup enable/disable race.

Signed-off-by: Qu Wenruo 
---
 tests/btrfs/201 | 100 
 tests/btrfs/201.out |   1 +
 tests/btrfs/group   |   1 +
 3 files changed, 102 insertions(+)
 create mode 100755 tests/btrfs/201
 create mode 100644 tests/btrfs/201.out

diff --git a/tests/btrfs/201 b/tests/btrfs/201
new file mode 100755
index 000..4bcad13
--- /dev/null
+++ b/tests/btrfs/201
@@ -0,0 +1,100 @@
+#! /bin/bash
+# FS QA Test 201
+#
+# Basic btrfs inband dedup enable/disable race test
+#
+#---
+# Copyright (c) 2016 Fujitsu.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+   kill $trigger_work &> /dev/null
+   kill $fsstress_work &> /dev/null
+   wait
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_btrfs_subcommand dedup
+_require_btrfs_fs_feature dedup
+_require_btrfs_mkfs_feature dedup
+
+# Use 64K dedup size to keep compatibility for 64K page size
+dedup_bs=64K
+
+_scratch_mkfs "-O dedup" >> $seqres.full 2>&1
+_scratch_mount
+
+mkdir -p $SCRATCH_MNT/stressdir
+
+fsstress_work()
+{
+   $FSSTRESS_PROG $(_scale_fsstress_args -p 8 -n 5000) $FSSTRESS_AVOID \
+   -d $SCRATCH_MNT/stressdir > /dev/null 2>&1
+}
+
+trigger_work()
+{
+   while true; do
+   _run_btrfs_util_prog dedup enable -s inmemory \
+   -b $dedup_bs $SCRATCH_MNT
+   sleep 5
+   _run_btrfs_util_prog dedup disable $SCRATCH_MNT
+   sleep 5
+   _run_btrfs_util_prog dedup enable -s ondisk \
+   -b $dedup_bs $SCRATCH_MNT
+   sleep 5
+   _run_btrfs_util_prog dedup disable $SCRATCH_MNT
+   sleep 5
+   done
+}
+
+fsstress_work &
+fsstress_pid=$!
+
+trigger_work &
+trigger_pid=$!
+
+wait $fsstress_pid
+kill $trigger_pid
+wait
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/201.out b/tests/btrfs/201.out
new file mode 100644
index 000..b8969af
--- /dev/null
+++ b/tests/btrfs/201.out
@@ -0,0 +1 @@
+QA output created by 201
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 0b7354b..76ebea7 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -120,3 +120,4 @@
 117 auto quick send clone
 118 auto quick snapshot metadata
 200 auto dedup
+201 auto dedup
-- 
2.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] fstests: rename _require_btrfs to _require_btrfs_subcommand

2016-02-23 Thread Qu Wenruo
Rename _require_btrfs() to _require_btrfs_subcommand() to avoid
confusion, as all other _require_btrfs_* has a quite clear suffix, like
_require_btrfs_mkfs_feature() or _require_btrfs_fs_feature().

Signed-off-by: Qu Wenruo 
---
 common/rc   | 2 +-
 tests/btrfs/004 | 2 +-
 tests/btrfs/048 | 1 +
 tests/btrfs/059 | 1 +
 4 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/common/rc b/common/rc
index af16c81..ff57862 100644
--- a/common/rc
+++ b/common/rc
@@ -2706,7 +2706,7 @@ _require_deletable_scratch_dev_pool()
 }
 
 # We check for btrfs and (optionally) features of the btrfs command
-_require_btrfs()
+_require_btrfs_subcommand()
 {
cmd=$1
_require_command "$BTRFS_UTIL_PROG" btrfs
diff --git a/tests/btrfs/004 b/tests/btrfs/004
index 905770a..2ce628e 100755
--- a/tests/btrfs/004
+++ b/tests/btrfs/004
@@ -51,7 +51,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_scratch
 _require_no_large_scratch_dev
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal
 _require_command "/usr/sbin/filefrag" filefrag
 
 rm -f $seqres.full
diff --git a/tests/btrfs/048 b/tests/btrfs/048
index c2cb4a6..d15346a 100755
--- a/tests/btrfs/048
+++ b/tests/btrfs/048
@@ -48,6 +48,7 @@ _supported_os Linux
 _require_test
 _require_scratch
 _require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 send_files_dir=$TEST_DIR/btrfs-test-$seq
 
diff --git a/tests/btrfs/059 b/tests/btrfs/059
index b9a6ef4..6e7f7ee 100755
--- a/tests/btrfs/059
+++ b/tests/btrfs/059
@@ -51,6 +51,7 @@ _supported_os Linux
 _require_test
 _require_scratch
 _require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 rm -f $seqres.full
 
-- 
2.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] fstests: btrfs: Test inband dedup with balance.

2016-02-23 Thread Qu Wenruo
Btrfs balance will reloate date extent, but its hash is removed too late
at run_delayed_ref() time, which will cause extent ref increased
increased during balance, cause either find_data_references() gives
WARN_ON() or even run_delayed_refs() fails and cause transaction abort.

Add such concurrency test for inband dedup and balance.

Signed-off-by: Qu Wenruo 
---
 tests/btrfs/203 | 91 +
 tests/btrfs/203.out |  3 ++
 tests/btrfs/group   |  1 +
 3 files changed, 95 insertions(+)
 create mode 100755 tests/btrfs/203
 create mode 100644 tests/btrfs/203.out

diff --git a/tests/btrfs/203 b/tests/btrfs/203
new file mode 100755
index 000..19dc55c
--- /dev/null
+++ b/tests/btrfs/203
@@ -0,0 +1,91 @@
+#! /bin/bash
+# FS QA Test 203
+#
+# Btrfs reflink with balance concurrency test
+#
+#---
+# Copyright (c) 2016 Fujitsu.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   kill $balance_pid &> /dev/null
+   wait
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/reflink
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_cp_reflink
+_require_btrfs_subcommand dedup
+_require_btrfs_fs_feature dedup
+_require_btrfs_mkfs_feature dedup
+
+dedup_bs=$(( 128 * 1024 ))
+file=$SCRATCH_MNT/foo
+nr=2
+
+_scratch_mkfs "-O dedup" >> $seqres.full 2>&1
+_scratch_mount
+
+_run_btrfs_util_prog dedup enable -b $dedup_bs $SCRATCH_MNT
+
+# create the initial file
+$XFS_IO_PROG -f -c "pwrite -b $dedup_bs 0 $dedup_bs" $file | _filter_xfs_io
+
+# make sure hash is added into hash pool
+sync
+
+_btrfs_stress_balance $SCRATCH_MNT >/dev/null 2>&1 &
+balance_pid=$!
+
+for n in $(seq 1 $nr); do
+   $XFS_IO_PROG -f -c "pwrite -b $dedup_bs 0 $dedup_bs" \
+   ${file}_${n} > /dev/null 2>&1
+done
+
+kill $balance_pid &> /dev/null
+wait
+
+# Sometimes even we killed $balance_pid and wait returned,
+# balance may still be running, use balance cancel to wait it.
+_run_btrfs_util_prog balance cancel $SCRATCH_MNT &> /dev/null
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/203.out b/tests/btrfs/203.out
new file mode 100644
index 000..404394c
--- /dev/null
+++ b/tests/btrfs/203.out
@@ -0,0 +1,3 @@
+QA output created by 203
+wrote 131072/131072 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 0c03cf1..fa90f33 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -122,3 +122,4 @@
 200 auto dedup
 201 auto dedup
 202 auto dedup
+203 auto dedup balance
-- 
2.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] fstests: Add support to check btrfs sysfs features

2016-02-23 Thread Filipe Manana
On Wed, Feb 24, 2016 at 6:35 AM, Qu Wenruo  wrote:
> Btrfs has its sysfs interface showing what features current kernel/btrfs
> module support.
>
> Add _require_btrfs_kernel_feature() to check such interface.


I think you sent the wrong patch. This doesn't add such a function and
the changes are exactly the same as in:

[PATCH 1/5] fstests: rename _require_btrfs to _require_btrfs_subcommand

>
> Also rename _require_btrfs() to _require_btrfs_subcommand() to avoid
> confusion.

So if there's a dedicated patch to do that already (the one I
mentioned above), why do it here again? (and should be a separate
patch anyway, since it's unrelated)

>
> Signed-off-by: Qu Wenruo 
> ---
>  common/rc   | 2 +-
>  tests/btrfs/004 | 2 +-
>  tests/btrfs/048 | 1 +
>  tests/btrfs/059 | 1 +
>  4 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/common/rc b/common/rc
> index af16c81..ff57862 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2706,7 +2706,7 @@ _require_deletable_scratch_dev_pool()
>  }
>
>  # We check for btrfs and (optionally) features of the btrfs command
> -_require_btrfs()
> +_require_btrfs_subcommand()
>  {
> cmd=$1
> _require_command "$BTRFS_UTIL_PROG" btrfs
> diff --git a/tests/btrfs/004 b/tests/btrfs/004
> index 905770a..2ce628e 100755
> --- a/tests/btrfs/004
> +++ b/tests/btrfs/004
> @@ -51,7 +51,7 @@ _supported_fs btrfs
>  _supported_os Linux
>  _require_scratch
>  _require_no_large_scratch_dev
> -_require_btrfs inspect-internal
> +_require_btrfs_subcommand inspect-internal
>  _require_command "/usr/sbin/filefrag" filefrag
>
>  rm -f $seqres.full
> diff --git a/tests/btrfs/048 b/tests/btrfs/048
> index c2cb4a6..d15346a 100755
> --- a/tests/btrfs/048
> +++ b/tests/btrfs/048
> @@ -48,6 +48,7 @@ _supported_os Linux
>  _require_test
>  _require_scratch
>  _require_btrfs "property"
> +_require_btrfs_subcommand "property"
>
>  send_files_dir=$TEST_DIR/btrfs-test-$seq
>
> diff --git a/tests/btrfs/059 b/tests/btrfs/059
> index b9a6ef4..6e7f7ee 100755
> --- a/tests/btrfs/059
> +++ b/tests/btrfs/059
> @@ -51,6 +51,7 @@ _supported_os Linux
>  _require_test
>  _require_scratch
>  _require_btrfs "property"
> +_require_btrfs_subcommand "property"
>
>  rm -f $seqres.full
>
> --
> 2.7.1
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] fstests: Add support to check btrfs sysfs features

2016-02-23 Thread Qu Wenruo



Filipe Manana wrote on 2016/02/24 07:27 +:

On Wed, Feb 24, 2016 at 6:35 AM, Qu Wenruo  wrote:

Btrfs has its sysfs interface showing what features current kernel/btrfs
module support.

Add _require_btrfs_kernel_feature() to check such interface.



I think you sent the wrong patch. This doesn't add such a function and
the changes are exactly the same as in:

[PATCH 1/5] fstests: rename _require_btrfs to _require_btrfs_subcommand


Oh, this is one old and deprecated patch.
I forgot to cleanup the dir...

Please ignore this one.

The other one, "[PATCH 1/5] fstests: rename _require_btrfs to
 _require_btrfs_subcommand" is the correct one.

As fstests already provide _btrfs_require_fs_feature().

I'll send the patchset.

Thanks,
Qu




Also rename _require_btrfs() to _require_btrfs_subcommand() to avoid
confusion.


So if there's a dedicated patch to do that already (the one I
mentioned above), why do it here again? (and should be a separate
patch anyway, since it's unrelated)



Signed-off-by: Qu Wenruo 
---
  common/rc   | 2 +-
  tests/btrfs/004 | 2 +-
  tests/btrfs/048 | 1 +
  tests/btrfs/059 | 1 +
  4 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/common/rc b/common/rc
index af16c81..ff57862 100644
--- a/common/rc
+++ b/common/rc
@@ -2706,7 +2706,7 @@ _require_deletable_scratch_dev_pool()
  }

  # We check for btrfs and (optionally) features of the btrfs command
-_require_btrfs()
+_require_btrfs_subcommand()
  {
 cmd=$1
 _require_command "$BTRFS_UTIL_PROG" btrfs
diff --git a/tests/btrfs/004 b/tests/btrfs/004
index 905770a..2ce628e 100755
--- a/tests/btrfs/004
+++ b/tests/btrfs/004
@@ -51,7 +51,7 @@ _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_no_large_scratch_dev
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal
  _require_command "/usr/sbin/filefrag" filefrag

  rm -f $seqres.full
diff --git a/tests/btrfs/048 b/tests/btrfs/048
index c2cb4a6..d15346a 100755
--- a/tests/btrfs/048
+++ b/tests/btrfs/048
@@ -48,6 +48,7 @@ _supported_os Linux
  _require_test
  _require_scratch
  _require_btrfs "property"
+_require_btrfs_subcommand "property"

  send_files_dir=$TEST_DIR/btrfs-test-$seq

diff --git a/tests/btrfs/059 b/tests/btrfs/059
index b9a6ef4..6e7f7ee 100755
--- a/tests/btrfs/059
+++ b/tests/btrfs/059
@@ -51,6 +51,7 @@ _supported_os Linux
  _require_test
  _require_scratch
  _require_btrfs "property"
+_require_btrfs_subcommand "property"

  rm -f $seqres.full

--
2.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html







--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html