[PATCH] btrfs: remove a FIXME in btrfs_get_acl()

2014-05-19 Thread Zhang Zhen
There is no function returns a value of -ENOENT, so the check is
useless.
Remove it, and the redundant braces.

Signed-off-by: Zhang Zhen zhenzhang.zh...@huawei.com
---
 fs/btrfs/acl.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index ff9b399..cae7480 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -53,14 +53,12 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int 
type)
return ERR_PTR(-ENOMEM);
size = __btrfs_getxattr(inode, name, value, size);
}
-   if (size  0) {
+   if (size  0)
acl = posix_acl_from_xattr(init_user_ns, value, size);
-   } else if (size == -ENOENT || size == -ENODATA || size == 0) {
-   /* FIXME, who returns -ENOENT?  I think nobody */
+   else if (size == -ENODATA || size == 0)
acl = NULL;
-   } else {
+   else
acl = ERR_PTR(-EIO);
-   }
kfree(value);

if (!IS_ERR(acl))
-- 
1.8.1.2


.




--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Convert btrfs software code to ASIC

2014-05-19 Thread Le Nguyen Tran
Hi,

I am Nguyen. I am not a software development engineer but an IC (chip)
development engineer. I have a plan to develop an IC controller for
Network Attached Storage (NAS). The main idea is converting software
code into hardware implementation. Because the chip is customized for
NAS, its performance is high, and its cost is lower than using micro
processor like Atom or Xeon (for servers).

I plan to use btrfs as the file system specification for my NAS. The
main point is that I need to understand the btrfs sofware code in
order to covert them into hardware implementation. I am wandering if
any of you can help me. If we can make the chip in a good shape, we
can start up a company and have our own business.

If you are interested in my idea and have further questions, please
send me an email: lntran...@gmail.com

Thanks.

Nguyen.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


3.14.2 Debian kernel BTRFS corruption after balance

2014-05-19 Thread Russell Coker
Last night my cron job that runs /sbin/btrfs fi balance start -dusage=30 -
musage=30 / failed with no space for balancing.  Then I manually ran it to 
free space for further balancing and ended up running -musage=0 -dusage=60 
(dusage=40 resulted in nothing being done).

http://www.coker.com.au/bug/btrfs-3.14.2-dmesg.txt.gz

After running the dusage=60 balance I had a kernel panic apparently due to a 
corrupted filesystem, the above URL has the dmesg.

http://www.coker.com.au/bug/btrfs-3.14.2-dmesg2.txt.gz

Then I rebooted the system and got the above error.  It now seems impossible 
to get a read-write filesystem.  I'll try booting with the latest Debian kernel 
(3.14.4) and see if that makes it work.  Otherwise I guess it's 
backup/format/restore.

Would it be worth keeping an image of that filesystem to see if a newer kernel 
can handle it better?

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Fajar A. Nugraha
On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com wrote:
 Hi,

 I am Nguyen. I am not a software development engineer but an IC (chip)
 development engineer. I have a plan to develop an IC controller for
 Network Attached Storage (NAS). The main idea is converting software
 code into hardware implementation. Because the chip is customized for
 NAS, its performance is high, and its cost is lower than using micro
 processor like Atom or Xeon (for servers).

 I plan to use btrfs as the file system specification for my NAS. The
 main point is that I need to understand the btrfs sofware code in
 order to covert them into hardware implementation. I am wandering if
 any of you can help me. If we can make the chip in a good shape, we
 can start up a company and have our own business.

I'm not sure if that's a good idea.

AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block,
etc). Rather than converting/reimplementing everything, if your aim is
lower cost, you might have easier time using something like a mediatek
SOC (the ones used on smartphones) and run a custom-built linux with
btrfs support on it.

For documentation,
https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentation
is probably the best place to start

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Convert btrfs software code to ASIC

2014-05-19 Thread Paul Jones
Hi Nguyen,
 Perhaps a better idea would be to use a low-cost low-power som module to run 
Linux and btrfs code, and use an FPGA/ASIC to offload 
compression/encryption/checksums and to possibly act as a raid controller. 
Since btrfs will be under heavy development for the foreseeable future I doubt 
it would be a good idea to lock it into silicon. Using this approach the mature 
technologies can be hardware accelerated, and the software parts are available 
for easy upgrades.
It also significantly reduces risk for your project, and VCs like that sort of 
thing!

Regards,
Paul.

-Original Message-
From: linux-btrfs-ow...@vger.kernel.org 
[mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Le Nguyen Tran
Sent: Monday, 19 May 2014 9:07 PM
To: Fajar A. Nugraha
Cc: linux-btrfs
Subject: Re: Convert btrfs software code to ASIC

Hi Nugraha,

Thank you so much for your information. Frankly speaking, no one can confirm a 
new start-up idea works or not. The probability of failure is always high. 
However, the benefit if it works is also very high.

I do not plan to exactly replicate the C source code. There are always some 
techniques in ASIC design to implement which are not the same as in software 
(less flexible but faster).

The main advantages of my proposed chip are:
- Very high performance: Performance of ASIC chip is normally more than 10x 
higher than performance of processors because processor run only 1-4 
instructions sequentially. That is very suitable for server when there are many 
requests from users.
- Low-cost: In side the chip, we can customized for our function only.
In my plan, we do not need cache (which covers a very large area), and we can 
use low cost technology 0.18um.
- Low-power: Processors run instructions sequentially and access memory ( or 
cache). As a result, they consume much more power than ASIC chip (also can be 
10x higher).

Actually ARM processors like mediatek cannot be comparable with ASIC chip. 
However, as I mentioned, it is just my draft idea. I still to work more to 
verify my idea.

Thanks.

Nguyen.

On 5/19/14, Fajar A. Nugraha l...@fajar.net wrote:
 On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com
 wrote:
 Hi,

 I am Nguyen. I am not a software development engineer but an IC 
 (chip) development engineer. I have a plan to develop an IC 
 controller for Network Attached Storage (NAS). The main idea is 
 converting software code into hardware implementation. Because the 
 chip is customized for NAS, its performance is high, and its cost is 
 lower than using micro processor like Atom or Xeon (for servers).

 I plan to use btrfs as the file system specification for my NAS. The 
 main point is that I need to understand the btrfs sofware code in 
 order to covert them into hardware implementation. I am wandering if 
 any of you can help me. If we can make the chip in a good shape, we 
 can start up a company and have our own business.

 I'm not sure if that's a good idea.

 AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block, 
 etc). Rather than converting/reimplementing everything, if your aim is 
 lower cost, you might have easier time using something like a mediatek 
 SOC (the ones used on smartphones) and run a custom-built linux with 
 btrfs support on it.

 For documentation,
 https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentat
 ion
 is probably the best place to start

 --
 Fajar

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in the 
body of a message to majord...@vger.kernel.org More majordomo info at  
http://vger.kernel.org/majordomo-info.html
N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: 3.15-rc5 btrfs send/receive corruption errors? Does scrub warn of silent corruption?

2014-05-19 Thread Filipe David Manana
On Sat, May 17, 2014 at 11:23 PM, Marc MERLIN m...@merlins.org wrote:
 Before I delete this and start over, anything else you'd like from it?

Can you create an image of the fs with btrfs-image
(https://btrfs.wiki.kernel.org/index.php/Btrfs-image) and uploaded it
somewhere (or send it to me directly) to see if it's easy to
reproduce?

thanks


 Also, it the 3.15rc5 deadlock I had, did not occur again.
 So, I think it may well have been related to my doing a big
 send/receive.
 See 3.15rc5 deadlock thread.

 Marc

 On Wed, May 14, 2014 at 06:26:18AM -0700, Marc MERLIN wrote:
 On Tue, May 13, 2014 at 09:11:34PM +0100, Filipe David Manana wrote:
   Is there anything you'd like from the subvolumes on the source that
   btrfs cannot process and that I'm going to delete so that I can start
   syncing back from the SSD to the HDD?
 
  For the issue you had with send sending weird path names, I just found
  a case that leads to it (or a crash or some other weird stuff):
 
  https://patchwork.kernel.org/patch/4170401/
 
  But you really need to be using a lot of hard links and deleting them,
  so maybe it's caused by something else.

 Unfortunately, even with your patch, I see get
 legolas:/mnt/btrfs_pool2# btrfs send home_ro.20140507_10\:00\:01 | btrfs 
 receive /mnt/btrfs_pool1/
 At subvol home_ro.20140507_10:00:01
 At subvol home_ro.20140507_10:00:01
 ERROR: chown merlin/.config/google-chrome-mysetup/ ��� failed. No such file 
 or directory


 I just ran btrfsck and I see nothing majorly wrong with the source
 filesystem:
 legolas:~# btrfsck /dev/mapper/disk2 21 |tee /tmp/fsck
 checking extents
 checking free space cache
 checking fs roots
 root 22504 inode 1926322 errors 400, nbytes wrong
 Checking filesystem on /dev/mapper/disk2
 UUID: 6afd4707-876c-46d6-9de2-21c4085b7bed
 free space inode generation (0) did not match free space cache generation 
 (78684)
 free space inode generation (0) did not match free space cache generation 
 (75988)
 free space inode generation (0) did not match free space cache generation 
 (76193)
 free space inode generation (0) did not match free space cache generation 
 (28818)
 free space inode generation (0) did not match free space cache generation 
 (28818)
 free space inode generation (0) did not match free space cache generation 
 (33187)
 free space inode generation (0) did not match free space cache generation 
 (31543)
 free space inode generation (0) did not match free space cache generation 
 (16710)
 found 283033724420 bytes used err is 1
 total csum bytes: 663653972
 total tree bytes: 7333687296
 total fs tree bytes: 5844262912
 total extent tree bytes: 631451648
 btree space waste bytes: 1497868045
 file data blocks allocated: 1081231372288
  referenced 807338209280
 Btrfs v3.14.1


 To be clear, I do not need this to work, this is a snapshot I'm going to
 delete anyway, but if there is anything you'd like me to try or capture
 for you to help with improving the code, please let me know.

 Marc
 --
 A mouse is a device used to point at the xterm you want to type in - A.S.R.
 Microsoft is to operating systems 
    what McDonalds is to gourmet 
 cooking
 Home page: http://marc.merlins.org/ | PGP 
 1024R/763BE901
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 --
 A mouse is a device used to point at the xterm you want to type in - A.S.R.
 Microsoft is to operating systems 
    what McDonalds is to gourmet 
 cooking
 Home page: http://marc.merlins.org/ | PGP 
 1024R/763BE901



-- 
Filipe David Manana,

Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: send/receive and bedup

2014-05-19 Thread Scott Middleton
On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:
 On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote:
 I read so much about BtrFS that I mistaked Bedup with Duperemove.
 Duperemove is actually what I am testing.

 I'm currently using programs that find files that are the same, and
 hardlink them together:
 http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html

 hardlink.py actually seems to be the faster (memory and CPU) one event
 though it's in python.
 I can get others to run out of RAM on my 8GB server easily :(

 Bedup should be better, but last I tried I couldn't get it to work.
 It's been updated since then, I just haven't had the chance to try it
 again since then.

 Please post what you find out, or if you have a hardlink maker that's
 better than the ones I found :)



Thanks for that.

I may be  completely wrong in my approach.

I am not looking for a file level comparison. Bedup worked fine for
that. I have a lot of virtual images and shadow protect images where
only a few megabytes may be the difference. So a file level hash and
comparison doesn't really achieve my goals.

I thought duperemove may be on a lower level.

https://github.com/markfasheh/duperemove

Duperemove is a simple tool for finding duplicated extents and
submitting them for deduplication. When given a list of files it will
hash their contents on a block by block basis and compare those hashes
to each other, finding and categorizing extents that match each
other. When given the -d option, duperemove will submit those
extents for deduplication using the btrfs-extent-same ioctl.

It defaults to 128k but you can make it smaller.

I hit a hurdle though. The 3TB HDD  I used seemed OK when I did a long
SMART test but seems to die every few hours. Admittedly it was part of
a failed mdadm RAID array that I pulled out of a clients machine.

The only other copy I have of the data is the original mdadm array
that was recently replaced with a new server, so I am loathe to use
that HDD yet. At least for another couple of weeks!


I am still hopeful duperemove will work.

In another month I will put the 2 X 4TB HDDs online in BtrFS RAID 1
format on the production machine and have a crack on duperemove on
that after hours. I will convert the onsite backup machine to BtrFS
with its 2 x 4TB HDDs to BtrFS not long after.

The ultimate goal is to be able to back up on a block level very large
files offsite where maybe a GB is changed on a daily basis. I realise
that I will have to make an original copy and manually take that to my
datacentre but hopefully I can backup multiple clients data after
hours, or possibly,  a trickle, constantly.

Kind Regards

Scott
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.

2014-05-19 Thread Chris Mason
On 05/17/2014 01:43 PM, Hugo Mills wrote:
 On Wed, Apr 16, 2014 at 07:12:19PM +0200, David Sterba wrote:
 On Wed, Apr 02, 2014 at 04:29:11PM +0800, Qu Wenruo wrote:
 Convert the old btrfs man pages to new asciidoc and split the huge
 btrfs man page into subcommand man page.

 I'm merging this patchset into the base series of integration because
 several patches need to update the docs and it's no longer feasible to
 keep it in a separate branch from the patches.
 
I've just been poking around in the docs for a completely different
 reason, and I think there's a fairly serious problem (well, as serious
 as problems get with documentation).
 
Take, for example, the format for btrfs fi resize:
 
 'resize' [devid:][+/-]size[gkm]|[devid:]max path::
 
Now, this has just thrown away all of the useful markup which
 indicates the semantics of the command. The asciidoc renders all of
 that text literally and unformatted, making alphasymbolic(*) soup of
 the docs. Compare this to the old roff man page:
 
 \fBbtrfs\fP \fBfilesystem resize\fP 
 [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP
 
This isn't perfect -- we're missing a \fB around the max -- but
 it has text in bold(⁑) and italics(⁂) and neither(☃). I've just looked
 at some of the other pages, and they've also got similar typographical
 problems. This is a lot of fiddly tedious work to get it right, and if
 it doesn't get done now in the initial commit, then we're going to end
 up with poor examples copied for every new feature or docs update,
 making the problem worse before anyone does the work to make it
 better.

Are there issues with the asciidoc form outside of the command summary line?

The reason I ask is that all of these tools have tradeoffs.  If asciidoc
makes our documentation easier to update and easier to keep up to date,
I'm willing to trade that for the perfect summary line.

I think the easiest way to add clarity to the summary (in any markup
language) is by providing examples.  Italics and bolds definitely help,
but examples always win.

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Duncan
Paul Jones posted on Mon, 19 May 2014 12:24:53 + as excerpted:

 On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com
 wrote:

 I have a plan to develop an IC controller for Network Attached
 Storage (NAS). The main idea is converting software code into
 hardware implementation.

 I plan to use btrfs as the file system specification for my NAS.

 Perhaps a better idea would be to use a low-cost low-power som module
 to run Linux and btrfs code, and use an FPGA/ASIC to offload
 compression/encryption/checksums and to possibly act as a raid
 controller. Since btrfs will be under heavy development for the
 foreseeable future I doubt it would be a good idea to lock it into
 silicon. Using this approach the mature technologies can be hardware
 accelerated, and the software parts are available for easy upgrades.
 It also significantly reduces risk for your project, and VCs like that
 sort of thing!

This is a very good idea and what I was about to suggest.  Certainly, 
btrfs is still not fully stable, and I really would hate to see the 
current implementation etched in silicon at this time.  However, a hybrid 
approach where the mature bits such as (de-/)compression/checksums/
encryption are hardware etched/accelerated while the more general and 
still developing code is deployed as upgradeable firmware on a system-on-
module sounds like a very good idea indeed, particularly if that firmware 
is deployed as a user-modifiable/replaceable free-as-in-freedom kernel in 
keeping with the spirit of the GPL under which the Linux kernel and thus 
btrfs are written.

In other words... I doubt very much that any list regular here familiar 
with the continuing flow of bugs we see, as well as the roadmapped but 
not yet implemented features that people wanting a hardware 
implementation would certainly be interested in, would find the idea of a 
hardware implementation of anything like current code anything but 
nightmare material. =:^\  Maybe in a couple years... but even then, 
upgradeable firmware with critical mature bits offloaded for hardware 
acceleration sounds like a far better idea.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Le Nguyen Tran
Hi Paul,

Thank you for your advice. Actually, I currently have ideas to
implement database management (like list, tree), and dynamic memory
allocation in hardware to accelerate the file system operations. I
still do not have a clear picture about which part is implemented by
processor (as your advice) and which part is accelerated by hardware.
I now need to understand the operation of btrfs source code to
determine. I hope that one of you can help me and if it work, we can
start-up our own business.

Thanks.

Nguyen.

On 5/19/14, Paul Jones p...@pauljones.id.au wrote:
 Hi Nguyen,
  Perhaps a better idea would be to use a low-cost low-power som module to
 run Linux and btrfs code, and use an FPGA/ASIC to offload
 compression/encryption/checksums and to possibly act as a raid controller.
 Since btrfs will be under heavy development for the foreseeable future I
 doubt it would be a good idea to lock it into silicon. Using this approach
 the mature technologies can be hardware accelerated, and the software parts
 are available for easy upgrades.
 It also significantly reduces risk for your project, and VCs like that sort
 of thing!

 Regards,
 Paul.

 -Original Message-
 From: linux-btrfs-ow...@vger.kernel.org
 [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Le Nguyen Tran
 Sent: Monday, 19 May 2014 9:07 PM
 To: Fajar A. Nugraha
 Cc: linux-btrfs
 Subject: Re: Convert btrfs software code to ASIC

 Hi Nugraha,

 Thank you so much for your information. Frankly speaking, no one can confirm
 a new start-up idea works or not. The probability of failure is always high.
 However, the benefit if it works is also very high.

 I do not plan to exactly replicate the C source code. There are always some
 techniques in ASIC design to implement which are not the same as in software
 (less flexible but faster).

 The main advantages of my proposed chip are:
 - Very high performance: Performance of ASIC chip is normally more than 10x
 higher than performance of processors because processor run only 1-4
 instructions sequentially. That is very suitable for server when there are
 many requests from users.
 - Low-cost: In side the chip, we can customized for our function only.
 In my plan, we do not need cache (which covers a very large area), and we
 can use low cost technology 0.18um.
 - Low-power: Processors run instructions sequentially and access memory ( or
 cache). As a result, they consume much more power than ASIC chip (also can
 be 10x higher).

 Actually ARM processors like mediatek cannot be comparable with ASIC chip.
 However, as I mentioned, it is just my draft idea. I still to work more to
 verify my idea.

 Thanks.

 Nguyen.

 On 5/19/14, Fajar A. Nugraha l...@fajar.net wrote:
 On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com
 wrote:
 Hi,

 I am Nguyen. I am not a software development engineer but an IC
 (chip) development engineer. I have a plan to develop an IC
 controller for Network Attached Storage (NAS). The main idea is
 converting software code into hardware implementation. Because the
 chip is customized for NAS, its performance is high, and its cost is
 lower than using micro processor like Atom or Xeon (for servers).

 I plan to use btrfs as the file system specification for my NAS. The
 main point is that I need to understand the btrfs sofware code in
 order to covert them into hardware implementation. I am wandering if
 any of you can help me. If we can make the chip in a good shape, we
 can start up a company and have our own business.

 I'm not sure if that's a good idea.

 AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block,
 etc). Rather than converting/reimplementing everything, if your aim is
 lower cost, you might have easier time using something like a mediatek
 SOC (the ones used on smartphones) and run a custom-built linux with
 btrfs support on it.

 For documentation,
 https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentat
 ion
 is probably the best place to start

 --
 Fajar

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


3.15-rc5 deadlocked a 2nd time after I was copying photos from an sdcard + common code path that deadlocks all btrfs filesystems

2014-05-19 Thread Marc MERLIN
Ok, that's 2 out of 2.

I was copying pictures from an sdcard (through mmcblk0), and the
filesystem deadlocked.

Unfortunately, when this happens, I copied my pictures (which were still
in RAM) to my 2nd drive which was also btrfs.
I had to reboot, and of course the last pictures didn't get committed to
disk, but more annoyingly the copy I did to the second drive didn't work
either.
All the filenames got copied to the 2nd drive, some ended up with data,
and others ended up empty.
Why does a deadlock on drive 1 also cause btrfs to fail to write to
drive #2?
This is not the first time, there seem to be common codepaths across all
drives (just like disk array #1 having problems causing failure of
syslog to work on the boot drive with btrfs).

I tried to capture sysrq+w, but it didn't make it to disk because of that bug.
I do have remote syslog of the hangs before that though, but the capture of 
sysrq+w
has too much missing data to be useful
http://marc.merlins.org/tmp/btrfs-hang.txt

Mmmh, maybe the deadlock is more complicated. I had a 2nd syslog stream
going to an ext4 filesystem, exactly to get around that btrfs master
deadlock, and now I see that didn't work either.

If sync hangs, and logging to an ext4 filesystem didn't work, am I
hitting another bug/hardware problem?

Here's what I got at the end?


[194790.138156] FAT-fs (mmcblk0p1): utf8 is not a recommended IO charset for 
FAT filesystems, filesystem will be case sensitive!
[194790.140892] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some 
data may be corrupt. Please run fsck.
[194932.445153] INFO: task IndexedDB:29612 blocked for more than 120 seconds.
[194932.445161]   Tainted: GW 
3.15.0-rc5-amd64-i915-preempt-20140216s1 #2
[194932.445163] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
this message.
[194932.445166] IndexedDB   D 8800ccde8bc0 0 29612   5570 0x0080
[194932.445172]  8801b521fc30 0086 8801b521fc00 
8801b521ffd8
[194932.445178]  8801d622a450 000141c0 88041e3941c0 
8801d622a450
[194932.445182]  8801b521fcd0 0002 810fda1a 
8801b521fc40
[194932.445188] Call Trace:
[194932.445198]  [810fda1a] ? wait_on_page_read+0x3c/0x3c
[194932.445209]  [8161ca1b] io_schedule+0x60/0x7a
[194932.445214]  [810fda28] sleep_on_page+0xe/0x12
[194932.445219]  [8161cdab] __wait_on_bit_lock+0x46/0x8a
[194932.445223]  [810fdae3] __lock_page+0x69/0x6b
[194932.445228]  [81084771] ? autoremove_wake_function+0x34/0x34
[194932.445232]  [81240c41] lock_page+0x1e/0x21
[194932.445237]  [81244779] 
extent_write_cache_pages.isra.16.constprop.32+0x10e/0x2c3
[194932.445243]  [8161d2d4] ? mutex_unlock+0x16/0x18
[194932.445248]  [81239c74] ? btrfs_file_aio_write+0x3e9/0x4b6
[194932.445251]  [81244bd4] extent_writepages+0x4b/0x5c
[194932.445255]  [8122ee1f] ? btrfs_submit_direct+0x3f4/0x3f4
[194932.445262]  [8122d3fa] btrfs_writepages+0x28/0x2a
[194932.445267]  [811082b1] do_writepages+0x1e/0x2c
[194932.445272]  [810ff179] __filemap_fdatawrite_range+0x55/0x57
[194932.445277]  [810ff1ef] filemap_fdatawrite_range+0x13/0x15
[194932.445280]  [8123885a] btrfs_sync_file+0xa8/0x2b3
[194932.445286]  [8132048f] ? __percpu_counter_add+0x8c/0xa6
[194932.445292]  [8117a1a7] vfs_fsync_range+0x18/0x22
[194932.445296]  [8117a1cd] vfs_fsync+0x1c/0x1e
[194932.445299]  [8117a3d9] do_fsync+0x2c/0x4c
[194932.445303]  [8117a5f9] SyS_fdatasync+0x13/0x17
[194932.445308]  [81625bad] system_call_fastpath+0x1a/0x1f
[194932.445395] INFO: task kworker/u16:35:3812 blocked for more than 120 
seconds.
[194932.445398]   Tainted: GW 
3.15.0-rc5-amd64-i915-preempt-20140216s1 #2
[194932.445400] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
this message.
[194932.445403] kworker/u16:35  D  0  3812  2 0x0080
[194932.445410] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[194932.445414]  88003b647a00 0046 88003b6479d0 
88003b647fd8
[194932.445419]  88003b8ca590 000141c0 88041e3941c0 
88003b8ca590
[194932.445423]  88003b647aa0 0002 810fda1a 
88003b647a10
[194932.445427] Call Trace:
[194932.445432]  [810fda1a] ? wait_on_page_read+0x3c/0x3c
[194932.445437]  [8161c876] schedule+0x73/0x75
[194932.445441]  [8161ca1b] io_schedule+0x60/0x7a
[194932.445445]  [810fda28] sleep_on_page+0xe/0x12
[194932.445450]  [8161cdab] __wait_on_bit_lock+0x46/0x8a
[194932.445454]  [810fdae3] __lock_page+0x69/0x6b
[194932.445458]  [81084771] ? autoremove_wake_function+0x34/0x34
[194932.445461]  [81240c41] lock_page+0x1e/0x21
[194932.445465]  [81244779] 
extent_write_cache_pages.isra.16.constprop.32+0x10e/0x2c3
[194932.445470]  

Re: Convert btrfs software code to ASIC

2014-05-19 Thread Fajar A. Nugraha
On Mon, May 19, 2014 at 8:09 PM, Le Nguyen Tran lntran...@gmail.com wrote:
 I now need to understand the operation of btrfs source code to
 determine. I hope that one of you can help me


Have you read the wiki link?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.

2014-05-19 Thread David Sterba
On Sat, May 17, 2014 at 06:43:15PM +0100, Hugo Mills wrote:
I've just been poking around in the docs for a completely different
 reason, and I think there's a fairly serious problem (well, as serious
 as problems get with documentation).
 
Take, for example, the format for btrfs fi resize:
 
 'resize' [devid:][+/-]size[gkm]|[devid:]max path::
 
Now, this has just thrown away all of the useful markup which
 indicates the semantics of the command. The asciidoc renders all of
 that text literally and unformatted, making alphasymbolic(*) soup of
 the docs. Compare this to the old roff man page:
 
 \fBbtrfs\fP \fBfilesystem resize\fP 
 [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP

I think we can restore the formatting with asciidoc.

The line above would become:

*btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]'max path'

or with bold max

*btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]*max* 'path'

I was first worried that this will not be possible due to limitations of
asciidoc markup but as this turned out not be true, I'd rather spend the
boring time to keep the formatting as before.

My personal feeling about the enriched formatting is that the commands
stand out of the text and are easier to catch (as you've mentioned
somewhere in the thread).
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hang while deleting file

2014-05-19 Thread Joshua McKinney
No luck of catching it in the act unfortunately, but I'll bear that
tip in mind for any future issues.

On Mon, May 19, 2014 at 4:00 AM, Chris Murphy li...@colorremedies.com wrote:

 On May 18, 2014, at 10:36 AM, Joshua McKinney jos...@joshka.net wrote:

 https://bugzilla.kernel.org/show_bug.cgi?id=76421

 Perceived issue: SABNZBD hangs, requires restart.
 Diagnosis shows the following in my system log at the time of hang.
 This happens more than once.
 Log:

 [ 5883.464766] INFO: task SABnzbd.py:994 blocked for more than 120 seconds.
 [ 5883.464906]   Not tainted 3.14.4-1-ARCH #1
 [ 5883.464989] echo 0  /proc/sys/kernel/hung_task_timeout_secs
 disables this message.
 [ 5883.465130] SABnzbd.py  D 880196d1f5c0 0   994  1 
 0x
 [ 5883.465140]  8800765c9ce8 0082 0052
 880196d1f5c0
 [ 5883.465148]  000146c0 8800765c9fd8 000146c0
 880196d1f5c0
 [ 5883.465156]  ffef 880177d9a000 a02d0dbc
 8800765c9c50
 [ 5883.465163] Call Trace:
 [ 5883.465218]  [a02d0dbc] ? __set_extent_bit+0x45c/0x550 [btrfs]
 [ 5883.465252]  [a02d03c3] ? free_extent_state+0x43/0xc0 [btrfs]
 [ 5883.465284]  [a02d0dbc] ? __set_extent_bit+0x45c/0x550 [btrfs]
 [ 5883.465295]  [810b3ba4] ? __wake_up+0x44/0x50
 [ 5883.465304]  [8150b729] schedule+0x29/0x70
 [ 5883.465335]  [a02d1cd2] lock_extent_bits+0x152/0x1e0 [btrfs]
 [ 5883.465344]  [810b4020] ? __wake_up_sync+0x20/0x20
 [ 5883.465375]  [a02bfa59] btrfs_evict_inode+0x139/0x520 [btrfs]
 [ 5883.465387]  [811d5a80] evict+0xb0/0x1c0
 [ 5883.465394]  [811d6335] iput+0xf5/0x1a0
 [ 5883.465402]  [811ca9c5] do_unlinkat+0x1b5/0x300
 [ 5883.465411]  [8117899c] ? vm_munmap+0x4c/0x60
 [ 5883.465418]  [811cb986] SyS_unlink+0x16/0x20
 [ 5883.465427]  [81517769] system_call_fastpath+0x16/0x1b

 Filesystem:
 # btrfs filesystem show
 Btrfs v3.14.1

 running Data RAID, sys/meta RAID10 on 5x4TB. SABNzbd is a usenet
 download program, so the file attempting to be deleted was possibly
 large (GB)

 Recently updated to 3.14 kernel provided in arch linux. I haven't seen
 this issue before the last couple of days.

 Happy to provide more info if necessary.

 I'd include as an attachment to the bug the output from sysrq-w.

 echo 1  /proc/sys/kernel/sysrq
 echo w  /proc/sysrq-trigger
 dmesg

 https://www.kernel.org/doc/Documentation/sysrq.txt


 Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.

2014-05-19 Thread David Sterba
On Mon, May 19, 2014 at 04:01:23PM +0200, David Sterba wrote:
 On Sat, May 17, 2014 at 06:43:15PM +0100, Hugo Mills wrote:
 I've just been poking around in the docs for a completely different
  reason, and I think there's a fairly serious problem (well, as serious
  as problems get with documentation).
  
 Take, for example, the format for btrfs fi resize:
  
  'resize' [devid:][+/-]size[gkm]|[devid:]max path::
  
 Now, this has just thrown away all of the useful markup which
  indicates the semantics of the command. The asciidoc renders all of
  that text literally and unformatted, making alphasymbolic(*) soup of
  the docs. Compare this to the old roff man page:
  
  \fBbtrfs\fP \fBfilesystem resize\fP 
  [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP
 
 I think we can restore the formatting with asciidoc.
 
 The line above would become:
 
 *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]'max path'
 
 or with bold max
 
 *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]*max* 'path'

The correct base string should read

  btrfs filesystem resize [devid:][+/-]size[kgm]|[devid:]max path

ie. add .. around devid and size. That way it's copy-paste-ready.
In this case the italic/underlined text does not IMO add much value.

 My personal feeling about the enriched formatting is that the commands
 stand out of the text and are easier to catch (as you've mentioned
 somewhere in the thread).

The bolded subcommand name seems to be sufficent.

The files are processed by XSL, I think it should be possible to apply
some transformation that would add '...' around ... automatically
instead of making everybody write that.

Proposed changes:
- format all subcommands as bold instead of italic ('' - **)
- add all missing ...
- find a way how to add '...' around ... (xsl or sed or whatever)

Does that work for you?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] xfstests: new mailing list

2014-05-19 Thread Christoph Hellwig
On Sat, May 17, 2014 at 08:19:30AM +1000, Dave Chinner wrote:
 Renaming the test suite take a lot more work - .e.g renaming/moving
 source trees and a fixing all the documentation that points to it...

In that case please call the list xfstests - a name different by a
single character is utterly confusing.  And I defintively see some merit
to the suggestion that we'll just keep the x and allow people to come up
with a nice backronym for it if they care enough.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: send/receive and bedup

2014-05-19 Thread Brendan Hide

On 19/05/14 15:00, Scott Middleton wrote:

On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:

On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote:

I read so much about BtrFS that I mistaked Bedup with Duperemove.
Duperemove is actually what I am testing.

I'm currently using programs that find files that are the same, and
hardlink them together:
http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html

hardlink.py actually seems to be the faster (memory and CPU) one event
though it's in python.
I can get others to run out of RAM on my 8GB server easily :(


Interesting app.

An issue with hardlinking (with the backups use-case, this problem isn't likely 
to happen), is that if you modify a file, all the hardlinks get changed along 
with it - including the ones that you don't want changed.

@Marc: Since you've been using btrfs for a while now I'm sure you've already 
considered whether or not a reflink copy is the better/worse option.



Bedup should be better, but last I tried I couldn't get it to work.
It's been updated since then, I just haven't had the chance to try it
again since then.

Please post what you find out, or if you have a hardlink maker that's
better than the ones I found :)



Thanks for that.

I may be  completely wrong in my approach.

I am not looking for a file level comparison. Bedup worked fine for
that. I have a lot of virtual images and shadow protect images where
only a few megabytes may be the difference. So a file level hash and
comparison doesn't really achieve my goals.

I thought duperemove may be on a lower level.

https://github.com/markfasheh/duperemove

Duperemove is a simple tool for finding duplicated extents and
submitting them for deduplication. When given a list of files it will
hash their contents on a block by block basis and compare those hashes
to each other, finding and categorizing extents that match each
other. When given the -d option, duperemove will submit those
extents for deduplication using the btrfs-extent-same ioctl.

It defaults to 128k but you can make it smaller.

I hit a hurdle though. The 3TB HDD  I used seemed OK when I did a long
SMART test but seems to die every few hours. Admittedly it was part of
a failed mdadm RAID array that I pulled out of a clients machine.

The only other copy I have of the data is the original mdadm array
that was recently replaced with a new server, so I am loathe to use
that HDD yet. At least for another couple of weeks!


I am still hopeful duperemove will work.
Duperemove does look exactly like what you are looking for. The last 
traffic on the mailing list regarding that was in August last year. It 
looks like it was pulled into the main kernel repository on September 1st.


The last commit to the duperemove application was on April 20th this 
year. Maybe Mark (cc'd) can provide further insight on its current status.


--
__
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] Documentation updates

2014-05-19 Thread David Sterba
Formatting changes inspired by Hugo's mail and some fixes that I found along
the way. The update in Availability section removes the heavy development 
not usable text.

David Sterba (5):
  btrfs-progs: doc: fix argument notation and typos
  btrfs-progs: doc: remove text for unmerged features
  btrfs-progs: doc: autoformat user-supplied arguments by sed
  btrfs-progs: doc: make all commands and subcommands bold
  btrfs-progs: doc: update the Availability section

 Documentation/Makefile   |  9 ++--
 Documentation/btrfs-balance.txt  | 30 +++---
 Documentation/btrfs-check.txt| 18 
 Documentation/btrfs-convert.txt  |  6 +--
 Documentation/btrfs-debug-tree.txt   |  6 +--
 Documentation/btrfs-device.txt   | 27 +---
 Documentation/btrfs-filesystem.txt   | 70 ++--
 Documentation/btrfs-find-root.txt|  6 +--
 Documentation/btrfs-image.txt| 10 ++---
 Documentation/btrfs-inspect-internal.txt | 18 
 Documentation/btrfs-map-logical.txt  |  6 +--
 Documentation/btrfs-property.txt | 22 +-
 Documentation/btrfs-qgroup.txt   | 36 
 Documentation/btrfs-quota.txt| 16 
 Documentation/btrfs-receive.txt  | 12 +++---
 Documentation/btrfs-replace.txt  | 16 
 Documentation/btrfs-rescue.txt   | 16 
 Documentation/btrfs-restore.txt  | 12 +++---
 Documentation/btrfs-scrub.txt| 26 ++--
 Documentation/btrfs-send.txt | 10 ++---
 Documentation/btrfs-show-super.txt   |  6 +--
 Documentation/btrfs-subvolume.txt| 34 
 Documentation/btrfs-zero-log.txt | 12 +++---
 Documentation/btrfs.txt  | 48 +++---
 Documentation/btrfstune.txt  |  6 +--
 Documentation/fsck.btrfs.txt |  6 +--
 Documentation/mkfs.btrfs.txt | 12 +++---
 27 files changed, 207 insertions(+), 289 deletions(-)

-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] btrfs-progs: doc: fix argument notation and typos

2014-05-19 Thread David Sterba
All user-supplied values should be enclosed in ... to distinguish
them from verbatim strings.

Signed-off-by: David Sterba dste...@suse.cz
---
 Documentation/btrfs-balance.txt  |  6 +++---
 Documentation/btrfs-check.txt|  2 +-
 Documentation/btrfs-filesystem.txt   | 10 +-
 Documentation/btrfs-image.txt|  4 ++--
 Documentation/btrfs-inspect-internal.txt |  2 +-
 Documentation/btrfs-qgroup.txt   | 10 +-
 Documentation/btrfs-scrub.txt|  8 
 Documentation/btrfs-subvolume.txt|  6 +++---
 Documentation/fsck.btrfs.txt |  2 +-
 Documentation/mkfs.btrfs.txt |  2 +-
 10 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/Documentation/btrfs-balance.txt b/Documentation/btrfs-balance.txt
index 1b1861cfd31f..7edc44b150dd 100644
--- a/Documentation/btrfs-balance.txt
+++ b/Documentation/btrfs-balance.txt
@@ -36,11 +36,11 @@ given balance all chunks in a filesystem.
 +
 `Options`
 +
--d[filters]
+-d[filters]
 act on data chunks
--m[filters]
+-m[filters]
 act on metadata chunks
--s[filters]
+-s[filters]
 act on system chunks (only under -f)
 -v
 be verbose
diff --git a/Documentation/btrfs-check.txt b/Documentation/btrfs-check.txt
index 485a49cbc3ec..ce491734a981 100644
--- a/Documentation/btrfs-check.txt
+++ b/Documentation/btrfs-check.txt
@@ -22,7 +22,7 @@ https://btrfs.wiki.kernel.org/index.php/Btrfsck
 
 OPTIONS
 ---
--s|--support superblock::
+-s|--super superblock::
 use superblockth superblock copy.
 --repair::
 try to repair the filesystem.
diff --git a/Documentation/btrfs-filesystem.txt 
b/Documentation/btrfs-filesystem.txt
index de9b3f3c39c4..4ac8711f62c0 100644
--- a/Documentation/btrfs-filesystem.txt
+++ b/Documentation/btrfs-filesystem.txt
@@ -17,7 +17,7 @@ resizing, defragment.
 
 SUBCOMMAND
 --
-'df' [-b] path [path...]::
+'df' [-b] path [path...]::
 Show space usage information for a mount point.
 +
 If '-b' is given, then byte is used as unit. Default unit will be
@@ -59,10 +59,10 @@ lower than 100% because the metadata is duplicated for 
security reasons.
 If all the data and metadata are duplicated (or have a profile like RAID1)
 the Data to disk ratio could be 50%.
 
-'show' [--mounted|--all-devices|path|uuid|device|lable]::
+'show' [--mounted|--all-devices|path|uuid|device|label]::
 Show the btrfs filesystem with some additional info.
 +
-If no option nor path|uuid|device|lable is passed, btrfs shows
+If no option nor path|uuid|device|label is passed, btrfs shows
 information of all the btrfs filesystem both mounted and unmounted.
 If '--mounted' is passed, it would probe btrfs kernel to list mounted btrfs
 filesystem(s);
@@ -109,7 +109,7 @@ don't use it if you use snapshots, have de-duplicated your 
data or made
 copies with `cp --reflink`.
 
 // Some wording are extracted by the resize2fs man page
-'resize' [devid:][+/-]size[gkm]|[devid:]max path::
+'resize' [devid:][+/-]size[gkm]|[devid:]max path::
 Resize a filesystem identified by path for the underlying device
 devid *online*. +
 The devid can be found with 'btrfs filesystem show' and
@@ -133,7 +133,7 @@ partition after reducing the size of the filesystem.  This 
can done using
 it with the new desired size.  When recreating the partition make sure to use
 the same starting disk cylinder as before.
 
-'label' [dev|mount_point] [newlabel]::
+'label' [dev|mountpoint] [newlabel]::
 Show or update the label of a filesystem.
 +
 [device|mountpoint] is used to identify the filesystem. 
diff --git a/Documentation/btrfs-image.txt b/Documentation/btrfs-image.txt
index bd74a86cff44..c41e36d6c59a 100644
--- a/Documentation/btrfs-image.txt
+++ b/Documentation/btrfs-image.txt
@@ -24,10 +24,10 @@ using 1 stripe pointing to primary device, so that file 
system can be
 restored by running tree log reply if possible. To restore without
 changing number of stripes in chunk tree check -o option.
 
--c value::
+-c value::
 Compression level (0 ~ 9).
 
--t value::
+-t value::
 Number of threads (1 ~ 32) to be used to process the image dump or restore.
 
 -o::
diff --git a/Documentation/btrfs-inspect-internal.txt 
b/Documentation/btrfs-inspect-internal.txt
index 4555c70670df..c5f751dc4f71 100644
--- a/Documentation/btrfs-inspect-internal.txt
+++ b/Documentation/btrfs-inspect-internal.txt
@@ -23,7 +23,7 @@ Resolves an inode in subvolume path to all filesystem 
paths.
 -v
 verbose mode. print count of returned paths and ioctl() return value
 
-'logical-resolve' [-Pv] [-s bufsize] logical path::
+'logical-resolve' [-Pv] [-s bufsize] logical path::
 Resolves a logical address in the filesystem mounted at path to all inodes.
 +
 By default, each inode is then resolved to a file system path (similar to the
diff --git a/Documentation/btrfs-qgroup.txt b/Documentation/btrfs-qgroup.txt
index d0544232f353..531febb3a086 100644
--- a/Documentation/btrfs-qgroup.txt
+++ b/Documentation/btrfs-qgroup.txt
@@ -73,15 +73,15 

[PATCH 5/5] btrfs-progs: doc: update the Availability section

2014-05-19 Thread David Sterba
Does not reflect the current state. The wiki contains more details on
the first page.

Signed-off-by: David Sterba dste...@suse.cz
---
 Documentation/btrfs-balance.txt  | 4 +---
 Documentation/btrfs-check.txt| 4 +---
 Documentation/btrfs-device.txt   | 4 +---
 Documentation/btrfs-filesystem.txt   | 4 +---
 Documentation/btrfs-inspect-internal.txt | 4 +---
 Documentation/btrfs-property.txt | 4 +---
 Documentation/btrfs-qgroup.txt   | 4 +---
 Documentation/btrfs-quota.txt| 4 +---
 Documentation/btrfs-receive.txt  | 4 +---
 Documentation/btrfs-replace.txt  | 4 +---
 Documentation/btrfs-rescue.txt   | 4 +---
 Documentation/btrfs-restore.txt  | 4 +---
 Documentation/btrfs-scrub.txt| 4 +---
 Documentation/btrfs-send.txt | 4 +---
 Documentation/btrfs-subvolume.txt| 4 +---
 Documentation/btrfs.txt  | 4 +---
 Documentation/mkfs.btrfs.txt | 4 +---
 17 files changed, 17 insertions(+), 51 deletions(-)

diff --git a/Documentation/btrfs-balance.txt b/Documentation/btrfs-balance.txt
index d34833d6f380..37d8781eee4e 100644
--- a/Documentation/btrfs-balance.txt
+++ b/Documentation/btrfs-balance.txt
@@ -68,9 +68,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-check.txt b/Documentation/btrfs-check.txt
index 073667265c13..027032b2efb8 100644
--- a/Documentation/btrfs-check.txt
+++ b/Documentation/btrfs-check.txt
@@ -38,9 +38,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-device.txt b/Documentation/btrfs-device.txt
index 4f847763bb66..0f7917d894a0 100644
--- a/Documentation/btrfs-device.txt
+++ b/Documentation/btrfs-device.txt
@@ -105,9 +105,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-filesystem.txt 
b/Documentation/btrfs-filesystem.txt
index e3e270ff0957..0ee79cbabc34 100644
--- a/Documentation/btrfs-filesystem.txt
+++ b/Documentation/btrfs-filesystem.txt
@@ -108,9 +108,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-inspect-internal.txt 
b/Documentation/btrfs-inspect-internal.txt
index fe76217365b0..5ae4997b9bc0 100644
--- a/Documentation/btrfs-inspect-internal.txt
+++ b/Documentation/btrfs-inspect-internal.txt
@@ -57,9 +57,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-property.txt b/Documentation/btrfs-property.txt
index 4b0f49e04b20..6b23e2e52aad 100644
--- a/Documentation/btrfs-property.txt
+++ b/Documentation/btrfs-property.txt
@@ -56,9 +56,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-qgroup.txt b/Documentation/btrfs-qgroup.txt
index 12321926b25b..55c4747449ff 100644
--- a/Documentation/btrfs-qgroup.txt
+++ b/Documentation/btrfs-qgroup.txt
@@ -97,9 +97,7 @@ returned in case of failure.
 
 AVAILABILITY
 
-*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
-development,
-and not suitable for any uses other than benchmarking and review.
+*btrfs* is part of btrfs-progs.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
 further details.
 
diff --git a/Documentation/btrfs-quota.txt b/Documentation/btrfs-quota.txt

[PATCH 2/5] btrfs-progs: doc: remove text for unmerged features

2014-05-19 Thread David Sterba
The asciidoc conversion was done on a development branch and there are
portions of text that do not reflect the code.

Signed-off-by: David Sterba dste...@suse.cz
---
 Documentation/btrfs-device.txt |  5 -
 Documentation/btrfs-filesystem.txt | 46 +-
 2 files changed, 1 insertion(+), 50 deletions(-)

diff --git a/Documentation/btrfs-device.txt b/Documentation/btrfs-device.txt
index 7a6bce5c650a..9cd8ad081a5a 100644
--- a/Documentation/btrfs-device.txt
+++ b/Documentation/btrfs-device.txt
@@ -86,11 +86,6 @@ filesystem as listed by blkid.
 Finally, if '--all-devices' or '-d' is passed, all the devices under /dev are 
 scanned.
 
-'disk-usage' [-b] path [path..]::
-Show which chunks are in a device.
-+
-If '-b' is given, byte will be set as unit.
-
 'ready' device::
 Check device to see if it has all of it's devices in cache for mounting.
 
diff --git a/Documentation/btrfs-filesystem.txt 
b/Documentation/btrfs-filesystem.txt
index 4ac8711f62c0..63e3ef676cd3 100644
--- a/Documentation/btrfs-filesystem.txt
+++ b/Documentation/btrfs-filesystem.txt
@@ -17,47 +17,8 @@ resizing, defragment.
 
 SUBCOMMAND
 --
-'df' [-b] path [path...]::
+'df' path [path...]::
 Show space usage information for a mount point.
-+
-If '-b' is given, then byte is used as unit. Default unit will be
-human-readable unit such as KiB/MiB/GiB.
-+
-The command 'btrfs filesystem df' is used to query how many space on the 
-disk(s) are used and an estimation of the free
-space of the filesystem.
-The output of the command 'btrfs filesystem df' shows:
-
-`Disk size`
-the total size of the disks which compose the filesystem.
-
-`Disk allocated`
-the size of the area of the disks used by the chunks.
-
-`Disk unallocated`
-the size of the area of the disks which is free (i.e.
-the differences of the values above).
-
-`Used`
-the portion of the logical space used by the file and metadata.
-
-`Free (estimated)`
-the estimated free space available: i.e. how many space can be used
-by the user. The evaluation cannot be rigorous because it depends by the
-allocation policy (DUP, Single, RAID1...) of the metadata and data chunks. +
-If every chunk is stored as Single the sum of the free (estimated) space
-and the used space  is equal to the disk size.
-Otherwise if all the chunk are mirrored (raid1 or raid10) or duplicated
-the sum of the free (estimated) space and the used space is
-half of the disk size. Normally the free (estimated) is between
-these two limits.
-
-`Data to disk ratio`
-the ratio betwen the logical size (i.e. the space available by
-the chunks) and the disk allocated (by the chunks). Normally it is 
-lower than 100% because the metadata is duplicated for security reasons.
-If all the data and metadata are duplicated (or have a profile like RAID1)
-the Data to disk ratio could be 50%.
 
 'show' [--mounted|--all-devices|path|uuid|device|label]::
 Show the btrfs filesystem with some additional info.
@@ -140,11 +101,6 @@ Show or update the label of a filesystem.
 If a newlabel optional argument is passed, the label is changed.
 NOTE: the maximum allowable length shall be less than 256 chars
 
-'disk-usage' [-tb] path [path...]::
-Show in which disk the chunks are allocated. +
-If '-b' is given,  set byte as unit;
-If '-t' is given, show data in tabular format.
-
 EXIT STATUS
 ---
 'btrfs filesystem' returns a zero exist status if it succeeds. Non zero is
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] btrfs-progs: doc: make all commands and subcommands bold

2014-05-19 Thread David Sterba
Italic format is used for parameters and values, bold makes the text visually
separated.

Signed-off-by: David Sterba dste...@suse.cz
---
 Documentation/btrfs-balance.txt  | 22 +++
 Documentation/btrfs-check.txt| 14 +-
 Documentation/btrfs-convert.txt  |  6 ++---
 Documentation/btrfs-debug-tree.txt   |  6 ++---
 Documentation/btrfs-device.txt   | 20 +++---
 Documentation/btrfs-filesystem.txt   | 22 +++
 Documentation/btrfs-find-root.txt|  6 ++---
 Documentation/btrfs-image.txt|  6 ++---
 Documentation/btrfs-inspect-internal.txt | 16 +--
 Documentation/btrfs-map-logical.txt  |  6 ++---
 Documentation/btrfs-property.txt | 20 +++---
 Documentation/btrfs-qgroup.txt   | 24 -
 Documentation/btrfs-quota.txt| 14 +-
 Documentation/btrfs-receive.txt  | 10 +++
 Documentation/btrfs-replace.txt  | 14 +-
 Documentation/btrfs-rescue.txt   | 14 +-
 Documentation/btrfs-restore.txt  | 10 +++
 Documentation/btrfs-scrub.txt| 20 +++---
 Documentation/btrfs-send.txt |  8 +++---
 Documentation/btrfs-show-super.txt   |  6 ++---
 Documentation/btrfs-subvolume.txt| 28 +--
 Documentation/btrfs-zero-log.txt | 12 -
 Documentation/btrfs.txt  | 46 
 Documentation/btrfstune.txt  |  6 ++---
 Documentation/fsck.btrfs.txt |  6 ++---
 Documentation/mkfs.btrfs.txt |  8 +++---
 26 files changed, 185 insertions(+), 185 deletions(-)

diff --git a/Documentation/btrfs-balance.txt b/Documentation/btrfs-balance.txt
index 7edc44b150dd..d34833d6f380 100644
--- a/Documentation/btrfs-balance.txt
+++ b/Documentation/btrfs-balance.txt
@@ -7,11 +7,11 @@ btrfs-balance - balance btrfs filesystem
 
 SYNOPSIS
 
-'btrfs [filesystem] balance' subcommand|args
+*btrfs [filesystem] balance* subcommand|args
 
 DESCRIPTION
 ---
-'btrfs balance' is used to balance chunks in a btrfs filesystem across
+*btrfs balance* is used to balance chunks in a btrfs filesystem across
 multiple or even single device.
 
 See `btrfs-device`(8) for more details about the effect on device management.
@@ -21,10 +21,10 @@ SUBCOMMAND
 path::
 Balance chunks across the devices *online*.
 +
-'btrfs balance path' is deprecated,
-please use 'btrfs balance start' command instead.
+*btrfs balance path* is deprecated,
+please use *btrfs balance start* command instead.
 
-'start' [options] path::
+*start* [options] path::
 Balance chunks across the devices *online*.
 +
 Balance and/or convert (change allocation profile of) chunks that
@@ -47,28 +47,28 @@ be verbose
 -f
 force reducing of metadata integrity
 
-'pause' path::
+*pause* path::
 Pause running balance.
 
-'cancel' path::
+*cancel* path::
 Cancel running or paused balance.
 
-'resume' path::
+*resume* path::
 Resume interrupted balance.
 
-'status' [-v] path::
+*status* [-v] path::
 Show status of running or paused balance.
 +
 If '-v' option is given, output will be verbose.
 
 EXIT STATUS
 ---
-'btrfs balance' returns a zero exist status if it succeeds. Non zero is
+*btrfs balance* returns a zero exist status if it succeeds. Non zero is
 returned in case of failure.
 
 AVAILABILITY
 
-'btrfs' is part of btrfs-progs. Btrfs filesystem is currently under heavy
+*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy
 development,
 and not suitable for any uses other than benchmarking and review.
 Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
diff --git a/Documentation/btrfs-check.txt b/Documentation/btrfs-check.txt
index ce491734a981..073667265c13 100644
--- a/Documentation/btrfs-check.txt
+++ b/Documentation/btrfs-check.txt
@@ -7,18 +7,18 @@ btrfs-check - check or repair a btrfs filesystem offline
 
 SYNOPSIS
 
-'btrfs check' [options] device
+*btrfs check* [options] device
 
 DESCRIPTION
 ---
-'btrfs check' is used to check or repair a btrfs filesystem offline.
+*btrfs check* is used to check or repair a btrfs filesystem offline.
 
-NOTE: Since btrfs is under heavy development especially the 'btrfs check'
+NOTE: Since btrfs is under heavy development especially the *btrfs check*
 command, it is *highly* recommended to read the following btrfs wiki before
-executing 'btrfs check' with '--repair' option: +
+executing *btrfs check* with '--repair' option: +
 https://btrfs.wiki.kernel.org/index.php/Btrfsck
 
-'btrfsck' is an alias of 'btrfs check' command and is now deprecated.
+*btrfsck* is an alias of *btrfs check* command and is now deprecated.
 
 OPTIONS
 ---
@@ -33,12 +33,12 @@ create a new extent tree.
 
 EXIT STATUS
 ---
-'btrfs check' returns a zero exist status if it succeeds. Non zero is
+*btrfs check* returns a zero exist status if it 

[PATCH 1/4 RESEND] Btrfs: all super blocks of the replaced disk must be scratched

2014-05-19 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com

In a normal scenario when sys-admin replaces a disk, the
expeted is btrfs will release the disk completely.

However the below test case gives a wrong impression that
replaced disk is still is in use.

$ btrfs rep start /dev/sde /dev/sdg4 /btrfs
$ mkfs.btrfs /dev/sde
/dev/sde appears to contain an existing filesystem (btrfs).
Error: Use the -f option to force overwrite.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 33 +
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b4660c4..19e68f7 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6280,16 +6280,33 @@ int btrfs_scratch_superblock(struct btrfs_device 
*device)
 {
struct buffer_head *bh;
struct btrfs_super_block *disk_super;
+   int i;
+   u64 bytenr;
 
-   bh = btrfs_read_dev_super(device-bdev);
-   if (!bh)
-   return -EINVAL;
-   disk_super = (struct btrfs_super_block *)bh-b_data;
+   for (i = 0; i  BTRFS_SUPER_MIRROR_MAX; i++) {
+   bytenr = btrfs_sb_offset(i);
+   if (bytenr + BTRFS_SUPER_INFO_SIZE =
+   i_size_read(device-bdev-bd_inode))
+   break;
 
-   memset(disk_super-magic, 0, sizeof(disk_super-magic));
-   set_buffer_dirty(bh);
-   sync_dirty_buffer(bh);
-   brelse(bh);
+   bh = __bread(device-bdev, bytenr / 4096,
+   BTRFS_SUPER_INFO_SIZE);
+   if (!bh)
+   continue;
+
+   disk_super = (struct btrfs_super_block *)bh-b_data;
+   if (btrfs_super_bytenr(disk_super) != bytenr ||
+   btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
+   brelse(bh);
+   continue;
+   }
+
+   memset(disk_super-magic, 0, sizeof(disk_super-magic));
+
+   set_buffer_dirty(bh);
+   sync_dirty_buffer(bh);
+   brelse(bh);
+   }
 
return 0;
 }
-- 
1.8.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4 RESEND] btrfs: btrfs_rm_device() should zero mirror SB as well

2014-05-19 Thread Anand Jain
This fix will ensure all SB copies on the disk is zeroed
when the disk is intentionally removed. This helps to
better manage disks in the user land.

This version of patch also merges the Zach patch as below.

 btrfs: don't double brelse on device rm

Signed-off-by: Anand Jain anand.j...@oracle.com
Signed-off-by: Zach Brown z...@redhat.com
---
 fs/btrfs/volumes.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 19e68f7..1567439 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1681,12 +1681,43 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path)
 * remove it from the devices list and zero out the old super
 */
if (clear_super  disk_super) {
+   u64 bytenr;
+   int i;
+
/* make sure this device isn't detected as part of
 * the FS anymore
 */
memset(disk_super-magic, 0, sizeof(disk_super-magic));
set_buffer_dirty(bh);
sync_dirty_buffer(bh);
+
+   /* clear the mirror copies of super block on the disk
+* being removed, 0th copy is been taken care above and
+* the below would take of the rest
+*/
+   for (i = 1; i  BTRFS_SUPER_MIRROR_MAX; i++) {
+   bytenr = btrfs_sb_offset(i);
+   if (bytenr + BTRFS_SUPER_INFO_SIZE =
+   i_size_read(bdev-bd_inode))
+   break;
+
+   brelse(bh);
+   bh = __bread(bdev, bytenr / 4096,
+   BTRFS_SUPER_INFO_SIZE);
+   if (!bh)
+   continue;
+
+   disk_super = (struct btrfs_super_block *)bh-b_data;
+
+   if (btrfs_super_bytenr(disk_super) != bytenr ||
+   btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
+   continue;
+   }
+   memset(disk_super-magic, 0,
+   sizeof(disk_super-magic));
+   set_buffer_dirty(bh);
+   sync_dirty_buffer(bh);
+   }
}
 
ret = 0;
-- 
1.8.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4 RESEND] btrfs: add framework to read fs info from btrfs-control

2014-05-19 Thread Anand Jain
This adds ioctl BTRFS_IOC_GET_FSIDS which reads the fs
info through the btrfs-control, needed to optimize
heavily used btrfs-progs function check_mounted()
plus few other minor uses.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/super.c   | 66 +-
 fs/btrfs/volumes.c | 39 +++
 fs/btrfs/volumes.h |  2 ++
 include/uapi/linux/btrfs.h | 19 +
 4 files changed, 120 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d4878dd..b42cd50 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1717,38 +1717,92 @@ static struct file_system_type btrfs_fs_type = {
 };
 MODULE_ALIAS_FS(btrfs);
 
+static int btrfs_ioc_get_fslist(void __user *arg)
+{
+   int ret = 0;
+   u64 sz_fslist_arg;
+   u64 sz_fslist;
+   u64 sz_out;
+   struct btrfs_ioctl_fslist_args *fslist_arg;
+   struct btrfs_ioctl_fslist_args *fslist_arg_tmp;
+   struct btrfs_ioctl_fslist *fslist;
+
+   u64 cnt = 0, ucnt;
+
+   sz_fslist_arg = sizeof(*fslist_arg);
+   sz_fslist = sizeof(*fslist);
+   if (copy_from_user(ucnt,
+   (struct btrfs_ioctl_fslist_args __user *)(arg +
+   offsetof(struct btrfs_ioctl_fslist_args, count)),
+   sizeof(ucnt)))
+   return -EFAULT;
+
+   cnt = btrfs_get_fslist_cnt();
+
+   if (cnt  ucnt) {
+   if (copy_to_user(arg +
+   offsetof(struct btrfs_ioctl_fslist_args, count),
+   cnt, sizeof(cnt)))
+   return -EFAULT;
+   return 1;
+   }
+
+   sz_out = sz_fslist_arg + sz_fslist * cnt;
+   fslist_arg_tmp = fslist_arg = memdup_user(arg, sz_out);
+   if (IS_ERR(fslist_arg))
+   return PTR_ERR(fslist_arg);
+   fslist = (struct btrfs_ioctl_fslist *) (++fslist_arg_tmp);
+   cnt = btrfs_get_fslist(fslist, cnt);
+   fslist_arg-count = cnt;
+   if (copy_to_user(arg, fslist_arg, sz_out)) {
+   ret = -EFAULT;
+   goto out;
+   }
+   ret = 0;
+out:
+   kfree(fslist_arg);
+   return ret;
+}
+
 /*
  * used by btrfsctl to scan devices when no FS is mounted
  */
 static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
 {
-   struct btrfs_ioctl_vol_args *vol;
+   struct btrfs_ioctl_vol_args *vol = NULL;
struct btrfs_fs_devices *fs_devices;
int ret = -ENOTTY;
+   void __user *argp = (void __user *)arg;
 
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
 
-   vol = memdup_user((void __user *)arg, sizeof(*vol));
-   if (IS_ERR(vol))
-   return PTR_ERR(vol);
-
switch (cmd) {
case BTRFS_IOC_SCAN_DEV:
+   vol = memdup_user((void __user *)arg, sizeof(*vol));
+   if (IS_ERR(vol))
+   return PTR_ERR(vol);
ret = btrfs_scan_one_device(vol-name, FMODE_READ,
btrfs_fs_type, fs_devices);
+   kfree(vol);
break;
case BTRFS_IOC_DEVICES_READY:
+   vol = memdup_user((void __user *)arg, sizeof(*vol));
+   if (IS_ERR(vol))
+   return PTR_ERR(vol);
ret = btrfs_scan_one_device(vol-name, FMODE_READ,
btrfs_fs_type, fs_devices);
+   kfree(vol);
if (ret)
break;
ret = !(fs_devices-num_devices == fs_devices-total_devices);
break;
+   case BTRFS_IOC_GET_FSLIST:
+   ret = btrfs_ioc_get_fslist(argp);
+   break;
}
 
-   kfree(vol);
return ret;
 }
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1567439..e22ac22 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6341,3 +6341,42 @@ int btrfs_scratch_superblock(struct btrfs_device *device)
 
return 0;
 }
+
+int btrfs_get_fslist_cnt(void)
+{
+   int cnt = 0;
+   struct btrfs_fs_devices *fs_devices;
+
+   mutex_lock(uuid_mutex);
+   list_for_each_entry(fs_devices, fs_uuids, list)
+   cnt++;
+   mutex_unlock(uuid_mutex);
+
+   return cnt;
+}
+
+u64 btrfs_get_fslist(struct btrfs_ioctl_fslist *fslist, u64 ucnt)
+{
+   u64 cnt = 0;
+   struct btrfs_fs_devices *fs_devices;
+
+   mutex_lock(uuid_mutex);
+   list_for_each_entry(fs_devices, fs_uuids, list) {
+   if (!(cnt  ucnt))
+   break;
+   memcpy(fslist-fsid, fs_devices-fsid,
+   BTRFS_FSID_SIZE);
+   fslist-num_devices = fs_devices-num_devices;
+   fslist-missing_devices = fs_devices-missing_devices;
+   fslist-total_devices = fs_devices-total_devices;
+
+   if 

[PATCH 2/2] btrfs: usage error should not be logged into system log

2014-05-19 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com

I have an opinion that system logs /var/log/messages are
valuable info to investigate the real system issues at
the data center. People handling data center issues
do spend a lot time and efforts analyzing messages
files. Having usage error logged into /var/log/messages
is something we should avoid.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/sysfs.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 63c2907..f729199 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -374,11 +374,8 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
struct btrfs_root *root = fs_info-fs_root;
int ret;
 
-   if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) {
-   pr_err(BTRFS: unable to set label with more than %d bytes\n,
-  BTRFS_LABEL_SIZE - 1);
+   if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n'))
return -EINVAL;
-   }
 
trans = btrfs_start_transaction(root, 0);
if (IS_ERR(trans))
-- 
1.8.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4 RESEND] btrfs: scrub maintenance event should be recorded in the messages

2014-05-19 Thread Anand Jain
so to help problem understanding and solving

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/ioctl.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e174770..ff27c08 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3702,10 +3702,17 @@ static long btrfs_ioctl_scrub(struct file *file, void 
__user *arg)
goto out;
}
 
+   btrfs_info(root-fs_info, Scrub started);
+
ret = btrfs_scrub_dev(root-fs_info, sa-devid, sa-start, sa-end,
  sa-progress, sa-flags  BTRFS_SCRUB_READONLY,
  0);
 
+   if (ret)
+   btrfs_info(root-fs_info, Scrub failed - %d, ret);
+   else
+   btrfs_info(root-fs_info, Scrub finished);
+
if (copy_to_user(arg, sa, sizeof(*sa)))
ret = -EFAULT;
 
-- 
1.8.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: label should not contain return char

2014-05-19 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com

generally if you use
  echo test  /sys/fs/btrfs/fsid/label
it would introduce return char at the end and it can not
be part of the label. The correct command is
  echo -n test  /sys/fs/btrfs/fsid/label

This patch will check for this user error

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index c5eb214..63c2907 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -374,7 +374,7 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
struct btrfs_root *root = fs_info-fs_root;
int ret;
 
-   if (len = BTRFS_LABEL_SIZE) {
+   if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) {
pr_err(BTRFS: unable to set label with more than %d bytes\n,
   BTRFS_LABEL_SIZE - 1);
return -EINVAL;
-- 
1.8.5.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] btrfs: revamp /sys/fs/btrfs/fsid/devices

2014-05-19 Thread Anand Jain
As of now with out this patch the content under the
dir /sys/fs/btrfs/fsid/devices is just links to the
block devs.

Moving forward we would need the above btrfs sysfs path
to contain more info about the btrfs devices.

This patch provide a framework and as of now a fault
notification interface, which is needed to notify when
disk disappear.

The idea is to call
  /sys/fs/btrfs/fsid/devices/disk/fault
when we get a kobject notification about the disk
disappearance.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/sysfs.c   | 110 +
 fs/btrfs/sysfs.h   |   3 ++
 fs/btrfs/volumes.c |   5 +++
 fs/btrfs/volumes.h |   2 +
 4 files changed, 96 insertions(+), 24 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index f729199..7c80a99 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -31,6 +31,7 @@
 #include transaction.h
 #include sysfs.h
 #include volumes.h
+#include rcu-string.h
 
 static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj);
 
@@ -475,19 +476,6 @@ static void __btrfs_sysfs_remove_one(struct btrfs_fs_info 
*fs_info)
wait_for_completion(fs_info-kobj_unregister);
 }
 
-void btrfs_sysfs_remove_one(struct btrfs_fs_info *fs_info)
-{
-   if (fs_info-space_info_kobj) {
-   sysfs_remove_files(fs_info-space_info_kobj, allocation_attrs);
-   kobject_del(fs_info-space_info_kobj);
-   kobject_put(fs_info-space_info_kobj);
-   }
-   kobject_del(fs_info-device_dir_kobj);
-   kobject_put(fs_info-device_dir_kobj);
-   addrm_unknown_feature_attrs(fs_info, false);
-   sysfs_remove_group(fs_info-super_kobj, btrfs_feature_attr_group);
-   __btrfs_sysfs_remove_one(fs_info);
-}
 
 const char * const btrfs_feature_set_names[3] = {
[FEAT_COMPAT]= compat,
@@ -564,36 +552,91 @@ static void init_feature_attrs(void)
}
 }
 
-static int add_device_membership(struct btrfs_fs_info *fs_info)
+
+#define to_btrfs_device(_kobj) container_of(_kobj, struct btrfs_device, 
device_kobj)
+
+static ssize_t device_kobj_fault_store(struct kobject *dev_kobj,
+   struct kobj_attribute *a, const char *buf, size_t len)
+{
+   struct btrfs_device *dev = to_btrfs_device(dev_kobj);
+
+   if (dev-missing || !dev-bdev)
+   return -EINVAL;
+
+   /* Fixme: Call appropriate device check status handler */
+
+return len;
+}
+
+BTRFS_ATTR_RW(fault, 0200, NULL, device_kobj_fault_store);
+
+static struct attribute *device_kobj_attrs[] = {
+   BTRFS_ATTR_PTR(fault),
+   NULL,
+};
+
+static void device_kobj_release(struct kobject *dev_kobj)
+{
+   /* nothing to free as of now */
+}
+
+struct kobj_type device_ktype = {
+   .sysfs_ops  = kobj_sysfs_ops,
+   .release= device_kobj_release,
+   .default_attrs  = device_kobj_attrs,
+};
+
+int device_add_kobject(struct btrfs_fs_info *fs_info)
 {
int error = 0;
struct btrfs_fs_devices *fs_devices = fs_info-fs_devices;
struct btrfs_device *dev;
 
-   fs_info-device_dir_kobj = kobject_create_and_add(devices,
+   if (!fs_info-device_dir_kobj)
+   fs_info-device_dir_kobj = kobject_create_and_add(devices,
fs_info-super_kobj);
if (!fs_info-device_dir_kobj)
return -ENOMEM;
 
list_for_each_entry(dev, fs_devices-devices, dev_list) {
-   struct hd_struct *disk;
-   struct kobject *disk_kobj;
 
-   if (!dev-bdev)
+   if (!dev-bdev || dev-missing || dev-device_kobj.parent)
continue;
 
-   disk = dev-bdev-bd_part;
-   disk_kobj = part_to_dev(disk)-kobj;
+   error = kobject_init_and_add(dev-device_kobj, device_ktype,
+   fs_info-device_dir_kobj, %s,
+   strrchr(rcu_str_deref(dev-name), '/') + 1);
 
-   error = sysfs_create_link(fs_info-device_dir_kobj,
- disk_kobj, disk_kobj-name);
if (error)
break;
}
-
return error;
 }
 
+/*
+ * Remove the sysfs entries for the devices.
+ * devid provides a perticular devid for which the sysfs entry
+ * has to be removed, if -1 it would remove for all devs
+ */
+void device_rm_kobject(struct btrfs_fs_info *fs_info, u64 devid)
+{
+   struct btrfs_fs_devices *fs_devices = fs_info-fs_devices;
+   struct btrfs_device *dev;
+
+   if (!fs_info-device_dir_kobj)
+   return;
+
+   list_for_each_entry(dev, fs_devices-devices, dev_list) {
+   if (!dev-device_kobj.parent)
+   continue;
+
+   if (devid == -1 || devid == dev-devid) {
+   kobject_del(dev-device_kobj);
+   kobject_put(dev-device_kobj);
+   }
+   }
+}
+
 /* /sys/fs/btrfs/ entry 

Re: send/receive and bedup

2014-05-19 Thread Konstantinos Skarlatos

On 19/5/2014 7:01 μμ, Brendan Hide wrote:

On 19/05/14 15:00, Scott Middleton wrote:

On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:

On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote:

I read so much about BtrFS that I mistaked Bedup with Duperemove.
Duperemove is actually what I am testing.

I'm currently using programs that find files that are the same, and
hardlink them together:
http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html 



hardlink.py actually seems to be the faster (memory and CPU) one event
though it's in python.
I can get others to run out of RAM on my 8GB server easily :(


Interesting app.

An issue with hardlinking (with the backups use-case, this problem 
isn't likely to happen), is that if you modify a file, all the 
hardlinks get changed along with it - including the ones that you 
don't want changed.


@Marc: Since you've been using btrfs for a while now I'm sure you've 
already considered whether or not a reflink copy is the better/worse 
option.




Bedup should be better, but last I tried I couldn't get it to work.
It's been updated since then, I just haven't had the chance to try it
again since then.

Please post what you find out, or if you have a hardlink maker that's
better than the ones I found :)



Thanks for that.

I may be  completely wrong in my approach.

I am not looking for a file level comparison. Bedup worked fine for
that. I have a lot of virtual images and shadow protect images where
only a few megabytes may be the difference. So a file level hash and
comparison doesn't really achieve my goals.

I thought duperemove may be on a lower level.

https://github.com/markfasheh/duperemove

Duperemove is a simple tool for finding duplicated extents and
submitting them for deduplication. When given a list of files it will
hash their contents on a block by block basis and compare those hashes
to each other, finding and categorizing extents that match each
other. When given the -d option, duperemove will submit those
extents for deduplication using the btrfs-extent-same ioctl.

It defaults to 128k but you can make it smaller.

I hit a hurdle though. The 3TB HDD  I used seemed OK when I did a long
SMART test but seems to die every few hours. Admittedly it was part of
a failed mdadm RAID array that I pulled out of a clients machine.

The only other copy I have of the data is the original mdadm array
that was recently replaced with a new server, so I am loathe to use
that HDD yet. At least for another couple of weeks!


I am still hopeful duperemove will work.
Duperemove does look exactly like what you are looking for. The last 
traffic on the mailing list regarding that was in August last year. It 
looks like it was pulled into the main kernel repository on September 
1st.


The last commit to the duperemove application was on April 20th this 
year. Maybe Mark (cc'd) can provide further insight on its current 
status.


I have been testing duperemove and it seems to work just fine, in 
contrast with bedup that i have been unable to install/compile/sort out 
the mess with python versions. I have 2 questions about duperemove:

1) can it use existing filesystem csums instead of calculating its own?
2) can it be included in btrfs-progs so that it becomes a standard 
feature of btrfs?

Thanks
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: label should not contain return char

2014-05-19 Thread Eric Sandeen
On 5/19/14, 12:04 PM, Anand Jain wrote:
 From: Anand Jain anand.j...@oracle.com
 
 generally if you use
   echo test  /sys/fs/btrfs/fsid/label
 it would introduce return char at the end and it can not
 be part of the label. The correct command is
   echo -n test  /sys/fs/btrfs/fsid/label
 
 This patch will check for this user error

Wouldn't it be a lot better to just strip the \n if it
exists?

 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
  fs/btrfs/sysfs.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
 index c5eb214..63c2907 100644
 --- a/fs/btrfs/sysfs.c
 +++ b/fs/btrfs/sysfs.c
 @@ -374,7 +374,7 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
   struct btrfs_root *root = fs_info-fs_root;
   int ret;
  
 - if (len = BTRFS_LABEL_SIZE) {
 + if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) {
   pr_err(BTRFS: unable to set label with more than %d bytes\n,
  BTRFS_LABEL_SIZE - 1);

so if I do:

# echo mylabel  /sys/fs/btrfs/fsid/label

I'll get:

BTRFS: unable to set label with more than 255 bytes

which would be pretty confusing, IMHO, given the short
label I tried to create.

Just strip out the \n ...

-Eric

   return -EINVAL;
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: label should not contain return char

2014-05-19 Thread Roman Mamedov
On Tue, 20 May 2014 01:04:30 +0800
Anand Jain anand.j...@oracle.com wrote:

 From: Anand Jain anand.j...@oracle.com
 
 generally if you use
   echo test  /sys/fs/btrfs/fsid/label
 it would introduce return char at the end and it can not
 be part of the label. The correct command is
   echo -n test  /sys/fs/btrfs/fsid/label
 
 This patch will check for this user error

Maybe instead consider checking for one trailing \n, and silently remove it
if passed, so that both of the mentioned variants of 'echo' can be used?

All other sysfs files do not care if you pass an extra \n at the end, e.g.

  echo cfq  /sys/block/sda/queue/scheduler  

works fine, doesn't require you to use echo -n cfq.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: send/receive and bedup

2014-05-19 Thread Mark Fasheh
On Mon, May 19, 2014 at 06:01:25PM +0200, Brendan Hide wrote:
 On 19/05/14 15:00, Scott Middleton wrote:
 On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:
 Thanks for that.

 I may be  completely wrong in my approach.

 I am not looking for a file level comparison. Bedup worked fine for
 that. I have a lot of virtual images and shadow protect images where
 only a few megabytes may be the difference. So a file level hash and
 comparison doesn't really achieve my goals.

 I thought duperemove may be on a lower level.

 https://github.com/markfasheh/duperemove

 Duperemove is a simple tool for finding duplicated extents and
 submitting them for deduplication. When given a list of files it will
 hash their contents on a block by block basis and compare those hashes
 to each other, finding and categorizing extents that match each
 other. When given the -d option, duperemove will submit those
 extents for deduplication using the btrfs-extent-same ioctl.

 It defaults to 128k but you can make it smaller.

 I hit a hurdle though. The 3TB HDD  I used seemed OK when I did a long
 SMART test but seems to die every few hours. Admittedly it was part of
 a failed mdadm RAID array that I pulled out of a clients machine.

 The only other copy I have of the data is the original mdadm array
 that was recently replaced with a new server, so I am loathe to use
 that HDD yet. At least for another couple of weeks!


 I am still hopeful duperemove will work.
 Duperemove does look exactly like what you are looking for. The last 
 traffic on the mailing list regarding that was in August last year. It 
 looks like it was pulled into the main kernel repository on September 1st.

I'm confused - you need to avoid a file scan completely? Duperemove does do
that just to be clear.

In your mind, what would be the alternative to that sort of a scan?

By the way, if you know exactly where the changes are you
could just feed the duplicate extents directly to the ioctl via a script. I
have a small tool in the duperemove repositry that can do that for you
('make btrfs-extent-same').


 The last commit to the duperemove application was on April 20th this year. 
 Maybe Mark (cc'd) can provide further insight on its current status.

Duperemove will be shipping as supported software in a major SUSE release so
it will be bug fixed, etc as you would expect. At the moment I'm very busy
trying to fix qgroup bugs so I haven't had much time to add features, or
handle external bug reports, etc. Also I'm not very good at advertising my
software which would be why it hasn't really been mentioned on list lately
:)

I would say that state that it's in is that I've gotten the feature set to a
point which feels reasonable, and I've fixed enough bugs that I'd appreciate
folks giving it a spin and providing reasonable feedback.

There's a TODO list which gives a decent idea of what's on my mind for
possible future improvements. I think what I'm most wanting to do right now
is some sort of (optional) writeout to a file of what was done during a run.
The idea is that you could feed that data back to duperemove to improve the
speed of subsequent runs. My priorities may change depending on feedback
from users of course.

I also at some point want to rewrite some of the duplicate extent finding
code as it got messy and could be a bit faster.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: send/receive and bedup

2014-05-19 Thread Mark Fasheh
On Mon, May 19, 2014 at 08:12:03PM +0300, Konstantinos Skarlatos wrote:
 On 19/5/2014 7:01 μμ, Brendan Hide wrote:
 On 19/05/14 15:00, Scott Middleton wrote:
 Duperemove does look exactly like what you are looking for. The last 
 traffic on the mailing list regarding that was in August last year. It 
 looks like it was pulled into the main kernel repository on September 1st.

 The last commit to the duperemove application was on April 20th this year. 
 Maybe Mark (cc'd) can provide further insight on its current status.

 I have been testing duperemove and it seems to work just fine, in contrast 
 with bedup that i have been unable to install/compile/sort out the mess 
 with python versions. I have 2 questions about duperemove:
 1) can it use existing filesystem csums instead of calculating its own?

Not right now, though that may be something we can feed to it in the future.

I haven't thought about this much and to be honest I don't recall *exactly*
how btrfs stores it's checksums. That said, I think feasibility of doing
this comes down to a few things:

1) how expensive is it to get at the on-disk checksums?

This might not make sense if it's simply faster to scan a file than its
checksums.


2) are they stored in a manner which makes sense for dedupe.

By that I mean, do we have a checksum for every X bytes? If so, then
theoretically life is easy - we just make our blocksize to X and load the
checksums into duperemoves internal block checksum tree. If checksums can
cover arbitrary sized extents than we might not be able to use them at all
or maybe we would have to 'fill in the blanks' so to speak.


3) what is the tradeoff of false positives?

Btrfs checksums are there for detecting bad blocks, as opposed to duplicate
data. The difference is that btrfs doesn't have to use very strong hashing
as a result. So we just want to make sure that we don't wind up passing
*so* many false positives to the kernel that it was just faster to scan the
file and checksum on our own.


Not that any of those questions are super difficult to answer by the
way, it's more about how much time I've had :)


 2) can it be included in btrfs-progs so that it becomes a standard feature 
 of btrfs?

I have to think about this one personally as it implies some tradeoffs in my
development on duperemove that I'm not sure I want to make yet.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: send/receive and bedup

2014-05-19 Thread Austin S Hemmelgarn
On 2014-05-19 13:12, Konstantinos Skarlatos wrote:
 On 19/5/2014 7:01 μμ, Brendan Hide wrote:
 On 19/05/14 15:00, Scott Middleton wrote:
 On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:
 On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote:
 I read so much about BtrFS that I mistaked Bedup with Duperemove.
 Duperemove is actually what I am testing.
 I'm currently using programs that find files that are the same, and
 hardlink them together:
 http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html


 hardlink.py actually seems to be the faster (memory and CPU) one event
 though it's in python.
 I can get others to run out of RAM on my 8GB server easily :(

 Interesting app.

 An issue with hardlinking (with the backups use-case, this problem
 isn't likely to happen), is that if you modify a file, all the
 hardlinks get changed along with it - including the ones that you
 don't want changed.

 @Marc: Since you've been using btrfs for a while now I'm sure you've
 already considered whether or not a reflink copy is the better/worse
 option.


 Bedup should be better, but last I tried I couldn't get it to work.
 It's been updated since then, I just haven't had the chance to try it
 again since then.

 Please post what you find out, or if you have a hardlink maker that's
 better than the ones I found :)


 Thanks for that.

 I may be  completely wrong in my approach.

 I am not looking for a file level comparison. Bedup worked fine for
 that. I have a lot of virtual images and shadow protect images where
 only a few megabytes may be the difference. So a file level hash and
 comparison doesn't really achieve my goals.

 I thought duperemove may be on a lower level.

 https://github.com/markfasheh/duperemove

 Duperemove is a simple tool for finding duplicated extents and
 submitting them for deduplication. When given a list of files it will
 hash their contents on a block by block basis and compare those hashes
 to each other, finding and categorizing extents that match each
 other. When given the -d option, duperemove will submit those
 extents for deduplication using the btrfs-extent-same ioctl.

 It defaults to 128k but you can make it smaller.

 I hit a hurdle though. The 3TB HDD  I used seemed OK when I did a long
 SMART test but seems to die every few hours. Admittedly it was part of
 a failed mdadm RAID array that I pulled out of a clients machine.

 The only other copy I have of the data is the original mdadm array
 that was recently replaced with a new server, so I am loathe to use
 that HDD yet. At least for another couple of weeks!


 I am still hopeful duperemove will work.
 Duperemove does look exactly like what you are looking for. The last
 traffic on the mailing list regarding that was in August last year. It
 looks like it was pulled into the main kernel repository on September
 1st.

 The last commit to the duperemove application was on April 20th this
 year. Maybe Mark (cc'd) can provide further insight on its current
 status.

 I have been testing duperemove and it seems to work just fine, in
 contrast with bedup that i have been unable to install/compile/sort out
 the mess with python versions. I have 2 questions about duperemove:
 1) can it use existing filesystem csums instead of calculating its own?
While this might seem like a great idea at first, it really isn't.
BTRFS uses CRC32c at the moment as it's checksum algorithm, and while
that is relatively good at detecting small differences (i.e. a single
bit flipped out of every 64 or so bytes), it is known to have issues
with hash collisions.  Normally, the data on disk won't change enough
even from a media error to cause a hash collision, but when you start
using it to compare extents that aren't known to be the same to begin
with, and then try to merge those extents, you run the risk of serious
file corruption.  Also, AFAIK, BTRFS doesn't expose the block checksum
to userspace directly (although I may be wrong about this, in which case
i retract the following statement) this would therefore require some
kernelspace support.
 2) can it be included in btrfs-progs so that it becomes a standard
 feature of btrfs?
I would definitely like to second this suggestion, I hear a lot of
people talking about how BTRFS has batch deduplication, but it's almost
impossible to make use of without extra software or writing your own code.



smime.p7s
Description: S/MIME Cryptographic Signature


[PATCH v2] Btrfs: ensure readers see new data after a clone operation

2014-05-19 Thread Filipe David Borba Manana
We were cleaning the clone target file range from the page cache before
we did replace the file extent items in the fs tree. This was racy,
as right after cleaning the relevant range from the page cache and before
replacing the file extent items, a read against that range could be
performed by another task and populate again the page cache with stale
data (stale after the cloning finishes). This would result in reads after
the clone operation successfully finishes to get old data (and potentially
for a very long time). Therefore evict the pages after replacing the file
extent items, so that subsequent reads will always get the new data.

Similarly, we were prone to races while cloning the file extent items
because we weren't locking the target range and wait for any existing
ordered extents against that range to complete. It was possible that
after cloning the extent items, a write operation that was performed
before the clone operation and overlaps the same range, would end up
undoing all or part of the work the clone operation did (a worker task
running inode.c:btrfs_finish_ordered_io). Therefore lock the target
range in the io tree, wait for all pending ordered extents against that
range to finish and then safely perform the cloning.

The issue of reading stale data after the clone operation is easy to
reproduce by running the following C program in a loop until it exits
with return value 1.

 #include unistd.h
 #include stdio.h
 #include stdlib.h
 #include string.h
 #include errno.h
 #include pthread.h
 #include sys/stat.h
 #include fcntl.h
 #include assert.h
 #include asm/types.h
 #include linux/ioctl.h
 #include sys/stat.h
 #include sys/types.h
 #include sys/ioctl.h

 #define SRC_FILE /mnt/sdd/foo
 #define DST_FILE /mnt/sdd/bar
 #define FILE_SIZE (16 * 1024)
 #define PATTERN_SRC 'X'
 #define PATTERN_DST 'Y'

struct btrfs_ioctl_clone_range_args {
__s64 src_fd;
__u64 src_offset, src_length;
__u64 dest_offset;
};

 #define BTRFS_IOCTL_MAGIC 0x94
 #define BTRFS_IOC_CLONE_RANGE _IOW(BTRFS_IOCTL_MAGIC, 13, \
   struct btrfs_ioctl_clone_range_args)

static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static int clone_done = 0;
static int reader_ready = 0;
static int stale_data = 0;

static void *reader_loop(void *arg)
{
char buf[4096], want_buf[4096];

memset(want_buf, PATTERN_SRC, 4096);
pthread_mutex_lock(mutex);
reader_ready = 1;
pthread_mutex_unlock(mutex);

while (1) {
int done, fd, ret;

fd = open(DST_FILE, O_RDONLY);
assert(fd != -1);

pthread_mutex_lock(mutex);
done = clone_done;
pthread_mutex_unlock(mutex);

ret = read(fd, buf, 4096);
assert(ret == 4096);
close(fd);

if (done) {
ret = memcmp(buf, want_buf, 4096);
if (ret == 0) {
printf(Found new content\n);
} else {
printf(Found old content\n);
pthread_mutex_lock(mutex);
stale_data = 1;
pthread_mutex_unlock(mutex);
}
break;
}
}
return NULL;
}

int main(int argc, char *argv[])
{
pthread_t reader;
int ret, i, fd;
struct btrfs_ioctl_clone_range_args clone_args;
int fd1, fd2;

ret = remove(SRC_FILE);
if (ret == -1  errno != ENOENT) {
fprintf(stderr, Error deleting src file: %s\n, 
strerror(errno));
return 1;
}
ret = remove(DST_FILE);
if (ret == -1  errno != ENOENT) {
fprintf(stderr, Error deleting dst file: %s\n, 
strerror(errno));
return 1;
}

fd = open(SRC_FILE, O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);
assert(fd != -1);
for (i = 0; i  FILE_SIZE; i++) {
char c = PATTERN_SRC;
ret = write(fd, c, 1);
assert(ret == 1);
}
close(fd);
fd = open(DST_FILE, O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);
assert(fd != -1);
for (i = 0; i  FILE_SIZE; i++) {
char c = PATTERN_DST;
ret = write(fd, c, 1);
assert(ret == 1);
}
close(fd);

ret = pthread_create(reader, NULL, reader_loop, NULL);
assert(ret == 0);
while (1) {
int r;
pthread_mutex_lock(mutex);
r = reader_ready;
pthread_mutex_unlock(mutex);
if (r) break;
}

fd1 = open(SRC_FILE, O_RDONLY);
if (fd1  0) {
fprintf(stderr, Error open src file: %s\n, strerror(errno));
return 1;

Re: send/receive and bedup

2014-05-19 Thread Mark Fasheh
On Mon, May 19, 2014 at 01:59:01PM -0400, Austin S Hemmelgarn wrote:
 On 2014-05-19 13:12, Konstantinos Skarlatos wrote:
  I have been testing duperemove and it seems to work just fine, in
  contrast with bedup that i have been unable to install/compile/sort out
  the mess with python versions. I have 2 questions about duperemove:
  1) can it use existing filesystem csums instead of calculating its own?
 While this might seem like a great idea at first, it really isn't.
 BTRFS uses CRC32c at the moment as it's checksum algorithm, and while
 that is relatively good at detecting small differences (i.e. a single
 bit flipped out of every 64 or so bytes), it is known to have issues
 with hash collisions.  Normally, the data on disk won't change enough
 even from a media error to cause a hash collision, but when you start
 using it to compare extents that aren't known to be the same to begin
 with, and then try to merge those extents, you run the risk of serious
 file corruption.  Also, AFAIK, BTRFS doesn't expose the block checksum
 to userspace directly (although I may be wrong about this, in which case
 i retract the following statement) this would therefore require some
 kernelspace support.

I'm pretty sure you could get the checkums via ioctl. The thing about dedupe
though is that kernel is always doing a byte-by-byte comparison of the file
data before merging it so we should never corrupt just because userspace
gave us a bad range to dedupe. That said I don't necessarily disagree that
it might not be as good an idea as it sounds.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ditto blocks on ZFS

2014-05-19 Thread Martin
On 18/05/14 17:09, Russell Coker wrote:
 On Sat, 17 May 2014 13:50:52 Martin wrote:
[...]
 Do you see or measure any real advantage?
 
 Imagine that you have a RAID-1 array where both disks get ~14,000 read 
 errors.  
 This could happen due to a design defect common to drives of a particular 
 model or some shared environmental problem.  Most errors would be corrected 
 by 
 RAID-1 but there would be a risk of some data being lost due to both copies 
 being corrupt.  Another possibility is that one disk could entirely die 
 (although total disk death seems rare nowadays) and the other could have 
 corruption.  If metadata was duplicated in addition to being on both disks 
 then the probability of data loss would be reduced.
 
 Another issue is the case where all drive slots are filled with active drives 
 (a very common configuration).  To replace a disk you have to physically 
 remove the old disk before adding the new one.  If the array is a RAID-1 or 
 RAID-5 then ANY error during reconstruction loses data.  Using dup for 
 metadata on top of the RAID protections (IE the ZFS ditto idea) means that 
 case doesn't lose you data.

Your example there is for the case where in effect there is no RAID. How
is that case any better than what is already done for btrfs duplicating
metadata?



So...


What real-world failure modes do the ditto blocks usefully protect against?

And how does that compare for failure rates and against what is already
done?


For example, we have RAID1 and RAID5 to protect against any one RAID
chunk being corrupted or for the total loss of any one device.

There is a second part to that in that another failure cannot be
tolerated until the RAID is remade.


Hence, we have RAID6 that protects against any two failures for a chunk
or device. Hence with just one failure, you can tolerate a second
failure whilst rebuilding the RAID.


And then we supposedly have safety-by-design where the filesystem itself
is using a journal and barriers/sync to ensure that the filesystem is
always kept in a consistent state, even after an interruption to any writes.


*What other failure modes* should we guard against?


There has been mention of fixing metadata keys from single bit flips...

Should hamming codes be used instead of a crc so that we can have
multiple bit error detect, single bit error correct functionality for
all data both in RAM and on disk for those systems that do not use ECC RAM?

Would that be useful?...


Regards,
Martin

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] xfstests: new mailing list

2014-05-19 Thread Dave Chinner
On Mon, May 19, 2014 at 07:55:41AM -0700, Christoph Hellwig wrote:
 On Sat, May 17, 2014 at 08:19:30AM +1000, Dave Chinner wrote:
  Renaming the test suite take a lot more work - .e.g renaming/moving
  source trees and a fixing all the documentation that points to it...
 
 In that case please call the list xfstests - a name different by a
 single character is utterly confusing.  And I defintively see some merit
 to the suggestion that we'll just keep the x and allow people to come up
 with a nice backronym for it if they care enough.

What is important is that we have a separate list for the filesystem
test suite we use, not whether the name has an x in or not.
Arguing about whether there should or should not be an 'x' in the
mailing list name is just a waste of time - it's not going to make
me change the name of the list

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ditto blocks on ZFS

2014-05-19 Thread Brendan Hide

On 2014/05/19 10:36 PM, Martin wrote:

On 18/05/14 17:09, Russell Coker wrote:

On Sat, 17 May 2014 13:50:52 Martin wrote:

[...]

Do you see or measure any real advantage?

[snip]
This is extremely difficult to measure objectively. Subjectively ... see 
below.

[snip]

*What other failure modes* should we guard against?


I know I'd sleep a /little/ better at night knowing that a double disk 
failure on a raid5/1/10 configuration might ruin a ton of data along 
with an obscure set of metadata in some long tree paths - but not the 
entire filesystem.


The other use-case/failure mode - where you are somehow unlucky enough 
to have sets of bad sectors/bitrot on multiple disks that simultaneously 
affect the only copies of the tree roots - is an extremely unlikely 
scenario. As unlikely as it may be, the scenario is a very painful 
consequence in spite of VERY little corruption. That is where the 
peace-of-mind/bragging rights come in.


--
__
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: send/receive and bedup

2014-05-19 Thread Konstantinos Skarlatos

On 19/5/2014 8:38 μμ, Mark Fasheh wrote:

On Mon, May 19, 2014 at 06:01:25PM +0200, Brendan Hide wrote:

On 19/05/14 15:00, Scott Middleton wrote:

On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:
Thanks for that.

I may be  completely wrong in my approach.

I am not looking for a file level comparison. Bedup worked fine for
that. I have a lot of virtual images and shadow protect images where
only a few megabytes may be the difference. So a file level hash and
comparison doesn't really achieve my goals.

I thought duperemove may be on a lower level.

https://github.com/markfasheh/duperemove

Duperemove is a simple tool for finding duplicated extents and
submitting them for deduplication. When given a list of files it will
hash their contents on a block by block basis and compare those hashes
to each other, finding and categorizing extents that match each
other. When given the -d option, duperemove will submit those
extents for deduplication using the btrfs-extent-same ioctl.

It defaults to 128k but you can make it smaller.

I hit a hurdle though. The 3TB HDD  I used seemed OK when I did a long
SMART test but seems to die every few hours. Admittedly it was part of
a failed mdadm RAID array that I pulled out of a clients machine.

The only other copy I have of the data is the original mdadm array
that was recently replaced with a new server, so I am loathe to use
that HDD yet. At least for another couple of weeks!


I am still hopeful duperemove will work.

Duperemove does look exactly like what you are looking for. The last
traffic on the mailing list regarding that was in August last year. It
looks like it was pulled into the main kernel repository on September 1st.

I'm confused - you need to avoid a file scan completely? Duperemove does do
that just to be clear.

In your mind, what would be the alternative to that sort of a scan?

By the way, if you know exactly where the changes are you
could just feed the duplicate extents directly to the ioctl via a script. I
have a small tool in the duperemove repositry that can do that for you
('make btrfs-extent-same').



The last commit to the duperemove application was on April 20th this year.
Maybe Mark (cc'd) can provide further insight on its current status.

Duperemove will be shipping as supported software in a major SUSE release so
it will be bug fixed, etc as you would expect. At the moment I'm very busy
trying to fix qgroup bugs so I haven't had much time to add features, or
handle external bug reports, etc. Also I'm not very good at advertising my
software which would be why it hasn't really been mentioned on list lately
:)

I would say that state that it's in is that I've gotten the feature set to a
point which feels reasonable, and I've fixed enough bugs that I'd appreciate
folks giving it a spin and providing reasonable feedback.
Well, after having good results with duperemove with a few gigs of data, 
i tried it on a 500gb subvolume. After it scanned all files, it is stuck 
at 100% of one cpu core for about 5 hours, and still hasn't done any 
deduping. My cpu is an Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz, so i 
guess thats not the problem. So I guess the speed of duperemove drops 
dramatically as data volume increases.




There's a TODO list which gives a decent idea of what's on my mind for
possible future improvements. I think what I'm most wanting to do right now
is some sort of (optional) writeout to a file of what was done during a run.
The idea is that you could feed that data back to duperemove to improve the
speed of subsequent runs. My priorities may change depending on feedback
from users of course.

I also at some point want to rewrite some of the duplicate extent finding
code as it got messy and could be a bit faster.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.

2014-05-19 Thread Qu Wenruo


 Original Message 
Subject: Re: [PATCH 00/27] Replace the old man page with asciidoc and 
man page for each btrfs subcommand.

From: David Sterba dste...@suse.cz
To: Hugo Mills h...@carfax.org.uk, Qu Wenruo 
quwen...@cn.fujitsu.com, linux-btrfs@vger.kernel.org, c...@fb.com

Date: 2014年05月19日 22:33

On Mon, May 19, 2014 at 04:01:23PM +0200, David Sterba wrote:

On Sat, May 17, 2014 at 06:43:15PM +0100, Hugo Mills wrote:

I've just been poking around in the docs for a completely different
reason, and I think there's a fairly serious problem (well, as serious
as problems get with documentation).

Take, for example, the format for btrfs fi resize:

'resize' [devid:][+/-]size[gkm]|[devid:]max path::

Now, this has just thrown away all of the useful markup which
indicates the semantics of the command. The asciidoc renders all of
that text literally and unformatted, making alphasymbolic(*) soup of
the docs. Compare this to the old roff man page:

\fBbtrfs\fP \fBfilesystem resize\fP 
[\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP

I think we can restore the formatting with asciidoc.

The line above would become:

*btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]'max path'

or with bold max

*btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]*max* 'path'

The correct base string should read

   btrfs filesystem resize [devid:][+/-]size[kgm]|[devid:]max path

ie. add .. around devid and size. That way it's copy-paste-ready.
In this case the italic/underlined text does not IMO add much value.

It is completely OK for me.
Since the base string is copy-paste-ready, it would add any extra effort 
to add other markup.

My personal feeling about the enriched formatting is that the commands
stand out of the text and are easier to catch (as you've mentioned
somewhere in the thread).

The bolded subcommand name seems to be sufficent.

The files are processed by XSL, I think it should be possible to apply
some transformation that would add '...' around ... automatically
instead of making everybody write that.

Proposed changes:
- format all subcommands as bold instead of italic ('' - **)
- add all missing ...
- find a way how to add '...' around ... (xsl or sed or whatever)

Does that work for you?

That is OK for me, I'll investigate it.

Should I send a new patchset or just delta patches upon the current base?

Thanks,
Qu
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs diff between snapshosts

2014-05-19 Thread Marc MERLIN
As a followup on the discussion we had on how to do a live data switch
on a partition without unmounting it since it's busy, it came back to
how do you know what changed between 2 snapshots.

btrfs send giving a file list of what was added/modified/removed is the
long term answer, and Filipe is working on patches that will offer this
in the future (thanks Filipe).

In the meantime, I found a hack/script that gives a partial diff between
2 snapshots, called it btrfs-diff but then remembered that there isn't a
page for it.

So, I made one, along with example usage:
http://marc.merlins.org/perso/btrfs/post_2014-05-19_Btrfs-diff-Between-Snapshots.html

Hope this helps.

As another temp hack, I tried to look at a quick way to parse btrfs send
output to just spit out filenames, but that wasn't too trivial (as in
with sed/perl).

If someone has something better, please share :)

Thanks,
Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ditto blocks on ZFS

2014-05-19 Thread Russell Coker
On Mon, 19 May 2014 23:47:37 Brendan Hide wrote:
 This is extremely difficult to measure objectively. Subjectively ... see
 below.
 
  [snip]
  
  *What other failure modes* should we guard against?
 
 I know I'd sleep a /little/ better at night knowing that a double disk
 failure on a raid5/1/10 configuration might ruin a ton of data along
 with an obscure set of metadata in some long tree paths - but not the
 entire filesystem.

My experience is that most disk failures that don't involve extreme physical 
damage (EG dropping a drive on concrete) don't involve totally losing the 
disk.  Much of the discussion about RAID failures concerns entirely failed 
disks, but I believe that is due to RAID implementations such as Linux 
software RAID that will entirely remove a disk when it gives errors.

I have a disk which had ~14,000 errors of which ~2000 errors were corrected by 
duplicate metadata.  If two disks with that problem were in a RAID-1 array 
then duplicate metadata would be a significant benefit.

 The other use-case/failure mode - where you are somehow unlucky enough
 to have sets of bad sectors/bitrot on multiple disks that simultaneously
 affect the only copies of the tree roots - is an extremely unlikely
 scenario. As unlikely as it may be, the scenario is a very painful
 consequence in spite of VERY little corruption. That is where the
 peace-of-mind/bragging rights come in.

http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html

The NetApp research on latent errors on drives is worth reading.  On page 12 
they report latent sector errors on 9.5% of SATA disks per year.  So if you 
lose one disk entirely the risk of having errors on a second disk is higher 
than you would want for RAID-5.  While losing the root of the tree is 
unlikely, losing a directory in the middle that has lots of subdirectories is 
a risk.

I can understand why people wouldn't want ditto blocks to be mandatory.  But 
why are people arguing against them as an option?


As an aside, I'd really like to be able to set RAID levels by subtree.  I'd 
like to use RAID-1 with ditto blocks for my important data and RAID-0 for 
unimportant data.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html