[PATCH] btrfs: remove a FIXME in btrfs_get_acl()
There is no function returns a value of -ENOENT, so the check is useless. Remove it, and the redundant braces. Signed-off-by: Zhang Zhen zhenzhang.zh...@huawei.com --- fs/btrfs/acl.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c index ff9b399..cae7480 100644 --- a/fs/btrfs/acl.c +++ b/fs/btrfs/acl.c @@ -53,14 +53,12 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type) return ERR_PTR(-ENOMEM); size = __btrfs_getxattr(inode, name, value, size); } - if (size 0) { + if (size 0) acl = posix_acl_from_xattr(init_user_ns, value, size); - } else if (size == -ENOENT || size == -ENODATA || size == 0) { - /* FIXME, who returns -ENOENT? I think nobody */ + else if (size == -ENODATA || size == 0) acl = NULL; - } else { + else acl = ERR_PTR(-EIO); - } kfree(value); if (!IS_ERR(acl)) -- 1.8.1.2 . -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Convert btrfs software code to ASIC
Hi, I am Nguyen. I am not a software development engineer but an IC (chip) development engineer. I have a plan to develop an IC controller for Network Attached Storage (NAS). The main idea is converting software code into hardware implementation. Because the chip is customized for NAS, its performance is high, and its cost is lower than using micro processor like Atom or Xeon (for servers). I plan to use btrfs as the file system specification for my NAS. The main point is that I need to understand the btrfs sofware code in order to covert them into hardware implementation. I am wandering if any of you can help me. If we can make the chip in a good shape, we can start up a company and have our own business. If you are interested in my idea and have further questions, please send me an email: lntran...@gmail.com Thanks. Nguyen. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
3.14.2 Debian kernel BTRFS corruption after balance
Last night my cron job that runs /sbin/btrfs fi balance start -dusage=30 - musage=30 / failed with no space for balancing. Then I manually ran it to free space for further balancing and ended up running -musage=0 -dusage=60 (dusage=40 resulted in nothing being done). http://www.coker.com.au/bug/btrfs-3.14.2-dmesg.txt.gz After running the dusage=60 balance I had a kernel panic apparently due to a corrupted filesystem, the above URL has the dmesg. http://www.coker.com.au/bug/btrfs-3.14.2-dmesg2.txt.gz Then I rebooted the system and got the above error. It now seems impossible to get a read-write filesystem. I'll try booting with the latest Debian kernel (3.14.4) and see if that makes it work. Otherwise I guess it's backup/format/restore. Would it be worth keeping an image of that filesystem to see if a newer kernel can handle it better? -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert btrfs software code to ASIC
On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com wrote: Hi, I am Nguyen. I am not a software development engineer but an IC (chip) development engineer. I have a plan to develop an IC controller for Network Attached Storage (NAS). The main idea is converting software code into hardware implementation. Because the chip is customized for NAS, its performance is high, and its cost is lower than using micro processor like Atom or Xeon (for servers). I plan to use btrfs as the file system specification for my NAS. The main point is that I need to understand the btrfs sofware code in order to covert them into hardware implementation. I am wandering if any of you can help me. If we can make the chip in a good shape, we can start up a company and have our own business. I'm not sure if that's a good idea. AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block, etc). Rather than converting/reimplementing everything, if your aim is lower cost, you might have easier time using something like a mediatek SOC (the ones used on smartphones) and run a custom-built linux with btrfs support on it. For documentation, https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentation is probably the best place to start -- Fajar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Convert btrfs software code to ASIC
Hi Nguyen, Perhaps a better idea would be to use a low-cost low-power som module to run Linux and btrfs code, and use an FPGA/ASIC to offload compression/encryption/checksums and to possibly act as a raid controller. Since btrfs will be under heavy development for the foreseeable future I doubt it would be a good idea to lock it into silicon. Using this approach the mature technologies can be hardware accelerated, and the software parts are available for easy upgrades. It also significantly reduces risk for your project, and VCs like that sort of thing! Regards, Paul. -Original Message- From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Le Nguyen Tran Sent: Monday, 19 May 2014 9:07 PM To: Fajar A. Nugraha Cc: linux-btrfs Subject: Re: Convert btrfs software code to ASIC Hi Nugraha, Thank you so much for your information. Frankly speaking, no one can confirm a new start-up idea works or not. The probability of failure is always high. However, the benefit if it works is also very high. I do not plan to exactly replicate the C source code. There are always some techniques in ASIC design to implement which are not the same as in software (less flexible but faster). The main advantages of my proposed chip are: - Very high performance: Performance of ASIC chip is normally more than 10x higher than performance of processors because processor run only 1-4 instructions sequentially. That is very suitable for server when there are many requests from users. - Low-cost: In side the chip, we can customized for our function only. In my plan, we do not need cache (which covers a very large area), and we can use low cost technology 0.18um. - Low-power: Processors run instructions sequentially and access memory ( or cache). As a result, they consume much more power than ASIC chip (also can be 10x higher). Actually ARM processors like mediatek cannot be comparable with ASIC chip. However, as I mentioned, it is just my draft idea. I still to work more to verify my idea. Thanks. Nguyen. On 5/19/14, Fajar A. Nugraha l...@fajar.net wrote: On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com wrote: Hi, I am Nguyen. I am not a software development engineer but an IC (chip) development engineer. I have a plan to develop an IC controller for Network Attached Storage (NAS). The main idea is converting software code into hardware implementation. Because the chip is customized for NAS, its performance is high, and its cost is lower than using micro processor like Atom or Xeon (for servers). I plan to use btrfs as the file system specification for my NAS. The main point is that I need to understand the btrfs sofware code in order to covert them into hardware implementation. I am wandering if any of you can help me. If we can make the chip in a good shape, we can start up a company and have our own business. I'm not sure if that's a good idea. AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block, etc). Rather than converting/reimplementing everything, if your aim is lower cost, you might have easier time using something like a mediatek SOC (the ones used on smartphones) and run a custom-built linux with btrfs support on it. For documentation, https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentat ion is probably the best place to start -- Fajar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Re: 3.15-rc5 btrfs send/receive corruption errors? Does scrub warn of silent corruption?
On Sat, May 17, 2014 at 11:23 PM, Marc MERLIN m...@merlins.org wrote: Before I delete this and start over, anything else you'd like from it? Can you create an image of the fs with btrfs-image (https://btrfs.wiki.kernel.org/index.php/Btrfs-image) and uploaded it somewhere (or send it to me directly) to see if it's easy to reproduce? thanks Also, it the 3.15rc5 deadlock I had, did not occur again. So, I think it may well have been related to my doing a big send/receive. See 3.15rc5 deadlock thread. Marc On Wed, May 14, 2014 at 06:26:18AM -0700, Marc MERLIN wrote: On Tue, May 13, 2014 at 09:11:34PM +0100, Filipe David Manana wrote: Is there anything you'd like from the subvolumes on the source that btrfs cannot process and that I'm going to delete so that I can start syncing back from the SSD to the HDD? For the issue you had with send sending weird path names, I just found a case that leads to it (or a crash or some other weird stuff): https://patchwork.kernel.org/patch/4170401/ But you really need to be using a lot of hard links and deleting them, so maybe it's caused by something else. Unfortunately, even with your patch, I see get legolas:/mnt/btrfs_pool2# btrfs send home_ro.20140507_10\:00\:01 | btrfs receive /mnt/btrfs_pool1/ At subvol home_ro.20140507_10:00:01 At subvol home_ro.20140507_10:00:01 ERROR: chown merlin/.config/google-chrome-mysetup/ ��� failed. No such file or directory I just ran btrfsck and I see nothing majorly wrong with the source filesystem: legolas:~# btrfsck /dev/mapper/disk2 21 |tee /tmp/fsck checking extents checking free space cache checking fs roots root 22504 inode 1926322 errors 400, nbytes wrong Checking filesystem on /dev/mapper/disk2 UUID: 6afd4707-876c-46d6-9de2-21c4085b7bed free space inode generation (0) did not match free space cache generation (78684) free space inode generation (0) did not match free space cache generation (75988) free space inode generation (0) did not match free space cache generation (76193) free space inode generation (0) did not match free space cache generation (28818) free space inode generation (0) did not match free space cache generation (28818) free space inode generation (0) did not match free space cache generation (33187) free space inode generation (0) did not match free space cache generation (31543) free space inode generation (0) did not match free space cache generation (16710) found 283033724420 bytes used err is 1 total csum bytes: 663653972 total tree bytes: 7333687296 total fs tree bytes: 5844262912 total extent tree bytes: 631451648 btree space waste bytes: 1497868045 file data blocks allocated: 1081231372288 referenced 807338209280 Btrfs v3.14.1 To be clear, I do not need this to work, this is a snapshot I'm going to delete anyway, but if there is anything you'd like me to try or capture for you to help with improving the code, please let me know. Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: send/receive and bedup
On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote: On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote: I read so much about BtrFS that I mistaked Bedup with Duperemove. Duperemove is actually what I am testing. I'm currently using programs that find files that are the same, and hardlink them together: http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html hardlink.py actually seems to be the faster (memory and CPU) one event though it's in python. I can get others to run out of RAM on my 8GB server easily :( Bedup should be better, but last I tried I couldn't get it to work. It's been updated since then, I just haven't had the chance to try it again since then. Please post what you find out, or if you have a hardlink maker that's better than the ones I found :) Thanks for that. I may be completely wrong in my approach. I am not looking for a file level comparison. Bedup worked fine for that. I have a lot of virtual images and shadow protect images where only a few megabytes may be the difference. So a file level hash and comparison doesn't really achieve my goals. I thought duperemove may be on a lower level. https://github.com/markfasheh/duperemove Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on a block by block basis and compare those hashes to each other, finding and categorizing extents that match each other. When given the -d option, duperemove will submit those extents for deduplication using the btrfs-extent-same ioctl. It defaults to 128k but you can make it smaller. I hit a hurdle though. The 3TB HDD I used seemed OK when I did a long SMART test but seems to die every few hours. Admittedly it was part of a failed mdadm RAID array that I pulled out of a clients machine. The only other copy I have of the data is the original mdadm array that was recently replaced with a new server, so I am loathe to use that HDD yet. At least for another couple of weeks! I am still hopeful duperemove will work. In another month I will put the 2 X 4TB HDDs online in BtrFS RAID 1 format on the production machine and have a crack on duperemove on that after hours. I will convert the onsite backup machine to BtrFS with its 2 x 4TB HDDs to BtrFS not long after. The ultimate goal is to be able to back up on a block level very large files offsite where maybe a GB is changed on a daily basis. I realise that I will have to make an original copy and manually take that to my datacentre but hopefully I can backup multiple clients data after hours, or possibly, a trickle, constantly. Kind Regards Scott -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.
On 05/17/2014 01:43 PM, Hugo Mills wrote: On Wed, Apr 16, 2014 at 07:12:19PM +0200, David Sterba wrote: On Wed, Apr 02, 2014 at 04:29:11PM +0800, Qu Wenruo wrote: Convert the old btrfs man pages to new asciidoc and split the huge btrfs man page into subcommand man page. I'm merging this patchset into the base series of integration because several patches need to update the docs and it's no longer feasible to keep it in a separate branch from the patches. I've just been poking around in the docs for a completely different reason, and I think there's a fairly serious problem (well, as serious as problems get with documentation). Take, for example, the format for btrfs fi resize: 'resize' [devid:][+/-]size[gkm]|[devid:]max path:: Now, this has just thrown away all of the useful markup which indicates the semantics of the command. The asciidoc renders all of that text literally and unformatted, making alphasymbolic(*) soup of the docs. Compare this to the old roff man page: \fBbtrfs\fP \fBfilesystem resize\fP [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP This isn't perfect -- we're missing a \fB around the max -- but it has text in bold(⁑) and italics(⁂) and neither(☃). I've just looked at some of the other pages, and they've also got similar typographical problems. This is a lot of fiddly tedious work to get it right, and if it doesn't get done now in the initial commit, then we're going to end up with poor examples copied for every new feature or docs update, making the problem worse before anyone does the work to make it better. Are there issues with the asciidoc form outside of the command summary line? The reason I ask is that all of these tools have tradeoffs. If asciidoc makes our documentation easier to update and easier to keep up to date, I'm willing to trade that for the perfect summary line. I think the easiest way to add clarity to the summary (in any markup language) is by providing examples. Italics and bolds definitely help, but examples always win. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert btrfs software code to ASIC
Paul Jones posted on Mon, 19 May 2014 12:24:53 + as excerpted: On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com wrote: I have a plan to develop an IC controller for Network Attached Storage (NAS). The main idea is converting software code into hardware implementation. I plan to use btrfs as the file system specification for my NAS. Perhaps a better idea would be to use a low-cost low-power som module to run Linux and btrfs code, and use an FPGA/ASIC to offload compression/encryption/checksums and to possibly act as a raid controller. Since btrfs will be under heavy development for the foreseeable future I doubt it would be a good idea to lock it into silicon. Using this approach the mature technologies can be hardware accelerated, and the software parts are available for easy upgrades. It also significantly reduces risk for your project, and VCs like that sort of thing! This is a very good idea and what I was about to suggest. Certainly, btrfs is still not fully stable, and I really would hate to see the current implementation etched in silicon at this time. However, a hybrid approach where the mature bits such as (de-/)compression/checksums/ encryption are hardware etched/accelerated while the more general and still developing code is deployed as upgradeable firmware on a system-on- module sounds like a very good idea indeed, particularly if that firmware is deployed as a user-modifiable/replaceable free-as-in-freedom kernel in keeping with the spirit of the GPL under which the Linux kernel and thus btrfs are written. In other words... I doubt very much that any list regular here familiar with the continuing flow of bugs we see, as well as the roadmapped but not yet implemented features that people wanting a hardware implementation would certainly be interested in, would find the idea of a hardware implementation of anything like current code anything but nightmare material. =:^\ Maybe in a couple years... but even then, upgradeable firmware with critical mature bits offloaded for hardware acceleration sounds like a far better idea. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert btrfs software code to ASIC
Hi Paul, Thank you for your advice. Actually, I currently have ideas to implement database management (like list, tree), and dynamic memory allocation in hardware to accelerate the file system operations. I still do not have a clear picture about which part is implemented by processor (as your advice) and which part is accelerated by hardware. I now need to understand the operation of btrfs source code to determine. I hope that one of you can help me and if it work, we can start-up our own business. Thanks. Nguyen. On 5/19/14, Paul Jones p...@pauljones.id.au wrote: Hi Nguyen, Perhaps a better idea would be to use a low-cost low-power som module to run Linux and btrfs code, and use an FPGA/ASIC to offload compression/encryption/checksums and to possibly act as a raid controller. Since btrfs will be under heavy development for the foreseeable future I doubt it would be a good idea to lock it into silicon. Using this approach the mature technologies can be hardware accelerated, and the software parts are available for easy upgrades. It also significantly reduces risk for your project, and VCs like that sort of thing! Regards, Paul. -Original Message- From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Le Nguyen Tran Sent: Monday, 19 May 2014 9:07 PM To: Fajar A. Nugraha Cc: linux-btrfs Subject: Re: Convert btrfs software code to ASIC Hi Nugraha, Thank you so much for your information. Frankly speaking, no one can confirm a new start-up idea works or not. The probability of failure is always high. However, the benefit if it works is also very high. I do not plan to exactly replicate the C source code. There are always some techniques in ASIC design to implement which are not the same as in software (less flexible but faster). The main advantages of my proposed chip are: - Very high performance: Performance of ASIC chip is normally more than 10x higher than performance of processors because processor run only 1-4 instructions sequentially. That is very suitable for server when there are many requests from users. - Low-cost: In side the chip, we can customized for our function only. In my plan, we do not need cache (which covers a very large area), and we can use low cost technology 0.18um. - Low-power: Processors run instructions sequentially and access memory ( or cache). As a result, they consume much more power than ASIC chip (also can be 10x higher). Actually ARM processors like mediatek cannot be comparable with ASIC chip. However, as I mentioned, it is just my draft idea. I still to work more to verify my idea. Thanks. Nguyen. On 5/19/14, Fajar A. Nugraha l...@fajar.net wrote: On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com wrote: Hi, I am Nguyen. I am not a software development engineer but an IC (chip) development engineer. I have a plan to develop an IC controller for Network Attached Storage (NAS). The main idea is converting software code into hardware implementation. Because the chip is customized for NAS, its performance is high, and its cost is lower than using micro processor like Atom or Xeon (for servers). I plan to use btrfs as the file system specification for my NAS. The main point is that I need to understand the btrfs sofware code in order to covert them into hardware implementation. I am wandering if any of you can help me. If we can make the chip in a good shape, we can start up a company and have our own business. I'm not sure if that's a good idea. AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block, etc). Rather than converting/reimplementing everything, if your aim is lower cost, you might have easier time using something like a mediatek SOC (the ones used on smartphones) and run a custom-built linux with btrfs support on it. For documentation, https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentat ion is probably the best place to start -- Fajar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
3.15-rc5 deadlocked a 2nd time after I was copying photos from an sdcard + common code path that deadlocks all btrfs filesystems
Ok, that's 2 out of 2. I was copying pictures from an sdcard (through mmcblk0), and the filesystem deadlocked. Unfortunately, when this happens, I copied my pictures (which were still in RAM) to my 2nd drive which was also btrfs. I had to reboot, and of course the last pictures didn't get committed to disk, but more annoyingly the copy I did to the second drive didn't work either. All the filenames got copied to the 2nd drive, some ended up with data, and others ended up empty. Why does a deadlock on drive 1 also cause btrfs to fail to write to drive #2? This is not the first time, there seem to be common codepaths across all drives (just like disk array #1 having problems causing failure of syslog to work on the boot drive with btrfs). I tried to capture sysrq+w, but it didn't make it to disk because of that bug. I do have remote syslog of the hangs before that though, but the capture of sysrq+w has too much missing data to be useful http://marc.merlins.org/tmp/btrfs-hang.txt Mmmh, maybe the deadlock is more complicated. I had a 2nd syslog stream going to an ext4 filesystem, exactly to get around that btrfs master deadlock, and now I see that didn't work either. If sync hangs, and logging to an ext4 filesystem didn't work, am I hitting another bug/hardware problem? Here's what I got at the end? [194790.138156] FAT-fs (mmcblk0p1): utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive! [194790.140892] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. [194932.445153] INFO: task IndexedDB:29612 blocked for more than 120 seconds. [194932.445161] Tainted: GW 3.15.0-rc5-amd64-i915-preempt-20140216s1 #2 [194932.445163] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [194932.445166] IndexedDB D 8800ccde8bc0 0 29612 5570 0x0080 [194932.445172] 8801b521fc30 0086 8801b521fc00 8801b521ffd8 [194932.445178] 8801d622a450 000141c0 88041e3941c0 8801d622a450 [194932.445182] 8801b521fcd0 0002 810fda1a 8801b521fc40 [194932.445188] Call Trace: [194932.445198] [810fda1a] ? wait_on_page_read+0x3c/0x3c [194932.445209] [8161ca1b] io_schedule+0x60/0x7a [194932.445214] [810fda28] sleep_on_page+0xe/0x12 [194932.445219] [8161cdab] __wait_on_bit_lock+0x46/0x8a [194932.445223] [810fdae3] __lock_page+0x69/0x6b [194932.445228] [81084771] ? autoremove_wake_function+0x34/0x34 [194932.445232] [81240c41] lock_page+0x1e/0x21 [194932.445237] [81244779] extent_write_cache_pages.isra.16.constprop.32+0x10e/0x2c3 [194932.445243] [8161d2d4] ? mutex_unlock+0x16/0x18 [194932.445248] [81239c74] ? btrfs_file_aio_write+0x3e9/0x4b6 [194932.445251] [81244bd4] extent_writepages+0x4b/0x5c [194932.445255] [8122ee1f] ? btrfs_submit_direct+0x3f4/0x3f4 [194932.445262] [8122d3fa] btrfs_writepages+0x28/0x2a [194932.445267] [811082b1] do_writepages+0x1e/0x2c [194932.445272] [810ff179] __filemap_fdatawrite_range+0x55/0x57 [194932.445277] [810ff1ef] filemap_fdatawrite_range+0x13/0x15 [194932.445280] [8123885a] btrfs_sync_file+0xa8/0x2b3 [194932.445286] [8132048f] ? __percpu_counter_add+0x8c/0xa6 [194932.445292] [8117a1a7] vfs_fsync_range+0x18/0x22 [194932.445296] [8117a1cd] vfs_fsync+0x1c/0x1e [194932.445299] [8117a3d9] do_fsync+0x2c/0x4c [194932.445303] [8117a5f9] SyS_fdatasync+0x13/0x17 [194932.445308] [81625bad] system_call_fastpath+0x1a/0x1f [194932.445395] INFO: task kworker/u16:35:3812 blocked for more than 120 seconds. [194932.445398] Tainted: GW 3.15.0-rc5-amd64-i915-preempt-20140216s1 #2 [194932.445400] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [194932.445403] kworker/u16:35 D 0 3812 2 0x0080 [194932.445410] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1) [194932.445414] 88003b647a00 0046 88003b6479d0 88003b647fd8 [194932.445419] 88003b8ca590 000141c0 88041e3941c0 88003b8ca590 [194932.445423] 88003b647aa0 0002 810fda1a 88003b647a10 [194932.445427] Call Trace: [194932.445432] [810fda1a] ? wait_on_page_read+0x3c/0x3c [194932.445437] [8161c876] schedule+0x73/0x75 [194932.445441] [8161ca1b] io_schedule+0x60/0x7a [194932.445445] [810fda28] sleep_on_page+0xe/0x12 [194932.445450] [8161cdab] __wait_on_bit_lock+0x46/0x8a [194932.445454] [810fdae3] __lock_page+0x69/0x6b [194932.445458] [81084771] ? autoremove_wake_function+0x34/0x34 [194932.445461] [81240c41] lock_page+0x1e/0x21 [194932.445465] [81244779] extent_write_cache_pages.isra.16.constprop.32+0x10e/0x2c3 [194932.445470]
Re: Convert btrfs software code to ASIC
On Mon, May 19, 2014 at 8:09 PM, Le Nguyen Tran lntran...@gmail.com wrote: I now need to understand the operation of btrfs source code to determine. I hope that one of you can help me Have you read the wiki link? -- Fajar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.
On Sat, May 17, 2014 at 06:43:15PM +0100, Hugo Mills wrote: I've just been poking around in the docs for a completely different reason, and I think there's a fairly serious problem (well, as serious as problems get with documentation). Take, for example, the format for btrfs fi resize: 'resize' [devid:][+/-]size[gkm]|[devid:]max path:: Now, this has just thrown away all of the useful markup which indicates the semantics of the command. The asciidoc renders all of that text literally and unformatted, making alphasymbolic(*) soup of the docs. Compare this to the old roff man page: \fBbtrfs\fP \fBfilesystem resize\fP [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP I think we can restore the formatting with asciidoc. The line above would become: *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]'max path' or with bold max *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]*max* 'path' I was first worried that this will not be possible due to limitations of asciidoc markup but as this turned out not be true, I'd rather spend the boring time to keep the formatting as before. My personal feeling about the enriched formatting is that the commands stand out of the text and are easier to catch (as you've mentioned somewhere in the thread). -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hang while deleting file
No luck of catching it in the act unfortunately, but I'll bear that tip in mind for any future issues. On Mon, May 19, 2014 at 4:00 AM, Chris Murphy li...@colorremedies.com wrote: On May 18, 2014, at 10:36 AM, Joshua McKinney jos...@joshka.net wrote: https://bugzilla.kernel.org/show_bug.cgi?id=76421 Perceived issue: SABNZBD hangs, requires restart. Diagnosis shows the following in my system log at the time of hang. This happens more than once. Log: [ 5883.464766] INFO: task SABnzbd.py:994 blocked for more than 120 seconds. [ 5883.464906] Not tainted 3.14.4-1-ARCH #1 [ 5883.464989] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 5883.465130] SABnzbd.py D 880196d1f5c0 0 994 1 0x [ 5883.465140] 8800765c9ce8 0082 0052 880196d1f5c0 [ 5883.465148] 000146c0 8800765c9fd8 000146c0 880196d1f5c0 [ 5883.465156] ffef 880177d9a000 a02d0dbc 8800765c9c50 [ 5883.465163] Call Trace: [ 5883.465218] [a02d0dbc] ? __set_extent_bit+0x45c/0x550 [btrfs] [ 5883.465252] [a02d03c3] ? free_extent_state+0x43/0xc0 [btrfs] [ 5883.465284] [a02d0dbc] ? __set_extent_bit+0x45c/0x550 [btrfs] [ 5883.465295] [810b3ba4] ? __wake_up+0x44/0x50 [ 5883.465304] [8150b729] schedule+0x29/0x70 [ 5883.465335] [a02d1cd2] lock_extent_bits+0x152/0x1e0 [btrfs] [ 5883.465344] [810b4020] ? __wake_up_sync+0x20/0x20 [ 5883.465375] [a02bfa59] btrfs_evict_inode+0x139/0x520 [btrfs] [ 5883.465387] [811d5a80] evict+0xb0/0x1c0 [ 5883.465394] [811d6335] iput+0xf5/0x1a0 [ 5883.465402] [811ca9c5] do_unlinkat+0x1b5/0x300 [ 5883.465411] [8117899c] ? vm_munmap+0x4c/0x60 [ 5883.465418] [811cb986] SyS_unlink+0x16/0x20 [ 5883.465427] [81517769] system_call_fastpath+0x16/0x1b Filesystem: # btrfs filesystem show Btrfs v3.14.1 running Data RAID, sys/meta RAID10 on 5x4TB. SABNzbd is a usenet download program, so the file attempting to be deleted was possibly large (GB) Recently updated to 3.14 kernel provided in arch linux. I haven't seen this issue before the last couple of days. Happy to provide more info if necessary. I'd include as an attachment to the bug the output from sysrq-w. echo 1 /proc/sys/kernel/sysrq echo w /proc/sysrq-trigger dmesg https://www.kernel.org/doc/Documentation/sysrq.txt Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.
On Mon, May 19, 2014 at 04:01:23PM +0200, David Sterba wrote: On Sat, May 17, 2014 at 06:43:15PM +0100, Hugo Mills wrote: I've just been poking around in the docs for a completely different reason, and I think there's a fairly serious problem (well, as serious as problems get with documentation). Take, for example, the format for btrfs fi resize: 'resize' [devid:][+/-]size[gkm]|[devid:]max path:: Now, this has just thrown away all of the useful markup which indicates the semantics of the command. The asciidoc renders all of that text literally and unformatted, making alphasymbolic(*) soup of the docs. Compare this to the old roff man page: \fBbtrfs\fP \fBfilesystem resize\fP [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP I think we can restore the formatting with asciidoc. The line above would become: *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]'max path' or with bold max *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]*max* 'path' The correct base string should read btrfs filesystem resize [devid:][+/-]size[kgm]|[devid:]max path ie. add .. around devid and size. That way it's copy-paste-ready. In this case the italic/underlined text does not IMO add much value. My personal feeling about the enriched formatting is that the commands stand out of the text and are easier to catch (as you've mentioned somewhere in the thread). The bolded subcommand name seems to be sufficent. The files are processed by XSL, I think it should be possible to apply some transformation that would add '...' around ... automatically instead of making everybody write that. Proposed changes: - format all subcommands as bold instead of italic ('' - **) - add all missing ... - find a way how to add '...' around ... (xsl or sed or whatever) Does that work for you? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] xfstests: new mailing list
On Sat, May 17, 2014 at 08:19:30AM +1000, Dave Chinner wrote: Renaming the test suite take a lot more work - .e.g renaming/moving source trees and a fixing all the documentation that points to it... In that case please call the list xfstests - a name different by a single character is utterly confusing. And I defintively see some merit to the suggestion that we'll just keep the x and allow people to come up with a nice backronym for it if they care enough. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: send/receive and bedup
On 19/05/14 15:00, Scott Middleton wrote: On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote: On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote: I read so much about BtrFS that I mistaked Bedup with Duperemove. Duperemove is actually what I am testing. I'm currently using programs that find files that are the same, and hardlink them together: http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html hardlink.py actually seems to be the faster (memory and CPU) one event though it's in python. I can get others to run out of RAM on my 8GB server easily :( Interesting app. An issue with hardlinking (with the backups use-case, this problem isn't likely to happen), is that if you modify a file, all the hardlinks get changed along with it - including the ones that you don't want changed. @Marc: Since you've been using btrfs for a while now I'm sure you've already considered whether or not a reflink copy is the better/worse option. Bedup should be better, but last I tried I couldn't get it to work. It's been updated since then, I just haven't had the chance to try it again since then. Please post what you find out, or if you have a hardlink maker that's better than the ones I found :) Thanks for that. I may be completely wrong in my approach. I am not looking for a file level comparison. Bedup worked fine for that. I have a lot of virtual images and shadow protect images where only a few megabytes may be the difference. So a file level hash and comparison doesn't really achieve my goals. I thought duperemove may be on a lower level. https://github.com/markfasheh/duperemove Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on a block by block basis and compare those hashes to each other, finding and categorizing extents that match each other. When given the -d option, duperemove will submit those extents for deduplication using the btrfs-extent-same ioctl. It defaults to 128k but you can make it smaller. I hit a hurdle though. The 3TB HDD I used seemed OK when I did a long SMART test but seems to die every few hours. Admittedly it was part of a failed mdadm RAID array that I pulled out of a clients machine. The only other copy I have of the data is the original mdadm array that was recently replaced with a new server, so I am loathe to use that HDD yet. At least for another couple of weeks! I am still hopeful duperemove will work. Duperemove does look exactly like what you are looking for. The last traffic on the mailing list regarding that was in August last year. It looks like it was pulled into the main kernel repository on September 1st. The last commit to the duperemove application was on April 20th this year. Maybe Mark (cc'd) can provide further insight on its current status. -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/5] Documentation updates
Formatting changes inspired by Hugo's mail and some fixes that I found along the way. The update in Availability section removes the heavy development not usable text. David Sterba (5): btrfs-progs: doc: fix argument notation and typos btrfs-progs: doc: remove text for unmerged features btrfs-progs: doc: autoformat user-supplied arguments by sed btrfs-progs: doc: make all commands and subcommands bold btrfs-progs: doc: update the Availability section Documentation/Makefile | 9 ++-- Documentation/btrfs-balance.txt | 30 +++--- Documentation/btrfs-check.txt| 18 Documentation/btrfs-convert.txt | 6 +-- Documentation/btrfs-debug-tree.txt | 6 +-- Documentation/btrfs-device.txt | 27 +--- Documentation/btrfs-filesystem.txt | 70 ++-- Documentation/btrfs-find-root.txt| 6 +-- Documentation/btrfs-image.txt| 10 ++--- Documentation/btrfs-inspect-internal.txt | 18 Documentation/btrfs-map-logical.txt | 6 +-- Documentation/btrfs-property.txt | 22 +- Documentation/btrfs-qgroup.txt | 36 Documentation/btrfs-quota.txt| 16 Documentation/btrfs-receive.txt | 12 +++--- Documentation/btrfs-replace.txt | 16 Documentation/btrfs-rescue.txt | 16 Documentation/btrfs-restore.txt | 12 +++--- Documentation/btrfs-scrub.txt| 26 ++-- Documentation/btrfs-send.txt | 10 ++--- Documentation/btrfs-show-super.txt | 6 +-- Documentation/btrfs-subvolume.txt| 34 Documentation/btrfs-zero-log.txt | 12 +++--- Documentation/btrfs.txt | 48 +++--- Documentation/btrfstune.txt | 6 +-- Documentation/fsck.btrfs.txt | 6 +-- Documentation/mkfs.btrfs.txt | 12 +++--- 27 files changed, 207 insertions(+), 289 deletions(-) -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] btrfs-progs: doc: fix argument notation and typos
All user-supplied values should be enclosed in ... to distinguish them from verbatim strings. Signed-off-by: David Sterba dste...@suse.cz --- Documentation/btrfs-balance.txt | 6 +++--- Documentation/btrfs-check.txt| 2 +- Documentation/btrfs-filesystem.txt | 10 +- Documentation/btrfs-image.txt| 4 ++-- Documentation/btrfs-inspect-internal.txt | 2 +- Documentation/btrfs-qgroup.txt | 10 +- Documentation/btrfs-scrub.txt| 8 Documentation/btrfs-subvolume.txt| 6 +++--- Documentation/fsck.btrfs.txt | 2 +- Documentation/mkfs.btrfs.txt | 2 +- 10 files changed, 26 insertions(+), 26 deletions(-) diff --git a/Documentation/btrfs-balance.txt b/Documentation/btrfs-balance.txt index 1b1861cfd31f..7edc44b150dd 100644 --- a/Documentation/btrfs-balance.txt +++ b/Documentation/btrfs-balance.txt @@ -36,11 +36,11 @@ given balance all chunks in a filesystem. + `Options` + --d[filters] +-d[filters] act on data chunks --m[filters] +-m[filters] act on metadata chunks --s[filters] +-s[filters] act on system chunks (only under -f) -v be verbose diff --git a/Documentation/btrfs-check.txt b/Documentation/btrfs-check.txt index 485a49cbc3ec..ce491734a981 100644 --- a/Documentation/btrfs-check.txt +++ b/Documentation/btrfs-check.txt @@ -22,7 +22,7 @@ https://btrfs.wiki.kernel.org/index.php/Btrfsck OPTIONS --- --s|--support superblock:: +-s|--super superblock:: use superblockth superblock copy. --repair:: try to repair the filesystem. diff --git a/Documentation/btrfs-filesystem.txt b/Documentation/btrfs-filesystem.txt index de9b3f3c39c4..4ac8711f62c0 100644 --- a/Documentation/btrfs-filesystem.txt +++ b/Documentation/btrfs-filesystem.txt @@ -17,7 +17,7 @@ resizing, defragment. SUBCOMMAND -- -'df' [-b] path [path...]:: +'df' [-b] path [path...]:: Show space usage information for a mount point. + If '-b' is given, then byte is used as unit. Default unit will be @@ -59,10 +59,10 @@ lower than 100% because the metadata is duplicated for security reasons. If all the data and metadata are duplicated (or have a profile like RAID1) the Data to disk ratio could be 50%. -'show' [--mounted|--all-devices|path|uuid|device|lable]:: +'show' [--mounted|--all-devices|path|uuid|device|label]:: Show the btrfs filesystem with some additional info. + -If no option nor path|uuid|device|lable is passed, btrfs shows +If no option nor path|uuid|device|label is passed, btrfs shows information of all the btrfs filesystem both mounted and unmounted. If '--mounted' is passed, it would probe btrfs kernel to list mounted btrfs filesystem(s); @@ -109,7 +109,7 @@ don't use it if you use snapshots, have de-duplicated your data or made copies with `cp --reflink`. // Some wording are extracted by the resize2fs man page -'resize' [devid:][+/-]size[gkm]|[devid:]max path:: +'resize' [devid:][+/-]size[gkm]|[devid:]max path:: Resize a filesystem identified by path for the underlying device devid *online*. + The devid can be found with 'btrfs filesystem show' and @@ -133,7 +133,7 @@ partition after reducing the size of the filesystem. This can done using it with the new desired size. When recreating the partition make sure to use the same starting disk cylinder as before. -'label' [dev|mount_point] [newlabel]:: +'label' [dev|mountpoint] [newlabel]:: Show or update the label of a filesystem. + [device|mountpoint] is used to identify the filesystem. diff --git a/Documentation/btrfs-image.txt b/Documentation/btrfs-image.txt index bd74a86cff44..c41e36d6c59a 100644 --- a/Documentation/btrfs-image.txt +++ b/Documentation/btrfs-image.txt @@ -24,10 +24,10 @@ using 1 stripe pointing to primary device, so that file system can be restored by running tree log reply if possible. To restore without changing number of stripes in chunk tree check -o option. --c value:: +-c value:: Compression level (0 ~ 9). --t value:: +-t value:: Number of threads (1 ~ 32) to be used to process the image dump or restore. -o:: diff --git a/Documentation/btrfs-inspect-internal.txt b/Documentation/btrfs-inspect-internal.txt index 4555c70670df..c5f751dc4f71 100644 --- a/Documentation/btrfs-inspect-internal.txt +++ b/Documentation/btrfs-inspect-internal.txt @@ -23,7 +23,7 @@ Resolves an inode in subvolume path to all filesystem paths. -v verbose mode. print count of returned paths and ioctl() return value -'logical-resolve' [-Pv] [-s bufsize] logical path:: +'logical-resolve' [-Pv] [-s bufsize] logical path:: Resolves a logical address in the filesystem mounted at path to all inodes. + By default, each inode is then resolved to a file system path (similar to the diff --git a/Documentation/btrfs-qgroup.txt b/Documentation/btrfs-qgroup.txt index d0544232f353..531febb3a086 100644 --- a/Documentation/btrfs-qgroup.txt +++ b/Documentation/btrfs-qgroup.txt @@ -73,15 +73,15
[PATCH 5/5] btrfs-progs: doc: update the Availability section
Does not reflect the current state. The wiki contains more details on the first page. Signed-off-by: David Sterba dste...@suse.cz --- Documentation/btrfs-balance.txt | 4 +--- Documentation/btrfs-check.txt| 4 +--- Documentation/btrfs-device.txt | 4 +--- Documentation/btrfs-filesystem.txt | 4 +--- Documentation/btrfs-inspect-internal.txt | 4 +--- Documentation/btrfs-property.txt | 4 +--- Documentation/btrfs-qgroup.txt | 4 +--- Documentation/btrfs-quota.txt| 4 +--- Documentation/btrfs-receive.txt | 4 +--- Documentation/btrfs-replace.txt | 4 +--- Documentation/btrfs-rescue.txt | 4 +--- Documentation/btrfs-restore.txt | 4 +--- Documentation/btrfs-scrub.txt| 4 +--- Documentation/btrfs-send.txt | 4 +--- Documentation/btrfs-subvolume.txt| 4 +--- Documentation/btrfs.txt | 4 +--- Documentation/mkfs.btrfs.txt | 4 +--- 17 files changed, 17 insertions(+), 51 deletions(-) diff --git a/Documentation/btrfs-balance.txt b/Documentation/btrfs-balance.txt index d34833d6f380..37d8781eee4e 100644 --- a/Documentation/btrfs-balance.txt +++ b/Documentation/btrfs-balance.txt @@ -68,9 +68,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-check.txt b/Documentation/btrfs-check.txt index 073667265c13..027032b2efb8 100644 --- a/Documentation/btrfs-check.txt +++ b/Documentation/btrfs-check.txt @@ -38,9 +38,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-device.txt b/Documentation/btrfs-device.txt index 4f847763bb66..0f7917d894a0 100644 --- a/Documentation/btrfs-device.txt +++ b/Documentation/btrfs-device.txt @@ -105,9 +105,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-filesystem.txt b/Documentation/btrfs-filesystem.txt index e3e270ff0957..0ee79cbabc34 100644 --- a/Documentation/btrfs-filesystem.txt +++ b/Documentation/btrfs-filesystem.txt @@ -108,9 +108,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-inspect-internal.txt b/Documentation/btrfs-inspect-internal.txt index fe76217365b0..5ae4997b9bc0 100644 --- a/Documentation/btrfs-inspect-internal.txt +++ b/Documentation/btrfs-inspect-internal.txt @@ -57,9 +57,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-property.txt b/Documentation/btrfs-property.txt index 4b0f49e04b20..6b23e2e52aad 100644 --- a/Documentation/btrfs-property.txt +++ b/Documentation/btrfs-property.txt @@ -56,9 +56,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-qgroup.txt b/Documentation/btrfs-qgroup.txt index 12321926b25b..55c4747449ff 100644 --- a/Documentation/btrfs-qgroup.txt +++ b/Documentation/btrfs-qgroup.txt @@ -97,9 +97,7 @@ returned in case of failure. AVAILABILITY -*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy -development, -and not suitable for any uses other than benchmarking and review. +*btrfs* is part of btrfs-progs. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for further details. diff --git a/Documentation/btrfs-quota.txt b/Documentation/btrfs-quota.txt
[PATCH 2/5] btrfs-progs: doc: remove text for unmerged features
The asciidoc conversion was done on a development branch and there are portions of text that do not reflect the code. Signed-off-by: David Sterba dste...@suse.cz --- Documentation/btrfs-device.txt | 5 - Documentation/btrfs-filesystem.txt | 46 +- 2 files changed, 1 insertion(+), 50 deletions(-) diff --git a/Documentation/btrfs-device.txt b/Documentation/btrfs-device.txt index 7a6bce5c650a..9cd8ad081a5a 100644 --- a/Documentation/btrfs-device.txt +++ b/Documentation/btrfs-device.txt @@ -86,11 +86,6 @@ filesystem as listed by blkid. Finally, if '--all-devices' or '-d' is passed, all the devices under /dev are scanned. -'disk-usage' [-b] path [path..]:: -Show which chunks are in a device. -+ -If '-b' is given, byte will be set as unit. - 'ready' device:: Check device to see if it has all of it's devices in cache for mounting. diff --git a/Documentation/btrfs-filesystem.txt b/Documentation/btrfs-filesystem.txt index 4ac8711f62c0..63e3ef676cd3 100644 --- a/Documentation/btrfs-filesystem.txt +++ b/Documentation/btrfs-filesystem.txt @@ -17,47 +17,8 @@ resizing, defragment. SUBCOMMAND -- -'df' [-b] path [path...]:: +'df' path [path...]:: Show space usage information for a mount point. -+ -If '-b' is given, then byte is used as unit. Default unit will be -human-readable unit such as KiB/MiB/GiB. -+ -The command 'btrfs filesystem df' is used to query how many space on the -disk(s) are used and an estimation of the free -space of the filesystem. -The output of the command 'btrfs filesystem df' shows: - -`Disk size` -the total size of the disks which compose the filesystem. - -`Disk allocated` -the size of the area of the disks used by the chunks. - -`Disk unallocated` -the size of the area of the disks which is free (i.e. -the differences of the values above). - -`Used` -the portion of the logical space used by the file and metadata. - -`Free (estimated)` -the estimated free space available: i.e. how many space can be used -by the user. The evaluation cannot be rigorous because it depends by the -allocation policy (DUP, Single, RAID1...) of the metadata and data chunks. + -If every chunk is stored as Single the sum of the free (estimated) space -and the used space is equal to the disk size. -Otherwise if all the chunk are mirrored (raid1 or raid10) or duplicated -the sum of the free (estimated) space and the used space is -half of the disk size. Normally the free (estimated) is between -these two limits. - -`Data to disk ratio` -the ratio betwen the logical size (i.e. the space available by -the chunks) and the disk allocated (by the chunks). Normally it is -lower than 100% because the metadata is duplicated for security reasons. -If all the data and metadata are duplicated (or have a profile like RAID1) -the Data to disk ratio could be 50%. 'show' [--mounted|--all-devices|path|uuid|device|label]:: Show the btrfs filesystem with some additional info. @@ -140,11 +101,6 @@ Show or update the label of a filesystem. If a newlabel optional argument is passed, the label is changed. NOTE: the maximum allowable length shall be less than 256 chars -'disk-usage' [-tb] path [path...]:: -Show in which disk the chunks are allocated. + -If '-b' is given, set byte as unit; -If '-t' is given, show data in tabular format. - EXIT STATUS --- 'btrfs filesystem' returns a zero exist status if it succeeds. Non zero is -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] btrfs-progs: doc: make all commands and subcommands bold
Italic format is used for parameters and values, bold makes the text visually separated. Signed-off-by: David Sterba dste...@suse.cz --- Documentation/btrfs-balance.txt | 22 +++ Documentation/btrfs-check.txt| 14 +- Documentation/btrfs-convert.txt | 6 ++--- Documentation/btrfs-debug-tree.txt | 6 ++--- Documentation/btrfs-device.txt | 20 +++--- Documentation/btrfs-filesystem.txt | 22 +++ Documentation/btrfs-find-root.txt| 6 ++--- Documentation/btrfs-image.txt| 6 ++--- Documentation/btrfs-inspect-internal.txt | 16 +-- Documentation/btrfs-map-logical.txt | 6 ++--- Documentation/btrfs-property.txt | 20 +++--- Documentation/btrfs-qgroup.txt | 24 - Documentation/btrfs-quota.txt| 14 +- Documentation/btrfs-receive.txt | 10 +++ Documentation/btrfs-replace.txt | 14 +- Documentation/btrfs-rescue.txt | 14 +- Documentation/btrfs-restore.txt | 10 +++ Documentation/btrfs-scrub.txt| 20 +++--- Documentation/btrfs-send.txt | 8 +++--- Documentation/btrfs-show-super.txt | 6 ++--- Documentation/btrfs-subvolume.txt| 28 +-- Documentation/btrfs-zero-log.txt | 12 - Documentation/btrfs.txt | 46 Documentation/btrfstune.txt | 6 ++--- Documentation/fsck.btrfs.txt | 6 ++--- Documentation/mkfs.btrfs.txt | 8 +++--- 26 files changed, 185 insertions(+), 185 deletions(-) diff --git a/Documentation/btrfs-balance.txt b/Documentation/btrfs-balance.txt index 7edc44b150dd..d34833d6f380 100644 --- a/Documentation/btrfs-balance.txt +++ b/Documentation/btrfs-balance.txt @@ -7,11 +7,11 @@ btrfs-balance - balance btrfs filesystem SYNOPSIS -'btrfs [filesystem] balance' subcommand|args +*btrfs [filesystem] balance* subcommand|args DESCRIPTION --- -'btrfs balance' is used to balance chunks in a btrfs filesystem across +*btrfs balance* is used to balance chunks in a btrfs filesystem across multiple or even single device. See `btrfs-device`(8) for more details about the effect on device management. @@ -21,10 +21,10 @@ SUBCOMMAND path:: Balance chunks across the devices *online*. + -'btrfs balance path' is deprecated, -please use 'btrfs balance start' command instead. +*btrfs balance path* is deprecated, +please use *btrfs balance start* command instead. -'start' [options] path:: +*start* [options] path:: Balance chunks across the devices *online*. + Balance and/or convert (change allocation profile of) chunks that @@ -47,28 +47,28 @@ be verbose -f force reducing of metadata integrity -'pause' path:: +*pause* path:: Pause running balance. -'cancel' path:: +*cancel* path:: Cancel running or paused balance. -'resume' path:: +*resume* path:: Resume interrupted balance. -'status' [-v] path:: +*status* [-v] path:: Show status of running or paused balance. + If '-v' option is given, output will be verbose. EXIT STATUS --- -'btrfs balance' returns a zero exist status if it succeeds. Non zero is +*btrfs balance* returns a zero exist status if it succeeds. Non zero is returned in case of failure. AVAILABILITY -'btrfs' is part of btrfs-progs. Btrfs filesystem is currently under heavy +*btrfs* is part of btrfs-progs. Btrfs filesystem is currently under heavy development, and not suitable for any uses other than benchmarking and review. Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for diff --git a/Documentation/btrfs-check.txt b/Documentation/btrfs-check.txt index ce491734a981..073667265c13 100644 --- a/Documentation/btrfs-check.txt +++ b/Documentation/btrfs-check.txt @@ -7,18 +7,18 @@ btrfs-check - check or repair a btrfs filesystem offline SYNOPSIS -'btrfs check' [options] device +*btrfs check* [options] device DESCRIPTION --- -'btrfs check' is used to check or repair a btrfs filesystem offline. +*btrfs check* is used to check or repair a btrfs filesystem offline. -NOTE: Since btrfs is under heavy development especially the 'btrfs check' +NOTE: Since btrfs is under heavy development especially the *btrfs check* command, it is *highly* recommended to read the following btrfs wiki before -executing 'btrfs check' with '--repair' option: + +executing *btrfs check* with '--repair' option: + https://btrfs.wiki.kernel.org/index.php/Btrfsck -'btrfsck' is an alias of 'btrfs check' command and is now deprecated. +*btrfsck* is an alias of *btrfs check* command and is now deprecated. OPTIONS --- @@ -33,12 +33,12 @@ create a new extent tree. EXIT STATUS --- -'btrfs check' returns a zero exist status if it succeeds. Non zero is +*btrfs check* returns a zero exist status if it
[PATCH 1/4 RESEND] Btrfs: all super blocks of the replaced disk must be scratched
From: Anand Jain anand.j...@oracle.com In a normal scenario when sys-admin replaces a disk, the expeted is btrfs will release the disk completely. However the below test case gives a wrong impression that replaced disk is still is in use. $ btrfs rep start /dev/sde /dev/sdg4 /btrfs $ mkfs.btrfs /dev/sde /dev/sde appears to contain an existing filesystem (btrfs). Error: Use the -f option to force overwrite. Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/volumes.c | 33 + 1 file changed, 25 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b4660c4..19e68f7 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6280,16 +6280,33 @@ int btrfs_scratch_superblock(struct btrfs_device *device) { struct buffer_head *bh; struct btrfs_super_block *disk_super; + int i; + u64 bytenr; - bh = btrfs_read_dev_super(device-bdev); - if (!bh) - return -EINVAL; - disk_super = (struct btrfs_super_block *)bh-b_data; + for (i = 0; i BTRFS_SUPER_MIRROR_MAX; i++) { + bytenr = btrfs_sb_offset(i); + if (bytenr + BTRFS_SUPER_INFO_SIZE = + i_size_read(device-bdev-bd_inode)) + break; - memset(disk_super-magic, 0, sizeof(disk_super-magic)); - set_buffer_dirty(bh); - sync_dirty_buffer(bh); - brelse(bh); + bh = __bread(device-bdev, bytenr / 4096, + BTRFS_SUPER_INFO_SIZE); + if (!bh) + continue; + + disk_super = (struct btrfs_super_block *)bh-b_data; + if (btrfs_super_bytenr(disk_super) != bytenr || + btrfs_super_magic(disk_super) != BTRFS_MAGIC) { + brelse(bh); + continue; + } + + memset(disk_super-magic, 0, sizeof(disk_super-magic)); + + set_buffer_dirty(bh); + sync_dirty_buffer(bh); + brelse(bh); + } return 0; } -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4 RESEND] btrfs: btrfs_rm_device() should zero mirror SB as well
This fix will ensure all SB copies on the disk is zeroed when the disk is intentionally removed. This helps to better manage disks in the user land. This version of patch also merges the Zach patch as below. btrfs: don't double brelse on device rm Signed-off-by: Anand Jain anand.j...@oracle.com Signed-off-by: Zach Brown z...@redhat.com --- fs/btrfs/volumes.c | 31 +++ 1 file changed, 31 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 19e68f7..1567439 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1681,12 +1681,43 @@ int btrfs_rm_device(struct btrfs_root *root, char *device_path) * remove it from the devices list and zero out the old super */ if (clear_super disk_super) { + u64 bytenr; + int i; + /* make sure this device isn't detected as part of * the FS anymore */ memset(disk_super-magic, 0, sizeof(disk_super-magic)); set_buffer_dirty(bh); sync_dirty_buffer(bh); + + /* clear the mirror copies of super block on the disk +* being removed, 0th copy is been taken care above and +* the below would take of the rest +*/ + for (i = 1; i BTRFS_SUPER_MIRROR_MAX; i++) { + bytenr = btrfs_sb_offset(i); + if (bytenr + BTRFS_SUPER_INFO_SIZE = + i_size_read(bdev-bd_inode)) + break; + + brelse(bh); + bh = __bread(bdev, bytenr / 4096, + BTRFS_SUPER_INFO_SIZE); + if (!bh) + continue; + + disk_super = (struct btrfs_super_block *)bh-b_data; + + if (btrfs_super_bytenr(disk_super) != bytenr || + btrfs_super_magic(disk_super) != BTRFS_MAGIC) { + continue; + } + memset(disk_super-magic, 0, + sizeof(disk_super-magic)); + set_buffer_dirty(bh); + sync_dirty_buffer(bh); + } } ret = 0; -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4 RESEND] btrfs: add framework to read fs info from btrfs-control
This adds ioctl BTRFS_IOC_GET_FSIDS which reads the fs info through the btrfs-control, needed to optimize heavily used btrfs-progs function check_mounted() plus few other minor uses. Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/super.c | 66 +- fs/btrfs/volumes.c | 39 +++ fs/btrfs/volumes.h | 2 ++ include/uapi/linux/btrfs.h | 19 + 4 files changed, 120 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index d4878dd..b42cd50 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1717,38 +1717,92 @@ static struct file_system_type btrfs_fs_type = { }; MODULE_ALIAS_FS(btrfs); +static int btrfs_ioc_get_fslist(void __user *arg) +{ + int ret = 0; + u64 sz_fslist_arg; + u64 sz_fslist; + u64 sz_out; + struct btrfs_ioctl_fslist_args *fslist_arg; + struct btrfs_ioctl_fslist_args *fslist_arg_tmp; + struct btrfs_ioctl_fslist *fslist; + + u64 cnt = 0, ucnt; + + sz_fslist_arg = sizeof(*fslist_arg); + sz_fslist = sizeof(*fslist); + if (copy_from_user(ucnt, + (struct btrfs_ioctl_fslist_args __user *)(arg + + offsetof(struct btrfs_ioctl_fslist_args, count)), + sizeof(ucnt))) + return -EFAULT; + + cnt = btrfs_get_fslist_cnt(); + + if (cnt ucnt) { + if (copy_to_user(arg + + offsetof(struct btrfs_ioctl_fslist_args, count), + cnt, sizeof(cnt))) + return -EFAULT; + return 1; + } + + sz_out = sz_fslist_arg + sz_fslist * cnt; + fslist_arg_tmp = fslist_arg = memdup_user(arg, sz_out); + if (IS_ERR(fslist_arg)) + return PTR_ERR(fslist_arg); + fslist = (struct btrfs_ioctl_fslist *) (++fslist_arg_tmp); + cnt = btrfs_get_fslist(fslist, cnt); + fslist_arg-count = cnt; + if (copy_to_user(arg, fslist_arg, sz_out)) { + ret = -EFAULT; + goto out; + } + ret = 0; +out: + kfree(fslist_arg); + return ret; +} + /* * used by btrfsctl to scan devices when no FS is mounted */ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { - struct btrfs_ioctl_vol_args *vol; + struct btrfs_ioctl_vol_args *vol = NULL; struct btrfs_fs_devices *fs_devices; int ret = -ENOTTY; + void __user *argp = (void __user *)arg; if (!capable(CAP_SYS_ADMIN)) return -EPERM; - vol = memdup_user((void __user *)arg, sizeof(*vol)); - if (IS_ERR(vol)) - return PTR_ERR(vol); - switch (cmd) { case BTRFS_IOC_SCAN_DEV: + vol = memdup_user((void __user *)arg, sizeof(*vol)); + if (IS_ERR(vol)) + return PTR_ERR(vol); ret = btrfs_scan_one_device(vol-name, FMODE_READ, btrfs_fs_type, fs_devices); + kfree(vol); break; case BTRFS_IOC_DEVICES_READY: + vol = memdup_user((void __user *)arg, sizeof(*vol)); + if (IS_ERR(vol)) + return PTR_ERR(vol); ret = btrfs_scan_one_device(vol-name, FMODE_READ, btrfs_fs_type, fs_devices); + kfree(vol); if (ret) break; ret = !(fs_devices-num_devices == fs_devices-total_devices); break; + case BTRFS_IOC_GET_FSLIST: + ret = btrfs_ioc_get_fslist(argp); + break; } - kfree(vol); return ret; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1567439..e22ac22 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6341,3 +6341,42 @@ int btrfs_scratch_superblock(struct btrfs_device *device) return 0; } + +int btrfs_get_fslist_cnt(void) +{ + int cnt = 0; + struct btrfs_fs_devices *fs_devices; + + mutex_lock(uuid_mutex); + list_for_each_entry(fs_devices, fs_uuids, list) + cnt++; + mutex_unlock(uuid_mutex); + + return cnt; +} + +u64 btrfs_get_fslist(struct btrfs_ioctl_fslist *fslist, u64 ucnt) +{ + u64 cnt = 0; + struct btrfs_fs_devices *fs_devices; + + mutex_lock(uuid_mutex); + list_for_each_entry(fs_devices, fs_uuids, list) { + if (!(cnt ucnt)) + break; + memcpy(fslist-fsid, fs_devices-fsid, + BTRFS_FSID_SIZE); + fslist-num_devices = fs_devices-num_devices; + fslist-missing_devices = fs_devices-missing_devices; + fslist-total_devices = fs_devices-total_devices; + + if
[PATCH 2/2] btrfs: usage error should not be logged into system log
From: Anand Jain anand.j...@oracle.com I have an opinion that system logs /var/log/messages are valuable info to investigate the real system issues at the data center. People handling data center issues do spend a lot time and efforts analyzing messages files. Having usage error logged into /var/log/messages is something we should avoid. Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/sysfs.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 63c2907..f729199 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -374,11 +374,8 @@ static ssize_t btrfs_label_store(struct kobject *kobj, struct btrfs_root *root = fs_info-fs_root; int ret; - if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) { - pr_err(BTRFS: unable to set label with more than %d bytes\n, - BTRFS_LABEL_SIZE - 1); + if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) return -EINVAL; - } trans = btrfs_start_transaction(root, 0); if (IS_ERR(trans)) -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4 RESEND] btrfs: scrub maintenance event should be recorded in the messages
so to help problem understanding and solving Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/ioctl.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index e174770..ff27c08 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3702,10 +3702,17 @@ static long btrfs_ioctl_scrub(struct file *file, void __user *arg) goto out; } + btrfs_info(root-fs_info, Scrub started); + ret = btrfs_scrub_dev(root-fs_info, sa-devid, sa-start, sa-end, sa-progress, sa-flags BTRFS_SCRUB_READONLY, 0); + if (ret) + btrfs_info(root-fs_info, Scrub failed - %d, ret); + else + btrfs_info(root-fs_info, Scrub finished); + if (copy_to_user(arg, sa, sizeof(*sa))) ret = -EFAULT; -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs: label should not contain return char
From: Anand Jain anand.j...@oracle.com generally if you use echo test /sys/fs/btrfs/fsid/label it would introduce return char at the end and it can not be part of the label. The correct command is echo -n test /sys/fs/btrfs/fsid/label This patch will check for this user error Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index c5eb214..63c2907 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -374,7 +374,7 @@ static ssize_t btrfs_label_store(struct kobject *kobj, struct btrfs_root *root = fs_info-fs_root; int ret; - if (len = BTRFS_LABEL_SIZE) { + if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) { pr_err(BTRFS: unable to set label with more than %d bytes\n, BTRFS_LABEL_SIZE - 1); return -EINVAL; -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC] btrfs: revamp /sys/fs/btrfs/fsid/devices
As of now with out this patch the content under the dir /sys/fs/btrfs/fsid/devices is just links to the block devs. Moving forward we would need the above btrfs sysfs path to contain more info about the btrfs devices. This patch provide a framework and as of now a fault notification interface, which is needed to notify when disk disappear. The idea is to call /sys/fs/btrfs/fsid/devices/disk/fault when we get a kobject notification about the disk disappearance. Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/sysfs.c | 110 + fs/btrfs/sysfs.h | 3 ++ fs/btrfs/volumes.c | 5 +++ fs/btrfs/volumes.h | 2 + 4 files changed, 96 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index f729199..7c80a99 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -31,6 +31,7 @@ #include transaction.h #include sysfs.h #include volumes.h +#include rcu-string.h static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj); @@ -475,19 +476,6 @@ static void __btrfs_sysfs_remove_one(struct btrfs_fs_info *fs_info) wait_for_completion(fs_info-kobj_unregister); } -void btrfs_sysfs_remove_one(struct btrfs_fs_info *fs_info) -{ - if (fs_info-space_info_kobj) { - sysfs_remove_files(fs_info-space_info_kobj, allocation_attrs); - kobject_del(fs_info-space_info_kobj); - kobject_put(fs_info-space_info_kobj); - } - kobject_del(fs_info-device_dir_kobj); - kobject_put(fs_info-device_dir_kobj); - addrm_unknown_feature_attrs(fs_info, false); - sysfs_remove_group(fs_info-super_kobj, btrfs_feature_attr_group); - __btrfs_sysfs_remove_one(fs_info); -} const char * const btrfs_feature_set_names[3] = { [FEAT_COMPAT]= compat, @@ -564,36 +552,91 @@ static void init_feature_attrs(void) } } -static int add_device_membership(struct btrfs_fs_info *fs_info) + +#define to_btrfs_device(_kobj) container_of(_kobj, struct btrfs_device, device_kobj) + +static ssize_t device_kobj_fault_store(struct kobject *dev_kobj, + struct kobj_attribute *a, const char *buf, size_t len) +{ + struct btrfs_device *dev = to_btrfs_device(dev_kobj); + + if (dev-missing || !dev-bdev) + return -EINVAL; + + /* Fixme: Call appropriate device check status handler */ + +return len; +} + +BTRFS_ATTR_RW(fault, 0200, NULL, device_kobj_fault_store); + +static struct attribute *device_kobj_attrs[] = { + BTRFS_ATTR_PTR(fault), + NULL, +}; + +static void device_kobj_release(struct kobject *dev_kobj) +{ + /* nothing to free as of now */ +} + +struct kobj_type device_ktype = { + .sysfs_ops = kobj_sysfs_ops, + .release= device_kobj_release, + .default_attrs = device_kobj_attrs, +}; + +int device_add_kobject(struct btrfs_fs_info *fs_info) { int error = 0; struct btrfs_fs_devices *fs_devices = fs_info-fs_devices; struct btrfs_device *dev; - fs_info-device_dir_kobj = kobject_create_and_add(devices, + if (!fs_info-device_dir_kobj) + fs_info-device_dir_kobj = kobject_create_and_add(devices, fs_info-super_kobj); if (!fs_info-device_dir_kobj) return -ENOMEM; list_for_each_entry(dev, fs_devices-devices, dev_list) { - struct hd_struct *disk; - struct kobject *disk_kobj; - if (!dev-bdev) + if (!dev-bdev || dev-missing || dev-device_kobj.parent) continue; - disk = dev-bdev-bd_part; - disk_kobj = part_to_dev(disk)-kobj; + error = kobject_init_and_add(dev-device_kobj, device_ktype, + fs_info-device_dir_kobj, %s, + strrchr(rcu_str_deref(dev-name), '/') + 1); - error = sysfs_create_link(fs_info-device_dir_kobj, - disk_kobj, disk_kobj-name); if (error) break; } - return error; } +/* + * Remove the sysfs entries for the devices. + * devid provides a perticular devid for which the sysfs entry + * has to be removed, if -1 it would remove for all devs + */ +void device_rm_kobject(struct btrfs_fs_info *fs_info, u64 devid) +{ + struct btrfs_fs_devices *fs_devices = fs_info-fs_devices; + struct btrfs_device *dev; + + if (!fs_info-device_dir_kobj) + return; + + list_for_each_entry(dev, fs_devices-devices, dev_list) { + if (!dev-device_kobj.parent) + continue; + + if (devid == -1 || devid == dev-devid) { + kobject_del(dev-device_kobj); + kobject_put(dev-device_kobj); + } + } +} + /* /sys/fs/btrfs/ entry
Re: send/receive and bedup
On 19/5/2014 7:01 μμ, Brendan Hide wrote: On 19/05/14 15:00, Scott Middleton wrote: On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote: On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote: I read so much about BtrFS that I mistaked Bedup with Duperemove. Duperemove is actually what I am testing. I'm currently using programs that find files that are the same, and hardlink them together: http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html hardlink.py actually seems to be the faster (memory and CPU) one event though it's in python. I can get others to run out of RAM on my 8GB server easily :( Interesting app. An issue with hardlinking (with the backups use-case, this problem isn't likely to happen), is that if you modify a file, all the hardlinks get changed along with it - including the ones that you don't want changed. @Marc: Since you've been using btrfs for a while now I'm sure you've already considered whether or not a reflink copy is the better/worse option. Bedup should be better, but last I tried I couldn't get it to work. It's been updated since then, I just haven't had the chance to try it again since then. Please post what you find out, or if you have a hardlink maker that's better than the ones I found :) Thanks for that. I may be completely wrong in my approach. I am not looking for a file level comparison. Bedup worked fine for that. I have a lot of virtual images and shadow protect images where only a few megabytes may be the difference. So a file level hash and comparison doesn't really achieve my goals. I thought duperemove may be on a lower level. https://github.com/markfasheh/duperemove Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on a block by block basis and compare those hashes to each other, finding and categorizing extents that match each other. When given the -d option, duperemove will submit those extents for deduplication using the btrfs-extent-same ioctl. It defaults to 128k but you can make it smaller. I hit a hurdle though. The 3TB HDD I used seemed OK when I did a long SMART test but seems to die every few hours. Admittedly it was part of a failed mdadm RAID array that I pulled out of a clients machine. The only other copy I have of the data is the original mdadm array that was recently replaced with a new server, so I am loathe to use that HDD yet. At least for another couple of weeks! I am still hopeful duperemove will work. Duperemove does look exactly like what you are looking for. The last traffic on the mailing list regarding that was in August last year. It looks like it was pulled into the main kernel repository on September 1st. The last commit to the duperemove application was on April 20th this year. Maybe Mark (cc'd) can provide further insight on its current status. I have been testing duperemove and it seems to work just fine, in contrast with bedup that i have been unable to install/compile/sort out the mess with python versions. I have 2 questions about duperemove: 1) can it use existing filesystem csums instead of calculating its own? 2) can it be included in btrfs-progs so that it becomes a standard feature of btrfs? Thanks -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs: label should not contain return char
On 5/19/14, 12:04 PM, Anand Jain wrote: From: Anand Jain anand.j...@oracle.com generally if you use echo test /sys/fs/btrfs/fsid/label it would introduce return char at the end and it can not be part of the label. The correct command is echo -n test /sys/fs/btrfs/fsid/label This patch will check for this user error Wouldn't it be a lot better to just strip the \n if it exists? Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index c5eb214..63c2907 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -374,7 +374,7 @@ static ssize_t btrfs_label_store(struct kobject *kobj, struct btrfs_root *root = fs_info-fs_root; int ret; - if (len = BTRFS_LABEL_SIZE) { + if (len = BTRFS_LABEL_SIZE || strchr(buf, '\n')) { pr_err(BTRFS: unable to set label with more than %d bytes\n, BTRFS_LABEL_SIZE - 1); so if I do: # echo mylabel /sys/fs/btrfs/fsid/label I'll get: BTRFS: unable to set label with more than 255 bytes which would be pretty confusing, IMHO, given the short label I tried to create. Just strip out the \n ... -Eric return -EINVAL; -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs: label should not contain return char
On Tue, 20 May 2014 01:04:30 +0800 Anand Jain anand.j...@oracle.com wrote: From: Anand Jain anand.j...@oracle.com generally if you use echo test /sys/fs/btrfs/fsid/label it would introduce return char at the end and it can not be part of the label. The correct command is echo -n test /sys/fs/btrfs/fsid/label This patch will check for this user error Maybe instead consider checking for one trailing \n, and silently remove it if passed, so that both of the mentioned variants of 'echo' can be used? All other sysfs files do not care if you pass an extra \n at the end, e.g. echo cfq /sys/block/sda/queue/scheduler works fine, doesn't require you to use echo -n cfq. -- With respect, Roman signature.asc Description: PGP signature
Re: send/receive and bedup
On Mon, May 19, 2014 at 06:01:25PM +0200, Brendan Hide wrote: On 19/05/14 15:00, Scott Middleton wrote: On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote: Thanks for that. I may be completely wrong in my approach. I am not looking for a file level comparison. Bedup worked fine for that. I have a lot of virtual images and shadow protect images where only a few megabytes may be the difference. So a file level hash and comparison doesn't really achieve my goals. I thought duperemove may be on a lower level. https://github.com/markfasheh/duperemove Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on a block by block basis and compare those hashes to each other, finding and categorizing extents that match each other. When given the -d option, duperemove will submit those extents for deduplication using the btrfs-extent-same ioctl. It defaults to 128k but you can make it smaller. I hit a hurdle though. The 3TB HDD I used seemed OK when I did a long SMART test but seems to die every few hours. Admittedly it was part of a failed mdadm RAID array that I pulled out of a clients machine. The only other copy I have of the data is the original mdadm array that was recently replaced with a new server, so I am loathe to use that HDD yet. At least for another couple of weeks! I am still hopeful duperemove will work. Duperemove does look exactly like what you are looking for. The last traffic on the mailing list regarding that was in August last year. It looks like it was pulled into the main kernel repository on September 1st. I'm confused - you need to avoid a file scan completely? Duperemove does do that just to be clear. In your mind, what would be the alternative to that sort of a scan? By the way, if you know exactly where the changes are you could just feed the duplicate extents directly to the ioctl via a script. I have a small tool in the duperemove repositry that can do that for you ('make btrfs-extent-same'). The last commit to the duperemove application was on April 20th this year. Maybe Mark (cc'd) can provide further insight on its current status. Duperemove will be shipping as supported software in a major SUSE release so it will be bug fixed, etc as you would expect. At the moment I'm very busy trying to fix qgroup bugs so I haven't had much time to add features, or handle external bug reports, etc. Also I'm not very good at advertising my software which would be why it hasn't really been mentioned on list lately :) I would say that state that it's in is that I've gotten the feature set to a point which feels reasonable, and I've fixed enough bugs that I'd appreciate folks giving it a spin and providing reasonable feedback. There's a TODO list which gives a decent idea of what's on my mind for possible future improvements. I think what I'm most wanting to do right now is some sort of (optional) writeout to a file of what was done during a run. The idea is that you could feed that data back to duperemove to improve the speed of subsequent runs. My priorities may change depending on feedback from users of course. I also at some point want to rewrite some of the duplicate extent finding code as it got messy and could be a bit faster. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: send/receive and bedup
On Mon, May 19, 2014 at 08:12:03PM +0300, Konstantinos Skarlatos wrote: On 19/5/2014 7:01 μμ, Brendan Hide wrote: On 19/05/14 15:00, Scott Middleton wrote: Duperemove does look exactly like what you are looking for. The last traffic on the mailing list regarding that was in August last year. It looks like it was pulled into the main kernel repository on September 1st. The last commit to the duperemove application was on April 20th this year. Maybe Mark (cc'd) can provide further insight on its current status. I have been testing duperemove and it seems to work just fine, in contrast with bedup that i have been unable to install/compile/sort out the mess with python versions. I have 2 questions about duperemove: 1) can it use existing filesystem csums instead of calculating its own? Not right now, though that may be something we can feed to it in the future. I haven't thought about this much and to be honest I don't recall *exactly* how btrfs stores it's checksums. That said, I think feasibility of doing this comes down to a few things: 1) how expensive is it to get at the on-disk checksums? This might not make sense if it's simply faster to scan a file than its checksums. 2) are they stored in a manner which makes sense for dedupe. By that I mean, do we have a checksum for every X bytes? If so, then theoretically life is easy - we just make our blocksize to X and load the checksums into duperemoves internal block checksum tree. If checksums can cover arbitrary sized extents than we might not be able to use them at all or maybe we would have to 'fill in the blanks' so to speak. 3) what is the tradeoff of false positives? Btrfs checksums are there for detecting bad blocks, as opposed to duplicate data. The difference is that btrfs doesn't have to use very strong hashing as a result. So we just want to make sure that we don't wind up passing *so* many false positives to the kernel that it was just faster to scan the file and checksum on our own. Not that any of those questions are super difficult to answer by the way, it's more about how much time I've had :) 2) can it be included in btrfs-progs so that it becomes a standard feature of btrfs? I have to think about this one personally as it implies some tradeoffs in my development on duperemove that I'm not sure I want to make yet. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: send/receive and bedup
On 2014-05-19 13:12, Konstantinos Skarlatos wrote: On 19/5/2014 7:01 μμ, Brendan Hide wrote: On 19/05/14 15:00, Scott Middleton wrote: On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote: On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote: I read so much about BtrFS that I mistaked Bedup with Duperemove. Duperemove is actually what I am testing. I'm currently using programs that find files that are the same, and hardlink them together: http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html hardlink.py actually seems to be the faster (memory and CPU) one event though it's in python. I can get others to run out of RAM on my 8GB server easily :( Interesting app. An issue with hardlinking (with the backups use-case, this problem isn't likely to happen), is that if you modify a file, all the hardlinks get changed along with it - including the ones that you don't want changed. @Marc: Since you've been using btrfs for a while now I'm sure you've already considered whether or not a reflink copy is the better/worse option. Bedup should be better, but last I tried I couldn't get it to work. It's been updated since then, I just haven't had the chance to try it again since then. Please post what you find out, or if you have a hardlink maker that's better than the ones I found :) Thanks for that. I may be completely wrong in my approach. I am not looking for a file level comparison. Bedup worked fine for that. I have a lot of virtual images and shadow protect images where only a few megabytes may be the difference. So a file level hash and comparison doesn't really achieve my goals. I thought duperemove may be on a lower level. https://github.com/markfasheh/duperemove Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on a block by block basis and compare those hashes to each other, finding and categorizing extents that match each other. When given the -d option, duperemove will submit those extents for deduplication using the btrfs-extent-same ioctl. It defaults to 128k but you can make it smaller. I hit a hurdle though. The 3TB HDD I used seemed OK when I did a long SMART test but seems to die every few hours. Admittedly it was part of a failed mdadm RAID array that I pulled out of a clients machine. The only other copy I have of the data is the original mdadm array that was recently replaced with a new server, so I am loathe to use that HDD yet. At least for another couple of weeks! I am still hopeful duperemove will work. Duperemove does look exactly like what you are looking for. The last traffic on the mailing list regarding that was in August last year. It looks like it was pulled into the main kernel repository on September 1st. The last commit to the duperemove application was on April 20th this year. Maybe Mark (cc'd) can provide further insight on its current status. I have been testing duperemove and it seems to work just fine, in contrast with bedup that i have been unable to install/compile/sort out the mess with python versions. I have 2 questions about duperemove: 1) can it use existing filesystem csums instead of calculating its own? While this might seem like a great idea at first, it really isn't. BTRFS uses CRC32c at the moment as it's checksum algorithm, and while that is relatively good at detecting small differences (i.e. a single bit flipped out of every 64 or so bytes), it is known to have issues with hash collisions. Normally, the data on disk won't change enough even from a media error to cause a hash collision, but when you start using it to compare extents that aren't known to be the same to begin with, and then try to merge those extents, you run the risk of serious file corruption. Also, AFAIK, BTRFS doesn't expose the block checksum to userspace directly (although I may be wrong about this, in which case i retract the following statement) this would therefore require some kernelspace support. 2) can it be included in btrfs-progs so that it becomes a standard feature of btrfs? I would definitely like to second this suggestion, I hear a lot of people talking about how BTRFS has batch deduplication, but it's almost impossible to make use of without extra software or writing your own code. smime.p7s Description: S/MIME Cryptographic Signature
[PATCH v2] Btrfs: ensure readers see new data after a clone operation
We were cleaning the clone target file range from the page cache before we did replace the file extent items in the fs tree. This was racy, as right after cleaning the relevant range from the page cache and before replacing the file extent items, a read against that range could be performed by another task and populate again the page cache with stale data (stale after the cloning finishes). This would result in reads after the clone operation successfully finishes to get old data (and potentially for a very long time). Therefore evict the pages after replacing the file extent items, so that subsequent reads will always get the new data. Similarly, we were prone to races while cloning the file extent items because we weren't locking the target range and wait for any existing ordered extents against that range to complete. It was possible that after cloning the extent items, a write operation that was performed before the clone operation and overlaps the same range, would end up undoing all or part of the work the clone operation did (a worker task running inode.c:btrfs_finish_ordered_io). Therefore lock the target range in the io tree, wait for all pending ordered extents against that range to finish and then safely perform the cloning. The issue of reading stale data after the clone operation is easy to reproduce by running the following C program in a loop until it exits with return value 1. #include unistd.h #include stdio.h #include stdlib.h #include string.h #include errno.h #include pthread.h #include sys/stat.h #include fcntl.h #include assert.h #include asm/types.h #include linux/ioctl.h #include sys/stat.h #include sys/types.h #include sys/ioctl.h #define SRC_FILE /mnt/sdd/foo #define DST_FILE /mnt/sdd/bar #define FILE_SIZE (16 * 1024) #define PATTERN_SRC 'X' #define PATTERN_DST 'Y' struct btrfs_ioctl_clone_range_args { __s64 src_fd; __u64 src_offset, src_length; __u64 dest_offset; }; #define BTRFS_IOCTL_MAGIC 0x94 #define BTRFS_IOC_CLONE_RANGE _IOW(BTRFS_IOCTL_MAGIC, 13, \ struct btrfs_ioctl_clone_range_args) static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; static int clone_done = 0; static int reader_ready = 0; static int stale_data = 0; static void *reader_loop(void *arg) { char buf[4096], want_buf[4096]; memset(want_buf, PATTERN_SRC, 4096); pthread_mutex_lock(mutex); reader_ready = 1; pthread_mutex_unlock(mutex); while (1) { int done, fd, ret; fd = open(DST_FILE, O_RDONLY); assert(fd != -1); pthread_mutex_lock(mutex); done = clone_done; pthread_mutex_unlock(mutex); ret = read(fd, buf, 4096); assert(ret == 4096); close(fd); if (done) { ret = memcmp(buf, want_buf, 4096); if (ret == 0) { printf(Found new content\n); } else { printf(Found old content\n); pthread_mutex_lock(mutex); stale_data = 1; pthread_mutex_unlock(mutex); } break; } } return NULL; } int main(int argc, char *argv[]) { pthread_t reader; int ret, i, fd; struct btrfs_ioctl_clone_range_args clone_args; int fd1, fd2; ret = remove(SRC_FILE); if (ret == -1 errno != ENOENT) { fprintf(stderr, Error deleting src file: %s\n, strerror(errno)); return 1; } ret = remove(DST_FILE); if (ret == -1 errno != ENOENT) { fprintf(stderr, Error deleting dst file: %s\n, strerror(errno)); return 1; } fd = open(SRC_FILE, O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU); assert(fd != -1); for (i = 0; i FILE_SIZE; i++) { char c = PATTERN_SRC; ret = write(fd, c, 1); assert(ret == 1); } close(fd); fd = open(DST_FILE, O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU); assert(fd != -1); for (i = 0; i FILE_SIZE; i++) { char c = PATTERN_DST; ret = write(fd, c, 1); assert(ret == 1); } close(fd); ret = pthread_create(reader, NULL, reader_loop, NULL); assert(ret == 0); while (1) { int r; pthread_mutex_lock(mutex); r = reader_ready; pthread_mutex_unlock(mutex); if (r) break; } fd1 = open(SRC_FILE, O_RDONLY); if (fd1 0) { fprintf(stderr, Error open src file: %s\n, strerror(errno)); return 1;
Re: send/receive and bedup
On Mon, May 19, 2014 at 01:59:01PM -0400, Austin S Hemmelgarn wrote: On 2014-05-19 13:12, Konstantinos Skarlatos wrote: I have been testing duperemove and it seems to work just fine, in contrast with bedup that i have been unable to install/compile/sort out the mess with python versions. I have 2 questions about duperemove: 1) can it use existing filesystem csums instead of calculating its own? While this might seem like a great idea at first, it really isn't. BTRFS uses CRC32c at the moment as it's checksum algorithm, and while that is relatively good at detecting small differences (i.e. a single bit flipped out of every 64 or so bytes), it is known to have issues with hash collisions. Normally, the data on disk won't change enough even from a media error to cause a hash collision, but when you start using it to compare extents that aren't known to be the same to begin with, and then try to merge those extents, you run the risk of serious file corruption. Also, AFAIK, BTRFS doesn't expose the block checksum to userspace directly (although I may be wrong about this, in which case i retract the following statement) this would therefore require some kernelspace support. I'm pretty sure you could get the checkums via ioctl. The thing about dedupe though is that kernel is always doing a byte-by-byte comparison of the file data before merging it so we should never corrupt just because userspace gave us a bad range to dedupe. That said I don't necessarily disagree that it might not be as good an idea as it sounds. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ditto blocks on ZFS
On 18/05/14 17:09, Russell Coker wrote: On Sat, 17 May 2014 13:50:52 Martin wrote: [...] Do you see or measure any real advantage? Imagine that you have a RAID-1 array where both disks get ~14,000 read errors. This could happen due to a design defect common to drives of a particular model or some shared environmental problem. Most errors would be corrected by RAID-1 but there would be a risk of some data being lost due to both copies being corrupt. Another possibility is that one disk could entirely die (although total disk death seems rare nowadays) and the other could have corruption. If metadata was duplicated in addition to being on both disks then the probability of data loss would be reduced. Another issue is the case where all drive slots are filled with active drives (a very common configuration). To replace a disk you have to physically remove the old disk before adding the new one. If the array is a RAID-1 or RAID-5 then ANY error during reconstruction loses data. Using dup for metadata on top of the RAID protections (IE the ZFS ditto idea) means that case doesn't lose you data. Your example there is for the case where in effect there is no RAID. How is that case any better than what is already done for btrfs duplicating metadata? So... What real-world failure modes do the ditto blocks usefully protect against? And how does that compare for failure rates and against what is already done? For example, we have RAID1 and RAID5 to protect against any one RAID chunk being corrupted or for the total loss of any one device. There is a second part to that in that another failure cannot be tolerated until the RAID is remade. Hence, we have RAID6 that protects against any two failures for a chunk or device. Hence with just one failure, you can tolerate a second failure whilst rebuilding the RAID. And then we supposedly have safety-by-design where the filesystem itself is using a journal and barriers/sync to ensure that the filesystem is always kept in a consistent state, even after an interruption to any writes. *What other failure modes* should we guard against? There has been mention of fixing metadata keys from single bit flips... Should hamming codes be used instead of a crc so that we can have multiple bit error detect, single bit error correct functionality for all data both in RAM and on disk for those systems that do not use ECC RAM? Would that be useful?... Regards, Martin -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] xfstests: new mailing list
On Mon, May 19, 2014 at 07:55:41AM -0700, Christoph Hellwig wrote: On Sat, May 17, 2014 at 08:19:30AM +1000, Dave Chinner wrote: Renaming the test suite take a lot more work - .e.g renaming/moving source trees and a fixing all the documentation that points to it... In that case please call the list xfstests - a name different by a single character is utterly confusing. And I defintively see some merit to the suggestion that we'll just keep the x and allow people to come up with a nice backronym for it if they care enough. What is important is that we have a separate list for the filesystem test suite we use, not whether the name has an x in or not. Arguing about whether there should or should not be an 'x' in the mailing list name is just a waste of time - it's not going to make me change the name of the list Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ditto blocks on ZFS
On 2014/05/19 10:36 PM, Martin wrote: On 18/05/14 17:09, Russell Coker wrote: On Sat, 17 May 2014 13:50:52 Martin wrote: [...] Do you see or measure any real advantage? [snip] This is extremely difficult to measure objectively. Subjectively ... see below. [snip] *What other failure modes* should we guard against? I know I'd sleep a /little/ better at night knowing that a double disk failure on a raid5/1/10 configuration might ruin a ton of data along with an obscure set of metadata in some long tree paths - but not the entire filesystem. The other use-case/failure mode - where you are somehow unlucky enough to have sets of bad sectors/bitrot on multiple disks that simultaneously affect the only copies of the tree roots - is an extremely unlikely scenario. As unlikely as it may be, the scenario is a very painful consequence in spite of VERY little corruption. That is where the peace-of-mind/bragging rights come in. -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: send/receive and bedup
On 19/5/2014 8:38 μμ, Mark Fasheh wrote: On Mon, May 19, 2014 at 06:01:25PM +0200, Brendan Hide wrote: On 19/05/14 15:00, Scott Middleton wrote: On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote: Thanks for that. I may be completely wrong in my approach. I am not looking for a file level comparison. Bedup worked fine for that. I have a lot of virtual images and shadow protect images where only a few megabytes may be the difference. So a file level hash and comparison doesn't really achieve my goals. I thought duperemove may be on a lower level. https://github.com/markfasheh/duperemove Duperemove is a simple tool for finding duplicated extents and submitting them for deduplication. When given a list of files it will hash their contents on a block by block basis and compare those hashes to each other, finding and categorizing extents that match each other. When given the -d option, duperemove will submit those extents for deduplication using the btrfs-extent-same ioctl. It defaults to 128k but you can make it smaller. I hit a hurdle though. The 3TB HDD I used seemed OK when I did a long SMART test but seems to die every few hours. Admittedly it was part of a failed mdadm RAID array that I pulled out of a clients machine. The only other copy I have of the data is the original mdadm array that was recently replaced with a new server, so I am loathe to use that HDD yet. At least for another couple of weeks! I am still hopeful duperemove will work. Duperemove does look exactly like what you are looking for. The last traffic on the mailing list regarding that was in August last year. It looks like it was pulled into the main kernel repository on September 1st. I'm confused - you need to avoid a file scan completely? Duperemove does do that just to be clear. In your mind, what would be the alternative to that sort of a scan? By the way, if you know exactly where the changes are you could just feed the duplicate extents directly to the ioctl via a script. I have a small tool in the duperemove repositry that can do that for you ('make btrfs-extent-same'). The last commit to the duperemove application was on April 20th this year. Maybe Mark (cc'd) can provide further insight on its current status. Duperemove will be shipping as supported software in a major SUSE release so it will be bug fixed, etc as you would expect. At the moment I'm very busy trying to fix qgroup bugs so I haven't had much time to add features, or handle external bug reports, etc. Also I'm not very good at advertising my software which would be why it hasn't really been mentioned on list lately :) I would say that state that it's in is that I've gotten the feature set to a point which feels reasonable, and I've fixed enough bugs that I'd appreciate folks giving it a spin and providing reasonable feedback. Well, after having good results with duperemove with a few gigs of data, i tried it on a 500gb subvolume. After it scanned all files, it is stuck at 100% of one cpu core for about 5 hours, and still hasn't done any deduping. My cpu is an Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz, so i guess thats not the problem. So I guess the speed of duperemove drops dramatically as data volume increases. There's a TODO list which gives a decent idea of what's on my mind for possible future improvements. I think what I'm most wanting to do right now is some sort of (optional) writeout to a file of what was done during a run. The idea is that you could feed that data back to duperemove to improve the speed of subsequent runs. My priorities may change depending on feedback from users of course. I also at some point want to rewrite some of the duplicate extent finding code as it got messy and could be a bit faster. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.
Original Message Subject: Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand. From: David Sterba dste...@suse.cz To: Hugo Mills h...@carfax.org.uk, Qu Wenruo quwen...@cn.fujitsu.com, linux-btrfs@vger.kernel.org, c...@fb.com Date: 2014年05月19日 22:33 On Mon, May 19, 2014 at 04:01:23PM +0200, David Sterba wrote: On Sat, May 17, 2014 at 06:43:15PM +0100, Hugo Mills wrote: I've just been poking around in the docs for a completely different reason, and I think there's a fairly serious problem (well, as serious as problems get with documentation). Take, for example, the format for btrfs fi resize: 'resize' [devid:][+/-]size[gkm]|[devid:]max path:: Now, this has just thrown away all of the useful markup which indicates the semantics of the command. The asciidoc renders all of that text literally and unformatted, making alphasymbolic(*) soup of the docs. Compare this to the old roff man page: \fBbtrfs\fP \fBfilesystem resize\fP [\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP I think we can restore the formatting with asciidoc. The line above would become: *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]'max path' or with bold max *btrfs* *filesystem resize* ['devid':][+/-]'size'[kgm]|[devid':]*max* 'path' The correct base string should read btrfs filesystem resize [devid:][+/-]size[kgm]|[devid:]max path ie. add .. around devid and size. That way it's copy-paste-ready. In this case the italic/underlined text does not IMO add much value. It is completely OK for me. Since the base string is copy-paste-ready, it would add any extra effort to add other markup. My personal feeling about the enriched formatting is that the commands stand out of the text and are easier to catch (as you've mentioned somewhere in the thread). The bolded subcommand name seems to be sufficent. The files are processed by XSL, I think it should be possible to apply some transformation that would add '...' around ... automatically instead of making everybody write that. Proposed changes: - format all subcommands as bold instead of italic ('' - **) - add all missing ... - find a way how to add '...' around ... (xsl or sed or whatever) Does that work for you? That is OK for me, I'll investigate it. Should I send a new patchset or just delta patches upon the current base? Thanks, Qu -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs diff between snapshosts
As a followup on the discussion we had on how to do a live data switch on a partition without unmounting it since it's busy, it came back to how do you know what changed between 2 snapshots. btrfs send giving a file list of what was added/modified/removed is the long term answer, and Filipe is working on patches that will offer this in the future (thanks Filipe). In the meantime, I found a hack/script that gives a partial diff between 2 snapshots, called it btrfs-diff but then remembered that there isn't a page for it. So, I made one, along with example usage: http://marc.merlins.org/perso/btrfs/post_2014-05-19_Btrfs-diff-Between-Snapshots.html Hope this helps. As another temp hack, I tried to look at a quick way to parse btrfs send output to just spit out filenames, but that wasn't too trivial (as in with sed/perl). If someone has something better, please share :) Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ditto blocks on ZFS
On Mon, 19 May 2014 23:47:37 Brendan Hide wrote: This is extremely difficult to measure objectively. Subjectively ... see below. [snip] *What other failure modes* should we guard against? I know I'd sleep a /little/ better at night knowing that a double disk failure on a raid5/1/10 configuration might ruin a ton of data along with an obscure set of metadata in some long tree paths - but not the entire filesystem. My experience is that most disk failures that don't involve extreme physical damage (EG dropping a drive on concrete) don't involve totally losing the disk. Much of the discussion about RAID failures concerns entirely failed disks, but I believe that is due to RAID implementations such as Linux software RAID that will entirely remove a disk when it gives errors. I have a disk which had ~14,000 errors of which ~2000 errors were corrected by duplicate metadata. If two disks with that problem were in a RAID-1 array then duplicate metadata would be a significant benefit. The other use-case/failure mode - where you are somehow unlucky enough to have sets of bad sectors/bitrot on multiple disks that simultaneously affect the only copies of the tree roots - is an extremely unlikely scenario. As unlikely as it may be, the scenario is a very painful consequence in spite of VERY little corruption. That is where the peace-of-mind/bragging rights come in. http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html The NetApp research on latent errors on drives is worth reading. On page 12 they report latent sector errors on 9.5% of SATA disks per year. So if you lose one disk entirely the risk of having errors on a second disk is higher than you would want for RAID-5. While losing the root of the tree is unlikely, losing a directory in the middle that has lots of subdirectories is a risk. I can understand why people wouldn't want ditto blocks to be mandatory. But why are people arguing against them as an option? As an aside, I'd really like to be able to set RAID levels by subtree. I'd like to use RAID-1 with ditto blocks for my important data and RAID-0 for unimportant data. -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html