Re: Triple parity and beyond

2013-11-21 Thread joystick
On 21/11/2013 02:28, Stan Hoeppner wrote: On 11/20/2013 10:16 AM, James Plank wrote: Hi all -- no real comments, except as I mentioned to Ric, my tutorial in FAST last February presents Reed-Solomon coding with Cauchy matrices, and then makes special note of the common pitfall of assuming that

Re: Triple parity and beyond

2013-11-21 Thread David Brown
On 20/11/13 19:09, John Williams wrote: On Wed, Nov 20, 2013 at 2:31 AM, David Brown david.br...@hesbynett.no wrote: That's certainly a reasonable way to look at it. We should not limit the possibilities for high-end systems because of the limitations of low-end systems that are unlikely to

Re: Triple parity and beyond

2013-11-21 Thread David Brown
On 20/11/13 19:34, Andrea Mazzoleni wrote: Hi David, The choice of ZFS to use powers of 4 was likely not optimal, because to multiply by 4, it has to do two multiplications by 2. I can agree with that. I didn't copy ZFS's choice here David, it was not my intention to suggest that you

Re: [PATCH 1/2] Documentation: filesystems: add new btrfs mount options

2013-11-21 Thread Stefan Behrens
On Wed, 20 Nov 2013 15:05:51 +0100, David Sterba wrote: Two new options were added in 3.12: commit and rescan_uuid_tree CC: linux-...@vger.kernel.org Signed-off-by: David Sterba dste...@suse.cz --- Documentation/filesystems/btrfs.txt | 12 +++- 1 file changed, 11 insertions(+), 1

Re: Triple parity and beyond

2013-11-21 Thread David Brown
On 21/11/13 02:28, Stan Hoeppner wrote: On 11/20/2013 10:16 AM, James Plank wrote: Hi all -- no real comments, except as I mentioned to Ric, my tutorial in FAST last February presents Reed-Solomon coding with Cauchy matrices, and then makes special note of the common pitfall of assuming that

Re: Triple parity and beyond

2013-11-21 Thread David Brown
On 21/11/13 10:54, Adam Goryachev wrote: On 21/11/13 20:07, David Brown wrote: I can see plenty of reasons why raid15 might be a good idea, and even raid16 for 5 disk redundancy, compared to multi-parity sets. However, it costs a lot in disk space. For example, with 20 disks at 1 TB each,

[PATCH 5/5] Btrfs: reclaim the reserved metadata space at background

2013-11-21 Thread Miao Xie
Before applying this patch, the task had to reclaim the metadata space by itself if the metadata space was not enough. And When the task started the space reclamation, all the other tasks which wanted to reserve the metadata space were blocked. At some cases, they would be blocked for a long time,

Re: Triple parity and beyond

2013-11-21 Thread Piergiorgio Sartor
Hi David, On Thu, Nov 21, 2013 at 09:31:46PM +0100, David Brown wrote: [...] If this can all be done to give the user an informed choice, then it sounds good. that would be my target. To _offer_ more options to the (advanced) user. It _must_ always be under user control. One issue here is

Re: Actual effect of mkfs.btrfs -m raid10 /dev/sdX ... -d raid10 /dev/sdX ...

2013-11-21 Thread Jeff Mahoney
On 11/19/13, 12:12 AM, deadhorseconsulting wrote: In theory (going by the man page and available documentation, not 100% clear) does the following command indeed actually work as advertised and specify how metadata should be placed and kept only on the devices specified after the -m flag?

Re: Triple parity and beyond

2013-11-21 Thread Stan Hoeppner
On 11/21/2013 1:05 AM, John Williams wrote: On Wed, Nov 20, 2013 at 10:52 PM, Stan Hoeppner s...@hardwarefreak.com wrote: On 11/20/2013 8:46 PM, John Williams wrote: For myself or any machines I managed for work that do not need high IOPS, I would definitely choose triple- or quad-parity

[PATCH v2] btrfs: fix leaks during sysfs teardown

2013-11-21 Thread Jeff Mahoney
Filipe noticed that we were leaking the features attribute group after umount. His fix of just calling sysfs_remove_group() wasn't enough since that removes just the supported features and not the unsupported features. This patch changes the unknown feature handling to add them individually so we

Re: [PATCH] btrfs: fix leaks during sysfs teardown

2013-11-21 Thread Jeff Mahoney
On 11/21/13, 8:07 AM, Filipe David Manana wrote: On Wed, Nov 20, 2013 at 9:59 PM, Jeff Mahoney je...@suse.com wrote: Filipe noticed that we were leaking the features attribute group after umount. His fix of just calling sysfs_remove_group() wasn't enough since that removes just the supported

Re: Triple parity and beyond

2013-11-21 Thread Piergiorgio Sartor
On Thu, Nov 21, 2013 at 11:13:29AM +0100, David Brown wrote: [...] Ah, you are trying to find which disk has incorrect data so that you can change just that one disk? There are dangers with that... Hi David, http://neil.brown.name/blog/20100211050355 I think we already did the exercise,

Re: Triple parity and beyond

2013-11-21 Thread Stan Hoeppner
On 11/21/2013 2:08 AM, joystick wrote: On 21/11/2013 02:28, Stan Hoeppner wrote: ... WRT rebuild times, once drives hit 20TB we're looking at 18 hours just to mirror a drive at full streaming bandwidth, assuming 300MB/s average--and that is probably being kind to the drive makers. With 6 or

Re: Triple parity and beyond

2013-11-21 Thread David Brown
On 21/11/13 21:52, Piergiorgio Sartor wrote: Hi David, On Thu, Nov 21, 2013 at 09:31:46PM +0100, David Brown wrote: [...] If this can all be done to give the user an informed choice, then it sounds good. that would be my target. To _offer_ more options to the (advanced) user. It _must_

Re: Triple parity and beyond

2013-11-21 Thread H. Peter Anvin
On 11/21/2013 04:30 PM, Stan Hoeppner wrote: The rebuild time of a parity array normally has little to do with CPU overhead. Unless you have to fall back to table driven code. Anyway, this looks like a great concept. Now we just need to implement it ;) -hpa -- To unsubscribe from

Re: Triple parity and beyond

2013-11-21 Thread David Brown
On 22/11/13 01:30, Stan Hoeppner wrote: I don't like it either. It's a compromise. But as RAID1/10 will soon be unusable due to URE probability during rebuild, I think it's a relatively good compromise for some users, some workloads. An alternative is to move to 3-way raid1 mirrors rather

[PATCH 2/5] Btrfs: just do diry page flush for the inode with compression before direct IO

2013-11-21 Thread Miao Xie
As the comment in the btrfs_direct_IO says, only the compressed pages need be flush again to make sure they are on the disk, but the common pages needn't, so we add a if statement to check if the inode has compressed pages or not, if no, skip the flush. And in order to prevent the write ranges

Re: [PATCH] btrfs: fix leaks during sysfs teardown

2013-11-21 Thread Filipe David Manana
On Wed, Nov 20, 2013 at 9:59 PM, Jeff Mahoney je...@suse.com wrote: Filipe noticed that we were leaking the features attribute group after umount. His fix of just calling sysfs_remove_group() wasn't enough since that removes just the supported features and not the unknown features (but a

Re: [PATCH 1/2 v2] Documentation: filesystems: add new btrfs mount options

2013-11-21 Thread Chris Mason
Quoting David Sterba (2013-11-21 11:48:03) Two new options were added in 3.12: commit and rescan_uuid_tree CC: linux-...@vger.kernel.org Signed-off-by: David Sterba dste...@suse.cz --- v2: fix typo pointed out by Stefan You just caught me before the pull request. V1 was already in

[PATCH 4/5] Btrfs: remove unnecessary lock in may_commit_transaction()

2013-11-21 Thread Miao Xie
The reason is: - The per-cpu counter has its own lock to protect itself. - Here we need get a exact value. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/extent-tree.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/fs/btrfs/extent-tree.c

[PATCH 3/5] Btrfs: remove the unnecessary flush when preparing the pages

2013-11-21 Thread Miao Xie
Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/file.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 82d0342..27f65e4 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1286,12 +1286,11 @@ again:

Re: Triple parity and beyond

2013-11-21 Thread Piergiorgio Sartor
On Wed, Nov 20, 2013 at 07:28:37PM -0600, Stan Hoeppner wrote: [...] It's always perilous to follow a Ph.D., so I guess I'm feeling suicidal today. ;) I'm not attempting to marginalize Andrea's work here, but I can't help but ponder what the real value of triple parity RAID is, or quad, or

[PATCH 1/5] Btrfs: wake up the tasks that wait for the io earlier

2013-11-21 Thread Miao Xie
The tasks that wait for the IO_DONE flag just care about the io of the dirty pages, so it is better to wake up them immediately after all the pages are written, not the whole process of the io completes. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/ordered-data.c | 14 ++

[PATCH 1/2 v2] Documentation: filesystems: add new btrfs mount options

2013-11-21 Thread David Sterba
Two new options were added in 3.12: commit and rescan_uuid_tree CC: linux-...@vger.kernel.org Signed-off-by: David Sterba dste...@suse.cz --- v2: fix typo pointed out by Stefan Documentation/filesystems/btrfs.txt | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git

Re: [PATCH 1/5] Btrfs: wake up the tasks that wait for the io earlier

2013-11-21 Thread Liu Bo
On Thu, Nov 21, 2013 at 09:43:14PM +0800, Miao Xie wrote: The tasks that wait for the IO_DONE flag just care about the io of the dirty pages, so it is better to wake up them immediately after all the pages are written, not the whole process of the io completes. This doesn't seem to make sense,

Re: [PATCH 1/5] Btrfs: wake up the tasks that wait for the io earlier

2013-11-21 Thread Miao Xie
On Fri, 22 Nov 2013 12:30:40 +0800, Liu Bo wrote: On Thu, Nov 21, 2013 at 09:43:14PM +0800, Miao Xie wrote: The tasks that wait for the IO_DONE flag just care about the io of the dirty pages, so it is better to wake up them immediately after all the pages are written, not the whole process of

Re: [PATCH v4 2/3] btrfs-progs: fs show should handle if subvol(s) mounted

2013-11-21 Thread David Sterba
On Thu, Nov 21, 2013 at 11:32:36AM +0800, Anand Jain wrote: On 11/20/2013 10:18 PM, David Sterba wrote: On Fri, Nov 15, 2013 at 07:25:34PM +0800, Anand Jain wrote: static int btrfs_scan_kernel(void *search) { - int ret = 0, fd; - FILE *f; - struct mntent *mnt; - struct