Re: RADI6 questions

2013-06-02 Thread Felix Blanke
Hi,

fyi: Raid5/6 hit mainline in 3.9, with 3.8 you will not be able to use
those raid levels.

Regards,
Felix

On Sat, Jun 1, 2013 at 11:23 PM, Hugo Mills h...@carfax.org.uk wrote:
 On Sat, Jun 01, 2013 at 02:07:53PM -0700, ronnie sahlberg wrote:
 Hi List,

 I have a filesystem that is spanning about 10 devices.
 It is currently using RAID1 for both data and metadata.

 In order to get higher availability and be able to handle multi device 
 failures
 I would like to change from RAID1 to RAID6.


 Is it possible/stable/supported/recommended to change data from RAID1 to 
 RAID6 ?
 (I assume btrfs fi balance ...  is used for this?)

Yes.

 Metadata is currently RAID1, is it supported to put metadata as RAID6 too?
 It would be odd to have lesser protection for metadata than data.
 Optimally I would like a mode where metadata is mirrored onto all the
 spindles in the filesystem, not just 2 in RAID1 or n in RAID6.

Yes, that should be supported.

 Im running a 3.8.0 kernel.

The btrfs RAID-5 and RAID-6 implementations aren't really ready for
 production use, so right now I wouldn't recommend using them for
 anything other than for testing purposes with data that's replacable.

Hugo.

 --
 === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
   PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- w.w.w.  : England's batting scorecard ---
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Is there a way to flag specific directories nodatacow?

2013-06-02 Thread George Mitchell
I am seeing massive journal corruptions that seem to be unique to btrfs 
and I am suspecting that cow might be causing them.  My bandaid fix for 
this will be to mark the /var filesystem nodatacow at boot.  But I am 
wondering if their is any way to flag a particular directory as 
nodatacow outside of the mount process.  I would like to be able to 
mark /var/log/journal as nodatacow for example, without having to 
declare it a subvolume and mount it separately.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible solution to the open_ctree boot bug ...

2013-06-02 Thread George Mitchell
I am seeing a huge improvement in boot performance since doing a system 
wide file by file defragementation of metadata.  In fact in the four 
sequential boots since completing this process, I have not seen one 
open_ctree failure so far.  This leads me to suspect that the open_ctree 
boot failures that have been plaguing me since install have been related 
to metadata fragmentation.  So I would advise anyone else experiencing 
open_ctree boot problems to defragment their metatdata and see if that 
helps.  It certainly seems to have helped me in that regard.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] Btrfs-progs: add btrfsck functionality to btrfs

2013-06-02 Thread Dieter Ries
Hi everybody,

Am 08.02.2013 01:36, schrieb Ian Kumlien:
 diff --git a/cmds-check.c b/cmds-check.c
 index 71e98de..8e4cce0 100644
 --- a/cmds-check.c
 +++ b/cmds-check.c

[...]

 @@ -3574,7 +3579,8 @@ int main(int ac, char **av)
  (unsigned long long)bytenr);
   break;
   case '?':
 - print_usage();
 + case 'h':
 + usage(cmd_check_usage);
   }
   if (option_index == 1) {
   printf(enabling repair mode\n);

For this to have any effect, 'h' must be added to getopt_long(), see
attached patch 1.

However, this results in btrfsck -h and --help doing different things:

--help prints the usage message to stdout and exits with exit(0).
-h prints the usage message to stderr and exits with exit(129).

I made a patch to fix this, see attached patch 2.
What it doesn't fix though is, that -h/--help and -? don't do the same
thing. This is more complicated, as getop_long returns '?' for unknown
options.

Cheers,

Dieter
From 11aabdb018aed3c5b6a1616178883fd879152856 Mon Sep 17 00:00:00 2001
From: Dieter Ries m...@dieterries.net
Date: Sun, 2 Jun 2013 17:30:09 +0200
Subject: [PATCH 1/2] Btrfs-progs: Fix 'btrfsck/btrfs check -h'

For the '-h' option to be usable, getopts_long() has to know it.

Signed-off-by: Dieter Ries m...@dieterries.net
---
 cmds-check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index 1e5e005..ff9298d 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4065,7 +4065,7 @@ int cmd_check(int argc, char **argv)
 
 	while(1) {
 		int c;
-		c = getopt_long(argc, argv, as:, long_options,
+		c = getopt_long(argc, argv, ahs:, long_options,
 option_index);
 		if (c  0)
 			break;
-- 
1.8.1.3

From 52d9e47bfa0936a14baa48e8ad6ecdd820295809 Mon Sep 17 00:00:00 2001
From: Dieter Ries m...@dieterries.net
Date: Sun, 2 Jun 2013 17:32:15 +0200
Subject: [PATCH 2/2] Btrfs-progs: Fix '--help' to '-h' inconsistency in
 btrfsck/btrfs check

This patch fixes the following inconsistency between calling
btrfsck/btrfs check with the -h or --help options:
--help prints the usage message to stdout and exits with exit(0).
-h prints the usage message to stderr and exits with exit(129).

To achieve this, usage_command_usagestr() is made avalilable via
commands.h.

Signed-off-by: Dieter Ries m...@dieterries.net
---
 cmds-check.c | 5 -
 commands.h   | 2 ++
 help.c   | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index ff9298d..093c859 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4078,8 +4078,11 @@ int cmd_check(int argc, char **argv)
    (unsigned long long)bytenr);
 break;
 			case '?':
-			case 'h':
 usage(cmd_check_usage);
+break;
+			case 'h':
+usage_command_usagestr(cmd_check_usage, check, 1, 0);
+exit(0);
 		}
 		if (option_index == 1) {
 			printf(enabling repair mode\n);
diff --git a/commands.h b/commands.h
index 15c616d..814452f 100644
--- a/commands.h
+++ b/commands.h
@@ -73,6 +73,8 @@ extern const char * const generic_cmd_help_usage[];
 void usage(const char * const *usagestr);
 void usage_command(const struct cmd_struct *cmd, int full, int err);
 void usage_command_group(const struct cmd_group *grp, int all, int err);
+void usage_command_usagestr(const char * const *usagestr,
+const char *token, int full, int err);
 
 void help_unknown_token(const char *arg, const struct cmd_group *grp);
 void help_ambiguous_token(const char *arg, const struct cmd_group *grp);
diff --git a/help.c b/help.c
index 6d04293..effb72e 100644
--- a/help.c
+++ b/help.c
@@ -102,7 +102,7 @@ static int usage_command_internal(const char * const *usagestr,
 	return ret;
 }
 
-static void usage_command_usagestr(const char * const *usagestr,
+void usage_command_usagestr(const char * const *usagestr,
    const char *token, int full, int err)
 {
 	FILE *outf = err ? stderr : stdout;
-- 
1.8.1.3



RAID10 total capacity incorrect

2013-06-02 Thread Tim Eggleston

Hi list,

I have a 4-device RAID10 array of 2TB drives on btrfs. It works great. I 
recently added an additional 4 drives to the array. There is only about 
2TB in use across the whole array (which should have an effective 
capacity of about 8TB). However I have noticed that when I issue btrfs 
filesystem df against the mountpoint, in the total field, I get the 
same value as the used field:


root@mckinley:/# btrfs fi df /mnt/shares/btrfsvol0
Data, RAID10: total=2.06TB, used=2.06TB
System, RAID10: total=64.00MB, used=188.00KB
System: total=4.00MB, used=0.00
Metadata, RAID10: total=3.00GB, used=2.29GB

Here's my btrfs filesystem show:

root@mckinley:/# btrfs fi show
Label: 'btrfsvol0'  uuid: 1a735971-3ad7-4046-b25b-e834a74f2fbb
Total devices 8 FS bytes used 2.06TB
devid7 size 1.82TB used 527.77GB path /dev/sdk1
devid8 size 1.82TB used 527.77GB path /dev/sdg1
devid6 size 1.82TB used 527.77GB path /dev/sdi1
devid5 size 1.82TB used 527.77GB path /dev/sde1
devid4 size 1.82TB used 527.77GB path /dev/sdj1
devid2 size 1.82TB used 527.77GB path /dev/sdf1
devid1 size 1.82TB used 527.77GB path /dev/sdh1
devid3 size 1.82TB used 527.77GB path /dev/sdc1

This is running the Ubuntu build of kernel 3.9.4 and btrfs-progs from 
git (v0.20-rc1-324-g650e656).


Am I being an idiot and missing something here? I must admit that I 
still find the df output a bit cryptic (entirely my failure to 
understand, nothing else), but on another system with only a single 
device the total field returns the capacity of the device.


Cheers!

 ---tim

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID10 total capacity incorrect

2013-06-02 Thread Hugo Mills
On Sun, Jun 02, 2013 at 05:17:11PM +0100, Tim Eggleston wrote:
 Hi list,
 
 I have a 4-device RAID10 array of 2TB drives on btrfs. It works
 great. I recently added an additional 4 drives to the array. There
 is only about 2TB in use across the whole array (which should have
 an effective capacity of about 8TB). However I have noticed that
 when I issue btrfs filesystem df against the mountpoint, in the
 total field, I get the same value as the used field:
 
 root@mckinley:/# btrfs fi df /mnt/shares/btrfsvol0
 Data, RAID10: total=2.06TB, used=2.06TB
 System, RAID10: total=64.00MB, used=188.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID10: total=3.00GB, used=2.29GB
 
 Here's my btrfs filesystem show:
 
 root@mckinley:/# btrfs fi show
 Label: 'btrfsvol0'  uuid: 1a735971-3ad7-4046-b25b-e834a74f2fbb
   Total devices 8 FS bytes used 2.06TB
   devid7 size 1.82TB used 527.77GB path /dev/sdk1
   devid8 size 1.82TB used 527.77GB path /dev/sdg1
   devid6 size 1.82TB used 527.77GB path /dev/sdi1
   devid5 size 1.82TB used 527.77GB path /dev/sde1
   devid4 size 1.82TB used 527.77GB path /dev/sdj1
   devid2 size 1.82TB used 527.77GB path /dev/sdf1
   devid1 size 1.82TB used 527.77GB path /dev/sdh1
   devid3 size 1.82TB used 527.77GB path /dev/sdc1

   You have 8*527.77 GB = 4222.16 GB of raw space allocated for all
purposes. Since RAID-10 takes twice the raw bytes to store data, that
gives you 2111.08 GB of usable space so far.

   From the df output, 2.06 TB ~= 2109.44 GB is allocated as data, and
all of that space is used. 3.00 GB is allocated as metadata, and most
of that is used. That adds up (within rounding errors) to the 2111.08
GB above.

   Additional space will be allocated from the available unallocated
space as the FS needs it.

 This is running the Ubuntu build of kernel 3.9.4 and btrfs-progs
 from git (v0.20-rc1-324-g650e656).
 
 Am I being an idiot and missing something here? I must admit that I
 still find the df output a bit cryptic (entirely my failure to
 understand, nothing else), but on another system with only a single
 device the total field returns the capacity of the device.

   That's probably already fully-allocated, so used=size in btrfs fi
show. If it's a single device, then you're probably not using any
replication, so the raw storage is equal to the possible storage.

   HTH,
   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- I can resist everything except temptation ---


signature.asc
Description: Digital signature


Re: RAID10 total capacity incorrect

2013-06-02 Thread Tim Eggleston

Hi Hugo,

Thanks for your reply, good to know it's not an error as such (just me 
being an idiot!).



Additional space will be allocated from the available unallocated
space as the FS needs it.


So I guess my question becomes, how much of that available unallocated 
space do I have? Instinctively the btrfs df output feels like it's 
missing an equivalent to the size column from vanilla df.


Is there a method of getting this in a RAID situation? I understand that 
btrfs RAID is more complicated than md RAID, so it's ok if the answer at 
this point is no...


Thanks again,

 ---tim

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID10 total capacity incorrect

2013-06-02 Thread Chris Murphy

On Jun 2, 2013, at 12:17 PM, Tim Eggleston li...@timeggleston.co.uk wrote:
 
 root@mckinley:/# btrfs fi df /mnt/shares/btrfsvol0
 Data, RAID10: total=2.06TB, used=2.06TB
 System, RAID10: total=64.00MB, used=188.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID10: total=3.00GB, used=2.29GB
 
 
 Am I being an idiot and missing something here? 

No, it's confusing. btrfs fi df doesn't show free space. The first value is 
what space the fs has allocated for the data usage type, and the 2nd value is 
how much of that allocation is actually being used. I personally think the 
allocated value is useless for mortal users. I'd rather have some idea of what 
free space I have left, and the regular df command presents this in an annoying 
way also because it shows the total volume size, not accounting for the double 
consumption of raid1. So no matter how you slice it, it's confusing.


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID10 total capacity incorrect

2013-06-02 Thread Hugo Mills
On Sun, Jun 02, 2013 at 05:52:38PM +0100, Tim Eggleston wrote:
 Hi Hugo,
 
 Thanks for your reply, good to know it's not an error as such (just
 me being an idiot!).
 
 Additional space will be allocated from the available unallocated
 space as the FS needs it.
 
 So I guess my question becomes, how much of that available
 unallocated space do I have? Instinctively the btrfs df output feels
 like it's missing an equivalent to the size column from vanilla
 df.

   Look at btrfs fi show -- you have size and used there, so the
difference there will give you the unallocated space.

 Is there a method of getting this in a RAID situation? I understand
 that btrfs RAID is more complicated than md RAID, so it's ok if the
 answer at this point is no...

   Not in any obvious (and non-surprising) way. Basically, any way you
could work it out is going to give someone a surprise because they
were thinking of it some other way around. The problem is that until
the space is allocated, the FS can't know how that space needs to be
allocated (to data/metadata, or with what replication type and hence
overheads), so we can't necessarily give a reliable estimate.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
 --- If you're not part of the solution, you're part --- 
   of the precipiate.


signature.asc
Description: Digital signature


Re: RAID10 total capacity incorrect

2013-06-02 Thread Hugo Mills
On Sun, Jun 02, 2013 at 12:52:40PM -0400, Chris Murphy wrote:
 
 On Jun 2, 2013, at 12:17 PM, Tim Eggleston li...@timeggleston.co.uk wrote:
  
  root@mckinley:/# btrfs fi df /mnt/shares/btrfsvol0
  Data, RAID10: total=2.06TB, used=2.06TB
  System, RAID10: total=64.00MB, used=188.00KB
  System: total=4.00MB, used=0.00
  Metadata, RAID10: total=3.00GB, used=2.29GB
  
  
  Am I being an idiot and missing something here? 

 No, it's confusing. btrfs fi df doesn't show free space. The first
 value is what space the fs has allocated for the data usage type,
 and the 2nd value is how much of that allocation is actually being
 used. I personally think the allocated value is useless for mortal
 users. I'd rather have some idea of what free space I have left, and
 the regular df command presents this in an annoying way also because
 it shows the total volume size, not accounting for the double
 consumption of raid1. So no matter how you slice it, it's confusing.

   It's the nature of the beast, unfortunately. So far, nobody's
managed to come up with a simple method of showing free space and
space usage that isn't going to be misleading somehow.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
 --- If you're not part of the solution, you're part --- 
   of the precipiate.


signature.asc
Description: Digital signature


Re: Is there a way to flag specific directories nodatacow?

2013-06-02 Thread Liu Bo
On Sun, Jun 02, 2013 at 07:40:52AM -0700, George Mitchell wrote:
 I am seeing massive journal corruptions that seem to be unique to
 btrfs and I am suspecting that cow might be causing them.  My
 bandaid fix for this will be to mark the /var filesystem nodatacow
 at boot.  But I am wondering if their is any way to flag a
 particular directory as nodatacow outside of the mount process.  I
 would like to be able to mark /var/log/journal as nodatacow for
 example, without having to declare it a subvolume and mount it
 separately.

Hi George,

We actually have per-file/directory nodatacow :)

But please note if you set nodatacow on the particular directory, only
new-created or zero-size files in the directory can follow the nocow rule.

'chattr' in the latest e2fsprogs can fit your requirements,
# chattr +C /var/log/journal

Also, what kind of massive journal corruptions?  Does it look like a
btrfs specific bug?

thanks,
liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


csum failed during rebalance

2013-06-02 Thread John Haller
Hi,

I added a new drive to an existing RAID 0 array. Every
attempt to rebalance the array fails:
# btrfs filesystem balance /share/bd8
ERROR: error during balancing '/share/bd8' - Input/output error
# dmesg | tail
btrfs: found 1 extents
btrfs: relocating block group 10752513540096 flags 1
btrfs: found 5 extents
btrfs: found 5 extents
btrfs: relocating block group 10751439798272 flags 1
btrfs: found 1 extents
btrfs: found 1 extents
btrfs: relocating block group 10048138903552 flags 1
btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028
btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028

An earlier rebalance attempt had the same csum error on a different inode:
btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028
btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028

Every rebalance attempt fails the same way, but with a different inum.

Here is the array:
# btrfs filesystem show
Label: 'bd8'  uuid: b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4
Total devices 4 FS bytes used 7.37TB
devid4 size 3.64TB used 52.00GB path /dev/sde
devid1 size 3.64TB used 3.32TB path /dev/sdf1
devid3 size 3.64TB used 2.92TB path /dev/sdc
devid2 size 3.64TB used 2.97TB path /dev/sdb

While I didn't finish the scrub, no errors were found:
# btrfs scrub status -d /share/bd8
scrub status for b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4
scrub device /dev/sdf1 (id 1) status
scrub resumed at Sun Jun  2 20:29:06 2013, running for 10360 seconds
total bytes scrubbed: 845.53GB with 0 errors
scrub device /dev/sdb (id 2) status
scrub resumed at Sun Jun  2 20:29:06 2013, running for 10360 seconds
total bytes scrubbed: 869.38GB with 0 errors
scrub device /dev/sdc (id 3) status
scrub resumed at Sun Jun  2 20:29:06 2013, running for 10360 seconds
total bytes scrubbed: 706.04GB with 0 errors
scrub device /dev/sde (id 4) history
scrub started at Sun Jun  2 12:48:36 2013 and finished after 0 seconds
total bytes scrubbed: 0.00 with 0 errors

Mount options:
/dev/sdf1 on /share/bd8 type btrfs (rw,flushoncommit)

Kernel 3.9.4

John
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is there a way to flag specific directories nodatacow?

2013-06-02 Thread George Mitchell

On 06/02/2013 06:28 PM, Liu Bo wrote:

On Sun, Jun 02, 2013 at 07:40:52AM -0700, George Mitchell wrote:

I am seeing massive journal corruptions that seem to be unique to
btrfs and I am suspecting that cow might be causing them.  My
bandaid fix for this will be to mark the /var filesystem nodatacow
at boot.  But I am wondering if their is any way to flag a
particular directory as nodatacow outside of the mount process.  I
would like to be able to mark /var/log/journal as nodatacow for
example, without having to declare it a subvolume and mount it
separately.

Hi George,

We actually have per-file/directory nodatacow :)

But please note if you set nodatacow on the particular directory, only
new-created or zero-size files in the directory can follow the nocow rule.

'chattr' in the latest e2fsprogs can fit your requirements,
# chattr +C /var/log/journal

Also, what kind of massive journal corruptions?  Does it look like a
btrfs specific bug?

thanks,
liubo



Thanks Liu,

That helps a lot! I am very familiar with chattr/lsattr from my ext3 
days, but didn't know where to look for btrfs options. From what you are 
telling me the nodatacow option is identical to nodatacow option for 
ext3. Do the other ext3 options work for btrfs also?


As for as the corruption issue, I actually don't know whether the 
corruptions are real or whether they are being caused by the way the 
`journalctl --verify` command is interfacing with the filesystem. My 
suspicion is that metadata fragmentation *might* be somehow messing with 
the `journalctl --verify` since I can use simply `journalctl` and all 
the data flows out without error. I just cleaned out the 
/var/log/journal directory and started fresh and in no time I am seeing 
corruptions according to `journalctl --verify`. Here is what the output 
looks like:


==

[root@localhost aide]# journalctl --verify
Invalid object contents at 
130624 
0%
File corruption detected at 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0628-0004de2c1807989c.journal:130624 
(of 131072, 99%).
FAIL: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0628-0004de2c1807989c.journal 
(Bad message)
PASS: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/user-501@e1447322cf904d028439c2d3f17d032e-065a-0004de2c18d6d96d.journal
Invalid object contents at 
125264 
0%
File corruption detected at 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-069a-0004de2c5e323847.journal:125264 
(of 131072, 95%).
FAIL: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-069a-0004de2c5e323847.journal 
(Bad message)
PASS: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/user-501@e1447322cf904d028439c2d3f17d032e-06a8-0004de2c73b5f19d.journal
Invalid object contents at 
128408 
0%
File corruption detected at 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0709-0004de2cedab583c.journal:128408 
(of 131072, 97%).
FAIL: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0709-0004de2cedab583c.journal 
(Bad message)
Invalid object contents at 
126736 
0%
File corruption detected at 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-077f-0004de2d20abe261.journal:126736 
(of 131072, 96%).
FAIL: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-077f-0004de2d20abe261.journal 
(Bad message)
Invalid object contents at 
129600 
0%
File corruption detected at 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-07ec-0004de2d7c50c186.journal:129600 
(of 131072, 98%).
FAIL: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-07ec-0004de2d7c50c186.journal 
(Bad message)
PASS: 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/user-501@e1447322cf904d028439c2d3f17d032e-07f1-0004de2d87392b08.journal
Invalid object contents at 
129256 
0%
File corruption detected at 
/var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0862-0004de2e9a6decf4.journal:129256 
(of 131072, 98%).
FAIL: 

Re: Is there a way to flag specific directories nodatacow?

2013-06-02 Thread George Mitchell

On 06/02/2013 06:28 PM, Liu Bo wrote:

On Sun, Jun 02, 2013 at 07:40:52AM -0700, George Mitchell wrote:

I am seeing massive journal corruptions that seem to be unique to
btrfs and I am suspecting that cow might be causing them.  My
bandaid fix for this will be to mark the /var filesystem nodatacow
at boot.  But I am wondering if their is any way to flag a
particular directory as nodatacow outside of the mount process.  I
would like to be able to mark /var/log/journal as nodatacow for
example, without having to declare it a subvolume and mount it
separately.

Hi George,

We actually have per-file/directory nodatacow :)

But please note if you set nodatacow on the particular directory, only
new-created or zero-size files in the directory can follow the nocow rule.

'chattr' in the latest e2fsprogs can fit your requirements,
# chattr +C /var/log/journal

Also, what kind of massive journal corruptions?  Does it look like a
btrfs specific bug?

thanks,
liubo


I am also assuming that all directories later created under 
/var/log/journal will inherit the nodatacow profile?

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is there a way to flag specific directories nodatacow?

2013-06-02 Thread Liu Bo
On Sun, Jun 02, 2013 at 07:19:50PM -0700, George Mitchell wrote:
 On 06/02/2013 06:28 PM, Liu Bo wrote:
 On Sun, Jun 02, 2013 at 07:40:52AM -0700, George Mitchell wrote:
 I am seeing massive journal corruptions that seem to be unique to
 btrfs and I am suspecting that cow might be causing them.  My
 bandaid fix for this will be to mark the /var filesystem nodatacow
 at boot.  But I am wondering if their is any way to flag a
 particular directory as nodatacow outside of the mount process.  I
 would like to be able to mark /var/log/journal as nodatacow for
 example, without having to declare it a subvolume and mount it
 separately.
 Hi George,
 
 We actually have per-file/directory nodatacow :)
 
 But please note if you set nodatacow on the particular directory, only
 new-created or zero-size files in the directory can follow the nocow rule.
 
 'chattr' in the latest e2fsprogs can fit your requirements,
 # chattr +C /var/log/journal
 
 Also, what kind of massive journal corruptions?  Does it look like a
 btrfs specific bug?
 
 thanks,
 liubo
 
 
 I am also assuming that all directories later created under
 /var/log/journal will inherit the nodatacow profile?

Yes, indeed.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is there a way to flag specific directories nodatacow?

2013-06-02 Thread Liu Bo
On Sun, Jun 02, 2013 at 07:11:10PM -0700, George Mitchell wrote:
 On 06/02/2013 06:28 PM, Liu Bo wrote:
 On Sun, Jun 02, 2013 at 07:40:52AM -0700, George Mitchell wrote:
 I am seeing massive journal corruptions that seem to be unique to
 btrfs and I am suspecting that cow might be causing them.  My
 bandaid fix for this will be to mark the /var filesystem nodatacow
 at boot.  But I am wondering if their is any way to flag a
 particular directory as nodatacow outside of the mount process.  I
 would like to be able to mark /var/log/journal as nodatacow for
 example, without having to declare it a subvolume and mount it
 separately.
 Hi George,
 
 We actually have per-file/directory nodatacow :)
 
 But please note if you set nodatacow on the particular directory, only
 new-created or zero-size files in the directory can follow the nocow rule.
 
 'chattr' in the latest e2fsprogs can fit your requirements,
 # chattr +C /var/log/journal
 
 Also, what kind of massive journal corruptions?  Does it look like a
 btrfs specific bug?
 
 thanks,
 liubo
 
 
 Thanks Liu,
 
 That helps a lot! I am very familiar with chattr/lsattr from my ext3
 days, but didn't know where to look for btrfs options. From what you
 are telling me the nodatacow option is identical to nodatacow option
 for ext3. Do the other ext3 options work for btrfs also?

Besides nodatacow, compression is also supported as per file/directory
basis.

 
 As for as the corruption issue, I actually don't know whether the
 corruptions are real or whether they are being caused by the way the
 `journalctl --verify` command is interfacing with the filesystem. My
 suspicion is that metadata fragmentation *might* be somehow messing
 with the `journalctl --verify` since I can use simply `journalctl`
 and all the data flows out without error. I just cleaned out the
 /var/log/journal directory and started fresh and in no time I am
 seeing corruptions according to `journalctl --verify`. Here is what
 the output looks like:

That's weird, AFAIK it shouldn't be.

Does 'dmesg' also complain when these corruptions from 'journalctl --verify'
occurs?  (well, I'm expecting some csum errors, maybe...)

 
 ==
 
 [root@localhost aide]# journalctl --verify
 Invalid object contents at 
 130624
 0%
 File corruption detected at 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0628-0004de2c1807989c.journal:130624
 (of 131072, 99%).
 FAIL: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0628-0004de2c1807989c.journal
 (Bad message)
 PASS: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/user-501@e1447322cf904d028439c2d3f17d032e-065a-0004de2c18d6d96d.journal
 Invalid object contents at 
 125264
 0%
 File corruption detected at 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-069a-0004de2c5e323847.journal:125264
 (of 131072, 95%).
 FAIL: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-069a-0004de2c5e323847.journal
 (Bad message)
 PASS: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/user-501@e1447322cf904d028439c2d3f17d032e-06a8-0004de2c73b5f19d.journal
 Invalid object contents at 
 128408
 0%
 File corruption detected at 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0709-0004de2cedab583c.journal:128408
 (of 131072, 97%).
 FAIL: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-0709-0004de2cedab583c.journal
 (Bad message)
 Invalid object contents at 
 126736
 0%
 File corruption detected at 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-077f-0004de2d20abe261.journal:126736
 (of 131072, 96%).
 FAIL: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-077f-0004de2d20abe261.journal
 (Bad message)
 Invalid object contents at 
 129600
 0%
 File corruption detected at 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-07ec-0004de2d7c50c186.journal:129600
 (of 131072, 98%).
 FAIL: 
 /var/log/journal/8846d97f611b49aa9f3d48eeac6a81f2/system@e1447322cf904d028439c2d3f17d032e-07ec-0004de2d7c50c186.journal
 (Bad message)
 PASS: 
 

Re: Is there a way to flag specific directories nodatacow?

2013-06-02 Thread A. C. Censi
On Sun, Jun 2, 2013 at 11:11 PM, George Mitchell geo...@chinilu.com wrote:

 So I want to try forcing nodatacow on this directory and see what happens.
 If that doesn't work, I suppose the next step will be to place this one
 directory on an ext4 filesystem and mount it externally to the btrfs
 /var/log.


I have the same kind of errors in ext4 file system (ArchLinux 64-bit
in a Macbook Air). To me they seem to be related to power loss events,
caused by battery depletion when sleeping for long time.

Any way besides long time intialization of journalctl displays, there
is no error in dmesg or /var/log/files. The errors should be related
to log metadata info.


--
A. C. Censi
accensi [em] gmail [ponto] com
accensi [em] montreal [ponto] com [ponto] br
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID10 total capacity incorrect

2013-06-02 Thread Duncan
Hugo Mills posted on Sun, 02 Jun 2013 18:43:59 +0100 as excerpted:

 On Sun, Jun 02, 2013 at 12:52:40PM -0400, Chris Murphy wrote:
 
 [I]t's confusing. btrfs fi df doesn't show free space. The first
 value is what space the fs has allocated for the data usage type,
 and the 2nd value is how much of that allocation is actually being
 used. I personally think the allocated value is useless for mortal
 users. I'd rather have some idea of what free space I have left, and
 the regular df command presents this in an annoying way also because it
 shows the total volume size, not accounting for the double consumption
 of raid1. So no matter how you slice it, it's confusing.
 
 It's the nature of the beast, unfortunately. So far, nobody's
 managed to come up with a simple method of showing free space and space
 usage that isn't going to be misleading somehow.

btrfs.wiki.kernel.org covers this topic as well as I guess it's possible 
to be covered at this point, in the FAQ.  I definitely recommend reading 
the user documentation section there, to any btrfs or potential btrfs 
user who hasn't done so already, as it really does cover a lot of 
questions, tho certainly not all (as my posting history here, after 
reading it, demonstrates).

Home page (easiest to remember):

https://btrfs.wiki.kernel.org

Direct link to the documentation section on that page (perhaps more 
useful as a bookmark):

https://btrfs.wiki.kernel.org/index.php/Main_Page#Documentation

The FAQ:

https://btrfs.wiki.kernel.org/index.php/FAQ

Direct link to FAQ section 4.4, which starts the questions that deal with 
space (4.4-4.9):

https://btrfs.wiki.kernel.org/index.php/
FAQ#Why_does_df_show_incorrect_free_space_for_my_RAID_volume.3F

In addition, people using multiple devices should read the sysadmin guide 
and multiple devices pages (which can be found under the docs link 
above), tho they don't really cover space questions.  (But the raid-10 
diagram in the sysadmin guide may be helpful in visualizing what's going 
on.)

In particular, see the Whis is free space so complicated? question/
answer, which explains the why of Hugo's answer -- I don't believe it's 
yet implemented, but the plan is to allow different subvolumes, which can 
be created at any time, to have different raid levels.  Between differing 
data and metadata levels and differing subvolume levels, in the general 
case there's simply no reasonable way to reliably report on the 
unallocated space, since there's no way to know which raid level it'll be 
allocated as, until it actually happens.

Of course the answer in limited specific cases can be known.  Here, I'm 
just deploying multiple btrfs filesystems across two SSD devices, 
generally raid1[1] for both data/metadata, with no intention of having 
differing level subvolumes, so I can simply run regular df and divide the 
results in half in my head.  btrfs filesystem df gives me different, much 
more technical information, so it's useful, but not as simply useful as 
regular df, halving the numbers in my head.

Tim (the OP)'s case is similarly knowable since he's raid10 both data/
metadata across originally four, now eight, similarly sized 2TB devices 
(unlike me, he's apparently using the same btrfs across the entire 
physical device, all now eight devices), assuming he never chooses 
anything other than raid10 data/metadata for subvolumes, and sticks with 
two-mirror-copy raid10 once N-way mirroring becomes possible.

btrfs raid10, like its raid1, is limited to two mirror-copies, so with 
eight similarly-sized devices and the caveat that he has already 
rebalanced across all eight devices since doubled from four, he's raid10 
4-way striping, two-way-mirroring.

I'd guess normal df (not btrfs filesystem df) and doing the math in his 
head will be the simplest for him, as it is for me.

But it's worth noting that normal df with math in your head isn't 
/always/ going to be the answer, as things start getting rather more 
complex as soon as different sized devices get thrown into the mix, or 
raid1/10 on an /odd/ number of devices (tho there the math simply gets a 
bit more complex since it's no longer integers), let alone the case of 
differing data/metadata allocation mode, without even considering the 
case of subvolumes having different modes, since I don't think that's 
implemented yet.

But in the simple cases of data/metadata of the same raid level on either 
just one or an even number of devices, regular df, doing the math in your 
head, should be the simplest and most direct answer.  As I said, btrfs 
filesystem df and btrfs filesystem show are useful, but for more 
technical purposes or in the complex cases where there's no easy way to 
just do the math on normal df.

---
[1] My single exception is a separate tiny /boot, one to each device, --
mixed data/metadata DUP mode, as they're a quarter gig each.  I went 
separate here and separately installed grub2 to each device as well, so I 
can independently boot from