Re: [PATCH v2 2/2] btrfs-progs: limit the min value of total_bytes

2012-10-08 Thread Robin Dong
2012/10/1 David Sterba :
> On Fri, Sep 28, 2012 at 11:02:40AM +0800, Robin Dong wrote:
>> From: Robin Dong 
>>
>> Using mkfs.btrfs like:
>>
>> mkfs.btrfs -b 1048576 /dev/sda
>>
>> will report error:
>>
>>   mkfs.btrfs: volumes.c:796: btrfs_alloc_chunk: Assertion `!(ret)' 
>> failed.
>>   Aborted
>>
>> because the length of dev_extent is 4MB.
>>
>> But if we use mkfs.btrfs with 8MB total bytes, the newly mounted btrfs 
>> filesystem
>> would not contain even one empty file. So 12MB will be good min-value for 
>> block_count.
>
> I'm not able to create a single file even on a 12MB filesystem
> (with -d single -m single --mixed), so any limit that would let the mkfs
> finish normally should be fine. For the single/single case it's 5MB but
> for the dup/dup it's 156MB. It's due to the known bug in the blockgroup
> creation with multiple devices (applies on dup as well here) that leads
> to:
>
> # btrfs fi df .
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Data+Metadata, DUP: total=64.00MB, used=24.00KB
> Data+Metadata: total=8.00MB, used=0.00
>
> 8*2 + 4 + 64*2 + 8 = 156
>
> so, 12M is too small to avoid the mkfs crash.

Thanks for your notice!
>
> david



-- 
--
Best Regard
Robin Dong
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/2] btrfs-progs: limit the max value of leafsize and nodesize

2012-10-08 Thread Robin Dong
From: Robin Dong 

Using mkfs.btrfs like:

mkfs.btrfs -l 131072 /dev/sda

will return no error, but after mount it, the dmesg will report:

BTRFS: couldn't mount because metadata blocksize (131072) was too large

The leafsize and nodesize are equal at present, so we just use one function
"check_leaf_or_node_size" to limit leaf and node size below 
BTRFS_MAX_METADATA_BLOCKSIZE.

Signed-off-by: Robin Dong 
Reviewed-by: David Sterba 
---
 ctree.h |6 ++
 mkfs.c  |   29 +++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/ctree.h b/ctree.h
index 7f55229..75c1e0a 100644
--- a/ctree.h
+++ b/ctree.h
@@ -111,6 +111,12 @@ struct btrfs_trans_handle;
 #define BTRFS_DEV_ITEMS_OBJECTID 1ULL
 
 /*
+ * the max metadata block size.  This limit is somewhat artificial,
+ * but the memmove costs go through the roof for larger blocks.
+ */
+#define BTRFS_MAX_METADATA_BLOCKSIZE 65536
+
+/*
  * we can actually store much bigger names, but lets not confuse the rest
  * of linux
  */
diff --git a/mkfs.c b/mkfs.c
index dff5eb8..93672b9 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1201,6 +1201,27 @@ static int zero_output_file(int out_fd, u64 size, u32 
sectorsize)
return ret;
 }
 
+static int check_leaf_or_node_size(u32 size, u32 sectorsize)
+{
+   if (size < sectorsize) {
+   fprintf(stderr,
+   "Illegal leafsize (or nodesize) %u (smaller than 
sectorsize %u)\n",
+   size, sectorsize);
+   return -1;
+   } else if (size > BTRFS_MAX_METADATA_BLOCKSIZE) {
+   fprintf(stderr,
+   "Illegal leafsize (or nodesize) %u (larger than %u)\n",
+   size, BTRFS_MAX_METADATA_BLOCKSIZE);
+   return -1;
+   } else if (size & (sectorsize - 1)) {
+   fprintf(stderr,
+   "Illegal leafsize (or nodesize) %u (not aligned to 
%u)\n",
+   size, sectorsize);
+   return -1;
+   }
+   return 0;
+}
+
 int main(int ac, char **av)
 {
char *file;
@@ -1291,14 +1312,10 @@ int main(int ac, char **av)
}
}
sectorsize = max(sectorsize, (u32)getpagesize());
-   if (leafsize < sectorsize || (leafsize & (sectorsize - 1))) {
-   fprintf(stderr, "Illegal leafsize %u\n", leafsize);
+   if (check_leaf_or_node_size(leafsize, sectorsize))
exit(1);
-   }
-   if (nodesize < sectorsize || (nodesize & (sectorsize - 1))) {
-   fprintf(stderr, "Illegal nodesize %u\n", nodesize);
+   if (check_leaf_or_node_size(nodesize, sectorsize))
exit(1);
-   }
ac = ac - optind;
if (ac == 0)
print_usage();
-- 
1.7.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/2] btrfs-progs: limit the min value of total_bytes

2012-10-08 Thread Robin Dong
From: Robin Dong 

Using mkfs.btrfs like:

mkfs.btrfs -b 1048576 /dev/sda

will report error:

mkfs.btrfs: volumes.c:796: btrfs_alloc_chunk: Assertion `!(ret)' failed.
Aborted

because the length of dev_extent is 4MB.

For the single/single case it's 5MB but
for the dup/dup it's 156MB. It's due to the known bug in the blockgroup
creation with multiple devices (applies on dup as well here) that leads to:

# btrfs fi df .
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Data+Metadata, DUP: total=64.00MB, used=24.00KB
Data+Metadata: total=8.00MB, used=0.00

8*2 + 4 + 64*2 + 8 = 156

Signed-off-by: Robin Dong 
Reviewed-by: David Sterba 
---
 mkfs.c  |7 ++-
 utils.h |1 +
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index 93672b9..982a113 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1345,7 +1345,12 @@ int main(int ac, char **av)
&dev_block_count, &mixed, nodiscard);
if (block_count == 0)
block_count = dev_block_count;
-   else if (block_count > dev_block_count) {
+   else if (block_count < BTRFS_MKFS_SYSTEM_MAX_SIZE) {
+   fprintf(stderr, "Illegal total number of bytes %u "
+   "(smaller than %u)\n",
+   block_count, BTRFS_MKFS_SYSTEM_MAX_SIZE);
+   exit(1);
+   } else if (block_count > dev_block_count) {
fprintf(stderr, "%s is smaller than requested size\n", 
file);
exit(1);
}
diff --git a/utils.h b/utils.h
index c147c12..b63f69a 100644
--- a/utils.h
+++ b/utils.h
@@ -20,6 +20,7 @@
 #define __UTILS__
 
 #define BTRFS_MKFS_SYSTEM_GROUP_SIZE (4 * 1024 * 1024)
+#define BTRFS_MKFS_SYSTEM_MAX_SIZE   (156 * 1024 * 1024)
 
 int make_btrfs(int fd, const char *device, const char *label,
   u64 blocks[6], u64 num_bytes, u32 nodesize,
-- 
1.7.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send/receive review by vfs folks

2012-10-08 Thread Jan Schmidt
Hi Alex,

On Thu, October 04, 2012 at 17:59 (+0200), Alex Lyakas wrote:
> as I promised, here is some code for you to look at.

And quite a lot of it. I hadn't thought of such a big change when I wrote
"preferably in form of a patch".

As a side note, your patch doesn't follow the general kernel coding style (but
read on before you rework this one).

> First I will describe the approach in general.
>
> # Get rid of the pipe. Instead, user-space passes a buffer and kernel
> fills the specified user-space buffer with commands.
> # When the buffer is full, kernel stops generating commands and
> returns a checkpoint to the user-space.
> # User-space does whatever it wants with the returned buffer, and then
> calls the kernel again, with a buffer and a checkpoint that was
> returned by the kernel from previous SEND ioctl().
> # Kernel re-arms itself to the specified checkpoint, and fills the
> specified buffer with commands, attaches a new checkpoint and so on.
> # Eventually kernel signals to the user that there are no more commands.

We had that in the very beginning of btrfs send. Having only a single ioctl
saves a whole lot of system calls.

> I realize this is a big change, and a new IOCTL has to be introduced
> in order not to break current user-kernel protocol.
> The pros as I see them:
> # One data-copy is avoided (no pipe). For WRITE commands two
> data-copies are avoided (no read_buf needed)

I'm not sure I understand those correctly. If you're talking about the user mode
part, we could simply pass stdout to the kernel, saving the unnecessary pipe and
copy operations in between without introducing a new buffer.

> # ERESTARTSYS issue disappears. If needed, ioctl is restarted, but
> there is no problem with that, it will simply refill the buffer from
> the same checkpoint.

This is the subject of this thread and the thing I'd like to focus on currently.

> Cons:
> # Instead of one ioctl(), many ioctls() are issued to finish the send.
> # Big code change

Two big cons. I'd like to quota Alexander's suggestions again:

On Wed, August 01, 2012 at 14:09 (+0200), Alexander Block wrote:
> I have two possible solutions in my mind.
> 1. Store some kind of state in the ioctl arguments so that we can
> continue where we stopped when the ioctl reenters. This would however
> complicate the code a lot.
> 2. Spawn a thread when the ioctl is called and leave the ioctl
> immediately. I don't know if ERESTARTSYS can happen in vfs_xxx calls
> if they happen from a non syscall thread.

What do you think about those two?

I like the first suggestion. Combining single-ioctl with signal handling
capabilities feels like the right choice. When we get ERESTARTSYS, we know
exactly how many bytes made it to user mode. To reach a comfortable state for a
restart, we can store part of the stream together with the meta information in
our internal state before returning to user mode. The ioctl will be restarted
sooner or later and our internal state tells us where to proceed.

Thanks,
-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/2] btrfs-progs: limit the min value of total_bytes

2012-10-08 Thread Robin Dong
From: Robin Dong 

Using mkfs.btrfs like:

mkfs.btrfs -b 1048576 /dev/sda

will report error:

mkfs.btrfs: volumes.c:796: btrfs_alloc_chunk: Assertion `!(ret)' failed.
Aborted

because the length of dev_extent is 4MB.

For the single/single case it's 5MB but
for the dup/dup it's 156MB. It's due to the known bug in the blockgroup
creation with multiple devices (applies on dup as well here) that leads to:

# btrfs fi df .
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Data+Metadata, DUP: total=64.00MB, used=24.00KB
Data+Metadata: total=8.00MB, used=0.00

8*2 + 4 + 64*2 + 8 = 156

Signed-off-by: Robin Dong 
Reviewed-by: David Sterba 
---
 mkfs.c  |7 ++-
 utils.h |1 +
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index 93672b9..982a113 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1345,7 +1345,12 @@ int main(int ac, char **av)
&dev_block_count, &mixed, nodiscard);
if (block_count == 0)
block_count = dev_block_count;
-   else if (block_count > dev_block_count) {
+   else if (block_count < BTRFS_MKFS_SYSTEM_MAX_SIZE) {
+   fprintf(stderr, "Illegal total number of bytes %u "
+   "(smaller than %u)\n",
+   block_count, BTRFS_MKFS_SYSTEM_MAX_SIZE);
+   exit(1);
+   } else if (block_count > dev_block_count) {
fprintf(stderr, "%s is smaller than requested size\n", 
file);
exit(1);
}
diff --git a/utils.h b/utils.h
index c147c12..b63f69a 100644
--- a/utils.h
+++ b/utils.h
@@ -20,6 +20,7 @@
 #define __UTILS__
 
 #define BTRFS_MKFS_SYSTEM_GROUP_SIZE (4 * 1024 * 1024)
+#define BTRFS_MKFS_SYSTEM_MAX_SIZE   (156 * 1024 * 1024)
 
 int make_btrfs(int fd, const char *device, const char *label,
   u64 blocks[6], u64 num_bytes, u32 nodesize,
-- 
1.7.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/2] btrfs-progs: limit the max value of leafsize and nodesize

2012-10-08 Thread David Sterba
Hi,

please let us know what changed from v2 -> v3. The usual place for such
comments is pointed below:

On Mon, Oct 08, 2012 at 04:10:40PM +0800, Robin Dong wrote:
[changelog ...]
> Signed-off-by: Robin Dong 
> Reviewed-by: David Sterba 
> ---

here

>  ctree.h |6 ++
>  mkfs.c  |   29 +++--
>  2 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/ctree.h b/ctree.h
> index 7f55229..75c1e0a 100644
[diff ...]

thanks,
david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send/receive review by vfs folks

2012-10-08 Thread Alex Lyakas
Hi Jan,
thanks for taking time to look at the code.


On Mon, Oct 8, 2012 at 11:26 AM, Jan Schmidt  wrote:
> Hi Alex,
>
> On Thu, October 04, 2012 at 17:59 (+0200), Alex Lyakas wrote:
>> as I promised, here is some code for you to look at.
>
> And quite a lot of it. I hadn't thought of such a big change when I wrote
> "preferably in form of a patch".
>
> As a side note, your patch doesn't follow the general kernel coding style (but
> read on before you rework this one).
>
>> First I will describe the approach in general.
>>
>> # Get rid of the pipe. Instead, user-space passes a buffer and kernel
>> fills the specified user-space buffer with commands.
>> # When the buffer is full, kernel stops generating commands and
>> returns a checkpoint to the user-space.
>> # User-space does whatever it wants with the returned buffer, and then
>> calls the kernel again, with a buffer and a checkpoint that was
>> returned by the kernel from previous SEND ioctl().
>> # Kernel re-arms itself to the specified checkpoint, and fills the
>> specified buffer with commands, attaches a new checkpoint and so on.
>> # Eventually kernel signals to the user that there are no more commands.
>
> We had that in the very beginning of btrfs send. Having only a single ioctl
> saves a whole lot of system calls.
>
>> I realize this is a big change, and a new IOCTL has to be introduced
>> in order not to break current user-kernel protocol.
>> The pros as I see them:
>> # One data-copy is avoided (no pipe). For WRITE commands two
>> data-copies are avoided (no read_buf needed)
>
> I'm not sure I understand those correctly. If you're talking about the user 
> mode
> part, we could simply pass stdout to the kernel, saving the unnecessary pipe 
> and
> copy operations in between without introducing a new buffer.
What I meant is the following:
# For non-WRITE commands the flow is: put the command onto send_buf,
copy to pipe, then user-space copies it out from the pipe. With my
code: put command onto send_buf, then copy to user-space buffer
(copy_to_user). So one data-copy is avoided (2 vs 3).
# For WRITE commands: read data onto read_buf, then copy to send_buf,
then copy to pipe, then user-mode copies to its buffer. With my code:
read onto send_buf, then copy to user-space buffer. So 2 data-copies
are avoided (2 vs 4).
Does it make sense?

>
>> # ERESTARTSYS issue disappears. If needed, ioctl is restarted, but
>> there is no problem with that, it will simply refill the buffer from
>> the same checkpoint.
>
> This is the subject of this thread and the thing I'd like to focus on 
> currently.
>
>> Cons:
>> # Instead of one ioctl(), many ioctls() are issued to finish the send.
>> # Big code change
>
> Two big cons. I'd like to quota Alexander's suggestions again:
>
> On Wed, August 01, 2012 at 14:09 (+0200), Alexander Block wrote:
>> I have two possible solutions in my mind.
>> 1. Store some kind of state in the ioctl arguments so that we can
>> continue where we stopped when the ioctl reenters. This would however
>> complicate the code a lot.
>> 2. Spawn a thread when the ioctl is called and leave the ioctl
>> immediately. I don't know if ERESTARTSYS can happen in vfs_xxx calls
>> if they happen from a non syscall thread.
>
> What do you think about those two?
I am not familiar enough with Linux kernel - will the second one work?

>
> I like the first suggestion. Combining single-ioctl with signal handling
> capabilities feels like the right choice. When we get ERESTARTSYS, we know
> exactly how many bytes made it to user mode. To reach a comfortable state for 
> a
> restart, we can store part of the stream together with the meta information in
> our internal state before returning to user mode.
I thought that ERESTARTSYS never returns to user mode (this is what I
saw in my tests also).

> The ioctl will be restarted sooner or later and our internal state tells us 
> where to proceed.

Ok, so you say that we should maintain checkpoint-like information and
use it if the ioctl is automatically restarted. This is quite close to
what I have done, I believe; we still need all the capabilities of
re-arming tree search, saving context, skipping commands etc, that I
have written. Did I understand your proposal correctly?

But tell me one thing: let's say we call vfs_write() and it returns
-ERESTARTSYS. Can we be sure that it wrote 0 bytes to the pipe in this
call? Otherwise, we don't know how many bytes it wrote, so we cannot
resume it correctly.

Thanks,
Alex.



>
> Thanks,
> -Jan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Goffredo Baroncelli
On Mon, Oct 8, 2012 at 8:08 AM, Swâmi Petaramesh  wrote:
> Le 08/10/2012 00:47, Goffredo Baroncelli a écrit :
>> Please could you clarify if you are using the "autodefrag" options
>> when you have the performance problem ?
>
> I use autodefrag on all volumes systematically, except on volumes that I
> use for really big files that would always be defragmenting (i.e.
> virtual machines, but I seldom use virtual machines, and my performance
> problem is general, I don't take VMs into consideration in this respect...)

The autodefrag option is per filesystem not per subvolume. The settings
of the first subvolueme is used also for the other ones.

So, because you mount the same filesystem both on

> /dev/VG1/BTR_POOL/btrfs
subvol=UBUNTU/@,space_cache,autodefrag,compress=lzo,relatime0 0

and

> /dev/VG1/BTR_POOL/data/VBOX_HIDDENbtrfs
subvol=DATA/VBOX_HIDDEN,space_cache,compress=lzo,noatime0 0
> /dev/VG1/BTR_POOL/data/VBOX_VMSbtrfs
subvol=DATA/VBOX_VMS,space_cache,compress=lzo,noatime0 0

the autodefrag option is enabled also for DATA/VBOX_HIDDEN and DATA/VBOM_VMS

>
>> Are you in position to to reduce the number of snapshot from hundreds
>> to few  ?
>
> I might do this as an experiment, but it would defeat one of the
> prominent purposes for which I use BTRFS, and the number of snapshots
> would grow immediately again, as I use the excellent OpenSuSE "snapper"
> tool, that makes a snapshot every single hour...

I am not suggesting that as solution, but this would help to
investigate the problem.

If you don't have any needs of the snapshot, you can delete it, wait
that the cleaner
kernel thread does its job (it could require a bit of time), then
reboot the machine.
If I am right the performance should goes high.

I fear that both the combination of autodefrag and the high number of
snapshot could
be the root-cause of the the bad performance.

>
> Kind regards.

Ciao
Goffredo

> --
> Swâmi Petaramesh  http://petaramesh.org PGP 9076E32E
> Ne cherchez pas : Je ne suis pas sur Facebook.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Wiki (scrub)

2012-10-08 Thread Goffredo Baroncelli
On Mon, Oct 8, 2012 at 12:51 AM, Alex  wrote:
> David Sterba  jikos.cz> writes:
>
>>
>> Hi,
>>
>> On Sun, Oct 07, 2012 at 12:07:43PM +, Alex wrote:
>> > The official wiki seems to have lost references to "scrub" if not other
>> > commands.
>
> Sorry, I felt sure that scrub was listed on
> https://btrfs.wiki.kernel.org/index.php/Btrfs%28command%29

It was an old page that I started. But now it is largely unmaintained. We could
remove it (putting a banner which stated that this page is unmaintained)...

>
>> I'm using the a filter to see only page changes:
>> * select 'User' namespace
>> * [x] Invert selection
>> * Go
>>
>
> Thanks for this!
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


working quota example?

2012-10-08 Thread matthieu Barthélemy
Hi all,

I tried without success to get a working Btrfs+quota setup.
I created a new Btrfs filesystem on a new
partition, then activated quota management ('btrfs quota enable'), and
created a few subvolumes.
I created a qgroup (with id 100) with 'btrfs qgroup create', and tried to
apply a quota on one of my subvolumes using 'btrfs qgroup limit'

So far I've been unable to get this working, I can create a file (using dd)
inside the subvolume that will happily eat all my FS space without triggering
anything that could look like a quota limitation.
btrfs-progs help is not really useful, it's more like a quick reminder than
a real 'help'.
So I have 2 questions for experimented Btrfs developers and users:
-Could someone post a working example of a quota configuration on 1 or
several subvolumes? Minimal/simplest working configuration.
-How can I see the used/remaining space for each subvolume that has a quota
set (I guess it should be done with 'btrfs qgroup show ' but its output is
rather terse (returns '0/100 0 0' on my system).

Thanks in advance for your help, and all the work done to bring us so many
features.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v3] Btrfs: snapshot-aware defrag

2012-10-08 Thread Liu Bo
On 10/03/2012 10:02 PM, Chris Mason wrote:
> On Tue, Sep 25, 2012 at 07:07:53PM -0600, Liu Bo wrote:
>> On 09/26/2012 01:39 AM, Mitch Harder wrote:
>>> On Mon, Sep 17, 2012 at 4:58 AM, Liu Bo  wrote:
 This comes from one of btrfs's project ideas,
 As we defragment files, we break any sharing from other snapshots.
 The balancing code will preserve the sharing, and defrag needs to grow this
 as well.

 Now we're able to fill the blank with this patch, in which we make full 
 use of
 backref walking stuff.

 Here is the basic idea,
 o  set the writeback ranges started by defragment with flag EXTENT_DEFRAG
 o  at endio, after we finish updating fs tree, we use backref walking to 
 find
all parents of the ranges and re-link them with the new COWed file 
 layout by
adding corresponding backrefs.

 Originally patch by Li Zefan 
 Signed-off-by: Liu Bo 
>>>
>>> I'm hitting the WARN_ON in record_extent_backrefs() indicating a
>>> problem with the return value from iterate_inodes_from_logical().
> 
> Me too.  It triggers reliably with mount -o autodefrag, and then crashes
> a in the next function ;)
> 
> -chris
> 

Hi Chris, Mitch,

I'm afraid that I may need a little more time to fix all bugs in it because 
there seems to be
some backref walking bugs mixed in, and at least 4 different crashes make it 
harder to address bugs.

I use an 1G random write fio job running in background, following by creating 
20 snapshots in background,
and mount -o autodefrag.

So if your crash is quite stable in one place, please let me know the steps.

thanks,
liubo


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: working quota example?

2012-10-08 Thread Arne Jansen
On 08.10.2012 14:09, matthieu Barthélemy wrote:
> Hi all,
> 
> I tried without success to get a working Btrfs+quota setup.
> I created a new Btrfs filesystem on a new
> partition, then activated quota management ('btrfs quota enable'), and
> created a few subvolumes.
> I created a qgroup (with id 100) with 'btrfs qgroup create', and tried to
> apply a quota on one of my subvolumes using 'btrfs qgroup limit'
> 
> So far I've been unable to get this working, I can create a file (using dd)
> inside the subvolume that will happily eat all my FS space without triggering
> anything that could look like a quota limitation.
> btrfs-progs help is not really useful, it's more like a quick reminder than
> a real 'help'.
> So I have 2 questions for experimented Btrfs developers and users:
> -Could someone post a working example of a quota configuration on 1 or
> several subvolumes? Minimal/simplest working configuration.

# mkfs.btrfs /dev/sdx

WARNING! - Btrfs cloned-148-g8935d84 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdx
nodesize 4096 leafsize 4096 sectorsize 4096 size 931.51GB
Btrfs cloned-148-g8935d84
# mount /dev/sdx /mnt/test
# btrfs quota enable /mnt/test
# btrfs sub create /mnt/test/sub1
Create subvolume '/mnt/test/sub1'
# dd if=/dev/zero of=/mnt/test/sub1/file1 bs=1048576 count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.000873268 s, 1.2 GB/s
# sync
# btrfs qgroup show /mnt/test
0/257 1052672 1052672

> -How can I see the used/remaining space for each subvolume that has a quota
> set (I guess it should be done with 'btrfs qgroup show ' but its output is
> rather terse (returns '0/100 0 0' on my system).
>  

Right. 2 things to note:
 a) quota only shows up after some time. To enforce this, you can sync the fs.
 b) The output is too terse, some UI design is necessary here. The output
means:

qgroup references exclusive
0/257  10526721052672

Please refer here  for a discussion of the
meaning of those values.
Your mistake was to create the group 0/100 yourself. The command qgroup
create is only needed to create quota groups of subvolumes.

To limit the subvol:

# btrfs qgroup limit 2m /mnt/test/sub1
# dd if=/dev/zero of=/mnt/test/sub1/file1 bs=10485760 count=1
dd: writing `/mnt/test/sub1/file1': Disk quota exceeded
1+0 records in
0+0 records out
1966080 bytes (2.0 MB) copied, 0.0056283 s, 349 MB/s

Hope that helps :)

-Arne

> Thanks in advance for your help, and all the work done to bring us so many
> features.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v3] Btrfs: snapshot-aware defrag

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 06:18:26AM -0600, Liu Bo wrote:
> On 10/03/2012 10:02 PM, Chris Mason wrote:
> > On Tue, Sep 25, 2012 at 07:07:53PM -0600, Liu Bo wrote:
> >> On 09/26/2012 01:39 AM, Mitch Harder wrote:
> >>> On Mon, Sep 17, 2012 at 4:58 AM, Liu Bo  wrote:
>  This comes from one of btrfs's project ideas,
>  As we defragment files, we break any sharing from other snapshots.
>  The balancing code will preserve the sharing, and defrag needs to grow 
>  this
>  as well.
> 
>  Now we're able to fill the blank with this patch, in which we make full 
>  use of
>  backref walking stuff.
> 
>  Here is the basic idea,
>  o  set the writeback ranges started by defragment with flag EXTENT_DEFRAG
>  o  at endio, after we finish updating fs tree, we use backref walking to 
>  find
> all parents of the ranges and re-link them with the new COWed file 
>  layout by
> adding corresponding backrefs.
> 
>  Originally patch by Li Zefan 
>  Signed-off-by: Liu Bo 
> >>>
> >>> I'm hitting the WARN_ON in record_extent_backrefs() indicating a
> >>> problem with the return value from iterate_inodes_from_logical().
> > 
> > Me too.  It triggers reliably with mount -o autodefrag, and then crashes
> > a in the next function ;)
> > 
> > -chris
> > 
> 
> Hi Chris, Mitch,
> 
> I'm afraid that I may need a little more time to fix all bugs in it because 
> there seems to be
> some backref walking bugs mixed in, and at least 4 different crashes make it 
> harder to address bugs.
> 
> I use an 1G random write fio job running in background, following by creating 
> 20 snapshots in background,
> and mount -o autodefrag.
> 
> So if your crash is quite stable in one place, please let me know the steps.

I have a notmuch mail database.  I just receive mail with auto defrag on
and it crashes.  Chrome databases may do it as well.

If it helps, I have compression too.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: remove repeated eb->pages check in, disk-io.c/csum_dirty_buffer

2012-10-08 Thread Wang Sheng-Hui
In csum_dirty_buffer, we first get eb from page->private.
Then we check if the page is the first page of eb. Later
we check it again. Remove the repeated check here.

Signed-off-by: Wang Sheng-Hui 
---
 fs/btrfs/disk-io.c |8 +++-
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 22e98e0..8919c56 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -432,14 +432,12 @@ static int csum_dirty_buffer(struct btrfs_root *root, 
struct page *page)
tree = &BTRFS_I(page->mapping->host)->io_tree;

eb = (struct extent_buffer *)page->private;
-   if (page != eb->pages[0])
-   return 0;
-   found_start = btrfs_header_bytenr(eb);
-   if (found_start != start) {
+   if (page != eb->pages[0]) {
WARN_ON(1);
return 0;
}
-   if (eb->pages[0] != page) {
+   found_start = btrfs_header_bytenr(eb);
+   if (found_start != start) {
WARN_ON(1);
return 0;
}
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Swâmi Petaramesh
Le 08/10/2012 13:38, Goffredo Baroncelli a écrit :
> The autodefrag option is per filesystem not per subvolume. The settings
> of the first subvolueme is used also for the other ones.
Uh !

So there is no interest in creating several subvols, some for which
files should be autodefragged, and some not ? That's to bad :-(

> I fear that both the combination of autodefrag and the high number of
> snapshot could be the root-cause of the the bad performance. 

As a test, I will try to remove *most* of my snapshots and see if it
helps...

Thanks for the suggestion !

Kind regards.

-- 
Swâmi Petaramesh  http://petaramesh.org PGP 9076E32E
Ne cherchez pas : Je ne suis pas sur Facebook.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send/receive review by vfs folks

2012-10-08 Thread Jan Schmidt
On Mon, October 08, 2012 at 13:38 (+0200), Alex Lyakas wrote:
>>> I realize this is a big change, and a new IOCTL has to be introduced
>>> in order not to break current user-kernel protocol.
>>> The pros as I see them:
>>> # One data-copy is avoided (no pipe). For WRITE commands two
>>> data-copies are avoided (no read_buf needed)
>>
>> I'm not sure I understand those correctly. If you're talking about the user 
>> mode
>> part, we could simply pass stdout to the kernel, saving the unnecessary pipe 
>> and
>> copy operations in between without introducing a new buffer.
> What I meant is the following:
> # For non-WRITE commands the flow is: put the command onto send_buf,
> copy to pipe, then user-space copies it out from the pipe. With my
> code: put command onto send_buf, then copy to user-space buffer
> (copy_to_user). So one data-copy is avoided (2 vs 3).
> # For WRITE commands: read data onto read_buf, then copy to send_buf,
> then copy to pipe, then user-mode copies to its buffer. With my code:
> read onto send_buf, then copy to user-space buffer. So 2 data-copies
> are avoided (2 vs 4).
> Does it make sense?

I'd rather just focus on the ERESTARTSYS issue for now.

>> On Wed, August 01, 2012 at 14:09 (+0200), Alexander Block wrote:
>>> I have two possible solutions in my mind.
>>> 1. Store some kind of state in the ioctl arguments so that we can
>>> continue where we stopped when the ioctl reenters. This would however
>>> complicate the code a lot.
>>> 2. Spawn a thread when the ioctl is called and leave the ioctl
>>> immediately. I don't know if ERESTARTSYS can happen in vfs_xxx calls
>>> if they happen from a non syscall thread.
>>
>> What do you think about those two?
> I am not familiar enough with Linux kernel - will the second one work?

I not an expert here, either. My uneducated guess is that even a spawned kernel
thread can have signals pending, in which case we would be in the same trouble.

>> I like the first suggestion. Combining single-ioctl with signal handling
>> capabilities feels like the right choice. When we get ERESTARTSYS, we know
>> exactly how many bytes made it to user mode. To reach a comfortable state 
>> for a
>> restart, we can store part of the stream together with the meta information 
>> in
>> our internal state before returning to user mode.
> I thought that ERESTARTSYS never returns to user mode (this is what I
> saw in my tests also).

That error isn't returned to a user process, that's correct. From a kernel
developer's perspective, we're returning to user space, though, waiting for the
pending signal to be processed.

>> The ioctl will be restarted sooner or later and our internal state tells us 
>> where to proceed.
> 
> Ok, so you say that we should maintain checkpoint-like information and
> use it if the ioctl is automatically restarted. This is quite close to
> what I have done, I believe; we still need all the capabilities of
> re-arming tree search, saving context, skipping commands etc, that I
> have written. Did I understand your proposal correctly?

In some way, yes. If possible, I'd like to (with decreasing priority)
- stick with the stream format
- maintain ioctl-compatibility
- have a much smaller patch
- store less state
- stick with the pipe

I haven't had time to completely understand your checkpoint concept, and
one-big-chunk patches are hard to read. But the size of the added data
structures scares me. Shouldn't we be able to do this with as little information
as a single btrfs key and a buffer of generated but not yet pushed stream data?
This would also save us skipping commands or output bytes.

> But tell me one thing: let's say we call vfs_write() and it returns
> -ERESTARTSYS. Can we be sure that it wrote 0 bytes to the pipe in this
> call?

>From the version of fs/pipe.c in my working directory: yes. If we rely on this,
we'd better check if this is something to rely on, but I hope it is.

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Richard W.M. Jones

I'm tracking this bug here:

https://bugzilla.redhat.com/show_bug.cgi?id=863978

Since approx. last week I'm seeing lots of failures in btrfs.  The
common factor seems to be that the filesystem is created (mkfs.btrfs
/dev/sda1) and then it is immediately used -- eg.  mounted or some
btrfs subtool is run on it.  There is no pause or sync between the
operations.

Typical errors include:

mkfs.btrfs /dev/sda1
mount -o  /dev/sda1 /sysroot/
[   96.384211] device fsid 962db3c0-4153-450b-9ca7-c9216e81afe3 devid 1 transid 
3 /dev/sda1
[   96.385314] device fsid 962db3c0-4153-450b-9ca7-c9216e81afe3 devid 1 transid 
3 /dev/sda1
[   96.394158] btrfs: disk space caching is enabled
[   96.428656] btrfs: failed to recover relocation
[   96.437190] btrfs: open_ctree failed

and:

btrfsck /dev/sda1
Check tree block failed, want=139264, have=0
Check tree block failed, want=139264, have=0
Check tree block failed, want=139264, have=0
read block failed check_tree_block
Couldn't read chunk root

(There are plenty of others, see the above bug link)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


fix btrfs-progs build

2012-10-08 Thread Christian Hesse
Hello everybody,

man pages for btrfs-progs are compressed by gzip by default. In Makefile the
variable GZIP is use, this evaluates to 'gzip gzip' on my system. From man
gzip:

> The environment variable GZIP can hold a set of default options for gzip.
> These options are interpreted first and can be overwritten by explicit
> command line parameters.

So using any other variable name fixes this. Patch is attached.
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}
diff --git a/man/Makefile b/man/Makefile
index 4a90b75..f7b57f7 100644
--- a/man/Makefile
+++ b/man/Makefile
@@ -1,4 +1,4 @@
-GZIP=gzip
+GZIPCMD=gzip
 INSTALL= install
 
 prefix ?= /usr/local
@@ -12,22 +12,22 @@ MANPAGES = mkfs.btrfs.8.gz btrfsctl.8.gz btrfsck.8.gz btrfs-image.8.gz \
 all: $(MANPAGES)
 
 mkfs.btrfs.8.gz: mkfs.btrfs.8.in
-	$(GZIP) -n -c mkfs.btrfs.8.in > mkfs.btrfs.8.gz
+	$(GZIPCMD) -n -c mkfs.btrfs.8.in > mkfs.btrfs.8.gz
 
 btrfs.8.gz: btrfs.8.in
-	$(GZIP) -n -c btrfs.8.in > btrfs.8.gz
+	$(GZIPCMD) -n -c btrfs.8.in > btrfs.8.gz
 
 btrfsctl.8.gz: btrfsctl.8.in
-	$(GZIP) -n -c btrfsctl.8.in > btrfsctl.8.gz
+	$(GZIPCMD) -n -c btrfsctl.8.in > btrfsctl.8.gz
 
 btrfsck.8.gz: btrfsck.8.in
-	$(GZIP) -n -c btrfsck.8.in > btrfsck.8.gz
+	$(GZIPCMD) -n -c btrfsck.8.in > btrfsck.8.gz
 
 btrfs-image.8.gz: btrfs-image.8.in
-	$(GZIP) -n -c btrfs-image.8.in > btrfs-image.8.gz
+	$(GZIPCMD) -n -c btrfs-image.8.in > btrfs-image.8.gz
 
 btrfs-show.8.gz: btrfs-show.8.in
-	$(GZIP) -n -c btrfs-show.8.in > btrfs-show.8.gz
+	$(GZIPCMD) -n -c btrfs-show.8.in > btrfs-show.8.gz
 
 clean :
 	rm -f $(MANPAGES)


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 08:16:42AM -0600, Richard W.M. Jones wrote:
> 
> I'm tracking this bug here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=863978
> 
> Since approx. last week I'm seeing lots of failures in btrfs.  The
> common factor seems to be that the filesystem is created (mkfs.btrfs
> /dev/sda1) and then it is immediately used -- eg.  mounted or some
> btrfs subtool is run on it.  There is no pause or sync between the
> operations.

This was a problem on older btrfs-progs, but this commit:

btrfs-progs-0.19.20120817git043a639-1.fc19.i686

(043a639) has long had the fixes to flush things after mkfs.  Is there
any change the guest you're testing had an ancient progs on it?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fix btrfs-progs build

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 08:17:13AM -0600, Christian Hesse wrote:
> Hello everybody,
> 
> man pages for btrfs-progs are compressed by gzip by default. In Makefile the
> variable GZIP is use, this evaluates to 'gzip gzip' on my system. From man
> gzip:
> 
> > The environment variable GZIP can hold a set of default options for gzip.
> > These options are interpreted first and can be overwritten by explicit
> > command line parameters.
> 
> So using any other variable name fixes this. Patch is attached.

Ok, which system is this?  Just curious, I'll pull in the patch.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fix btrfs-progs build

2012-10-08 Thread Christian Hesse
Chris Mason  on Mon, 2012/10/08 10:29:
> On Mon, Oct 08, 2012 at 08:17:13AM -0600, Christian Hesse wrote:
> > Hello everybody,
> > 
> > man pages for btrfs-progs are compressed by gzip by default. In Makefile
> > the variable GZIP is use, this evaluates to 'gzip gzip' on my system.
> > From man gzip:
> > 
> > > The environment variable GZIP can hold a set of default options for
> > > gzip. These options are interpreted first and can be overwritten by
> > > explicit command line parameters.
> > 
> > So using any other variable name fixes this. Patch is attached.
> 
> Ok, which system is this?  Just curious, I'll pull in the patch.

This is Arch Linux with gzip 1.5-1.
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fix btrfs-progs build

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 08:30:31AM -0600, Christian Hesse wrote:
> Chris Mason  on Mon, 2012/10/08 10:29:
> > On Mon, Oct 08, 2012 at 08:17:13AM -0600, Christian Hesse wrote:
> > > Hello everybody,
> > > 
> > > man pages for btrfs-progs are compressed by gzip by default. In Makefile
> > > the variable GZIP is use, this evaluates to 'gzip gzip' on my system.
> > > From man gzip:
> > > 
> > > > The environment variable GZIP can hold a set of default options for
> > > > gzip. These options are interpreted first and can be overwritten by
> > > > explicit command line parameters.
> > > 
> > > So using any other variable name fixes this. Patch is attached.
> > 
> > Ok, which system is this?  Just curious, I'll pull in the patch.
> 
> This is Arch Linux with gzip 1.5-1.

Strange, I'm running running arch linux with gzip 1.5-1 and it builds.
I wonder if something else is expanding it.  I'll take the patch
regardless, there's no reason to add build problems when we don't need
to.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Experiences: Why BTRFS had to yield for ZFS

2012-10-08 Thread Casper Bang
> Thanks for taking the time to write this up follow through the thread.
> It's always interesting to hear situations where btrfs doesn't work
> well.
> 
> There are three basic problems with the database workloads on btrfs.
> First is that we have higher latencies on writes because we are feeding
> everything through helper threads for crcs.  Usually the extra latencies
> don't show up because we have enough work in the pipeline to keep the
> drive busy.
> 
> I don't believe the UEK kernels have the recent changes to do some of
> the crc work inline (without handing off) for smaller synchronous IOs.
> 
> Second, on O_SYNC writes btrfs will write both the file metadata and
> data into a special tree so we can be crash safe.  For big files this
> tends to spend a lot of time looking for the extents in the file that
> have changed.
> 
> Josef fixed that up and it is queued for the next merge window.
> 
> The third problem is that lots of random writes tend to make lots of
> metadata.  If this doesn't fit in ram, we can end up doing many reads
> that slow things down.  We're working on this now as well, but recent
> kernels change how we cache things and should improve the results.

I feel I should update my previous thread about performance issues using btrfs 
in light of recent findings. We have discovered that, in all likelihood, what 
we 
experienced and what was described, was not a problem with btrfs per se, but a 
result of a more general issue which btrfs was just really good at exposing 
(using threads more aggressively than zfs?!).

Various benchmarks in Java (thread-pool setup/shutdown) and C (pthreads 
creation 
and joining), has shown that our Xeon/E5-2620 server with the latest Oracle 
Unbreakable Linux has a very slow time serving up new threads (benchmarks 
available upon request).

Java threading benchmark on Xeon/E5-2620 @ 2.0GHz:
Oracle Unbreakable Linux: 1m49s realtime, 3m17s sys-time
Ubuntu:   5s realtime, 3.9s sys-time.

We are not sure how to continue investigating why the Oracle Linux/Kernel 
performs so poorly (scheduler, kernel config etc?), but it seems pretty obvious 
that this issue should be raised with Oracle rather than the btrfs developers - 
though we'll probably look into using another OS entirely. As such, apologies 
for creating the noise, btrfs was not to blame!

If you do have a suspicion or insight on the matter (perhaps work for Oracle, 
or 
know OUK?), of course we'd love a followup offline this list.

Kind regards,
Casper

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fix btrfs-progs build

2012-10-08 Thread Christian Hesse
Chris Mason  on Mon, 2012/10/08 10:33:
> On Mon, Oct 08, 2012 at 08:30:31AM -0600, Christian Hesse wrote:
> > Chris Mason  on Mon, 2012/10/08 10:29:
> > > On Mon, Oct 08, 2012 at 08:17:13AM -0600, Christian Hesse wrote:
> > > > Hello everybody,
> > > > 
> > > > man pages for btrfs-progs are compressed by gzip by default. In
> > > > Makefile the variable GZIP is use, this evaluates to 'gzip gzip' on
> > > > my system. From man gzip:
> > > > 
> > > > > The environment variable GZIP can hold a set of default options for
> > > > > gzip. These options are interpreted first and can be overwritten by
> > > > > explicit command line parameters.
> > > > 
> > > > So using any other variable name fixes this. Patch is attached.
> > > 
> > > Ok, which system is this?  Just curious, I'll pull in the patch.
> > 
> > This is Arch Linux with gzip 1.5-1.
> 
> Strange, I'm running running arch linux with gzip 1.5-1 and it builds.
> I wonder if something else is expanding it.  I'll take the patch
> regardless, there's no reason to add build problems when we don't need
> to.

This happens if you have exported GZIP to your environment. So probably most
people are not effected.
-- 
main(a){char*c=/*Schoene Gruesse */"B?IJj;MEH"
"CX:;",b;for(a/*Chris   get my mail address:*/=0;b=c[a++];)
putchar(b-1/(/*   gcc -o sig sig.c && ./sig*/b/42*2-3)*42);}
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Richard W.M. Jones
On Mon, Oct 08, 2012 at 10:27:57AM -0400, Chris Mason wrote:
> On Mon, Oct 08, 2012 at 08:16:42AM -0600, Richard W.M. Jones wrote:
> > 
> > I'm tracking this bug here:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=863978
> > 
> > Since approx. last week I'm seeing lots of failures in btrfs.  The
> > common factor seems to be that the filesystem is created (mkfs.btrfs
> > /dev/sda1) and then it is immediately used -- eg.  mounted or some
> > btrfs subtool is run on it.  There is no pause or sync between the
> > operations.
> 
> This was a problem on older btrfs-progs, but this commit:
> 
> btrfs-progs-0.19.20120817git043a639-1.fc19.i686
> 
> (043a639) has long had the fixes to flush things after mkfs.  Is there
> any change the guest you're testing had an ancient progs on it?

We have a couple of guests where this fails.  One has
btrfs-progs-0.19.20120817git043a639-1.fc19.i686.  The other has
btrfs-progs-0.19-20.fc18 which appears to be based on
btrfs-progs-0.19.20120817git043a639.tar.bz2 plus some upstream
patches.

What is the commit which we need?  I can't see anything related to
this in the btrfs-progs git log.

I should note this was all working fine until very recently (under 5
days ago).  Nothing has changed in btrfs-progs in Fedora for a few
months.  Could this be related to a kernel change?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 08:57:30AM -0600, Richard W.M. Jones wrote:
> On Mon, Oct 08, 2012 at 10:27:57AM -0400, Chris Mason wrote:
> > On Mon, Oct 08, 2012 at 08:16:42AM -0600, Richard W.M. Jones wrote:
> > > 
> > > I'm tracking this bug here:
> > > 
> > > https://bugzilla.redhat.com/show_bug.cgi?id=863978
> > > 
> > > Since approx. last week I'm seeing lots of failures in btrfs.  The
> > > common factor seems to be that the filesystem is created (mkfs.btrfs
> > > /dev/sda1) and then it is immediately used -- eg.  mounted or some
> > > btrfs subtool is run on it.  There is no pause or sync between the
> > > operations.
> > 
> > This was a problem on older btrfs-progs, but this commit:
> > 
> > btrfs-progs-0.19.20120817git043a639-1.fc19.i686
> > 
> > (043a639) has long had the fixes to flush things after mkfs.  Is there
> > any change the guest you're testing had an ancient progs on it?
> 
> We have a couple of guests where this fails.  One has
> btrfs-progs-0.19.20120817git043a639-1.fc19.i686.  The other has
> btrfs-progs-0.19-20.fc18 which appears to be based on
> btrfs-progs-0.19.20120817git043a639.tar.bz2 plus some upstream
> patches.
> 
> What is the commit which we need?  I can't see anything related to
> this in the btrfs-progs git log.

Sorry, I was remembering wrong.  I fixed this up in the kernel by
running invalidate_bdev during mount.  I just double checked and the
invalidates look right, so something strange must be going on.

If it is possible to reproduce this reliably, could you please check and
see if syncs do fix it?  We saw this often with xfstests in the past,
but haven't seen it since the invalidates were added.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2 v3] Btrfs: snapshot-aware defrag

2012-10-08 Thread Mitch Harder
On Mon, Oct 8, 2012 at 8:19 AM, Chris Mason  wrote:
> On Mon, Oct 08, 2012 at 06:18:26AM -0600, Liu Bo wrote:
>> On 10/03/2012 10:02 PM, Chris Mason wrote:
>> > On Tue, Sep 25, 2012 at 07:07:53PM -0600, Liu Bo wrote:
>> >> On 09/26/2012 01:39 AM, Mitch Harder wrote:
>> >>> On Mon, Sep 17, 2012 at 4:58 AM, Liu Bo  wrote:
>>  This comes from one of btrfs's project ideas,
>>  As we defragment files, we break any sharing from other snapshots.
>>  The balancing code will preserve the sharing, and defrag needs to grow 
>>  this
>>  as well.
>> 
>>  Now we're able to fill the blank with this patch, in which we make full 
>>  use of
>>  backref walking stuff.
>> 
>>  Here is the basic idea,
>>  o  set the writeback ranges started by defragment with flag 
>>  EXTENT_DEFRAG
>>  o  at endio, after we finish updating fs tree, we use backref walking 
>>  to find
>> all parents of the ranges and re-link them with the new COWed file 
>>  layout by
>> adding corresponding backrefs.
>> 
>>  Originally patch by Li Zefan 
>>  Signed-off-by: Liu Bo 
>> >>>
>> >>> I'm hitting the WARN_ON in record_extent_backrefs() indicating a
>> >>> problem with the return value from iterate_inodes_from_logical().
>> >
>> > Me too.  It triggers reliably with mount -o autodefrag, and then crashes
>> > a in the next function ;)
>> >
>> > -chris
>> >
>>
>> Hi Chris, Mitch,
>>
>> I'm afraid that I may need a little more time to fix all bugs in it because 
>> there seems to be
>> some backref walking bugs mixed in, and at least 4 different crashes make it 
>> harder to address bugs.
>>
>> I use an 1G random write fio job running in background, following by 
>> creating 20 snapshots in background,
>> and mount -o autodefrag.
>>
>> So if your crash is quite stable in one place, please let me know the steps.
>
> I have a notmuch mail database.  I just receive mail with auto defrag on
> and it crashes.  Chrome databases may do it as well.
>
> If it helps, I have compression too.
>
> -chris
>

I can usually reproduce fairly quickly, but I don't have a test that
fails in exactly the same spot every time.

My tests usually involve manipulating kernel git sources with
autodefrag (and usually lzo compression).  I have also hit a similar
error when balancing a partition with multiple snapshots.

I'll go back and review my methods for replicating, and see if any of
them can reproduce predictably.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Richard W.M. Jones
On Mon, Oct 08, 2012 at 11:04:19AM -0400, Chris Mason wrote:
> On Mon, Oct 08, 2012 at 08:57:30AM -0600, Richard W.M. Jones wrote:
> > On Mon, Oct 08, 2012 at 10:27:57AM -0400, Chris Mason wrote:
> > > On Mon, Oct 08, 2012 at 08:16:42AM -0600, Richard W.M. Jones wrote:
> > > > 
> > > > I'm tracking this bug here:
> > > > 
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=863978
> > > > 
> > > > Since approx. last week I'm seeing lots of failures in btrfs.  The
> > > > common factor seems to be that the filesystem is created (mkfs.btrfs
> > > > /dev/sda1) and then it is immediately used -- eg.  mounted or some
> > > > btrfs subtool is run on it.  There is no pause or sync between the
> > > > operations.
> > > 
> > > This was a problem on older btrfs-progs, but this commit:
> > > 
> > > btrfs-progs-0.19.20120817git043a639-1.fc19.i686
> > > 
> > > (043a639) has long had the fixes to flush things after mkfs.  Is there
> > > any change the guest you're testing had an ancient progs on it?
> > 
> > We have a couple of guests where this fails.  One has
> > btrfs-progs-0.19.20120817git043a639-1.fc19.i686.  The other has
> > btrfs-progs-0.19-20.fc18 which appears to be based on
> > btrfs-progs-0.19.20120817git043a639.tar.bz2 plus some upstream
> > patches.
> > 
> > What is the commit which we need?  I can't see anything related to
> > this in the btrfs-progs git log.
> 
> Sorry, I was remembering wrong.  I fixed this up in the kernel by
> running invalidate_bdev during mount.  I just double checked and the
> invalidates look right, so something strange must be going on.
> 
> If it is possible to reproduce this reliably, could you please check and
> see if syncs do fix it?  We saw this often with xfstests in the past,
> but haven't seen it since the invalidates were added.

Unfortunately I'm struggling to reproduce this outside of our build
system (Koji).  I will keep you informed if I do manage to reproduce
it locally.  Adding fsync /dev/sda1 was also my first instinct :-)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 09:15:14AM -0600, Richard W.M. Jones wrote:
> On Mon, Oct 08, 2012 at 11:04:19AM -0400, Chris Mason wrote:
> > On Mon, Oct 08, 2012 at 08:57:30AM -0600, Richard W.M. Jones wrote:
> > > On Mon, Oct 08, 2012 at 10:27:57AM -0400, Chris Mason wrote:
> > > > On Mon, Oct 08, 2012 at 08:16:42AM -0600, Richard W.M. Jones wrote:
> > > > > 
> > > > > I'm tracking this bug here:
> > > > > 
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=863978
> > > > > 
> > > > > Since approx. last week I'm seeing lots of failures in btrfs.  The
> > > > > common factor seems to be that the filesystem is created (mkfs.btrfs
> > > > > /dev/sda1) and then it is immediately used -- eg.  mounted or some
> > > > > btrfs subtool is run on it.  There is no pause or sync between the
> > > > > operations.
> > > > 
> > > > This was a problem on older btrfs-progs, but this commit:
> > > > 
> > > > btrfs-progs-0.19.20120817git043a639-1.fc19.i686
> > > > 
> > > > (043a639) has long had the fixes to flush things after mkfs.  Is there
> > > > any change the guest you're testing had an ancient progs on it?
> > > 
> > > We have a couple of guests where this fails.  One has
> > > btrfs-progs-0.19.20120817git043a639-1.fc19.i686.  The other has
> > > btrfs-progs-0.19-20.fc18 which appears to be based on
> > > btrfs-progs-0.19.20120817git043a639.tar.bz2 plus some upstream
> > > patches.
> > > 
> > > What is the commit which we need?  I can't see anything related to
> > > this in the btrfs-progs git log.
> > 
> > Sorry, I was remembering wrong.  I fixed this up in the kernel by
> > running invalidate_bdev during mount.  I just double checked and the
> > invalidates look right, so something strange must be going on.
> > 
> > If it is possible to reproduce this reliably, could you please check and
> > see if syncs do fix it?  We saw this often with xfstests in the past,
> > but haven't seen it since the invalidates were added.
> 
> Unfortunately I'm struggling to reproduce this outside of our build
> system (Koji).  I will keep you informed if I do manage to reproduce
> it locally.  Adding fsync /dev/sda1 was also my first instinct :-)

When we saw this during xfstests, the fsync wasn't sufficient.  It was
really pretty maddening and the invalidate was a nuke it from orbit
style solution.

The kernel side of the invalidate may have changed, so your first
instinct of a kernel change is probably right.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Swâmi Petaramesh
Hi again Goffredo,

Le 08/10/2012 13:38, Goffredo Baroncelli a écrit :
> I fear that both the combination of autodefrag and the high number of
> snapshot could be the root-cause of the the bad performance.
I've removed, on one of my machines, all snapshots but three per subvol
(keeping the oldests and newest), going from about 30 per subvol to 3,
for the complete filesystem from 120+ to about a dozen.

Then I let btrfs-cleaner do its job

After that the machine boots to GUI in a bit less than 2 minutes, where
it was more than 4 minutes previously.

The machine now seems much more reactive and swift.

So it seems that the number or active snapshots (or is it the number of
subvols whatsoever ??) dramatically impacts performance...

Thanks for the suggestion.

-- 
Swâmi Petaramesh  http://petaramesh.org PGP 9076E32E
Ne cherchez pas : Je ne suis pas sur Facebook.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Josef Bacik
On Sun, Oct 07, 2012 at 03:26:32AM -0600, Swâmi Petaramesh wrote:
> Hi,
> 
> I have 4 machines, all converted to BTRFS about 6 months ago, now all
> running Ubuntu Quantal with kernel 3.5.0-17
> 
> The matter is that all these machines are now getting slower and slower
> everyday, every disk access causing the disk to be 100% busy for long
> periods, to the point that I'm now seriously considering migrating
> everything back to ext4...
> 
> From the start BTRFS was "not very fast", still satisfactory, but now it
> becomes truly unusable.
> 
> On one machine, I know have a typical complete boot time to a usable GUI
> that is over 4 minutes, with the HD still very busy for a couple more
> minutes afterwards, where it used to be around 35-40 seconds in ext4 !
> 
> Is there anything I could do to speed things back (without losing all my
> snapshots or doubling the size of data on disk)...?
> 
> I already had made the move back from BTRFS to ext4 about 18 months ago,
> I found it had improved so was back to BTRFS, and I wouldn't have to
> revert back again :-/
> 
> Any advice or help greatly appreciated.
> 

Can you get sysrq+w when you are seeing slowness?  Usually bootup slow times
means you don't have space_cache enabled or your cache is being evicted for some
reason, can you check dmesg after bootup for messages related to space cache?
Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Swâmi Petaramesh
Le 08/10/2012 18:09, Josef Bacik a écrit :
> Can you get sysrq+w when you are seeing slowness?  Usually bootup slow times
> means you don't have space_cache enabled or your cache is being evicted for 
> some
> reason, can you check dmesg after bootup for messages related to space cache?
> Thanks,
 I have a few :

Oct  8 15:27:26 tethys kernel: [16174.736603] btrfs: free space inode
generation (0) did not match free space cache generation (106988)
Oct  8 15:27:27 tethys kernel: [16175.976784] btrfs: free space inode
generation (0) did not match free space cache generation (30727)
Oct  8 15:27:28 tethys kernel: [16176.420719] btrfs: free space inode
generation (0) did not match free space cache generation (48040)
Oct  8 15:27:28 tethys kernel: [16176.710972] btrfs: free space inode
generation (0) did not match free space cache generation (30745)

...in syslog, but that's about all... and not during boot...

I used to have much much more of these in syslog, but solved it by
booting once with the "clear_cache" option, that caused boot to be
extremely slow, but seemed to fix it...

(Remember I have such issues on several machines, it is highly
improbable that all of them would get their cache ignored...?)

Kind regards.

-- 
Swâmi Petaramesh  http://petaramesh.org PGP 9076E32E
Ne cherchez pas : Je ne suis pas sur Facebook.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Josef Bacik
On Mon, Oct 08, 2012 at 10:15:51AM -0600, Swâmi Petaramesh wrote:
> Le 08/10/2012 18:09, Josef Bacik a écrit :
> > Can you get sysrq+w when you are seeing slowness?  Usually bootup slow times
> > means you don't have space_cache enabled or your cache is being evicted for 
> > some
> > reason, can you check dmesg after bootup for messages related to space 
> > cache?
> > Thanks,
>  I have a few :
> 
> Oct  8 15:27:26 tethys kernel: [16174.736603] btrfs: free space inode
> generation (0) did not match free space cache generation (106988)
> Oct  8 15:27:27 tethys kernel: [16175.976784] btrfs: free space inode
> generation (0) did not match free space cache generation (30727)
> Oct  8 15:27:28 tethys kernel: [16176.420719] btrfs: free space inode
> generation (0) did not match free space cache generation (48040)
> Oct  8 15:27:28 tethys kernel: [16176.710972] btrfs: free space inode
> generation (0) did not match free space cache generation (30745)
> 
> ...in syslog, but that's about all... and not during boot...
> 
> I used to have much much more of these in syslog, but solved it by
> booting once with the "clear_cache" option, that caused boot to be
> extremely slow, but seemed to fix it...
> 
> (Remember I have such issues on several machines, it is highly
> improbable that all of them would get their cache ignored...?)

Well what happens is on a actually used fs it ends up being more fragmented than
the amount we're allowed to preallocate for our space cache, and so we don't
write anything out, so it's very likely that all of your machines could be
hitting that.  I put a patch into 3.6 to increase the cache size so that
wouldn't happen as much, perhaps move to 3.6 and see if you see some
improvements?  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread David Sterba
On Mon, Oct 08, 2012 at 04:15:14PM +0100, Richard W.M. Jones wrote:
> Unfortunately I'm struggling to reproduce this outside of our build
> system (Koji).  I will keep you informed if I do manage to reproduce
> it locally.  Adding fsync /dev/sda1 was also my first instinct :-)

Have you updated the VM/guest related packages recently? This may be a
bug in the VM drivers.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, getting darn slower everyday

2012-10-08 Thread Goffredo Baroncelli

On 10/08/2012 05:50 PM, Swâmi Petaramesh wrote:

Hi again Goffredo,

Le 08/10/2012 13:38, Goffredo Baroncelli a écrit :

I fear that both the combination of autodefrag and the high number of
snapshot could be the root-cause of the the bad performance.

I've removed, on one of my machines, all snapshots but three per subvol
(keeping the oldests and newest), going from about 30 per subvol to 3,
for the complete filesystem from 120+ to about a dozen.

Then I let btrfs-cleaner do its job

After that the machine boots to GUI in a bit less than 2 minutes, where
it was more than 4 minutes previously.

The machine now seems much more reactive and swift.

So it seems that the number or active snapshots (or is it the number of
subvols whatsoever ??) dramatically impacts performance...


Does the autodefrag options still alive ?
I believe that the snapshot is quite cheap, except if you update the 
shared files one at time. Which should be the case of the autodefrag.

But it is only a my supposition

Could you please try to avoid the autodefrag option in a machine with an 
high number of snapshot ? I am curios...






Thanks for the suggestion.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs receive to subdirectory

2012-10-08 Thread Rory Campbell-Lange
I can send snapshots to , but not /. Please advise
if what I am doing is incorrect.

Rory

Format usb3 disk and mount
root@orchard:/bkp# mkfs.btrfs /dev/sdb1
> WARNING! - Btrfs v0.20-rc1-37-g91d9eec IS EXPERIMENTAL
> WARNING! - see http://btrfs.wiki.kernel.org before using
> fs created label (null) on /dev/sdb1
>   nodesize 4096 leafsize 4096 sectorsize 4096 size 698.64GB
> Btrfs v0.20-rc1-37-g91d9eec
mount /dev/sdb1 /mnt


Create snapshots on /bkp share
root@orchard:/bkp# btrfs subvolume snapshot -r subvol 
snaps/snap_081012_1715
> Create a readonly snapshot of 'subvol' in 'snaps/snap_081012_1715'
root@orchard:/bkp# mutt -f subvol/INBOX/
> 1561 kept, 18 deleted.
root@orchard:/bkp# btrfs subvolume snapshot -r subvol 
snaps/snap_081012_1716
> Create a readonly snapshot of 'subvol' in 'snaps/snap_081012_1716'

Send base backup to /mnt
root@orchard:/bkp# btrfs send snaps/snap_081012_1715 | btrfs receive 
/mnt
> At subvol snaps/snap_081012_1715
> At subvol snap_081012_1715

Send incremental backup to /mnt
root@orchard:/bkp# btrfs send -p snaps/snap_081012_1715 \
   snaps/snap_081012_1716 | btrfs receive /mnt
> At subvol snaps/snap_081012_1716
> At snapshot snap_081012_1716

root@orchard:/bkp# ls /mnt
snap_081012_1715  snap_081012_1716

Results:
root@orchard:/bkp# btrfs subvolume list /bkp 
> ID 259 gen 62 top level 5 path subvol
> ID 278 gen 60 top level 5 path snaps/snap_081012_1715
> ID 279 gen 62 top level 5 path snaps/snap_081012_1716
root@orchard:/bkp# btrfs subvolume list /mnt 
> ID 256 gen 8 top level 5 path snap_081012_1715
> ID 259 gen 9 top level 5 path snap_081012_1716

Restart:
root@orchard:/bkp# btrfs subvolume del /mnt/snap_081012_171*
> Delete subvolume '/mnt/snap_081012_1715'
> Delete subvolume '/mnt/snap_081012_1716'

Try and snap to /mnt/
root@orchard:/bkp# mkdir /mnt/snaps
root@orchard:/bkp# btrfs send snaps/snap_081012_1715 | btrfs receive 
/mnt/snaps
> At subvol snaps/snap_081012_1715
> At subvol snap_081012_1715
root@orchard:/bkp# btrfs send -p snaps/snap_081012_1715 \
   snaps/snap_081012_1716 | btrfs receive /mnt/snaps
> At subvol snaps/snap_081012_1716
> At snapshot snap_081012_1716
> ERROR: open snaps/snap_081012_1715 failed. No such file or directory
root@orchard:/bkp# ls /mnt/snaps
> snap_081012_1715



-- 
Rory Campbell-Lange
r...@campbell-lange.net
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Two Issues with Btrfs Delayed Cleaner Process (linux-next)

2012-10-08 Thread Mitch Harder
I've run across two issues with the delayed cleaner process running a
kernel based on the 3.6.0 btrfs-next branch in Josef's git repository.

(1)  I'm getting an error when trying to list my subvolumes whenever
the cleaner thread is running:

# btrfs su li /mnt/benchmark/
ERROR: Failed to lookup path for root 0 - No such file or directory

As long as the cleaner thread is idle, I can run this command without error.

(2)  I ran into an issue on a slower x86 machine (AMD Athlon XP 2600+)
where the cleaner thread literally required an hour to finish deleting
a subvolume that contained the sources for a kernel I had previously
built.

The machine was responsive the whole time, and the cleaner thread
never required much more than 5-10% of the CPU, leaving ample idle
time.

Interestingly, every attempt to replicate this behaviour resulted in
the cleaner thread finishing in a few seconds.

My first issue replicates every time the cleaner thread is running.

I'll need to work on the second issue for a while to see if I can get
it to replicate.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Richard W.M. Jones
On Mon, Oct 08, 2012 at 06:42:27PM +0200, David Sterba wrote:
> On Mon, Oct 08, 2012 at 04:15:14PM +0100, Richard W.M. Jones wrote:
> > Unfortunately I'm struggling to reproduce this outside of our build
> > system (Koji).  I will keep you informed if I do manage to reproduce
> > it locally.  Adding fsync /dev/sda1 was also my first instinct :-)
> 
> Have you updated the VM/guest related packages recently? This may be a
> bug in the VM drivers.

qemu hasn't been updated for over a week.  However I'm having a hard
time understanding how even a change to qemu's caching would in any
way affect only btrfs and nothing else.  The libguestfs test suite is
extremely comprehensive and tests many other filesystems, and none of
them are failing.

These guests are built on the fly from the latest packages in Fedora,
so any other package might be the cause, but it seems like the kernel
is the most likely candidate.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] btrfs-progs: btrfsck: Print which filesystem to be checked to stdout

2012-10-08 Thread Dieter Ries
This patch makes btrfsck print the filesystem, which is to be checked,
to stdout. This should be helpful when analyzing (copied and pasted)
output of btrfsck.

Signed-off-by: Dieter Ries 
---
 btrfsck.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/btrfsck.c b/btrfsck.c
index 67f4a9d..516bcdf 100644
--- a/btrfsck.c
+++ b/btrfsck.c
@@ -3540,6 +3540,8 @@ int main(int ac, char **av)
} else if(ret) {
fprintf(stderr, "%s is currently mounted. Aborting.\n", 
av[optind]);
return -EBUSY;
+   } else {
+   printf("Checking filesystem on %s\n",av[optind]);
}
 
info = open_ctree_fs_info(av[optind], bytenr, rw, 1);
-- 
1.7.3.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] Resend: btrfs-progs: Some cosmetic changes (mainly) to btrfsck

2012-10-08 Thread Dieter Ries
Hi,

I sent an earlier version of this patchset before, and didn't get any response,
so here is the next try:

This patch series implements some mainly cosmetic changes to.
btrfs-progs, most in btrfsck.

As this is my first contribution here, I'd kindly ask you for feedback,
and if work like this is appreciated in general.

Cheers,

Dieter

Dieter Ries (4):
  btrfs-progs: Remove redundant "Btrfs" string from version string
  btrfs-progs: btrfsck: Print which filesystem to be checked to stdout
  btrfs-progs: btrfsck: Print feedback about fscking to stdout.
  btrfs-progs: btrfsck: Remove binary error code output

 btrfsck.c  |   29 ++---
 version.sh |2 +-
 2 files changed, 23 insertions(+), 8 deletions(-)

-- 
1.7.3.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] btrfs-progs: btrfsck: Remove binary error code output

2012-10-08 Thread Dieter Ries
This patch changes the output after checking a filesystem. Before, the
default output was "found x bytes used err is 0", where the last integer
corresponds to the return value of check_root_refs(), which is either 1
or 0. Now this value is evaluated, and a message saying if errors were
found or not is printed.

Signed-off-by: Dieter Ries 
---
 btrfsck.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/btrfsck.c b/btrfsck.c
index 83275cd..45ce681 100644
--- a/btrfsck.c
+++ b/btrfsck.c
@@ -3613,8 +3613,11 @@ out:
   "backup data and re-format the FS. *\n\n");
ret = 1;
}
-   printf("found %llu bytes used err is %d\n",
-  (unsigned long long)bytes_used, ret);
+   if (ret)
+   printf("Filesystem is damaged! One or more errors found.\n");
+   else
+   printf("Filesystem is clean! No errors found.\n");
+   printf("%llu bytes used\n",(unsigned long long)bytes_used);
printf("total csum bytes: %llu\n",(unsigned long long)total_csum_bytes);
printf("total tree bytes: %llu\n",
   (unsigned long long)total_btree_bytes);
-- 
1.7.3.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] btrfs-progs: btrfsck: Print feedback about fscking to stdout.

2012-10-08 Thread Dieter Ries
Status reports of the checking process should be printed to stdout
instead of stderr, as that is normal program output and not related to
problems in btrfsck. This patch changes this behaviour and adds the
output "Done!" after each of the parts.

Signed-off-by: Dieter Ries 
---
 btrfsck.c |   20 +++-
 1 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/btrfsck.c b/btrfsck.c
index 516bcdf..83275cd 100644
--- a/btrfsck.c
+++ b/btrfsck.c
@@ -3559,7 +3559,7 @@ int main(int ac, char **av)
 
root = info->fs_root;
 
-   fprintf(stderr, "checking extents\n");
+   printf("checking extents... ");
if (rw)
trans = btrfs_start_transaction(root, 1);
 
@@ -3567,22 +3567,32 @@ int main(int ac, char **av)
fprintf(stderr, "Reinit crc root\n");
ret = btrfs_fsck_reinit_root(trans, info->csum_root);
if (ret) {
+   printf("\n");
fprintf(stderr, "crc root initialization failed\n");
return -EIO;
}
goto out;
}
ret = check_extents(trans, root, repair);
-   if (ret)
+   if (ret) {
fprintf(stderr, "Errors found in extent allocation tree\n");
+   printf("\n");
+   }
+   else
+   printf("Done!\n");
 
-   fprintf(stderr, "checking fs roots\n");
+   printf("checking fs roots... ");
ret = check_fs_roots(root, &root_cache);
-   if (ret)
+   if (ret) {
+   printf("\n");
goto out;
+   }
+   else
+   printf("Done!\n");
 
-   fprintf(stderr, "checking root refs\n");
+   printf("checking root refs... ");
ret = check_root_refs(root, &root_cache);
+   printf("Done!\n");
 out:
free_root_recs(&root_cache);
if (rw) {
-- 
1.7.3.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] btrfs-progs: Remove redundant "Btrfs" string from version string

2012-10-08 Thread Dieter Ries
In the first line of version.sh, $v was set to "Btrfs vx.yy", and in
the end "Btrfs $v" was echoed to the version.h file. This resulted in
the version string "Btrfs Btrfs vx.yy". This patch removes the second
occurrence of "Btrfs".

Signed-off-by: Dieter Ries 
---
 version.sh |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/version.sh b/version.sh
index af3e441..2c1aff1 100644
--- a/version.sh
+++ b/version.sh
@@ -46,7 +46,7 @@ fi
  
 echo "#ifndef __BUILD_VERSION" > .build-version.h
 echo "#define __BUILD_VERSION" >> .build-version.h
-echo "#define BTRFS_BUILD_VERSION \"Btrfs $v\"" >> .build-version.h
+echo "#define BTRFS_BUILD_VERSION \"$v\"" >> .build-version.h
 echo "#endif" >> .build-version.h
 
 diff -q version.h .build-version.h >& /dev/null
-- 
1.7.3.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: working quota example?

2012-10-08 Thread matthieu Barthélemy
> On Mon, Oct 8, 2012 at 3:01 PM, Arne Jansen  wrote:
>>
>>
>> Please refer here  for a discussion of
>> the
>> meaning of those values.

Thanks, I already read your paper a few days ago, but it was a quick
reading in order to try to understand the concepts.


>> Your mistake was to create the group 0/100 yourself. The command qgroup
>> create is only needed to create quota groups of subvolumes.

Yes, for me it wasn't clear if one could create quotas without having
first to play with qgroups explicitly.
Thanks for pointing that out !

>>
>> To limit the subvol:
>>
>> # btrfs qgroup limit 2m /mnt/test/sub1
>> # dd if=/dev/zero of=/mnt/test/sub1/file1 bs=10485760 count=1
>> dd: writing `/mnt/test/sub1/file1': Disk quota exceeded
>> 1+0 records in
>> 0+0 records out
>> 1966080 bytes (2.0 MB) copied, 0.0056283 s, 349 MB/s
>>
>> Hope that helps :)
>
Totally :)

Are there any plan to maybe get a better 'btrfs quota show' output? Maybe
with more details, maybe a simple ' 1 subvolume + all its snapshots'
accounting.
Maybe I missed something, and I admit I didn't read all the btrfs-progs
patches related to qgroups, but there doesn't seem to have an option to show
a subvolume's quota limit (not the referenced/exclusive usage counters). Am
 I right?

 Thanks again Arne for your help!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: working quota example?

2012-10-08 Thread Arne Jansen
On 10/08/12 21:31, matthieu Barthélemy wrote:

> 
> Are there any plan to maybe get a better 'btrfs quota show' output?

Definitely. The first priority was to get the kernel part running, when
that is settled, we can improve the user mode part. There's also still
some work to do to make the tracking qgroups more presentable.

> Maybe with more details, maybe a simple ' 1 subvolume + all its
> snapshots'  accounting.

Well, there's no such thing as 'the snapshot of a subvolume'. As in
btrfs each snapshot instantly is a subvolume in it's own right. Btrfs
doesn't really keep track which snapshot is a descendant of which.

> Maybe I missed something, and I admit I didn't read all the btrfs-progs
> patches related to qgroups, but there doesn't seem to have an option to
> show a subvolume's quota limit (not the referenced/exclusive usage
> counters). Am I right?

Probably, I don't remember :) That should be easy to fix anyway.

> 
> Thanks again Arne for your help!
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs receive to subdirectory

2012-10-08 Thread Arne Jansen
On 10/08/12 18:30, Rory Campbell-Lange wrote:
> I can send snapshots to , but not /. Please advise
> if what I am doing is incorrect.
> 
> Rory
> 
> Format usb3 disk and mount
>   root@orchard:/bkp# mkfs.btrfs /dev/sdb1
>   > WARNING! - Btrfs v0.20-rc1-37-g91d9eec IS EXPERIMENTAL
>   > WARNING! - see http://btrfs.wiki.kernel.org before using
>   > fs created label (null) on /dev/sdb1
>   >   nodesize 4096 leafsize 4096 sectorsize 4096 size 698.64GB
>   > Btrfs v0.20-rc1-37-g91d9eec
>   mount /dev/sdb1 /mnt
> 
> 
> Create snapshots on /bkp share
>   root@orchard:/bkp# btrfs subvolume snapshot -r subvol 
> snaps/snap_081012_1715
>   > Create a readonly snapshot of 'subvol' in 'snaps/snap_081012_1715'
>   root@orchard:/bkp# mutt -f subvol/INBOX/
> > 1561 kept, 18 deleted.
>   root@orchard:/bkp# btrfs subvolume snapshot -r subvol 
> snaps/snap_081012_1716
> > Create a readonly snapshot of 'subvol' in 'snaps/snap_081012_1716'
> 
> Send base backup to /mnt
>   root@orchard:/bkp# btrfs send snaps/snap_081012_1715 | btrfs receive 
> /mnt
>   > At subvol snaps/snap_081012_1715
>   > At subvol snap_081012_1715
> 
> Send incremental backup to /mnt
>   root@orchard:/bkp# btrfs send -p snaps/snap_081012_1715 \
>snaps/snap_081012_1716 | btrfs receive /mnt
>   > At subvol snaps/snap_081012_1716
>   > At snapshot snap_081012_1716
> 
>   root@orchard:/bkp# ls /mnt
>   snap_081012_1715  snap_081012_1716
> 
> Results:
>   root@orchard:/bkp# btrfs subvolume list /bkp 
>   > ID 259 gen 62 top level 5 path subvol
>   > ID 278 gen 60 top level 5 path snaps/snap_081012_1715
>   > ID 279 gen 62 top level 5 path snaps/snap_081012_1716
>   root@orchard:/bkp# btrfs subvolume list /mnt 
>   > ID 256 gen 8 top level 5 path snap_081012_1715
>   > ID 259 gen 9 top level 5 path snap_081012_1716
> 
> Restart:
>   root@orchard:/bkp# btrfs subvolume del /mnt/snap_081012_171*
>   > Delete subvolume '/mnt/snap_081012_1715'
>   > Delete subvolume '/mnt/snap_081012_1716'
> 
> Try and snap to /mnt/
>   root@orchard:/bkp# mkdir /mnt/snaps
>   root@orchard:/bkp# btrfs send snaps/snap_081012_1715 | btrfs receive 
> /mnt/snaps
>   > At subvol snaps/snap_081012_1715
>   > At subvol snap_081012_1715
>   root@orchard:/bkp# btrfs send -p snaps/snap_081012_1715 \
>  snaps/snap_081012_1716 | btrfs receive /mnt/snaps
>   > At subvol snaps/snap_081012_1716
>   > At snapshot snap_081012_1716
>   > ERROR: open snaps/snap_081012_1715 failed. No such file or directory
> root@orchard:/bkp# ls /mnt/snaps
> > snap_081012_1715

The target has to be a subvol also. But interestingly enough, it also
fails for a subvol. The base send works, the incremental fails, because
btrfs receive can't find snaps/snap_081012_1715. If you give /mnt/snaps
as the target for the base and just /mnt for the incremental, it works.
There's clearly something broken there...

-arne

> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Experiences: Why BTRFS had to yield for ZFS

2012-10-08 Thread Avi Miller
Hi,

On 09/10/2012, at 1:38 AM, Casper Bang  wrote:

> If you do have a suspicion or insight on the matter (perhaps work for Oracle, 
> or 
> know OUK?), of course we'd love a followup offline this list.


I've sent an email to Casper to follow this up offline.

Thanks,
Avi

--
Oracle 
Avi Miller | Principal Program Manager | +61 (412) 229 687
Oracle Linux and Virtualization
417 St Kilda Road, Melbourne, Victoria 3004 Australia






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Richard W.M. Jones
On Mon, Oct 08, 2012 at 04:15:14PM +0100, Richard W.M. Jones wrote:
> On Mon, Oct 08, 2012 at 11:04:19AM -0400, Chris Mason wrote:
> > On Mon, Oct 08, 2012 at 08:57:30AM -0600, Richard W.M. Jones wrote:
> > > On Mon, Oct 08, 2012 at 10:27:57AM -0400, Chris Mason wrote:
> > > > On Mon, Oct 08, 2012 at 08:16:42AM -0600, Richard W.M. Jones wrote:
> > > > > 
> > > > > I'm tracking this bug here:
> > > > > 
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=863978
> > > > > 
> > > > > Since approx. last week I'm seeing lots of failures in btrfs.  The
> > > > > common factor seems to be that the filesystem is created (mkfs.btrfs
> > > > > /dev/sda1) and then it is immediately used -- eg.  mounted or some
> > > > > btrfs subtool is run on it.  There is no pause or sync between the
> > > > > operations.
> > > > 
> > > > This was a problem on older btrfs-progs, but this commit:
> > > > 
> > > > btrfs-progs-0.19.20120817git043a639-1.fc19.i686
> > > > 
> > > > (043a639) has long had the fixes to flush things after mkfs.  Is there
> > > > any change the guest you're testing had an ancient progs on it?
> > > 
> > > We have a couple of guests where this fails.  One has
> > > btrfs-progs-0.19.20120817git043a639-1.fc19.i686.  The other has
> > > btrfs-progs-0.19-20.fc18 which appears to be based on
> > > btrfs-progs-0.19.20120817git043a639.tar.bz2 plus some upstream
> > > patches.
> > > 
> > > What is the commit which we need?  I can't see anything related to
> > > this in the btrfs-progs git log.
> > 
> > Sorry, I was remembering wrong.  I fixed this up in the kernel by
> > running invalidate_bdev during mount.  I just double checked and the
> > invalidates look right, so something strange must be going on.
> > 
> > If it is possible to reproduce this reliably, could you please check and
> > see if syncs do fix it?  We saw this often with xfstests in the past,
> > but haven't seen it since the invalidates were added.
> 
> Unfortunately I'm struggling to reproduce this outside of our build
> system (Koji).  I will keep you informed if I do manage to reproduce
> it locally.  Adding fsync /dev/sda1 was also my first instinct :-)

I have now reproduced this bug locally.

Adding sync() + fsync of each /dev/sd* device after the mkfs command
does appear to fix the problem.

However it's a little bit difficult to know for sure because I might
just be changing the timing of things by adding these calls.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Anyone seeing lots of "Check tree block failed" and other errors with latest kernel?

2012-10-08 Thread Chris Mason
On Mon, Oct 08, 2012 at 03:22:30PM -0600, Richard W.M. Jones wrote:
> 
> I have now reproduced this bug locally.
> 
> Adding sync() + fsync of each /dev/sd* device after the mkfs command
> does appear to fix the problem.
> 
> However it's a little bit difficult to know for sure because I might
> just be changing the timing of things by adding these calls.

Ok, what's a rough idea of the mainline git equiv of the buggy kernel?

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] Btrfs-progs, btrfsck: add block group check function

2012-10-08 Thread Miao Xie
On Tue, 2 Oct 2012 20:21:56 -0400, Chris Mason wrote:
> On Thu, Aug 02, 2012 at 04:14:37AM -0600, Miao Xie wrote:
>> From: Chen Yang 
>>
>> This patch adds the function to check correspondence between block group,
>> chunk and device extent.
> 
> Excellent, thank you.  Could you please put this into a git tree?  I'm
> having trouble with whitespace (tabs -> spaces) in the patch.

You can pull it by

git://github.com/miaoxie/btrfs-progs.git block-group-check

I have re-based it on the latest tree

Thanks
Miao
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 0/7 ] Btrfs-progs: enhance btrfs subvol list only to show read-only snapshots

2012-10-08 Thread Miao Xie
On Tue, 2 Oct 2012 20:03:04 -0400, Chris Mason wrote:
> On Tue, Sep 18, 2012 at 04:35:25AM -0600, Miao Xie wrote:
>> We want 'btrfs subvolume list' only to list readonly subvolumes, this patch 
>> set
>> introduces a new option 'r' to implement it.
>>
>> You can use the command like that:
>>
>>  btrfs subvolume list -r 
>>
>> Changelog v3 -> v4:
>> - modify the check method which is used to check if btrfs_root_item contains 
>> otime and uuid
>>   or not.
>> - add filter set and comparer set which are used to manage the filters and 
>> comparers specified
>>   by the users.
>> - re-base the read-only subvolume list function.
>>
>> Changelog v2 -> v3:
>> - re-implement list_subvols()
>> - re-implement this read-only subvolume list function based on the new 
>> list_subvols()
>>
>> Changelog v1 -> v2:
>> - address the comments from Goffredo Baroncelli
>>   i,   change the changelog of the patches and make them more elaborate.
>>   ii,  move the function declarations to a new head file.
>>   iii, add the introduction of the new option 'r' into the man page
>>
>> We can pull the patches from the URL
>>
>>  git://github.com/miaoxie/btrfs-progs.git master
>>
>> This patchset is against the patches of Liu Bo, Anand Jain and mine which 
>> were sent by several
>> days ago. And we also can pull those patches from the above URL.
> 
> These are all really useful additions!  How do you plan on using the
> table option? (just curious).

It is just used to make the output be read easily. The old output always throw
out lots of inessential repeated strings, it is very ugly and make us giddy. So
we add this option to make the users happy.

Thanks
Miao
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html