[PATCH v2] Btrfs-progs: update usage message for cmds-restore

2013-07-10 Thread Filipe David Borba Manana
Mention that a target path argument is mandatory unless
the -l option is supplied. Also mention about the existence
of the -l option, which was previously not announced.

$ btrfs restore -v /dev/sdb3
usage: btrfs restore [options] device

Try to restore files from a damaged filesystem (unmounted)

-s  get snapshots
-v  verbose
-i  ignore errors
-o  overwrite
-t  tree location
-f offset filesystem location
-u block  super mirror
-d  find dir
$ echo $?
129

After specifying a target path, the command works as expected:

$ btrfs restore -v /dev/sdb3 files2/
Restoring files2/file1
Done searching
$ echo $?
0

V2: Updated command synopsis by suggestion of Anand Jain.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 cmds-restore.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index eca528d..d362d79 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -956,7 +956,7 @@ out:
 }
 
 const char * const cmd_restore_usage[] = {
-   btrfs restore [options] device,
+   btrfs restore [options] device path | -l device,
Try to restore files from a damaged filesystem (unmounted),
,
-s  get snapshots,
@@ -967,6 +967,7 @@ const char * const cmd_restore_usage[] = {
-f offset filesystem location,
-u block  super mirror,
-d  find dir,
+   -l  list roots,
NULL
 };
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs crashes

2013-07-10 Thread Franziska Näpelt

I created a bug:

https://bugzilla.kernel.org/show_bug.cgi?id=60544

Am 08.07.2013 15:24, schrieb Josef Bacik:

On Mon, Jul 08, 2013 at 08:46:17AM +0200, Franziska Näpelt wrote:

Hi everybody,

we are using a btrfs RAID 1 with four 2TB hard drives on a Debian 7.1
(Kernel 3.9.6).

After about one year of working, there was an error in messages log and the
filesystem was mounted read-only.

After that I restarted the system but that doesn't fix the bug. The
btrfs-filesystem couldn't be mounted.

I attach four logs:

- when the error occured
- after the error occured
- when rebooting the system
- when I tried ti mount manually

Before I rebuilt the filesystem (formatting everything and creating a new
btrfs-pool) I made a btrfs-image. I can provide it to you.



Can you file a bugzilla at bugzilla.kernel.org (make sure the component is set
to btrfs) with all of this information and a link to the image, and the output
of btrfsck?  Please use the most recent version of btrfs-progs

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git

Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: num_tolerated_disk_barrier_failures is incorrect for RAID6

2013-07-10 Thread Ilya Dryomov
Currently num_tolerated_disk_barrier_failures gets the value of
fs_devices-num_devices in the RAID6 case.  But, RAID6 can tolerate only
two simultaneous failures, so set it to 2.

CC: Stefan Behrens sbehr...@giantdisaster.de
Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/disk-io.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b8b60b6..aecf788 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3258,7 +3258,7 @@ int btrfs_calc_num_tolerated_disk_barrier_failures(
BTRFS_BLOCK_GROUP_RAID10)) {

num_tolerated_disk_barrier_failures = 1;
} else if (flags 
-  BTRFS_BLOCK_GROUP_RAID5) {
+  BTRFS_BLOCK_GROUP_RAID6) {

num_tolerated_disk_barrier_failures = 2;
}
}
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: num_tolerated_disk_barrier_failures is incorrect for RAID6

2013-07-10 Thread Stefan Behrens
On Wed, 10 Jul 2013 14:54:30 +0300, Ilya Dryomov wrote:
 Currently num_tolerated_disk_barrier_failures gets the value of
 fs_devices-num_devices in the RAID6 case.  But, RAID6 can tolerate only
 two simultaneous failures, so set it to 2.
 
 CC: Stefan Behrens sbehr...@giantdisaster.de
 Signed-off-by: Ilya Dryomov idryo...@gmail.com
 ---
  fs/btrfs/disk-io.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
 index b8b60b6..aecf788 100644
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -3258,7 +3258,7 @@ int btrfs_calc_num_tolerated_disk_barrier_failures(
   BTRFS_BLOCK_GROUP_RAID10)) {
   
 num_tolerated_disk_barrier_failures = 1;
   } else if (flags 
 -BTRFS_BLOCK_GROUP_RAID5) {
 +BTRFS_BLOCK_GROUP_RAID6) {
   
 num_tolerated_disk_barrier_failures = 2;
   }
   }
 

ELATE :), Henrik Nordvik already fixed it with commit 15b0a89d7.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: performance loss with lots of snapshots

2013-07-10 Thread Josef Bacik
On Wed, Jul 10, 2013 at 12:54:44PM +1000, Russell Coker wrote:
 There are two uses of backups, recovering from user errors (IE deleting the 
 wrong file) and recovering from sysadmin errors or hardware failures (IE 
 disks 
 are dead or wiped).  For the former use I'm mainly using BTRFS snapshots on 
 many systems.
 
 A problem that I have had on more than a few occasions (most recently on the 
 latest Debian 3.9 kernel) is of severe performance loss.  A few days ago this 
 happened on a workstation running an Intel 120G SSD device for the root 
 filesystem which was being used for basic workstation tasks (kmail, GIMP, 
 OpenOffice, etc).  The /home and / subvols had about 400 snapshots between 
 them (which doesn't seem like a huge number) when the system became unusably 
 slow while running a scrub from a cron job, programs like GIMP became stuck 
 in 
 D state.  The system in question has 8G of RAM and very light load, there 
 shouldn't be any reason for it not giving good performance while the scrub 
 was 
 in progress and it definitely should have performed well when the scrub was 
 cancelled.  But it didn't return to decent performance until I deleted about 
 300 snapshots.
 
 This has happened to me often enough that I can probably reproduce it on a 
 VM.  
 What kernel should I use for such tests?
 
 If I get a virtual machine in a state where it has ongoing performance 
 problems would any of the BTRFS developers like root access to debug it?
 

There is a memory leak-ish with scrub where it doesn't free up the csums it's
looked up until after its done scrubbing an area which can lead to OOM's or
degraded performance.  Btrfs-next has the fix as well as the pull request that
just went to Linus, so pick which one you want and run again and see if that
helps?  I imagine you are probably seeing two things, first that oom'ish
behavior and then some other performance gotcha with a fair number of snapshots,
but just in case.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 1/2] Btrfs-progs: make pretty_sizes() work less error prone

2013-07-10 Thread David Sterba
On Tue, Jul 09, 2013 at 01:24:43PM -0700, Zach Brown wrote:
  The original codes don't handle error gracefully and some places
  forget to free memory. We can allocate memory before calling pretty_sizes(),
  for example, we can use static memory allocation and we don't have to deal
  with memory allocation fails.
 
 I agree that callers shouldn't have to know to free allocated memory.
 
 But I think that we can do better and not have callers need to worry
 about per-call string storage at all.
 
 How about something like this?

Neat trick! A few neat-picks below. Besides, I guess we can use this
sort of trick with the fi-df patches.

 --- a/utils.c
 +++ b/utils.c
 @@ -1153,12 +1153,13 @@ out:
  
  static char *size_strs[] = { , KB, MB, GB, TB,
   PB, EB, ZB, YB};

I'll drop the ZB, YB suffixes.

 --- a/utils.h
 +++ b/utils.h
 @@ -44,7 +44,15 @@ int check_mounted_where(int fd, const char *file, char 
 *where, int size,
   struct btrfs_fs_devices **fs_devices_mnt);
  int btrfs_device_already_in_root(struct btrfs_root *root, int fd,
int super_offset);
 -char *pretty_sizes(u64 size);
 +
 +void pretty_size_snprintf(u64 size, char *str, size_t str_bytes);
 +#define pretty_sizes(size)   \

and rename it to pretty_size as it takes only one number

 + ({  \
 + static __thread char _str[16];  \

16 is not enough for exabyte scale, that needs at least 20 bytes + 1 for 0.

len(str(2**64)) = 20

- 24

 + pretty_size_snprintf(size, _str, sizeof(_str)); \

pretty_size_snprintf((size), _str, sizeof(_str));   \

As these are only trivial changes I'll fix them at commit time.

 + _str;   \
 + })
 +
  int get_mountpt(char *dev, char *mntpt, size_t size);
  int btrfs_scan_block_devices(int run_ioctl);
  u64 parse_size(char *s);
 -- 
 1.7.11.7
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: per-thread, per-call pretty buffer

2013-07-10 Thread David Sterba
From: Zach Brown z...@redhat.com

From: Zach Brown z...@redhat.com

We don't need callers to manage string storage for each pretty_sizes()
call.  We can use a macro to have per-thread and per-call static storage
so that pretty_sizes() can be used as many times as needed in printf()
arguments without requiring a bunch of supporting variables.

This lets us have a natural interface at the cost of requiring __thread
and TLS from gcc and a small amount of static storage.  This seems
better than the current code or doing something with illegible format
specifier macros.

Signed-off-by: Zach Brown z...@redhat.com
Signed-off-by: David Sterba dste...@suse.cz
---

I've updated the rest of pretty_size callers in targets that were not built by
default.

 btrfs-calc-size.c | 13 +++--
 btrfs-fragments.c |  2 +-
 cmds-filesystem.c | 27 +--
 cmds-scrub.c  |  8 
 mkfs.c|  4 +---
 utils.c   | 19 ++-
 utils.h   | 10 +-
 7 files changed, 37 insertions(+), 46 deletions(-)

diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
index c4adfb0..5aa0b70 100644
--- a/btrfs-calc-size.c
+++ b/btrfs-calc-size.c
@@ -162,18 +162,11 @@ out_print:
   stat.total_inline, stat.total_nodes, stat.total_leaves,
   level + 1);
} else {
-   char *total_size;
-   char *inline_size;
-
-   total_size = pretty_sizes(stat.total_bytes);
-   inline_size = pretty_sizes(stat.total_inline);
-
printf(\t%s total size, %s inline data, %Lu nodes, 
   %Lu leaves, %d levels\n,
-  total_size, inline_size, stat.total_nodes,
-  stat.total_leaves, level + 1);
-   free(total_size);
-   free(inline_size);
+  pretty_size(stat.total_bytes),
+  pretty_size(stat.total_inline),
+  stat.total_nodes, stat.total_leaves, level + 1);
}
 out:
btrfs_free_path(path);
diff --git a/btrfs-fragments.c b/btrfs-fragments.c
index a012fe1..7ec77e7 100644
--- a/btrfs-fragments.c
+++ b/btrfs-fragments.c
@@ -87,7 +87,7 @@ print_bg(FILE *html, char *name, u64 start, u64 len, u64 
used, u64 flags,
 
fprintf(html, p%s chunk starts at %lld, size is %s, %.2f%% used, 
  %.2f%% fragmented/p\n, chunk_type(flags), start,
- pretty_sizes(len), 100.0 * used / len, 100.0 * frag);
+ pretty_size(len), 100.0 * used / len, 100.0 * frag);
fprintf(html, img src=\%s\ border=\1\ /\n, name);
 }
 
diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index f41a72a..222e458 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -111,8 +111,6 @@ static int cmd_df(int argc, char **argv)
 
for (i = 0; i  sargs-total_spaces; i++) {
char description[80];
-   char *total_bytes;
-   char *used_bytes;
int written = 0;
u64 flags = sargs-spaces[i].flags;
 
@@ -155,10 +153,9 @@ static int cmd_df(int argc, char **argv)
written += 7;
}
 
-   total_bytes = pretty_sizes(sargs-spaces[i].total_bytes);
-   used_bytes = pretty_sizes(sargs-spaces[i].used_bytes);
-   printf(%s: total=%s, used=%s\n, description, total_bytes,
-  used_bytes);
+   printf(%s: total=%s, used=%s\n, description,
+   pretty_size(sargs-spaces[i].total_bytes),
+   pretty_size(sargs-spaces[i].used_bytes));
}
close(fd);
free(sargs);
@@ -192,7 +189,6 @@ static void print_one_uuid(struct btrfs_fs_devices 
*fs_devices)
char uuidbuf[37];
struct list_head *cur;
struct btrfs_device *device;
-   char *super_bytes_used;
u64 devs_found = 0;
u64 total;
 
@@ -204,25 +200,20 @@ static void print_one_uuid(struct btrfs_fs_devices 
*fs_devices)
else
printf(Label: none );
 
-   super_bytes_used = pretty_sizes(device-super_bytes_used);
 
total = device-total_devs;
printf( uuid: %s\n\tTotal devices %llu FS bytes used %s\n, uuidbuf,
-  (unsigned long long)total, super_bytes_used);
-
-   free(super_bytes_used);
+  (unsigned long long)total,
+  pretty_size(device-super_bytes_used));
 
list_for_each(cur, fs_devices-devices) {
-   char *total_bytes;
-   char *bytes_used;
device = list_entry(cur, struct btrfs_device, dev_list);
-   total_bytes = pretty_sizes(device-total_bytes);
-   bytes_used = pretty_sizes(device-bytes_used);
+
printf(\tdevid %4llu size %s used %s path %s\n,
   (unsigned long long)device-devid,
-  total_bytes, bytes_used, device-name);
-  

Re: [PATCH] btrfs-progs: per-thread, per-call pretty buffer

2013-07-10 Thread Wang Shilong
Hello David,

 From: Zach Brown z...@redhat.com
 
duplicate information.

 From: Zach Brown z...@redhat.com
 
 We don't need callers to manage string storage for each pretty_sizes()
 call.  We can use a macro to have per-thread and per-call static storage
 so that pretty_sizes() can be used as many times as needed in printf()
 arguments without requiring a bunch of supporting variables.
 
 This lets us have a natural interface at the cost of requiring __thread
 and TLS from gcc and a small amount of static storage.  This seems
 better than the current code or doing something with illegible format
 specifier macros.
 
 Signed-off-by: Zach Brown z...@redhat.com
 Signed-off-by: David Sterba dste...@suse.cz

OK.  please add my tag: Acked-by: Wang Shilong wangs.f...@cn.fujitsu.com
 (I have given my tag in the previous thread to Zach and cc to you!)

Zach gives a better solution,but i at least report and try for it. Isn't it?

Thanks
Wang

 ---
 
 I've updated the rest of pretty_size callers in targets that were not built by
 default.
 
 btrfs-calc-size.c | 13 +++--
 btrfs-fragments.c |  2 +-
 cmds-filesystem.c | 27 +--
 cmds-scrub.c  |  8 
 mkfs.c|  4 +---
 utils.c   | 19 ++-
 utils.h   | 10 +-
 7 files changed, 37 insertions(+), 46 deletions(-)
 
 diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
 index c4adfb0..5aa0b70 100644
 --- a/btrfs-calc-size.c
 +++ b/btrfs-calc-size.c
 @@ -162,18 +162,11 @@ out_print:
  stat.total_inline, stat.total_nodes, stat.total_leaves,
  level + 1);
   } else {
 - char *total_size;
 - char *inline_size;
 -
 - total_size = pretty_sizes(stat.total_bytes);
 - inline_size = pretty_sizes(stat.total_inline);
 -
   printf(\t%s total size, %s inline data, %Lu nodes, 
  %Lu leaves, %d levels\n,
 -total_size, inline_size, stat.total_nodes,
 -stat.total_leaves, level + 1);
 - free(total_size);
 - free(inline_size);
 +pretty_size(stat.total_bytes),
 +pretty_size(stat.total_inline),
 +stat.total_nodes, stat.total_leaves, level + 1);
   }
 out:
   btrfs_free_path(path);
 diff --git a/btrfs-fragments.c b/btrfs-fragments.c
 index a012fe1..7ec77e7 100644
 --- a/btrfs-fragments.c
 +++ b/btrfs-fragments.c
 @@ -87,7 +87,7 @@ print_bg(FILE *html, char *name, u64 start, u64 len, u64 
 used, u64 flags,
 
   fprintf(html, p%s chunk starts at %lld, size is %s, %.2f%% used, 
 %.2f%% fragmented/p\n, chunk_type(flags), start,
 -   pretty_sizes(len), 100.0 * used / len, 100.0 * frag);
 +   pretty_size(len), 100.0 * used / len, 100.0 * frag);
   fprintf(html, img src=\%s\ border=\1\ /\n, name);
 }
 
 diff --git a/cmds-filesystem.c b/cmds-filesystem.c
 index f41a72a..222e458 100644
 --- a/cmds-filesystem.c
 +++ b/cmds-filesystem.c
 @@ -111,8 +111,6 @@ static int cmd_df(int argc, char **argv)
 
   for (i = 0; i  sargs-total_spaces; i++) {
   char description[80];
 - char *total_bytes;
 - char *used_bytes;
   int written = 0;
   u64 flags = sargs-spaces[i].flags;
 
 @@ -155,10 +153,9 @@ static int cmd_df(int argc, char **argv)
   written += 7;
   }
 
 - total_bytes = pretty_sizes(sargs-spaces[i].total_bytes);
 - used_bytes = pretty_sizes(sargs-spaces[i].used_bytes);
 - printf(%s: total=%s, used=%s\n, description, total_bytes,
 -used_bytes);
 + printf(%s: total=%s, used=%s\n, description,
 + pretty_size(sargs-spaces[i].total_bytes),
 + pretty_size(sargs-spaces[i].used_bytes));
   }
   close(fd);
   free(sargs);
 @@ -192,7 +189,6 @@ static void print_one_uuid(struct btrfs_fs_devices 
 *fs_devices)
   char uuidbuf[37];
   struct list_head *cur;
   struct btrfs_device *device;
 - char *super_bytes_used;
   u64 devs_found = 0;
   u64 total;
 
 @@ -204,25 +200,20 @@ static void print_one_uuid(struct btrfs_fs_devices 
 *fs_devices)
   else
   printf(Label: none );
 
 - super_bytes_used = pretty_sizes(device-super_bytes_used);
 
   total = device-total_devs;
   printf( uuid: %s\n\tTotal devices %llu FS bytes used %s\n, uuidbuf,
 -(unsigned long long)total, super_bytes_used);
 -
 - free(super_bytes_used);
 +(unsigned long long)total,
 +pretty_size(device-super_bytes_used));
 
   list_for_each(cur, fs_devices-devices) {
 - char *total_bytes;
 - char *bytes_used;
   device = list_entry(cur, struct btrfs_device, dev_list);
 - total_bytes = 

[PATCH] Btrfs-progs: fix restore command leaving corrupted files

2013-07-10 Thread Filipe David Borba Manana
When there are files that have parts shared with snapshots, the
restore command was incorrectly restoring them, as it was not
taking into account the offset and number of bytes fields from
the file extent item. Besides leaving the recovered file corrupt,
it was also inneficient as it read and wrote more data than needed
(with each extent copy overwriting portions of the one previously
written).

The following steps and small C program show how to reproduce this
corruption issue:

$ mkfs.btrfs -f  /dev/sdb3
$ mount /dev/sdb3 /mnt/btrfs
$ ./write_file /mnt/btrfs/foobar
$ du -b /mnt/btrfs/foobar
1048926 /mnt/btrfs/foobar
$ md5sum /mnt/btrfs/foobar
f9f778f3a7410c40e4ed104a3a63c3c4  /mnt/btrfs/foobar

$ btrfs subvolume snapshot /mnt/btrfs /mnt/btrfs/my_snap
$ perl -e 'open($f, +, /dev/btrfs/foobar); seek($f, 4096, 0); print $f 
\xff; close($f);'
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar
$ umount /mnt/btrfs

$ btrfs restore /dev/sdb3 /tmp/copy
$ du -b /tmp/copy/foobar
1048926 /tmp/copy/foobar
$ md5sum /tmp/copy/foobar
88db338cbc1c44dfabae083f1ce642d5  /tmp/copy/foobar
$ od -t x1 -j 8192 -N 4 /tmp/copy/foobar
002 41 00 00 00
0020004
$ mount /dev/sdb3 /mnt/btrfs
$ od -t x1 -j 8192 -N 4 /mnt/btrfs/foobar
002 00 00 00 00
0020004
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar

$ cat write_file.c:

int main(int argc, char *argv[])
{
int fd;
unsigned char buf[BUF_SIZE];

if (argc  2) {
fprintf(stderr, Use:  %s filepath\n, argv[0]);
return 1;
}
fd = open(argv[1], O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);
assert(fd = 0);
memset(buf, 0, BUF_SIZE);
buf[0] = 65;
assert(write(fd, buf, BUF_SIZE) == BUF_SIZE);
assert(close(fd) == 0);
return 0;
}

Tested this change with zlib, lzo compression and file sizes larger
than 1GiB, and found no regression or other corruption issues (so far
at least).

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 cmds-restore.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..9688599 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -272,6 +272,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
u64 bytenr;
u64 ram_size;
u64 disk_size;
+   u64 num_bytes;
u64 length;
u64 size_left;
u64 dev_bytenr;
@@ -288,7 +289,9 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
disk_size = btrfs_file_extent_disk_num_bytes(leaf, fi);
ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
offset = btrfs_file_extent_offset(leaf, fi);
-   size_left = disk_size;
+   num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
+   size_left = num_bytes;
+   bytenr += offset;
 
if (offset)
printf(offset is %Lu\n, offset);
@@ -296,7 +299,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
if (disk_size == 0)
return 0;
 
-   inbuf = malloc(disk_size);
+   inbuf = malloc(size_left);
if (!inbuf) {
fprintf(stderr, No memory\n);
return -1;
@@ -351,8 +354,8 @@ again:
goto again;
 
if (compress == BTRFS_COMPRESS_NONE) {
-   while (total  ram_size) {
-   done = pwrite(fd, inbuf+total, ram_size-total,
+   while (total  num_bytes) {
+   done = pwrite(fd, inbuf+total, num_bytes-total,
  pos+total);
if (done  0) {
ret = -1;
@@ -365,7 +368,7 @@ again:
goto out;
}
 
-   ret = decompress(inbuf, outbuf, disk_size, ram_size, compress);
+   ret = decompress(inbuf, outbuf, num_bytes, ram_size, compress);
if (ret) {
num_copies = btrfs_num_copies(root-fs_info-mapping_tree,
  bytenr, length);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: per-thread, per-call pretty buffer

2013-07-10 Thread David Sterba
On Wed, Jul 10, 2013 at 11:31:17PM +0800, Wang Shilong wrote:
 Hello David,
 
  From: Zach Brown z...@redhat.com
  
 duplicate information.

git-send-email tricked me, the line is not present in thre tree

 
  From: Zach Brown z...@redhat.com
  
  We don't need callers to manage string storage for each pretty_sizes()
  call.  We can use a macro to have per-thread and per-call static storage
  so that pretty_sizes() can be used as many times as needed in printf()
  arguments without requiring a bunch of supporting variables.
  
  This lets us have a natural interface at the cost of requiring __thread
  and TLS from gcc and a small amount of static storage.  This seems
  better than the current code or doing something with illegible format
  specifier macros.
  
  Signed-off-by: Zach Brown z...@redhat.com
  Signed-off-by: David Sterba dste...@suse.cz
 
 OK.  please add my tag: Acked-by: Wang Shilong wangs.f...@cn.fujitsu.com
  (I have given my tag in the previous thread to Zach and cc to you!)
 
 Zach gives a better solution,but i at least report and try for it. Isn't it?

Oh sorry, I'll add the tag of course, I was so excited with zach's patch
and missed it.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 1/2] Btrfs-progs: make pretty_sizes() work less error prone

2013-07-10 Thread Zach Brown
 Neat trick! A few neat-picks below.

Indeed, those are all good fixes.

 As these are only trivial changes I'll fix them at commit time.

Great, thanks David!

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs-progs: fix restore command leaving corrupted files

2013-07-10 Thread Filipe David Borba Manana
When there are files that have parts shared with snapshots, the
restore command was incorrectly restoring them, as it was not
taking into account the offset and number of bytes fields from
the file extent item. Besides leaving the recovered file corrupt,
it was also inneficient as it read and wrote more data than needed
(with each extent copy overwriting portions of the one previously
written).

The following steps and small C program show how to reproduce this
corruption issue:

$ mkfs.btrfs -f  /dev/sdb3
$ mount /dev/sdb3 /mnt/btrfs
$ ./write_file /mnt/btrfs/foobar
$ du -b /mnt/btrfs/foobar
1048926 /mnt/btrfs/foobar
$ md5sum /mnt/btrfs/foobar
f9f778f3a7410c40e4ed104a3a63c3c4  /mnt/btrfs/foobar

$ btrfs subvolume snapshot /mnt/btrfs /mnt/btrfs/my_snap
$ perl -e 'open($f, +, /dev/btrfs/foobar); seek($f, 4096, 0); print $f 
\xff; close($f);'
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar
$ umount /mnt/btrfs

$ btrfs restore /dev/sdb3 /tmp/copy
$ du -b /tmp/copy/foobar
1048926 /tmp/copy/foobar
$ md5sum /tmp/copy/foobar
88db338cbc1c44dfabae083f1ce642d5  /tmp/copy/foobar
$ od -t x1 -j 8192 -N 4 /tmp/copy/foobar
002 41 00 00 00
0020004
$ mount /dev/sdb3 /mnt/btrfs
$ od -t x1 -j 8192 -N 4 /mnt/btrfs/foobar
002 00 00 00 00
0020004
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar

$ cat write_file.c:

 #include stdio.h
 #include stdlib.h
 #include unistd.h
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include string.h
 #include assert.h

 #define BUF_SIZE (60 * 1024 * 1024 + 33350)

int main(int argc, char *argv[])
{
int fd;
unsigned char buf[BUF_SIZE];

if (argc  2) {
fprintf(stderr, Use:  %s filepath\n, argv[0]);
return 1;
}
fd = open(argv[1], O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);
assert(fd = 0);
memset(buf, 0, BUF_SIZE);
buf[0] = 65;
assert(write(fd, buf, BUF_SIZE) == BUF_SIZE);
assert(close(fd) == 0);
return 0;
}

Tested this change with zlib, lzo compression and file sizes larger
than 1GiB, and found no regression or other corruption issues (so far
at least).

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: updated commit message to include the C preprocessor macros
in the C program.

 cmds-restore.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..9688599 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -272,6 +272,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
u64 bytenr;
u64 ram_size;
u64 disk_size;
+   u64 num_bytes;
u64 length;
u64 size_left;
u64 dev_bytenr;
@@ -288,7 +289,9 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
disk_size = btrfs_file_extent_disk_num_bytes(leaf, fi);
ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
offset = btrfs_file_extent_offset(leaf, fi);
-   size_left = disk_size;
+   num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
+   size_left = num_bytes;
+   bytenr += offset;
 
if (offset)
printf(offset is %Lu\n, offset);
@@ -296,7 +299,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
if (disk_size == 0)
return 0;
 
-   inbuf = malloc(disk_size);
+   inbuf = malloc(size_left);
if (!inbuf) {
fprintf(stderr, No memory\n);
return -1;
@@ -351,8 +354,8 @@ again:
goto again;
 
if (compress == BTRFS_COMPRESS_NONE) {
-   while (total  ram_size) {
-   done = pwrite(fd, inbuf+total, ram_size-total,
+   while (total  num_bytes) {
+   done = pwrite(fd, inbuf+total, num_bytes-total,
  pos+total);
if (done  0) {
ret = -1;
@@ -365,7 +368,7 @@ again:
goto out;
}
 
-   ret = decompress(inbuf, outbuf, disk_size, ram_size, compress);
+   ret = decompress(inbuf, outbuf, num_bytes, ram_size, compress);
if (ret) {
num_copies = btrfs_num_copies(root-fs_info-mapping_tree,
  bytenr, length);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs-progs: remove duplicated code in cmds-restore.c

2013-07-10 Thread David Sterba
On Tue, Jul 09, 2013 at 07:49:53PM +0100, Filipe David Borba Manana wrote:
 The module cmds-restore.c was defining its own next_leaf()
 function, which did exactly the same as btrfs_next_leaf()
 from ctree.c.

This has been removed by Eric's patch present in the integration
branches:
 Btrfs-progs: remove cut  paste btrfs_next_leaf from restore
http://www.spinics.net/lists/linux-btrfs/msg24477.html

but now Chris has a fix in the master branch,
 btrfs-restore: deal with NULL returns from read_node_slot
https://git.kernel.org/cgit/linux/kernel/git/mason/btrfs-progs.git/commit/?id=194aa4a1bd6447bb545286d0bcb0b0be8204d79f

the code of updated next_leaf is not identical to btrfs_next_leaf and I
think 'restore' could be more tolerant to partially corrupted
structures, so both functions could make sense in the end.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: per-thread, per-call pretty buffer

2013-07-10 Thread Hugo Mills
   Sorry to be a pain in the arse at this late stage of the patch, but
I've only just noticed.

On Wed, Jul 10, 2013 at 04:30:15PM +0200, David Sterba wrote:
  static char *size_strs[] = { , KB, MB, GB, TB,
 - PB, EB, ZB, YB};
 -char *pretty_sizes(u64 size)
 + PB, EB};

   These are SI (power of 10) prefixes...

 +void pretty_size_snprintf(u64 size, char *str, size_t str_bytes)
  {
   int num_divs = 0;
 -int pretty_len = 16;
   float fraction;
 - char *pretty;
 +
 + if (str_bytes == 0)
 + return;
  
   if( size  1024 ){
   fraction = size;
 @@ -1172,13 +1173,13 @@ char *pretty_sizes(u64 size)
   num_divs ++;
   }
  
 - if (num_divs = ARRAY_SIZE(size_strs))
 - return NULL;
 + if (num_divs = ARRAY_SIZE(size_strs)) {
 + str[0] = '\0';
 + return;
 + }
   fraction = (float)last_size / 1024;

   ... and this is working in IEC (power of 2) units.

   Can we fix this discrepancy, please? Also note that SI uses k for
10^3, but IEC uses K for 2^10. Just insert an i in the middle of
each element of size_strs should deal with the problem.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Charting the inexorable advance of Western syphilisation... ---   


signature.asc
Description: Digital signature


[PATCH v3] Btrfs-progs: fix restore command leaving corrupted files

2013-07-10 Thread Filipe David Borba Manana
When there are files that have parts shared with snapshots, the
restore command was incorrectly restoring them, as it was not
taking into account the offset and number of bytes fields from
the file extent item. Besides leaving the recovered file corrupt,
it was also inneficient as it read and wrote more data than needed
(with each extent copy overwriting portions of the one previously
written).

The following steps and small C program show how to reproduce this
corruption issue:

$ mkfs.btrfs -f  /dev/sdb3
$ mount /dev/sdb3 /mnt/btrfs
$ ./write_file /mnt/btrfs/foobar
$ du -b /mnt/btrfs/foobar
1048926 /mnt/btrfs/foobar
$ md5sum /mnt/btrfs/foobar
f9f778f3a7410c40e4ed104a3a63c3c4  /mnt/btrfs/foobar

$ btrfs subvolume snapshot /mnt/btrfs /mnt/btrfs/my_snap
$ perl -e 'open($f, +, /dev/btrfs/foobar); seek($f, 4096, 0); print $f 
\xff; close($f);'
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar
$ umount /mnt/btrfs

$ btrfs restore /dev/sdb3 /tmp/copy
$ du -b /tmp/copy/foobar
1048926 /tmp/copy/foobar
$ md5sum /tmp/copy/foobar
88db338cbc1c44dfabae083f1ce642d5  /tmp/copy/foobar
$ od -t x1 -j 8192 -N 4 /tmp/copy/foobar
002 41 00 00 00
0020004
$ mount /dev/sdb3 /mnt/btrfs
$ od -t x1 -j 8192 -N 4 /mnt/btrfs/foobar
002 00 00 00 00
0020004
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar

$ cat write_file.c:

 #include stdio.h
 #include stdlib.h
 #include unistd.h
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include string.h
 #include assert.h

 #define BUF_SIZE (1 * 1024 * 1024 + 350)

int main(int argc, char *argv[])
{
int fd;
unsigned char *buf = malloc(BUF_SIZE);

assert(buf != NULL);
if (argc  2) {
fprintf(stderr, Use:  %s filepath\n, argv[0]);
return 1;
}

fd = open(argv[1], O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);
assert(fd = 0);
memset(buf, 0, BUF_SIZE);
buf[0] = 65;
assert(write(fd, buf, BUF_SIZE) == BUF_SIZE);
assert(close(fd) == 0);

return 0;
}

Tested this change with zlib, lzo compression and file sizes larger
than 1GiB, and found no regression or other corruption issues (so far
at least).

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: updated commit message to include the C preprocessor macros
in the C program.
V3: updated commit message again to reflect the file size used in
the example in the C program macro.

 cmds-restore.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..9688599 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -272,6 +272,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
u64 bytenr;
u64 ram_size;
u64 disk_size;
+   u64 num_bytes;
u64 length;
u64 size_left;
u64 dev_bytenr;
@@ -288,7 +289,9 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
disk_size = btrfs_file_extent_disk_num_bytes(leaf, fi);
ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
offset = btrfs_file_extent_offset(leaf, fi);
-   size_left = disk_size;
+   num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
+   size_left = num_bytes;
+   bytenr += offset;
 
if (offset)
printf(offset is %Lu\n, offset);
@@ -296,7 +299,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
if (disk_size == 0)
return 0;
 
-   inbuf = malloc(disk_size);
+   inbuf = malloc(size_left);
if (!inbuf) {
fprintf(stderr, No memory\n);
return -1;
@@ -351,8 +354,8 @@ again:
goto again;
 
if (compress == BTRFS_COMPRESS_NONE) {
-   while (total  ram_size) {
-   done = pwrite(fd, inbuf+total, ram_size-total,
+   while (total  num_bytes) {
+   done = pwrite(fd, inbuf+total, num_bytes-total,
  pos+total);
if (done  0) {
ret = -1;
@@ -365,7 +368,7 @@ again:
goto out;
}
 
-   ret = decompress(inbuf, outbuf, disk_size, ram_size, compress);
+   ret = decompress(inbuf, outbuf, num_bytes, ram_size, compress);
if (ret) {
num_copies = btrfs_num_copies(root-fs_info-mapping_tree,
  bytenr, length);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs-progs: remove duplicated code in cmds-restore.c

2013-07-10 Thread Filipe David Manana
On Wed, Jul 10, 2013 at 5:12 PM, David Sterba dste...@suse.cz wrote:
 On Tue, Jul 09, 2013 at 07:49:53PM +0100, Filipe David Borba Manana wrote:
 The module cmds-restore.c was defining its own next_leaf()
 function, which did exactly the same as btrfs_next_leaf()
 from ctree.c.

 This has been removed by Eric's patch present in the integration
 branches:
  Btrfs-progs: remove cut  paste btrfs_next_leaf from restore
 http://www.spinics.net/lists/linux-btrfs/msg24477.html

Oh, didn't notice that.


 but now Chris has a fix in the master branch,
  btrfs-restore: deal with NULL returns from read_node_slot
 https://git.kernel.org/cgit/linux/kernel/git/mason/btrfs-progs.git/commit/?id=194aa4a1bd6447bb545286d0bcb0b0be8204d79f

 the code of updated next_leaf is not identical to btrfs_next_leaf and I
 think 'restore' could be more tolerant to partially corrupted
 structures, so both functions could make sense in the end.

Ok, I understand now why both exist.

So please just ignore this patch and the following one
(https://patchwork.kernel.org/patch/2825425/).

thanks


 david



--
Filipe David Manana,

Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: stat(2) and /proc/pid/maps returns different devices

2013-07-10 Thread Mark Fasheh
On Mon, Jul 08, 2013 at 11:54:46PM +0200, David Sterba wrote:
 On Thu, Jul 04, 2013 at 01:51:38PM +0400, Andrew Vagin wrote:
  We are not first who suffer from this problem:
  https://bugzilla.redhat.com/show_bug.cgi?id=711881
  http://marc.info/?l=linux-btrfsm=130074451403261
  https://bugzilla.openvz.org/show_bug.cgi?id=2653
 
  And about 2 years ago Mark Fasheh tried to fix this problem:
  http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps

And basically nobody cared :/


  Eric Biederman sugested to not create a new method and use vfs_getattr,
  but here is a few problems:
  * fanotify doesn't have dentry, but its fdinfo contains device.
  * vfs_getattr can fail and which device should be shown in this case?
  * vfs_getattr gets much more parameters, so here is a question about
performance degradation.
  
  So I have a question: Can two inodes from different subvolumes have
  equal inode numbers?
 
 Yes, subvolumes are separate inode number spaces.
 
  If someone have any suggestions how to fix this problem or any
  explanation why this is not a problem at all, please write here.
 
 The xstat syscall instead of the potentially heavyweight vfs_getattr
 could fix that, but it's not merged. For suse kernels we've taken the
 hackish approach of patching fs/proc/task_mmu.c:show_map_vma() (and the
 nommu variant) and use vfs_getattr only for btrfs.
 
 http://kernel.opensuse.org/cgit/kernel-source/tree/patches.suse/btrfs-use-correct-device-for-maps.patch?id=2434fa6ee93a83b117461eb13f24272606677fec
 
 Only a temporary and not upstreamable solution, but without it the core
 packaging tool zypper would not work correctly.

As far as I can tell we'll be carrying this patch until a better
solution is possible.

When that will happen, I don't know.
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4] Btrfs-progs: fix restore command leaving corrupted files

2013-07-10 Thread Filipe David Borba Manana
When there are files that have parts shared with snapshots, the
restore command was incorrectly restoring them, as it was not
taking into account the offset and number of bytes fields from
the file extent item. Besides leaving the recovered file corrupt,
it was also inneficient as it read and wrote more data than needed
(with each extent copy overwriting portions of the one previously
written).

The following steps and small C program show how to reproduce this
corruption issue:

$ mkfs.btrfs -f  /dev/sdb3
$ mount /dev/sdb3 /mnt/btrfs
$ ./write_file /mnt/btrfs/foobar
$ du -b /mnt/btrfs/foobar
1048926 /mnt/btrfs/foobar
$ md5sum /mnt/btrfs/foobar
f9f778f3a7410c40e4ed104a3a63c3c4  /mnt/btrfs/foobar

$ btrfs subvolume snapshot /mnt/btrfs /mnt/btrfs/my_snap
$ perl -e 'open($f, +, /mnt/btrfs/foobar); seek($f, 4096, 0); print $f 
\xff; close($f);'
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar
$ umount /mnt/btrfs

$ btrfs restore /dev/sdb3 /tmp/copy
$ du -b /tmp/copy/foobar
1048926 /tmp/copy/foobar
$ md5sum /tmp/copy/foobar
88db338cbc1c44dfabae083f1ce642d5  /tmp/copy/foobar
$ od -t x1 -j 8192 -N 4 /tmp/copy/foobar
002 41 00 00 00
0020004
$ mount /dev/sdb3 /mnt/btrfs
$ od -t x1 -j 8192 -N 4 /mnt/btrfs/foobar
002 00 00 00 00
0020004
$ md5sum /mnt/btrfs/foobar
b983fcefd4622a03a78936484c40272b  /mnt/btrfs/foobar

$ cat write_file.c:

 #include stdio.h
 #include stdlib.h
 #include unistd.h
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include string.h
 #include assert.h

 #define BUF_SIZE (1 * 1024 * 1024 + 350)

int main(int argc, char *argv[])
{
int fd;
unsigned char *buf = malloc(BUF_SIZE);

assert(buf != NULL);
if (argc  2) {
fprintf(stderr, Use:  %s filepath\n, argv[0]);
return 1;
}

fd = open(argv[1], O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU);
assert(fd = 0);
memset(buf, 0, BUF_SIZE);
buf[0] = 65;
assert(write(fd, buf, BUF_SIZE) == BUF_SIZE);
assert(close(fd) == 0);

return 0;
}

Tested this change with zlib, lzo compression and file sizes larger
than 1GiB, and found no regression or other corruption issues (so far
at least).

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: updated commit message to include the C preprocessor macros
in the C program.
V3: updated commit message again to reflect the file size used in
the example in the C program macro.
V4: fixed wrong path in commit message in the perl command line.

 cmds-restore.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..9688599 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -272,6 +272,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
u64 bytenr;
u64 ram_size;
u64 disk_size;
+   u64 num_bytes;
u64 length;
u64 size_left;
u64 dev_bytenr;
@@ -288,7 +289,9 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
disk_size = btrfs_file_extent_disk_num_bytes(leaf, fi);
ram_size = btrfs_file_extent_ram_bytes(leaf, fi);
offset = btrfs_file_extent_offset(leaf, fi);
-   size_left = disk_size;
+   num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
+   size_left = num_bytes;
+   bytenr += offset;
 
if (offset)
printf(offset is %Lu\n, offset);
@@ -296,7 +299,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd,
if (disk_size == 0)
return 0;
 
-   inbuf = malloc(disk_size);
+   inbuf = malloc(size_left);
if (!inbuf) {
fprintf(stderr, No memory\n);
return -1;
@@ -351,8 +354,8 @@ again:
goto again;
 
if (compress == BTRFS_COMPRESS_NONE) {
-   while (total  ram_size) {
-   done = pwrite(fd, inbuf+total, ram_size-total,
+   while (total  num_bytes) {
+   done = pwrite(fd, inbuf+total, num_bytes-total,
  pos+total);
if (done  0) {
ret = -1;
@@ -365,7 +368,7 @@ again:
goto out;
}
 
-   ret = decompress(inbuf, outbuf, disk_size, ram_size, compress);
+   ret = decompress(inbuf, outbuf, num_bytes, ram_size, compress);
if (ret) {
num_copies = btrfs_num_copies(root-fs_info-mapping_tree,
  bytenr, length);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: per-thread, per-call pretty buffer

2013-07-10 Thread David Sterba
On Wed, Jul 10, 2013 at 05:16:23PM +0100, Hugo Mills wrote:
Sorry to be a pain in the arse at this late stage of the patch, but
 I've only just noticed.

No worries, good to have this one fixed.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: use IEC units for sizes

2013-07-10 Thread David Sterba
As implemented now, we use 1024 based units but reporting 1000 based,
let's finally fix that and add optional unit bases later.

Signed-off-by: David Sterba dste...@suse.cz
---
 utils.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/utils.c b/utils.c
index bce06f1..2e24cb0 100644
--- a/utils.c
+++ b/utils.c
@@ -1173,8 +1173,7 @@ out:
return ret;
 }
 
-static char *size_strs[] = { , KB, MB, GB, TB,
-   PB, EB};
+static char *size_strs[] = { , KiB, MiB, GiB, TiB, PiB, EiB};
 void pretty_size_snprintf(u64 size, char *str, size_t str_bytes)
 {
int num_divs = 0;
-- 
1.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: stat(2) and /proc/pid/maps returns different devices

2013-07-10 Thread Mark Fasheh
On Wed, Jul 10, 2013 at 09:31:05AM -0700, Mark Fasheh wrote:
 As far as I can tell we'll be carrying this patch until a better
 solution is possible.
 
 When that will happen, I don't know.
   --Mark

Well, what do I get when I pretend I don't care any more? The little voice
in my head says keep plugging away. Here's another attempt at fixing this
problem in a sane manner. Basically, this time we're adding a flag to
s_flags which btrfs sets. Proc will see the flag and call -getattr().

This compiles, but it needs testing (which I will get to soon). It still has
a bunch of problems in my honest opinion but maybe if we get something
acceptable upstream we can work from there.

Also, as Andrew pointed out there's more than one place which is return
different device than from stat(2) so I probably need to update more sites
to deal with this.

Does anyone see a problem with this approach?
--Mark

--
Mark Fasheh

From: Mark Fasheh mfas...@suse.de

vfs: allow /proc/PID/maps to get device from stat

stat(2) on btrfs returns a custom device, but proc uses s_dev from the super
block. This causes problems because software (and users) are not expecting
the kernel to return different devices from these calls.

This patch fixes the problem by adding a new superblock flag,
MS_PROC_USE_ST. When the proc code sees this flag, it will call the file
systems -getattr() method to extract a device as opposed to getting it
directly from s_dev.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 fs/btrfs/super.c|  1 +
 fs/proc/generic.c   | 30 ++
 fs/proc/internal.h  |  1 +
 fs/proc/task_mmu.c  |  2 +-
 fs/proc/task_nommu.c|  2 +-
 include/uapi/linux/fs.h |  1 +
 6 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f0857e0..67be4ef 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -822,6 +822,7 @@ static int btrfs_fill_super(struct super_block *sb,
sb-s_flags |= MS_POSIXACL;
 #endif
sb-s_flags |= MS_I_VERSION;
+   sb-s_flags |= MS_PROC_USE_ST;
err = open_ctree(sb, fs_devices, (char *)data);
if (err) {
printk(btrfs: open_ctree failed\n);
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index a2596af..eca8195 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -24,6 +24,8 @@
 #include linux/spinlock.h
 #include linux/completion.h
 #include asm/uaccess.h
+#include linux/fs.h
+#include linux/dcache.h
 
 #include internal.h
 
@@ -637,3 +639,31 @@ void *PDE_DATA(const struct inode *inode)
return __PDE_DATA(inode);
 }
 EXPORT_SYMBOL(PDE_DATA);
+
+static dev_t proc_get_dev_from_stat(struct inode *inode)
+{
+   struct dentry *dentry = d_find_any_alias(inode);
+   struct kstat kstat;
+
+   if (!dentry)
+   goto out_error;
+
+   if (inode-i_op-getattr(NULL, dentry, kstat))
+   goto out_error_dput;
+
+   dput(dentry);
+   return kstat.dev;
+
+out_error_dput:
+   dput(dentry);
+out_error:
+   return inode-i_sb-s_dev;
+}
+
+dev_t proc_get_map_dev(struct inode *inode)
+{
+   if (inode-i_sb-s_flags  MS_PROC_USE_ST)
+   return proc_get_dev_from_stat(inode);
+   else
+   return inode-i_sb-s_dev;
+}
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index d600fb0..24808b0 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -192,6 +192,7 @@ static inline struct proc_dir_entry *pde_get(struct 
proc_dir_entry *pde)
return pde;
 }
 extern void pde_put(struct proc_dir_entry *);
+dev_t proc_get_map_dev(struct inode *inode);
 
 /*
  * inode.c
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3e636d8..9226600 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -272,7 +272,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct 
*vma, int is_pid)
 
if (file) {
struct inode *inode = file_inode(vma-vm_file);
-   dev = inode-i_sb-s_dev;
+   dev = proc_get_map_dev(inode);
ino = inode-i_ino;
pgoff = ((loff_t)vma-vm_pgoff)  PAGE_SHIFT;
}
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 56123a6..892d84a 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -150,7 +150,7 @@ static int nommu_vma_show(struct seq_file *m, struct 
vm_area_struct *vma,
 
if (file) {
struct inode *inode = file_inode(vma-vm_file);
-   dev = inode-i_sb-s_dev;
+   dev = proc_get_map_dev(inode);
ino = inode-i_ino;
pgoff = (loff_t)vma-vm_pgoff  PAGE_SHIFT;
}
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index a4ed56c..b4173a3 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -88,6 +88,7 @@ struct inodes_stat_t {
 #define MS_STRICTATIME (124) /* Always perform atime updates */
 
 /* These sb flags are internal to the kernel */
+#define MS_PROC_USE_ST 

Re: btrfs: stat(2) and /proc/pid/maps returns different devices

2013-07-10 Thread David Sterba
On Wed, Jul 10, 2013 at 10:45:45AM -0700, Mark Fasheh wrote:
 Well, what do I get when I pretend I don't care any more? The little voice
 in my head says keep plugging away. Here's another attempt at fixing this
 problem in a sane manner. Basically, this time we're adding a flag to
 s_flags which btrfs sets. Proc will see the flag and call -getattr().
 
 This compiles, but it needs testing (which I will get to soon). It still has
 a bunch of problems in my honest opinion but maybe if we get something
 acceptable upstream we can work from there.
 
 Also, as Andrew pointed out there's more than one place which is return
 different device than from stat(2) so I probably need to update more sites
 to deal with this.
 
 Does anyone see a problem with this approach?

The approach looks ok to me, the implementation is internal to vfs and
fairly minimal. The bit that bothers me is the name of the flag, it's
completely unobvious what it means.

There are some differences to the linked suse patch:

 +static dev_t proc_get_dev_from_stat(struct inode *inode)
 +{
 + struct dentry *dentry = d_find_any_alias(inode);

This does the dentry - inode mapping, while originally there was

file-f_path

passing just the inode to proc_get_dev_from_stat unnecessarily drops the
available information that's about to be retrieved again.

 + struct kstat kstat;
 +
 + if (!dentry)
 + goto out_error;
 + if (inode-i_op-getattr(NULL, dentry, kstat))

The suse patch calls vfs_getattr that in turn calls

security_inode_getattr(path-mnt, path-dentry);

That would be missing.

Plus checks for presence of the -getattr operation. Though this is
superfluous with btrfs, I suggest to use vfs_getattr here, which will
fix all of the above.

 + goto out_error_dput;
 +
 + dput(dentry);
 + return kstat.dev;
 +
 +out_error_dput:
 + dput(dentry);
 +out_error:
 + return inode-i_sb-s_dev;
 +}
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lz4 status?

2013-07-10 Thread David Sterba
On Sun, Jun 30, 2013 at 12:35:09PM -0500, Mitch Harder wrote:
 There's been a parallel effort to incorporate a general set of lz4
 patches in the kernel.
 
 I see these patches are currently queued up in the linux-next tree, so
 we may see them in the 3.11 kernel.

The patches are now merged into 3.11.

 It looks like lz4 and lz4hc will be provided.

Regarding HC mode, there are some core compression code changes needed
in order to fully utilize the its potential, namely larger chunk size
that's compressed at a time. There was some tiny yet measurable gain of
HC against ordinary mode compared on current 4k-at-a-time
implementation, but the space savings did not justify the speed drop of
HC mode.

I can't say if the patchset will be ready for 3.12 though.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs-progs: restore can now recover file xattrs

2013-07-10 Thread Filipe David Borba Manana
This change adds a new option to the restore command, named -x,
that makes it restore file extented attributes too. This is an
optional behaviour and it's disabled by default.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 cmds-restore.c |  113 +++-
 1 file changed, 112 insertions(+), 1 deletion(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..0f6169e 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -30,6 +30,8 @@
 #include lzo/lzoconf.h
 #include lzo/lzo1x.h
 #include zlib.h
+#include sys/types.h
+#include attr/xattr.h
 
 #include ctree.h
 #include disk-io.h
@@ -47,6 +49,7 @@ static int get_snaps = 0;
 static int verbose = 0;
 static int ignore_errors = 0;
 static int overwrite = 0;
+static int get_xattrs = 0;
 
 #define LZO_LEN 4
 #define PAGE_CACHE_SIZE 4096
@@ -412,6 +415,105 @@ again:
 }
 
 
+static int set_file_xattrs(struct btrfs_root *root, u64 inode,
+  int fd, const char *file_name)
+{
+   struct btrfs_key key;
+   struct btrfs_path *path;
+   struct extent_buffer *leaf;
+   struct btrfs_dir_item *di;
+   u32 name_len = 0;
+   u32 data_len = 0;
+   u32 len = 0;
+   char *name = NULL;
+   char *data = NULL;
+   int ret = 0;
+
+   key.objectid = inode;
+   key.type = BTRFS_XATTR_ITEM_KEY;
+   key.offset = 0;
+
+   path = btrfs_alloc_path();
+   if (!path)
+   return -ENOMEM;
+
+   ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
+   if (ret  0)
+   goto out;
+
+   leaf = path-nodes[0];
+   while (1) {
+   if (path-slots[0] = btrfs_header_nritems(leaf)) {
+   do {
+   ret = next_leaf(root, path);
+   if (ret  0) {
+   fprintf(stderr, Error searching for
+extended attributes: %d\n,
+   ret);
+   goto out;
+   } else if (ret) {
+   /* No more leaves to search */
+   goto out;
+   }
+   leaf = path-nodes[0];
+   } while (!leaf);
+   continue;
+   }
+
+   btrfs_item_key_to_cpu(leaf, key, path-slots[0]);
+
+   if (key.type != BTRFS_XATTR_ITEM_KEY || key.objectid != inode)
+   break;
+
+   di = btrfs_item_ptr(leaf, path-slots[0],
+   struct btrfs_dir_item);
+
+   len = btrfs_dir_name_len(leaf, di);
+   if (len  name_len) {
+   free(name);
+   name = (char *) malloc(len + 1);
+   if (!name) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, name, (unsigned long)(di + 1), len);
+   name[len] = '\0';
+   name_len = len;
+
+   len = btrfs_dir_data_len(leaf, di);
+   if (len  data_len) {
+   free(data);
+   data = (char *) malloc(len);
+   if (!data) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, data,
+  (unsigned long)(di + 1) + name_len, len);
+   data_len = len;
+
+   if (fsetxattr(fd, name, data, data_len, 0)) {
+   int err = errno;
+
+   fprintf(stderr, Error setting extended attribute %s
+on file %s: %s, name, file_name,
+   strerror(err));
+   }
+
+   path-slots[0]++;
+   }
+   ret = 0;
+out:
+   btrfs_free_path(path);
+   free(name);
+   free(data);
+
+   return ret;
+}
+
+
 static int copy_file(struct btrfs_root *root, int fd, struct btrfs_key *key,
 const char *file)
 {
@@ -535,6 +637,11 @@ set_size:
if (ret)
return ret;
}
+   if (get_xattrs) {
+   ret = set_file_xattrs(root, key-objectid, fd, file);
+   if (ret)
+   return ret;
+   }
return 0;
 }
 
@@ -966,6 +1073,7 @@ const char * const cmd_restore_usage[] = {
Try to restore files from a damaged filesystem (unmounted),
,
-s  get snapshots,
+   -x  get extended attributes,
-v  verbose,
-i  ignore errors,
-o  overwrite,
@@ 

[PATCH v2] Btrfs-progs: restore can now recover file xattrs

2013-07-10 Thread Filipe David Borba Manana
This change adds a new option to the restore command, named -x,
that makes it restore file extented attributes too. This is an
optional behaviour and it's disabled by default.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: added missing new line at end of error message.

 cmds-restore.c |  113 +++-
 1 file changed, 112 insertions(+), 1 deletion(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..cb8754a 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -30,6 +30,8 @@
 #include lzo/lzoconf.h
 #include lzo/lzo1x.h
 #include zlib.h
+#include sys/types.h
+#include attr/xattr.h
 
 #include ctree.h
 #include disk-io.h
@@ -47,6 +49,7 @@ static int get_snaps = 0;
 static int verbose = 0;
 static int ignore_errors = 0;
 static int overwrite = 0;
+static int get_xattrs = 0;
 
 #define LZO_LEN 4
 #define PAGE_CACHE_SIZE 4096
@@ -412,6 +415,105 @@ again:
 }
 
 
+static int set_file_xattrs(struct btrfs_root *root, u64 inode,
+  int fd, const char *file_name)
+{
+   struct btrfs_key key;
+   struct btrfs_path *path;
+   struct extent_buffer *leaf;
+   struct btrfs_dir_item *di;
+   u32 name_len = 0;
+   u32 data_len = 0;
+   u32 len = 0;
+   char *name = NULL;
+   char *data = NULL;
+   int ret = 0;
+
+   key.objectid = inode;
+   key.type = BTRFS_XATTR_ITEM_KEY;
+   key.offset = 0;
+
+   path = btrfs_alloc_path();
+   if (!path)
+   return -ENOMEM;
+
+   ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
+   if (ret  0)
+   goto out;
+
+   leaf = path-nodes[0];
+   while (1) {
+   if (path-slots[0] = btrfs_header_nritems(leaf)) {
+   do {
+   ret = next_leaf(root, path);
+   if (ret  0) {
+   fprintf(stderr, Error searching for
+extended attributes: %d\n,
+   ret);
+   goto out;
+   } else if (ret) {
+   /* No more leaves to search */
+   goto out;
+   }
+   leaf = path-nodes[0];
+   } while (!leaf);
+   continue;
+   }
+
+   btrfs_item_key_to_cpu(leaf, key, path-slots[0]);
+
+   if (key.type != BTRFS_XATTR_ITEM_KEY || key.objectid != inode)
+   break;
+
+   di = btrfs_item_ptr(leaf, path-slots[0],
+   struct btrfs_dir_item);
+
+   len = btrfs_dir_name_len(leaf, di);
+   if (len  name_len) {
+   free(name);
+   name = (char *) malloc(len + 1);
+   if (!name) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, name, (unsigned long)(di + 1), len);
+   name[len] = '\0';
+   name_len = len;
+
+   len = btrfs_dir_data_len(leaf, di);
+   if (len  data_len) {
+   free(data);
+   data = (char *) malloc(len);
+   if (!data) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, data,
+  (unsigned long)(di + 1) + name_len, len);
+   data_len = len;
+
+   if (fsetxattr(fd, name, data, data_len, 0)) {
+   int err = errno;
+
+   fprintf(stderr, Error setting extended attribute %s
+on file %s: %s\n, name, file_name,
+   strerror(err));
+   }
+
+   path-slots[0]++;
+   }
+   ret = 0;
+out:
+   btrfs_free_path(path);
+   free(name);
+   free(data);
+
+   return ret;
+}
+
+
 static int copy_file(struct btrfs_root *root, int fd, struct btrfs_key *key,
 const char *file)
 {
@@ -535,6 +637,11 @@ set_size:
if (ret)
return ret;
}
+   if (get_xattrs) {
+   ret = set_file_xattrs(root, key-objectid, fd, file);
+   if (ret)
+   return ret;
+   }
return 0;
 }
 
@@ -966,6 +1073,7 @@ const char * const cmd_restore_usage[] = {
Try to restore files from a damaged filesystem (unmounted),
,
-s  get snapshots,
+   -x  get extended attributes,
-v  verbose,
-i 

[PATCH v3] Btrfs-progs: restore can now recover file xattrs

2013-07-10 Thread Filipe David Borba Manana
This change adds a new option to the restore command, named -x,
that makes it restore file extented attributes too. This is an
optional behaviour and it's disabled by default.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: added missing new line at end of error message.
V3: return with 0 when there are no more leaves.

 cmds-restore.c |  113 +++-
 1 file changed, 112 insertions(+), 1 deletion(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..32ba89d 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -30,6 +30,8 @@
 #include lzo/lzoconf.h
 #include lzo/lzo1x.h
 #include zlib.h
+#include sys/types.h
+#include attr/xattr.h
 
 #include ctree.h
 #include disk-io.h
@@ -47,6 +49,7 @@ static int get_snaps = 0;
 static int verbose = 0;
 static int ignore_errors = 0;
 static int overwrite = 0;
+static int get_xattrs = 0;
 
 #define LZO_LEN 4
 #define PAGE_CACHE_SIZE 4096
@@ -412,6 +415,105 @@ again:
 }
 
 
+static int set_file_xattrs(struct btrfs_root *root, u64 inode,
+  int fd, const char *file_name)
+{
+   struct btrfs_key key;
+   struct btrfs_path *path;
+   struct extent_buffer *leaf;
+   struct btrfs_dir_item *di;
+   u32 name_len = 0;
+   u32 data_len = 0;
+   u32 len = 0;
+   char *name = NULL;
+   char *data = NULL;
+   int ret = 0;
+
+   key.objectid = inode;
+   key.type = BTRFS_XATTR_ITEM_KEY;
+   key.offset = 0;
+
+   path = btrfs_alloc_path();
+   if (!path)
+   return -ENOMEM;
+
+   ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
+   if (ret  0)
+   goto out;
+
+   leaf = path-nodes[0];
+   while (1) {
+   if (path-slots[0] = btrfs_header_nritems(leaf)) {
+   do {
+   ret = next_leaf(root, path);
+   if (ret  0) {
+   fprintf(stderr, Error searching for
+extended attributes: %d\n,
+   ret);
+   goto out;
+   } else if (ret) {
+   /* No more leaves to search */
+   ret = 0;
+   goto out;
+   }
+   leaf = path-nodes[0];
+   } while (!leaf);
+   continue;
+   }
+
+   btrfs_item_key_to_cpu(leaf, key, path-slots[0]);
+
+   if (key.type != BTRFS_XATTR_ITEM_KEY || key.objectid != inode)
+   break;
+
+   di = btrfs_item_ptr(leaf, path-slots[0],
+   struct btrfs_dir_item);
+
+   len = btrfs_dir_name_len(leaf, di);
+   if (len  name_len) {
+   free(name);
+   name = (char *) malloc(len + 1);
+   if (!name) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, name, (unsigned long)(di + 1), len);
+   name[len] = '\0';
+   name_len = len;
+
+   len = btrfs_dir_data_len(leaf, di);
+   if (len  data_len) {
+   free(data);
+   data = (char *) malloc(len);
+   if (!data) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, data,
+  (unsigned long)(di + 1) + name_len, len);
+   data_len = len;
+
+   if (fsetxattr(fd, name, data, data_len, 0)) {
+   int err = errno;
+
+   fprintf(stderr, Error setting extended attribute %s
+on file %s: %s\n, name, file_name,
+   strerror(err));
+   }
+
+   path-slots[0]++;
+   }
+out:
+   btrfs_free_path(path);
+   free(name);
+   free(data);
+
+   return ret;
+}
+
+
 static int copy_file(struct btrfs_root *root, int fd, struct btrfs_key *key,
 const char *file)
 {
@@ -535,6 +637,11 @@ set_size:
if (ret)
return ret;
}
+   if (get_xattrs) {
+   ret = set_file_xattrs(root, key-objectid, fd, file);
+   if (ret)
+   return ret;
+   }
return 0;
 }
 
@@ -966,6 +1073,7 @@ const char * const cmd_restore_usage[] = {
Try to restore files from a damaged filesystem (unmounted),
,
-s  get snapshots,
+   -x  

[PATCH v4] Btrfs-progs: restore can now recover file xattrs

2013-07-10 Thread Filipe David Borba Manana
This change adds a new option to the restore command, named -x,
that makes it restore file extented attributes too. This is an
optional behaviour and it's disabled by default.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: added missing new line at end of error message.
V3: return with 0 when there are no more leaves.
V4: fix back return value to 0 when no more xattrs are found.

 cmds-restore.c |  114 +++-
 1 file changed, 113 insertions(+), 1 deletion(-)

diff --git a/cmds-restore.c b/cmds-restore.c
index e48df40..5199476 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -30,6 +30,8 @@
 #include lzo/lzoconf.h
 #include lzo/lzo1x.h
 #include zlib.h
+#include sys/types.h
+#include attr/xattr.h
 
 #include ctree.h
 #include disk-io.h
@@ -47,6 +49,7 @@ static int get_snaps = 0;
 static int verbose = 0;
 static int ignore_errors = 0;
 static int overwrite = 0;
+static int get_xattrs = 0;
 
 #define LZO_LEN 4
 #define PAGE_CACHE_SIZE 4096
@@ -412,6 +415,106 @@ again:
 }
 
 
+static int set_file_xattrs(struct btrfs_root *root, u64 inode,
+  int fd, const char *file_name)
+{
+   struct btrfs_key key;
+   struct btrfs_path *path;
+   struct extent_buffer *leaf;
+   struct btrfs_dir_item *di;
+   u32 name_len = 0;
+   u32 data_len = 0;
+   u32 len = 0;
+   char *name = NULL;
+   char *data = NULL;
+   int ret = 0;
+
+   key.objectid = inode;
+   key.type = BTRFS_XATTR_ITEM_KEY;
+   key.offset = 0;
+
+   path = btrfs_alloc_path();
+   if (!path)
+   return -ENOMEM;
+
+   ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
+   if (ret  0)
+   goto out;
+
+   leaf = path-nodes[0];
+   while (1) {
+   if (path-slots[0] = btrfs_header_nritems(leaf)) {
+   do {
+   ret = next_leaf(root, path);
+   if (ret  0) {
+   fprintf(stderr, Error searching for
+extended attributes: %d\n,
+   ret);
+   goto out;
+   } else if (ret) {
+   /* No more leaves to search */
+   ret = 0;
+   goto out;
+   }
+   leaf = path-nodes[0];
+   } while (!leaf);
+   continue;
+   }
+
+   btrfs_item_key_to_cpu(leaf, key, path-slots[0]);
+
+   if (key.type != BTRFS_XATTR_ITEM_KEY || key.objectid != inode)
+   break;
+
+   di = btrfs_item_ptr(leaf, path-slots[0],
+   struct btrfs_dir_item);
+
+   len = btrfs_dir_name_len(leaf, di);
+   if (len  name_len) {
+   free(name);
+   name = (char *) malloc(len + 1);
+   if (!name) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, name, (unsigned long)(di + 1), len);
+   name[len] = '\0';
+   name_len = len;
+
+   len = btrfs_dir_data_len(leaf, di);
+   if (len  data_len) {
+   free(data);
+   data = (char *) malloc(len);
+   if (!data) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   read_extent_buffer(leaf, data,
+  (unsigned long)(di + 1) + name_len, len);
+   data_len = len;
+
+   if (fsetxattr(fd, name, data, data_len, 0)) {
+   int err = errno;
+
+   fprintf(stderr, Error setting extended attribute %s
+on file %s: %s\n, name, file_name,
+   strerror(err));
+   }
+
+   path-slots[0]++;
+   }
+   ret = 0;
+out:
+   btrfs_free_path(path);
+   free(name);
+   free(data);
+
+   return ret;
+}
+
+
 static int copy_file(struct btrfs_root *root, int fd, struct btrfs_key *key,
 const char *file)
 {
@@ -535,6 +638,11 @@ set_size:
if (ret)
return ret;
}
+   if (get_xattrs) {
+   ret = set_file_xattrs(root, key-objectid, fd, file);
+   if (ret)
+   return ret;
+   }
return 0;
 }
 
@@ -966,6 +1074,7 @@ const char * const cmd_restore_usage[] = {
Try to restore files from a damaged filesystem 

[PATCH 4/5] Btrfs: batch the extent state operation in the end io handle of the read page

2013-07-10 Thread Miao Xie
It is unnecessary to unlock the extent by the page size, we can do it
in batches, it makes the random read be faster by ~6%.

Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c | 70 ++--
 1 file changed, 40 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 9f4dedf..8f95418 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -762,15 +762,6 @@ static void cache_state(struct extent_state *state,
}
 }
 
-static void uncache_state(struct extent_state **cached_ptr)
-{
-   if (cached_ptr  (*cached_ptr)) {
-   struct extent_state *state = *cached_ptr;
-   *cached_ptr = NULL;
-   free_extent_state(state);
-   }
-}
-
 /*
  * set some bits on a range in the tree.  This may require allocations or
  * sleeping, so the gfp mask is used to indicate what is allowed.
@@ -2395,6 +2386,18 @@ static void end_bio_extent_writepage(struct bio *bio, 
int err)
bio_put(bio);
 }
 
+static void
+endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 len,
+ int uptodate)
+{
+   struct extent_state *cached = NULL;
+   u64 end = start + len - 1;
+
+   if (uptodate  tree-track_uptodate)
+   set_extent_uptodate(tree, start, end, cached, GFP_ATOMIC);
+   unlock_extent_cached(tree, start, end, cached, GFP_ATOMIC);
+}
+
 /*
  * after a readpage IO is done, we need to:
  * clear the uptodate bits on error
@@ -2417,6 +2420,8 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
u64 start;
u64 end;
u64 len;
+   u64 extent_start = 0;
+   u64 extent_len = 0;
int mirror;
int ret;
 
@@ -2425,8 +2430,6 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
 
do {
struct page *page = bvec-bv_page;
-   struct extent_state *cached = NULL;
-   struct extent_state *state;
struct inode *inode = page-mapping-host;
 
pr_debug(end_bio_extent_readpage: bi_sector=%llu, err=%d, 
@@ -2452,17 +2455,6 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
if (++bvec = bvec_end)
prefetchw(bvec-bv_page-flags);
 
-   spin_lock(tree-lock);
-   state = find_first_extent_bit_state(tree, start, EXTENT_LOCKED);
-   if (likely(state  state-start == start)) {
-   /*
-* take a reference on the state, unlock will drop
-* the ref
-*/
-   cache_state(state, cached);
-   }
-   spin_unlock(tree-lock);
-
mirror = io_bio-mirror_num;
if (likely(uptodate  tree-ops 
   tree-ops-readpage_end_io_hook)) {
@@ -2501,18 +2493,11 @@ static void end_bio_extent_readpage(struct bio *bio, 
int err)
test_bit(BIO_UPTODATE, bio-bi_flags);
if (err)
uptodate = 0;
-   uncache_state(cached);
continue;
}
}
 readpage_ok:
-   if (uptodate  tree-track_uptodate) {
-   set_extent_uptodate(tree, start, end, cached,
-   GFP_ATOMIC);
-   }
-   unlock_extent_cached(tree, start, end, cached, GFP_ATOMIC);
-
-   if (uptodate) {
+   if (likely(uptodate)) {
loff_t i_size = i_size_read(inode);
pgoff_t end_index = i_size  PAGE_CACHE_SHIFT;
unsigned offset;
@@ -2528,8 +2513,33 @@ readpage_ok:
}
unlock_page(page);
offset += len;
+
+   if (unlikely(!uptodate)) {
+   if (extent_len) {
+   endio_readpage_release_extent(tree,
+ extent_start,
+ extent_len, 1);
+   extent_start = 0;
+   extent_len = 0;
+   }
+   endio_readpage_release_extent(tree, start,
+ end - start + 1, 0);
+   } else if (!extent_len) {
+   extent_start = start;
+   extent_len = end + 1 - start;
+   } else if (extent_start + extent_len == start) {
+   extent_len += end + 1 - start;
+   } else {
+   endio_readpage_release_extent(tree, extent_start,
+ 

[PATCH 2/5] Btrfs: add branch prediction hints in the read page end IO function

2013-07-10 Thread Miao Xie
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 4bfbcc5..c9b28cf 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2503,7 +2503,7 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
 
spin_lock(tree-lock);
state = find_first_extent_bit_state(tree, start, EXTENT_LOCKED);
-   if (state  state-start == start) {
+   if (likely(state  state-start == start)) {
/*
 * take a reference on the state, unlock will drop
 * the ref
@@ -2513,7 +2513,8 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
spin_unlock(tree-lock);
 
mirror = io_bio-mirror_num;
-   if (uptodate  tree-ops  tree-ops-readpage_end_io_hook) {
+   if (likely(uptodate  tree-ops 
+  tree-ops-readpage_end_io_hook)) {
ret = tree-ops-readpage_end_io_hook(page, start, end,
  state, mirror);
if (ret)
@@ -2522,12 +2523,15 @@ static void end_bio_extent_readpage(struct bio *bio, 
int err)
clean_io_failure(start, page);
}
 
-   if (!uptodate  tree-ops  
tree-ops-readpage_io_failed_hook) {
+   if (likely(uptodate))
+   goto readpage_ok;
+
+   if (tree-ops  tree-ops-readpage_io_failed_hook) {
ret = tree-ops-readpage_io_failed_hook(page, mirror);
if (!ret  !err 
test_bit(BIO_UPTODATE, bio-bi_flags))
uptodate = 1;
-   } else if (!uptodate) {
+   } else {
/*
 * The generic bio_readpage_error handles errors the
 * following way: If possible, new read requests are
@@ -2548,7 +2552,7 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
continue;
}
}
-
+readpage_ok:
if (uptodate  tree-track_uptodate) {
set_extent_uptodate(tree, start, end, cached,
GFP_ATOMIC);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] Btrfs: don't cache the csum value into the extent state tree

2013-07-10 Thread Miao Xie
Before applying this patch, we cached the csum value into the extent state
tree when reading some data from the disk, this operation increased the lock
contention of the state tree.

Now, we just store the csum value into the bio structure or other unshared
structure, so we can reduce the lock contention.

Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/btrfs_inode.h |  21 +
 fs/btrfs/ctree.h   |   4 +-
 fs/btrfs/disk-io.c |   5 ++-
 fs/btrfs/extent_io.c   | 113 -
 fs/btrfs/extent_io.h   |  10 ++---
 fs/btrfs/file-item.c   |  81 +++
 fs/btrfs/inode.c   |  85 +++--
 fs/btrfs/volumes.h |   7 +++
 8 files changed, 163 insertions(+), 163 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 08b286b..d0ae226 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -218,6 +218,27 @@ static inline int btrfs_inode_in_log(struct inode *inode, 
u64 generation)
return 0;
 }
 
+struct btrfs_dio_private {
+   struct inode *inode;
+   u64 logical_offset;
+   u64 disk_bytenr;
+   u64 bytes;
+   void *private;
+
+   /* number of bios pending for this dio */
+   atomic_t pending_bios;
+
+   /* IO errors */
+   int errors;
+
+   /* orig_bio is our btrfs_io_bio */
+   struct bio *orig_bio;
+
+   /* dio_bio came from fs/direct-io.c */
+   struct bio *dio_bio;
+   u8 csum[0];
+};
+
 /*
  * Disable DIO read nolock optimization, so new dio readers will be forced
  * to grab i_mutex. It is used to avoid the endless truncate due to
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index f5b4b72..d52ec5d 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3556,12 +3556,14 @@ int btrfs_find_name_in_ext_backref(struct btrfs_path 
*path,
   struct btrfs_inode_extref **extref_ret);
 
 /* file-item.c */
+struct btrfs_dio_private;
 int btrfs_del_csums(struct btrfs_trans_handle *trans,
struct btrfs_root *root, u64 bytenr, u64 len);
 int btrfs_lookup_bio_sums(struct btrfs_root *root, struct inode *inode,
  struct bio *bio, u32 *dst);
 int btrfs_lookup_bio_sums_dio(struct btrfs_root *root, struct inode *inode,
- struct bio *bio, u64 logical_offset);
+ struct btrfs_dio_private *dip, struct bio *bio,
+ u64 logical_offset);
 int btrfs_insert_file_extent(struct btrfs_trans_handle *trans,
 struct btrfs_root *root,
 u64 objectid, u64 pos,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index dfe6864..290b83f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -576,8 +576,9 @@ static noinline int check_leaf(struct btrfs_root *root,
return 0;
 }
 
-static int btree_readpage_end_io_hook(struct page *page, u64 start, u64 end,
-  struct extent_state *state, int mirror)
+static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
+ u64 phy_offset, struct page *page,
+ u64 start, u64 end, int mirror)
 {
struct extent_io_tree *tree;
u64 found_start;
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c9b28cf..9f4dedf 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1837,64 +1837,6 @@ out:
return ret;
 }
 
-void extent_cache_csums_dio(struct extent_io_tree *tree, u64 start, u32 
csums[],
-   int count)
-{
-   struct rb_node *node;
-   struct extent_state *state;
-
-   spin_lock(tree-lock);
-   /*
-* this search will find all the extents that end after
-* our range starts.
-*/
-   node = tree_search(tree, start);
-   BUG_ON(!node);
-
-   state = rb_entry(node, struct extent_state, rb_node);
-   BUG_ON(state-start != start);
-
-   while (count) {
-   state-private = *csums++;
-   count--;
-   state = next_state(state);
-   }
-   spin_unlock(tree-lock);
-}
-
-static inline u64 __btrfs_get_bio_offset(struct bio *bio, int bio_index)
-{
-   struct bio_vec *bvec = bio-bi_io_vec + bio_index;
-
-   return page_offset(bvec-bv_page) + bvec-bv_offset;
-}
-
-void extent_cache_csums(struct extent_io_tree *tree, struct bio *bio, int 
bio_index,
-   u32 csums[], int count)
-{
-   struct rb_node *node;
-   struct extent_state *state = NULL;
-   u64 start;
-
-   spin_lock(tree-lock);
-   do {
-   start = __btrfs_get_bio_offset(bio, bio_index);
-   if (state == NULL || state-start != start) {
-   node = tree_search(tree, start);
-   BUG_ON(!node);
-
-   state = rb_entry(node, struct 

[PATCH 1/5] Btrfs: remove unnecessary argument of bio_readpage_error()

2013-07-10 Thread Miao Xie
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index f8586a9..4bfbcc5 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2202,8 +2202,7 @@ out:
  */
 
 static int bio_readpage_error(struct bio *failed_bio, struct page *page,
-   u64 start, u64 end, int failed_mirror,
-   struct extent_state *state)
+   u64 start, u64 end, int failed_mirror)
 {
struct io_failure_record *failrec = NULL;
u64 private;
@@ -2212,6 +2211,7 @@ static int bio_readpage_error(struct bio *failed_bio, 
struct page *page,
struct extent_io_tree *failure_tree = BTRFS_I(inode)-io_failure_tree;
struct extent_io_tree *tree = BTRFS_I(inode)-io_tree;
struct extent_map_tree *em_tree = BTRFS_I(inode)-extent_tree;
+   struct extent_state *state;
struct bio *bio;
int num_copies;
int ret;
@@ -2297,21 +2297,18 @@ static int bio_readpage_error(struct bio *failed_bio, 
struct page *page,
 * matter what the error is, it is very likely to persist.
 */
pr_debug(bio_readpage_error: cannot repair, num_copies == 1. 
-state=%p, num_copies=%d, next_mirror %d, 
-failed_mirror %d\n, state, num_copies,
-failrec-this_mirror, failed_mirror);
+num_copies=%d, next_mirror %d, failed_mirror %d\n, 
+num_copies, failrec-this_mirror, failed_mirror);
free_io_failure(inode, failrec, 0);
return -EIO;
}
 
-   if (!state) {
-   spin_lock(tree-lock);
-   state = find_first_extent_bit_state(tree, failrec-start,
-   EXTENT_LOCKED);
-   if (state  state-start != failrec-start)
-   state = NULL;
-   spin_unlock(tree-lock);
-   }
+   spin_lock(tree-lock);
+   state = find_first_extent_bit_state(tree, failrec-start,
+   EXTENT_LOCKED);
+   if (state  state-start != failrec-start)
+   state = NULL;
+   spin_unlock(tree-lock);
 
/*
 * there are two premises:
@@ -2541,7 +2538,7 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
 * can't handle the error it will return -EIO and we
 * remain responsible for that page.
 */
-   ret = bio_readpage_error(bio, page, start, end, mirror, 
NULL);
+   ret = bio_readpage_error(bio, page, start, end, mirror);
if (ret == 0) {
uptodate =
test_bit(BIO_UPTODATE, bio-bi_flags);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html