Re: btrfs oops (autodefrag related?)

2012-03-12 Thread Chris Samuel
Forgot citations, sorry!

On Tuesday 13 March 2012 11:13:28 Chris Samuel wrote:

> Note though that some people are reporting regressions with
> premature  ENOSPC in 3.3-rc7, to quote:
>
> # - bisected down to 5500cdb (Btrfs: increase the global block
> # reserve estimates). After reverting this one Linus master works
> # for me again.

is from https://lkml.org/lkml/2012/3/9/624

> Though after reversion they hit ENOSPC much later (but still 
> prematurely).

is from https://lkml.org/lkml/2012/3/12/231

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP


signature.asc
Description: This is a digitally signed message part.


Re: immutable (WORM) file system

2012-03-12 Thread Chris Samuel
On Tuesday 13 March 2012 10:40:39 David Sterba wrote:

> (I just know that the flag is there and is related to the question,
> haven't tested it myself and do not know what was the original
> intention.)

Not sure it helps, but the commits for these were:

commit fdebe2bd70047e057827cba85ba31b2545e31900
Author: Yan 
Date:   Mon Jan 14 13:26:08 2008 -0500

Btrfs: Add readonly inode flag

This patch adds readonly inode flag support.  A file with this flag
can't be modified, but can be deleted.

Signed-off-by: Chris Mason 

and..

commit cb6db4e57632ba8589cc2f9fe1d0aa9116b87ab8
Author: Jeff Mahoney 
Date:   Mon Aug 15 17:27:21 2011 +
Author: Jeff Mahoney 
Date:   Mon Aug 15 17:27:21 2011 +

btrfs: btrfs_permission's RO check shouldn't apply to device nodes

This patch tightens the read-only access checks in btrfs_permission to
 match the constraints in inode_permission. Currently, even though the
 device node itself will be unmodified, read-write access to device nodes
 is denied to when the device node resides on a read-only subvolume or a
 is a file that has been marked read-only by the btrfs conversion utility.

 With this patch applied, the check only affects regular files,
 directories, and symlinks. It also restructures the code a bit so that
 we don't duplicate the MAY_WRITE check for both tests.

Signed-off-by: Jeff Mahoney 
Signed-off-by: Chris Mason 

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP


signature.asc
Description: This is a digitally signed message part.


Re: btrfs oops (autodefrag related?)

2012-03-12 Thread Chris Samuel
On Tuesday 13 March 2012 11:04:35 Chris Mason wrote:

> This one was fixed in the 3.3 series.  You can pull from my
> for-linus repo for a commit against 3.2.

Note though that some people are reporting regressions with premature 
ENOSPC in 3.3-rc7, to quote:

# - bisected down to 5500cdb (Btrfs: increase the global block
# reserve estimates). After reverting this one Linus master works
# for me again.

Though after reversion they hit ENOSPC much later (but still 
prematurely).

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP


signature.asc
Description: This is a digitally signed message part.


Re: immutable (WORM) file system

2012-03-12 Thread Duncan
Chester posted on Mon, 12 Mar 2012 14:52:47 -0500 as excerpted:

> On Mon, Mar 12, 2012 at 2:36 PM, Fong Vang  wrote:
>> Does anyone know if there's a plan to provide an option to make a BTRFS
>> filesystem a WORM (write-one-read-many)?  So essentially, once a file
>> has been written it cannot be altered nor deleted.  To delete would
>> require a newfs.  I know that there's extended attributes but root can
>> alter that on individual files.
>>
>> Anyway, thanks in advance for any help.

> There's support for Read-only snapshots, so you might be able to use
> that with some clever scripting =\

[Quote rearranged into standard contextual quote and reply order.]

If I'm reading between the lines correctly, that's not going to suffice.  
It sounds to me like the request is for an indelible/immutable journal, 
potentially backed by a legal requirement.

Think (US) SOX, Sarbanes-Oxley, requirements, or similar elsewhere.  Or 
consider a criminal computer-related case where an investigator's every 
action while analyzing a suspect's drive is write-once recorded, creating 
a permanent record to be used in court.  Or for that matter, consider the 
audit trail of an electronic voting machine.

In such cases an indelible log that cannot be altered without destruction 
is vital.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs oops (autodefrag related?)

2012-03-12 Thread Chris Mason
On Mon, Mar 12, 2012 at 09:32:54PM +0200, Avi Kivity wrote:
> Because I'm such a btrfs fanboi I'm running btrfs on my /, all past
> experience notwithstanding.  In an attempt to recover some performance,
> I enabled autodefrag, and got this in return:

Hi Avi,

This one was fixed in the 3.3 series.  You can pull from my for-linus
repo for a commit against 3.2.

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus

The individual fix is here:

http://git.kernel.org/?p=linux/kernel/git/mason/linux-btrfs.git;a=commit;h=87826df0ec36fc28884b4ddbb3f3af41c4c2008f

-chris

> 
> [567304.937620] [ cut here ]
> [567304.938525] kernel BUG at fs/btrfs/inode.c:1588!
> [567304.938525] invalid opcode:  [#1] SMP
> [567304.938525] CPU 0
> [567304.938525] Modules linked in: vfat fat usb_storage binfmt_misc
> tcp_lp ppdev parport_pc lp parport fuse ebtable_nat ebtables be2iscsi
> ipt_MASQUERADE iscsi_boot_sysfs iptable_nat nf_nat bnx2i xt_CHECKSUM
> cnic iptable_mangle bridge stp llc uio cxgb4i cxgb4 cxgb3i libcxgbi
> cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr
> iscsi_tcp libiscsi_tcp libiscsi lockd rfcomm scsi_transport_iscsi bnep
> nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_conntrack_ipv6
> nf_defrag_ipv6 xt_state ip6table_filter ip6_tables nf_conntrack
> snd_hda_codec_hdmi snd_hda_codec_conexant btusb bluetooth snd_hda_intel
> snd_hda_codec uvcvideo snd_hwdep snd_seq arc4 snd_seq_device snd_pcm
> videodev media thinkpad_acpi iwlwifi i2c_i801 snd_timer mac80211 tpm_tis
> tpm tpm_bios v4l2_compat_ioctl32 e1000e cfg80211 iTCO_wdt snd
> snd_page_alloc soundcore iTCO_vendor_support microcode joydev rfkill
> vhost_net macvtap macvlan tun virtio_net kvm_intel kvm sunrpc uinput xts
> gf128mul dm_crypt btrfs zlib_deflate libcrc32c sdhci_pci sdhci mmc_core
> wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
> scsi_wait_scan]
> [567304.938525]
> [567304.938525] Pid: 533, comm: btrfs-fixup-1 Not tainted
> 3.2.6-3.fc16.x86_64 #1 LENOVO 4174BH4/4174BH4
> [567304.938525] RIP: 0010:[]  []
> btrfs_writepage_fixup_worker+0x152/0x160 [btrfs]
> [567304.948748] RSP: 0018:88011276fde0  EFLAGS: 00010246
> [567304.948748] RAX:  RBX: ea000416a000 RCX:
> 0664a000
> [567304.948748] RDX: 8800452cbde8 RSI:  RDI:
> 8800452cbde8
> [567304.948748] RBP: 88011276fe30 R08: 88011e21a780 R09:
> 88011276fd98
> [567304.948748] R10: 0001 R11: 0010 R12:
> 0664a000
> [567304.948748] R13: 88002ecc0190 R14:  R15:
> 0664afff
> [567304.948748] FS:  () GS:88011e20()
> knlGS:
> [567304.948748] CS:  0010 DS:  ES:  CR0: 8005003b
> [567304.948748] CR2: 07ff00ac2000 CR3: bbd92000 CR4:
> 000426e0
> [567304.948748] DR0:  DR1:  DR2:
> 
> [567304.948748] DR3:  DR6: 0ff0 DR7:
> 0400
> [567304.948748] Process btrfs-fixup-1 (pid: 533, threadinfo
> 88011276e000, task 8801125e5c80)
> [567304.948748] Stack:
> [567304.948748]  880114cb4ea8 88002ecc0030 
> 88009b09b8c0
> [567304.948748]   880116896360 880114cb4ed0
> 8801168963b0
> [567304.948748]  880116896378 88011276fe98 88011276fee0
> a0163647
> [567304.948748] Call Trace:
> [567304.948748]  [] worker_loop+0x147/0x530 [btrfs]
> [567304.948748]  [] ? btrfs_queue_worker+0x2e0/0x2e0
> [btrfs]
> [567304.948748]  [] kthread+0x8c/0xa0
> [567304.948748]  [] kernel_thread_helper+0x4/0x10
> [567304.948748]  [] ? kthread_worker_fn+0x190/0x190
> [567304.948748]  [] ? gs_change+0x13/0x13
> [567304.948748] Code: 00 48 8b 7d b8 48 8d 4d c8 41 b8 50 00 00 00 4c 89
> fa 4c 89 e6 e8 3f 8f 01 00 eb b3 48 89 df e8 c5 bb fd e0 0f 1f 44 00 00
> eb 92 <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41
> [567304.948748] RIP  []
> btrfs_writepage_fixup_worker+0x152/0x160 [btrfs]
> [567304.948748]  RSP 
> [567305.036430] ---[ end trace 642b0cfbec5885d3 ]---
> 
> The thread that died was doing some O_DIRECT I/O.
> 
> 
> -- 
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: immutable (WORM) file system

2012-03-12 Thread David Sterba
On Mon, Mar 12, 2012 at 12:36:13PM -0700, Fong Vang wrote:
> Does anyone know if there's a plan to provide an option to make a
> BTRFS filesystem a WORM (write-one-read-many)?  So essentially, once a
> file has been written it cannot be altered nor deleted.  To delete
> would require a newfs.  I know that there's extended attributes but
> root can alter that on individual files.

There's a btrfs-specific per-inode flag BTRFS_INODE_READONLY, it is used
in just one place (to check inode permissions upon write) and denies any
write operation. There's no direct way (like an ioctl call) to either
add or remove this flag, so it fullfils one half of what you want.

It's possible to enhance mkfs to set this flag to all files when
prefilled from a directory (option -r). For a change in a mounted
filesystem, a new ioctl doing the one-way switch to RO would be needed.

(I just know that the flag is there and is related to the question,
haven't tested it myself and do not know what was the original
intention.)


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: immutable (WORM) file system

2012-03-12 Thread Chester
There's support for Read-only snapshots, so you might be able to use
that with some clever scripting =\

On Mon, Mar 12, 2012 at 2:36 PM, Fong Vang  wrote:
> Does anyone know if there's a plan to provide an option to make a
> BTRFS filesystem a WORM (write-one-read-many)?  So essentially, once a
> file has been written it cannot be altered nor deleted.  To delete
> would require a newfs.  I know that there's extended attributes but
> root can alter that on individual files.
>
> Anyway, thanks in advance for any help.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


immutable (WORM) file system

2012-03-12 Thread Fong Vang
Does anyone know if there's a plan to provide an option to make a
BTRFS filesystem a WORM (write-one-read-many)?  So essentially, once a
file has been written it cannot be altered nor deleted.  To delete
would require a newfs.  I know that there's extended attributes but
root can alter that on individual files.

Anyway, thanks in advance for any help.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs oops (autodefrag related?)

2012-03-12 Thread Avi Kivity
Because I'm such a btrfs fanboi I'm running btrfs on my /, all past
experience notwithstanding.  In an attempt to recover some performance,
I enabled autodefrag, and got this in return:

[567304.937620] [ cut here ]
[567304.938525] kernel BUG at fs/btrfs/inode.c:1588!
[567304.938525] invalid opcode:  [#1] SMP
[567304.938525] CPU 0
[567304.938525] Modules linked in: vfat fat usb_storage binfmt_misc
tcp_lp ppdev parport_pc lp parport fuse ebtable_nat ebtables be2iscsi
ipt_MASQUERADE iscsi_boot_sysfs iptable_nat nf_nat bnx2i xt_CHECKSUM
cnic iptable_mangle bridge stp llc uio cxgb4i cxgb4 cxgb3i libcxgbi
cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr
iscsi_tcp libiscsi_tcp libiscsi lockd rfcomm scsi_transport_iscsi bnep
nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_conntrack_ipv6
nf_defrag_ipv6 xt_state ip6table_filter ip6_tables nf_conntrack
snd_hda_codec_hdmi snd_hda_codec_conexant btusb bluetooth snd_hda_intel
snd_hda_codec uvcvideo snd_hwdep snd_seq arc4 snd_seq_device snd_pcm
videodev media thinkpad_acpi iwlwifi i2c_i801 snd_timer mac80211 tpm_tis
tpm tpm_bios v4l2_compat_ioctl32 e1000e cfg80211 iTCO_wdt snd
snd_page_alloc soundcore iTCO_vendor_support microcode joydev rfkill
vhost_net macvtap macvlan tun virtio_net kvm_intel kvm sunrpc uinput xts
gf128mul dm_crypt btrfs zlib_deflate libcrc32c sdhci_pci sdhci mmc_core
wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded:
scsi_wait_scan]
[567304.938525]
[567304.938525] Pid: 533, comm: btrfs-fixup-1 Not tainted
3.2.6-3.fc16.x86_64 #1 LENOVO 4174BH4/4174BH4
[567304.938525] RIP: 0010:[]  []
btrfs_writepage_fixup_worker+0x152/0x160 [btrfs]
[567304.948748] RSP: 0018:88011276fde0  EFLAGS: 00010246
[567304.948748] RAX:  RBX: ea000416a000 RCX:
0664a000
[567304.948748] RDX: 8800452cbde8 RSI:  RDI:
8800452cbde8
[567304.948748] RBP: 88011276fe30 R08: 88011e21a780 R09:
88011276fd98
[567304.948748] R10: 0001 R11: 0010 R12:
0664a000
[567304.948748] R13: 88002ecc0190 R14:  R15:
0664afff
[567304.948748] FS:  () GS:88011e20()
knlGS:
[567304.948748] CS:  0010 DS:  ES:  CR0: 8005003b
[567304.948748] CR2: 07ff00ac2000 CR3: bbd92000 CR4:
000426e0
[567304.948748] DR0:  DR1:  DR2:

[567304.948748] DR3:  DR6: 0ff0 DR7:
0400
[567304.948748] Process btrfs-fixup-1 (pid: 533, threadinfo
88011276e000, task 8801125e5c80)
[567304.948748] Stack:
[567304.948748]  880114cb4ea8 88002ecc0030 
88009b09b8c0
[567304.948748]   880116896360 880114cb4ed0
8801168963b0
[567304.948748]  880116896378 88011276fe98 88011276fee0
a0163647
[567304.948748] Call Trace:
[567304.948748]  [] worker_loop+0x147/0x530 [btrfs]
[567304.948748]  [] ? btrfs_queue_worker+0x2e0/0x2e0
[btrfs]
[567304.948748]  [] kthread+0x8c/0xa0
[567304.948748]  [] kernel_thread_helper+0x4/0x10
[567304.948748]  [] ? kthread_worker_fn+0x190/0x190
[567304.948748]  [] ? gs_change+0x13/0x13
[567304.948748] Code: 00 48 8b 7d b8 48 8d 4d c8 41 b8 50 00 00 00 4c 89
fa 4c 89 e6 e8 3f 8f 01 00 eb b3 48 89 df e8 c5 bb fd e0 0f 1f 44 00 00
eb 92 <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41
[567304.948748] RIP  []
btrfs_writepage_fixup_worker+0x152/0x160 [btrfs]
[567304.948748]  RSP 
[567305.036430] ---[ end trace 642b0cfbec5885d3 ]---

The thread that died was doing some O_DIRECT I/O.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Btrfs-progs: bring 'subvol get-default' back in

2012-03-12 Thread Ilya Dryomov
Commit bab2c565 accidentally broke 'subvol get-default' command by
removing almost all of the underlying code.  Bring it back with some
fixes and improvements.

Signed-off-by: Ilya Dryomov 
---
 btrfs-list.c |   81 +-
 ctree.h  |2 +
 2 files changed, 82 insertions(+), 1 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index cc1dc66..00c428b 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -552,6 +552,60 @@ build:
return full;
 }
 
+static int get_default_subvolid(int fd, u64 *default_id)
+{
+   struct btrfs_ioctl_search_args args;
+   struct btrfs_ioctl_search_key *sk = &args.key;
+   struct btrfs_ioctl_search_header *sh;
+   u64 found = 0;
+   int ret;
+
+   memset(&args, 0, sizeof(args));
+
+   /*
+* search for a dir item with a name 'default' in the tree of
+* tree roots, it should point us to a default root
+*/
+   sk->tree_id = 1;
+
+   /* don't worry about ancient format and request only one item */
+   sk->nr_items = 1;
+
+   sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->max_type = BTRFS_DIR_ITEM_KEY;
+   sk->min_type = BTRFS_DIR_ITEM_KEY;
+   sk->max_offset = (u64)-1;
+   sk->max_transid = (u64)-1;
+
+   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
+   if (ret < 0)
+   return ret;
+
+   /* the ioctl returns the number of items it found in nr_items */
+   if (sk->nr_items == 0)
+   goto out;
+
+   sh = (struct btrfs_ioctl_search_header *)args.buf;
+
+   if (sh->type == BTRFS_DIR_ITEM_KEY) {
+   struct btrfs_dir_item *di;
+   int name_len;
+   char *name;
+
+   di = (struct btrfs_dir_item *)(sh + 1);
+   name_len = btrfs_stack_dir_name_len(di);
+   name = (char *)(di + 1);
+
+   if (!strncmp("default", name, name_len))
+   found = btrfs_disk_key_objectid(&di->location);
+   }
+
+out:
+   *default_id = found;
+   return 0;
+}
+
 static int __list_subvol_search(int fd, struct root_lookup *root_lookup)
 {
int ret;
@@ -663,12 +717,32 @@ static int __list_subvol_fill_paths(int fd, struct 
root_lookup *root_lookup)
return 0;
 }
 
-int list_subvols(int fd, int print_parent)
+int list_subvols(int fd, int print_parent, int get_default)
 {
struct root_lookup root_lookup;
struct rb_node *n;
+   u64 default_id;
int ret;
 
+   if (get_default) {
+   ret = get_default_subvolid(fd, &default_id);
+   if (ret) {
+   fprintf(stderr, "ERROR: can't perform the search - 
%s\n",
+   strerror(errno));
+   return ret;
+   }
+   if (default_id == 0) {
+   fprintf(stderr, "ERROR: 'default' dir item not 
found\n");
+   return ret;
+   }
+
+   /* no need to resolve roots if FS_TREE is default */
+   if (default_id == BTRFS_FS_TREE_OBJECTID) {
+   printf("ID 5 (FS_TREE)\n");
+   return ret;
+   }
+   }
+
ret = __list_subvol_search(fd, &root_lookup);
if (ret) {
fprintf(stderr, "ERROR: can't perform the search - %s\n",
@@ -696,6 +770,11 @@ int list_subvols(int fd, int print_parent)
char *path;
 
entry = rb_entry(n, struct root_info, rb_node);
+   if (get_default && entry->root_id != default_id) {
+   n = rb_prev(n);
+   continue;
+   }
+
resolve_root(&root_lookup, entry, &parent_id, &level, &path);
if (print_parent) {
printf("ID %llu parent %llu top level %llu path %s\n",
diff --git a/ctree.h b/ctree.h
index 5309059..141ec59 100644
--- a/ctree.h
+++ b/ctree.h
@@ -1416,6 +1416,8 @@ BTRFS_SETGET_FUNCS(dir_type, struct btrfs_dir_item, type, 
8);
 BTRFS_SETGET_FUNCS(dir_name_len, struct btrfs_dir_item, name_len, 16);
 BTRFS_SETGET_FUNCS(dir_transid, struct btrfs_dir_item, transid, 64);
 
+BTRFS_SETGET_STACK_FUNCS(stack_dir_name_len, struct btrfs_dir_item, name_len, 
16);
+
 static inline void btrfs_dir_item_key(struct extent_buffer *eb,
  struct btrfs_dir_item *item,
  struct btrfs_disk_key *key)
-- 
1.7.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] Btrfs-progs: refactor resolve_root() function a bit

2012-03-12 Thread Ilya Dryomov
Don't pass a pointer to root_id to resolve_root().  It's always the same as
ri->root_id, passing a pointer hints that root_id can somehow change which is
not true.

Signed-off-by: Ilya Dryomov 
---
 btrfs-list.c |   21 ++---
 1 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 44a73de..cc1dc66 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -200,7 +200,7 @@ static int add_root(struct root_lookup *root_lookup,
  * in by lookup_ino_path
  */
 static int resolve_root(struct root_lookup *rl, struct root_info *ri,
-   u64 *root_id, u64 *parent_id, u64 *top_id, char **path)
+   u64 *parent_id, u64 *top_id, char **path)
 {
char *full_path = NULL;
int len = 0;
@@ -254,7 +254,6 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri,
}
}
 
-   *root_id = ri->root_id;
*path = full_path;
 
return 0;
@@ -692,23 +691,23 @@ int list_subvols(int fd, int print_parent)
n = rb_last(&root_lookup.root);
while (n) {
struct root_info *entry;
-   u64 root_id;
u64 level;
u64 parent_id;
char *path;
+
entry = rb_entry(n, struct root_info, rb_node);
-   resolve_root(&root_lookup, entry, &root_id, &parent_id,
-   &level, &path);
+   resolve_root(&root_lookup, entry, &parent_id, &level, &path);
if (print_parent) {
printf("ID %llu parent %llu top level %llu path %s\n",
-   (unsigned long long)root_id,
+   (unsigned long long)entry->root_id,
(unsigned long long)parent_id,
(unsigned long long)level, path);
} else {
printf("ID %llu top level %llu path %s\n",
-   (unsigned long long)root_id,
+   (unsigned long long)entry->root_id,
(unsigned long long)level, path);
}
+
free(path);
n = rb_prev(n);
}
@@ -914,17 +913,17 @@ char *path_for_root(int fd, u64 root)
n = rb_last(&root_lookup.root);
while (n) {
struct root_info *entry;
-   u64 root_id;
u64 parent_id;
u64 level;
char *path;
+
entry = rb_entry(n, struct root_info, rb_node);
-   resolve_root(&root_lookup, entry, &root_id, &parent_id, &level,
-   &path);
-   if (root_id == root)
+   resolve_root(&root_lookup, entry, &parent_id, &level, &path);
+   if (entry->root_id == root)
ret_path = path;
else
free(path);
+
n = rb_prev(n);
}
 
-- 
1.7.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] Btrfs-progs: nuke redundant zeroing in __list_subvol_search()

2012-03-12 Thread Ilya Dryomov
There's no need to zero out things twice.

Signed-off-by: Ilya Dryomov 
---
 btrfs-list.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 5f4a9be..44a73de 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -569,10 +569,6 @@ static int __list_subvol_search(int fd, struct root_lookup 
*root_lookup)
root_lookup_init(root_lookup);
memset(&args, 0, sizeof(args));
 
-   root_lookup_init(root_lookup);
-
-   memset(&args, 0, sizeof(args));
-
/* search in the tree of tree roots */
sk->tree_id = 1;
 
-- 
1.7.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Btrfs-progs: fix 'subvol get-default' command

2012-03-12 Thread Ilya Dryomov
'btrfs subvol get-default' has been broken for a while, fix it.  Patches
1 and 2 are straightforward cleanups in that area, patch 3 fixes the
problem.

Thanks,

Ilya


Ilya Dryomov (3):
  Btrfs-progs: nuke redundant zeroing in __list_subvol_search()
  Btrfs-progs: refactor resolve_root() function a bit
  Btrfs-progs: bring 'subvol get-default' back in

 btrfs-list.c |  106 +-
 ctree.h  |2 +
 2 files changed, 92 insertions(+), 16 deletions(-)

-- 
1.7.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/delayed-inode.c:1466!

2012-03-12 Thread Johannes Hirte
Am Mon, 12 Mar 2012 15:21:49 +0100
schrieb Jacek Luczak :

> > 2) A *regression* in 3.3.0-rc6-00197-g9f8050c
> > - completely unusable as reports ENOSPC
> > - to reproduce, mount volume and issue:
> > # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> > /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> > On my host this shows:
> > # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> > /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> > touch: cannot touch `/btrfs/dd': No space left on device
> > 423
> > - remount to reset:
> > # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> > /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> > touch: cannot touch `/btrfs/dd': No space left on device
> > 1
> > # umount /btrfs/
> > # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o
> > noatime,nodatacow,defaults # CNT=1 ; while [ $CNT -lt 1 ] ; do
> >  rm -f /btrfs/dd ; ! touch /vdd && echo "$CNT" && break  ;
> > CNT=$(( $CNT + 1 )) ; done touch: cannot touch `/btrfs/dd': No
> > space left on device 423
> > - bisected down to 5500cdb (Btrfs: increase the global block reserve
> > estimates). After reverting this one Linus master works for me
> > again.
> 
> With above patch reverted after a longer run I've got ENOSPC again:
> 1) # df -hP /btrfs
> FilesystemSize  Used Avail Use% Mounted on
> /dev/mapper/vg00-btrfs  195G  179G   11G  95% /btrfs
> 2) # rm -f /btrfs/dd
> rm: cannot remove `/btrfs/dd': No space left on device
> 3) strace
> unlink("/btrfs/dd")= -1 ENOSPC (No space left on
> device) 4) last message from kernel (except WARN_ONs):
> btrfs: fail to dirty inode 116882385  
> 
> I've remouted volume and after that I've been able to remove dd file
> from volume. In dmesg there's bunch on new WARN_ONs:
> [ cut here ]
> WARNING: at fs/btrfs/extent-tree.c:4185
> btrfs_free_block_groups+0x17d/0x2b8 [btrfs]()
> Hardware name: ProLiant BL460c G6
> Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
> autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
> ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
> ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
> dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac
> parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm
> hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core
> hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd
> Pid: 9518, comm: umount Tainted: GW
> 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace:
>  [] ? print_oops_end_marker+0x9/0x20
>  [] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs]
>  [] ? warn_slowpath_common+0x78/0x8d
>  [] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs]
>  [] ? close_ctree+0x1e1/0x380 [btrfs]
>  [] ? dispose_list+0x27/0x31
>  [] ? evict_inodes+0xc5/0xcc
>  [] ? generic_shutdown_super+0x4d/0xc1
>  [] ? kill_anon_super+0x9/0x11
>  [] ? btrfs_kill_super+0xd/0x73 [btrfs]
>  [] ? deactivate_locked_super+0x2f/0x5f
>  [] ? sys_umount+0x2c1/0x30b
>  [] ? system_call_fastpath+0x16/0x1b
> ---[ end trace fd6da849e53b77dd ]---
> [ cut here ]
> WARNING: at fs/btrfs/extent-tree.c:4186
> btrfs_free_block_groups+0x198/0x2b8 [btrfs]()
> Hardware name: ProLiant BL460c G6
> Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
> autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
> ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
> ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
> dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac
> parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm
> hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core
> hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd
> Pid: 9518, comm: umount Tainted: GW
> 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace:
>  [] ? print_oops_end_marker+0x9/0x20
>  [] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs]
>  [] ? warn_slowpath_common+0x78/0x8d
>  [] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs]
>  [] ? close_ctree+0x1e1/0x380 [btrfs]
>  [] ? dispose_list+0x27/0x31
>  [] ? evict_inodes+0xc5/0xcc
>  [] ? generic_shutdown_super+0x4d/0xc1
>  [] ? kill_anon_super+0x9/0x11
>  [] ? btrfs_kill_super+0xd/0x73 [btrfs]
>  [] ? deactivate_locked_super+0x2f/0x5f
>  [] ? sys_umount+0x2c1/0x30b
>  [] ? system_call_fastpath+0x16/0x1b
> ---[ end trace fd6da849e53b77de ]---
> [ cut here ]
> WARNING: at fs/btrfs/extent-tree.c:4187
> btrfs_free_block_groups+0x1b3/0x2b8 [btrfs]()
> Hardware name: ProLiant BL460c G6
> Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
> autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
> ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
> ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
> dm_region_hash dm_log 

Re: kernel BUG at fs/btrfs/delayed-inode.c:1466!

2012-03-12 Thread Jacek Luczak
2012/3/10 Jacek Luczak :
> 2012/3/9 David Sterba :
>> On Fri, Mar 09, 2012 at 12:08:12PM +0100, Jacek Luczak wrote:
>>> For this one I've created also a report [1].
>>> >
>>> > so there is probably other problem in reservations and it just blew up 
>>> > during
>>> > the unlink call.
>>>
>>> Could be as this came up after a longer time of throwing above WARN_ON.
>>>
>>> I'm now cloning the Linus tree. Lets see if both will pop up on there.
>>
>> The 3.3-rc6 should help in one case, with
>>
>> http://thread.gmane.org/gmane.comp.file-systems.btrfs/15268
>>
>> but I was able to reproduce the WARN_ON even with this patch, didn't get
>> to debugging it again yet.
>>
>
>
> The story so far looks like this:
> 1) kernel 3.2.7:
> - on the BUG_ON triggers after a longer while of CI env (doing builds)
> running. This has been already reproduced twice.
> - WARN_ON spams heavily, even after BUG_ON pop up.
> - possible relation between WARN_ON and BUG_ON.

WARN_ON still popup in 3.3.0-rc6-00197-g9f8050c but did not triggered
BUG_ON after ~300 occurrence.

> 2) A *regression* in 3.3.0-rc6-00197-g9f8050c
> - completely unusable as reports ENOSPC
> - to reproduce, mount volume and issue:
> # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> On my host this shows:
> # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> touch: cannot touch `/btrfs/dd': No space left on device
> 423
> - remount to reset:
> # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> touch: cannot touch `/btrfs/dd': No space left on device
> 1
> # umount /btrfs/
> # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o noatime,nodatacow,defaults
> # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> /vdd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> touch: cannot touch `/btrfs/dd': No space left on device
> 423
> - bisected down to 5500cdb (Btrfs: increase the global block reserve
> estimates). After reverting this one Linus master works for me again.

With above patch reverted after a longer run I've got ENOSPC again:
1) # df -hP /btrfs
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/vg00-btrfs  195G  179G   11G  95% /btrfs
2) # rm -f /btrfs/dd
rm: cannot remove `/btrfs/dd': No space left on device
3) strace
unlink("/btrfs/dd")= -1 ENOSPC (No space left on device)
4) last message from kernel (except WARN_ONs):
btrfs: fail to dirty inode 116882385 error -28

I've remouted volume and after that I've been able to remove dd file
from volume. In dmesg there's bunch on new WARN_ONs:
[ cut here ]
WARNING: at fs/btrfs/extent-tree.c:4185
btrfs_free_block_groups+0x17d/0x2b8 [btrfs]()
Hardware name: ProLiant BL460c G6
Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac
parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm
hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core
hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd
Pid: 9518, comm: umount Tainted: GW3.3.0-rc6-00197-g9f8050c-dirty #1
Call Trace:
 [] ? print_oops_end_marker+0x9/0x20
 [] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs]
 [] ? warn_slowpath_common+0x78/0x8d
 [] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs]
 [] ? close_ctree+0x1e1/0x380 [btrfs]
 [] ? dispose_list+0x27/0x31
 [] ? evict_inodes+0xc5/0xcc
 [] ? generic_shutdown_super+0x4d/0xc1
 [] ? kill_anon_super+0x9/0x11
 [] ? btrfs_kill_super+0xd/0x73 [btrfs]
 [] ? deactivate_locked_super+0x2f/0x5f
 [] ? sys_umount+0x2c1/0x30b
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace fd6da849e53b77dd ]---
[ cut here ]
WARNING: at fs/btrfs/extent-tree.c:4186
btrfs_free_block_groups+0x198/0x2b8 [btrfs]()
Hardware name: ProLiant BL460c G6
Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac
parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm
hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core
hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd
Pid: 9518, comm: umount Tainted: GW3.3.0-rc6-00197-g9f8050c-dirty #1
Call Trace:
 [] ? print_oops_end_marker+0x9/0x20
 [] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs]
 [] ? warn_slowpath_common+0x78/0x8d
 [] ? btrfs_free_block_groups+0x198/0x2b8 

Re: ext3/4, btrfs, ocfs2: How to assure that cleancache_invalidate_fs is called on every superblock free

2012-03-12 Thread Jan Kara
  Hello,

On Fri 09-03-12 14:40:22, Andor Daam wrote:
> Is it ever possible for a superblock for a mounted filesystem to be
> free'd without a previous call to unmount the filesystem?
  No, I don't think so (well, except for cases where we do not manage to
fully setup the superblock). But be aware that mount/umount need not be
really the entry points you are looking for since filesystem can be mounted
several times. Rather deactivate_locked_supers() is the place you are
looking for...

> I need to be certain that the function cleancache_invalidate_fs, which is
> at the moment called by deactivate_locked_super (fs/super.c) [1], is
> called before every free on a superblock of cleancache-enabled
> filesystems.  Is this already the case or are there situations in which
> this does not happen?
> 
> It would be interesting to know this, as we are planning to have
> cleancache save pointers to superblocks of every mounted
> cleancache-enabled filesystem [2] and it would be fatal if a
> superblock is free'd without cleancache being notified.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2][RESEND] Btrfs: avoid possible use-after-free in clear_extent_bit()

2012-03-12 Thread Li Zefan
clear_extent_bit()
{
next_node = rb_next(&state->rb_node);
...
clear_state_bit(state);  <-- this may free next_node
if (next_node) {
state = rb_entry(next_node);
...
}
}

clear_state_bit() calls merge_state() which may free the next node
of the passing extent_state, so clear_extent_bit() may end up
referencing freed memory.

Signed-off-by: Li Zefan 
---

no more Chinese characters in this section. ;)

---
 fs/btrfs/extent_io.c |   36 +---
 1 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c968c95..20f2f5a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -392,6 +392,15 @@ static int split_state(struct extent_io_tree *tree, struct 
extent_state *orig,
return 0;
 }
 
+static struct extent_state *next_state(struct extent_state *state)
+{
+   struct rb_node *next = rb_next(&state->rb_node);
+   if (next)
+   return rb_entry(next, struct extent_state, rb_node);
+   else
+   return NULL;
+}
+
 /*
  * utility function to clear some bits in an extent state struct.
  * it will optionally wake up any one waiting on this state (wake == 1)
@@ -399,10 +408,11 @@ static int split_state(struct extent_io_tree *tree, 
struct extent_state *orig,
  * If no bits are set on the state struct after clearing things, the
  * struct is freed and removed from the tree
  */
-static void clear_state_bit(struct extent_io_tree *tree,
-   struct extent_state *state,
-   int *bits, int wake)
+static struct extent_state *clear_state_bit(struct extent_io_tree *tree,
+   struct extent_state *state,
+   int *bits, int wake)
 {
+   struct extent_state *next;
int bits_to_clear = *bits & ~EXTENT_CTLBITS;
 
if ((bits_to_clear & EXTENT_DIRTY) && (state->state & EXTENT_DIRTY)) {
@@ -415,6 +425,7 @@ static void clear_state_bit(struct extent_io_tree *tree,
if (wake)
wake_up(&state->wq);
if (state->state == 0) {
+   next = next_state(state);
if (state->tree) {
rb_erase(&state->rb_node, &tree->state);
state->tree = NULL;
@@ -424,7 +435,9 @@ static void clear_state_bit(struct extent_io_tree *tree,
}
} else {
merge_state(tree, state);
+   next = next_state(state);
}
+   return next;
 }
 
 static struct extent_state *
@@ -456,7 +469,6 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 
start, u64 end,
struct extent_state *state;
struct extent_state *cached;
struct extent_state *prealloc = NULL;
-   struct rb_node *next_node;
struct rb_node *node;
u64 last_end;
int err;
@@ -508,14 +520,11 @@ hit_next:
WARN_ON(state->end < start);
last_end = state->end;
 
-   if (state->end < end && !need_resched())
-   next_node = rb_next(&state->rb_node);
-   else
-   next_node = NULL;
-
/* the state doesn't have the wanted bits, go ahead */
-   if (!(state->state & bits))
+   if (!(state->state & bits)) {
+   state = next_state(state);
goto next;
+   }
 
/*
 * |  desired range  |
@@ -569,16 +578,13 @@ hit_next:
goto out;
}
 
-   clear_state_bit(tree, state, &bits, wake);
+   state = clear_state_bit(tree, state, &bits, wake);
 next:
if (last_end == (u64)-1)
goto out;
start = last_end + 1;
-   if (start <= end && next_node) {
-   state = rb_entry(next_node, struct extent_state,
-rb_node);
+   if (start <= end && state && !need_resched())
goto hit_next;
-   }
goto search_again;
 
 out:
-- 1.7.3.1 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: avoid possible use-after-free in clear_extent_bit()

2012-03-12 Thread Li Zefan
clear_extent_bit()
{
next_node = rb_next(&state->rb_node);
...
clear_state_bit(state);  <-- this may free next_node
if (next_node) {
state = rb_entry(next_node);
...
}
}

clear_state_bit() calls merge_state() which may free the next node
of the passing extent_state, so clear_extent_bit() may end up
referencing freed memory.

Signed-off-by: Li Zefan 
---

经过测试,发现clear_state_bit()会在80%的情况下直接把state free掉。所以之前
的比较简单的patch,会在80%的情况下重新goto回去搜索rbtree,会很慢。所以,现在
改成由clear_state_bit()返回下一个state。

---
 fs/btrfs/extent_io.c |   36 +---
 1 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c968c95..20f2f5a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -392,6 +392,15 @@ static int split_state(struct extent_io_tree *tree, struct 
extent_state *orig,
return 0;
 }
 
+static struct extent_state *next_state(struct extent_state *state)
+{
+   struct rb_node *next = rb_next(&state->rb_node);
+   if (next)
+   return rb_entry(next, struct extent_state, rb_node);
+   else
+   return NULL;
+}
+
 /*
  * utility function to clear some bits in an extent state struct.
  * it will optionally wake up any one waiting on this state (wake == 1)
@@ -399,10 +408,11 @@ static int split_state(struct extent_io_tree *tree, 
struct extent_state *orig,
  * If no bits are set on the state struct after clearing things, the
  * struct is freed and removed from the tree
  */
-static void clear_state_bit(struct extent_io_tree *tree,
-   struct extent_state *state,
-   int *bits, int wake)
+static struct extent_state *clear_state_bit(struct extent_io_tree *tree,
+   struct extent_state *state,
+   int *bits, int wake)
 {
+   struct extent_state *next;
int bits_to_clear = *bits & ~EXTENT_CTLBITS;
 
if ((bits_to_clear & EXTENT_DIRTY) && (state->state & EXTENT_DIRTY)) {
@@ -415,6 +425,7 @@ static void clear_state_bit(struct extent_io_tree *tree,
if (wake)
wake_up(&state->wq);
if (state->state == 0) {
+   next = next_state(state);
if (state->tree) {
rb_erase(&state->rb_node, &tree->state);
state->tree = NULL;
@@ -424,7 +435,9 @@ static void clear_state_bit(struct extent_io_tree *tree,
}
} else {
merge_state(tree, state);
+   next = next_state(state);
}
+   return next;
 }
 
 static struct extent_state *
@@ -456,7 +469,6 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 
start, u64 end,
struct extent_state *state;
struct extent_state *cached;
struct extent_state *prealloc = NULL;
-   struct rb_node *next_node;
struct rb_node *node;
u64 last_end;
int err;
@@ -508,14 +520,11 @@ hit_next:
WARN_ON(state->end < start);
last_end = state->end;
 
-   if (state->end < end && !need_resched())
-   next_node = rb_next(&state->rb_node);
-   else
-   next_node = NULL;
-
/* the state doesn't have the wanted bits, go ahead */
-   if (!(state->state & bits))
+   if (!(state->state & bits)) {
+   state = next_state(state);
goto next;
+   }
 
/*
 * |  desired range  |
@@ -569,16 +578,13 @@ hit_next:
goto out;
}
 
-   clear_state_bit(tree, state, &bits, wake);
+   state = clear_state_bit(tree, state, &bits, wake);
 next:
if (last_end == (u64)-1)
goto out;
start = last_end + 1;
-   if (start <= end && next_node) {
-   state = rb_entry(next_node, struct extent_state,
-rb_node);
+   if (start <= end && state && !need_resched())
goto hit_next;
-   }
goto search_again;
 
 out:
-- 1.7.3.1 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Btrfs: make clear_extent_bit() always return 0 on success

2012-03-12 Thread Li Zefan
Currently it returns a set of bits that were cleared, but this return
value is not used at all.

Moreover it doesn't seem to be useful, because we may clear the bits
of a few extent_states, but only the cleared bits of last one is
returned.

Signed-off-by: Li Zefan 
---
 fs/btrfs/extent_io.c |   19 +++
 1 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a55fbe6..c968c95 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -394,18 +394,16 @@ static int split_state(struct extent_io_tree *tree, 
struct extent_state *orig,
 
 /*
  * utility function to clear some bits in an extent state struct.
- * it will optionally wake up any one waiting on this state (wake == 1), or
- * forcibly remove the state from the tree (delete == 1).
+ * it will optionally wake up any one waiting on this state (wake == 1)
  *
  * If no bits are set on the state struct after clearing things, the
  * struct is freed and removed from the tree
  */
-static int clear_state_bit(struct extent_io_tree *tree,
+static void clear_state_bit(struct extent_io_tree *tree,
struct extent_state *state,
int *bits, int wake)
 {
int bits_to_clear = *bits & ~EXTENT_CTLBITS;
-   int ret = state->state & bits_to_clear;
 
if ((bits_to_clear & EXTENT_DIRTY) && (state->state & EXTENT_DIRTY)) {
u64 range = state->end - state->start + 1;
@@ -427,7 +425,6 @@ static int clear_state_bit(struct extent_io_tree *tree,
} else {
merge_state(tree, state);
}
-   return ret;
 }
 
 static struct extent_state *
@@ -449,8 +446,7 @@ alloc_extent_state_atomic(struct extent_state *prealloc)
  *
  * the range [start, end] is inclusive.
  *
- * This takes the tree lock, and returns < 0 on error, > 0 if any of the
- * bits were already set, or zero if none of the bits were already set.
+ * This takes the tree lock, and returns < 0 on error.
  */
 int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 int bits, int wake, int delete,
@@ -464,7 +460,6 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 
start, u64 end,
struct rb_node *node;
u64 last_end;
int err;
-   int set = 0;
int clear = 0;
 
if (delete)
@@ -547,7 +542,7 @@ hit_next:
if (err)
goto out;
if (state->end <= end) {
-   set |= clear_state_bit(tree, state, &bits, wake);
+   clear_state_bit(tree, state, &bits, wake);
if (last_end == (u64)-1)
goto out;
start = last_end + 1;
@@ -568,13 +563,13 @@ hit_next:
if (wake)
wake_up(&state->wq);
 
-   set |= clear_state_bit(tree, prealloc, &bits, wake);
+   clear_state_bit(tree, prealloc, &bits, wake);
 
prealloc = NULL;
goto out;
}
 
-   set |= clear_state_bit(tree, state, &bits, wake);
+   clear_state_bit(tree, state, &bits, wake);
 next:
if (last_end == (u64)-1)
goto out;
@@ -591,7 +586,7 @@ out:
if (prealloc)
free_extent_state(prealloc);
 
-   return set;
+   return 0;
 
 search_again:
if (start > end)
-- 1.7.3.1 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html