[PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Josef Bacik
There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: GW   2.6.39-rc2+ #35 
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[]  [] 
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0  EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS:  () GS:88003e40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
880032278000)
Stack:
 0001 88002a92 881d 038d
  0005 88003aa38000 81481012
 88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
 [] ? sub_preempt_count+0x97/0xaa
 [] run_clustered_refs+0x61b/0x700 [btrfs]
 [] ? sub_preempt_count+0xe/0xaa
 [] ? spin_lock+0xe/0x10 [btrfs]
 [] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
 [] ? _raw_spin_unlock+0x4a/0x57
 [] __btrfs_end_transaction+0x89/0x1ed [btrfs]
 [] btrfs_end_transaction+0x15/0x17 [btrfs]
 [] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
 [] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
 [] end_bio_extent_writepage+0xae/0x159 [btrfs]
 [] bio_endio+0x2d/0x2f
 [] end_workqueue_fn+0x111/0x120 [btrfs]
 [] worker_loop+0x192/0x4d1 [btrfs]
 [] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
 [] kthread+0xa0/0xa8
 [] ? trace_hardirqs_on_caller+0x111/0x135
 [] kernel_thread_helper+0x4/0x10
 [] ? retint_restore_args+0x13/0x13
 [] ? __init_kthread_worker+0x5b/0x5b
 [] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 
48 8b 72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04 <0f> 0b eb fe 48 8b 03 
48 89 45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP  [] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
 RSP 
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before ending the
transaction.  However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we still have
plenty of space that we need to use.  So instead release the delalloc bytes
_after_ we end the transaction so that we don't get this false ENOSPC.  Thanks,

Signed-off-by: Josef Bacik 
---
 fs/btrfs/inode.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ade00e7..b1e5b11 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1783,9 +1783,13 @@ out:
if (trans)
btrfs_end_transaction_nolock(trans, root);
} else {
-   btrfs_delalloc_release_metadata(inode, ordered_extent->len);
if (trans)
btrfs_end_transaction(trans, root);
+   /*
+* Release after the transaction ends so it covers the delayed
+* ref updates
+*/
+   btrfs_delalloc_release_metadata(inode, ordered_extent->len);
}
 
/* once for us */
@@ -5897,8 +5901,8 @@ out_unlock:
 ordered->file_offset + ordered->len - 1,
 &cached_state, GFP_NOFS);
 out:
-   btrfs_delalloc_release_metadata(inode, ordered->len);
btrfs_end_transaction(trans, root);
+   btrfs_delalloc_release_metadata(inode, ordered->len);
ordered_offset = ordered->file_offset + ordered->len;
btrfs_put_ordered_extent(ordered);
btrfs_put_ordered_extent(ordered);
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.ht

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Arne Jansen

On 13.04.2011 20:54, Josef Bacik wrote:

There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: GW   2.6.39-rc2+ #35 
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[]  [] 
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0  EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS:  () GS:88003e40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
880032278000)
Stack:
  0001 88002a92 881d 038d
   0005 88003aa38000 81481012
  88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
  [] ? sub_preempt_count+0x97/0xaa
  [] run_clustered_refs+0x61b/0x700 [btrfs]
  [] ? sub_preempt_count+0xe/0xaa
  [] ? spin_lock+0xe/0x10 [btrfs]
  [] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
  [] ? _raw_spin_unlock+0x4a/0x57
  [] __btrfs_end_transaction+0x89/0x1ed [btrfs]
  [] btrfs_end_transaction+0x15/0x17 [btrfs]
  [] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
  [] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
  [] end_bio_extent_writepage+0xae/0x159 [btrfs]
  [] bio_endio+0x2d/0x2f
  [] end_workqueue_fn+0x111/0x120 [btrfs]
  [] worker_loop+0x192/0x4d1 [btrfs]
  [] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
  [] kthread+0xa0/0xa8
  [] ? trace_hardirqs_on_caller+0x111/0x135
  [] kernel_thread_helper+0x4/0x10
  [] ? retint_restore_args+0x13/0x13
  [] ? __init_kthread_worker+0x5b/0x5b
  [] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 48 8b 
72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04<0f>  0b eb fe 48 8b 03 48 89 
45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP  [] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
  RSP
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before ending the
transaction.  However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we still have
plenty of space that we need to use.  So instead release the delalloc bytes
_after_ we end the transaction so that we don't get this false ENOSPC.  Thanks,

Signed-off-by: Josef Bacik
---
  fs/btrfs/inode.c |8 ++--
  1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ade00e7..b1e5b11 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1783,9 +1783,13 @@ out:
if (trans)
btrfs_end_transaction_nolock(trans, root);
} else {
-   btrfs_delalloc_release_metadata(inode, ordered_extent->len);
if (trans)
btrfs_end_transaction(trans, root);
+   /*
+* Release after the transaction ends so it covers the delayed
+* ref updates
+*/
+   btrfs_delalloc_release_metadata(inode, ordered_extent->len);


I think calling end_transaction doesn't guarantee you that all delayed
refs have run, only if end_transaction leads to commit transaction.
Another problem I see is that commit_transaction just uses the block_rsv
of whatever trans happened to call commit, even if the trans->block_rsv
have been set to a different block_rsv than trans_block_rsv or
delalloc_block_rsv. In other words, the relayed_refs are run from a non-
deterministic block_rsv.
But it's late, I'll think more about it tomorrow.

-Arne


}

/* once for us */
@@ -5897,8 +5901,8 @@ out_unlock:
 ordered->file_offset + ordered->len - 1,
  

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Yan, Zheng
On Thu, Apr 14, 2011 at 2:54 AM, Josef Bacik  wrote:
> There have been many sporadic reports of the following panic
>
> [ cut here ]
> kernel BUG at fs/btrfs/extent-tree.c:5498!
> invalid opcode:  [#1] PREEMPT SMP
> last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> CPU 7
> Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
> ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
> acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
> nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
> snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
> rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
> snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]
>
> Pid: 28504, comm: btrfs-endio-wri Tainted: G        W   2.6.39-rc2+ #35 
> Hewlett-Packard HP xw6600 Workstation/0A9Ch
> RIP: 0010:[]  [] 
> alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
> RSP: 0018:88000b4319f0  EFLAGS: 00010286
> RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
> RDX: 88000b4318c0 RSI: 00d5 RDI: 
> RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
> R10: 0001 R11: 88000b431b10 R12: 88000b431b10
> R13: 00b2 R14:  R15: 88002225f2f8
> FS:  () GS:88003e40() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
> 880032278000)
> Stack:
>  0001 88002a92 881d 038d
>   0005 88003aa38000 81481012
>  88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
> Call Trace:
>  [] ? sub_preempt_count+0x97/0xaa
>  [] run_clustered_refs+0x61b/0x700 [btrfs]
>  [] ? sub_preempt_count+0xe/0xaa
>  [] ? spin_lock+0xe/0x10 [btrfs]
>  [] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
>  [] ? _raw_spin_unlock+0x4a/0x57
>  [] __btrfs_end_transaction+0x89/0x1ed [btrfs]
>  [] btrfs_end_transaction+0x15/0x17 [btrfs]
>  [] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
>  [] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
>  [] end_bio_extent_writepage+0xae/0x159 [btrfs]
>  [] bio_endio+0x2d/0x2f
>  [] end_workqueue_fn+0x111/0x120 [btrfs]
>  [] worker_loop+0x192/0x4d1 [btrfs]
>  [] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
>  [] kthread+0xa0/0xa8
>  [] ? trace_hardirqs_on_caller+0x111/0x135
>  [] kernel_thread_helper+0x4/0x10
>  [] ? retint_restore_args+0x13/0x13
>  [] ? __init_kthread_worker+0x5b/0x5b
>  [] ? gs_change+0x13/0x13
> Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 
> 48 8b 72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04 <0f> 0b eb fe 48 8b 
> 03 48 89 45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
> RIP  [] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
>  RSP 
> ---[ end trace 81d1c68cb00af83e ]---
>
> This is because we have been releasing the delalloc bytes before ending the
> transaction.  However the way we make allocations, any updates to the
> extent_tree are delayed and then run when the transaction runs, so we still 
> have
> plenty of space that we need to use.  So instead release the delalloc bytes
> _after_ we end the transaction so that we don't get this false ENOSPC.  
> Thanks,
>

This is wrong, because btrfs_run_delayed_refs uses global block reservation.


> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/inode.c |    8 ++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index ade00e7..b1e5b11 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -1783,9 +1783,13 @@ out:
>                if (trans)
>                        btrfs_end_transaction_nolock(trans, root);
>        } else {
> -               btrfs_delalloc_release_metadata(inode, ordered_extent->len);
>                if (trans)
>                        btrfs_end_transaction(trans, root);
> +               /*
> +                * Release after the transaction ends so it covers the delayed
> +                * ref updates
> +                */
> +               btrfs_delalloc_release_metadata(inode, ordered_extent->len);
>        }
>
>        /* once for us */
> @@ -5897,8 +5901,8 @@ out_unlock:
>                             ordered->file_offset + ordered->len - 1,
>                             &cached_state, GFP_NOFS);
>  out:
> -       btrfs_delalloc_release_metadata(inode, ordered->len);
>        btrfs_end_transaction(trans, root);
> +       btrfs_delalloc_release_metadata(inode, ordered->len);
>        ordered_offset =

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Josef Bacik

On 04/13/2011 06:08 PM, Arne Jansen wrote:

On 13.04.2011 20:54, Josef Bacik wrote:

There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc
cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6
dm_multipath kvm uinput snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd
hp_wmi i5400_edac sparse_keymap iTCO_wdt rfkill edac_core tg3 shpchp
iTCO_vendor_support soundcore wmi floppy snd_page_alloc pcspkr i5k_amb
[last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: G W 2.6.39-rc2+ #35
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[] []
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0 EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS: () GS:88003e40()
knlGS:
CS: 0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task
880032278000)
Stack:
0001 88002a92 881d 038d
 0005 88003aa38000 81481012
88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
[] ? sub_preempt_count+0x97/0xaa
[] run_clustered_refs+0x61b/0x700 [btrfs]
[] ? sub_preempt_count+0xe/0xaa
[] ? spin_lock+0xe/0x10 [btrfs]
[] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
[] ? _raw_spin_unlock+0x4a/0x57
[] __btrfs_end_transaction+0x89/0x1ed [btrfs]
[] btrfs_end_transaction+0x15/0x17 [btrfs]
[] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
[] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
[] end_bio_extent_writepage+0xae/0x159 [btrfs]
[] bio_endio+0x2d/0x2f
[] end_workqueue_fn+0x111/0x120 [btrfs]
[] worker_loop+0x192/0x4d1 [btrfs]
[] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
[] kthread+0xa0/0xa8
[] ? trace_hardirqs_on_caller+0x111/0x135
[] kernel_thread_helper+0x4/0x10
[] ? retint_restore_args+0x13/0x13
[] ? __init_kthread_worker+0x5b/0x5b
[] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18
4c 89 e1 48 8b 72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04<0f>
0b eb fe 48 8b 03 48 89 45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP [] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before
ending the
transaction. However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we
still have
plenty of space that we need to use. So instead release the delalloc
bytes
_after_ we end the transaction so that we don't get this false ENOSPC.
Thanks,

Signed-off-by: Josef Bacik
---
fs/btrfs/inode.c | 8 ++--
1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ade00e7..b1e5b11 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1783,9 +1783,13 @@ out:
if (trans)
btrfs_end_transaction_nolock(trans, root);
} else {
- btrfs_delalloc_release_metadata(inode, ordered_extent->len);
if (trans)
btrfs_end_transaction(trans, root);
+ /*
+ * Release after the transaction ends so it covers the delayed
+ * ref updates
+ */
+ btrfs_delalloc_release_metadata(inode, ordered_extent->len);


I think calling end_transaction doesn't guarantee you that all delayed
refs have run, only if end_transaction leads to commit transaction.
Another problem I see is that commit_transaction just uses the block_rsv
of whatever trans happened to call commit, even if the trans->block_rsv
have been set to a different block_rsv than trans_block_rsv or
delalloc_block_rsv. In other words, the relayed_refs are run from a non-
deterministic block_rsv.
But it's late, I'll think more about it tomorrow.



Yeah you are right, not all delayed refs will be run, but hopefully the 
amount that we put into the delayed refs tree from the amount we created 
in making our new extents will be run, which will be enough for the math 
to come out right.  Hopefully ;),


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@

Re: [PATCH] Btrfs: do not release delalloc space until after we end the transaction

2011-04-13 Thread Josef Bacik

On 04/13/2011 08:26 PM, Yan, Zheng wrote:

On Thu, Apr 14, 2011 at 2:54 AM, Josef Bacik  wrote:

There have been many sporadic reports of the following panic

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:5498!
invalid opcode:  [#1] PREEMPT SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 7
Modules linked in: btrfs zlib_deflate libcrc32c netconsole configfs 
ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath kvm uinput 
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq 
snd_seq_device snd_pcm snd_timer snd hp_wmi i5400_edac sparse_keymap iTCO_wdt 
rfkill edac_core tg3 shpchp iTCO_vendor_support soundcore wmi floppy 
snd_page_alloc pcspkr i5k_amb [last unloaded: btrfs]

Pid: 28504, comm: btrfs-endio-wri Tainted: GW   2.6.39-rc2+ #35 
Hewlett-Packard HP xw6600 Workstation/0A9Ch
RIP: 0010:[]  [] 
alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
RSP: 0018:88000b4319f0  EFLAGS: 00010286
RAX: ffe4 RBX: 880009fdc438 RCX: 880020c216d0
RDX: 88000b4318c0 RSI: 00d5 RDI: 
RBP: 88000b431a70 R08: ffe4 R09: 880020c216d0
R10: 0001 R11: 88000b431b10 R12: 88000b431b10
R13: 00b2 R14:  R15: 88002225f2f8
FS:  () GS:88003e40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 003738ca6940 CR3: 2a39a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-endio-wri (pid: 28504, threadinfo 88000b43, task 
880032278000)
Stack:
  0001 88002a92 881d 038d
   0005 88003aa38000 81481012
  88000c3bb480 8800241d01c8 88000b431a60 880031a040a8
Call Trace:
  [] ? sub_preempt_count+0x97/0xaa
  [] run_clustered_refs+0x61b/0x700 [btrfs]
  [] ? sub_preempt_count+0xe/0xaa
  [] ? spin_lock+0xe/0x10 [btrfs]
  [] btrfs_run_delayed_refs+0xd1/0x1ab [btrfs]
  [] ? _raw_spin_unlock+0x4a/0x57
  [] __btrfs_end_transaction+0x89/0x1ed [btrfs]
  [] btrfs_end_transaction+0x15/0x17 [btrfs]
  [] btrfs_finish_ordered_io+0x29c/0x2bf [btrfs]
  [] btrfs_writepage_end_io_hook+0x81/0x8d [btrfs]
  [] end_bio_extent_writepage+0xae/0x159 [btrfs]
  [] bio_endio+0x2d/0x2f
  [] end_workqueue_fn+0x111/0x120 [btrfs]
  [] worker_loop+0x192/0x4d1 [btrfs]
  [] ? btrfs_queue_worker+0x22c/0x22c [btrfs]
  [] kthread+0xa0/0xa8
  [] ? trace_hardirqs_on_caller+0x111/0x135
  [] kernel_thread_helper+0x4/0x10
  [] ? retint_restore_args+0x13/0x13
  [] ? __init_kthread_worker+0x5b/0x5b
  [] ? gs_change+0x13/0x13
Code: 44 8b 45 90 0f 84 58 01 00 00 80 88 88 00 00 00 08 41 83 c0 18 4c 89 e1 48 8b 
72 20 4c 89 ff 48 89 c2 e8 1f b4 ff ff 85 c0 74 04<0f>  0b eb fe 48 8b 03 48 89 
45 c8 8b 73 40 48 89 c7 e8 bc 98 ff
RIP  [] alloc_reserved_file_extent+0x9a/0x1e5 [btrfs]
  RSP
---[ end trace 81d1c68cb00af83e ]---

This is because we have been releasing the delalloc bytes before ending the
transaction.  However the way we make allocations, any updates to the
extent_tree are delayed and then run when the transaction runs, so we still have
plenty of space that we need to use.  So instead release the delalloc bytes
_after_ we end the transaction so that we don't get this false ENOSPC.  Thanks,



This is wrong, because btrfs_run_delayed_refs uses global block reservation.



I don't see anywhere in the delayed ref code that specifically uses the 
global block reserve, where is that?  And if that is what is supposed to 
happen, why are we charging the metadata we will use for modifying the 
extent tree to the delalloc reserve?  It seems to me we should either


1) Be using the delalloc block reserve for running the delayed ref's 
that are created by inserting our extent, since that is where the 
reservation currently is made, or


2) Stop charging the reservations for modifying the extent tree to the 
delalloc block reserve and charge it instead to the global reserve, and 
then actually make sure that the global reserve is used when we do the 
delayed ref updating.


Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html