Re: kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-22 Thread Jim Schutt

Jim Schutt wrote:

Hi Miao,

Miao Xie wrote:

Hi, Jim

Could you test the attached patch for me? I have done some quick 
tests, it worked well. But I'm not sure if it can fix

the bug you reported or not, so I need your help!


So far I haven't been able to reproduce with your patch
applied.  I'd like to test for a few more days, though,
before calling it good.

Thanks for the patch -- I'll let you know what more
testing brings.


I've been running this patch on top of 3.0-rc4 for
the last couple days, and have been unable to reproduce
the BUG.

Thanks -- Jim



-- Jim



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-21 Thread Miao Xie
On Tue, 21 Jun 2011 02:08:54 +0200, David Sterba wrote:
> On Mon, Jun 20, 2011 at 06:12:10PM +0800, Miao Xie wrote:
>> >From 457f39393b2e3d475fbba029b90b6a4e17b94d43 Mon Sep 17 00:00:00 2001
>> From: Miao Xie 
>> Date: Mon, 20 Jun 2011 17:21:51 +0800
>> Subject: [PATCH] btrfs: fix inconsonant inode information
>>
>> When iputting the inode, We may leave the delayed nodes if they have some
>> delayed items that have not been dealt with. So when the inode is read again,
>> we must look up the relative delayed node, and use the information in it to
>> initialize the inode. Or we will get inconsonant inode information.
>>
>> Signed-off-by: Miao Xie 
>> ---
>>  fs/btrfs/delayed-inode.c |  104 
>> +++---
>>  fs/btrfs/delayed-inode.h |1 +
>>  fs/btrfs/inode.c |   12 -
>>  3 files changed, 91 insertions(+), 26 deletions(-)
>>
>> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
>> index f1cbd02..280755e 100644
>> --- a/fs/btrfs/delayed-inode.c
>> +++ b/fs/btrfs/delayed-inode.c
>> @@ -82,19 +82,16 @@ static inline struct btrfs_delayed_root 
>> *btrfs_get_delayed_root(
>>  return root->fs_info->delayed_root;
>>  }
>>  
>> -static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
>> -struct inode *inode)
>> +static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode 
>> *inode)
>>  {
>> -struct btrfs_delayed_node *node;
>>  struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>>  struct btrfs_root *root = btrfs_inode->root;
>>  u64 ino = btrfs_ino(inode);
>> -int ret;
>> +struct btrfs_delayed_node *node;
>>  
>> -again:
>>  node = ACCESS_ONCE(btrfs_inode->delayed_node);
> 
> do you still need the volatile access here, after the again: label has
> been removed? it does not break things if it's there, but it raises
> questions ...

The rule of ACCESS_ONCE said by Linus is:

  if you access unlocked values, you use ACCESS_ONCE().

(See: http://yarchive.net/comp/linux/ACCESS_ONCE.html)
So I think it is still needed.
> 
>>  if (node) {
>> -atomic_inc(&node->refs);/* can be accessed */
>> +atomic_inc(&node->refs);
>>  return node;
>>  }
>>  
>> @@ -103,7 +100,9 @@ again:
>>  if (node) {
>>  if (btrfs_inode->delayed_node) {
>>  spin_unlock(&root->inode_lock);
>> -goto again;
>> +BUG_ON(btrfs_inode->delayed_node != node);
>> +atomic_inc(&node->refs);/* can be accessed */
>> +return node;
>>  }
>>  btrfs_inode->delayed_node = node;
>>  atomic_inc(&node->refs);/* can be accessed */
>> @@ -113,6 +112,23 @@ again:
>>  }
>>  spin_unlock(&root->inode_lock);
>>  
>> +return NULL;
>> +}
>> +
>> +static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
>> +struct inode *inode)
>> +{
>> +struct btrfs_delayed_node *node;
>> +struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>> +struct btrfs_root *root = btrfs_inode->root;
>> +u64 ino = btrfs_ino(inode);
>> +int ret;
>> +
>> +again:
>> +node = btrfs_get_delayed_node(inode);
> 
> ... aha, it's been somehow moved here, which copies the original logic.
> Now reading inode->delayed_node is inside a function and I do not think
> that compiler could optimize reading value of btrfs_inode->delayed_node
> that it would require ACCESS_ONCE.

The reason that I rewrote btrfs_get_delayed_node() is:
The old btrfs_get_delayed_node() may ignore the old delayed node which holds the
up-to-date information (such as: new directory indexes, up-to-date inode 
information),
considers there is no relative delayed node. And then btrfs will use the max 
index number
in the fs tree to initialize ->index_cnt, but in fact, this number is not right,
perhaps there are some directory indexes in the delayed node, the index number 
of them
is greater than the one in the fs tree. In this way, the same index number may 
be allocated
twice, and hit EEXIST error.

> 
> And there is another ACCESS_ONCE in btrfs_remove_delayed_node. I wonder
> what's the reason for that. Sorry to abuse this thread, but I'd like to
> be sure about protection of the ->delayed_node members inside
> btrfs_inode. Can you please comment on that?

OK, I'll write some comment.

> If you recall one message from the original report:
> 
> [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0)
> into the insertion tree of the delayed node(root id:
> 262, inode id: 258, errno: -17)
> 
> (-17 == -EEXIST)
> 
> a printk after return from __btrfs_add_delayed_item (which is
> able to return -EEXIST) in btrfs_insert_delayed_dir_index. I haven't
> looked farther, but it seems that the item is being inserted (at least)
> twice and I suspect missing locking or other ty

Re: kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-20 Thread David Sterba
On Mon, Jun 20, 2011 at 06:12:10PM +0800, Miao Xie wrote:
> >From 457f39393b2e3d475fbba029b90b6a4e17b94d43 Mon Sep 17 00:00:00 2001
> From: Miao Xie 
> Date: Mon, 20 Jun 2011 17:21:51 +0800
> Subject: [PATCH] btrfs: fix inconsonant inode information
> 
> When iputting the inode, We may leave the delayed nodes if they have some
> delayed items that have not been dealt with. So when the inode is read again,
> we must look up the relative delayed node, and use the information in it to
> initialize the inode. Or we will get inconsonant inode information.
> 
> Signed-off-by: Miao Xie 
> ---
>  fs/btrfs/delayed-inode.c |  104 
> +++---
>  fs/btrfs/delayed-inode.h |1 +
>  fs/btrfs/inode.c |   12 -
>  3 files changed, 91 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index f1cbd02..280755e 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -82,19 +82,16 @@ static inline struct btrfs_delayed_root 
> *btrfs_get_delayed_root(
>   return root->fs_info->delayed_root;
>  }
>  
> -static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
> - struct inode *inode)
> +static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode *inode)
>  {
> - struct btrfs_delayed_node *node;
>   struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>   struct btrfs_root *root = btrfs_inode->root;
>   u64 ino = btrfs_ino(inode);
> - int ret;
> + struct btrfs_delayed_node *node;
>  
> -again:
>   node = ACCESS_ONCE(btrfs_inode->delayed_node);

do you still need the volatile access here, after the again: label has
been removed? it does not break things if it's there, but it raises
questions ...

>   if (node) {
> - atomic_inc(&node->refs);/* can be accessed */
> + atomic_inc(&node->refs);
>   return node;
>   }
>  
> @@ -103,7 +100,9 @@ again:
>   if (node) {
>   if (btrfs_inode->delayed_node) {
>   spin_unlock(&root->inode_lock);
> - goto again;
> + BUG_ON(btrfs_inode->delayed_node != node);
> + atomic_inc(&node->refs);/* can be accessed */
> + return node;
>   }
>   btrfs_inode->delayed_node = node;
>   atomic_inc(&node->refs);/* can be accessed */
> @@ -113,6 +112,23 @@ again:
>   }
>   spin_unlock(&root->inode_lock);
>  
> + return NULL;
> +}
> +
> +static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
> + struct inode *inode)
> +{
> + struct btrfs_delayed_node *node;
> + struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
> + struct btrfs_root *root = btrfs_inode->root;
> + u64 ino = btrfs_ino(inode);
> + int ret;
> +
> +again:
> + node = btrfs_get_delayed_node(inode);

... aha, it's been somehow moved here, which copies the original logic.
Now reading inode->delayed_node is inside a function and I do not think
that compiler could optimize reading value of btrfs_inode->delayed_node
that it would require ACCESS_ONCE.

And there is another ACCESS_ONCE in btrfs_remove_delayed_node. I wonder
what's the reason for that. Sorry to abuse this thread, but I'd like to
be sure about protection of the ->delayed_node members inside
btrfs_inode. Can you please comment on that?

If you recall one message from the original report:

[ 5447.554187] err add delayed dir index item(name: pglog_0.965_0)
into the insertion tree of the delayed node(root id:
262, inode id: 258, errno: -17)

(-17 == -EEXIST)

a printk after return from __btrfs_add_delayed_item (which is
able to return -EEXIST) in btrfs_insert_delayed_dir_index. I haven't
looked farther, but it seems that the item is being inserted (at least)
twice and I suspect missing locking or other type of protection.


thanks,
david

> + if (node)
> + return node;
> +
>   node = kmem_cache_alloc(delayed_node_cache, GFP_NOFS);
>   if (!node)
>   return ERR_PTR(-ENOMEM);
> @@ -548,19 +564,6 @@ struct btrfs_delayed_item *__btrfs_next_delayed_item(
>   return next;
>  }
>  
> -static inline struct btrfs_delayed_node *btrfs_get_delayed_node(
> - struct inode *inode)
> -{
> - struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
> - struct btrfs_delayed_node *delayed_node;
> -
> - delayed_node = btrfs_inode->delayed_node;
> - if (delayed_node)
> - atomic_inc(&delayed_node->refs);
> -
> - return delayed_node;
> -}
> -
>  static inline struct btrfs_root *btrfs_get_fs_root(struct btrfs_root *root,
>  u64 root_id)
>  {
> @@ -1404,8 +1407,7 @@ end:
>  
>  int btrfs_inode_delayed_dir_index_count(struct inod

Re: kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-20 Thread Jim Schutt

Hi Miao,

Miao Xie wrote:

Hi, Jim

Could you test the attached patch for me? 
I have done some quick tests, it worked well. But I'm not sure if it can fix

the bug you reported or not, so I need your help!


So far I haven't been able to reproduce with your patch
applied.  I'd like to test for a few more days, though,
before calling it good.

Thanks for the patch -- I'll let you know what more
testing brings.

-- Jim



Thanks
Miao

On fri, 17 Jun 2011 10:10:31 -0600, Jim Schutt wrote:

Hi,

I've hit this delayed-inode BUG several times.  I'm using btrfs
as the data store for Ceph OSDs, and testing a heavy write load.
The kernel I'm running is a recent commit (f8f44f09eaa) from
Linus' tree with the for-chris branch (commit ed0ca14021e5) of
  git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git
merged in.

Please let me know what I can do to help resolve this.

[ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the 
insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
[ 5447.569766] [ cut here ]--------
[ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
[ 5447.580672] invalid opcode:  [#1] SMP
[ 5447.584806] CPU 2
[ 5447.586646] Modules linked in: loop btrfs zlib_deflate lzo_compress 
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp 
i2c_dev i2c_core ext3 jbd scsi_transport_iscsi rds ib_ipoib rdma_ucm rdma_cm 
ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror 
dm_region_hash dm_log dm_multipath scsi_dh dm_mod video sbs sbshc pci_slot 
battery acpi_pad ac kvm sg ses enclosure sd_mod megaraid_sas ide_cd_mod cdrom 
qla2xxx ib_mthca scsi_transport_fc ib_mad scsi_tgt ib_core button serio_raw 
ata_piix i5k_amb libata hwmon i5000_edac scsi_mod tpm_tis edac_core ehci_hcd 
pcspkr iTCO_wdt tpm dcdbas tpm_bios iTCO_vendor_support uhci_hcd rtc nfs 
nfs_acl auth_rpcgss fscache lockd sunrpc tg3 bnx2 e1000 [last unloaded: 
freq_table]
[ 5447.660248]
[ 5447.661744] Pid: 7622, comm: cosd Not tainted 3.0.0-rc3-00178-gbfc8ccb #34 
Dell Inc. PowerEdge 1950/0DT097
[ 5447.671421] RIP: 0010:[]  [] 
btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
[ 5447.681922] RSP: 0018:88021c0edaf8  EFLAGS: 00010292
[ 5447.687351] RAX: 009e RBX: 880085bf0480 RCX: 00012e0f
[ 5447.694487] RDX:  RSI: 0046 RDI: 819aed98
[ 5447.701631] RBP: 88021c0edb48 R08: 88021c0ed908 R09: 8189ef98
[ 5447.708783] R10:  R11: 0006 R12: ffef
[ 5447.715923] R13: 880072e64240 R14: 880072e64288 R15: 000d
[ 5447.723065] FS:  7fefc66a9940() GS:88022fc8() 
knlGS:
[ 5447.731178] CS:  0010 DS:  ES:  CR0: 8005003b
[ 5447.736934] CR2: 004042b0 CR3: 0001ca4ef000 CR4: 06e0
[ 5447.744087] DR0:  DR1:  DR2: 
[ 5447.751218] DR3:  DR6: 0ff0 DR7: 0400
[ 5447.758343] Process cosd (pid: 7622, threadinfo 88021c0ec000, task 
8801d92996b0)
[ 5447.766422] Stack:
[ 5447.768429]  0001d2da3700 88021c0edb88 8802238d50f8 
880225627000
[ 5447.775866]  8801e32f5e58 8801d2da3700  
8801d9aec510
[ 5447.783291]  000d 8802238d50f8 88021c0edbf8 
a0641c4e
[ 5447.790721] Call Trace:
[ 5447.793191]  [] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
[ 5447.800156]  [] btrfs_add_link+0x12b/0x191 [btrfs]
[ 5447.806517]  [] btrfs_add_nondir+0x31/0x58 [btrfs]
[ 5447.812876]  [] btrfs_create+0xf9/0x197 [btrfs]
[ 5447.818961]  [] vfs_create+0x72/0x92
[ 5447.824090]  [] do_last+0x22c/0x40b
[ 5447.829133]  [] path_openat+0xc0/0x2ef
[ 5447.834438]  [] ? __perf_event_task_sched_out+0x24/0x44
[ 5447.841216]  [] ? perf_event_task_sched_out+0x59/0x67
[ 5447.847846]  [] do_filp_open+0x3d/0x87
[ 5447.853156]  [] ? strncpy_from_user+0x43/0x4d
[ 5447.859072]  [] ? getname_flags+0x2e/0x80
[ 5447.864636]  [] ? do_getname+0x14b/0x173
[ 5447.870112]  [] ? audit_getname+0x16/0x26
[ 5447.875682]  [] ? spin_lock+0xe/0x10
[ 5447.880882]  [] do_sys_open+0x69/0xae
[ 5447.886153]  [] sys_open+0x20/0x22
[ 5447.891114]  [] system_call_fastpath+0x16/0x1b
[ 5447.897124] Code: 85 c0 41 89 c4 74 28 49 8b 45 10 49 8b 4d 00 45 89 e0 48 8b 75 
c0 48 c7 c7 bb 43 69 a0 48 8b 90 e8 02 00 00 31 c0 e8 ce fb 9b e0 <0f> 0b eb fe 
4c 89 f7 e8 81 7b d2 e0 4c 89 ef e8 d4 e4 ff ff 48
[ 5447.916562] RIP  [] 
btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
[ 5447.924683]  RSP 
[ 5447.928514] ---[ end trace 461a7f9887994fe0 ]---

-- Jim


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/ma

Re: kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-20 Thread Miao Xie
Hi, Jim

Could you test the attached patch for me? 
I have done some quick tests, it worked well. But I'm not sure if it can fix
the bug you reported or not, so I need your help!

Thanks
Miao

On fri, 17 Jun 2011 10:10:31 -0600, Jim Schutt wrote:
> Hi,
> 
> I've hit this delayed-inode BUG several times.  I'm using btrfs
> as the data store for Ceph OSDs, and testing a heavy write load.
> The kernel I'm running is a recent commit (f8f44f09eaa) from
> Linus' tree with the for-chris branch (commit ed0ca14021e5) of
>   git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git
> merged in.
> 
> Please let me know what I can do to help resolve this.
> 
> [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the 
> insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
> [ 5447.569766] --------[ cut here ]----
> [ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
> [ 5447.580672] invalid opcode:  [#1] SMP
> [ 5447.584806] CPU 2
> [ 5447.586646] Modules linked in: loop btrfs zlib_deflate lzo_compress 
> ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
> nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge 
> stp i2c_dev i2c_core ext3 jbd scsi_transport_iscsi rds ib_ipoib rdma_ucm 
> rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror 
> dm_region_hash dm_log dm_multipath scsi_dh dm_mod video sbs sbshc pci_slot 
> battery acpi_pad ac kvm sg ses enclosure sd_mod megaraid_sas ide_cd_mod cdrom 
> qla2xxx ib_mthca scsi_transport_fc ib_mad scsi_tgt ib_core button serio_raw 
> ata_piix i5k_amb libata hwmon i5000_edac scsi_mod tpm_tis edac_core ehci_hcd 
> pcspkr iTCO_wdt tpm dcdbas tpm_bios iTCO_vendor_support uhci_hcd rtc nfs 
> nfs_acl auth_rpcgss fscache lockd sunrpc tg3 bnx2 e1000 [last unloaded: 
> freq_table]
> [ 5447.660248]
> [ 5447.661744] Pid: 7622, comm: cosd Not tainted 3.0.0-rc3-00178-gbfc8ccb #34 
> Dell Inc. PowerEdge 1950/0DT097
> [ 5447.671421] RIP: 0010:[]  [] 
> btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
> [ 5447.681922] RSP: 0018:88021c0edaf8  EFLAGS: 00010292
> [ 5447.687351] RAX: 009e RBX: 880085bf0480 RCX: 
> 00012e0f
> [ 5447.694487] RDX:  RSI: 0046 RDI: 
> 819aed98
> [ 5447.701631] RBP: 88021c0edb48 R08: 88021c0ed908 R09: 
> 8189ef98
> [ 5447.708783] R10:  R11: 0006 R12: 
> ffef
> [ 5447.715923] R13: 880072e64240 R14: 880072e64288 R15: 
> 000d
> [ 5447.723065] FS:  7fefc66a9940() GS:88022fc8() 
> knlGS:
> [ 5447.731178] CS:  0010 DS:  ES:  CR0: 8005003b
> [ 5447.736934] CR2: 004042b0 CR3: 0001ca4ef000 CR4: 
> 06e0
> [ 5447.744087] DR0:  DR1:  DR2: 
> 
> [ 5447.751218] DR3:  DR6: 0ff0 DR7: 
> 0400
> [ 5447.758343] Process cosd (pid: 7622, threadinfo 88021c0ec000, task 
> 8801d92996b0)
> [ 5447.766422] Stack:
> [ 5447.768429]  0001d2da3700 88021c0edb88 8802238d50f8 
> 880225627000
> [ 5447.775866]  8801e32f5e58 8801d2da3700  
> 8801d9aec510
> [ 5447.783291]  000d 8802238d50f8 88021c0edbf8 
> a0641c4e
> [ 5447.790721] Call Trace:
> [ 5447.793191]  [] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
> [ 5447.800156]  [] btrfs_add_link+0x12b/0x191 [btrfs]
> [ 5447.806517]  [] btrfs_add_nondir+0x31/0x58 [btrfs]
> [ 5447.812876]  [] btrfs_create+0xf9/0x197 [btrfs]
> [ 5447.818961]  [] vfs_create+0x72/0x92
> [ 5447.824090]  [] do_last+0x22c/0x40b
> [ 5447.829133]  [] path_openat+0xc0/0x2ef
> [ 5447.834438]  [] ? __perf_event_task_sched_out+0x24/0x44
> [ 5447.841216]  [] ? perf_event_task_sched_out+0x59/0x67
> [ 5447.847846]  [] do_filp_open+0x3d/0x87
> [ 5447.853156]  [] ? strncpy_from_user+0x43/0x4d
> [ 5447.859072]  [] ? getname_flags+0x2e/0x80
> [ 5447.864636]  [] ? do_getname+0x14b/0x173
> [ 5447.870112]  [] ? audit_getname+0x16/0x26
> [ 5447.875682]  [] ? spin_lock+0xe/0x10
> [ 5447.880882]  [] do_sys_open+0x69/0xae
> [ 5447.886153]  [] sys_open+0x20/0x22
> [ 5447.891114]  [] system_call_fastpath+0x16/0x1b
> [ 5447.897124] Code: 85 c0 41 89 c4 74 28 49 8b 45 10 49 8b 4d 00 45 89 e0 48 
> 8b 75 c0 48 c7 c7 bb 43 69 a0 48 8b 90 e8 02 00 00 31 c0 e8 ce fb 9b e0 <0f> 
> 0b eb fe 4c 89 f7 e8 81 7b d2 e0 4c 89 ef e8 d4 e4 ff ff 48
> [ 5447.916562] RIP  [] 
> btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
> [ 5447.924683]  RSP 
> [ 5447.928514] ---[ end trace 461a7f9887994fe0 ]---
>

Re: kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-19 Thread Miao Xie
On fri, 17 Jun 2011 10:10:31 -0600, Jim Schutt wrote:
> I've hit this delayed-inode BUG several times.  I'm using btrfs
> as the data store for Ceph OSDs, and testing a heavy write load.
> The kernel I'm running is a recent commit (f8f44f09eaa) from
> Linus' tree with the for-chris branch (commit ed0ca14021e5) of
>   git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git
> merged in.
> 
> Please let me know what I can do to help resolve this.
> 
> [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the 
> insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
> [ 5447.569766] --------[ cut here ]----
> [ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
> [ 5447.580672] invalid opcode:  [#1] SMP
> [ 5447.584806] CPU 2
> [ 5447.586646] Modules linked in: loop btrfs zlib_deflate lzo_compress 
> ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
> nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge 
> stp i2c_dev i2c_core ext3 jbd scsi_transport_iscsi rds ib_ipoib rdma_ucm 
> rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror 
> dm_region_hash dm_log dm_multipath scsi_dh dm_mod video sbs sbshc pci_slot 
> battery acpi_pad ac kvm sg ses enclosure sd_mod megaraid_sas ide_cd_mod cdrom 
> qla2xxx ib_mthca scsi_transport_fc ib_mad scsi_tgt ib_core button serio_raw 
> ata_piix i5k_amb libata hwmon i5000_edac scsi_mod tpm_tis edac_core ehci_hcd 
> pcspkr iTCO_wdt tpm dcdbas tpm_bios iTCO_vendor_support uhci_hcd rtc nfs 
> nfs_acl auth_rpcgss fscache lockd sunrpc tg3 bnx2 e1000 [last unloaded: 
> freq_table]
> [ 5447.660248]
> [ 5447.661744] Pid: 7622, comm: cosd Not tainted 3.0.0-rc3-00178-gbfc8ccb #34 
> Dell Inc. PowerEdge 1950/0DT097
> [ 5447.671421] RIP: 0010:[]  [] 
> btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
> [ 5447.681922] RSP: 0018:88021c0edaf8  EFLAGS: 00010292
> [ 5447.687351] RAX: 009e RBX: 880085bf0480 RCX: 
> 00012e0f
> [ 5447.694487] RDX:  RSI: 0046 RDI: 
> 819aed98
> [ 5447.701631] RBP: 88021c0edb48 R08: 88021c0ed908 R09: 
> 8189ef98
> [ 5447.708783] R10:  R11: 0006 R12: 
> ffef
> [ 5447.715923] R13: 880072e64240 R14: 880072e64288 R15: 
> 000d
> [ 5447.723065] FS:  7fefc66a9940() GS:88022fc8() 
> knlGS:
> [ 5447.731178] CS:  0010 DS:  ES:  CR0: 8005003b
> [ 5447.736934] CR2: 004042b0 CR3: 0001ca4ef000 CR4: 
> 06e0
> [ 5447.744087] DR0:  DR1:  DR2: 
> 
> [ 5447.751218] DR3:  DR6: 0ff0 DR7: 
> 0400
> [ 5447.758343] Process cosd (pid: 7622, threadinfo 88021c0ec000, task 
> 8801d92996b0)
> [ 5447.766422] Stack:
> [ 5447.768429]  0001d2da3700 88021c0edb88 8802238d50f8 
> 880225627000
> [ 5447.775866]  8801e32f5e58 8801d2da3700  
> 8801d9aec510
> [ 5447.783291]  000d 8802238d50f8 88021c0edbf8 
> a0641c4e
> [ 5447.790721] Call Trace:
> [ 5447.793191]  [] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
> [ 5447.800156]  [] btrfs_add_link+0x12b/0x191 [btrfs]
> [ 5447.806517]  [] btrfs_add_nondir+0x31/0x58 [btrfs]
> [ 5447.812876]  [] btrfs_create+0xf9/0x197 [btrfs]
> [ 5447.818961]  [] vfs_create+0x72/0x92
> [ 5447.824090]  [] do_last+0x22c/0x40b
> [ 5447.829133]  [] path_openat+0xc0/0x2ef
> [ 5447.834438]  [] ? __perf_event_task_sched_out+0x24/0x44
> [ 5447.841216]  [] ? perf_event_task_sched_out+0x59/0x67
> [ 5447.847846]  [] do_filp_open+0x3d/0x87
> [ 5447.853156]  [] ? strncpy_from_user+0x43/0x4d
> [ 5447.859072]  [] ? getname_flags+0x2e/0x80
> [ 5447.864636]  [] ? do_getname+0x14b/0x173
> [ 5447.870112]  [] ? audit_getname+0x16/0x26
> [ 5447.875682]  [] ? spin_lock+0xe/0x10
> [ 5447.880882]  [] do_sys_open+0x69/0xae
> [ 5447.886153]  [] sys_open+0x20/0x22
> [ 5447.891114]  [] system_call_fastpath+0x16/0x1b
> [ 5447.897124] Code: 85 c0 41 89 c4 74 28 49 8b 45 10 49 8b 4d 00 45 89 e0 48 
> 8b 75 c0 48 c7 c7 bb 43 69 a0 48 8b 90 e8 02 00 00 31 c0 e8 ce fb 9b e0 <0f> 
> 0b eb fe 4c 89 f7 e8 81 7b d2 e0 4c 89 ef e8 d4 e4 ff ff 48
> [ 5447.916562] RIP  [] 
> btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
> [ 5447.924683]  RSP 
> [ 5447.928514] ---[ end trace 461a7f9887994fe0 ]---

Thanks for your report. I will deal with it.

Miao

> 
> -- Jim
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kernel BUG at fs/btrfs/delayed-inode.c:1301!

2011-06-17 Thread Jim Schutt

Hi,

I've hit this delayed-inode BUG several times.  I'm using btrfs
as the data store for Ceph OSDs, and testing a heavy write load.
The kernel I'm running is a recent commit (f8f44f09eaa) from
Linus' tree with the for-chris branch (commit ed0ca14021e5) of
  git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git
merged in.

Please let me know what I can do to help resolve this.

[ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the 
insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
[ 5447.569766] [ cut here ]
[ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
[ 5447.580672] invalid opcode:  [#1] SMP
[ 5447.584806] CPU 2
[ 5447.586646] Modules linked in: loop btrfs zlib_deflate lzo_compress 
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp 
i2c_dev i2c_core ext3 jbd scsi_transport_iscsi rds ib_ipoib rdma_ucm rdma_cm 
ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror 
dm_region_hash dm_log dm_multipath scsi_dh dm_mod video sbs sbshc pci_slot 
battery acpi_pad ac kvm sg ses enclosure sd_mod megaraid_sas ide_cd_mod cdrom 
qla2xxx ib_mthca scsi_transport_fc ib_mad scsi_tgt ib_core button serio_raw 
ata_piix i5k_amb libata hwmon i5000_edac scsi_mod tpm_tis edac_core ehci_hcd 
pcspkr iTCO_wdt tpm dcdbas tpm_bios iTCO_vendor_support uhci_hcd rtc nfs 
nfs_acl auth_rpcgss fscache lockd sunrpc tg3 bnx2 e1000 [last unloaded: 
freq_table]
[ 5447.660248]
[ 5447.661744] Pid: 7622, comm: cosd Not tainted 3.0.0-rc3-00178-gbfc8ccb #34 
Dell Inc. PowerEdge 1950/0DT097
[ 5447.671421] RIP: 0010:[]  [] 
btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
[ 5447.681922] RSP: 0018:88021c0edaf8  EFLAGS: 00010292
[ 5447.687351] RAX: 009e RBX: 880085bf0480 RCX: 00012e0f
[ 5447.694487] RDX:  RSI: 0046 RDI: 819aed98
[ 5447.701631] RBP: 88021c0edb48 R08: 88021c0ed908 R09: 8189ef98
[ 5447.708783] R10:  R11: 0006 R12: ffef
[ 5447.715923] R13: 880072e64240 R14: 880072e64288 R15: 000d
[ 5447.723065] FS:  7fefc66a9940() GS:88022fc8() 
knlGS:
[ 5447.731178] CS:  0010 DS:  ES:  CR0: 8005003b
[ 5447.736934] CR2: 004042b0 CR3: 0001ca4ef000 CR4: 06e0
[ 5447.744087] DR0:  DR1:  DR2: 
[ 5447.751218] DR3:  DR6: 0ff0 DR7: 0400
[ 5447.758343] Process cosd (pid: 7622, threadinfo 88021c0ec000, task 
8801d92996b0)
[ 5447.766422] Stack:
[ 5447.768429]  0001d2da3700 88021c0edb88 8802238d50f8 
880225627000
[ 5447.775866]  8801e32f5e58 8801d2da3700  
8801d9aec510
[ 5447.783291]  000d 8802238d50f8 88021c0edbf8 
a0641c4e
[ 5447.790721] Call Trace:
[ 5447.793191]  [] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
[ 5447.800156]  [] btrfs_add_link+0x12b/0x191 [btrfs]
[ 5447.806517]  [] btrfs_add_nondir+0x31/0x58 [btrfs]
[ 5447.812876]  [] btrfs_create+0xf9/0x197 [btrfs]
[ 5447.818961]  [] vfs_create+0x72/0x92
[ 5447.824090]  [] do_last+0x22c/0x40b
[ 5447.829133]  [] path_openat+0xc0/0x2ef
[ 5447.834438]  [] ? __perf_event_task_sched_out+0x24/0x44
[ 5447.841216]  [] ? perf_event_task_sched_out+0x59/0x67
[ 5447.847846]  [] do_filp_open+0x3d/0x87
[ 5447.853156]  [] ? strncpy_from_user+0x43/0x4d
[ 5447.859072]  [] ? getname_flags+0x2e/0x80
[ 5447.864636]  [] ? do_getname+0x14b/0x173
[ 5447.870112]  [] ? audit_getname+0x16/0x26
[ 5447.875682]  [] ? spin_lock+0xe/0x10
[ 5447.880882]  [] do_sys_open+0x69/0xae
[ 5447.886153]  [] sys_open+0x20/0x22
[ 5447.891114]  [] system_call_fastpath+0x16/0x1b
[ 5447.897124] Code: 85 c0 41 89 c4 74 28 49 8b 45 10 49 8b 4d 00 45 89 e0 48 8b 75 
c0 48 c7 c7 bb 43 69 a0 48 8b 90 e8 02 00 00 31 c0 e8 ce fb 9b e0 <0f> 0b eb fe 
4c 89 f7 e8 81 7b d2 e0 4c 89 ef e8 d4 e4 ff ff 48
[ 5447.916562] RIP  [] 
btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
[ 5447.924683]  RSP 
[ 5447.928514] ---[ end trace 461a7f9887994fe0 ]---

-- Jim


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html