[Lustre-discuss] LBUG with 1.8.2 during rm

2010-04-28 Thread Patrick Winnertz
Hey,

after the my test lustre filesystem was quite full, I've made a rm -rf * on the 
lustre filesystem and got this error message:

[  135.094107] Lustre: mgc192.168@tcp: Reactivating import
[  135.094706] Lustre: Server MGS on device /dev/hda5 has started
[  137.827630] Lustre: spfs-MDT: temporarily refusing client connection 
from 192.168@tcp
[  137.828076] LustreError: 2100:0:(ldlm_lib.c:1848:target_send_reply_msg()) 
@@@ processing error (-11)  r...@decada00 x1333718444276272/t0 o38->@:0/0 
lens 368/0 e 0 to 0 dl 1272455883 ref 1 fl Interpret:/0/0 rc -11/0
[  155.830871] Lustre: spfs-MDT: temporarily refusing client connection 
from 192.168@tcp
[  155.831308] LustreError: 2099:0:(ldlm_lib.c:1848:target_send_reply_msg()) 
@@@ processing error (-11)  r...@de96b800 x1333718444276280/t0 o38->@:0/0 
lens 368/0 e 0 to 0 dl 1272455901 ref 1 fl Interpret:/0/0 rc -11/0
[  169.705870] Lustre: 2124:0:(mds_lov.c:1167:mds_notify()) MDS spfs-MDT: 
add target spfs-OST_UUID
[  169.770049] Lustre: 2052:0:(mds_lov.c:1203:mds_notify()) MDS spfs-MDT: 
in recovery, not resetting orphans on spfs-OST_UUID
[  169.802342] Lustre: spfs-mdtlov.lov: set parameter stripesize=1048576
[  173.866361] LustreError: 2103:0:(mds_open.c:1666:mds_close()) @@@ no handle 
for file close ino 1738772: cookie 0x5a629f5fe2dc51f1  r...@ded9f600 
x1333718444266421/t0 o35->51740db3-b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 
408/528 e 0 to 0 dl 1272456350 ref 2 fl Interpret:/4/0 rc 0/0
[  173.867518] LustreError: 2103:0:(ldlm_lib.c:1848:target_send_reply_msg()) 
@@@ processing error (-116)  r...@ded9f600 x1333718444266421/t0 o35->51740db3-
b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 408/432 e 0 to 0 dl 1272456350 ref 2 fl 
Interpret:/4/0 rc -116/0
[  174.635986] LustreError: 2104:0:(mds_open.c:1666:mds_close()) @@@ no handle 
for file close ino 1671195: cookie 0x5a629f5fe2dc47a2  r...@dfac4c00 
x1333718444266751/t0 o35->51740db3-b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 
408/528 e 0 to 0 dl 1272455836 ref 2 fl Interpret:/4/0 rc 0/0
[  174.637154] LustreError: 2104:0:(mds_open.c:1666:mds_close()) Skipped 2 
previous similar messages
[  176.182138] LustreError: 2103:0:(mds_open.c:1666:mds_close()) @@@ no handle 
for file close ino 893277: cookie 0x5a629f5fe2dc22e2  r...@dfac9400 
x1333718444267408/t0 o35->51740db3-b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 
408/528 e 0 to 0 dl 1272455838 ref 2 fl Interpret:/4/0 rc 0/0
[  176.183281] LustreError: 2103:0:(mds_open.c:1666:mds_close()) Skipped 4 
previous similar messages
[  177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1
[  177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
LBUG
[  177.490559] Pid: 2099, comm: ll_mdt_00
[  177.490759] 
[  177.490760] Call Trace:
[  177.491067]  [] libcfs_debug_dumpstack+0x58/0x80 [libcfs]
[  177.491423]  [] lbug_with_loc+0x6d/0xc0 [libcfs]
[  177.491779]  [] mds_orphan_add_link+0xd85/0xd90 [mds]
[  177.492162]  [] __ldiskfs_journal_stop+0x24/0x50 
[ldiskfs]
[  177.492527]  [] mds_reint_unlink+0x1e8f/0x3b80 [mds]
[  177.492867]  [] mds_reint_rec+0x133/0x3d0 [mds]
[  177.493189]  [] mds_reint+0x229/0x740 [mds]
[  177.493583]  [] lustre_msg_get_flags+0x104/0x200 [ptlrpc]
[  177.493941]  [] mds_handle+0x17c9/0xa180 [mds]
[  177.494243]  [] _spin_lock+0x5/0x7
[  177.494505]  [] _spin_lock_irqsave+0x23/0x29
[  177.494801]  [] lock_timer_base+0x19/0x35
[  177.495081]  [] __mod_timer+0xc0/0xc9
[  177.495392]  [] lustre_msg_get_transno+0x10c/0x210 
[ptlrpc]
[  177.495738]  [] _spin_lock+0x5/0x7
[  177.496046]  [] 
target_queue_recovery_request+0xaf2/0x1750 [ptlrpc]
[  177.496428]  [] __do_softirq+0x143/0x16b
[  177.496707]  [] _spin_lock+0x5/0x7
[  177.496980]  [] mds_handle+0x29b3/0xa180 [mds]
[  177.497282]  [] net_rx_action+0xa4/0x1be
[  177.497558]  [] __do_softirq+0x143/0x16b
[  177.497879]  [] lustre_msg_get_conn_cnt+0x104/0x200 
[ptlrpc]
[  177.498238]  [] common_interrupt+0x23/0x28
[  177.498566]  [] ptlrpc_update_export_timer+0x56/0x670 
[ptlrpc]
[  177.498972]  [] ptlrpc_check_req+0x16/0x220 [ptlrpc]
[  177.499305]  [] lprocfs_counter_add+0x5b/0x150 [lvfs]
[  177.499673]  [] ptlrpc_server_handle_request+0xb29/0x1d90 
[ptlrpc]
[  177.500065]  [] enqueue_task+0x52/0x5d
[  177.500336]  [] try_to_wake_up+0x15c/0x165
[  177.500637]  [] lc_watchdog_touch+0x9b/0x270 [libcfs]
[  177.501675]  [] lc_watchdog_disa

Re: [Lustre-discuss] LBUG with 1.8.2 during rm

2010-04-28 Thread Johann Lombardi
On Wed, Apr 28, 2010 at 01:50:13PM +0200, Patrick Winnertz wrote:
> [  177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
> ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1
> [  177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
> LBUG

This is a known problem with open-unlinked directory in 1.8.2.
There is a fix attached to bug 22177.

Johann
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LBUG with 1.8.2 during rm

2010-07-14 Thread Derek Yarnell
Hi,

We seem to have may have hit this bug too now.  Is the fix in 1.8.3, the 
bugzilla entry isn't exactly clear.  Additionally the bug 
(https://bugzilla.lustre.org/show_bug.cgi?id=22177) is still marked as ASSIGNED 
does that mean the fix still has problems?

Thanks,
derek

On Apr 28, 2010, at 7:59 AM, Johann Lombardi wrote:

> On Wed, Apr 28, 2010 at 01:50:13PM +0200, Patrick Winnertz wrote:
>> [  177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
>> ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1
>> [  177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
>> LBUG
> 
> This is a known problem with open-unlinked directory in 1.8.2.
> There is a fix attached to bug 22177.
> 
> Johann
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 

Derek Yarnell
UNIX Systems Administrator
University of Maryland
Institute for Advanced Computer Studies



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LBUG with 1.8.2 during rm

2010-07-14 Thread Andreas Dilger
On 2010-07-14, at 17:05, Derek Yarnell wrote:
> We seem to have may have hit this bug too now.  Is the fix in 1.8.3, the 
> bugzilla entry isn't exactly clear.  Additionally the bug 
> (https://bugzilla.lustre.org/show_bug.cgi?id=22177) is still marked as 
> ASSIGNED does that mean the fix still has problems?

The patch shows in bugzilla as landed-1.8.3, and the 1.8.3 ChangeLog also lists 
a fix for bug 22177, so I'd say it is fixed.

> On Apr 28, 2010, at 7:59 AM, Johann Lombardi wrote:
> 
>> On Wed, Apr 28, 2010 at 01:50:13PM +0200, Patrick Winnertz wrote:
>>> [  177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
>>> ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1
>>> [  177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) 
>>> LBUG
>> 
>> This is a known problem with open-unlinked directory in 1.8.2.
>> There is a fix attached to bug 22177.
>> 
>> Johann
>> ___
>> Lustre-discuss mailing list
>> Lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
> 
> Derek Yarnell
> UNIX Systems Administrator
> University of Maryland
> Institute for Advanced Computer Studies
> 
> 
> 
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss


Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LBUG with 1.8.2 during rm

2010-07-15 Thread Peter Jones
Yes, confirmed. Every site that reported this issue reported it resolved 
with this patch.

Andreas Dilger wrote:
> 
> The patch shows in bugzilla as landed-1.8.3, and the 1.8.3 ChangeLog also 
> lists a fix for bug 22177, so I'd say it is fixed.
>
>   
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LBUG with 1.8.2 during rm

2010-07-15 Thread Derek Yarnell
Hi,

Thanks, yes this fixed it for us too.  Just doing a lfsck today to make sure 
everything looks good.  Sorry for the inability to read the bugzilla will just 
chalk that up to being tired.

Thanks,
derek

On Jul 15, 2010, at 8:51 AM, Peter Jones wrote:

> Yes, confirmed. Every site that reported this issue reported it resolved with 
> this patch.
> 
> Andreas Dilger wrote:
>> 
>> The patch shows in bugzilla as landed-1.8.3, and the 1.8.3 ChangeLog also 
>> lists a fix for bug 22177, so I'd say it is fixed.
>> 
>>  
> 

Derek Yarnell
UNIX Systems Administrator
University of Maryland
Institute for Advanced Computer Studies



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss