On Wed, 2006-03-08 at 14:29 +0100, Herbert Poetzl wrote:
> On Tue, Mar 07, 2006 at 07:30:26PM +0100, Pallai Roland wrote:
> > 
> >  I've a weird problem, sometimes a random file "stucks" after 1-2 weeks
> > uptime on a xfs partition within a vs. the xfs is laying on lvm2 &
> > 
> > SysRQ+t - very long, copied only the stucked processes, but not all of
> > them is here, cause the 'dmesg' buffer is too small and I haven't a
> > serial console :(
 today I've got another stucked file, and this time I've a full SysRq+t
dump of the system, ask if that could help

 in the last mail I've mentioned that only the 'pdflush' is in state D
on the host, but I was wrong, 'xfssyncd' also stucked in D. both of them
is in the new sysrq+t dump

 I'm not a kernel hacker, but I've noticed that every time is exactly
one process of the stucked ones that doing a dm_request() meanwhile the
pdflush is stucked on dm_* thing too, I think, maybe it's not a dangling
inode lock in the xfs code, but some kind of deadlock in dm*-things..?
don't laugh, I said, INAKH! :)

pdflush       D 0000000000000000     0  9262     11          9326  9367 
(L-TLB)^@
ffff81005b743aa8 0000000000000046 0000000300000000 ffff8100761f0ed0 ^@
       0000000000000000 0000000000000000 0000000000000096 0000000000000003 ^@
       ffff81005b743a18 ffffffff8012de13 ^@
Call Trace:<ffffffff8012de13>{__wake_up+67} 
<ffffffff803581d3>{dm_table_unplug_all+51}^@
       <ffffffff8035605d>{dm_unplug_all+29} <ffffffff8015f290>{sync_page+0}^@
       <ffffffff803de614>{io_schedule+52} <ffffffff8015f2d8>{sync_page+72}^@
       <ffffffff803de9e1>{__wait_on_bit_lock+65} 
<ffffffff8015fa34>{__lock_page+164}^@
       <ffffffff8014a820>{wake_bit_function+0} 
<ffffffff8014a820>{wake_bit_function+0}^@
       <ffffffff8016c56a>{pagevec_lookup_tag+26} 
<ffffffff801aceaf>{mpage_writepages+351}^@
       <ffffffff88106250>{:xfs:linvfs_writepage+0} 
<ffffffff801ab580>{__sync_single_inode+112}
       <ffffffff801ab8c1>{__writeback_single_inode+417} 
<ffffffff80358145>{dm_table_any_congest
       <ffffffff803560b8>{dm_any_congested+72} 
<ffffffff80358177>{dm_table_any_congested+71}^@
       <ffffffff801abab2>{sync_sb_inodes+482} 
<ffffffff8014a1c0>{keventd_create_kthread+0}^@
       <ffffffff801abc15>{writeback_inodes+133} 
<ffffffff80166a7e>{wb_kupdate+206}^@
       <ffffffff80167510>{pdflush+0} <ffffffff80167464>{__pdflush+292}^@
       <ffffffff8016754a>{pdflush+58} <ffffffff801669b0>{wb_kupdate+0}^@
       <ffffffff8014a182>{kthread+146} <ffffffff8010ea5a>{child_rip+8}^@
       <ffffffff8014a1c0>{keventd_create_kthread+0} 
<ffffffff8014a0f0>{kthread+0}^@
       <ffffffff8010ea52>{child_rip+0} ^@

glftpd        D ffff81006f666000     0 18518      1         18526 24134 
(NOTLB)^@
ffff81006ddfba28 0000000000000086 0000000000000292 ffffffff80355fda ^@
       ffff81007ff82e00 0000000000000001 ffff8100422b2140 ffffffff80227256 ^@
       0000000000000001 ffffc200000d9040 ^@
Call Trace:<ffffffff80355fda>{dm_request+122} 
<ffffffff80227256>{generic_make_request+262}^@
       <ffffffff803dee98>{__down+152} 
<ffffffff8012dd50>{default_wake_function+0}^@
       <ffffffff803dec8a>{__down_failed+53} 
<ffffffff88108951>{:xfs:.text.lock.xfs_buf+25}^@
       <ffffffff88106d94>{:xfs:_pagebuf_find+372} 
<ffffffff88106e82>{:xfs:xfs_buf_get_flags+82}
       <ffffffff88106f8a>{:xfs:xfs_buf_read_flags+26} 
<ffffffff880f9feb>{:xfs:xfs_trans_read_bu
       <ffffffff880b601c>{:xfs:xfs_alloc_read_agf+108} 
<ffffffff880f9075>{:xfs:_xfs_trans_commi
       <ffffffff880b5a93>{:xfs:xfs_alloc_fix_freelist+291}^@
       <ffffffff880fa917>{:xfs:xfs_trans_log_inode+39} 
<ffffffff803deaf2>{__down_read+18}^@
       <ffffffff880b65b8>{:xfs:xfs_free_extent+152} 
<ffffffff880df434>{:xfs:xfs_efd_init+68}^@
       <ffffffff880fa65b>{:xfs:xfs_trans_get_efd+43} 
<ffffffff880c52d8>{:xfs:xfs_bmap_finish+23
       <ffffffff880e73f4>{:xfs:xfs_itruncate_finish+420} 
<ffffffff880ed3a1>{:xfs:xfs_log_reserv
       <ffffffff8810015e>{:xfs:xfs_inactive+558} 
<ffffffff8810dfb1>{:xfs:linvfs_clear_inode+161
       <ffffffff801a07d0>{clear_inode+224} 
<ffffffff801a18bd>{generic_delete_inode+205}^@
       <ffffffff801a1b2b>{iput+123} <ffffffff80197223>{sys_unlink+259}^@
       <ffffffff8011fd01>{ia32_sysret+0} ^@


> looks like some xfs inode lock is not released properly
> the reasons for this can be various, updating to the
> latest kernel and vserver patches might help here ...
> 
> anyway, will have a more detailed look at it later.
 thanks in advice, I trying different kernels meanwhile, now I rebooted
into 2.6.15.6-vs2.1.1-rc10


--
 d

_______________________________________________
Vserver mailing list
Vserver@list.linux-vserver.org
http://list.linux-vserver.org/mailman/listinfo/vserver

Reply via email to