Nasf, all,

Upon post-recovery analysis, it appears that the "ls" hangs on my system were 
caused by inconsistencies with MDT striping (DNE) – See some details below.

I'm not sure if I should file this in Jira but wanted to document at least here.


- Nasf mentioned agent inodes for cross-MDTs objects. Could it be that 
cross-MDT objects being *missing* causes such hangs?

In my case, secondary trouble arose because LFSCK got stuck and could not 
resolve the issue.

As for how this might have happened on my system: I did have a couple of 
unexpected *HA double failures* for MDS nodes a few months ago. These failures 
may have interfered with Lustre recovery. The double failures were due to how 
NetworkManager in RHEL7 would tear down the entire network stack on a server 
when just *one* of its network interfaces goes offline, as it does for a 
host-to-host heartbeat link on the active HA node when the peer node reboots.  
{Solution: (a) disable NetworkManager control for  heartbeat interfaces, and 
(b) use Corosync RRP, i.e., secondary heartbeat links.}


- To clarify, I did *not* do an MDT file-level-restore; I just mentioned that 
as an item from the operations manual, from which I gathered the kinds of 
Lustre-internal files that were expendable (like LFSCK and OIs), and 
extrapolated a bit from there. If useful, I could do additional analysis.


- Possibly related:  https://jira.whamcloud.com/browse/LU-11584 , which also 
mentioned "ls -l" hanging.


Nasf, thank you for clarifying agent inodes. While my "rm" of all of them from 
the ldiskfs may have removed a few more than strictly necessary, the fact that 
*all* of my MDT-striped dirs were implicated (see below), plus one outlier, 
made their removal a promising avenue of attack.



Best wishes,
Michael


---------------------------
From post-recovery analysis
---------------------------

Prior to my intervention at the MDT ldiskfs level (removal of all 0-permission 
files), I had determined on a client the dirs for which "ls -l" would hang by 
running a number of du(1) processes in parallel and seeing which ones got stuck:

        ls -d /home/[A-Z]*/* /home/[a-z]* | xargs -n1 -P12 du

(The glob is two-level to account voluminous project and software build 
directories separately from equally plentiful user homes.) After the last input 
was been taken up by a du(1) process and indeed finished (as monitored by ps(8) 
and/or lsof(8)), I found stuck du processes for:

        /home/SHARE/XXX123/XXX-env/
        /home/SHARE/g-cXXX/
        /home/SHARE/g-fXXX/
        /home/krXXX/
        /home/SOFT/spXXX/
        /home/mcXXX/

Now, I had used MDT striping only sparingly upon initially populating the file 
system. Strikingly, from my notes, the dirs that I had created with an MDT 
stripe count of 2 were a strict subset of the ones that failed, in fact the 
majority:

        lfs mkdir -i 1 -c 2 /home/mcXXX
        lfs mkdir -i 1 -c 2 /home/krXXX
        lfs mkdir -i 1 -c 2 /home/SOFT/spXXX
        lfs mkdir -i 1 -c 2 /home/SHARE/g-fXXX
        lfs mkdir -i 1 -c 2 /home/SHARE/g-cXXX



> On 2019-06-16, at 09:00 , Yong, Fan <fan.y...@intel.com> wrote:
> 
> Hi Michael,
> 
> The inode with zero permission and zero owner/group is NOT equal to 
> corruption, instead, it is quite possible Lustre agent inode for cross-MDTs 
> object. In your case, for the file ".bash_histroy", its name entry exists on 
> the MDT0 (with a local agent inode), the object itself resides on the MDT1. 
> The permission and owner/group information for agent inode are always zero. 
> On the other hand, its time bits is valid. That also indicates a Lustre 
> backend agent, not corrupted one.
> 
> Usually, if there is data corruption for some inode, then the output may be 
> like:
> 
> ?????  1 ??? ???            xxx ??? .bash_history
> 
> You resolved the stuck issue by removing these 'trouble' agent inodes, but it 
> may not the root reason (and may cause data lost), although I do not know 
> what the root reason is. Anyway, if you have restored the system from 
> file-level backup, then you may have lost the clues for the root reason.
> 
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to