Sorry, I misread “abnormal”. Anything I can check to help diagnose the slowness?
Thanks, - Dong > On May 2, 2018, at 2:53 AM, 代栋 <daidon...@gmail.com> wrote: > > Thanks very much for your reply. > > I used Lustre 2.9.0 and ran “lctl lfsck_start -M lustre-MDT0000 -A -t all -r” > to start LFSCK. > > Could you brief me more about the slowness? I mean scanning around 300K > inodes should not take that much time (80mins). These files were just created > using a script after a fresh build of the lustre (no complex metadata > operations at all). > > Got it, so the 30-sec interval is just for checking the status of the MDT. > Another question is, for layout checking, does lfsck need to compare metadata > stored in MDT (in LayoutEA) and metadata stored in OSTs (FID in LMA? not very > sure) for orphan objects? When are these metadata gathered into one place for > checking? I am asking this because previously I thought the periodically > queries from OSTs to MDT are doing this job. > > Thanks, > - Dong > > >> On May 1, 2018, at 9:49 PM, Yong, Fan <fan.y...@intel.com> wrote: >> >> Inline comments. >> >> -- >> Cheers, >> Nasf >> >> >>> -----Original Message----- >>> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On >>> Behalf >>> Of 代栋 >>> Sent: Wednesday, May 2, 2018 5:36 AM >>> To: lustre-discuss@lists.lustre.org >>> Subject: [lustre-discuss] Is there a way to have faster lustre file system >>> checker >>> (lfsck)? >>> >>> Hi all, >>> >>> I am still new to Lustre, so please let me know if I should send this >>> message to >>> devel-list. >>> >>> This week, I tried to run LFSCK over a very small cluster configuration (1 >>> mdt and >>> 3 osts). In this Lustre, I used about 300K inodes. It took me about 80 >>> mins >>> to finish a LFSCK run. And, more importantly, while I am running LFSCK, on >>> both MDT and OSTS, the CPU utilization is 100%, taken by the lfsck thread. >> >> Which version of Lustre and what is the LFSCK command line you used? >> >> >>> I understand that lfsck is operating in an online mode, so it is slow. >>> But, I am >>> wondering is there any way to accelerate this? Especially if I am allowed >>> to run >>> it offline, for example, during weekly maintenance. >> >> Your slow is abnormal, not related with online. The LFSCK can NOT be run >> under offline mode. >> >> >>> >>> After checking the lfsck kernel logs, I noticed that in the phase2 scanning >>> on >>> OSTs, there is an 30 seconds interval between querying the MDTs. I am >>> wondering is there any reason to have this 30 seconds interval, and will >>> lfsck on >>> OSTs be faster if we remove such 30 seconds interval? >> >> Normally, the master engine on the MDT will notify the LFSCK engine on the >> OST when the first phase done. But we can NOT guarantee that the LFSCK >> engine on the MDT always alive during the LFSCK (may because of some >> failure, or network trouble, or node crash, and so on), so in the 2nd phase >> scanning, if the LFSCK engine on the OST does not receive the notification >> from the MDT, it needs to query the LFSCK (on the MDT) status periodically. >> If the MDT finished the 1st phase scanning earlier than OST, then there will >> be no such query. Anyway, such query is NOT the reason of your slow LFSCK. >> >> >>> >>> Thanks, >>> - Dong >>> _______________________________________________ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org