Many thanks. I will try to run namespace check again and see the results. BTW, is the rate control mechanism enabled by default? Can I disable it?
I might be asking stupid questions. But, how MDT-object and OST-object pair each other during phase1 scanning? I mean whether OST sends objects metadata to MDT or MDT sends objects metadata to OST for pairing? Could you point me to the source code, so that I can look for more details? thanks, - Dong > On May 2, 2018, at 3:43 AM, Yong, Fan <fan.y...@intel.com> wrote: > > Inline comments. > > -- > Cheers, > Nasf > >> -----Original Message----- >> From: 代栋 [mailto:daidon...@gmail.com <mailto:daidon...@gmail.com>] >> Sent: Wednesday, May 2, 2018 4:06 PM >> To: Yong, Fan <fan.y...@intel.com <mailto:fan.y...@intel.com>> >> Cc: lustre-discuss@lists.lustre.org <mailto:lustre-discuss@lists.lustre.org> >> Subject: Re: [lustre-discuss] Is there a way to have faster lustre file >> system >> checker (lfsck)? >> >> Sorry, I misread “abnormal”. Anything I can check to help diagnose the >> slowness? >> >> Thanks, >> - Dong >> >>> On May 2, 2018, at 2:53 AM, 代栋 <daidon...@gmail.com> wrote: >>> >>> Thanks very much for your reply. >>> >>> I used Lustre 2.9.0 and ran “lctl lfsck_start -M lustre-MDT0000 -A -t all >>> -r” to >> start LFSCK. >>> > You can try "lctl lfsck_start -M lustre-MDT0000 -A -t namespace -r" firstly > to check the namespace LFSCK speed. Since you has only one MDT, it should be > quite faster. If yes, then check "lctl lfsck_start -M lustre-MDT0000 -A -t > layout -r". > >>> Could you brief me more about the slowness? I mean scanning around 300K >> inodes should not take that much time (80mins). These files were just created >> using a script after a fresh build of the lustre (no complex metadata >> operations >> at all). >>> > I do not know what caused such slowness. There may be many factors. Have you > set some fail_loc? If not, you may need to enable LFSCK debug on both the MDT > and OST, then collect and analysis Lustre debug logs. > > >>> Got it, so the 30-sec interval is just for checking the status of the MDT. >> Another question is, for layout checking, does lfsck need to compare metadata >> stored in MDT (in LayoutEA) and metadata stored in OSTs (FID in LMA? not very >> sure) for orphan objects? When are these metadata gathered into one place for >> checking? I am asking this because previously I thought the periodically >> queries >> from OSTs to MDT are doing this job. >>> > In short, all the MDT-object OST-object pairs have been marked during the 1st > stage scanning. So If there are some OST-objects non-marked, then it may be > orphans those will be handled during the 2nd phase scanning. > > >>> Thanks, >>> - Dong >>> >>> >>>> On May 1, 2018, at 9:49 PM, Yong, Fan <fan.y...@intel.com> wrote: >>>> >>>> Inline comments. >>>> >>>> -- >>>> Cheers, >>>> Nasf >>>> >>>> >>>>> -----Original Message----- >>>>> From: lustre-discuss >>>>> [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of 代栋 >>>>> Sent: Wednesday, May 2, 2018 5:36 AM >>>>> To: lustre-discuss@lists.lustre.org >>>>> Subject: [lustre-discuss] Is there a way to have faster lustre file >>>>> system checker (lfsck)? >>>>> >>>>> Hi all, >>>>> >>>>> I am still new to Lustre, so please let me know if I should send >>>>> this message to devel-list. >>>>> >>>>> This week, I tried to run LFSCK over a very small cluster >>>>> configuration (1 mdt and >>>>> 3 osts). In this Lustre, I used about 300K inodes. It took me >>>>> about 80 mins to finish a LFSCK run. And, more importantly, while I >>>>> am running LFSCK, on both MDT and OSTS, the CPU utilization is 100%, >> taken by the lfsck thread. >>>> >>>> Which version of Lustre and what is the LFSCK command line you used? >>>> >>>> >>>>> I understand that lfsck is operating in an online mode, so it is >>>>> slow. But, I am wondering is there any way to accelerate this? >>>>> Especially if I am allowed to run it offline, for example, during weekly >> maintenance. >>>> >>>> Your slow is abnormal, not related with online. The LFSCK can NOT be run >> under offline mode. >>>> >>>> >>>>> >>>>> After checking the lfsck kernel logs, I noticed that in the phase2 >>>>> scanning on OSTs, there is an 30 seconds interval between querying >>>>> the MDTs. I am wondering is there any reason to have this 30 >>>>> seconds interval, and will lfsck on OSTs be faster if we remove such 30 >> seconds interval? >>>> >>>> Normally, the master engine on the MDT will notify the LFSCK engine on the >> OST when the first phase done. But we can NOT guarantee that the LFSCK >> engine on the MDT always alive during the LFSCK (may because of some failure, >> or network trouble, or node crash, and so on), so in the 2nd phase scanning, >> if >> the LFSCK engine on the OST does not receive the notification from the MDT, >> it >> needs to query the LFSCK (on the MDT) status periodically. If the MDT >> finished >> the 1st phase scanning earlier than OST, then there will be no such query. >> Anyway, such query is NOT the reason of your slow LFSCK. >>>> >>>> >>>>> >>>>> Thanks, >>>>> - Dong >>>>> _______________________________________________ >>>>> lustre-discuss mailing list >>>>> lustre-discuss@lists.lustre.org >>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org