Many thanks. I will try to run namespace check again and see the results. BTW, 
is the rate control mechanism enabled by default? Can I disable it?

I might be asking stupid questions. But, how MDT-object and OST-object pair 
each other during phase1 scanning? I mean whether OST sends objects metadata to 
MDT or MDT sends objects metadata to OST for pairing? Could you point me to the 
source code, so that I can look for more details?

thanks,
- Dong

> On May 2, 2018, at 3:43 AM, Yong, Fan <fan.y...@intel.com> wrote:
> 
> Inline comments.
> 
> --
> Cheers,
> Nasf
> 
>> -----Original Message-----
>> From: 代栋 [mailto:daidon...@gmail.com <mailto:daidon...@gmail.com>]
>> Sent: Wednesday, May 2, 2018 4:06 PM
>> To: Yong, Fan <fan.y...@intel.com <mailto:fan.y...@intel.com>>
>> Cc: lustre-discuss@lists.lustre.org <mailto:lustre-discuss@lists.lustre.org>
>> Subject: Re: [lustre-discuss] Is there a way to have faster lustre file 
>> system
>> checker (lfsck)?
>> 
>> Sorry, I misread “abnormal”.  Anything I can check to help diagnose the
>> slowness?
>> 
>> Thanks,
>> - Dong
>> 
>>> On May 2, 2018, at 2:53 AM, 代栋 <daidon...@gmail.com> wrote:
>>> 
>>> Thanks very much for your reply.
>>> 
>>> I used Lustre 2.9.0 and ran “lctl lfsck_start -M lustre-MDT0000 -A -t all 
>>> -r” to
>> start LFSCK.
>>> 
> You can try "lctl lfsck_start -M lustre-MDT0000 -A -t namespace -r" firstly 
> to check the namespace LFSCK speed. Since you has only one MDT, it should be 
> quite faster. If yes, then check "lctl lfsck_start -M lustre-MDT0000 -A -t 
> layout -r".
> 
>>> Could you brief me more about the slowness? I mean scanning around 300K
>> inodes should not take that much time (80mins). These files were just created
>> using a script after a fresh build of the lustre (no complex metadata 
>> operations
>> at all).
>>> 
> I do not know what caused such slowness. There may be many factors. Have you 
> set some fail_loc? If not, you may need to enable LFSCK debug on both the MDT 
> and OST, then collect and analysis Lustre debug logs.
> 
> 
>>> Got it, so the 30-sec interval is just for checking the status of the MDT.
>> Another question is, for layout checking, does lfsck need to compare metadata
>> stored in MDT (in LayoutEA) and metadata stored in OSTs (FID in LMA? not very
>> sure) for orphan objects? When are these metadata gathered into one place for
>> checking? I am asking this because previously I thought the periodically 
>> queries
>> from OSTs to MDT are doing this job.
>>> 
> In short, all the MDT-object OST-object pairs have been marked during the 1st 
> stage scanning. So If there are some OST-objects non-marked, then it may be 
> orphans those will be handled during the 2nd phase scanning.
> 
> 
>>> Thanks,
>>> - Dong
>>> 
>>> 
>>>> On May 1, 2018, at 9:49 PM, Yong, Fan <fan.y...@intel.com> wrote:
>>>> 
>>>> Inline comments.
>>>> 
>>>> --
>>>> Cheers,
>>>> Nasf
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: lustre-discuss
>>>>> [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of 代栋
>>>>> Sent: Wednesday, May 2, 2018 5:36 AM
>>>>> To: lustre-discuss@lists.lustre.org
>>>>> Subject: [lustre-discuss] Is there a way to have faster lustre file
>>>>> system checker (lfsck)?
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I am still new to Lustre, so please let me know if I should send
>>>>> this message to devel-list.
>>>>> 
>>>>> This week, I tried to run LFSCK over a very small cluster
>>>>> configuration (1 mdt and
>>>>> 3 osts).  In this Lustre, I used about 300K inodes.  It took me
>>>>> about 80 mins to finish a LFSCK run.  And, more importantly, while I
>>>>> am running LFSCK, on both MDT and OSTS, the CPU utilization is 100%,
>> taken by the lfsck thread.
>>>> 
>>>> Which version of Lustre and what is the LFSCK command line you used?
>>>> 
>>>> 
>>>>> I understand that lfsck is operating in an online mode, so it is
>>>>> slow.  But, I am wondering is there any way to accelerate this?
>>>>> Especially if I am allowed to run it offline, for example, during weekly
>> maintenance.
>>>> 
>>>> Your slow is abnormal, not related with online. The LFSCK can NOT be run
>> under offline mode.
>>>> 
>>>> 
>>>>> 
>>>>> After checking the lfsck kernel logs, I noticed that in the phase2
>>>>> scanning on OSTs, there is an 30 seconds interval between querying
>>>>> the MDTs.  I am wondering is there any reason to have this 30
>>>>> seconds interval, and will lfsck on OSTs be faster if we remove such 30
>> seconds interval?
>>>> 
>>>> Normally, the master engine on the MDT will notify the LFSCK engine on the
>> OST when the first phase done. But we can NOT guarantee that the LFSCK
>> engine on the MDT always alive during the LFSCK (may because of some failure,
>> or network trouble, or node crash, and so on), so in the 2nd phase scanning, 
>> if
>> the LFSCK engine on the OST does not receive the notification from the MDT, 
>> it
>> needs to query the LFSCK (on the MDT) status periodically. If the MDT 
>> finished
>> the 1st phase scanning earlier than OST, then there will be no such query.
>> Anyway, such query is NOT the reason of your slow LFSCK.
>>>> 
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> - Dong
>>>>> _______________________________________________
>>>>> lustre-discuss mailing list
>>>>> lustre-discuss@lists.lustre.org
>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to