Le 26/11/2014 18:19, Craig Tierney - NOAA Affiliate a écrit :
Thomas,

We backported the patch. It was just a one-liner to put changlog entries at the tail, versus the head, of the list. After the last catchup of the changelogs completed, I created a bunch of new files while robinhood was not running. The processing rate is still about 400 entries per second. In particular, it looked like it was processing about 1024 records every 2.5 seconds.

So I looked in the configuration and saw that I had:

  # clear changelog every 1024 records:
    batch_ack_count = 1024 ;
Craig,

This is strange. The behavior you describe sounds exactly like the problem that must be fixed with the patch: every changelog_clear() call to the MDS stucks changelog delivery for a while.

Is there a lot of stacked records? You can see this on the MDS, as far I I can remember, in /proc/fs/lustre/*mdd*/changelog_user something like that,
you have the last record id and the last cleared record.

I don't know why this would slow things down, I thought it was just an update optimization. I ran some tests with a different changelog user and it seemed dumping the changelogs and updating the position should never be a limitation as I was able to grab over 100,000 entries and reset the count in a few seconds.
OK.

So I updated batch_ack_count to 10,000. Now the change log processing rate seemed to go up to 1666 logs/second (over 30 seconds). This is better. If the rate is limited by the database performance, then there probably isn't much more I can do (comparing to scan rates).
"grep STAT" into robinhood log would help to indentify the limitation you hit. If you want to sample stats for a shorter period that the default (which is 15 or 20minutes), you can change the "stats_interval" in the config.


What do people use for a value of batch_ack_count on large, PB sized, filesystems?
I think a good value is a few seconds of changelog processing. So 10k is a good value in you case.


Regards

Thanks,
Craig


On Tue, Nov 18, 2014 at 3:00 AM, LEIBOVICI Thomas <[email protected] <mailto:[email protected]>> wrote:

    Hi Craig,

    No, it is njot expected to get such a slow processing speed.
    According to the Lustre versions you run, this slow processing may
    be due to the following Lustre bug:

    https://jira.hpdd.intel.com/browse/LU-5405

    It is a MDS fix. For now the fix is only landed in Lustre 2.5.4. I
    don't know if it can be backported to Lustre2.4...

    Regards,
    Thomas


    On 11/17/14 21:11, Craig Tierney - NOAA Affiliate wrote:
    Hi,

    I have just installed Robinhood 2.5.3 to monitor a Lustre 2.4.3
system. The client on the server is running the 2.5.3 version. When I did an initlal scan of another test system I saw scan
    rates of about 1000-2000 entries per second. While I had
    configured robinhood to monitor this new system, the Robinhood
    server was not running when we started to copy data to the new
    filesystem.  From the changelog statistics, I am about 144m
    events behind. Processing the change logs seems only be going at
    375 entries per second.

    Is this typical?  I would have expected the processing of
    changelog events to be much faster than this or at least as fast
    as a normal file scan.

    Thanks,
    Craig


    
------------------------------------------------------------------------------
    Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
    from Actuate! Instantly Supercharge Your Business Reports and Dashboards
    with Interactivity, Sharing, Native Excel Exports, App Integration & more
    Get technology previously reserved for billion-dollar corporations, FREE
    http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk


    _______________________________________________
    robinhood-support mailing list
    [email protected]  
<mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/robinhood-support




------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk


_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support



---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to