Re: [robinhood-support] scan speed on PB scale systems

LEIBOVICI Thomas Wed, 23 Apr 2014 01:19:25 -0700

Hi Andrew,

As Scott suggested, looking at Pipeline stats in the log will help 
pointing out what it the bottleneck in your configuration : the fs scan 
speed or the robinhood DB. Pipeline stats show where the current 
operations are stacked and their average duration.
If the pipeline looks empty, then the bottleneck is lustre readdir/stat 
rate.


Anyhow, it is important to take care about your DB tunings.
I don't know it it is the case with MariaDB, but MySQL default 
parameters were not appropriate at all
and some tuning was required to get a good enough DB performance.

Here is a set of parameters from /etc/my.cnf, we use to tune on our 
systems :
(you can use http://mysqltuner.pl to help you)

key_buffer_size=512M
thread_cache_size=64
query_cache_size=512M
query_cache_limit=512M
sort_buffer_size=512M
read_rnd_buffer_size=1M
table_cache=8K
tmp_table_size=1G
max_heap_table_size=1G

innodb_file_per_table
innodb_buffer_pool_size=64G
innodb_max_dirty_pages_pct=20

# See 
http://www.mysqlperformanceblog.com/2008/11/21/how-to-calculate-a-good-innodb-log-file-size
innodb_log_file_size=500M

Also, looking at your rbh-report stats, i'd suggest you set the Lustre 
default stripe count = 1  as most of your files are smaller than a 
single strip.
This should speed up stat() operations for the newly created files as 
the lustre client will only have to get the size from 1 OSS instead of 
querying several servers.

Regards,
Thomas

On 04/21/14 16:41, Scott Nolin wrote:
> Hi,
>
> I think you should probably should be looking not at TB/day but number 
> of files (records) per unit time. Robinhood should report stats 
> periodically in the log and this can be useful to see how your scan is 
> going in more detail.
>
> Those millions of tiny files will certainly have a cost in scan time. 
> It should take just as long to read the file attributes for a 1K file 
> as for a 100G file I think.
>
> The bottleneck I think comes down to the database performance and MDS 
> performance.
>
> The robinhood developers probably have more insight.
>
> Scott
>
> On 4/20/2014 2:52 AM, Andrew Elwell wrote:
>> Hi folks,
>>
>> I suspect this is a "how long is a piece of string" question, but
>> roughly what order of scan speed do other sites see on large systems?
>>
>> We have a 3PB /scratch hosted on sonnexion appliances (Cray) so I'm
>> running 2 instances of robinhood (one on each of two esDM nodes) --
>> one as a lustre changelog, and the other performing a --scan -O
>> --no-gc -d to help with the initial DB population (it's a fresh
>> install of MariaDB10 on a 3rd host - dedicated LUN for /var/lib/mysql
>> but without and SSD devices)
>>
>> I'm seeing an average of 15-20TB/day for the scan - is this normal?
>> Also, some of our users have huge directory structures with millions
>> of directories and tiny (o240k) files within them *cough* openfoam --
>> do other sites see this and how do they deal with the filetype mix?
>>
>>
>> so far in (~7d) I have:
>> type    ,      count,     volume,   avg_size
>> symlink ,     269149,   19.68 MB,         77
>> dir     ,   41570192,  160.88 GB,    4.06 KB
>> file    ,  194195639,  134.64 TB,  744.42 KB
>> fifo    ,          3,          0,          0
>>
>> Total: 236034983 entries, 148206163990949 bytes (134.79 TB)
>>
>>
>> Many thanks
>>
>> Andrew


------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Re: [robinhood-support] scan speed on PB scale systems

Reply via email to