Hi Thomas,

Thanks for the information and suggestions.  Robinhood v3 sounds quite exciting 
- I will give some thought as to how we might be able to participate in an 
early testing phase.  I think our timeline probably points toward moving into 
production with v2, though.  I am by no means a mysql expert.  I'd very much 
appreciate if you can point me at tuned my.cnf examples.

If I may ask for comment on high-availability using the current Robinhood 
releases and our lhsm+tmpfs setup - my intuition is that dual controller, 
direct-attached storage and two hosts each running a mysqld and an rbh manager 
- one for lhsm and one for tmpfs - would work well; the storage can be two 
separate LUNs (probably made from SSDs), one for each database.  Then with a 
pretty conventional corosync/pacemaker setup, we can have one node ready to 
take over the services of the other as required but be running in an 
active-active mode in normal operation.  Currently, our file system has 220M 
inodes in use (growing), which suggests each host having at least 256GB of 
memory.  Having each LUN be a couple TB or so should allow for future growth 
(which we expect).  Is this the kind of setup which is generally recommended if 
one needs a highly available Robinhood setup?  I suspect that alternatives like 
mysql master-slave replication or DRBD will impact performance and/or not work 
quite as well - is my concern unfounded?  FWIW, our cluster currently has ~22k 
cores, and we'll be growing to ~50k next year.  Our combined 
create/open/close/rename/unlink activity is recently averaging in the few 
thousand ops/sec, though being in the 10s of thousands ops/sec isn't uncommon.

Thanks again,
Craig Prescott
UF Research Computing

From: LEIBOVICI Thomas [mailto:[email protected]]
Sent: Monday, December 01, 2014 5:13 AM
To: Prescott,Craig P; [email protected]
Subject: Re: [robinhood-support] lhsm custom purge command and DB server sizing

Hi Craig,

It sounds Robinhood v3 would perfectly match you need to manage these multiple 
use-cases in a single robinhood instance. This is basically what it is designed 
for. This major version is currently under development, and it will likely be 
available in 2H2015. Not sure how it fits in your planning...
Depending on your time requirement to setup these use-cases into full 
production, you could get an early version of robinhood early 2015 that do not 
implement all planned features of rbhv3, but at least implements lhsm archiving 
and pool to pool migration in a single instance.
If you agree for such a early testing phase, just let us know.
I believe this answers to your question 1).

2) lhsm and tmpfs can run on the same client, as long as they are registered as 
different changelog readers, and access distinct databases.
I'd be concerned about a fight between the 2 instances for realeasing disk 
space in ost pools: one will want to run  "lhsm release", and the other "lfs 
migrate". However, I understood you just want to replicate data with lhsm, but 
don't plan to "release" it. So will be OK.

3) A few recommandations:
 - Keep your robinhood client as close to the MDS as possible: no access 
through LNET routers that would introduce an extra latency for Lustre RPCs.
- I'm not sure running the DB on a different host is a good idea because it 
introduces a network latency for robinhood DB requests, whereas they must fire 
at tens of thousands per seconde to sustain the filesystem workload.
The robinhood daemon itself doesn't require much memory or CPU, so it can 
perfectly live with a DB engine on the same host.
- It is not expected that the database size changes a lot between Lustre 1.8 
and 2.5. Most of the space is consumed by namespace management and storage of 
stripe information, that are about the same between the 2 versions. Only one or 
2 more table fields for Lustre/HSM, so not a lot of additionnal space.
- Tune your /etc/my.cnf (we can provide you examples).
- For the HW:
    * It is nice to have most of the DB in memory: 1k/entry is a good sizing 
(e.g. 128GB of memory for 128M entries). In the case you need to run 2 
robinhood DB on the same host, you'll have to double it.
    * Of course, a fast DB backend like a SSD is better than a spinning disk 
(some benchmarks here: 
https://github.com/cea-hpc/robinhood/wiki/tmpfs_admin_guide#entry-processor-pipeline-options)

Regards,
Thomas

On 11/25/14 20:39, Prescott,Craig P wrote:
Hello,

A few months back we took a look at Robinhood Policy Engine on a Lustre 2.5.x 
testbed with eyes toward accomplishing two main goals: a) automatic migration 
between OST pools based upon atime without changing the path, and b) 
replication of particular data to an external file system.  It seemed like 
current Robinhood software releases could do this by running a tmpfs manager 
with a custom purge command (lfs migrate) to handle the migration between 
pools, and an lhsm manager for the replication.

With upgrade and expansion our production file system from 1.8.9 to 2.5 being 
planned for the new year (new system), we are considering how to best bring the 
above two goals into production.  We want have the Robinhood components we need 
in place when the new system goes into production, and we want to avoid ever 
having to rescan the file system.  So I have the following questions I hope I 
can get some input on:


1)      At the time I was looking, the lhsm manager could not have a custom 
purge command (appended).  If it could, then we would only need to run the lhsm 
manager, which would be ideal - it could handle both of our use cases and we 
could have a single changelog reader and a single database.  Is this possible 
with current software releases?

2)      If we need to run both lhsm and tmpfs managers, can they be run 
simultaneously from the same client?

3)      We intend to use a database running on a different node than the 
Robinhood managers.  I am not sure how to wisely choose HW for mysqld to keep 
up with the changelogs.  We can use stats from our current 1.8.9-based file 
system and cluster activity to size this, but I'm not sure what to look at.  
Any pointers here?

Thanks,
Craig Prescott
UF Research Computing







------------------------------------------------------------------------------

Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server

from Actuate! Instantly Supercharge Your Business Reports and Dashboards

with Interactivity, Sharing, Native Excel Exports, App Integration & more

Get technology previously reserved for billion-dollar corporations, FREE

http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk




_______________________________________________

robinhood-support mailing list

[email protected]<mailto:[email protected]>

https://lists.sourceforge.net/lists/listinfo/robinhood-support

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to