Howdy,
We are running Lustre 2.4.3 and I'm attempting to get the changelog
functionality working so that I don't have to do full scans.
The robinhood service is running and when started it reports it's using
the correct config file (Starting Robinhood for
/etc/robinhood.d/tmpfs/scratch.conf [ OK ])
Robinhood packages:
robinhood-webgui-2.5.0-1.noarch.x86_64
robinhood-adm-2.5.2-1.noarch.x86_64
robinhood-tmpfs-2.5.2-1.lustre2.4.el6.x86_64
robinhood-backup-2.5.2-1.lustre2.4.el6.x86_64
Lustre packages:
kernel-2.6.32-358.23.2.el6_lustre.x86_64
lustre-2.4.3-2.6.32_358.23.2.el6_lustre.x86_64.x86_64
lustre-tests-2.4.3-2.6.32_358.23.2.el6_lustre.x86_64.x86_64
lustre-iokit-1.4.0-1.noarch
lustre-modules-2.4.3-2.6.32_358.23.2.el6_lustre.x86_64.x86_64
lustre-ldiskfs-4.1.0-2.6.32_358.23.2.el6_lustre.x86_64.x86_64
kernel-devel-2.6.32-358.23.2.el6_lustre.x86_64
lustre-osd-ldiskfs-2.4.3-2.6.32_358.23.2.el6_lustre.x86_64.x86_64
Here's the /etc/robinhood.d/tmpfs/scratch.conf file
General
{
fs_path = "/scratch";
fs_type = lustre;
# check that objects are in the same device as 'fs_path',
# so it will not traverse mount points
stay_in_fs = TRUE ;
# check that the filesystem is mounted
check_mounted = TRUE ;
}
Log
{
log_file = "/var/log/robinhood/tmp_fs.log";
report_file = "/var/log/robinhood/reports.log";
alert_file = "/var/log/robinhood/alerts.log";
}
ListManager
{
MySQL
{
server = localhost;
db = robinhood_scratch;
user = robinhood;
password_file = /etc/robinhood.d/.dbpassword;
}
}
# FS Scan configuration
FS_Scan
{
# simple scan interval (fixed)
#scan_interval = 2d ;
scan_interval = 4h ;
# min/max for adaptive scan interval:
# the more the filesystem is full, the more frequently it is scanned.
#min_scan_interval = 24h ;
#max_scan_interval = 7d ;
# number of threads used for scanning the filesystem
nb_threads_scan = 2 ;
# when a scan fails, this is the delay before retrying
scan_retry_delay = 1h ;
# timeout for operations on the filesystem
scan_op_timeout = 1h ;
# exit if operation timeout is reached?
exit_on_timeout = TRUE ;
# external command called on scan termination
# special arguments can be specified: {cfg} = config file path,
# {fspath} = path to managed filesystem
#completion_command = "/path/to/my/script.sh -f {cfg} -p
{fspath}" ;
# Internal scheduler granularity (for testing and of scan, hangs, ...)
spooler_check_interval = 1min ;
# Memory preallocation parameters
nb_prealloc_tasks = 256 ;
Ignore
{
# ignore ".snapshot" and ".snapdir" directories (don't scan them)
type == directory
and
( name == ".snapdir" or name == ".snapshot" )
}
}
# ChangeLog Reader configuration
# Parameters for processing MDT changelogs :
ChangeLog
{
# 1 MDT block for each MDT :
MDT
{
# name of the first MDT
mdt_name = "MDT0000" ;
# id of the persistent changelog reader
# as returned by "lctl changelog_register" command
reader_id = "cl1" ;
}
# clear changelog every 1024 records:
batch_ack_count = 1024 ;
force_polling = ON ;
polling_interval = 1s ;
queue_max_size = 1000 ;
queue_max_age = 5s ;
queue_check_interval = 1s ;
}
And here's the output of rbh-report -a
rbh-report -a
Using config file '/etc/robinhood.d/tmpfs/scratch.conf'.
Filesystem scan activity:
Current scan interval: 4.0h
Previous filesystem scan:
start: 2014/06/12 10:33:54
duration: 1h 21min 22s
Last filesystem scan:
status: running
start: 2014/06/12 12:56:10 (4d 1h 51min 41s ago)
last action: 2014/06/12 17:41:24 (3d 21h 06min 27s ago)
Statistics:
entries scanned: 2424665
errors: 0
timeouts: 0
# threads: 2
average speed: 145.88 entries/sec
>>> current speed: 65.81 entries/sec
Changelog stats:
Last read record id: 26728103
Last read record time: 2014/06/10 22:31:49.203102
Last receive time: 2014/06/16 14:40:00
Last committed record id: 26706138
Changelog stats:
type total (diff) (rate)
MARK: 0
CREAT: 5096121 (+15589) (17.32/sec)
MKDIR: 281879 (+7014) (7.79/sec)
HLINK: 174
SLINK: 38692 (+4) (0.00/sec)
MKNOD: 278
UNLNK: 810946 (+235) (0.26/sec)
RMDIR: 28027
RENME: 2298320 (+15505) (17.23/sec)
RNMTO: 0
OPEN: 0
CLOSE: 8920215 (+16234) (18.04/sec)
LYOUT: 0
TRUNC: 0
SATTR: 5610684 (+61312) (68.12/sec)
XATTR: 103
HSM: 0
MTIME: 3266967 (+394) (0.44/sec)
CTIME: 329209
ATIME: 0
Storage usage has never been checked
No purge was performed on this filesystem
When I look user statistics, they are not current. For example, the user
below shows as having 5.56GB, however that user has had roughly 20TB in
their scratch directory for over a week.
rbh-report -u jsmith
Using config file '/etc/robinhood.d/tmpfs/scratch.conf'.
user , type, count, spc_used, avg_size
jsmith , symlink, 140, 0, 23
jsmith , dir, 726, 2.95 MB, 4.17 KB
jsmith , file, 7312, 5.56 GB, 794.96 KB
jsmith , chr, 2, 0, 0
Total: 8180 entries, 5972529152 bytes used (5.56 GB)
Does anyone see anything wrong with my changelog configuration?
Thanks, Mike
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support