What's the main limiting factor for purge speed?

I've got one problem user who has millions of small files.  Robinhood has been 
diligently working away at purging the files, but it's presently going at about 
6 deletes per second, which strikes me as pretty slow.  Adding an index to the 
last_access column in the ENTRIES table seemed to help boost the DB query. (If 
this seems like good practice, it might be worth mentioning in the 
documentation.)

>From what I can see the purge module got it's list of 100,000 files 
>(db_result_size_max = 100000) from the database, and has been busy purging 
>them.  I don't suspect that the change log reader is creating a conflict 
>that's hampering performance.
I'm including a snippet from the log file below in the hopes it might be 
useful. I'm using v2.5.1.
It's worth noting that the change log reader is about 12 hours behind because I 
stopped robinhood yesterday to add the index to the database.
Does robinhood batch the unlink operations?  Is there anything else I'm missing 
that would explain why the purge is crawling along?

Thanks,
Peter

2014/04/24 09:51:18 [24573/1] STATS | ==================== Dumping stats at 
2014/04/24 09:51:18 =====================
2014/04/24 09:51:18 [24573/1] STATS | ======== General statistics =========
2014/04/24 09:51:18 [24573/1] STATS | Daemon start time: 2014/04/23 23:11:17
2014/04/24 09:51:18 [24573/1] STATS | Started modules: log_reader,purge,rmdir
2014/04/24 09:51:18 [24573/1] STATS | ChangeLog reader #0:
2014/04/24 09:51:18 [24573/1] STATS |    fs_name    =   scratch
2014/04/24 09:51:18 [24573/1] STATS |    mdt_name   =   MDT0000
2014/04/24 09:51:18 [24573/1] STATS |    reader_id  =   cl1
2014/04/24 09:51:18 [24573/1] STATS |    records read        = 7542885
2014/04/24 09:51:18 [24573/1] STATS |    interesting records = 6883204
2014/04/24 09:51:18 [24573/1] STATS |    suppressed records  = 659681
2014/04/24 09:51:18 [24573/1] STATS |    records pending     = 4470
2014/04/24 09:51:18 [24573/1] STATS |    last received            = 2014/04/24 
09:51:17
2014/04/24 09:51:18 [24573/1] STATS |    last read record time    = 2014/04/23 
22:53:19.980761
2014/04/24 09:51:18 [24573/1] STATS |    last read record id      = 451984893
2014/04/24 09:51:18 [24573/1] STATS |    last pushed record id    = 451979737
2014/04/24 09:51:18 [24573/1] STATS |    last committed record id = 451968294
2014/04/24 09:51:18 [24573/1] STATS |    last cleared record id   = 451967508
2014/04/24 09:51:18 [24573/1] STATS |    read speed               = 351.39 
record/sec
2014/04/24 09:51:18 [24573/1] STATS |    processing speed ratio   = 2.08
2014/04/24 09:51:18 [24573/1] STATS |    status                   = busy
2014/04/24 09:51:18 [24573/1] STATS |    ChangeLog stats:
2014/04/24 09:51:18 [24573/1] STATS |    MARK: 0, CREAT: 3836391, MKDIR: 5646, 
HLINK: 791, SLINK: 476, MKNOD: 1, UNLNK: 518361
2014/04/24 09:51:18 [24573/1] STATS |    RMDIR: 1058, RENME: 12599, RNMTO: 0, 
OPEN: 0, CLOSE: 0, LYOUT: 5, TRUNC: 0, SATTR: 2659523
2014/04/24 09:51:18 [24573/1] STATS |    XATTR: 0, HSM: 0, MTIME: 508034, 
CTIME: 0, ATIME: 0

2014/04/24 09:51:18 [24573/1] STATS | ==== EntryProcessor Pipeline Stats ===
2014/04/24 09:51:18 [24573/1] STATS | Idle threads: 0
2014/04/24 09:51:18 [24573/1] STATS | Id constraints count: 10000 (hash 
min=0/max=7/avg=1.3)
2014/04/24 09:51:18 [24573/1] STATS | Stage              | Wait | Curr | Done | 
    Total | ms/op |
2014/04/24 09:51:18 [24573/1] STATS |  0: GET_FID        |    0 |    0 |    0 | 
        0 |  0.00 |
2014/04/24 09:51:18 [24573/1] STATS |  1: GET_INFO_DB    |  428 |    0 | 8924 | 
  6890996 |  2.58 |
2014/04/24 09:51:18 [24573/1] STATS |  2: GET_INFO_FS    |  305 |    7 |   15 | 
  6349673 | 20.29 |
2014/04/24 09:51:18 [24573/1] STATS |  3: REPORTING      |    0 |    0 |    0 | 
  6028488 |  0.00 |
2014/04/24 09:51:18 [24573/1] STATS |  4: PRE_APPLY      |    0 |    0 |    0 | 
  6322256 |  0.00 |
2014/04/24 09:51:18 [24573/1] STATS |  5: DB_APPLY       |  280 |    1 |   40 | 
  6321975 |  5.49 | 53.19% batched (avg batch size: 3.1)
2014/04/24 09:51:18 [24573/1] STATS |  6: CHGLOG_CLR     |    0 |    0 |    0 | 
  6881424 |  0.01 |
2014/04/24 09:51:18 [24573/1] STATS |  7: RM_OLD_ENTRIES |    0 |    0 |    0 | 
        0 |  0.00 |
2014/04/24 09:51:18 [24573/1] STATS | DB ops: 
get=2781136/ins=3546501/upd=2481706/rm=293768
2014/04/24 09:51:18 [24573/1] STATS | --- Pipeline stage details ---
2014/04/24 09:51:18 [24573/1] STATS | GET_INFO_DB   : first: changelog record 
#451969026, fid=[0x2020ba885:0x88e9:0x0], status=waiting
2014/04/24 09:51:18 [24573/1] STATS | GET_INFO_DB   : last: changelog record 
#451979736, fid=[0x2020b9bb8:0x731a:0x0], status=waiting
2014/04/24 09:51:18 [24573/1] STATS | GET_INFO_FS   : first: changelog record 
#451968658, fid=[0x2020ba981:0x19347:0x0], status=processing
2014/04/24 09:51:18 [24573/1] STATS | GET_INFO_FS   : last: changelog record 
#451969026, fid=[0x2020b9a83:0x1452e:0x0], status=waiting
2014/04/24 09:51:18 [24573/1] STATS | DB_APPLY      : first: changelog record 
#451968295, fid=[0x2020ba981:0x19324:0x0], status=processing
2014/04/24 09:51:18 [24573/1] STATS | DB_APPLY      : last: changelog record 
#451968657, fid=[0x2020ba97b:0x1c6be:0x0], status=waiting

2014/04/24 09:51:18 [24573/1] STATS | ============ Purge stats ============
2014/04/24 09:51:18 [24573/1] STATS | idle purge threads       = 4
2014/04/24 09:51:18 [24573/1] STATS | purge operations pending = 0
2014/04/24 09:51:18 [24573/1] STATS | purge status:
2014/04/24 09:51:18 [24573/1] STATS |     successfully purged            = 
269227
2014/04/24 09:51:18 [24573/1] STATS |     accessed since last update     = 42365
2014/04/24 09:51:18 [24573/1] STATS |     whitelisted/ignored            = 67
2014/04/24 09:51:18 [24573/1] STATS | total purged volume = 1148391424 (1.07 GB)
2014/04/24 09:51:18 [24573/1] STATS | last file submitted  0 s ago
2014/04/24 09:51:18 [24573/1] STATS | last file handled    0 s ago
2014/04/24 09:51:18 [24573/1] STATS | last file purged     0 s ago


------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to