In this case, I'm not in the business of deleting these files myself.  I've got 
experience purging files like this, but for this task, I have to let robinhood 
do its thing.

I think the DB delays you're seeing are related to the change log reader 
inserting and updating new records.  When I look at the process list on the 
mysql server none of the queries are related to the purge module, they just 
look like the standard insert/update queries from the change log reader.

-peter


On Apr 24, 2014, at 13:17, Adam Brenner <[email protected]> wrote:

> On Thu, Apr 24, 2014 at 7:42 AM, Doherty, Peter Charles
> <[email protected]> wrote:
>> What's the main limiting factor for purge speed?
>> 
>> I've got one problem user who has millions of small files.
> 
> Only one? Buy some lotto tickets, have some real good luck ;-p
> 
> 
>> Robinhood has been diligently working away at purging the files, but it's
>> presently going at about 6 deletes per second, which strikes me as pretty 
>> slow.
>> Adding an index to the last_access column in the ENTRIES table seemed to help
>> boost the DB query. (If this seems like good practice, it might be worth 
>> mentioning
>> in the documentation.)
> 
> It depends. By adding another index, inserts have an extra plenty.
> depending on how the index was created (as an additional row to a
> current index or as an entirely new key) the costs vary. Ideally it
> would be an additional row to a current index, so its within the same
> look up.
> 
> I am not sure how RBH actually performs file deletions but as a
> general FYI use rsync. A great write up _use to_ exists however way
> back machine was able to retrieve it:
> 
> https://web.archive.org/web/20130929001850/http://linuxnote.net/jianingy/en/linux/a-fast-way-to-remove-huge-number-of-files.html
> 
> As a note for Thomas and the other RBH devs: This may be a faster way
> to purge files: https://gist.github.com/jzwinck/5692534
> 
> 
>> Does robinhood batch the unlink operations?  Is there anything else I'm
>> missing that would explain why the purge is crawling along?
> 
> From the output, it appears your bottle neck is not the actually file
> deletion operation but rather the database. The GET_INFO_DB (select
> statements) along with DB_APPLY.
> 
> I suggest you run
> 
>  wget mysqltuner.pl
>  perl mysqltuner.pl
> 
> And see if you can improve your database performance on the
> GET_INFO_DB operation. You may also want to increase the number of
> threads to perform this action (only 7 with none of them idle).
> 
>> 
>> 2014/04/24 09:51:18 [24573/1] STATS | ==== EntryProcessor Pipeline Stats ===
>> 2014/04/24 09:51:18 [24573/1] STATS | Idle threads: 0
>> 2014/04/24 09:51:18 [24573/1] STATS | Id constraints count: 10000 (hash 
>> min=0/max=7/avg=1.3)
>> 2014/04/24 09:51:18 [24573/1] STATS | Stage              | Wait | Curr | 
>> Done |     Total | ms/op |
>> 2014/04/24 09:51:18 [24573/1] STATS |  0: GET_FID        |    0 |    0 |    
>> 0 |         0 |  0.00 |
>> 2014/04/24 09:51:18 [24573/1] STATS |  1: GET_INFO_DB    |  428 |    0 | 
>> 8924 |   6890996 |  2.58 |
>> 2014/04/24 09:51:18 [24573/1] STATS |  2: GET_INFO_FS    |  305 |    7 |   
>> 15 |   6349673 | 20.29 |
>> 2014/04/24 09:51:18 [24573/1] STATS |  3: REPORTING      |    0 |    0 |    
>> 0 |   6028488 |  0.00 |
>> 2014/04/24 09:51:18 [24573/1] STATS |  4: PRE_APPLY      |    0 |    0 |    
>> 0 |   6322256 |  0.00 |
>> 2014/04/24 09:51:18 [24573/1] STATS |  5: DB_APPLY       |  280 |    1 |   
>> 40 |   6321975 |  5.49 | 53.19% batched (avg batch size: 3.1)
>> 2014/04/24 09:51:18 [24573/1] STATS |  6: CHGLOG_CLR     |    0 |    0 |    
>> 0 |   6881424 |  0.01 |
>> 2014/04/24 09:51:18 [24573/1] STATS |  7: RM_OLD_ENTRIES |    0 |    0 |    
>> 0 |         0 |  0.00 |
> 
> 
> 
> 
> --
> Adam Brenner
> Computer Science, Undergraduate Student
> Donald Bren School of Information and Computer Sciences
> 
> System Administrator, HPC Cluster
> Office of Information Technology
> http://hpc.oit.uci.edu/
> 
> University of California, Irvine
> www.ics.uci.edu/~aebrenne/
> [email protected]


------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to