In this case, I'm not in the business of deleting these files myself. I've got experience purging files like this, but for this task, I have to let robinhood do its thing.
I think the DB delays you're seeing are related to the change log reader inserting and updating new records. When I look at the process list on the mysql server none of the queries are related to the purge module, they just look like the standard insert/update queries from the change log reader. -peter On Apr 24, 2014, at 13:17, Adam Brenner <[email protected]> wrote: > On Thu, Apr 24, 2014 at 7:42 AM, Doherty, Peter Charles > <[email protected]> wrote: >> What's the main limiting factor for purge speed? >> >> I've got one problem user who has millions of small files. > > Only one? Buy some lotto tickets, have some real good luck ;-p > > >> Robinhood has been diligently working away at purging the files, but it's >> presently going at about 6 deletes per second, which strikes me as pretty >> slow. >> Adding an index to the last_access column in the ENTRIES table seemed to help >> boost the DB query. (If this seems like good practice, it might be worth >> mentioning >> in the documentation.) > > It depends. By adding another index, inserts have an extra plenty. > depending on how the index was created (as an additional row to a > current index or as an entirely new key) the costs vary. Ideally it > would be an additional row to a current index, so its within the same > look up. > > I am not sure how RBH actually performs file deletions but as a > general FYI use rsync. A great write up _use to_ exists however way > back machine was able to retrieve it: > > https://web.archive.org/web/20130929001850/http://linuxnote.net/jianingy/en/linux/a-fast-way-to-remove-huge-number-of-files.html > > As a note for Thomas and the other RBH devs: This may be a faster way > to purge files: https://gist.github.com/jzwinck/5692534 > > >> Does robinhood batch the unlink operations? Is there anything else I'm >> missing that would explain why the purge is crawling along? > > From the output, it appears your bottle neck is not the actually file > deletion operation but rather the database. The GET_INFO_DB (select > statements) along with DB_APPLY. > > I suggest you run > > wget mysqltuner.pl > perl mysqltuner.pl > > And see if you can improve your database performance on the > GET_INFO_DB operation. You may also want to increase the number of > threads to perform this action (only 7 with none of them idle). > >> >> 2014/04/24 09:51:18 [24573/1] STATS | ==== EntryProcessor Pipeline Stats === >> 2014/04/24 09:51:18 [24573/1] STATS | Idle threads: 0 >> 2014/04/24 09:51:18 [24573/1] STATS | Id constraints count: 10000 (hash >> min=0/max=7/avg=1.3) >> 2014/04/24 09:51:18 [24573/1] STATS | Stage | Wait | Curr | >> Done | Total | ms/op | >> 2014/04/24 09:51:18 [24573/1] STATS | 0: GET_FID | 0 | 0 | >> 0 | 0 | 0.00 | >> 2014/04/24 09:51:18 [24573/1] STATS | 1: GET_INFO_DB | 428 | 0 | >> 8924 | 6890996 | 2.58 | >> 2014/04/24 09:51:18 [24573/1] STATS | 2: GET_INFO_FS | 305 | 7 | >> 15 | 6349673 | 20.29 | >> 2014/04/24 09:51:18 [24573/1] STATS | 3: REPORTING | 0 | 0 | >> 0 | 6028488 | 0.00 | >> 2014/04/24 09:51:18 [24573/1] STATS | 4: PRE_APPLY | 0 | 0 | >> 0 | 6322256 | 0.00 | >> 2014/04/24 09:51:18 [24573/1] STATS | 5: DB_APPLY | 280 | 1 | >> 40 | 6321975 | 5.49 | 53.19% batched (avg batch size: 3.1) >> 2014/04/24 09:51:18 [24573/1] STATS | 6: CHGLOG_CLR | 0 | 0 | >> 0 | 6881424 | 0.01 | >> 2014/04/24 09:51:18 [24573/1] STATS | 7: RM_OLD_ENTRIES | 0 | 0 | >> 0 | 0 | 0.00 | > > > > > -- > Adam Brenner > Computer Science, Undergraduate Student > Donald Bren School of Information and Computer Sciences > > System Administrator, HPC Cluster > Office of Information Technology > http://hpc.oit.uci.edu/ > > University of California, Irvine > www.ics.uci.edu/~aebrenne/ > [email protected] ------------------------------------------------------------------------------ Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform _______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
