Hi, Thanks for the responses. We discovered our problems by looking through apache logs. Unfortunately we are not using the xmlui, but jsp ui, so I guess we can't use sitemap.xmap. We are looking into setting up some restrictions in our robots.txt to see if that will help.
We also talked about restricting bots bandwidth by using mod-bw. Does anybody have any experience with that? Regards, Ene Rammer Nielsen, Roskilde University Library. -----Oprindelig meddelelse----- Fra: Andrea Schweer [mailto:schw...@waikato.ac.nz] Sendt: 8. april 2013 07:35 Til: dspace-tech@lists.sourceforge.net Emne: Re: [Dspace-tech] Bots and cpu Hi, On 07/04/13 02:14, Sims, Richard B wrote: > our site's Google Search Appliance was intensely indexing the content served > by this system. When that indexing completed, the load average went back to > nil. Looking back in our system monitoring graphs, I saw that this load spike > occurred every Tuesday morning - when the GSA was doing a full indexing run. When we experienced some problems with GSA in one of "my" repositories, we managed to improve the situation quite significantly by adding gsa-crawler to the list of known bots in the main XMLUI sitemap.xmap. See here for some background: http://www.mail-archive.com/dspace-tech@lists.sourceforge.net/msg19537.html We're also disallowing Discovery and the browse indexes for gsa-crawler; it gets all relevant pages via the sitemap anyway. Sorry about the short e-mail, I'm about to head home. I can give some more details about the sitemap.xmap changes if anyone is interested. cheers, Andrea -- Dr Andrea Schweer IRR Technical Specialist, ITS Information Systems The University of Waikato, Hamilton, New Zealand ------------------------------------------------------------------------------ Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette ------------------------------------------------------------------------------ Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette