On 16 Apr 2009, at 16:44, Mark H. Wood wrote: > On Tue, Apr 14, 2009 at 04:46:59PM +0200, Ruijgrok, P.T. (Peter) > wrote: >> Problem 2: >> --------------- >> checker -d 10h processes approximately 250 bitstreams / hour >> >> checker -d 1874/12233 (collection handle) processes approximately >> 20.000 >> bitstreams / hour >> >> So running the checker every night for eg 10 hours is not >> possible..... >> >> >> Cause ( I think): >> ---------------------- >> select bitstream_id >> from most_recent_checksum where to_be_processed = true >> order by date_trunc('milliseconds', >> last_process_end_date), >> bitstream_id ASC LIMIT 1; >> >> This statement requests the next bitstream to check and it takes +10 >> seconds to find the next bitstream on our 320.000+ dspace. > > I don't *know* what is causing this, and I'm not yet a DBMS > performance expert, but my eye is attracted to the ORDER BY clause. > The monitor is having to build a temporary table of some probably very > sizable subset of 320.000 records, and sort it. There is no index on > last_process_end_date. That may not be significant, but I suppose > that the query planner might be able to eliminate the sorting if that > column were indexed. It might be able to make other optimizations as > well, but I'm just guessing.
My first suggestion would be to take out that call to date_trunc(). Even if last_process_end_date is indexed, the results of truncating it won't be. It's unnecessary and probably a significant performance hit. So, time select bitstream_id from most_recent_checksum where to_be_processed = true order by date_trunc('milliseconds', last_process_end_date), bitstream_id ASC LIMIT 1; against select bitstream_id from most_recent_checksum where to_be_processed = true order by last_process_end_date, bitstream_id ASC LIMIT 1; and see what the difference is. -- Simon Brown <st...@cam.ac.uk> - Cambridge University Computing Service +44 1223 3 34714 - New Museums Site, Pembroke Street, Cambridge CB2 3QH ------------------------------------------------------------------------------ Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech