Hi all,

I'm currently developing a standalone stats package to analyse dspace logs and 
generate usage reports based on some custom requirements (similar to built-in 
stats, but with per-author views/downloads statistics, allow search for any 
item/handle to display views/downloads, etc.)

Access to a few tables, such as 
handle,metadatavalue,item2bundle,bundle2bitstream,bitstream are needed to 
resolve handles to IDs and generate reports based on author names, print 
metadata values for items, and so on.

At the moment, I'm too paranoid to let my stats app query dspace's DB directly, 
and am just regularly dumping tables from dspace and spitting them over to a 
separate database. However, I'm aware that dspace does protect itself from 
running out of resources with a managed connection pool, and that I'm only 
doing read operations. (no UPDATEs or INSERTs, they would only ever happen on 
my separate stats DB).

Can anyone either (a) confirm that my paranoia is justified, or (b) point out 
some safe ways of querying the dspace DB from a standalone app? Increase max 
connections in postgresql to $dspacepool + $statspool + a bit of overhead?

At the moment, dumping to a separate DB works fine, but it probably doesn't 
scale very well and just seems like an ugly -- albeit 'safe' -- hack.

Any suggestions or pointers are appreciated.

Cheers,

Kim.

--
Kim Shepherd
IRR Technical Specialist
ITS Systems & Development
The University of Waikato
DDI +64 7 838 4025


------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to