Hi,

Here at the University of Manchester (UK) we are currently assessing 
requirements for setting up and supporting an institutional repository. We have 
some questions regarding hardware architecture and scale-up. 

Our plans are ambitious. We intend the repository to include all University 
research outputs and anticipate it will host a backlog of <200,000 records with 
10,000 - 20,000 records added annually. We expect a reasonable fraction of 
these to include fulltext.

We expect our total number of registered contributers to be around 9,000 (staff 
and students). However the number of users submitting content concurrently will 
be low i.e. <20. In contrast, we expect data reads to be high (many 1,000's per 
day) as we plan to deliver metadata dynamically to a range of websites, 
including staff personal websites, research centre websites, faculty websites, 
the main University website and external services (e.g. metadata harvesters). 

We believe a multi-server, n-tiered hardware architecture that can scale-up 
over time is wise. We'd appreciate any experiences people have had with such an 
architecture. Specifically,

1) How are people dealing with scale-up issues and is anyone supporting 
multi-server, n-tiered architectures?
2) Are there any known problems with separating the application, database and 
filestore layers when installing/configuring/running DSpace?
3) Has anyone run the DSpace application layer in a clustered/load-balanced 
multi-server environment? What were your experiences/problems?
4) Does anyone run the database server for DSpace using a multi-server 
architecture? What were your experiences/problems?
5) Does DSpace support large file stores i.e. >1TeraByte across multiple 
partitions with replication across dual-sited servers (possibly using a SAN)? 
What were your experiences/problems?
6) Is anyone separating database, fulltext, server logs and fulltext file 
storage space and how?
7) How much storage space are you using for your database, fulltext, search 
indexes and server logs per 10,000 records? 
8) Has anyone published any stress-testing results for large repositories using 
DSpace?
9) What backup mechanisms are people using for database, fulltext file store, 
index and server logs?

Any comments on any of these would be great.

We are keen to share our own experiences with the community once we've had some.

Also we'd very much appreciate an opportunity to have a site visit to discuss 
experiences if convenient (unfortunately only within the UK). 

I'm posting this message to a number of tech lists so please excuse any 
cross-postings.

Thanks in advance, Phil

***********************************
Dr PR Butler
Institutional Repository Manager
Red 1.3 The John Rylands University Library, 
The University of Manchester, 
Oxford Road, 
Manchester M13 9PP, UK
Tel: +44 (0)161 275 1514 (internal x51514)
Email: [EMAIL PROTECTED]
Web: http://www.manchester.ac.uk/institutionalrepositoryproject
***********************************


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to