Hi, Here at the University of Manchester (UK) we are currently assessing requirements for setting up and supporting an institutional repository. We have some questions regarding hardware architecture and scale-up.
Our plans are ambitious. We intend the repository to include all University research outputs and anticipate it will host a backlog of <200,000 records with 10,000 - 20,000 records added annually. We expect a reasonable fraction of these to include fulltext. We expect our total number of registered contributers to be around 9,000 (staff and students). However the number of users submitting content concurrently will be low i.e. <20. In contrast, we expect data reads to be high (many 1,000's per day) as we plan to deliver metadata dynamically to a range of websites, including staff personal websites, research centre websites, faculty websites, the main University website and external services (e.g. metadata harvesters). We believe a multi-server, n-tiered hardware architecture that can scale-up over time is wise. We'd appreciate any experiences people have had with such an architecture. Specifically, 1) How are people dealing with scale-up issues and is anyone supporting multi-server, n-tiered architectures? 2) Are there any known problems with separating the application, database and filestore layers when installing/configuring/running DSpace? 3) Has anyone run the DSpace application layer in a clustered/load-balanced multi-server environment? What were your experiences/problems? 4) Does anyone run the database server for DSpace using a multi-server architecture? What were your experiences/problems? 5) Does DSpace support large file stores i.e. >1TeraByte across multiple partitions with replication across dual-sited servers (possibly using a SAN)? What were your experiences/problems? 6) Is anyone separating database, fulltext, server logs and fulltext file storage space and how? 7) How much storage space are you using for your database, fulltext, search indexes and server logs per 10,000 records? 8) Has anyone published any stress-testing results for large repositories using DSpace? 9) What backup mechanisms are people using for database, fulltext file store, index and server logs? Any comments on any of these would be great. We are keen to share our own experiences with the community once we've had some. Also we'd very much appreciate an opportunity to have a site visit to discuss experiences if convenient (unfortunately only within the UK). I'm posting this message to a number of tech lists so please excuse any cross-postings. Thanks in advance, Phil *********************************** Dr PR Butler Institutional Repository Manager Red 1.3 The John Rylands University Library, The University of Manchester, Oxford Road, Manchester M13 9PP, UK Tel: +44 (0)161 275 1514 (internal x51514) Email: [EMAIL PROTECTED] Web: http://www.manchester.ac.uk/institutionalrepositoryproject *********************************** ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech