We are in the process of "moving" a Collection of documents out of our main 
DSpace repository, and into another separate DSpace instance.  Our reasons for 
doing this are as follows:

 1.  We currently have 139,000 documents in this Collection and will eventually 
have approximately 900,000.  The 139,000 documents currently comprises 56% of 
our entire repository.
 2.  This particular Collection is being added to DSpace as part of a mass 
digitization effort, however it does not contain any metadata and is not nearly 
as useful to our researchers as the remainder of the repository.
 3.  The processing times for our filter-media, browse/search index build and 
updates, etc are significantly increased due to the large number of documents 
in this collection and online performance is negatively affected.
 4.  The document ingest process is slowly degrading in direct proportion to 
the number of documents in our repository.

I just came across the documentation on "Registration" in DSpace 1.5.1 vs. the 
physical export/import from one database/instance to another, separate one, and 
I'm thinking this would be a much easier and quicker way to migrate these 
documents from our main repository to the new one we've created just to house 
this one Collection.  My understanding is that basically you are exporting the 
metadata from one instance to another, you leave the documents in their 
original location in /<dspace>/assetstore(n), and the new Item in the new 
repository contains a pointer (in the bitstream table...?) to the original 
location of the document.

Is my understanding correct?  If so, given that we are leaving the original 
assetstore directory(s) as is, but moving the metadata to a separate database 
(in the same postgreSQL environment on the same server) and deploying our web 
application in a separate directory on the same machine, are we really going to 
see a significant improvement in processing time and performance in the 
original repository?  I am thinking we will, but I would like to hear from 
anyone else who may have had a similar situation or has an idea of what we can 
expect.

Also, I'm trying to figure out exactly how the filter-media, browse/search 
index build and updates, etc are going to work if we end up using the 
"Registration" technique?  Will we end up with a single /<dspace>/search 
directory with combined indexes, or will we have two separate ones?  How will 
our searching and browsing be impacted?

Any thoughts or recommendations are welcome!
Thanks in advance,
Sue


Sue Walker-Thornton
ConITS Contract
NASA Langley Research Center
Integrated Library Systems Application & Database Administrator
130 Research Drive
Hampton, VA  23666
Office: (757) 224-4074
Fax:    (757) 224-4001
Pager: (757) 988-2547
Email:  susan.m.thorn...@nasa.gov<mailto:susan.m.thorn...@nasa.gov>

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to