I recommend looking at the old AIP prototype (see 
http://wiki.dspace.org/index.php/AipPrototype 
  for doc and fossilized code) for a start on this.  It extended the  
METS implementation to cover Communities, Collections, and every  
aspect of Items, including most administrative metadata. Although it  
stopped short of fully representing the EPerson, Group and Policy  
objects  those would be straightforward to add.  It does show some of  
the issues involved in building a copy of an archive from scratch.

Migrating and mirroring content between repositories was one of the  
use cases for AIPs.  *Every* existing interchange mechanism (batch  
import, packager, etc) loses *some* details of the Item and its child  
objects.  Only the AIP was complete, and iirc it also had a little bug  
or two (e.g. bitstream sequence IDs weren't always restored perfectly).

good luck!

  -- Larry

On Jun 17, 2009, at 4:56 PM, Mark H. Wood wrote:

> We're working with a partner who want to keep a separate test instance
> with content tracking the (sizable) live repository fairly closely.
> The requirement that I've been given is to entirely replace the
> content from live every week or two.
>
> Deleting 17,000 items (and over 20,000 bitstreams) is an all-day
> operation, and then comes the loading phase.  It would save a lot of
> time if I could export the Community/Collection structure, EPerson and
> Group objects, registries, and anything else that's not an Item,
> Bundle, or Bitstream; drop and recreate the database; empty the
> assetstore and history; reload the noncontent tables; and then begin
> loading.
>
> So I'm looking at adding export/import for all of those objects,
> probably to XML.  In the case of Community and Collection I guess the
> best thing would be to just do a single exporter producing the same
> XML dialect consumed by the existing Community and Collection
> Structure Importer.  Likewise for the registries, it seems.  The other
> classes would need importers built as well as exporters.  Comments?
>
> Or is there a smarter way to make a consistent clone of a DSpace
> instance, with its own Handles, that is writable but doesn't affect
> the original?  (The Handle business, plus the need to quiesce the
> production site to ensure consistency across database and assetstore,
> is why I don't just use tar and pg_dump.)
>
> -- 
> Mark H. Wood, Lead System Programmer   mw...@iupui.edu
> Friends don't let friends publish revisable-form documents.
> ------------------------------------------------------------------------------
> Crystal Reports - New Free Runtime and 30 Day Trial
> Check out the new simplified licensing option that enables unlimited
> royalty-free distribution of the report engine for externally facing
> server and web deployment.
> http://p.sf.net/sfu/businessobjects_______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech


------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to