Thanks, both Tim and Helix.

Yes, I initially looked into the "-r" mode, but then realized that, as
Tim mentioned, our development instance doesn't necessarily create
proper handles.  Our development instance is more of a code-testing
ground, and we don't sync the content very frequently.  Also, the
date-related meta data isn't necessarily correct either, as the
accession into the development instance (for quality assurance) isn't
necessarily the accession date we'd want.

I think I'll have to rely on a two-step approach: first ingesting via
AIP to get community/collection hierarchy and bitstreams, then meta data
cleanup of the resulting community to clean the "old" URIs and accession
dates etc.

Thanks for bouncing some ideas around!

Alan

On 01/16/2014 06:39 PM, Tim Donohue wrote:
> Hi Alan,
>
> On 1/16/2014 9:10 AM, Alan Orth wrote:
>> Hi,
>>
>> I've got a development instance where we uploaded a few hundred items
>> (in one community and several collections).  Our editors spent some time
>> manually uploading bit streams to many of these items.  Now I want to
>> migrate the community and its hierarchy to the production instance.  We
>> can't use the CSV via "Export Metadata" because of the bit streams, so
>> I've been looking at using AIP, ie:
>>
>> dspace packager -s -a -t AIP -e m...@us.org -p 10568/0 33474.zip
>>
>> This works great, but the resulting items now have two of each of the
>> following fields:
>>      - dc.date.accessioned
>>      - dc.date.available
>>      - dc.identifier.uri
>>
>> I can't figure out a work flow that doesn't produce this effect...
>
> These three fields are unfortunately auto-generated by DSpace whenever
> you treat an AIP as a submission information package (SIP), which is
> what the -s option. Essentially, the '-s' option assumes this is new
> content, so DSpace defines these fields as:
>     * dc.date.accessioned - the date this new content was added to DSpace
>     * dc.date.available - the date this new content became available
> in DSpace (i.e. finished approval workflow)
>     * dc.identifier.uri - the assigned Handle for this object
>
> For your situation, you may need to consider some metadata related
> questions.
>
> * Does your development instance assign proper Handles?  If not, then
> you *need* Production to assign a new dc.identifier.uri.  This may
> mean that you'll have to unfortunately do some post-metadata cleanup
> (perhaps via the Bulk Metadata Editor) of the invalid "development"
> handles in the dc.identifier.uri fields. DSpace never overwrites or
> removes existing metadata.
>
> * Do you want the "date.accessioned" and "date.available" fields to be
> set to the dates the Item was added to *development* or to
> *production*? If the latter, again, you may unfortunately need to do
> some post-metadata cleanup, as DSpace specifically *never*
> removes/overwrites existing metadata fields.
>
>
> Depending on your setup/answers to your questions, there are three
> possible AIP import options I can see:
>
> 1a) Use "Restore/Replace" option instead (-r) when migrating to
> Production.
>
> If you treat this as an AIP "restoration" then DSpace will skip
> creating "date.accessioned", "date.available" and "identifier.uri"
> fields and assume that the provided values in the AIPs are correct (as
> it assumes you are restoring a set of deleted objects).  WARNING: If
> the 'dc.identifier.uri' in the AIP does NOT correspond to a valid
> Handle, then you will end up with invalid Handles in Production! (See
> next option.)
>
> More on Restore/Replace:
> https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-Restoring/ReplacingusingAIP(s)
>
>
> 1b) When using "Restore/Replace", you may want/need to override some
> of the default options. For example, restoration will always assume
> the 'dc.identifier.uri' is a valid Handle (so a new Handle will not be
> assigned). Restoration will also always attempt to restore an object
> under the *specified* parent object in the AIP -- so, this means if a
> Collection was under a Community with ID "123456789/1" in your
> development instance, then it will be restored under a Community of
> the *same ID* in Production
>
> Luckily, these defaults can be overridden. See the 'ignoreHandle' and
> 'ignoreParent' Advanced options documented here:
>
> https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-AdditionalPackagerOptions
>
>
> 2) The other option is to still use Submission (-s) option, but use
> one or more of the Advanced options (in 1b) to tweak the defaults.
>
> I know this is a lot of info, but hopefully it gives you some ideas to
> go on.
>
> - Tim

-- 
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my telephone; my 
wish has come true because I can no longer figure out how to use my telephone." 
-Bjarne Stroustrup, inventor of C++
GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c


Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to