Hi,

I've just decided I will export the metadata (CSV) and clean it up
manually, then re-import before I export via AIP.  This works great for
the dc.identifier.uri (handle link), but I just realized that
dc.date.accessioned and dc.date.available aren't in the exported metadata.

I assume these fields are in the database, so I'll have to use SQL to
clean them up after importing via AIP?  I'm not sure where to look in
the DB...

Thanks,

Alan

On 01/17/2014 09:12 AM, Alan Orth wrote:
> Thanks, both Tim and Helix.
>
> Yes, I initially looked into the "-r" mode, but then realized that, as
> Tim mentioned, our development instance doesn't necessarily create
> proper handles.  Our development instance is more of a code-testing
> ground, and we don't sync the content very frequently.  Also, the
> date-related meta data isn't necessarily correct either, as the
> accession into the development instance (for quality assurance) isn't
> necessarily the accession date we'd want.
>
> I think I'll have to rely on a two-step approach: first ingesting via
> AIP to get community/collection hierarchy and bitstreams, then meta data
> cleanup of the resulting community to clean the "old" URIs and accession
> dates etc.
>
> Thanks for bouncing some ideas around!
>
> Alan
>
> On 01/16/2014 06:39 PM, Tim Donohue wrote:
>> Hi Alan,
>>
>> On 1/16/2014 9:10 AM, Alan Orth wrote:
>>> Hi,
>>>
>>> I've got a development instance where we uploaded a few hundred items
>>> (in one community and several collections).  Our editors spent some time
>>> manually uploading bit streams to many of these items.  Now I want to
>>> migrate the community and its hierarchy to the production instance.  We
>>> can't use the CSV via "Export Metadata" because of the bit streams, so
>>> I've been looking at using AIP, ie:
>>>
>>> dspace packager -s -a -t AIP -e m...@us.org -p 10568/0 33474.zip
>>>
>>> This works great, but the resulting items now have two of each of the
>>> following fields:
>>>      - dc.date.accessioned
>>>      - dc.date.available
>>>      - dc.identifier.uri
>>>
>>> I can't figure out a work flow that doesn't produce this effect...
>> These three fields are unfortunately auto-generated by DSpace whenever
>> you treat an AIP as a submission information package (SIP), which is
>> what the -s option. Essentially, the '-s' option assumes this is new
>> content, so DSpace defines these fields as:
>>     * dc.date.accessioned - the date this new content was added to DSpace
>>     * dc.date.available - the date this new content became available
>> in DSpace (i.e. finished approval workflow)
>>     * dc.identifier.uri - the assigned Handle for this object
>>
>> For your situation, you may need to consider some metadata related
>> questions.
>>
>> * Does your development instance assign proper Handles?  If not, then
>> you *need* Production to assign a new dc.identifier.uri.  This may
>> mean that you'll have to unfortunately do some post-metadata cleanup
>> (perhaps via the Bulk Metadata Editor) of the invalid "development"
>> handles in the dc.identifier.uri fields. DSpace never overwrites or
>> removes existing metadata.
>>
>> * Do you want the "date.accessioned" and "date.available" fields to be
>> set to the dates the Item was added to *development* or to
>> *production*? If the latter, again, you may unfortunately need to do
>> some post-metadata cleanup, as DSpace specifically *never*
>> removes/overwrites existing metadata fields.
>>
>>
>> Depending on your setup/answers to your questions, there are three
>> possible AIP import options I can see:
>>
>> 1a) Use "Restore/Replace" option instead (-r) when migrating to
>> Production.
>>
>> If you treat this as an AIP "restoration" then DSpace will skip
>> creating "date.accessioned", "date.available" and "identifier.uri"
>> fields and assume that the provided values in the AIPs are correct (as
>> it assumes you are restoring a set of deleted objects).  WARNING: If
>> the 'dc.identifier.uri' in the AIP does NOT correspond to a valid
>> Handle, then you will end up with invalid Handles in Production! (See
>> next option.)
>>
>> More on Restore/Replace:
>> https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-Restoring/ReplacingusingAIP(s)
>>
>>
>> 1b) When using "Restore/Replace", you may want/need to override some
>> of the default options. For example, restoration will always assume
>> the 'dc.identifier.uri' is a valid Handle (so a new Handle will not be
>> assigned). Restoration will also always attempt to restore an object
>> under the *specified* parent object in the AIP -- so, this means if a
>> Collection was under a Community with ID "123456789/1" in your
>> development instance, then it will be restored under a Community of
>> the *same ID* in Production
>>
>> Luckily, these defaults can be overridden. See the 'ignoreHandle' and
>> 'ignoreParent' Advanced options documented here:
>>
>> https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore#AIPBackupandRestore-AdditionalPackagerOptions
>>
>>
>> 2) The other option is to still use Submission (-s) option, but use
>> one or more of the Advanced options (in 1b) to tweak the defaults.
>>
>> I know this is a lot of info, but hopefully it gives you some ideas to
>> go on.
>>
>> - Tim

-- 
Alan Orth
alan.o...@gmail.com
http://alaninkenya.org
http://mjanja.co.ke
"I have always wished for my computer to be as easy to use as my telephone; my 
wish has come true because I can no longer figure out how to use my telephone." 
-Bjarne Stroustrup, inventor of C++
GPG Public Key: 0xf92c4bd91084bb5de14e20be9470dd588dd1026c


Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to