I developed a set of four related command line apps for dspace:
1) a lister / report generator
2) a policy tool, that adds / removes policies
3) a metada tool, that adds / removes specific metadata values
4) a bitstream replacer tool
All expect 4 work on a set of dspace objects specified by two command line
arguments:
--root ROOT - where ROOT is a handle or object type follow by an ID
--type TYPE - where TYPE is one of collection, item, bundle, or
bitstream
—-root COLLECTION.10 --type BITSTREAM
means work on all bitstreams in collection with ID 10
--root handle/12345 --type ITEM
means work on all items contained in the object designated by handle
There is an additional argument, --doWorkFlowItems, that restricts sets to
items in workflows and by extension to bundles or bitstreams in items in
workflows.
The lister generates tsv or txt formatted output, printing properties of the
selected set of DSpace objects. Its --include option determines which
properties are printed. You can choose to print IDs and handles, as well as
policy information, or specify select item metadata fields. You can include an
items 'withdrawn' status or a bundle's embargo state. Bitstream reports may
print mimeType, checksum, ... When printing DSpace objects, you can choose to
print properties of enclosing Dspace objects. For example when printing
bitstreams in a collection, you can include bundle names, item handles, even
item metadata values by using options like these:
--include
'object,name,mimeType,BUNDLE.name,ITEM.handle,ITEM.dc.contributor,author'
The lister works nicely with the other commands, since all four commands use
the same mechanism to select the objects they work on. For example you might
use the lister to review which DSpace objects need policy or metadata changes.
After applying changes, it comes in handy, when making sure the changes
performed are in fact the ones, that were intended.
The policy tool decides which action to apply to each DSpaceObject selected by
the --root and --type parameters based on three options:
--action [ADD | DEL ] - whether to add or delete policies
--dspace_action [READ | WRITE | REMOVE | ... ]
--who [group | eperson]
For example
dspace bulk-pols -r handle/712657 -t BITSTREAM —action ADD —dspace_action
WRITE --who EPERSON.monikam
gives the eperson monikam WRITE priviledges on all bitstreams
contained in the object behind the given handle, which may be a community,
collection, or item.
dspace bulk-pols -r handle/712657 -t BITSTREAM -a DEL -d READ -w
GROUP.Anonymous
removes the READ permission from the Anonymous group
The metadata tool works similar to the policy tool. Of cause it makes only
sense to apply to item sets.
The bitstream replacer works on single bitstreams. It is related to the other
tools in that it selects the bitstream to work on in the same fashion, aka with
--root and --type arguments.
I developed these commands in connection with a project here at Princeton,
where I needed to add a cover page to all bitstreams in original bundles in a
community. The lister gave me the list of bitstreams. Printing the list in txt
format, allowed me to grep for name=ORIGINAL. I included the mimeType in the
listing, so I would only work on pdf documents. Including the internalId
allowed me to use the file right from the assetstore and stick it into my ‘add
the cover page’ script. I replaced the old bitstream using the IDs, printed
earlier, to define the —root parameter to the bitstream replacer. Finally I
used the lister to check on the access policies of the bitstreams. Right now I
run the lister command in a cronjob to watch the submission progress in one of
our communities.
I wrote more detailed documentation which is part of the pull request that I
created for this code. Here at Princeton we are still running 1.8. The bulk-do
code mostly lives in its own package and should play well with version 3 (I
have not tried it). The PR is based on the master. In other words unless you
run pre 1.8, merging this into your version should be relatively painless -
and it goes without saying - I'd help sort out conflicts.
The PR is HERE<https://github.com/DSpace/DSpace/pull/560> and the documentation
is
THERE<https://github.com/akinom/DSpace/blob/prq_bulk_commands/dspace-api/src/main/java/org/dspace/app/bulkdo/README.md>
I believe this code would be useful for many DSpace administrators. It would
be straight forward to add a JSON/XML output format to offer this functionally
in the REST API. So please have a look, send feedback, and possibly step up as
a volunteer tester / reviewer.
Monika
—
Monika Mevenkamp
phone: 609-258-4161
123 693 Alexander Street, Princeton University, Princeton, NJ 08544
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette