On Thu, May 10, 2012 at 04:54:17PM -0400, Diego Menéndez wrote:
> Is DSpace able to retrieve possibly large data files directly from tape 
> archives? I wonder how the response to the user would be handled if the 
> file is not readily available and the user may have to wait until the 
> tape be placed in the tape drive. Optionally, is there any extension to 
> achieve that? If yes, has anyone experience with that that would like to 
> share?

I don't believe there is anything in stock DSpace to do this.

There are several layers to this problem:

o  You noted the issue of long access delays.  I don't recall anything
   in the user interface design which would permit the insertion of a
   "please wait for retrieval of offline resources" page in the flow.
   One might return a status page saying that the request has failed
   temporarily because the file is being fetched, and invite the user
   to repeat the request after a short wait.  One could even send an
   email to a registered user when the bitstream is available.  (Heh,
   extend the EPerson model a little and one could offer to send a TXT
   instead.)

o  I would approach the actual storage and retrieval as a new type of
   assetstore.  It would have to deal with the linearity of tape
   storage and possibly the great length of bitstreams' internal
   identifiers.  You'd probably want a catalog of tape contents stored
   in a new database table.  It sounds like it would be fun to write.
   See dspace-api:org.dspace.storage.bitstore.BitstreamStorageManager
   for the place to begin exploring.

   (I just realized that you may be talking about something like a tape
   containing a single 'tar' archive or the like, or even several
   tapes containing segments of such an archive.  I had been thinking
   of individual files on ANSI-labelled tapes (which may give you a
   clue to my age :-) .  Archive container files could be even more fun.)

   I'd probably want to cache a few recently-retrieved files on disk.
   More fun stuff to write.

I wonder whether the cost of developing all that would be less than
the cost of enough disk drives to replace the tapes.  Or are these
existing tapes to be registered in the assetstore as-is?

I have heard rumors that someone at IU Bloomington has an assetstore
implemented on an IBM HPSS nearline (tape staged to disk) storage.  I
haven't heard any details.  The tape robots and staging process are
speedy -- for tape -- but would still require of the user considerable
patience if the required bitstream is not currently cached.

-- 
Mark H. Wood, Lead System Programmer   [email protected]
Asking whether markets are efficient is like asking whether people are smart.

Attachment: pgpF0aRODqVL9.pgp
Description: PGP signature

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspace-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-general

Reply via email to