I have been looking at making DSpace use an S3 Bucket when it stores a database entry so the Metadata goes into the oracle database and the content goes directly to the S3. It is way less costly than using EBS volumes at the 50TB scale we are looking at.
With s3cmd, using the http access to the S3 has much better performance than using the operating system call. The http put or get goes straight from the file current storage to the S3. The operating system call first copies it down to the EC2 instance, then copies it over to the S3. The S3 copy is sequential so start it and take a nap. When DSpace writes the content external to the database, the command would be something like: S3cmd put http://something-amazonaws.com/subdir1/subdir2 filename.pdf Note that I made it look like the S3 is a file system which it is not. But doing it this way makes the S3 look like a file system to the end user. Using the Operating System File access, s3fs, is really slow and not that reliable. The mount tends to fail and has to be remounted from time to time. I think the DSpace ItemImport.java class can be modified to write the external data to S3 this way. Has this been looked at in the past? Is there a clean way to do it? Thank you. Charles Keagle Sr. Cloud Engineer | 2nd Watch 603 Stewart St, Suite 707 | Seattle, WA | 98101 Mobile 425-417-3434 | Office 888.747.8254 http://www.2ndwatch.com [2ndwatch] [aws-image] CONFIDENTIALITY NOTICE: The information contained in this email and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged. If any reader of this communication is not the intended recipient, unauthorized use, disclosure or copying is strictly prohibited, and may be unlawful. If you have received this communication in error, please immediately notify the sender by telephone at 425.224.3127 or by return email, and delete the original message and all copies from your system. Thank you.
<<inline: image002.jpg>>
<<inline: image004.jpg>>
------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette