We are in the process of migrating several hundred gigabytes of
repository content from a CMS to a Fedora 3.x installation.
One of the issues that we have is the decision whether to store the
assets (mostly PDF files at the moment) as managed or external content.
Some of the PDF files can be several hundred megabytes in size.
 
The strategy for the conversion (until now) was to create FOXML on-disk
with several datastreams embedded, and then do ingest using the client
command-line scripts. With the large PDF files embedded as datastreams,
the Java client crashes with out of memory errors, even when I increase
the heap size to seemingly sufficient sizes ( -Xms512m -Xmx640m)
 
So I wonder, what kind of content are other users storing? What are the
maximum sizes of stored datastreams observed? And do you ingest them
with FOXML in one go, or use something like an API-M call to add the
datastream after the object has already been created?
 
Any thoughts appreciated.
 
Etienne Posthumus
resident propellerhead
TU Delft Library
Netherlands
---
http://www.library.tudeflt.nl/
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to