Re: [fcrepo-user] Akubra and multiple file systems

Peter Gorman Fri, 19 Apr 2013 07:57:57 -0700

Is this in the KB?

Peter C. Gorman
Head, University of Wisconsin Digital Collections Center
[email protected]
(608) 265-5291




On Apr 18, 2013, at 11:47 AM, Scott Prater wrote:

> We're using the approach you describe here at UW Madison.  Basically, we 
> have a number of mounted file systems, with a structure that looks like 
> this:
> 
> /fedora/datamp/data01
> /fedora/datamp/data02
> /fedora/datamp/data03...
> 
> Then we have a directories with symbolic links into the mounted file 
> systems:
> 
> /fedora/data/objects/a -> /fedora/datamp/data01/objects/a
> /fedora/data/datastreams/a -> /fedora/datamp/data01/datastreams/a
> /fedora/data/objects/b -> /fedora/datamp/data02/objects/b
> /fedora/data/datastreams/b -> /fedora/datamp/data02/datastreams/b
> /fedora/data/objects/c -> /fedora/datamp/data03/objects/c
> /fedora/data/datastreams/c -> /fedora/datamp/data03/datastreams/c
> 
> And in our akubra.xml file, we have the object and datastream roots and 
> hash paths configured:
> 
> <bean name="fsObjectStore" class="org.akubraproject.fs.FSBlobStore" 
> singleton="true"> 
> 
> 
>     <constructor-arg value="urn:example.org:fsObjectStore"/> 
> 
> 
>     <constructor-arg value="/fedora/data/objects"/> 
> 
>           </bean>
> 
> <bean name="fsObjectStoreMapper" 
> class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper" 
> singleton="true">
>     <constructor-arg value="#/##/##"/>
> </bean>
> 
> <bean name="fsDatastreamStore" class="org.akubraproject.fs.FSBlobStore" 
> singleton="true">
>     <constructor-arg value="urn:example.org:fsDatastreamStore"/>
>     <constructor-arg value="/fedora/data/datastreams"/>
> </bean>
> 
> <bean name="fsDatastreamStoreMapper" 
> class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper" 
> singleton="true">
>     <constructor-arg value="#/##/##"/>
> </bean>
> 
> We have four mount point on our production machine, and four top-level 
> directories allocated per mount point.
> 
> One of the beauties of hashed directory and file paths is that all the 
> file systems should fill up evenly: the hash ensures that objects are 
> distributed across all the file systems.
> 
> /fedora/datamp/data01:  68%
> /fedora/datamp/data02:  68%
> /fedora/datamp/data03:  68%
> /fedora/datamp/data04:  69%
> 
> -- Scott
> 
> On 04/18/2013 10:42 AM, Gary Phillips wrote:
>> Hello,
>> 
>> I've spent some time looking at akubra and the HashPathIdMapper to get a
>> feel for how we would distribute our datastreamStore over multiple file
>> systems.  The default configuration (##) creates 256 potential
>> directories.  Changing that to # (or something like #/##) would give us
>> 16 top level sub-directories, which we could work with.  However, there
>> is another issue, in that I don't see how we can easily predict how each
>> of those directories would grow, if I am understanding how the files are
>> distributed across the directories.
>> 
>> I assume one solution might involve sym linking the top level
>> directories over to a few directories that each correspond to a mount point.
>> 
>> Have other people tackled this particular problem (and what solutions
>> did you come up with) or found another way around distributing large
>> amounts of Fedora data over multiple mount points?  Thanks in advance.
>> 
>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> 
>> 
>> 
>> _______________________________________________
>> Fedora-commons-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>> 
> 
> 
> -- 
> Scott Prater
> Shared Development Group
> General Library System
> University of Wisconsin - Madison
> [email protected]
> 5-5415
> 
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> Fedora-commons-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users


------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Re: [fcrepo-user] Akubra and multiple file systems

Reply via email to