We're using the approach you describe here at UW Madison.  Basically, we 
have a number of mounted file systems, with a structure that looks like 
this:

/fedora/datamp/data01
/fedora/datamp/data02
/fedora/datamp/data03...

Then we have a directories with symbolic links into the mounted file 
systems:

/fedora/data/objects/a -> /fedora/datamp/data01/objects/a
/fedora/data/datastreams/a -> /fedora/datamp/data01/datastreams/a
/fedora/data/objects/b -> /fedora/datamp/data02/objects/b
/fedora/data/datastreams/b -> /fedora/datamp/data02/datastreams/b
/fedora/data/objects/c -> /fedora/datamp/data03/objects/c
/fedora/data/datastreams/c -> /fedora/datamp/data03/datastreams/c

And in our akubra.xml file, we have the object and datastream roots and 
hash paths configured:

<bean name="fsObjectStore" class="org.akubraproject.fs.FSBlobStore" 
singleton="true"> 
 

     <constructor-arg value="urn:example.org:fsObjectStore"/> 
 

     <constructor-arg value="/fedora/data/objects"/> 
 
           </bean>

<bean name="fsObjectStoreMapper" 
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper" 
singleton="true">
     <constructor-arg value="#/##/##"/>
</bean>

<bean name="fsDatastreamStore" class="org.akubraproject.fs.FSBlobStore" 
singleton="true">
     <constructor-arg value="urn:example.org:fsDatastreamStore"/>
     <constructor-arg value="/fedora/data/datastreams"/>
</bean>

<bean name="fsDatastreamStoreMapper" 
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper" 
singleton="true">
     <constructor-arg value="#/##/##"/>
</bean>

We have four mount point on our production machine, and four top-level 
directories allocated per mount point.

One of the beauties of hashed directory and file paths is that all the 
file systems should fill up evenly: the hash ensures that objects are 
distributed across all the file systems.

/fedora/datamp/data01:  68%
/fedora/datamp/data02:  68%
/fedora/datamp/data03:  68%
/fedora/datamp/data04:  69%

-- Scott

On 04/18/2013 10:42 AM, Gary Phillips wrote:
> Hello,
>
> I've spent some time looking at akubra and the HashPathIdMapper to get a
> feel for how we would distribute our datastreamStore over multiple file
> systems.  The default configuration (##) creates 256 potential
> directories.  Changing that to # (or something like #/##) would give us
> 16 top level sub-directories, which we could work with.  However, there
> is another issue, in that I don't see how we can easily predict how each
> of those directories would grow, if I am understanding how the files are
> distributed across the directories.
>
> I assume one solution might involve sym linking the top level
> directories over to a few directories that each correspond to a mount point.
>
> Have other people tackled this particular problem (and what solutions
> did you come up with) or found another way around distributing large
> amounts of Fedora data over multiple mount points?  Thanks in advance.
>
>
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
>
>
>
> _______________________________________________
> Fedora-commons-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>


-- 
Scott Prater
Shared Development Group
General Library System
University of Wisconsin - Madison
[email protected]
5-5415

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to